git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v2 0/4] propose config-based hooks
@ 2020-05-21 18:54 Emily Shaffer
  2020-05-21 18:54 ` [PATCH v2 1/4] doc: propose hooks managed by the config Emily Shaffer
                   ` (4 more replies)
  0 siblings, 5 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-05-21 18:54 UTC (permalink / raw)
  To: git
  Cc: Emily Shaffer, Jeff King, Junio C Hamano, James Ramsay,
	Jonathan Nieder, brian m. carlson,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Josh Steadmon

This series implements "Stage 1" of the config-based hooks rollout
process as proposed in the design doc. It does not touch the existing
hook implementation or change the way that Git functions - it only adds
a new, independent command.

In the design doc, I mentioned the possibility of including 'git hook
add' and 'git hook edit' in this stage. However, I'd like to get input
from our UX team internally before I get started - I know my own limits,
and coming up with good UX design is one of them ;) Unfortunately, I
won't be able to get time with them until the first week of June, so I
haven't included those commands here.

The series is listed as v2 because I included the updated design doc
with changes pointed out by Junio and brian. That's a good place to
start if you're reviewing the series for the first time. (I'm also
breaking thread with the contributor summit notes to bring the series to
the attention of more contributors who may be interested.)

One point I'd like discussion on especially is the '--porcelain'
command. The intent was to make it very easy for non-builtins to run
hooks; but I'm starting to wonder whether it makes more sense to include
a `git hook run <hookname>`, which makes parallelization possible in the
future if we decide to implement that. Even if we decide it makes sense
to keep 'list --porcelain', I'm not sure what information to include;
providing simply the line to pass to 'sh' seems a little thin.

The next stage from here is to migrate internal callers who use
'find_hook()' now to call the hook library (and teach the hook library
to call find_hook()), which will essentially turn on config-based hooks;
does it make sense to include that stage at the same time as this
series so we aren't checking in unused code?

Thanks all.
 - Emily

Emily Shaffer (4):
  doc: propose hooks managed by the config
  hook: scaffolding for git-hook subcommand
  hook: add list command
  hook: add --porcelain to list command

 .gitignore                                    |   1 +
 Documentation/Makefile                        |   1 +
 Documentation/git-hook.txt                    |  63 ++++
 .../technical/config-based-hooks.txt          | 320 ++++++++++++++++++
 Makefile                                      |   2 +
 builtin.h                                     |   1 +
 builtin/hook.c                                |  77 +++++
 git.c                                         |   1 +
 hook.c                                        |  90 +++++
 hook.h                                        |  15 +
 t/t1360-config-based-hooks.sh                 |  69 ++++
 11 files changed, 640 insertions(+)
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 Documentation/technical/config-based-hooks.txt
 create mode 100644 builtin/hook.c
 create mode 100644 hook.c
 create mode 100644 hook.h
 create mode 100755 t/t1360-config-based-hooks.sh

-- 
2.27.0.rc0.183.gde8f92d652-goog


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 1/4] doc: propose hooks managed by the config
  2020-05-21 18:54 [PATCH v2 0/4] propose config-based hooks Emily Shaffer
@ 2020-05-21 18:54 ` Emily Shaffer
  2020-05-22 10:13   ` Phillip Wood
  2020-05-21 18:54 ` [PATCH v2 2/4] hook: scaffolding for git-hook subcommand Emily Shaffer
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-05-21 18:54 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Begin a design document for config-based hooks, managed via git-hook.
Focus on an overview of the implementation and motivation for design
decisions. Briefly discuss the alternatives considered before this
point. Also, attempt to redefine terms to fit into a multihook world.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/Makefile                        |   1 +
 .../technical/config-based-hooks.txt          | 320 ++++++++++++++++++
 2 files changed, 321 insertions(+)
 create mode 100644 Documentation/technical/config-based-hooks.txt

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 15d9d04f31..5b21f31d31 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -80,6 +80,7 @@ SP_ARTICLES += $(API_DOCS)
 TECH_DOCS += MyFirstContribution
 TECH_DOCS += MyFirstObjectWalk
 TECH_DOCS += SubmittingPatches
+TECH_DOCS += technical/config-based-hooks
 TECH_DOCS += technical/hash-function-transition
 TECH_DOCS += technical/http-protocol
 TECH_DOCS += technical/index-format
diff --git a/Documentation/technical/config-based-hooks.txt b/Documentation/technical/config-based-hooks.txt
new file mode 100644
index 0000000000..59cdc25a47
--- /dev/null
+++ b/Documentation/technical/config-based-hooks.txt
@@ -0,0 +1,320 @@
+Configuration-based hook management
+===================================
+
+== Motivation
+
+Treat hooks as a first-class citizen by replacing the .git/hook/hookname path as
+the only source of hooks to execute, in a way which is friendly to users with
+multiple repos which have similar needs.
+
+Redefine "hook" as an event rather than a single script, allowing users to
+perform unrelated actions on a single event.
+
+Take a step closer to safety when copying zipped Git repositories from untrusted
+users.
+
+Make it easier for users to discover Git's hook feature and automate their
+workflows.
+
+== User interfaces
+
+=== Config schema
+
+Hooks can be introduced by editing the configuration manually. There are two new
+sections added, `hook` and `hookcmd`.
+
+==== `hook`
+
+Primarily contains subsections for each hook event. These subsections define
+hook command execution order; hook commands can be specified by passing the
+command directly if no additional configuration is needed, or by passing the
+name of a `hookcmd`. If Git does not find a `hookcmd` whose subsection matches
+the value of the given command string, Git will try to execute the string
+directly. Hooks are executed by passing the resolved command string to the
+shell. Hook event subsections can also contain per-hook-event settings.
+
+Also contains top-level hook execution settings, for example,
+`hook.warnHookDir`, `hook.runHookDir`, or `hook.disableAll`.
+
+----
+[hook "pre-commit"]
+  command = perl-linter
+  command = /usr/bin/git-secrets --pre-commit
+
+[hook "pre-applypatch"]
+  command = perl-linter
+  error = ignore
+
+[hook]
+  runHookDir = interactive
+----
+
+==== `hookcmd`
+
+Defines a hook command and its attributes, which will be used when a hook event
+occurs. Unqualified attributes are assumed to apply to this hook during all hook
+events, but event-specific attributes can also be supplied. The example runs
+`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
+include this config, the hook command will be skipped for all events to which
+it's normally subscribed _except_ `pre-commit`.
+
+----
+[hookcmd "perl-linter"]
+  command = /usr/bin/lint-it --language=perl
+  skip = true
+  pre-commit-skip = false
+----
+
+=== Command-line API
+
+Users should be able to view, reorder, and create hook commands via the command
+line. External tools should be able to view a list of hooks in the correct order
+to run.
+
+*`git hook list <hook-event>`*
+
+*`git hook list (--system|--global|--local|--worktree)`*
+
+*`git hook edit <hook-event>`*
+
+*`git hook add <hook-command> <hook-event> <options...>`*
+
+=== Hook editor
+
+The tool which is presented by `git hook edit <hook-command>`. Ideally, this
+tool should be easier to use than manually editing the config, and then produce
+a concise config afterwards. It may take a form similar to `git rebase
+--interactive`.
+
+== Implementation
+
+=== Library
+
+`hook.c` and `hook.h` are responsible for interacting with the config files. In
+the case when the code generating a hook event doesn't have special concerns
+about how to run the hooks, the hook library will provide a basic API to call
+all hooks in config order with an `argv_array` provided by the code which
+generates the hook event:
+
+*`int run_hooks(const char *hookname, struct argv_array *args)`*
+
+This call includes the hook command provided by `run-command.h:find_hook()`;
+eventually, this legacy hook will be gated by a config `hook.runHookDir`. The
+config is checked against a number of cases:
+
+- "no": the legacy hook will not be run
+- "interactive": Git will prompt the user before running the legacy hook
+- "warn": Git will print a warning to stderr before running the legacy hook
+- "yes" (default): Git will silently run the legacy hook
+
+In case this list is expanded in the future, if a value for `hook.runHookDir` is
+given which Git does not recognize, Git should discard that config entry. For
+example, if "warn" was specified at system level and "junk" was specified at
+global level, Git would resolve the value to "warn"; if the only time the config
+was set was to "junk", Git would use the default value of "yes".
+
+If the caller wants to do something more complicated, the hook library can also
+provide a callback API:
+
+*`int for_each_hookcmd(const char *hookname, hookcmd_function *cb)`*
+
+Finally, to facilitate the builtin, the library will also provide the following
+APIs to interact with the config:
+
+----
+int set_hook_commands(const char *hookname, struct string_list *commands,
+	enum config_scope scope);
+int set_hookcmd(const char *hookcmd, struct hookcmd options);
+
+int list_hook_commands(const char *hookname, struct string_list *commands);
+int list_hooks_in_scope(enum config_scope scope, struct string_list *commands);
+----
+
+`struct hookcmd` is expected to grow in size over time as more functionality is
+added to hooks; so that other parts of the code don't need to understand the
+config schema, `struct hookcmd` should contain logical values instead of string
+pairs.
+
+----
+struct hookcmd {
+  const char *name;
+  const char *command;
+
+  /* for illustration only; not planned at present */
+  int parallelizable;
+  const char *hookcmd_before;
+  const char *hookcmd_after;
+  enum recovery_action on_fail;
+}
+----
+
+=== Builtin
+
+`builtin/hook.c` is responsible for providing the frontend. It's responsible for
+formatting user-provided data and then calling the library API to set the
+configs as appropriate. The builtin frontend is not responsible for calling the
+config directly, so that other areas of Git can rely on the hook library to
+understand the most recent config schema for hooks.
+
+=== Migration path
+
+==== Stage 0
+
+Hooks are called by running `run-command.h:find_hook()` with the hookname and
+executing the result. The hook library and builtin do not exist. Hooks only
+exist as specially named scripts within `.git/hooks/`.
+
+==== Stage 1
+
+`git hook list --porcelain <hook-event>` is implemented. Users can replace their
+`.git/hooks/<hook-event>` scripts with a trampoline based on `git hook list`'s
+output. Modifier commands like `git hook add` and `git hook edit` can be
+implemented around this time as well.
+
+==== Stage 2
+
+`hook.h:run_hooks()` is taught to include `run-command.h:find_hook()` at the
+end; calls to `find_hook()` are replaced with calls to `run_hooks()`. Users can
+opt-in to config-based hooks simply by creating some in their config; otherwise
+users should remain unaffected by the change.
+
+==== Stage 3
+
+The call to `find_hook()` inside of `run_hooks()` learns to check for a config,
+`hook.runHookDir`. Users can opt into managing their hooks completely via the
+config this way.
+
+==== Stage 4
+
+`.git/hooks` is removed from the template and the hook directory is considered
+deprecated. To avoid breaking older repos, the default of `hook.runHookDir` is
+not changed, and `find_hook()` is not removed.
+
+== Caveats
+
+=== Security and repo config
+
+Part of the motivation behind this refactor is to mitigate hooks as an attack
+vector;footnote:[https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/]
+however, as the design stands, users can still provide hooks in the repo-level
+config, which is included when a repo is zipped and sent elsewhere.  The
+security of the repo-level config is still under discussion; this design
+generally assumes the repo-level config is secure, which is not true yet. The
+goal is to avoid an overcomplicated design to work around a problem which has
+ceased to exist.
+
+=== Ease of use
+
+The config schema is nontrivial; that's why it's important for the `git hook`
+modifier commands to be usable. Contributors with UX expertise are encouraged to
+share their suggestions.
+
+== Alternative approaches
+
+A previous summary of alternatives exists in the
+archives.footnote:[https://lore.kernel.org/git/20191116011125.GG22855@google.com]
+
+=== Status quo
+
+Today users can implement multihooks themselves by using a "trampoline script"
+as their hook, and pointing that script to a directory or list of other scripts
+they wish to run.
+
+=== Hook directories
+
+Other contributors have suggested Git learn about the existence of a directory
+such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
+
+=== Comparison table
+
+.Comparison of alternatives
+|===
+|Feature |Config-based hooks |Hook directories |Status quo
+
+|Supports multiple hooks
+|Natively
+|Natively
+|With user effort
+
+|Safer for zipped repos
+|A little
+|No
+|No
+
+|Previous hooks just work
+|If configured
+|Yes
+|Yes
+
+|Can install one hook to many repos
+|Yes
+|No
+|No
+
+|Discoverability
+|Better (in `git help git`)
+|Same as before
+|Same as before
+
+|Hard to run unexpected hook
+|If configured
+|No
+|No
+|===
+
+== Future work
+
+=== Execution ordering
+
+We may find that config order is insufficient for some users; for example,
+config order makes it difficult to add a new hook to the system or global config
+which runs at the end of the hook list. A new ordering schema should be:
+
+1) Specified by a `hook.order` config, so that users will not unexpectedly see
+their order change;
+
+2) Either dependency or numerically based.
+
+Dependency-based ordering is prone to classic linked-list problems, like a
+cycles and handling of missing dependencies. But, it paves the way for enabling
+parallelization if some tasks truly depend on others.
+
+Numerical ordering makes it tricky for Git to generate suggested ordering
+numbers for each command, but is easy to determine a definitive order.
+
+=== Parallelization
+
+Users with many hooks might want to run them simultaneously, if the hooks don't
+modify state; if one hook depends on another's output, then users will want to
+specify those dependencies. If we decide to solve this problem, we may want to
+look to modern build systems for inspiration on how to manage dependencies and
+parallel tasks.
+
+=== Securing hookdir hooks
+
+With the design as written in this doc, it's still possible for a malicious user
+to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
+zip their repo and send it to another user. It may be necessary to teach Git to
+only allow one-line hooks like this if they were configured outside of the local
+scope; or another approach, like a list of safe projects, might be useful. It
+may also be sufficient (or at least useful) to teach a `hook.disableAll` config
+or similar flag to the Git executable.
+
+=== Submodule inheritance
+
+It's possible some submodules may want to run the identical set of hooks that
+their superrepo runs. While a globally-configured hook set is helpful, it's not
+a great solution for users who have multiple repos-with-submodules under the
+same user. It would be useful for submodules to learn how to run hooks from
+their superrepo's config, or inherit that hook setting.
+
+== Glossary
+
+*hook event*
+
+A point during Git's execution where user scripts may be run, for example,
+_prepare-commit-msg_ or _pre-push_.
+
+*hook command*
+
+A user script or executable which will be run on one or more hook events.
-- 
2.27.0.rc0.183.gde8f92d652-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v2 2/4] hook: scaffolding for git-hook subcommand
  2020-05-21 18:54 [PATCH v2 0/4] propose config-based hooks Emily Shaffer
  2020-05-21 18:54 ` [PATCH v2 1/4] doc: propose hooks managed by the config Emily Shaffer
@ 2020-05-21 18:54 ` Emily Shaffer
  2020-05-21 18:54 ` [PATCH v2 3/4] hook: add list command Emily Shaffer
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-05-21 18:54 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Introduce infrastructure for a new subcommand, git-hook, which will be
used to ease config-based hook management. This command will handle
parsing configs to compose a list of hooks to run for a given event, as
well as adding or modifying hook configs in an interactive fashion.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 .gitignore                    |  1 +
 Documentation/git-hook.txt    | 19 +++++++++++++++++++
 Makefile                      |  1 +
 builtin.h                     |  1 +
 builtin/hook.c                | 21 +++++++++++++++++++++
 git.c                         |  1 +
 t/t1360-config-based-hooks.sh | 11 +++++++++++
 7 files changed, 55 insertions(+)
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 builtin/hook.c
 create mode 100755 t/t1360-config-based-hooks.sh

diff --git a/.gitignore b/.gitignore
index ee509a2ad2..0694a34884 100644
--- a/.gitignore
+++ b/.gitignore
@@ -75,6 +75,7 @@
 /git-grep
 /git-hash-object
 /git-help
+/git-hook
 /git-http-backend
 /git-http-fetch
 /git-http-push
diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
new file mode 100644
index 0000000000..2d50c414cc
--- /dev/null
+++ b/Documentation/git-hook.txt
@@ -0,0 +1,19 @@
+git-hook(1)
+===========
+
+NAME
+----
+git-hook - Manage configured hooks
+
+SYNOPSIS
+--------
+[verse]
+'git hook'
+
+DESCRIPTION
+-----------
+You can list, add, and modify hooks with this command.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 3d3a39fc19..fce6ee154e 100644
--- a/Makefile
+++ b/Makefile
@@ -1080,6 +1080,7 @@ BUILTIN_OBJS += builtin/get-tar-commit-id.o
 BUILTIN_OBJS += builtin/grep.o
 BUILTIN_OBJS += builtin/hash-object.o
 BUILTIN_OBJS += builtin/help.o
+BUILTIN_OBJS += builtin/hook.o
 BUILTIN_OBJS += builtin/index-pack.o
 BUILTIN_OBJS += builtin/init-db.o
 BUILTIN_OBJS += builtin/interpret-trailers.o
diff --git a/builtin.h b/builtin.h
index a5ae15bfe5..4e736499c0 100644
--- a/builtin.h
+++ b/builtin.h
@@ -157,6 +157,7 @@ int cmd_get_tar_commit_id(int argc, const char **argv, const char *prefix);
 int cmd_grep(int argc, const char **argv, const char *prefix);
 int cmd_hash_object(int argc, const char **argv, const char *prefix);
 int cmd_help(int argc, const char **argv, const char *prefix);
+int cmd_hook(int argc, const char **argv, const char *prefix);
 int cmd_index_pack(int argc, const char **argv, const char *prefix);
 int cmd_init_db(int argc, const char **argv, const char *prefix);
 int cmd_interpret_trailers(int argc, const char **argv, const char *prefix);
diff --git a/builtin/hook.c b/builtin/hook.c
new file mode 100644
index 0000000000..b2bbc84d4d
--- /dev/null
+++ b/builtin/hook.c
@@ -0,0 +1,21 @@
+#include "cache.h"
+
+#include "builtin.h"
+#include "parse-options.h"
+
+static const char * const builtin_hook_usage[] = {
+	N_("git hook"),
+	NULL
+};
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+
+	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+			     builtin_hook_usage, 0);
+
+	return 0;
+}
diff --git a/git.c b/git.c
index a2d337eed7..99372529a2 100644
--- a/git.c
+++ b/git.c
@@ -517,6 +517,7 @@ static struct cmd_struct commands[] = {
 	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
 	{ "hash-object", cmd_hash_object },
 	{ "help", cmd_help },
+	{ "hook", cmd_hook, RUN_SETUP },
 	{ "index-pack", cmd_index_pack, RUN_SETUP_GENTLY | NO_PARSEOPT },
 	{ "init", cmd_init_db },
 	{ "init-db", cmd_init_db },
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
new file mode 100755
index 0000000000..34b0df5216
--- /dev/null
+++ b/t/t1360-config-based-hooks.sh
@@ -0,0 +1,11 @@
+#!/bin/bash
+
+test_description='config-managed multihooks, including git-hook command'
+
+. ./test-lib.sh
+
+test_expect_success 'git hook command does not crash' '
+	git hook
+'
+
+test_done
-- 
2.27.0.rc0.183.gde8f92d652-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v2 3/4] hook: add list command
  2020-05-21 18:54 [PATCH v2 0/4] propose config-based hooks Emily Shaffer
  2020-05-21 18:54 ` [PATCH v2 1/4] doc: propose hooks managed by the config Emily Shaffer
  2020-05-21 18:54 ` [PATCH v2 2/4] hook: scaffolding for git-hook subcommand Emily Shaffer
@ 2020-05-21 18:54 ` Emily Shaffer
  2020-05-22 10:27   ` Phillip Wood
  2020-05-24 23:00   ` Johannes Schindelin
  2020-05-21 18:54 ` [PATCH v2 4/4] hook: add --porcelain to " Emily Shaffer
  2020-07-28 22:24 ` [PATCH v3 0/6] propose config-based hooks Emily Shaffer
  4 siblings, 2 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-05-21 18:54 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Teach 'git hook list <hookname>', which checks the known configs in
order to create an ordered list of hooks to run on a given hook event.

Multiple commands can be specified for a given hook by providing
multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
run in config order. If more properties need to be set on a given hook
in the future, commands can also be specified by providing
"hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
<hookcmd-name>]" subsection; at minimum, this subsection must contain a
"hookcmd.<hookcmd-name>.command = <path-to-hook>" line.

For example:

  $ git config --list | grep ^hook
  hook.pre-commit.command=baz
  hook.pre-commit.command=~/bar.sh
  hookcmd.baz.command=~/baz/from/hookcmd.sh

  $ git hook list pre-commit
  ~/baz/from/hookcmd.sh
  ~/bar.sh

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/git-hook.txt    | 37 +++++++++++++-
 Makefile                      |  1 +
 builtin/hook.c                | 55 +++++++++++++++++++--
 hook.c                        | 90 +++++++++++++++++++++++++++++++++++
 hook.h                        | 15 ++++++
 t/t1360-config-based-hooks.sh | 51 +++++++++++++++++++-
 6 files changed, 242 insertions(+), 7 deletions(-)
 create mode 100644 hook.c
 create mode 100644 hook.h

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index 2d50c414cc..e458586e96 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -8,12 +8,47 @@ git-hook - Manage configured hooks
 SYNOPSIS
 --------
 [verse]
-'git hook'
+'git hook' list <hook-name>
 
 DESCRIPTION
 -----------
 You can list, add, and modify hooks with this command.
 
+This command parses the default configuration files for sections "hook" and
+"hookcmd". "hook" is used to describe the commands which will be run during a
+particular hook event; commands are run in config order. "hookcmd" is used to
+describe attributes of a specific command. If additional attributes don't need
+to be specified, a command to run can be specified directly in the "hook"
+section; if a "hookcmd" by that name isn't found, Git will attempt to run the
+provided value directly. For example:
+
+Global config
+----
+  [hook "post-commit"]
+    command = "linter"
+    command = "~/typocheck.sh"
+
+  [hookcmd "linter"]
+    command = "/bin/linter --c"
+----
+
+Local config
+----
+  [hook "prepare-commit-msg"]
+    command = "linter"
+  [hook "post-commit"]
+    command = "python ~/run-test-suite.py"
+----
+
+COMMANDS
+--------
+
+list <hook-name>::
+
+List the hooks which have been configured for <hook-name>. Hooks appear
+in the order they should be run, and note the config scope where the relevant
+`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index fce6ee154e..b7bbf3be7b 100644
--- a/Makefile
+++ b/Makefile
@@ -894,6 +894,7 @@ LIB_OBJS += grep.o
 LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
+LIB_OBJS += hook.o
 LIB_OBJS += ident.o
 LIB_OBJS += interdiff.o
 LIB_OBJS += json-writer.o
diff --git a/builtin/hook.c b/builtin/hook.c
index b2bbc84d4d..cfd8e388bd 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -1,21 +1,68 @@
 #include "cache.h"
 
 #include "builtin.h"
+#include "config.h"
+#include "hook.h"
 #include "parse-options.h"
+#include "strbuf.h"
 
 static const char * const builtin_hook_usage[] = {
-	N_("git hook"),
+	N_("git hook list <hookname>"),
 	NULL
 };
 
-int cmd_hook(int argc, const char **argv, const char *prefix)
+static int list(int argc, const char **argv, const char *prefix)
 {
-	struct option builtin_hook_options[] = {
+	struct list_head *head, *pos;
+	struct hook *item;
+	struct strbuf hookname = STRBUF_INIT;
+
+	struct option list_options[] = {
 		OPT_END(),
 	};
 
-	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+	argc = parse_options(argc, argv, prefix, list_options,
 			     builtin_hook_usage, 0);
 
+	if (argc < 1) {
+		usage_msg_opt("a hookname must be provided to operate on.",
+			      builtin_hook_usage, list_options);
+	}
+
+	strbuf_addstr(&hookname, argv[0]);
+
+	head = hook_list(&hookname);
+
+	if (!head) {
+		printf(_("no commands configured for hook '%s'\n"),
+		       hookname.buf);
+		return 0;
+	}
+
+	list_for_each(pos, head) {
+		item = list_entry(pos, struct hook, list);
+		if (item)
+			printf("%s:\t%s\n",
+			       config_scope_name(item->origin),
+			       item->command.buf);
+	}
+
+	clear_hook_list();
+	strbuf_release(&hookname);
+
 	return 0;
 }
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+	if (argc < 2)
+		usage_with_options(builtin_hook_usage, builtin_hook_options);
+
+	if (!strcmp(argv[1], "list"))
+		return list(argc - 1, argv + 1, prefix);
+
+	usage_with_options(builtin_hook_usage, builtin_hook_options);
+}
diff --git a/hook.c b/hook.c
new file mode 100644
index 0000000000..9dfc1a885e
--- /dev/null
+++ b/hook.c
@@ -0,0 +1,90 @@
+#include "cache.h"
+
+#include "hook.h"
+#include "config.h"
+
+static LIST_HEAD(hook_head);
+
+void free_hook(struct hook *ptr)
+{
+	if (ptr) {
+		strbuf_release(&ptr->command);
+		free(ptr);
+	}
+}
+
+static void emplace_hook(struct list_head *pos, const char *command)
+{
+	struct hook *to_add = malloc(sizeof(struct hook));
+	to_add->origin = current_config_scope();
+	strbuf_init(&to_add->command, 0);
+	strbuf_addstr(&to_add->command, command);
+
+	list_add_tail(&to_add->list, pos);
+}
+
+static void remove_hook(struct list_head *to_remove)
+{
+	struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
+	list_del(to_remove);
+	free_hook(hook_to_remove);
+}
+
+void clear_hook_list(void)
+{
+	struct list_head *pos, *tmp;
+	list_for_each_safe(pos, tmp, &hook_head)
+		remove_hook(pos);
+}
+
+static int hook_config_lookup(const char *key, const char *value, void *hook_key_cb)
+{
+	const char *hook_key = hook_key_cb;
+
+	if (!strcmp(key, hook_key)) {
+		const char *command = value;
+		struct strbuf hookcmd_name = STRBUF_INIT;
+		struct list_head *pos = NULL, *tmp = NULL;
+
+		/* Check if a hookcmd with that name exists. */
+		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
+		git_config_get_value(hookcmd_name.buf, &command);
+
+		if (!command)
+			BUG("git_config_get_value overwrote a string it shouldn't have");
+
+		/*
+		 * TODO: implement an option-getting callback, e.g.
+		 *   get configs by pattern hookcmd.$value.*
+		 *   for each key+value, do_callback(key, value, cb_data)
+		 */
+
+		list_for_each_safe(pos, tmp, &hook_head) {
+			struct hook *hook = list_entry(pos, struct hook, list);
+			/*
+			 * The list of hooks to run can be reordered by being redeclared
+			 * in the config. Options about hook ordering should be checked
+			 * here.
+			 */
+			if (0 == strcmp(hook->command.buf, command))
+				remove_hook(pos);
+		}
+		emplace_hook(pos, command);
+	}
+
+	return 0;
+}
+
+struct list_head* hook_list(const struct strbuf* hookname)
+{
+	struct strbuf hook_key = STRBUF_INIT;
+
+	if (!hookname)
+		return NULL;
+
+	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
+
+	git_config(hook_config_lookup, (void*)hook_key.buf);
+
+	return &hook_head;
+}
diff --git a/hook.h b/hook.h
new file mode 100644
index 0000000000..aaf6511cff
--- /dev/null
+++ b/hook.h
@@ -0,0 +1,15 @@
+#include "config.h"
+#include "list.h"
+#include "strbuf.h"
+
+struct hook
+{
+	struct list_head list;
+	enum config_scope origin;
+	struct strbuf command;
+};
+
+struct list_head* hook_list(const struct strbuf *hookname);
+
+void free_hook(struct hook *ptr);
+void clear_hook_list(void);
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 34b0df5216..4e46d7dd4e 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -4,8 +4,55 @@ test_description='config-managed multihooks, including git-hook command'
 
 . ./test-lib.sh
 
-test_expect_success 'git hook command does not crash' '
-	git hook
+test_expect_success 'git hook rejects commands without a mode' '
+	test_must_fail git hook pre-commit
+'
+
+
+test_expect_success 'git hook rejects commands without a hookname' '
+	test_must_fail git hook list
+'
+
+test_expect_success 'setup hooks in global, and local' '
+	git config --add --local hook.pre-commit.command "/path/ghi" &&
+	git config --add --global hook.pre-commit.command "/path/def"
+'
+
+test_expect_success 'git hook list orders by config order' '
+	cat >expected <<-\EOF &&
+	global:	/path/def
+	local:	/path/ghi
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list dereferences a hookcmd' '
+	git config --add --local hook.pre-commit.command "abc" &&
+	git config --add --global hookcmd.abc.command "/path/abc" &&
+
+	cat >expected <<-\EOF &&
+	global:	/path/def
+	local:	/path/ghi
+	local:	/path/abc
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list reorders on duplicate commands' '
+	git config --add --local hook.pre-commit.command "/path/def" &&
+
+	cat >expected <<-\EOF &&
+	local:	/path/ghi
+	local:	/path/abc
+	local:	/path/def
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
 '
 
 test_done
-- 
2.27.0.rc0.183.gde8f92d652-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v2 4/4] hook: add --porcelain to list command
  2020-05-21 18:54 [PATCH v2 0/4] propose config-based hooks Emily Shaffer
                   ` (2 preceding siblings ...)
  2020-05-21 18:54 ` [PATCH v2 3/4] hook: add list command Emily Shaffer
@ 2020-05-21 18:54 ` Emily Shaffer
  2020-05-24 23:00   ` Johannes Schindelin
  2020-07-28 22:24 ` [PATCH v3 0/6] propose config-based hooks Emily Shaffer
  4 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-05-21 18:54 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Teach 'git hook list --porcelain <hookname>', which prints simply the
commands to be run in the order suggested by the config. This option is
intended for use by user scripts, wrappers, or out-of-process Git
commands which still want to execute hooks. For example, the following
snippet might be added to git-send-email.perl to introduce a
`pre-send-email` hook:

  sub pre_send_email {
    open(my $fh, 'git hook list --porcelain pre-send-email |');
    chomp(my @hooks = <$fh>);
    close($fh);

    foreach $hook (@hooks) {
            system $hook
    }

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/git-hook.txt    | 13 +++++++++++--
 builtin/hook.c                | 17 +++++++++++++----
 t/t1360-config-based-hooks.sh | 11 +++++++++++
 3 files changed, 35 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index e458586e96..0854035ce2 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -8,7 +8,7 @@ git-hook - Manage configured hooks
 SYNOPSIS
 --------
 [verse]
-'git hook' list <hook-name>
+'git hook' list [--porcelain] <hook-name>
 
 DESCRIPTION
 -----------
@@ -43,11 +43,20 @@ Local config
 COMMANDS
 --------
 
-list <hook-name>::
+list [--porcelain] <hook-name>::
 
 List the hooks which have been configured for <hook-name>. Hooks appear
 in the order they should be run, and note the config scope where the relevant
 `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
++
+If `--porcelain` is specified, instead print the commands alone, separated by
+newlines, for easy parsing by a script.
+
+OPTIONS
+-------
+--porcelain::
+	With `list`, print the commands in the order they should be run,
+	separated by newlines, for easy parsing by a script.
 
 GIT
 ---
diff --git a/builtin/hook.c b/builtin/hook.c
index cfd8e388bd..2e51c84c81 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -16,8 +16,11 @@ static int list(int argc, const char **argv, const char *prefix)
 	struct list_head *head, *pos;
 	struct hook *item;
 	struct strbuf hookname = STRBUF_INIT;
+	int porcelain = 0;
 
 	struct option list_options[] = {
+		OPT_BOOL(0, "porcelain", &porcelain,
+			 "format for execution by a script"),
 		OPT_END(),
 	};
 
@@ -29,6 +32,8 @@ static int list(int argc, const char **argv, const char *prefix)
 			      builtin_hook_usage, list_options);
 	}
 
+
+
 	strbuf_addstr(&hookname, argv[0]);
 
 	head = hook_list(&hookname);
@@ -41,10 +46,14 @@ static int list(int argc, const char **argv, const char *prefix)
 
 	list_for_each(pos, head) {
 		item = list_entry(pos, struct hook, list);
-		if (item)
-			printf("%s:\t%s\n",
-			       config_scope_name(item->origin),
-			       item->command.buf);
+		if (item) {
+			if (porcelain)
+				printf("%s\n", item->command.buf);
+			else
+				printf("%s:\t%s\n",
+				       config_scope_name(item->origin),
+				       item->command.buf);
+		}
 	}
 
 	clear_hook_list();
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 4e46d7dd4e..3296d8af45 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -55,4 +55,15 @@ test_expect_success 'git hook list reorders on duplicate commands' '
 	test_cmp expected actual
 '
 
+test_expect_success 'git hook list --porcelain prints just the command' '
+	cat >expected <<-\EOF &&
+	/path/ghi
+	/path/abc
+	/path/def
+	EOF
+
+	git hook list --porcelain pre-commit >actual &&
+	test_cmp expected actual
+'
+
 test_done
-- 
2.27.0.rc0.183.gde8f92d652-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 1/4] doc: propose hooks managed by the config
  2020-05-21 18:54 ` [PATCH v2 1/4] doc: propose hooks managed by the config Emily Shaffer
@ 2020-05-22 10:13   ` Phillip Wood
  2020-06-09 20:26     ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Phillip Wood @ 2020-05-22 10:13 UTC (permalink / raw)
  To: Emily Shaffer, git

Hi Emily

Thanks for working on this

On 21/05/2020 19:54, Emily Shaffer wrote:
> Begin a design document for config-based hooks, managed via git-hook.
> Focus on an overview of the implementation and motivation for design
> decisions. Briefly discuss the alternatives considered before this
> point. Also, attempt to redefine terms to fit into a multihook world.
> 
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
>  Documentation/Makefile                        |   1 +
>  .../technical/config-based-hooks.txt          | 320 ++++++++++++++++++
>  2 files changed, 321 insertions(+)
>  create mode 100644 Documentation/technical/config-based-hooks.txt
> 
> diff --git a/Documentation/Makefile b/Documentation/Makefile
> index 15d9d04f31..5b21f31d31 100644
> --- a/Documentation/Makefile
> +++ b/Documentation/Makefile
> @@ -80,6 +80,7 @@ SP_ARTICLES += $(API_DOCS)
>  TECH_DOCS += MyFirstContribution
>  TECH_DOCS += MyFirstObjectWalk
>  TECH_DOCS += SubmittingPatches
> +TECH_DOCS += technical/config-based-hooks
>  TECH_DOCS += technical/hash-function-transition
>  TECH_DOCS += technical/http-protocol
>  TECH_DOCS += technical/index-format
> diff --git a/Documentation/technical/config-based-hooks.txt b/Documentation/technical/config-based-hooks.txt
> new file mode 100644
> index 0000000000..59cdc25a47
> --- /dev/null
> +++ b/Documentation/technical/config-based-hooks.txt
> @@ -0,0 +1,320 @@
> +Configuration-based hook management
> +===================================
> +
> +== Motivation
> +
> +Treat hooks as a first-class citizen by replacing the .git/hook/hookname path as
> +the only source of hooks to execute, in a way which is friendly to users with
> +multiple repos which have similar needs.
> +
> +Redefine "hook" as an event rather than a single script, allowing users to
> +perform unrelated actions on a single event.
> +
> +Take a step closer to safety when copying zipped Git repositories from untrusted
> +users.

Having read through this (admittedly fairly quickly) I'm not sure what
that step is

> +
> +Make it easier for users to discover Git's hook feature and automate their
> +workflows.
> +
> +== User interfaces
> +
> +=== Config schema
> +
> +Hooks can be introduced by editing the configuration manually. There are two new
> +sections added, `hook` and `hookcmd`.
> +
> +==== `hook`
> +
> +Primarily contains subsections for each hook event. These subsections define
> +hook command execution order;

May be "The order of these subsections define the hook command execution
order" ?

> hook commands can be specified by passing the
> +command directly if no additional configuration is needed, or by passing the
> +name of a `hookcmd`.

I know what you mean by "passing" but as this section is talking about
config settings perhaps it should refer to the keys and values.

> If Git does not find a `hookcmd` whose subsection matches
> +the value of the given command string, Git will try to execute the string
> +directly. Hooks are executed by passing the resolved command string to the
> +shell.

Do we really need to invoke the shell just to split a command-line and
look up the command in $PATH? If we used split_commandline() in alias.c
then we could avoid invoking this extra process for each hook command.

> Hook event subsections can also contain per-hook-event settings.
> +
> +Also contains top-level hook execution settings, for example,
> +`hook.warnHookDir`, `hook.runHookDir`, or `hook.disableAll`.

(see sections ...) ? for the forward references to these settings?

> +
> +----
> +[hook "pre-commit"]
> +  command = perl-linter
> +  command = /usr/bin/git-secrets --pre-commit
> +
> +[hook "pre-applypatch"]
> +  command = perl-linter
> +  error = ignore
> +
> +[hook]
> +  runHookDir = interactive
> +----
> +
> +==== `hookcmd`
> +
> +Defines a hook command and its attributes, which will be used when a hook event
> +occurs. Unqualified attributes are assumed to apply to this hook during all hook
> +events, but event-specific attributes can also be supplied. The example runs
> +`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
> +include this config, the hook command will be skipped for all events to which
> +it's normally subscribed _except_ `pre-commit`.
> +
> +----
> +[hookcmd "perl-linter"]
> +  command = /usr/bin/lint-it --language=perl
> +  skip = true
> +  pre-commit-skip = false
> +----
> +
> +=== Command-line API
> +
> +Users should be able to view, reorder, and create hook commands via the command
> +line. External tools should be able to view a list of hooks in the correct order
> +to run.
> +
> +*`git hook list <hook-event>`*
> +
> +*`git hook list (--system|--global|--local|--worktree)`*
> +
> +*`git hook edit <hook-event>`*
> +
> +*`git hook add <hook-command> <hook-event> <options...>`*
> +
> +=== Hook editor
> +
> +The tool which is presented by `git hook edit <hook-command>`. Ideally, this
> +tool should be easier to use than manually editing the config, and then produce
> +a concise config afterwards. It may take a form similar to `git rebase
> +--interactive`.

rebase -i is not necessarily an exemplar of user interface design, what
sort of thing do you have in mind?

> +
> +== Implementation
> +
> +=== Library
> +
> +`hook.c` and `hook.h` are responsible for interacting with the config files. In
> +the case when the code generating a hook event doesn't have special concerns
> +about how to run the hooks, the hook library will provide a basic API to call
> +all hooks in config order with an `argv_array` provided by the code which
> +generates the hook event:
> +
> +*`int run_hooks(const char *hookname, struct argv_array *args)`*
> +
> +This call includes the hook command provided by `run-command.h:find_hook()`;
> +eventually, this legacy hook will be gated by a config `hook.runHookDir`. The
> +config is checked against a number of cases:
> +
> +- "no": the legacy hook will not be run
> +- "interactive": Git will prompt the user before running the legacy hook
> +- "warn": Git will print a warning to stderr before running the legacy hook
> +- "yes" (default): Git will silently run the legacy hook
> +
> +In case this list is expanded in the future, if a value for `hook.runHookDir` is
> +given which Git does not recognize, Git should discard that config entry. For
> +example, if "warn" was specified at system level and "junk" was specified at
> +global level, Git would resolve the value to "warn"; if the only time the config
> +was set was to "junk", Git would use the default value of "yes".
> +
> +If the caller wants to do something more complicated, the hook library can also
> +provide a callback API:
> +
> +*`int for_each_hookcmd(const char *hookname, hookcmd_function *cb)`*
> +
> +Finally, to facilitate the builtin, the library will also provide the following
> +APIs to interact with the config:
> +
> +----
> +int set_hook_commands(const char *hookname, struct string_list *commands,
> +	enum config_scope scope);
> +int set_hookcmd(const char *hookcmd, struct hookcmd options);
> +
> +int list_hook_commands(const char *hookname, struct string_list *commands);
> +int list_hooks_in_scope(enum config_scope scope, struct string_list *commands);
> +----
> +
> +`struct hookcmd` is expected to grow in size over time as more functionality is
> +added to hooks; so that other parts of the code don't need to understand the
> +config schema, `struct hookcmd` should contain logical values instead of string
> +pairs.
> +
> +----
> +struct hookcmd {
> +  const char *name;
> +  const char *command;
> +
> +  /* for illustration only; not planned at present */
> +  int parallelizable;
> +  const char *hookcmd_before;
> +  const char *hookcmd_after;
> +  enum recovery_action on_fail;
> +}
> +----
> +
> +=== Builtin
> +
> +`builtin/hook.c` is responsible for providing the frontend. It's responsible for
> +formatting user-provided data and then calling the library API to set the
> +configs as appropriate. The builtin frontend is not responsible for calling the
> +config directly, so that other areas of Git can rely on the hook library to
> +understand the most recent config schema for hooks.
> +
> +=== Migration path
> +
> +==== Stage 0
> +
> +Hooks are called by running `run-command.h:find_hook()` with the hookname and
> +executing the result. The hook library and builtin do not exist. Hooks only
> +exist as specially named scripts within `.git/hooks/`.
> +
> +==== Stage 1
> +
> +`git hook list --porcelain <hook-event>` is implemented. Users can replace their
> +`.git/hooks/<hook-event>` scripts with a trampoline based on `git hook list`'s
> +output. Modifier commands like `git hook add` and `git hook edit` can be
> +implemented around this time as well.
> +
> +==== Stage 2
> +
> +`hook.h:run_hooks()` is taught to include `run-command.h:find_hook()` at the
> +end; calls to `find_hook()` are replaced with calls to `run_hooks()`. Users can
> +opt-in to config-based hooks simply by creating some in their config; otherwise
> +users should remain unaffected by the change.
> +
> +==== Stage 3
> +
> +The call to `find_hook()` inside of `run_hooks()` learns to check for a config,
> +`hook.runHookDir`. Users can opt into managing their hooks completely via the
> +config this way.
> +
> +==== Stage 4
> +
> +`.git/hooks` is removed from the template and the hook directory is considered
> +deprecated. To avoid breaking older repos, the default of `hook.runHookDir` is
> +not changed, and `find_hook()` is not removed.
> +
> +== Caveats
> +
> +=== Security and repo config
> +
> +Part of the motivation behind this refactor is to mitigate hooks as an attack
> +vector;footnote:[https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/]
> +however, as the design stands, users can still provide hooks in the repo-level
> +config, which is included when a repo is zipped and sent elsewhere.  The
> +security of the repo-level config is still under discussion; this design
> +generally assumes the repo-level config is secure, which is not true yet. The
> +goal is to avoid an overcomplicated design to work around a problem which has
> +ceased to exist.
> +
> +=== Ease of use
> +
> +The config schema is nontrivial; that's why it's important for the `git hook`
> +modifier commands to be usable.

That's an important point

> Contributors with UX expertise are encouraged to
> +share their suggestions.
> +
> +== Alternative approaches
> +
> +A previous summary of alternatives exists in the
> +archives.footnote:[https://lore.kernel.org/git/20191116011125.GG22855@google.com]
> +
> +=== Status quo
> +
> +Today users can implement multihooks themselves by using a "trampoline script"
> +as their hook, and pointing that script to a directory or list of other scripts
> +they wish to run.
> +
> +=== Hook directories
> +
> +Other contributors have suggested Git learn about the existence of a directory
> +such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
> +
> +=== Comparison table
> +
> +.Comparison of alternatives
> +|===
> +|Feature |Config-based hooks |Hook directories |Status quo
> +
> +|Supports multiple hooks
> +|Natively
> +|Natively
> +|With user effort
> +
> +|Safer for zipped repos
> +|A little
> +|No
> +|No
> +
> +|Previous hooks just work
> +|If configured
> +|Yes
> +|Yes
> +
> +|Can install one hook to many repos
> +|Yes
> +|No
> +|No
> +
> +|Discoverability
> +|Better (in `git help git`)
> +|Same as before
> +|Same as before
> +
> +|Hard to run unexpected hook
> +|If configured
> +|No
> +|No
> +|===
> +
> +== Future work
> +
> +=== Execution ordering
> +
> +We may find that config order is insufficient for some users; for example,
> +config order makes it difficult to add a new hook to the system or global config
> +which runs at the end of the hook list. A new ordering schema should be:
> +
> +1) Specified by a `hook.order` config, so that users will not unexpectedly see
> +their order change;
> +
> +2) Either dependency or numerically based.
> +
> +Dependency-based ordering is prone to classic linked-list problems, like a
> +cycles and handling of missing dependencies. But, it paves the way for enabling
> +parallelization if some tasks truly depend on others.
> +
> +Numerical ordering makes it tricky for Git to generate suggested ordering
> +numbers for each command, but is easy to determine a definitive order.
> +
> +=== Parallelization
> +
> +Users with many hooks might want to run them simultaneously, if the hooks don't
> +modify state; if one hook depends on another's output, then users will want to
> +specify those dependencies. If we decide to solve this problem, we may want to
> +look to modern build systems for inspiration on how to manage dependencies and
> +parallel tasks.
> +
> +=== Securing hookdir hooks
> +
> +With the design as written in this doc, it's still possible for a malicious user
> +to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
> +zip their repo and send it to another user. It may be necessary to teach Git to
> +only allow one-line hooks like this if they were configured outside of the local
> +scope;

Does "disabling one-line hooks" mean "disable passing command line
arguments to the hook"? I'm not sure that gains much security - can't I
just set 'hook.pre-receive.command = ./delete-everything' and include
delete-everything in my malicious repo?

Best Wishes

Phillip

> or another approach, like a list of safe projects, might be useful. It
> +may also be sufficient (or at least useful) to teach a `hook.disableAll` config
> +or similar flag to the Git executable.
> +
> +=== Submodule inheritance
> +
> +It's possible some submodules may want to run the identical set of hooks that
> +their superrepo runs. While a globally-configured hook set is helpful, it's not
> +a great solution for users who have multiple repos-with-submodules under the
> +same user. It would be useful for submodules to learn how to run hooks from
> +their superrepo's config, or inherit that hook setting.
> +
> +== Glossary
> +
> +*hook event*
> +
> +A point during Git's execution where user scripts may be run, for example,
> +_prepare-commit-msg_ or _pre-push_.
> +
> +*hook command*
> +
> +A user script or executable which will be run on one or more hook events.
> 


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 3/4] hook: add list command
  2020-05-21 18:54 ` [PATCH v2 3/4] hook: add list command Emily Shaffer
@ 2020-05-22 10:27   ` Phillip Wood
  2020-06-09 21:49     ` Emily Shaffer
  2020-05-24 23:00   ` Johannes Schindelin
  1 sibling, 1 reply; 170+ messages in thread
From: Phillip Wood @ 2020-05-22 10:27 UTC (permalink / raw)
  To: Emily Shaffer, git

Hi Emily

On 21/05/2020 19:54, Emily Shaffer wrote:
> Teach 'git hook list <hookname>', which checks the known configs in
> order to create an ordered list of hooks to run on a given hook event.
> 
> Multiple commands can be specified for a given hook by providing
> multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
> run in config order. If more properties need to be set on a given hook
> in the future, commands can also be specified by providing
> "hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
> <hookcmd-name>]" subsection; at minimum, this subsection must contain a
> "hookcmd.<hookcmd-name>.command = <path-to-hook>" line.
> 
> For example:
> 
>   $ git config --list | grep ^hook
>   hook.pre-commit.command=baz
>   hook.pre-commit.command=~/bar.sh
>   hookcmd.baz.command=~/baz/from/hookcmd.sh
> 
>   $ git hook list pre-commit
>   ~/baz/from/hookcmd.sh
>   ~/bar.sh
> 
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
>  Documentation/git-hook.txt    | 37 +++++++++++++-
>  Makefile                      |  1 +
>  builtin/hook.c                | 55 +++++++++++++++++++--
>  hook.c                        | 90 +++++++++++++++++++++++++++++++++++
>  hook.h                        | 15 ++++++
>  t/t1360-config-based-hooks.sh | 51 +++++++++++++++++++-
>  6 files changed, 242 insertions(+), 7 deletions(-)
>  create mode 100644 hook.c
>  create mode 100644 hook.h
> 
> diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
> index 2d50c414cc..e458586e96 100644
> --- a/Documentation/git-hook.txt
> +++ b/Documentation/git-hook.txt
> @@ -8,12 +8,47 @@ git-hook - Manage configured hooks
>  SYNOPSIS
>  --------
>  [verse]
> -'git hook'
> +'git hook' list <hook-name>
>  
>  DESCRIPTION
>  -----------
>  You can list, add, and modify hooks with this command.
>  
> +This command parses the default configuration files for sections "hook" and
> +"hookcmd". "hook" is used to describe the commands which will be run during a
> +particular hook event; commands are run in config order. "hookcmd" is used to
> +describe attributes of a specific command. If additional attributes don't need
> +to be specified, a command to run can be specified directly in the "hook"
> +section; if a "hookcmd" by that name isn't found, Git will attempt to run the
> +provided value directly. For example:
> +
> +Global config
> +----
> +  [hook "post-commit"]
> +    command = "linter"
> +    command = "~/typocheck.sh"
> +
> +  [hookcmd "linter"]
> +    command = "/bin/linter --c"
> +----
> +
> +Local config
> +----
> +  [hook "prepare-commit-msg"]
> +    command = "linter"
> +  [hook "post-commit"]
> +    command = "python ~/run-test-suite.py"
> +----
> +
> +COMMANDS
> +--------
> +
> +list <hook-name>::
> +
> +List the hooks which have been configured for <hook-name>. Hooks appear
> +in the order they should be run, and note the config scope where the relevant
> +`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
> +
>  GIT
>  ---
>  Part of the linkgit:git[1] suite
> diff --git a/Makefile b/Makefile
> index fce6ee154e..b7bbf3be7b 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -894,6 +894,7 @@ LIB_OBJS += grep.o
>  LIB_OBJS += hashmap.o
>  LIB_OBJS += help.o
>  LIB_OBJS += hex.o
> +LIB_OBJS += hook.o
>  LIB_OBJS += ident.o
>  LIB_OBJS += interdiff.o
>  LIB_OBJS += json-writer.o
> diff --git a/builtin/hook.c b/builtin/hook.c
> index b2bbc84d4d..cfd8e388bd 100644
> --- a/builtin/hook.c
> +++ b/builtin/hook.c
> @@ -1,21 +1,68 @@
>  #include "cache.h"
>  
>  #include "builtin.h"
> +#include "config.h"
> +#include "hook.h"
>  #include "parse-options.h"
> +#include "strbuf.h"
>  
>  static const char * const builtin_hook_usage[] = {
> -	N_("git hook"),
> +	N_("git hook list <hookname>"),
>  	NULL
>  };
>  
> -int cmd_hook(int argc, const char **argv, const char *prefix)
> +static int list(int argc, const char **argv, const char *prefix)
>  {
> -	struct option builtin_hook_options[] = {
> +	struct list_head *head, *pos;
> +	struct hook *item;
> +	struct strbuf hookname = STRBUF_INIT;
> +
> +	struct option list_options[] = {
>  		OPT_END(),
>  	};
>  
> -	argc = parse_options(argc, argv, prefix, builtin_hook_options,
> +	argc = parse_options(argc, argv, prefix, list_options,
>  			     builtin_hook_usage, 0);
>  
> +	if (argc < 1) {
> +		usage_msg_opt("a hookname must be provided to operate on.",
> +			      builtin_hook_usage, list_options);
> +	}
> +
> +	strbuf_addstr(&hookname, argv[0]);
> +
> +	head = hook_list(&hookname);
> +
> +	if (!head) {
> +		printf(_("no commands configured for hook '%s'\n"),
> +		       hookname.buf);
> +		return 0;
> +	}
> +
> +	list_for_each(pos, head) {
> +		item = list_entry(pos, struct hook, list);
> +		if (item)
> +			printf("%s:\t%s\n",
> +			       config_scope_name(item->origin),
> +			       item->command.buf);
> +	}
> +
> +	clear_hook_list();
> +	strbuf_release(&hookname);
> +
>  	return 0;
>  }
> +
> +int cmd_hook(int argc, const char **argv, const char *prefix)
> +{
> +	struct option builtin_hook_options[] = {
> +		OPT_END(),
> +	};
> +	if (argc < 2)
> +		usage_with_options(builtin_hook_usage, builtin_hook_options);
> +
> +	if (!strcmp(argv[1], "list"))
> +		return list(argc - 1, argv + 1, prefix);
> +
> +	usage_with_options(builtin_hook_usage, builtin_hook_options);
> +}
> diff --git a/hook.c b/hook.c
> new file mode 100644
> index 0000000000..9dfc1a885e
> --- /dev/null
> +++ b/hook.c
> @@ -0,0 +1,90 @@
> +#include "cache.h"
> +
> +#include "hook.h"
> +#include "config.h"
> +
> +static LIST_HEAD(hook_head);
> +
> +void free_hook(struct hook *ptr)
> +{
> +	if (ptr) {
> +		strbuf_release(&ptr->command);
> +		free(ptr);
> +	}
> +}
> +
> +static void emplace_hook(struct list_head *pos, const char *command)
> +{
> +	struct hook *to_add = malloc(sizeof(struct hook));
> +	to_add->origin = current_config_scope();
> +	strbuf_init(&to_add->command, 0);
> +	strbuf_addstr(&to_add->command, command);
> +
> +	list_add_tail(&to_add->list, pos);
> +}
> +
> +static void remove_hook(struct list_head *to_remove)
> +{
> +	struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
> +	list_del(to_remove);
> +	free_hook(hook_to_remove);
> +}
> +
> +void clear_hook_list(void)
> +{
> +	struct list_head *pos, *tmp;
> +	list_for_each_safe(pos, tmp, &hook_head)
> +		remove_hook(pos);
> +}
> +
> +static int hook_config_lookup(const char *key, const char *value, void *hook_key_cb)
> +{
> +	const char *hook_key = hook_key_cb;
> +
> +	if (!strcmp(key, hook_key)) {
> +		const char *command = value;
> +		struct strbuf hookcmd_name = STRBUF_INIT;
> +		struct list_head *pos = NULL, *tmp = NULL;
> +
> +		/* Check if a hookcmd with that name exists. */
> +		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
> +		git_config_get_value(hookcmd_name.buf, &command);

This looks dodgy to me. This code is called by git_config() as it parses
the config files, so it has not had a chance to fully populate the
config cache used by git_config_get_value(). I think the test below
passes because the hookcmd setting is set in the global file and the
hook setting is set in the local file so when we have already parsed the
hookcmd setting when we come to look it up. The same comment applies to
the hypothetical ordering config mentioned below. I think it would be
better to collect the list of hook.<event>.command settings in this
callback and then look up any hookcmd settings for those hook commands
after we've finished reading all of the config files.

> +
> +		if (!command)
> +			BUG("git_config_get_value overwrote a string it shouldn't have");
> +
> +		/*
> +		 * TODO: implement an option-getting callback, e.g.
> +		 *   get configs by pattern hookcmd.$value.*
> +		 *   for each key+value, do_callback(key, value, cb_data)
> +		 */
> +
> +		list_for_each_safe(pos, tmp, &hook_head) {
> +			struct hook *hook = list_entry(pos, struct hook, list);
> +			/*
> +			 * The list of hooks to run can be reordered by being redeclared
> +			 * in the config. Options about hook ordering should be checked
> +			 * here.
> +			 */
> +			if (0 == strcmp(hook->command.buf, command))
> +				remove_hook(pos);
> +		}
> +		emplace_hook(pos, command);
> +	}
> +
> +	return 0;
> +}
> +
> +struct list_head* hook_list(const struct strbuf* hookname)
> +{
> +	struct strbuf hook_key = STRBUF_INIT;
> +
> +	if (!hookname)
> +		return NULL;
> +
> +	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
> +
> +	git_config(hook_config_lookup, (void*)hook_key.buf);
> +
> +	return &hook_head;
> +}
> diff --git a/hook.h b/hook.h
> new file mode 100644
> index 0000000000..aaf6511cff
> --- /dev/null
> +++ b/hook.h
> @@ -0,0 +1,15 @@
> +#include "config.h"
> +#include "list.h"
> +#include "strbuf.h"
> +
> +struct hook
> +{
> +	struct list_head list;
> +	enum config_scope origin;
> +	struct strbuf command;
> +};
> +
> +struct list_head* hook_list(const struct strbuf *hookname);
> +
> +void free_hook(struct hook *ptr);
> +void clear_hook_list(void);
> diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
> index 34b0df5216..4e46d7dd4e 100755
> --- a/t/t1360-config-based-hooks.sh
> +++ b/t/t1360-config-based-hooks.sh
> @@ -4,8 +4,55 @@ test_description='config-managed multihooks, including git-hook command'
>  
>  . ./test-lib.sh
>  
> -test_expect_success 'git hook command does not crash' '
> -	git hook
> +test_expect_success 'git hook rejects commands without a mode' '
> +	test_must_fail git hook pre-commit
> +'
> +
> +
> +test_expect_success 'git hook rejects commands without a hookname' '
> +	test_must_fail git hook list
> +'
> +
> +test_expect_success 'setup hooks in global, and local' '
> +	git config --add --local hook.pre-commit.command "/path/ghi" &&

Can I make a plea for the use of test_config please. Writing tests which
rely on previous tests for their set-up creates a chain of hidden
dependencies that make it hard to add/alter tests later or run a subset
of the tests when developing a new patch. t3404-rebase-interactive.sh is
a prime example of this and I dread touching it.

> +	git config --add --global hook.pre-commit.command "/path/def"
> +'
> +
> +test_expect_success 'git hook list orders by config order' '
> +	cat >expected <<-\EOF &&
> +	global:	/path/def
> +	local:	/path/ghi
> +	EOF
> +
> +	git hook list pre-commit >actual &&
> +	test_cmp expected actual
> +'
> +
> +test_expect_success 'git hook list dereferences a hookcmd' '
> +	git config --add --local hook.pre-commit.command "abc" &&
> +	git config --add --global hookcmd.abc.command "/path/abc" &&
> +
> +	cat >expected <<-\EOF &&
> +	global:	/path/def
> +	local:	/path/ghi
> +	local:	/path/abc

We should make it clear in the documentation that the config origin
applies to the hook setting, even though we display the hookcmd command
which is set globally here for the last hook.

Best Wishes

Phillip

> +	EOF
> +
> +	git hook list pre-commit >actual &&
> +	test_cmp expected actual
> +'
> +
> +test_expect_success 'git hook list reorders on duplicate commands' '
> +	git config --add --local hook.pre-commit.command "/path/def" &&
> +
> +	cat >expected <<-\EOF &&
> +	local:	/path/ghi
> +	local:	/path/abc
> +	local:	/path/def
> +	EOF
> +
> +	git hook list pre-commit >actual &&
> +	test_cmp expected actual
>  '
>  
>  test_done
> 


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 3/4] hook: add list command
  2020-05-21 18:54 ` [PATCH v2 3/4] hook: add list command Emily Shaffer
  2020-05-22 10:27   ` Phillip Wood
@ 2020-05-24 23:00   ` Johannes Schindelin
  2020-05-27 23:37     ` Emily Shaffer
  1 sibling, 1 reply; 170+ messages in thread
From: Johannes Schindelin @ 2020-05-24 23:00 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Hi Emily,

On Thu, 21 May 2020, Emily Shaffer wrote:

> diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
> index 34b0df5216..4e46d7dd4e 100755
> --- a/t/t1360-config-based-hooks.sh
> +++ b/t/t1360-config-based-hooks.sh
> @@ -4,8 +4,55 @@ test_description='config-managed multihooks, including git-hook command'
>
>  . ./test-lib.sh
>
> -test_expect_success 'git hook command does not crash' '
> -	git hook
> +test_expect_success 'git hook rejects commands without a mode' '
> +	test_must_fail git hook pre-commit
> +'
> +
> +
> +test_expect_success 'git hook rejects commands without a hookname' '
> +	test_must_fail git hook list
> +'
> +
> +test_expect_success 'setup hooks in global, and local' '
> +	git config --add --local hook.pre-commit.command "/path/ghi" &&
> +	git config --add --global hook.pre-commit.command "/path/def"
> +'
> +
> +test_expect_success 'git hook list orders by config order' '
> +	cat >expected <<-\EOF &&
> +	global:	/path/def
> +	local:	/path/ghi
> +	EOF
> +
> +	git hook list pre-commit >actual &&
> +	test_cmp expected actual

This, as well as the next two test cases, won't work on Windows, as you
almost certainly realized from looking at the failed GitHub workflow run
of your branch.

The reason is that Unix-like absolute paths like `/path/def` do _not_ do
what you think on Windows: they are relative to the MSYS2 root (because
the shell script runs in an MSYS2 Bash). The Git executable, however, has
not the slightest idea about MSYS2 and does not handle those. To remedy
that, the MSYS2 Bash prefixes those paths with the absolute
_Windows-style_ path when passing them to `git.exe` (in your case,
actually in the `setup hooks` test case above).

So you will need to squash this (or an equivalent fix) into your patch:

-- snip --
From f2568d47509130a9c35590d907797d2eb813ac0d Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 25 May 2020 15:03:16 +0200
Subject: [PATCH] fixup??? hook: add list command

This is needed to make the tests pass on Windows, where Unix-like
absolute paths are not what you think they are.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t1360-config-based-hooks.sh | 39 +++++++++++++++++++++--------------
 1 file changed, 24 insertions(+), 15 deletions(-)

diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 3296d8af4587..c862655fd4d9 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -18,10 +18,19 @@ test_expect_success 'setup hooks in global, and local' '
 	git config --add --global hook.pre-commit.command "/path/def"
 '

+ROOT=
+if test_have_prereq MINGW
+then
+	# In Git for Windows, Unix-like paths work only in shell scripts;
+	# `git.exe`, however, will prefix them with the pseudo root directory
+	# (of the Unix shell). Let's accommodate for that.
+	ROOT="$(cd / && pwd)"
+fi
+
 test_expect_success 'git hook list orders by config order' '
-	cat >expected <<-\EOF &&
-	global:	/path/def
-	local:	/path/ghi
+	cat >expected <<-EOF &&
+	global:	$ROOT/path/def
+	local:	$ROOT/path/ghi
 	EOF

 	git hook list pre-commit >actual &&
@@ -32,10 +41,10 @@ test_expect_success 'git hook list dereferences a hookcmd' '
 	git config --add --local hook.pre-commit.command "abc" &&
 	git config --add --global hookcmd.abc.command "/path/abc" &&

-	cat >expected <<-\EOF &&
-	global:	/path/def
-	local:	/path/ghi
-	local:	/path/abc
+	cat >expected <<-EOF &&
+	global:	$ROOT/path/def
+	local:	$ROOT/path/ghi
+	local:	$ROOT/path/abc
 	EOF

 	git hook list pre-commit >actual &&
@@ -45,10 +54,10 @@ test_expect_success 'git hook list dereferences a hookcmd' '
 test_expect_success 'git hook list reorders on duplicate commands' '
 	git config --add --local hook.pre-commit.command "/path/def" &&

-	cat >expected <<-\EOF &&
-	local:	/path/ghi
-	local:	/path/abc
-	local:	/path/def
+	cat >expected <<-EOF &&
+	local:	$ROOT/path/ghi
+	local:	$ROOT/path/abc
+	local:	$ROOT/path/def
 	EOF

 	git hook list pre-commit >actual &&
@@ -56,10 +65,10 @@ test_expect_success 'git hook list reorders on duplicate commands' '
 '

 test_expect_success 'git hook list --porcelain prints just the command' '
-	cat >expected <<-\EOF &&
-	/path/ghi
-	/path/abc
-	/path/def
+	cat >expected <<-EOF &&
+	$ROOT/path/ghi
+	$ROOT/path/abc
+	$ROOT/path/def
 	EOF

 	git hook list --porcelain pre-commit >actual &&
--
2.27.0.rc1.windows.1

-- snap --

Ciao,
Dscho

> +'
> +
> +test_expect_success 'git hook list dereferences a hookcmd' '
> +	git config --add --local hook.pre-commit.command "abc" &&
> +	git config --add --global hookcmd.abc.command "/path/abc" &&
> +
> +	cat >expected <<-\EOF &&
> +	global:	/path/def
> +	local:	/path/ghi
> +	local:	/path/abc
> +	EOF
> +
> +	git hook list pre-commit >actual &&
> +	test_cmp expected actual
> +'
> +
> +test_expect_success 'git hook list reorders on duplicate commands' '
> +	git config --add --local hook.pre-commit.command "/path/def" &&
> +
> +	cat >expected <<-\EOF &&
> +	local:	/path/ghi
> +	local:	/path/abc
> +	local:	/path/def
> +	EOF
> +
> +	git hook list pre-commit >actual &&
> +	test_cmp expected actual
>  '
>
>  test_done
> --
> 2.27.0.rc0.183.gde8f92d652-goog
>
>
>

^ permalink raw reply related	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 4/4] hook: add --porcelain to list command
  2020-05-21 18:54 ` [PATCH v2 4/4] hook: add --porcelain to " Emily Shaffer
@ 2020-05-24 23:00   ` Johannes Schindelin
  2020-05-25  0:29     ` Johannes Schindelin
  0 siblings, 1 reply; 170+ messages in thread
From: Johannes Schindelin @ 2020-05-24 23:00 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Hi Emily,

On Thu, 21 May 2020, Emily Shaffer wrote:

> diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
> index 4e46d7dd4e..3296d8af45 100755
> --- a/t/t1360-config-based-hooks.sh
> +++ b/t/t1360-config-based-hooks.sh
> @@ -55,4 +55,15 @@ test_expect_success 'git hook list reorders on duplicate commands' '
>  	test_cmp expected actual
>  '
>
> +test_expect_success 'git hook list --porcelain prints just the command' '
> +	cat >expected <<-\EOF &&
> +	/path/ghi
> +	/path/abc
> +	/path/def
> +	EOF
> +
> +	git hook list --porcelain pre-commit >actual &&
> +	test_cmp expected actual
> +'

As you surely found out from the GitHub workflow running in your fork,
this does not work on Windows. I need this (and strongly suggest you
squash that into your patch):

-- snipsnap --
From 97e3dfa6155785363c881ce2dcaf4f5ddead83ed Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Mon, 25 May 2020 15:04:24 +0200
Subject: [PATCH] fixup??? hook: add --porcelain to list command

This is required to let the test pass on Windows, where Git reports
Windows-style absolute paths and has no idea about the pseudo Unix
absolute paths that the Bash knows about.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t1360-config-based-hooks.sh | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index c862655fd4d9..fce7335e97b9 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -65,10 +65,10 @@ test_expect_success 'git hook list reorders on duplicate commands' '
 '

 test_expect_success 'git hook list --porcelain prints just the command' '
-	cat >expected <<-EOF &&
-	$ROOT/path/ghi
-	$ROOT/path/abc
-	$ROOT/path/def
+	cat >expected <<-\EOF &&
+	/path/ghi
+	/path/abc
+	/path/def
 	EOF

 	git hook list --porcelain pre-commit >actual &&
--
2.27.0.rc1.windows.1


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 4/4] hook: add --porcelain to list command
  2020-05-24 23:00   ` Johannes Schindelin
@ 2020-05-25  0:29     ` Johannes Schindelin
  0 siblings, 0 replies; 170+ messages in thread
From: Johannes Schindelin @ 2020-05-25  0:29 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Hi Emily,

On Mon, 25 May 2020, Johannes Schindelin wrote:

> Hi Emily,
>
> On Thu, 21 May 2020, Emily Shaffer wrote:
>
> > diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
> > index 4e46d7dd4e..3296d8af45 100755
> > --- a/t/t1360-config-based-hooks.sh
> > +++ b/t/t1360-config-based-hooks.sh
> > @@ -55,4 +55,15 @@ test_expect_success 'git hook list reorders on duplicate commands' '
> >  	test_cmp expected actual
> >  '
> >
> > +test_expect_success 'git hook list --porcelain prints just the command' '
> > +	cat >expected <<-\EOF &&
> > +	/path/ghi
> > +	/path/abc
> > +	/path/def
> > +	EOF
> > +
> > +	git hook list --porcelain pre-commit >actual &&
> > +	test_cmp expected actual
> > +'
>
> As you surely found out from the GitHub workflow running in your fork,
> this does not work on Windows. I need this (and strongly suggest you
> squash that into your patch):
>
> -- snipsnap --
> From 97e3dfa6155785363c881ce2dcaf4f5ddead83ed Mon Sep 17 00:00:00 2001
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> Date: Mon, 25 May 2020 15:04:24 +0200
> Subject: [PATCH] fixup??? hook: add --porcelain to list command
>
> This is required to let the test pass on Windows, where Git reports
> Windows-style absolute paths and has no idea about the pseudo Unix
> absolute paths that the Bash knows about.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  t/t1360-config-based-hooks.sh | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
> index c862655fd4d9..fce7335e97b9 100755
> --- a/t/t1360-config-based-hooks.sh
> +++ b/t/t1360-config-based-hooks.sh
> @@ -65,10 +65,10 @@ test_expect_success 'git hook list reorders on duplicate commands' '
>  '
>
>  test_expect_success 'git hook list --porcelain prints just the command' '
> -	cat >expected <<-EOF &&
> -	$ROOT/path/ghi
> -	$ROOT/path/abc
> -	$ROOT/path/def
> +	cat >expected <<-\EOF &&
> +	/path/ghi
> +	/path/abc
> +	/path/def

Due to an oversight on my part, this is actually the _reverse_ diff, and
the corresponding part in my mail answering your PATCH 3/4 should be
skipped from that fixup. Sorry for that.

Ciao,
Dscho

>  	EOF
>
>  	git hook list --porcelain pre-commit >actual &&
> --
> 2.27.0.rc1.windows.1
>
>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 3/4] hook: add list command
  2020-05-24 23:00   ` Johannes Schindelin
@ 2020-05-27 23:37     ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-05-27 23:37 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

On Mon, May 25, 2020 at 01:00:03AM +0200, Johannes Schindelin wrote:
> cc: git@vger.kernel.org
> 
> Hi Emily,
> 
> On Thu, 21 May 2020, Emily Shaffer wrote:
> 
> > diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
> > index 34b0df5216..4e46d7dd4e 100755
> > --- a/t/t1360-config-based-hooks.sh
> > +++ b/t/t1360-config-based-hooks.sh
> > @@ -4,8 +4,55 @@ test_description='config-managed multihooks, including git-hook command'
> >
> >  . ./test-lib.sh
> >
> > -test_expect_success 'git hook command does not crash' '
> > -	git hook
> > +test_expect_success 'git hook rejects commands without a mode' '
> > +	test_must_fail git hook pre-commit
> > +'
> > +
> > +
> > +test_expect_success 'git hook rejects commands without a hookname' '
> > +	test_must_fail git hook list
> > +'
> > +
> > +test_expect_success 'setup hooks in global, and local' '
> > +	git config --add --local hook.pre-commit.command "/path/ghi" &&
> > +	git config --add --global hook.pre-commit.command "/path/def"
> > +'
> > +
> > +test_expect_success 'git hook list orders by config order' '
> > +	cat >expected <<-\EOF &&
> > +	global:	/path/def
> > +	local:	/path/ghi
> > +	EOF
> > +
> > +	git hook list pre-commit >actual &&
> > +	test_cmp expected actual
> 
> This, as well as the next two test cases, won't work on Windows, as you
> almost certainly realized from looking at the failed GitHub workflow run
> of your branch.

Thanks very much for sending this - to be honest, the failed workflow
run appeared to be because of the earlier SDK download issue, which I
have not rebased on top of a fix for yet, so I missed any actionable
failures when I ran the CI locally. I'll take it into account, much
appreciated.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 1/4] doc: propose hooks managed by the config
  2020-05-22 10:13   ` Phillip Wood
@ 2020-06-09 20:26     ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-06-09 20:26 UTC (permalink / raw)
  To: Phillip Wood; +Cc: git

On Fri, May 22, 2020 at 11:13:07AM +0100, Phillip Wood wrote:
> 
> Hi Emily
> 
> Thanks for working on this
> 
> On 21/05/2020 19:54, Emily Shaffer wrote:
> > Begin a design document for config-based hooks, managed via git-hook.
> > Focus on an overview of the implementation and motivation for design
> > decisions. Briefly discuss the alternatives considered before this
> > point. Also, attempt to redefine terms to fit into a multihook world.
> > 
> > Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> > ---
> >  Documentation/Makefile                        |   1 +
> >  .../technical/config-based-hooks.txt          | 320 ++++++++++++++++++
> >  2 files changed, 321 insertions(+)
> >  create mode 100644 Documentation/technical/config-based-hooks.txt
> > 
> > diff --git a/Documentation/Makefile b/Documentation/Makefile
> > index 15d9d04f31..5b21f31d31 100644
> > --- a/Documentation/Makefile
> > +++ b/Documentation/Makefile
> > @@ -80,6 +80,7 @@ SP_ARTICLES += $(API_DOCS)
> >  TECH_DOCS += MyFirstContribution
> >  TECH_DOCS += MyFirstObjectWalk
> >  TECH_DOCS += SubmittingPatches
> > +TECH_DOCS += technical/config-based-hooks
> >  TECH_DOCS += technical/hash-function-transition
> >  TECH_DOCS += technical/http-protocol
> >  TECH_DOCS += technical/index-format
> > diff --git a/Documentation/technical/config-based-hooks.txt b/Documentation/technical/config-based-hooks.txt
> > new file mode 100644
> > index 0000000000..59cdc25a47
> > --- /dev/null
> > +++ b/Documentation/technical/config-based-hooks.txt
> > @@ -0,0 +1,320 @@
> > +Configuration-based hook management
> > +===================================
> > +
> > +== Motivation
> > +
> > +Treat hooks as a first-class citizen by replacing the .git/hook/hookname path as
> > +the only source of hooks to execute, in a way which is friendly to users with
> > +multiple repos which have similar needs.
> > +
> > +Redefine "hook" as an event rather than a single script, allowing users to
> > +perform unrelated actions on a single event.
> > +
> > +Take a step closer to safety when copying zipped Git repositories from untrusted
> > +users.
> 
> Having read through this (admittedly fairly quickly) I'm not sure what
> that step is

Ok, I'll try to clarify it a little here.

> 
> > +
> > +Make it easier for users to discover Git's hook feature and automate their
> > +workflows.
> > +
> > +== User interfaces
> > +
> > +=== Config schema
> > +
> > +Hooks can be introduced by editing the configuration manually. There are two new
> > +sections added, `hook` and `hookcmd`.
> > +
> > +==== `hook`
> > +
> > +Primarily contains subsections for each hook event. These subsections define
> > +hook command execution order;
> 
> May be "The order of these subsections define the hook command execution
> order" ?

Nice. Took it verbatim.

> 
> > hook commands can be specified by passing the
> > +command directly if no additional configuration is needed, or by passing the
> > +name of a `hookcmd`.
> 
> I know what you mean by "passing" but as this section is talking about
> config settings perhaps it should refer to the keys and values.

Sure.

> 
> > If Git does not find a `hookcmd` whose subsection matches
> > +the value of the given command string, Git will try to execute the string
> > +directly. Hooks are executed by passing the resolved command string to the
> > +shell.
> 
> Do we really need to invoke the shell just to split a command-line and
> look up the command in $PATH? If we used split_commandline() in alias.c
> then we could avoid invoking this extra process for each hook command.

I'll want to experiment a little bit with this and figure out what works
best - you may be right, and I could also be wrong about platform
compatibility doing it the way I described. I haven't written this bit
yet - so I'd like to update this section of the design doc when I get to
the implementation, so that it matches.

> 
> > Hook event subsections can also contain per-hook-event settings.
> > +
> > +Also contains top-level hook execution settings, for example,
> > +`hook.warnHookDir`, `hook.runHookDir`, or `hook.disableAll`.
> 
> (see sections ...) ? for the forward references to these settings?

Sure. I think the best way to do this is if I use anchors for all the
sections; this works without me specifying it in Asciidoctor but needs
to be explicitly specified in Asciidoc. So I'll make sure to include
that with the next iteration.

> 
> > +
> > +----
> > +[hook "pre-commit"]
> > +  command = perl-linter
> > +  command = /usr/bin/git-secrets --pre-commit
> > +
> > +[hook "pre-applypatch"]
> > +  command = perl-linter
> > +  error = ignore
> > +
> > +[hook]
> > +  runHookDir = interactive
> > +----
> > +
> > +==== `hookcmd`
> > +
> > +Defines a hook command and its attributes, which will be used when a hook event
> > +occurs. Unqualified attributes are assumed to apply to this hook during all hook
> > +events, but event-specific attributes can also be supplied. The example runs
> > +`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
> > +include this config, the hook command will be skipped for all events to which
> > +it's normally subscribed _except_ `pre-commit`.
> > +
> > +----
> > +[hookcmd "perl-linter"]
> > +  command = /usr/bin/lint-it --language=perl
> > +  skip = true
> > +  pre-commit-skip = false
> > +----
> > +
> > +=== Command-line API
> > +
> > +Users should be able to view, reorder, and create hook commands via the command
> > +line. External tools should be able to view a list of hooks in the correct order
> > +to run.
> > +
> > +*`git hook list <hook-event>`*
> > +
> > +*`git hook list (--system|--global|--local|--worktree)`*
> > +
> > +*`git hook edit <hook-event>`*
> > +
> > +*`git hook add <hook-command> <hook-event> <options...>`*
> > +
> > +=== Hook editor
> > +
> > +The tool which is presented by `git hook edit <hook-command>`. Ideally, this
> > +tool should be easier to use than manually editing the config, and then produce
> > +a concise config afterwards. It may take a form similar to `git rebase
> > +--interactive`.
> 
> rebase -i is not necessarily an exemplar of user interface design, what
> sort of thing do you have in mind?

Thanks for patience on this - I didn't really have a clear idea before
when I wrote the doc because I don't have much expertise in user
interfaces. However, since then I worked with some UX experts here, so
I'll make a better writeup in the next iteration - I've got a much
clearer idea of how that should look, now.

> 
> > +
> > +== Implementation
> > +
> > +=== Library
> > +
> > +`hook.c` and `hook.h` are responsible for interacting with the config files. In
> > +the case when the code generating a hook event doesn't have special concerns
> > +about how to run the hooks, the hook library will provide a basic API to call
> > +all hooks in config order with an `argv_array` provided by the code which
> > +generates the hook event:
> > +
> > +*`int run_hooks(const char *hookname, struct argv_array *args)`*
> > +
> > +This call includes the hook command provided by `run-command.h:find_hook()`;
> > +eventually, this legacy hook will be gated by a config `hook.runHookDir`. The
> > +config is checked against a number of cases:
> > +
> > +- "no": the legacy hook will not be run
> > +- "interactive": Git will prompt the user before running the legacy hook
> > +- "warn": Git will print a warning to stderr before running the legacy hook
> > +- "yes" (default): Git will silently run the legacy hook
> > +
> > +In case this list is expanded in the future, if a value for `hook.runHookDir` is
> > +given which Git does not recognize, Git should discard that config entry. For
> > +example, if "warn" was specified at system level and "junk" was specified at
> > +global level, Git would resolve the value to "warn"; if the only time the config
> > +was set was to "junk", Git would use the default value of "yes".
> > +
> > +If the caller wants to do something more complicated, the hook library can also
> > +provide a callback API:
> > +
> > +*`int for_each_hookcmd(const char *hookname, hookcmd_function *cb)`*
> > +
> > +Finally, to facilitate the builtin, the library will also provide the following
> > +APIs to interact with the config:
> > +
> > +----
> > +int set_hook_commands(const char *hookname, struct string_list *commands,
> > +	enum config_scope scope);
> > +int set_hookcmd(const char *hookcmd, struct hookcmd options);
> > +
> > +int list_hook_commands(const char *hookname, struct string_list *commands);
> > +int list_hooks_in_scope(enum config_scope scope, struct string_list *commands);
> > +----
> > +
> > +`struct hookcmd` is expected to grow in size over time as more functionality is
> > +added to hooks; so that other parts of the code don't need to understand the
> > +config schema, `struct hookcmd` should contain logical values instead of string
> > +pairs.
> > +
> > +----
> > +struct hookcmd {
> > +  const char *name;
> > +  const char *command;
> > +
> > +  /* for illustration only; not planned at present */
> > +  int parallelizable;
> > +  const char *hookcmd_before;
> > +  const char *hookcmd_after;
> > +  enum recovery_action on_fail;
> > +}
> > +----
> > +
> > +=== Builtin
> > +
> > +`builtin/hook.c` is responsible for providing the frontend. It's responsible for
> > +formatting user-provided data and then calling the library API to set the
> > +configs as appropriate. The builtin frontend is not responsible for calling the
> > +config directly, so that other areas of Git can rely on the hook library to
> > +understand the most recent config schema for hooks.
> > +
> > +=== Migration path
> > +
> > +==== Stage 0
> > +
> > +Hooks are called by running `run-command.h:find_hook()` with the hookname and
> > +executing the result. The hook library and builtin do not exist. Hooks only
> > +exist as specially named scripts within `.git/hooks/`.
> > +
> > +==== Stage 1
> > +
> > +`git hook list --porcelain <hook-event>` is implemented. Users can replace their
> > +`.git/hooks/<hook-event>` scripts with a trampoline based on `git hook list`'s
> > +output. Modifier commands like `git hook add` and `git hook edit` can be
> > +implemented around this time as well.
> > +
> > +==== Stage 2
> > +
> > +`hook.h:run_hooks()` is taught to include `run-command.h:find_hook()` at the
> > +end; calls to `find_hook()` are replaced with calls to `run_hooks()`. Users can
> > +opt-in to config-based hooks simply by creating some in their config; otherwise
> > +users should remain unaffected by the change.
> > +
> > +==== Stage 3
> > +
> > +The call to `find_hook()` inside of `run_hooks()` learns to check for a config,
> > +`hook.runHookDir`. Users can opt into managing their hooks completely via the
> > +config this way.
> > +
> > +==== Stage 4
> > +
> > +`.git/hooks` is removed from the template and the hook directory is considered
> > +deprecated. To avoid breaking older repos, the default of `hook.runHookDir` is
> > +not changed, and `find_hook()` is not removed.
> > +
> > +== Caveats
> > +
> > +=== Security and repo config
> > +
> > +Part of the motivation behind this refactor is to mitigate hooks as an attack
> > +vector;footnote:[https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/]
> > +however, as the design stands, users can still provide hooks in the repo-level
> > +config, which is included when a repo is zipped and sent elsewhere.  The
> > +security of the repo-level config is still under discussion; this design
> > +generally assumes the repo-level config is secure, which is not true yet. The
> > +goal is to avoid an overcomplicated design to work around a problem which has
> > +ceased to exist.
> > +
> > +=== Ease of use
> > +
> > +The config schema is nontrivial; that's why it's important for the `git hook`
> > +modifier commands to be usable.
> 
> That's an important point
> 
> > Contributors with UX expertise are encouraged to
> > +share their suggestions.
> > +
> > +== Alternative approaches
> > +
> > +A previous summary of alternatives exists in the
> > +archives.footnote:[https://lore.kernel.org/git/20191116011125.GG22855@google.com]
> > +
> > +=== Status quo
> > +
> > +Today users can implement multihooks themselves by using a "trampoline script"
> > +as their hook, and pointing that script to a directory or list of other scripts
> > +they wish to run.
> > +
> > +=== Hook directories
> > +
> > +Other contributors have suggested Git learn about the existence of a directory
> > +such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
> > +
> > +=== Comparison table
> > +
> > +.Comparison of alternatives
> > +|===
> > +|Feature |Config-based hooks |Hook directories |Status quo
> > +
> > +|Supports multiple hooks
> > +|Natively
> > +|Natively
> > +|With user effort
> > +
> > +|Safer for zipped repos
> > +|A little
> > +|No
> > +|No
> > +
> > +|Previous hooks just work
> > +|If configured
> > +|Yes
> > +|Yes
> > +
> > +|Can install one hook to many repos
> > +|Yes
> > +|No
> > +|No
> > +
> > +|Discoverability
> > +|Better (in `git help git`)
> > +|Same as before
> > +|Same as before
> > +
> > +|Hard to run unexpected hook
> > +|If configured
> > +|No
> > +|No
> > +|===
> > +
> > +== Future work
> > +
> > +=== Execution ordering
> > +
> > +We may find that config order is insufficient for some users; for example,
> > +config order makes it difficult to add a new hook to the system or global config
> > +which runs at the end of the hook list. A new ordering schema should be:
> > +
> > +1) Specified by a `hook.order` config, so that users will not unexpectedly see
> > +their order change;
> > +
> > +2) Either dependency or numerically based.
> > +
> > +Dependency-based ordering is prone to classic linked-list problems, like a
> > +cycles and handling of missing dependencies. But, it paves the way for enabling
> > +parallelization if some tasks truly depend on others.
> > +
> > +Numerical ordering makes it tricky for Git to generate suggested ordering
> > +numbers for each command, but is easy to determine a definitive order.
> > +
> > +=== Parallelization
> > +
> > +Users with many hooks might want to run them simultaneously, if the hooks don't
> > +modify state; if one hook depends on another's output, then users will want to
> > +specify those dependencies. If we decide to solve this problem, we may want to
> > +look to modern build systems for inspiration on how to manage dependencies and
> > +parallel tasks.
> > +
> > +=== Securing hookdir hooks
> > +
> > +With the design as written in this doc, it's still possible for a malicious user
> > +to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
> > +zip their repo and send it to another user. It may be necessary to teach Git to
> > +only allow one-line hooks like this if they were configured outside of the local
> > +scope;
> 
> Does "disabling one-line hooks" mean "disable passing command line
> arguments to the hook"? I'm not sure that gains much security - can't I
> just set 'hook.pre-receive.command = ./delete-everything' and include
> delete-everything in my malicious repo?

No, I meant something more along the lines of:

- hookcmds cannot be specified at the repo/worktree level
- hook.pre-receive.command's value *must* be a hookcmd name

I'll try to make that more clear next round.

Thanks for reading.
 - Emily

> > or another approach, like a list of safe projects, might be useful. It
> > +may also be sufficient (or at least useful) to teach a `hook.disableAll` config
> > +or similar flag to the Git executable.
> > +
> > +=== Submodule inheritance
> > +
> > +It's possible some submodules may want to run the identical set of hooks that
> > +their superrepo runs. While a globally-configured hook set is helpful, it's not
> > +a great solution for users who have multiple repos-with-submodules under the
> > +same user. It would be useful for submodules to learn how to run hooks from
> > +their superrepo's config, or inherit that hook setting.
> > +
> > +== Glossary
> > +
> > +*hook event*
> > +
> > +A point during Git's execution where user scripts may be run, for example,
> > +_prepare-commit-msg_ or _pre-push_.
> > +
> > +*hook command*
> > +
> > +A user script or executable which will be run on one or more hook events.
> > 
> 

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 3/4] hook: add list command
  2020-05-22 10:27   ` Phillip Wood
@ 2020-06-09 21:49     ` Emily Shaffer
  2020-08-17 13:36       ` Phillip Wood
  0 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-06-09 21:49 UTC (permalink / raw)
  To: Phillip Wood; +Cc: git

On Fri, May 22, 2020 at 11:27:44AM +0100, Phillip Wood wrote:
> 
> Hi Emily
> 
> On 21/05/2020 19:54, Emily Shaffer wrote:
> > Teach 'git hook list <hookname>', which checks the known configs in
> > order to create an ordered list of hooks to run on a given hook event.
> > 
> > Multiple commands can be specified for a given hook by providing
> > multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
> > run in config order. If more properties need to be set on a given hook
> > in the future, commands can also be specified by providing
> > "hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
> > <hookcmd-name>]" subsection; at minimum, this subsection must contain a
> > "hookcmd.<hookcmd-name>.command = <path-to-hook>" line.
> > 
> > For example:
> > 
> >   $ git config --list | grep ^hook
> >   hook.pre-commit.command=baz
> >   hook.pre-commit.command=~/bar.sh
> >   hookcmd.baz.command=~/baz/from/hookcmd.sh
> > 
> >   $ git hook list pre-commit
> >   ~/baz/from/hookcmd.sh
> >   ~/bar.sh
> > 
> > Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> > ---
> >  Documentation/git-hook.txt    | 37 +++++++++++++-
> >  Makefile                      |  1 +
> >  builtin/hook.c                | 55 +++++++++++++++++++--
> >  hook.c                        | 90 +++++++++++++++++++++++++++++++++++
> >  hook.h                        | 15 ++++++
> >  t/t1360-config-based-hooks.sh | 51 +++++++++++++++++++-
> >  6 files changed, 242 insertions(+), 7 deletions(-)
> >  create mode 100644 hook.c
> >  create mode 100644 hook.h
> > 
> > diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
> > index 2d50c414cc..e458586e96 100644
> > --- a/Documentation/git-hook.txt
> > +++ b/Documentation/git-hook.txt
> > @@ -8,12 +8,47 @@ git-hook - Manage configured hooks
> >  SYNOPSIS
> >  --------
> >  [verse]
> > -'git hook'
> > +'git hook' list <hook-name>
> >  
> >  DESCRIPTION
> >  -----------
> >  You can list, add, and modify hooks with this command.
> >  
> > +This command parses the default configuration files for sections "hook" and
> > +"hookcmd". "hook" is used to describe the commands which will be run during a
> > +particular hook event; commands are run in config order. "hookcmd" is used to
> > +describe attributes of a specific command. If additional attributes don't need
> > +to be specified, a command to run can be specified directly in the "hook"
> > +section; if a "hookcmd" by that name isn't found, Git will attempt to run the
> > +provided value directly. For example:
> > +
> > +Global config
> > +----
> > +  [hook "post-commit"]
> > +    command = "linter"
> > +    command = "~/typocheck.sh"
> > +
> > +  [hookcmd "linter"]
> > +    command = "/bin/linter --c"
> > +----
> > +
> > +Local config
> > +----
> > +  [hook "prepare-commit-msg"]
> > +    command = "linter"
> > +  [hook "post-commit"]
> > +    command = "python ~/run-test-suite.py"
> > +----
> > +
> > +COMMANDS
> > +--------
> > +
> > +list <hook-name>::
> > +
> > +List the hooks which have been configured for <hook-name>. Hooks appear
> > +in the order they should be run, and note the config scope where the relevant
> > +`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
> > +
> >  GIT
> >  ---
> >  Part of the linkgit:git[1] suite
> > diff --git a/Makefile b/Makefile
> > index fce6ee154e..b7bbf3be7b 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -894,6 +894,7 @@ LIB_OBJS += grep.o
> >  LIB_OBJS += hashmap.o
> >  LIB_OBJS += help.o
> >  LIB_OBJS += hex.o
> > +LIB_OBJS += hook.o
> >  LIB_OBJS += ident.o
> >  LIB_OBJS += interdiff.o
> >  LIB_OBJS += json-writer.o
> > diff --git a/builtin/hook.c b/builtin/hook.c
> > index b2bbc84d4d..cfd8e388bd 100644
> > --- a/builtin/hook.c
> > +++ b/builtin/hook.c
> > @@ -1,21 +1,68 @@
> >  #include "cache.h"
> >  
> >  #include "builtin.h"
> > +#include "config.h"
> > +#include "hook.h"
> >  #include "parse-options.h"
> > +#include "strbuf.h"
> >  
> >  static const char * const builtin_hook_usage[] = {
> > -	N_("git hook"),
> > +	N_("git hook list <hookname>"),
> >  	NULL
> >  };
> >  
> > -int cmd_hook(int argc, const char **argv, const char *prefix)
> > +static int list(int argc, const char **argv, const char *prefix)
> >  {
> > -	struct option builtin_hook_options[] = {
> > +	struct list_head *head, *pos;
> > +	struct hook *item;
> > +	struct strbuf hookname = STRBUF_INIT;
> > +
> > +	struct option list_options[] = {
> >  		OPT_END(),
> >  	};
> >  
> > -	argc = parse_options(argc, argv, prefix, builtin_hook_options,
> > +	argc = parse_options(argc, argv, prefix, list_options,
> >  			     builtin_hook_usage, 0);
> >  
> > +	if (argc < 1) {
> > +		usage_msg_opt("a hookname must be provided to operate on.",
> > +			      builtin_hook_usage, list_options);
> > +	}
> > +
> > +	strbuf_addstr(&hookname, argv[0]);
> > +
> > +	head = hook_list(&hookname);
> > +
> > +	if (!head) {
> > +		printf(_("no commands configured for hook '%s'\n"),
> > +		       hookname.buf);
> > +		return 0;
> > +	}
> > +
> > +	list_for_each(pos, head) {
> > +		item = list_entry(pos, struct hook, list);
> > +		if (item)
> > +			printf("%s:\t%s\n",
> > +			       config_scope_name(item->origin),
> > +			       item->command.buf);
> > +	}
> > +
> > +	clear_hook_list();
> > +	strbuf_release(&hookname);
> > +
> >  	return 0;
> >  }
> > +
> > +int cmd_hook(int argc, const char **argv, const char *prefix)
> > +{
> > +	struct option builtin_hook_options[] = {
> > +		OPT_END(),
> > +	};
> > +	if (argc < 2)
> > +		usage_with_options(builtin_hook_usage, builtin_hook_options);
> > +
> > +	if (!strcmp(argv[1], "list"))
> > +		return list(argc - 1, argv + 1, prefix);
> > +
> > +	usage_with_options(builtin_hook_usage, builtin_hook_options);
> > +}
> > diff --git a/hook.c b/hook.c
> > new file mode 100644
> > index 0000000000..9dfc1a885e
> > --- /dev/null
> > +++ b/hook.c
> > @@ -0,0 +1,90 @@
> > +#include "cache.h"
> > +
> > +#include "hook.h"
> > +#include "config.h"
> > +
> > +static LIST_HEAD(hook_head);
> > +
> > +void free_hook(struct hook *ptr)
> > +{
> > +	if (ptr) {
> > +		strbuf_release(&ptr->command);
> > +		free(ptr);
> > +	}
> > +}
> > +
> > +static void emplace_hook(struct list_head *pos, const char *command)
> > +{
> > +	struct hook *to_add = malloc(sizeof(struct hook));
> > +	to_add->origin = current_config_scope();
> > +	strbuf_init(&to_add->command, 0);
> > +	strbuf_addstr(&to_add->command, command);
> > +
> > +	list_add_tail(&to_add->list, pos);
> > +}
> > +
> > +static void remove_hook(struct list_head *to_remove)
> > +{
> > +	struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
> > +	list_del(to_remove);
> > +	free_hook(hook_to_remove);
> > +}
> > +
> > +void clear_hook_list(void)
> > +{
> > +	struct list_head *pos, *tmp;
> > +	list_for_each_safe(pos, tmp, &hook_head)
> > +		remove_hook(pos);
> > +}
> > +
> > +static int hook_config_lookup(const char *key, const char *value, void *hook_key_cb)
> > +{
> > +	const char *hook_key = hook_key_cb;
> > +
> > +	if (!strcmp(key, hook_key)) {
> > +		const char *command = value;
> > +		struct strbuf hookcmd_name = STRBUF_INIT;
> > +		struct list_head *pos = NULL, *tmp = NULL;
> > +
> > +		/* Check if a hookcmd with that name exists. */
> > +		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
> > +		git_config_get_value(hookcmd_name.buf, &command);
> 
> This looks dodgy to me. This code is called by git_config() as it parses
> the config files, so it has not had a chance to fully populate the
> config cache used by git_config_get_value(). I think the test below
> passes because the hookcmd setting is set in the global file and the
> hook setting is set in the local file so when we have already parsed the
> hookcmd setting when we come to look it up. The same comment applies to
> the hypothetical ordering config mentioned below. I think it would be
> better to collect the list of hook.<event>.command settings in this
> callback and then look up any hookcmd settings for those hook commands
> after we've finished reading all of the config files.

git_config_get_value() calls repo_read_config(the_repository) if the
config hasn't been fully parsed yet, so I think what you're worrying
about is not an issue. It's ugly, I agree, but since the new hotness
(git_config_get_value() and friends) doesn't offer the same
functionality as the old solution (config origin) this seemed like an
okay approach. As I understand it, moving this hookcmd lookup section
outside of the config callback will save us up to one additional pass
through the configs, at the expense of a more convoluted code path.

> 
> > +
> > +		if (!command)
> > +			BUG("git_config_get_value overwrote a string it shouldn't have");
> > +
> > +		/*
> > +		 * TODO: implement an option-getting callback, e.g.
> > +		 *   get configs by pattern hookcmd.$value.*
> > +		 *   for each key+value, do_callback(key, value, cb_data)
> > +		 */
> > +
> > +		list_for_each_safe(pos, tmp, &hook_head) {
> > +			struct hook *hook = list_entry(pos, struct hook, list);
> > +			/*
> > +			 * The list of hooks to run can be reordered by being redeclared
> > +			 * in the config. Options about hook ordering should be checked
> > +			 * here.
> > +			 */
> > +			if (0 == strcmp(hook->command.buf, command))
> > +				remove_hook(pos);
> > +		}
> > +		emplace_hook(pos, command);
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +struct list_head* hook_list(const struct strbuf* hookname)
> > +{
> > +	struct strbuf hook_key = STRBUF_INIT;
> > +
> > +	if (!hookname)
> > +		return NULL;
> > +
> > +	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
> > +
> > +	git_config(hook_config_lookup, (void*)hook_key.buf);
> > +
> > +	return &hook_head;
> > +}
> > diff --git a/hook.h b/hook.h
> > new file mode 100644
> > index 0000000000..aaf6511cff
> > --- /dev/null
> > +++ b/hook.h
> > @@ -0,0 +1,15 @@
> > +#include "config.h"
> > +#include "list.h"
> > +#include "strbuf.h"
> > +
> > +struct hook
> > +{
> > +	struct list_head list;
> > +	enum config_scope origin;
> > +	struct strbuf command;
> > +};
> > +
> > +struct list_head* hook_list(const struct strbuf *hookname);
> > +
> > +void free_hook(struct hook *ptr);
> > +void clear_hook_list(void);
> > diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
> > index 34b0df5216..4e46d7dd4e 100755
> > --- a/t/t1360-config-based-hooks.sh
> > +++ b/t/t1360-config-based-hooks.sh
> > @@ -4,8 +4,55 @@ test_description='config-managed multihooks, including git-hook command'
> >  
> >  . ./test-lib.sh
> >  
> > -test_expect_success 'git hook command does not crash' '
> > -	git hook
> > +test_expect_success 'git hook rejects commands without a mode' '
> > +	test_must_fail git hook pre-commit
> > +'
> > +
> > +
> > +test_expect_success 'git hook rejects commands without a hookname' '
> > +	test_must_fail git hook list
> > +'
> > +
> > +test_expect_success 'setup hooks in global, and local' '
> > +	git config --add --local hook.pre-commit.command "/path/ghi" &&
> 
> Can I make a plea for the use of test_config please. Writing tests which
> rely on previous tests for their set-up creates a chain of hidden
> dependencies that make it hard to add/alter tests later or run a subset
> of the tests when developing a new patch. t3404-rebase-interactive.sh is
> a prime example of this and I dread touching it.

Sure. I'll redo them.

> 
> > +	git config --add --global hook.pre-commit.command "/path/def"
> > +'
> > +
> > +test_expect_success 'git hook list orders by config order' '
> > +	cat >expected <<-\EOF &&
> > +	global:	/path/def
> > +	local:	/path/ghi
> > +	EOF
> > +
> > +	git hook list pre-commit >actual &&
> > +	test_cmp expected actual
> > +'
> > +
> > +test_expect_success 'git hook list dereferences a hookcmd' '
> > +	git config --add --local hook.pre-commit.command "abc" &&
> > +	git config --add --global hookcmd.abc.command "/path/abc" &&
> > +
> > +	cat >expected <<-\EOF &&
> > +	global:	/path/def
> > +	local:	/path/ghi
> > +	local:	/path/abc
> 
> We should make it clear in the documentation that the config origin
> applies to the hook setting, even though we display the hookcmd command
> which is set globally here for the last hook.

One of the suggestions from our UX team last week was to make this list
output clearer to indicate the origin of the command plus the origin of
the hookcmd object; I'll try to straighten this out and make sure the
doc agrees.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v3 0/6] propose config-based hooks
  2020-05-21 18:54 [PATCH v2 0/4] propose config-based hooks Emily Shaffer
                   ` (3 preceding siblings ...)
  2020-05-21 18:54 ` [PATCH v2 4/4] hook: add --porcelain to " Emily Shaffer
@ 2020-07-28 22:24 ` Emily Shaffer
  2020-07-28 22:24   ` [PATCH v3 1/6] doc: propose hooks managed by the config Emily Shaffer
                     ` (6 more replies)
  4 siblings, 7 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-07-28 22:24 UTC (permalink / raw)
  To: git
  Cc: Emily Shaffer, Jeff King, Junio C Hamano, James Ramsay,
	Jonathan Nieder, brian m. carlson,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Josh Steadmon, Johannes Schindelin

Hi all,

After taking a few weeks to work on other items, I've got another update
to the config-based hook series. Patches 5 and 6 are RFC - a sketch of
how the hook library could run the appropriate set of hooks. There's
more work to do, which I'll outline later in the cover letter.

Since last time, I took into account review comments, including Dscho's
fixups to make the tests work in Windows. It seems those tests are
passing now, according to the GH Actions run:
https://github.com/nasamuffin/git/actions/runs/186242637

One thing I didn't decide on was the benefit of moving the hookcmd
resolution outside of the hook config pass; that code is unchanged. I
still haven't decided quite which approach I like better, but it's still
on my mind.

In the 'run_hook()' implementation I flipped the 'use_shell' bit, which
by my understanding only uses a shell if it can't find the command in
PATH; this seems like a reasonable approach especially because the code
is so brief, but I'm interested in hearing why I'm wrong or it won't
work well :)

There is still some work I've got locally which isn't quite ready:
 - support for hook.runHookDir. This is turning into a yak shave about
   who decides where and when to display or run the hookdir hook. I
   think I've got it mostly figured out and there's a patch locally, but
   it's not polished.
 - Drafts for 'git hook add' and 'git hook edit'. These features are
   probably the most complicated part of the series, but it's possible
   to use config-based hooks without them. In the interest of getting
   something out for people to try on their own, I'll probably leave
   these for later.
 - Support for stdin redirection to hooks. Since this means we want to
   point the same stdin to multiple processes, I'm thinking it will be
   slightly complicated. Maybe someone has a hint for me? :) Without
   having looked at what's available or not yet, I'm planning to do this
   by reading the whole stdin to memory and then streaming it to each
   process in turn, as I can't seek back to the beginning of the stream
   when I start each new process.
 - Conversion of codebase to use the hook library instead. Partly, this
   is gated on the previous point - there are plenty of callers who,
   instead of using run-command's run_hook_*(), just use find_hook() and
   roll their own struct child_process so they can use stdin/stdout. I
   do plan to consider the hook lib's run_hooks() implementation as
   non-final until I start this process - I'm expecting to learn more
   about what I do and don't have to support when I do this.

Thanks, all. Hopefully I can do better than a 2-month wait for the
series after this one... although I imagine I cursed myself by saying
that. :)

 - Emily


Emily Shaffer (6):
  doc: propose hooks managed by the config
  hook: scaffolding for git-hook subcommand
  hook: add list command
  hook: add --porcelain to list command
  parse-options: parse into argv_array
  hook: add 'run' subcommand

 .gitignore                                    |   1 +
 Documentation/Makefile                        |   1 +
 Documentation/git-hook.txt                    |  63 ++++
 Documentation/technical/api-parse-options.txt |   5 +
 .../technical/config-based-hooks.txt          | 354 ++++++++++++++++++
 Makefile                                      |   2 +
 builtin.h                                     |   1 +
 builtin/hook.c                                | 107 ++++++
 git.c                                         |   1 +
 hook.c                                        | 132 +++++++
 hook.h                                        |  18 +
 parse-options-cb.c                            |  16 +
 parse-options.h                               |   4 +
 t/t1360-config-based-hooks.sh                 | 115 ++++++
 14 files changed, 820 insertions(+)
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 Documentation/technical/config-based-hooks.txt
 create mode 100644 builtin/hook.c
 create mode 100644 hook.c
 create mode 100644 hook.h
 create mode 100755 t/t1360-config-based-hooks.sh

-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v3 1/6] doc: propose hooks managed by the config
  2020-07-28 22:24 ` [PATCH v3 0/6] propose config-based hooks Emily Shaffer
@ 2020-07-28 22:24   ` Emily Shaffer
  2020-07-28 22:24   ` [PATCH v3 2/6] hook: scaffolding for git-hook subcommand Emily Shaffer
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-07-28 22:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Begin a design document for config-based hooks, managed via git-hook.
Focus on an overview of the implementation and motivation for design
decisions. Briefly discuss the alternatives considered before this
point. Also, attempt to redefine terms to fit into a multihook world.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/Makefile                        |   1 +
 .../technical/config-based-hooks.txt          | 354 ++++++++++++++++++
 2 files changed, 355 insertions(+)
 create mode 100644 Documentation/technical/config-based-hooks.txt

diff --git a/Documentation/Makefile b/Documentation/Makefile
index ecd0b340b1..5483995113 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -80,6 +80,7 @@ SP_ARTICLES += $(API_DOCS)
 TECH_DOCS += MyFirstContribution
 TECH_DOCS += MyFirstObjectWalk
 TECH_DOCS += SubmittingPatches
+TECH_DOCS += technical/config-based-hooks
 TECH_DOCS += technical/hash-function-transition
 TECH_DOCS += technical/http-protocol
 TECH_DOCS += technical/index-format
diff --git a/Documentation/technical/config-based-hooks.txt b/Documentation/technical/config-based-hooks.txt
new file mode 100644
index 0000000000..c6e762b192
--- /dev/null
+++ b/Documentation/technical/config-based-hooks.txt
@@ -0,0 +1,354 @@
+Configuration-based hook management
+===================================
+:sectanchors:
+
+[[motivation]]
+== Motivation
+
+Treat hooks as a first-class citizen by replacing the .git/hook/hookname path as
+the only source of hooks to execute, in a way which is friendly to users with
+multiple repos which have similar needs.
+
+Redefine "hook" as an event rather than a single script, allowing users to
+perform unrelated actions on a single event.
+
+Take a step closer to safety when copying zipped Git repositories from untrusted
+users by making it more apparent to users which scripts will be run during
+normal Git operations.
+
+Make it easier for users to discover Git's hook feature and automate their
+workflows.
+
+[[user-interfaces]]
+== User interfaces
+
+[[config-schema]]
+=== Config schema
+
+Hooks can be introduced by editing the configuration manually. There are two new
+sections added, `hook` and `hookcmd`.
+
+[[config-schema-hook]]
+==== `hook`
+
+Primarily contains subsections for each hook event. These order of these
+subsections defines the hook command execution order; hook commands can be
+specified by setting the value directly to the command if no additional
+configuration is needed, or by setting the value as the name of a `hookcmd`. If
+Git does not find a `hookcmd` whose subsection matches the value of the given
+command string, Git will try to execute the string directly. Hooks are executed
+by passing the resolved command string to the shell. Hook event subsections can
+also contain per-hook-event settings.
+
+Also contains top-level hook execution settings, for example,
+`hook.warnHookDir`, `hook.runHookDir`, or `hook.disableAll`. (These settings are
+described more in <<library,Library>>.)
+
+----
+[hook "pre-commit"]
+  command = perl-linter
+  command = /usr/bin/git-secrets --pre-commit
+
+[hook "pre-applypatch"]
+  command = perl-linter
+  error = ignore
+
+[hook]
+  runHookDir = interactive
+----
+
+[[config-schema-hookcmd]]
+==== `hookcmd`
+
+Defines a hook command and its attributes, which will be used when a hook event
+occurs. Unqualified attributes are assumed to apply to this hook during all hook
+events, but event-specific attributes can also be supplied. The example runs
+`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
+include this config, the hook command will be skipped for all events to which
+it's normally subscribed _except_ `pre-commit`.
+
+----
+[hookcmd "perl-linter"]
+  command = /usr/bin/lint-it --language=perl
+  skip = true
+  pre-commit-skip = false
+----
+
+[[command-line-api]]
+=== Command-line API
+
+Users should be able to view, reorder, and create hook commands via the command
+line. External tools should be able to view a list of hooks in the correct order
+to run.
+
+*`git hook list <hook-event>`*
+
+*`git hook list (--system|--global|--local|--worktree)`*
+
+*`git hook edit <hook-event>`*
+
+*`git hook add <hook-command> <hook-event> <options...>`*
+
+[[hook-editor]]
+=== Hook editor
+
+The tool which is presented by `git hook edit <hook-command>`. Ideally, this
+tool should be easier to use than manually editing the config, and then produce
+a concise config afterwards. It may take a form similar to `git rebase
+--interactive`.
+
+[[implementation]]
+== Implementation
+
+[[library]]
+=== Library
+
+`hook.c` and `hook.h` are responsible for interacting with the config files. In
+the case when the code generating a hook event doesn't have special concerns
+about how to run the hooks, the hook library will provide a basic API to call
+all hooks in config order with an `argv_array` provided by the code which
+generates the hook event:
+
+*`int run_hooks(const char *hookname, struct argv_array *args)`*
+
+This call includes the hook command provided by `run-command.h:find_hook()`;
+eventually, this legacy hook will be gated by a config `hook.runHookDir`. The
+config is checked against a number of cases:
+
+- "no": the legacy hook will not be run
+- "interactive": Git will prompt the user before running the legacy hook
+- "warn": Git will print a warning to stderr before running the legacy hook
+- "yes" (default): Git will silently run the legacy hook
+
+In case this list is expanded in the future, if a value for `hook.runHookDir` is
+given which Git does not recognize, Git should discard that config entry. For
+example, if "warn" was specified at system level and "junk" was specified at
+global level, Git would resolve the value to "warn"; if the only time the config
+was set was to "junk", Git would use the default value of "yes".
+
+If the caller wants to do something more complicated, the hook library can also
+provide a callback API:
+
+*`int for_each_hookcmd(const char *hookname, hookcmd_function *cb)`*
+
+Finally, to facilitate the builtin, the library will also provide the following
+APIs to interact with the config:
+
+----
+int set_hook_commands(const char *hookname, struct string_list *commands,
+	enum config_scope scope);
+int set_hookcmd(const char *hookcmd, struct hookcmd options);
+
+int list_hook_commands(const char *hookname, struct string_list *commands);
+int list_hooks_in_scope(enum config_scope scope, struct string_list *commands);
+----
+
+`struct hookcmd` is expected to grow in size over time as more functionality is
+added to hooks; so that other parts of the code don't need to understand the
+config schema, `struct hookcmd` should contain logical values instead of string
+pairs.
+
+----
+struct hookcmd {
+  const char *name;
+  const char *command;
+
+  /* for illustration only; not planned at present */
+  int parallelizable;
+  const char *hookcmd_before;
+  const char *hookcmd_after;
+  enum recovery_action on_fail;
+}
+----
+
+[[builtin]]
+=== Builtin
+
+`builtin/hook.c` is responsible for providing the frontend. It's responsible for
+formatting user-provided data and then calling the library API to set the
+configs as appropriate. The builtin frontend is not responsible for calling the
+config directly, so that other areas of Git can rely on the hook library to
+understand the most recent config schema for hooks.
+
+[[migration]]
+=== Migration path
+
+[[stage-0]]
+==== Stage 0
+
+Hooks are called by running `run-command.h:find_hook()` with the hookname and
+executing the result. The hook library and builtin do not exist. Hooks only
+exist as specially named scripts within `.git/hooks/`.
+
+[[stage-1]]
+==== Stage 1
+
+`git hook list --porcelain <hook-event>` is implemented. Users can replace their
+`.git/hooks/<hook-event>` scripts with a trampoline based on `git hook list`'s
+output. Modifier commands like `git hook add` and `git hook edit` can be
+implemented around this time as well.
+
+[[stage-2]]
+==== Stage 2
+
+`hook.h:run_hooks()` is taught to include `run-command.h:find_hook()` at the
+end; calls to `find_hook()` are replaced with calls to `run_hooks()`. Users can
+opt-in to config-based hooks simply by creating some in their config; otherwise
+users should remain unaffected by the change.
+
+[[stage-3]]
+==== Stage 3
+
+The call to `find_hook()` inside of `run_hooks()` learns to check for a config,
+`hook.runHookDir`. Users can opt into managing their hooks completely via the
+config this way.
+
+[[stage-4]]
+==== Stage 4
+
+`.git/hooks` is removed from the template and the hook directory is considered
+deprecated. To avoid breaking older repos, the default of `hook.runHookDir` is
+not changed, and `find_hook()` is not removed.
+
+[[caveats]]
+== Caveats
+
+[[security]]
+=== Security and repo config
+
+Part of the motivation behind this refactor is to mitigate hooks as an attack
+vector;footnote:[https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/]
+however, as the design stands, users can still provide hooks in the repo-level
+config, which is included when a repo is zipped and sent elsewhere.  The
+security of the repo-level config is still under discussion; this design
+generally assumes the repo-level config is secure, which is not true yet. The
+goal is to avoid an overcomplicated design to work around a problem which has
+ceased to exist.
+
+[[ease-of-use]]
+=== Ease of use
+
+The config schema is nontrivial; that's why it's important for the `git hook`
+modifier commands to be usable. Contributors with UX expertise are encouraged to
+share their suggestions.
+
+[[alternatives]]
+== Alternative approaches
+
+A previous summary of alternatives exists in the
+archives.footnote:[https://lore.kernel.org/git/20191116011125.GG22855@google.com]
+
+[[status-quo]]
+=== Status quo
+
+Today users can implement multihooks themselves by using a "trampoline script"
+as their hook, and pointing that script to a directory or list of other scripts
+they wish to run.
+
+[[hook-directories]]
+=== Hook directories
+
+Other contributors have suggested Git learn about the existence of a directory
+such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
+
+[[comparison]]
+=== Comparison table
+
+.Comparison of alternatives
+|===
+|Feature |Config-based hooks |Hook directories |Status quo
+
+|Supports multiple hooks
+|Natively
+|Natively
+|With user effort
+
+|Safer for zipped repos
+|A little
+|No
+|No
+
+|Previous hooks just work
+|If configured
+|Yes
+|Yes
+
+|Can install one hook to many repos
+|Yes
+|No
+|No
+
+|Discoverability
+|Better (in `git help git`)
+|Same as before
+|Same as before
+
+|Hard to run unexpected hook
+|If configured
+|No
+|No
+|===
+
+[[future-work]]
+== Future work
+
+[[execution-ordering]]
+=== Execution ordering
+
+We may find that config order is insufficient for some users; for example,
+config order makes it difficult to add a new hook to the system or global config
+which runs at the end of the hook list. A new ordering schema should be:
+
+1) Specified by a `hook.order` config, so that users will not unexpectedly see
+their order change;
+
+2) Either dependency or numerically based.
+
+Dependency-based ordering is prone to classic linked-list problems, like a
+cycles and handling of missing dependencies. But, it paves the way for enabling
+parallelization if some tasks truly depend on others.
+
+Numerical ordering makes it tricky for Git to generate suggested ordering
+numbers for each command, but is easy to determine a definitive order.
+
+[[parallelization]]
+=== Parallelization
+
+Users with many hooks might want to run them simultaneously, if the hooks don't
+modify state; if one hook depends on another's output, then users will want to
+specify those dependencies. If we decide to solve this problem, we may want to
+look to modern build systems for inspiration on how to manage dependencies and
+parallel tasks.
+
+[[securing-hookdir-hooks]]
+=== Securing hookdir hooks
+
+With the design as written in this doc, it's still possible for a malicious user
+to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
+zip their repo and send it to another user. It may be necessary to teach Git to
+only allow inlined hooks like this if they were configured outside of the local
+scope (in other words, only run hookcmds, and only allow hookcmds to be
+configured in global or system scope); or another approach, like a list of safe
+projects, might be useful. It may also be sufficient (or at least useful) to
+teach a `hook.disableAll` config or similar flag to the Git executable.
+
+[[submodule-inheritance]]
+=== Submodule inheritance
+
+It's possible some submodules may want to run the identical set of hooks that
+their superrepo runs. While a globally-configured hook set is helpful, it's not
+a great solution for users who have multiple repos-with-submodules under the
+same user. It would be useful for submodules to learn how to run hooks from
+their superrepo's config, or inherit that hook setting.
+
+[[glossary]]
+== Glossary
+
+*hook event*
+
+A point during Git's execution where user scripts may be run, for example,
+_prepare-commit-msg_ or _pre-push_.
+
+*hook command*
+
+A user script or executable which will be run on one or more hook events.
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v3 2/6] hook: scaffolding for git-hook subcommand
  2020-07-28 22:24 ` [PATCH v3 0/6] propose config-based hooks Emily Shaffer
  2020-07-28 22:24   ` [PATCH v3 1/6] doc: propose hooks managed by the config Emily Shaffer
@ 2020-07-28 22:24   ` Emily Shaffer
  2020-07-28 22:24   ` [PATCH v3 3/6] hook: add list command Emily Shaffer
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-07-28 22:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Introduce infrastructure for a new subcommand, git-hook, which will be
used to ease config-based hook management. This command will handle
parsing configs to compose a list of hooks to run for a given event, as
well as adding or modifying hook configs in an interactive fashion.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 .gitignore                    |  1 +
 Documentation/git-hook.txt    | 19 +++++++++++++++++++
 Makefile                      |  1 +
 builtin.h                     |  1 +
 builtin/hook.c                | 21 +++++++++++++++++++++
 git.c                         |  1 +
 t/t1360-config-based-hooks.sh | 11 +++++++++++
 7 files changed, 55 insertions(+)
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 builtin/hook.c
 create mode 100755 t/t1360-config-based-hooks.sh

diff --git a/.gitignore b/.gitignore
index ee509a2ad2..0694a34884 100644
--- a/.gitignore
+++ b/.gitignore
@@ -75,6 +75,7 @@
 /git-grep
 /git-hash-object
 /git-help
+/git-hook
 /git-http-backend
 /git-http-fetch
 /git-http-push
diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
new file mode 100644
index 0000000000..2d50c414cc
--- /dev/null
+++ b/Documentation/git-hook.txt
@@ -0,0 +1,19 @@
+git-hook(1)
+===========
+
+NAME
+----
+git-hook - Manage configured hooks
+
+SYNOPSIS
+--------
+[verse]
+'git hook'
+
+DESCRIPTION
+-----------
+You can list, add, and modify hooks with this command.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 372139f1f2..e13e58e23f 100644
--- a/Makefile
+++ b/Makefile
@@ -1077,6 +1077,7 @@ BUILTIN_OBJS += builtin/get-tar-commit-id.o
 BUILTIN_OBJS += builtin/grep.o
 BUILTIN_OBJS += builtin/hash-object.o
 BUILTIN_OBJS += builtin/help.o
+BUILTIN_OBJS += builtin/hook.o
 BUILTIN_OBJS += builtin/index-pack.o
 BUILTIN_OBJS += builtin/init-db.o
 BUILTIN_OBJS += builtin/interpret-trailers.o
diff --git a/builtin.h b/builtin.h
index a5ae15bfe5..4e736499c0 100644
--- a/builtin.h
+++ b/builtin.h
@@ -157,6 +157,7 @@ int cmd_get_tar_commit_id(int argc, const char **argv, const char *prefix);
 int cmd_grep(int argc, const char **argv, const char *prefix);
 int cmd_hash_object(int argc, const char **argv, const char *prefix);
 int cmd_help(int argc, const char **argv, const char *prefix);
+int cmd_hook(int argc, const char **argv, const char *prefix);
 int cmd_index_pack(int argc, const char **argv, const char *prefix);
 int cmd_init_db(int argc, const char **argv, const char *prefix);
 int cmd_interpret_trailers(int argc, const char **argv, const char *prefix);
diff --git a/builtin/hook.c b/builtin/hook.c
new file mode 100644
index 0000000000..b2bbc84d4d
--- /dev/null
+++ b/builtin/hook.c
@@ -0,0 +1,21 @@
+#include "cache.h"
+
+#include "builtin.h"
+#include "parse-options.h"
+
+static const char * const builtin_hook_usage[] = {
+	N_("git hook"),
+	NULL
+};
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+
+	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+			     builtin_hook_usage, 0);
+
+	return 0;
+}
diff --git a/git.c b/git.c
index 2f021b97f3..7f3328c63f 100644
--- a/git.c
+++ b/git.c
@@ -517,6 +517,7 @@ static struct cmd_struct commands[] = {
 	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
 	{ "hash-object", cmd_hash_object },
 	{ "help", cmd_help },
+	{ "hook", cmd_hook, RUN_SETUP },
 	{ "index-pack", cmd_index_pack, RUN_SETUP_GENTLY | NO_PARSEOPT },
 	{ "init", cmd_init_db },
 	{ "init-db", cmd_init_db },
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
new file mode 100755
index 0000000000..34b0df5216
--- /dev/null
+++ b/t/t1360-config-based-hooks.sh
@@ -0,0 +1,11 @@
+#!/bin/bash
+
+test_description='config-managed multihooks, including git-hook command'
+
+. ./test-lib.sh
+
+test_expect_success 'git hook command does not crash' '
+	git hook
+'
+
+test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v3 3/6] hook: add list command
  2020-07-28 22:24 ` [PATCH v3 0/6] propose config-based hooks Emily Shaffer
  2020-07-28 22:24   ` [PATCH v3 1/6] doc: propose hooks managed by the config Emily Shaffer
  2020-07-28 22:24   ` [PATCH v3 2/6] hook: scaffolding for git-hook subcommand Emily Shaffer
@ 2020-07-28 22:24   ` Emily Shaffer
  2020-07-28 22:24   ` [PATCH v3 4/6] hook: add --porcelain to " Emily Shaffer
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-07-28 22:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Teach 'git hook list <hookname>', which checks the known configs in
order to create an ordered list of hooks to run on a given hook event.

Multiple commands can be specified for a given hook by providing
multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
run in config order. If more properties need to be set on a given hook
in the future, commands can also be specified by providing
"hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
<hookcmd-name>]" subsection; at minimum, this subsection must contain a
"hookcmd.<hookcmd-name>.command = <path-to-hook>" line.

For example:

  $ git config --list | grep ^hook
  hook.pre-commit.command=baz
  hook.pre-commit.command=~/bar.sh
  hookcmd.baz.command=~/baz/from/hookcmd.sh

  $ git hook list pre-commit
  ~/baz/from/hookcmd.sh
  ~/bar.sh

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/git-hook.txt    | 37 +++++++++++++-
 Makefile                      |  1 +
 builtin/hook.c                | 55 +++++++++++++++++++--
 hook.c                        | 90 +++++++++++++++++++++++++++++++++++
 hook.h                        | 15 ++++++
 t/t1360-config-based-hooks.sh | 68 +++++++++++++++++++++++++-
 6 files changed, 259 insertions(+), 7 deletions(-)
 create mode 100644 hook.c
 create mode 100644 hook.h

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index 2d50c414cc..e458586e96 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -8,12 +8,47 @@ git-hook - Manage configured hooks
 SYNOPSIS
 --------
 [verse]
-'git hook'
+'git hook' list <hook-name>
 
 DESCRIPTION
 -----------
 You can list, add, and modify hooks with this command.
 
+This command parses the default configuration files for sections "hook" and
+"hookcmd". "hook" is used to describe the commands which will be run during a
+particular hook event; commands are run in config order. "hookcmd" is used to
+describe attributes of a specific command. If additional attributes don't need
+to be specified, a command to run can be specified directly in the "hook"
+section; if a "hookcmd" by that name isn't found, Git will attempt to run the
+provided value directly. For example:
+
+Global config
+----
+  [hook "post-commit"]
+    command = "linter"
+    command = "~/typocheck.sh"
+
+  [hookcmd "linter"]
+    command = "/bin/linter --c"
+----
+
+Local config
+----
+  [hook "prepare-commit-msg"]
+    command = "linter"
+  [hook "post-commit"]
+    command = "python ~/run-test-suite.py"
+----
+
+COMMANDS
+--------
+
+list <hook-name>::
+
+List the hooks which have been configured for <hook-name>. Hooks appear
+in the order they should be run, and note the config scope where the relevant
+`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index e13e58e23f..50e7c911d1 100644
--- a/Makefile
+++ b/Makefile
@@ -891,6 +891,7 @@ LIB_OBJS += grep.o
 LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
+LIB_OBJS += hook.o
 LIB_OBJS += ident.o
 LIB_OBJS += interdiff.o
 LIB_OBJS += json-writer.o
diff --git a/builtin/hook.c b/builtin/hook.c
index b2bbc84d4d..a0759a4c26 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -1,21 +1,68 @@
 #include "cache.h"
 
 #include "builtin.h"
+#include "config.h"
+#include "hook.h"
 #include "parse-options.h"
+#include "strbuf.h"
 
 static const char * const builtin_hook_usage[] = {
-	N_("git hook"),
+	N_("git hook list <hookname>"),
 	NULL
 };
 
-int cmd_hook(int argc, const char **argv, const char *prefix)
+static int list(int argc, const char **argv, const char *prefix)
 {
-	struct option builtin_hook_options[] = {
+	struct list_head *head, *pos;
+	struct hook *item;
+	struct strbuf hookname = STRBUF_INIT;
+
+	struct option list_options[] = {
 		OPT_END(),
 	};
 
-	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+	argc = parse_options(argc, argv, prefix, list_options,
 			     builtin_hook_usage, 0);
 
+	if (argc < 1) {
+		usage_msg_opt("a hookname must be provided to operate on.",
+			      builtin_hook_usage, list_options);
+	}
+
+	strbuf_addstr(&hookname, argv[0]);
+
+	head = hook_list(&hookname);
+
+	if (list_empty(head)) {
+		printf(_("no commands configured for hook '%s'\n"),
+		       hookname.buf);
+		return 0;
+	}
+
+	list_for_each(pos, head) {
+		item = list_entry(pos, struct hook, list);
+		if (item)
+			printf("%s:\t%s\n",
+			       config_scope_name(item->origin),
+			       item->command.buf);
+	}
+
+	clear_hook_list();
+	strbuf_release(&hookname);
+
 	return 0;
 }
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+	if (argc < 2)
+		usage_with_options(builtin_hook_usage, builtin_hook_options);
+
+	if (!strcmp(argv[1], "list"))
+		return list(argc - 1, argv + 1, prefix);
+
+	usage_with_options(builtin_hook_usage, builtin_hook_options);
+}
diff --git a/hook.c b/hook.c
new file mode 100644
index 0000000000..9dfc1a885e
--- /dev/null
+++ b/hook.c
@@ -0,0 +1,90 @@
+#include "cache.h"
+
+#include "hook.h"
+#include "config.h"
+
+static LIST_HEAD(hook_head);
+
+void free_hook(struct hook *ptr)
+{
+	if (ptr) {
+		strbuf_release(&ptr->command);
+		free(ptr);
+	}
+}
+
+static void emplace_hook(struct list_head *pos, const char *command)
+{
+	struct hook *to_add = malloc(sizeof(struct hook));
+	to_add->origin = current_config_scope();
+	strbuf_init(&to_add->command, 0);
+	strbuf_addstr(&to_add->command, command);
+
+	list_add_tail(&to_add->list, pos);
+}
+
+static void remove_hook(struct list_head *to_remove)
+{
+	struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
+	list_del(to_remove);
+	free_hook(hook_to_remove);
+}
+
+void clear_hook_list(void)
+{
+	struct list_head *pos, *tmp;
+	list_for_each_safe(pos, tmp, &hook_head)
+		remove_hook(pos);
+}
+
+static int hook_config_lookup(const char *key, const char *value, void *hook_key_cb)
+{
+	const char *hook_key = hook_key_cb;
+
+	if (!strcmp(key, hook_key)) {
+		const char *command = value;
+		struct strbuf hookcmd_name = STRBUF_INIT;
+		struct list_head *pos = NULL, *tmp = NULL;
+
+		/* Check if a hookcmd with that name exists. */
+		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
+		git_config_get_value(hookcmd_name.buf, &command);
+
+		if (!command)
+			BUG("git_config_get_value overwrote a string it shouldn't have");
+
+		/*
+		 * TODO: implement an option-getting callback, e.g.
+		 *   get configs by pattern hookcmd.$value.*
+		 *   for each key+value, do_callback(key, value, cb_data)
+		 */
+
+		list_for_each_safe(pos, tmp, &hook_head) {
+			struct hook *hook = list_entry(pos, struct hook, list);
+			/*
+			 * The list of hooks to run can be reordered by being redeclared
+			 * in the config. Options about hook ordering should be checked
+			 * here.
+			 */
+			if (0 == strcmp(hook->command.buf, command))
+				remove_hook(pos);
+		}
+		emplace_hook(pos, command);
+	}
+
+	return 0;
+}
+
+struct list_head* hook_list(const struct strbuf* hookname)
+{
+	struct strbuf hook_key = STRBUF_INIT;
+
+	if (!hookname)
+		return NULL;
+
+	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
+
+	git_config(hook_config_lookup, (void*)hook_key.buf);
+
+	return &hook_head;
+}
diff --git a/hook.h b/hook.h
new file mode 100644
index 0000000000..aaf6511cff
--- /dev/null
+++ b/hook.h
@@ -0,0 +1,15 @@
+#include "config.h"
+#include "list.h"
+#include "strbuf.h"
+
+struct hook
+{
+	struct list_head list;
+	enum config_scope origin;
+	struct strbuf command;
+};
+
+struct list_head* hook_list(const struct strbuf *hookname);
+
+void free_hook(struct hook *ptr);
+void clear_hook_list(void);
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 34b0df5216..46d1ed354a 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -4,8 +4,72 @@ test_description='config-managed multihooks, including git-hook command'
 
 . ./test-lib.sh
 
-test_expect_success 'git hook command does not crash' '
-	git hook
+ROOT=
+if test_have_prereq MINGW
+then
+	# In Git for Windows, Unix-like paths work only in shell scripts;
+	# `git.exe`, however, will prefix them with the pseudo root directory
+	# (of the Unix shell). Let's accommodate for that.
+	ROOT="$(cd / && pwd)"
+fi
+
+setup_hooks () {
+	test_config hook.pre-commit.command "/path/ghi" --add
+	test_config_global hook.pre-commit.command "/path/def" --add
+}
+
+setup_hookcmd () {
+	test_config hook.pre-commit.command "abc" --add
+	test_config_global hookcmd.abc.command "/path/abc" --add
+}
+
+test_expect_success 'git hook rejects commands without a mode' '
+	test_must_fail git hook pre-commit
+'
+
+
+test_expect_success 'git hook rejects commands without a hookname' '
+	test_must_fail git hook list
+'
+
+test_expect_success 'git hook list orders by config order' '
+	setup_hooks &&
+
+	cat >expected <<-EOF &&
+	global:	$ROOT/path/def
+	local:	$ROOT/path/ghi
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list dereferences a hookcmd' '
+	setup_hooks &&
+	setup_hookcmd &&
+
+	cat >expected <<-EOF &&
+	global:	$ROOT/path/def
+	local:	$ROOT/path/ghi
+	local:	$ROOT/path/abc
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list reorders on duplicate commands' '
+	setup_hooks &&
+
+	test_config hook.pre-commit.command "/path/def" --add &&
+
+	cat >expected <<-EOF &&
+	local:	$ROOT/path/ghi
+	local:	$ROOT/path/def
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
 '
 
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v3 4/6] hook: add --porcelain to list command
  2020-07-28 22:24 ` [PATCH v3 0/6] propose config-based hooks Emily Shaffer
                     ` (2 preceding siblings ...)
  2020-07-28 22:24   ` [PATCH v3 3/6] hook: add list command Emily Shaffer
@ 2020-07-28 22:24   ` Emily Shaffer
  2020-07-28 22:24   ` [RFC PATCH v3 5/6] parse-options: parse into argv_array Emily Shaffer
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-07-28 22:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Teach 'git hook list --porcelain <hookname>', which prints simply the
commands to be run in the order suggested by the config. This option is
intended for use by user scripts, wrappers, or out-of-process Git
commands which still want to execute hooks. For example, the following
snippet might be added to git-send-email.perl to introduce a
`pre-send-email` hook:

  sub pre_send_email {
    open(my $fh, 'git hook list --porcelain pre-send-email |');
    chomp(my @hooks = <$fh>);
    close($fh);

    foreach $hook (@hooks) {
            system $hook
    }

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/git-hook.txt    | 13 +++++++++++--
 builtin/hook.c                | 17 +++++++++++++----
 t/t1360-config-based-hooks.sh | 12 ++++++++++++
 3 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index e458586e96..0854035ce2 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -8,7 +8,7 @@ git-hook - Manage configured hooks
 SYNOPSIS
 --------
 [verse]
-'git hook' list <hook-name>
+'git hook' list [--porcelain] <hook-name>
 
 DESCRIPTION
 -----------
@@ -43,11 +43,20 @@ Local config
 COMMANDS
 --------
 
-list <hook-name>::
+list [--porcelain] <hook-name>::
 
 List the hooks which have been configured for <hook-name>. Hooks appear
 in the order they should be run, and note the config scope where the relevant
 `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
++
+If `--porcelain` is specified, instead print the commands alone, separated by
+newlines, for easy parsing by a script.
+
+OPTIONS
+-------
+--porcelain::
+	With `list`, print the commands in the order they should be run,
+	separated by newlines, for easy parsing by a script.
 
 GIT
 ---
diff --git a/builtin/hook.c b/builtin/hook.c
index a0759a4c26..0d92124ca6 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -16,8 +16,11 @@ static int list(int argc, const char **argv, const char *prefix)
 	struct list_head *head, *pos;
 	struct hook *item;
 	struct strbuf hookname = STRBUF_INIT;
+	int porcelain = 0;
 
 	struct option list_options[] = {
+		OPT_BOOL(0, "porcelain", &porcelain,
+			 "format for execution by a script"),
 		OPT_END(),
 	};
 
@@ -29,6 +32,8 @@ static int list(int argc, const char **argv, const char *prefix)
 			      builtin_hook_usage, list_options);
 	}
 
+
+
 	strbuf_addstr(&hookname, argv[0]);
 
 	head = hook_list(&hookname);
@@ -41,10 +46,14 @@ static int list(int argc, const char **argv, const char *prefix)
 
 	list_for_each(pos, head) {
 		item = list_entry(pos, struct hook, list);
-		if (item)
-			printf("%s:\t%s\n",
-			       config_scope_name(item->origin),
-			       item->command.buf);
+		if (item) {
+			if (porcelain)
+				printf("%s\n", item->command.buf);
+			else
+				printf("%s:\t%s\n",
+				       config_scope_name(item->origin),
+				       item->command.buf);
+		}
 	}
 
 	clear_hook_list();
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 46d1ed354a..ebf8f38d68 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -72,4 +72,16 @@ test_expect_success 'git hook list reorders on duplicate commands' '
 	test_cmp expected actual
 '
 
+test_expect_success 'git hook list --porcelain prints just the command' '
+	setup_hooks &&
+
+	cat >expected <<-EOF &&
+	$ROOT/path/def
+	$ROOT/path/ghi
+	EOF
+
+	git hook list --porcelain pre-commit >actual &&
+	test_cmp expected actual
+'
+
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [RFC PATCH v3 5/6] parse-options: parse into argv_array
  2020-07-28 22:24 ` [PATCH v3 0/6] propose config-based hooks Emily Shaffer
                     ` (3 preceding siblings ...)
  2020-07-28 22:24   ` [PATCH v3 4/6] hook: add --porcelain to " Emily Shaffer
@ 2020-07-28 22:24   ` Emily Shaffer
  2020-07-29 19:33     ` Junio C Hamano
  2020-07-28 22:24   ` [RFC PATCH v3 6/6] hook: add 'run' subcommand Emily Shaffer
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
  6 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-07-28 22:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

parse-options already knows how to read into a string_list, and it knows
how to read into an argv_array as a passthrough (that is, including the
argument as well as its value). string_list and argv_array serve similar
purposes but are somewhat painful to convert between; so, let's teach
parse-options to read values of string arguments directly into an
argv_array without preserving the argument name.

This is useful if collecting generic arguments to pass through to
another command, for example, 'git hook run --arg "--quiet" --arg
"--format=pretty" some-hook'. The resulting argv_array would contain
{ "--quiet", "--format=pretty" }.

The implementation is based on that of OPT_STRING_LIST.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/technical/api-parse-options.txt |  5 +++++
 parse-options-cb.c                            | 16 ++++++++++++++++
 parse-options.h                               |  4 ++++
 3 files changed, 25 insertions(+)

diff --git a/Documentation/technical/api-parse-options.txt b/Documentation/technical/api-parse-options.txt
index 2e2e7c10c6..1e97343338 100644
--- a/Documentation/technical/api-parse-options.txt
+++ b/Documentation/technical/api-parse-options.txt
@@ -173,6 +173,11 @@ There are some macros to easily define options:
 	The string argument is stored as an element in `string_list`.
 	Use of `--no-option` will clear the list of preceding values.
 
+`OPT_ARGV_ARRAY(short, long, &struct argv_array, arg_str, description)`::
+	Introduce an option with a string argument.
+	The string argument is stored as an element in `argv_array`.
+	Use of `--no-option` will clear the list of preceding values.
+
 `OPT_INTEGER(short, long, &int_var, description)`::
 	Introduce an option with integer argument.
 	The integer is put into `int_var`.
diff --git a/parse-options-cb.c b/parse-options-cb.c
index 86cd393013..94c2dd397a 100644
--- a/parse-options-cb.c
+++ b/parse-options-cb.c
@@ -205,6 +205,22 @@ int parse_opt_string_list(const struct option *opt, const char *arg, int unset)
 	return 0;
 }
 
+int parse_opt_argv_array(const struct option *opt, const char *arg, int unset)
+{
+	struct argv_array *v = opt->value;
+
+	if (unset) {
+		argv_array_clear(v);
+		return 0;
+	}
+
+	if (!arg)
+		return -1;
+
+	argv_array_push(v, arg);
+	return 0;
+}
+
 int parse_opt_noop_cb(const struct option *opt, const char *arg, int unset)
 {
 	return 0;
diff --git a/parse-options.h b/parse-options.h
index 46af942093..e2e2de75c8 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -177,6 +177,9 @@ struct option {
 #define OPT_STRING_LIST(s, l, v, a, h) \
 				    { OPTION_CALLBACK, (s), (l), (v), (a), \
 				      (h), 0, &parse_opt_string_list }
+#define OPT_ARGV_ARRAY(s, l, v, a, h) \
+				    { OPTION_CALLBACK, (s), (l), (v), (a), \
+				      (h), 0, &parse_opt_argv_array }
 #define OPT_UYN(s, l, v, h)         { OPTION_CALLBACK, (s), (l), (v), NULL, \
 				      (h), PARSE_OPT_NOARG, &parse_opt_tertiary }
 #define OPT_EXPIRY_DATE(s, l, v, h) \
@@ -296,6 +299,7 @@ int parse_opt_commits(const struct option *, const char *, int);
 int parse_opt_commit(const struct option *, const char *, int);
 int parse_opt_tertiary(const struct option *, const char *, int);
 int parse_opt_string_list(const struct option *, const char *, int);
+int parse_opt_argv_array(const struct option *, const char *, int);
 int parse_opt_noop_cb(const struct option *, const char *, int);
 enum parse_opt_result parse_opt_unknown_cb(struct parse_opt_ctx_t *ctx,
 					   const struct option *,
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [RFC PATCH v3 6/6] hook: add 'run' subcommand
  2020-07-28 22:24 ` [PATCH v3 0/6] propose config-based hooks Emily Shaffer
                     ` (4 preceding siblings ...)
  2020-07-28 22:24   ` [RFC PATCH v3 5/6] parse-options: parse into argv_array Emily Shaffer
@ 2020-07-28 22:24   ` Emily Shaffer
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
  6 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-07-28 22:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

In order to enable hooks to be run as an external process, by a
standalone Git command, or by tools which wrap Git, provide an external
means to run all configured hook commands for a given hook event.

For now, the hook commands will in config order, in series. As alternate
ordering or parallelism is supported in the future, we should add knobs
to use those to the command line as well.

As with the legacy hook implementation, all stdout generated by hook
commands is redirected to stderr. Piping from stdin is not yet
supported.

Legacy hooks (those present in $GITDIR/hooks) are run at the end of the
execution list. For now, there is no way to disable them.

Users may wish to provide hook commands like 'git config
hook.pre-commit.command "~/linter.sh --pre-commit"'. To enable this, the
contents of the 'hook.*.command' and 'hookcmd.*.command' strings are
first split by space or quotes into an argv_array, then expanded with
'expand_user_path()'.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 builtin/hook.c                | 30 +++++++++++++++++++++++++
 hook.c                        | 42 +++++++++++++++++++++++++++++++++++
 hook.h                        |  3 +++
 t/t1360-config-based-hooks.sh | 28 +++++++++++++++++++++++
 4 files changed, 103 insertions(+)

diff --git a/builtin/hook.c b/builtin/hook.c
index 0d92124ca6..cd61fad5fb 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -5,9 +5,11 @@
 #include "hook.h"
 #include "parse-options.h"
 #include "strbuf.h"
+#include "argv-array.h"
 
 static const char * const builtin_hook_usage[] = {
 	N_("git hook list <hookname>"),
+	N_("git hook run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hookname>"),
 	NULL
 };
 
@@ -62,6 +64,32 @@ static int list(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+static int run(int argc, const char **argv, const char *prefix)
+{
+	struct strbuf hookname = STRBUF_INIT;
+	struct argv_array env_argv = ARGV_ARRAY_INIT;
+	struct argv_array arg_argv = ARGV_ARRAY_INIT;
+
+	struct option run_options[] = {
+		OPT_ARGV_ARRAY('e', "env", &env_argv, N_("var"),
+			       N_("environment variables for hook to use")),
+		OPT_ARGV_ARRAY('a', "arg", &arg_argv, N_("args"),
+			       N_("argument to pass to hook")),
+		OPT_END(),
+	};
+
+	argc = parse_options(argc, argv, prefix, run_options,
+			     builtin_hook_usage, 0);
+
+	if (argc < 1)
+		usage_msg_opt(_("a hookname must be provided to operate on."),
+			      builtin_hook_usage, run_options);
+
+	strbuf_addstr(&hookname, argv[0]);
+
+	return run_hooks(env_argv.argv, &hookname, &arg_argv);
+}
+
 int cmd_hook(int argc, const char **argv, const char *prefix)
 {
 	struct option builtin_hook_options[] = {
@@ -72,6 +100,8 @@ int cmd_hook(int argc, const char **argv, const char *prefix)
 
 	if (!strcmp(argv[1], "list"))
 		return list(argc - 1, argv + 1, prefix);
+	if (!strcmp(argv[1], "run"))
+		return run(argc - 1, argv + 1, prefix);
 
 	usage_with_options(builtin_hook_usage, builtin_hook_options);
 }
diff --git a/hook.c b/hook.c
index 9dfc1a885e..902e213173 100644
--- a/hook.c
+++ b/hook.c
@@ -2,6 +2,7 @@
 
 #include "hook.h"
 #include "config.h"
+#include "run-command.h"
 
 static LIST_HEAD(hook_head);
 
@@ -78,6 +79,7 @@ static int hook_config_lookup(const char *key, const char *value, void *hook_key
 struct list_head* hook_list(const struct strbuf* hookname)
 {
 	struct strbuf hook_key = STRBUF_INIT;
+	const char *legacy_hook_path = NULL;
 
 	if (!hookname)
 		return NULL;
@@ -86,5 +88,45 @@ struct list_head* hook_list(const struct strbuf* hookname)
 
 	git_config(hook_config_lookup, (void*)hook_key.buf);
 
+	legacy_hook_path = find_hook(hookname->buf);
+
+	/* TODO: check hook.runHookDir */
+	if (legacy_hook_path)
+		emplace_hook(&hook_head, legacy_hook_path);
+
 	return &hook_head;
 }
+
+int run_hooks(const char *const *env, const struct strbuf *hookname,
+	      const struct argv_array *args)
+{
+	struct list_head *to_run, *pos = NULL, *tmp = NULL;
+	int rc = 0;
+
+	to_run = hook_list(hookname);
+
+	list_for_each_safe(pos, tmp, to_run) {
+		struct child_process hook_proc = CHILD_PROCESS_INIT;
+		struct hook *hook = list_entry(pos, struct hook, list);
+
+		/* add command */
+		argv_array_push(&hook_proc.args, hook->command.buf);
+
+		/*
+		 * add passed-in argv, without expanding - let the user get back
+		 * exactly what they put in
+		 */
+		if (args)
+			argv_array_pushv(&hook_proc.args, args->argv);
+
+		hook_proc.env = env;
+		hook_proc.no_stdin = 1;
+		hook_proc.stdout_to_stderr = 1;
+		hook_proc.trace2_hook_name = hook->command.buf;
+		hook_proc.use_shell = 1;
+
+		rc |= run_command(&hook_proc);
+	}
+
+	return rc;
+}
diff --git a/hook.h b/hook.h
index aaf6511cff..cf598d6ccb 100644
--- a/hook.h
+++ b/hook.h
@@ -1,6 +1,7 @@
 #include "config.h"
 #include "list.h"
 #include "strbuf.h"
+#include "argv-array.h"
 
 struct hook
 {
@@ -10,6 +11,8 @@ struct hook
 };
 
 struct list_head* hook_list(const struct strbuf *hookname);
+int run_hooks(const char *const *env, const struct strbuf *hookname,
+	      const struct argv_array *args);
 
 void free_hook(struct hook *ptr);
 void clear_hook_list(void);
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index ebf8f38d68..ee8114250d 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -84,4 +84,32 @@ test_expect_success 'git hook list --porcelain prints just the command' '
 	test_cmp expected actual
 '
 
+test_expect_success 'inline hook definitions execute oneliners' '
+	test_config hook.pre-commit.command "echo \"Hello World\"" &&
+
+	echo "Hello World" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'inline hook definitions resolve paths' '
+	cat >~/sample-hook.sh <<-EOF &&
+	echo \"Sample Hook\"
+	EOF
+
+	test_when_finished "rm ~/sample-hook.sh" &&
+
+	chmod +x ~/sample-hook.sh &&
+
+	test_config hook.pre-commit.command "~/sample-hook.sh" &&
+
+	echo \"Sample Hook\" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* Re: [RFC PATCH v3 5/6] parse-options: parse into argv_array
  2020-07-28 22:24   ` [RFC PATCH v3 5/6] parse-options: parse into argv_array Emily Shaffer
@ 2020-07-29 19:33     ` Junio C Hamano
  2020-07-30 23:41       ` Junio C Hamano
  0 siblings, 1 reply; 170+ messages in thread
From: Junio C Hamano @ 2020-07-29 19:33 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Emily Shaffer <emilyshaffer@google.com> writes:

> parse-options already knows how to read into a string_list, and it knows
> how to read into an argv_array as a passthrough (that is, including the
> argument as well as its value). string_list and argv_array serve similar
> purposes but are somewhat painful to convert between; so, let's teach
> parse-options to read values of string arguments directly into an
> argv_array without preserving the argument name.
>
> This is useful if collecting generic arguments to pass through to
> another command, for example, 'git hook run --arg "--quiet" --arg
> "--format=pretty" some-hook'. The resulting argv_array would contain
> { "--quiet", "--format=pretty" }.
>
> The implementation is based on that of OPT_STRING_LIST.

Be it argv_array or strvec, I think this is a useful thing to do.

I grepped for the users of OPT_STRING_LIST() to see if some of them
are better served by this, but none of them stood out as candidates
that are particularly good match.

> +int parse_opt_argv_array(const struct option *opt, const char *arg, int unset)
> +{
> +	struct argv_array *v = opt->value;
> +
> +	if (unset) {
> +		argv_array_clear(v);
> +		return 0;
> +	}
> +
> +	if (!arg)
> +		return -1;

I think the calling parse_options() loop would catch this negative
return and raise an error, but is it better for this code to stay
silent or would it be better to say that opt->long_name/short_name 
is not a boolean?

> +	argv_array_push(v, arg);
> +	return 0;
> +}

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [RFC PATCH v3 5/6] parse-options: parse into argv_array
  2020-07-29 19:33     ` Junio C Hamano
@ 2020-07-30 23:41       ` Junio C Hamano
  0 siblings, 0 replies; 170+ messages in thread
From: Junio C Hamano @ 2020-07-30 23:41 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git, Jeff King

Junio C Hamano <gitster@pobox.com> writes:

> Be it argv_array or strvec, I think this is a useful thing to do.
>
> I grepped for the users of OPT_STRING_LIST() to see if some of them
> are better served by this, but none of them stood out as candidates
> that are particularly good match.
>
>> +int parse_opt_argv_array(const struct option *opt, const char *arg, int unset)
>> +{
>> +	struct argv_array *v = opt->value;
>> +
>> +	if (unset) {
>> +		argv_array_clear(v);
>> +		return 0;
>> +	}
>> +
>> +	if (!arg)
>> +		return -1;
>
> I think the calling parse_options() loop would catch this negative
> return and raise an error, but is it better for this code to stay
> silent or would it be better to say that opt->long_name/short_name 
> is not a boolean?

I am still waiting for this to be answered, but I queued the whole
topic, these last two steps included, just to see how bad adjusting
to the strvec API migration would be.  It wasn't too bad.

I would not recommend you, or other contributors who use argv-array
API in their topics, to build on top of jk/strvec, not just yet, as
I expect it to go through at least one more reroll to update the
details.

Thanks.


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 3/4] hook: add list command
  2020-06-09 21:49     ` Emily Shaffer
@ 2020-08-17 13:36       ` Phillip Wood
  0 siblings, 0 replies; 170+ messages in thread
From: Phillip Wood @ 2020-08-17 13:36 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Hi Emily

sorry it has taken me so long to reply

On 09/06/2020 22:49, Emily Shaffer wrote:
> On Fri, May 22, 2020 at 11:27:44AM +0100, Phillip Wood wrote:
>>
>> Hi Emily
>>
>> On 21/05/2020 19:54, Emily Shaffer wrote:
>>> [...]
>>> +static int hook_config_lookup(const char *key, const char *value, void *hook_key_cb)
>>> +{
>>> +	const char *hook_key = hook_key_cb;
>>> +
>>> +	if (!strcmp(key, hook_key)) {
>>> +		const char *command = value;
>>> +		struct strbuf hookcmd_name = STRBUF_INIT;
>>> +		struct list_head *pos = NULL, *tmp = NULL;
>>> +
>>> +		/* Check if a hookcmd with that name exists. */
>>> +		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
>>> +		git_config_get_value(hookcmd_name.buf, &command);
>>
>> This looks dodgy to me. This code is called by git_config() as it parses
>> the config files, so it has not had a chance to fully populate the
>> config cache used by git_config_get_value(). I think the test below
>> passes because the hookcmd setting is set in the global file and the
>> hook setting is set in the local file so when we have already parsed the
>> hookcmd setting when we come to look it up. The same comment applies to
>> the hypothetical ordering config mentioned below. I think it would be
>> better to collect the list of hook.<event>.command settings in this
>> callback and then look up any hookcmd settings for those hook commands
>> after we've finished reading all of the config files.
> 
> git_config_get_value() calls repo_read_config(the_repository) if the
> config hasn't been fully parsed yet, so I think what you're worrying
> about is not an issue. It's ugly, I agree, but since the new hotness
> (git_config_get_value() and friends) doesn't offer the same
> functionality as the old solution (config origin) this seemed like an
> okay approach. As I understand it, moving this hookcmd lookup section
> outside of the config callback will save us up to one additional pass
> through the configs, at the expense of a more convoluted code path.

Oh I didn't realize that, thanks for explaining it. Below you mention 
showing the origin for hookcmds as well as the origin of the command 
which would mean having to change this code anyway I think.

>>
>>> +
>>> +		if (!command)
>>> +			BUG("git_config_get_value overwrote a string it shouldn't have");
>>> +
>>> +		/*
>>> +		 * TODO: implement an option-getting callback, e.g.
>>> +		 *   get configs by pattern hookcmd.$value.*
>>> +		 *   for each key+value, do_callback(key, value, cb_data)
>>> +		 */
>>> +
>>> +		list_for_each_safe(pos, tmp, &hook_head) {
>>> +			struct hook *hook = list_entry(pos, struct hook, list);
>>> +			/*
>>> +			 * The list of hooks to run can be reordered by being redeclared
>>> +			 * in the config. Options about hook ordering should be checked
>>> +			 * here.
>>> +			 */
>>> +			if (0 == strcmp(hook->command.buf, command))
>>> +				remove_hook(pos);
>>> +		}
>>> +		emplace_hook(pos, command);
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +struct list_head* hook_list(const struct strbuf* hookname)
>>> +{
>>> +	struct strbuf hook_key = STRBUF_INIT;
>>> +
>>> +	if (!hookname)
>>> +		return NULL;
>>> +
>>> +	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
>>> +
>>> +	git_config(hook_config_lookup, (void*)hook_key.buf);
>>> +
>>> +	return &hook_head;
>>> +}
>>> diff --git a/hook.h b/hook.h
>>> new file mode 100644
>>> index 0000000000..aaf6511cff
>>> --- /dev/null
>>> +++ b/hook.h
>>> @@ -0,0 +1,15 @@
>>> +#include "config.h"
>>> +#include "list.h"
>>> +#include "strbuf.h"
>>> +
>>> +struct hook
>>> +{
>>> +	struct list_head list;
>>> +	enum config_scope origin;
>>> +	struct strbuf command;
>>> +};
>>> +
>>> +struct list_head* hook_list(const struct strbuf *hookname);
>>> +
>>> +void free_hook(struct hook *ptr);
>>> +void clear_hook_list(void);
>>> diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
>>> index 34b0df5216..4e46d7dd4e 100755
>>> --- a/t/t1360-config-based-hooks.sh
>>> +++ b/t/t1360-config-based-hooks.sh
>>> @@ -4,8 +4,55 @@ test_description='config-managed multihooks, including git-hook command'
>>>   
>>>   . ./test-lib.sh
>>>   
>>> -test_expect_success 'git hook command does not crash' '
>>> -	git hook
>>> +test_expect_success 'git hook rejects commands without a mode' '
>>> +	test_must_fail git hook pre-commit
>>> +'
>>> +
>>> +
>>> +test_expect_success 'git hook rejects commands without a hookname' '
>>> +	test_must_fail git hook list
>>> +'
>>> +
>>> +test_expect_success 'setup hooks in global, and local' '
>>> +	git config --add --local hook.pre-commit.command "/path/ghi" &&
>>
>> Can I make a plea for the use of test_config please. Writing tests which
>> rely on previous tests for their set-up creates a chain of hidden
>> dependencies that make it hard to add/alter tests later or run a subset
>> of the tests when developing a new patch. t3404-rebase-interactive.sh is
>> a prime example of this and I dread touching it.
> 
> Sure. I'll redo them.

That's great, thanks

Best Wishes

Phillip

>>
>>> +	git config --add --global hook.pre-commit.command "/path/def"
>>> +'
>>> +
>>> +test_expect_success 'git hook list orders by config order' '
>>> +	cat >expected <<-\EOF &&
>>> +	global:	/path/def
>>> +	local:	/path/ghi
>>> +	EOF
>>> +
>>> +	git hook list pre-commit >actual &&
>>> +	test_cmp expected actual
>>> +'
>>> +
>>> +test_expect_success 'git hook list dereferences a hookcmd' '
>>> +	git config --add --local hook.pre-commit.command "abc" &&
>>> +	git config --add --global hookcmd.abc.command "/path/abc" &&
>>> +
>>> +	cat >expected <<-\EOF &&
>>> +	global:	/path/def
>>> +	local:	/path/ghi
>>> +	local:	/path/abc
>>
>> We should make it clear in the documentation that the config origin
>> applies to the hook setting, even though we display the hookcmd command
>> which is set globally here for the last hook.
> 
> One of the suggestions from our UX team last week was to make this list
> output clearer to indicate the origin of the command plus the origin of
> the hookcmd object; I'll try to straighten this out and make sure the
> doc agrees.
> 
>   - Emily
> 

^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v4 0/9] propose config-based hooks
  2020-07-28 22:24 ` [PATCH v3 0/6] propose config-based hooks Emily Shaffer
                     ` (5 preceding siblings ...)
  2020-07-28 22:24   ` [RFC PATCH v3 6/6] hook: add 'run' subcommand Emily Shaffer
@ 2020-09-09  0:49   ` Emily Shaffer
  2020-09-09  0:49     ` [PATCH v4 1/9] doc: propose hooks managed by the config Emily Shaffer
                       ` (10 more replies)
  6 siblings, 11 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-09-09  0:49 UTC (permalink / raw)
  To: git
  Cc: Emily Shaffer, Jeff King, Junio C Hamano, James Ramsay,
	Jonathan Nieder, brian m. carlson,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Josh Steadmon, Johannes Schindelin

Since v3, the biggest change is the conversion of commit hooks to use the new
hook machinery. The first change ("commit: use config-based hooks") is the
important part; the second change ("run_commit_hook: take strvec instead of varargs")
is probably subjective, but I thought it was a decent tech debt reduction.

I wanted to send this reroll quickly since I had promised it in standup last
week, but I've got pretty good progress locally on the patch for configuring
"hook.runHookDir"; I'm planning to send that soon, probably this week.

 - Emily

Emily Shaffer (9):
  doc: propose hooks managed by the config
  hook: scaffolding for git-hook subcommand
  hook: add list command
  hook: add --porcelain to list command
  parse-options: parse into strvec
  hook: add 'run' subcommand
  hook: replace run-command.h:find_hook
  commit: use config-based hooks
  run_commit_hook: take strvec instead of varargs

 .gitignore                                    |   1 +
 Documentation/Makefile                        |   1 +
 Documentation/git-hook.txt                    |  63 ++++
 Documentation/technical/api-parse-options.txt |   5 +
 .../technical/config-based-hooks.txt          | 354 ++++++++++++++++++
 Makefile                                      |   2 +
 builtin.h                                     |   1 +
 builtin/commit.c                              |  49 +--
 builtin/hook.c                                | 107 ++++++
 builtin/merge.c                               |  23 +-
 commit.c                                      |  12 +-
 commit.h                                      |   5 +-
 git.c                                         |   1 +
 hook.c                                        | 155 ++++++++
 hook.h                                        |  19 +
 parse-options-cb.c                            |  16 +
 parse-options.h                               |   4 +
 sequencer.c                                   |  15 +-
 t/t1360-config-based-hooks.sh                 | 115 ++++++
 ...3-pre-commit-and-pre-merge-commit-hooks.sh |  13 +
 20 files changed, 918 insertions(+), 43 deletions(-)
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 Documentation/technical/config-based-hooks.txt
 create mode 100644 builtin/hook.c
 create mode 100644 hook.c
 create mode 100644 hook.h
 create mode 100755 t/t1360-config-based-hooks.sh

-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v4 1/9] doc: propose hooks managed by the config
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
@ 2020-09-09  0:49     ` Emily Shaffer
  2020-09-23 22:59       ` Jonathan Tan
  2020-10-07  9:23       ` Ævar Arnfjörð Bjarmason
  2020-09-09  0:49     ` [PATCH v4 2/9] hook: scaffolding for git-hook subcommand Emily Shaffer
                       ` (9 subsequent siblings)
  10 siblings, 2 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-09-09  0:49 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Begin a design document for config-based hooks, managed via git-hook.
Focus on an overview of the implementation and motivation for design
decisions. Briefly discuss the alternatives considered before this
point. Also, attempt to redefine terms to fit into a multihook world.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/Makefile                        |   1 +
 .../technical/config-based-hooks.txt          | 354 ++++++++++++++++++
 2 files changed, 355 insertions(+)
 create mode 100644 Documentation/technical/config-based-hooks.txt

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 80d1908a44..58d6b3acbe 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -81,6 +81,7 @@ SP_ARTICLES += $(API_DOCS)
 TECH_DOCS += MyFirstContribution
 TECH_DOCS += MyFirstObjectWalk
 TECH_DOCS += SubmittingPatches
+TECH_DOCS += technical/config-based-hooks
 TECH_DOCS += technical/hash-function-transition
 TECH_DOCS += technical/http-protocol
 TECH_DOCS += technical/index-format
diff --git a/Documentation/technical/config-based-hooks.txt b/Documentation/technical/config-based-hooks.txt
new file mode 100644
index 0000000000..c6e762b192
--- /dev/null
+++ b/Documentation/technical/config-based-hooks.txt
@@ -0,0 +1,354 @@
+Configuration-based hook management
+===================================
+:sectanchors:
+
+[[motivation]]
+== Motivation
+
+Treat hooks as a first-class citizen by replacing the .git/hook/hookname path as
+the only source of hooks to execute, in a way which is friendly to users with
+multiple repos which have similar needs.
+
+Redefine "hook" as an event rather than a single script, allowing users to
+perform unrelated actions on a single event.
+
+Take a step closer to safety when copying zipped Git repositories from untrusted
+users by making it more apparent to users which scripts will be run during
+normal Git operations.
+
+Make it easier for users to discover Git's hook feature and automate their
+workflows.
+
+[[user-interfaces]]
+== User interfaces
+
+[[config-schema]]
+=== Config schema
+
+Hooks can be introduced by editing the configuration manually. There are two new
+sections added, `hook` and `hookcmd`.
+
+[[config-schema-hook]]
+==== `hook`
+
+Primarily contains subsections for each hook event. These order of these
+subsections defines the hook command execution order; hook commands can be
+specified by setting the value directly to the command if no additional
+configuration is needed, or by setting the value as the name of a `hookcmd`. If
+Git does not find a `hookcmd` whose subsection matches the value of the given
+command string, Git will try to execute the string directly. Hooks are executed
+by passing the resolved command string to the shell. Hook event subsections can
+also contain per-hook-event settings.
+
+Also contains top-level hook execution settings, for example,
+`hook.warnHookDir`, `hook.runHookDir`, or `hook.disableAll`. (These settings are
+described more in <<library,Library>>.)
+
+----
+[hook "pre-commit"]
+  command = perl-linter
+  command = /usr/bin/git-secrets --pre-commit
+
+[hook "pre-applypatch"]
+  command = perl-linter
+  error = ignore
+
+[hook]
+  runHookDir = interactive
+----
+
+[[config-schema-hookcmd]]
+==== `hookcmd`
+
+Defines a hook command and its attributes, which will be used when a hook event
+occurs. Unqualified attributes are assumed to apply to this hook during all hook
+events, but event-specific attributes can also be supplied. The example runs
+`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
+include this config, the hook command will be skipped for all events to which
+it's normally subscribed _except_ `pre-commit`.
+
+----
+[hookcmd "perl-linter"]
+  command = /usr/bin/lint-it --language=perl
+  skip = true
+  pre-commit-skip = false
+----
+
+[[command-line-api]]
+=== Command-line API
+
+Users should be able to view, reorder, and create hook commands via the command
+line. External tools should be able to view a list of hooks in the correct order
+to run.
+
+*`git hook list <hook-event>`*
+
+*`git hook list (--system|--global|--local|--worktree)`*
+
+*`git hook edit <hook-event>`*
+
+*`git hook add <hook-command> <hook-event> <options...>`*
+
+[[hook-editor]]
+=== Hook editor
+
+The tool which is presented by `git hook edit <hook-command>`. Ideally, this
+tool should be easier to use than manually editing the config, and then produce
+a concise config afterwards. It may take a form similar to `git rebase
+--interactive`.
+
+[[implementation]]
+== Implementation
+
+[[library]]
+=== Library
+
+`hook.c` and `hook.h` are responsible for interacting with the config files. In
+the case when the code generating a hook event doesn't have special concerns
+about how to run the hooks, the hook library will provide a basic API to call
+all hooks in config order with an `argv_array` provided by the code which
+generates the hook event:
+
+*`int run_hooks(const char *hookname, struct argv_array *args)`*
+
+This call includes the hook command provided by `run-command.h:find_hook()`;
+eventually, this legacy hook will be gated by a config `hook.runHookDir`. The
+config is checked against a number of cases:
+
+- "no": the legacy hook will not be run
+- "interactive": Git will prompt the user before running the legacy hook
+- "warn": Git will print a warning to stderr before running the legacy hook
+- "yes" (default): Git will silently run the legacy hook
+
+In case this list is expanded in the future, if a value for `hook.runHookDir` is
+given which Git does not recognize, Git should discard that config entry. For
+example, if "warn" was specified at system level and "junk" was specified at
+global level, Git would resolve the value to "warn"; if the only time the config
+was set was to "junk", Git would use the default value of "yes".
+
+If the caller wants to do something more complicated, the hook library can also
+provide a callback API:
+
+*`int for_each_hookcmd(const char *hookname, hookcmd_function *cb)`*
+
+Finally, to facilitate the builtin, the library will also provide the following
+APIs to interact with the config:
+
+----
+int set_hook_commands(const char *hookname, struct string_list *commands,
+	enum config_scope scope);
+int set_hookcmd(const char *hookcmd, struct hookcmd options);
+
+int list_hook_commands(const char *hookname, struct string_list *commands);
+int list_hooks_in_scope(enum config_scope scope, struct string_list *commands);
+----
+
+`struct hookcmd` is expected to grow in size over time as more functionality is
+added to hooks; so that other parts of the code don't need to understand the
+config schema, `struct hookcmd` should contain logical values instead of string
+pairs.
+
+----
+struct hookcmd {
+  const char *name;
+  const char *command;
+
+  /* for illustration only; not planned at present */
+  int parallelizable;
+  const char *hookcmd_before;
+  const char *hookcmd_after;
+  enum recovery_action on_fail;
+}
+----
+
+[[builtin]]
+=== Builtin
+
+`builtin/hook.c` is responsible for providing the frontend. It's responsible for
+formatting user-provided data and then calling the library API to set the
+configs as appropriate. The builtin frontend is not responsible for calling the
+config directly, so that other areas of Git can rely on the hook library to
+understand the most recent config schema for hooks.
+
+[[migration]]
+=== Migration path
+
+[[stage-0]]
+==== Stage 0
+
+Hooks are called by running `run-command.h:find_hook()` with the hookname and
+executing the result. The hook library and builtin do not exist. Hooks only
+exist as specially named scripts within `.git/hooks/`.
+
+[[stage-1]]
+==== Stage 1
+
+`git hook list --porcelain <hook-event>` is implemented. Users can replace their
+`.git/hooks/<hook-event>` scripts with a trampoline based on `git hook list`'s
+output. Modifier commands like `git hook add` and `git hook edit` can be
+implemented around this time as well.
+
+[[stage-2]]
+==== Stage 2
+
+`hook.h:run_hooks()` is taught to include `run-command.h:find_hook()` at the
+end; calls to `find_hook()` are replaced with calls to `run_hooks()`. Users can
+opt-in to config-based hooks simply by creating some in their config; otherwise
+users should remain unaffected by the change.
+
+[[stage-3]]
+==== Stage 3
+
+The call to `find_hook()` inside of `run_hooks()` learns to check for a config,
+`hook.runHookDir`. Users can opt into managing their hooks completely via the
+config this way.
+
+[[stage-4]]
+==== Stage 4
+
+`.git/hooks` is removed from the template and the hook directory is considered
+deprecated. To avoid breaking older repos, the default of `hook.runHookDir` is
+not changed, and `find_hook()` is not removed.
+
+[[caveats]]
+== Caveats
+
+[[security]]
+=== Security and repo config
+
+Part of the motivation behind this refactor is to mitigate hooks as an attack
+vector;footnote:[https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/]
+however, as the design stands, users can still provide hooks in the repo-level
+config, which is included when a repo is zipped and sent elsewhere.  The
+security of the repo-level config is still under discussion; this design
+generally assumes the repo-level config is secure, which is not true yet. The
+goal is to avoid an overcomplicated design to work around a problem which has
+ceased to exist.
+
+[[ease-of-use]]
+=== Ease of use
+
+The config schema is nontrivial; that's why it's important for the `git hook`
+modifier commands to be usable. Contributors with UX expertise are encouraged to
+share their suggestions.
+
+[[alternatives]]
+== Alternative approaches
+
+A previous summary of alternatives exists in the
+archives.footnote:[https://lore.kernel.org/git/20191116011125.GG22855@google.com]
+
+[[status-quo]]
+=== Status quo
+
+Today users can implement multihooks themselves by using a "trampoline script"
+as their hook, and pointing that script to a directory or list of other scripts
+they wish to run.
+
+[[hook-directories]]
+=== Hook directories
+
+Other contributors have suggested Git learn about the existence of a directory
+such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
+
+[[comparison]]
+=== Comparison table
+
+.Comparison of alternatives
+|===
+|Feature |Config-based hooks |Hook directories |Status quo
+
+|Supports multiple hooks
+|Natively
+|Natively
+|With user effort
+
+|Safer for zipped repos
+|A little
+|No
+|No
+
+|Previous hooks just work
+|If configured
+|Yes
+|Yes
+
+|Can install one hook to many repos
+|Yes
+|No
+|No
+
+|Discoverability
+|Better (in `git help git`)
+|Same as before
+|Same as before
+
+|Hard to run unexpected hook
+|If configured
+|No
+|No
+|===
+
+[[future-work]]
+== Future work
+
+[[execution-ordering]]
+=== Execution ordering
+
+We may find that config order is insufficient for some users; for example,
+config order makes it difficult to add a new hook to the system or global config
+which runs at the end of the hook list. A new ordering schema should be:
+
+1) Specified by a `hook.order` config, so that users will not unexpectedly see
+their order change;
+
+2) Either dependency or numerically based.
+
+Dependency-based ordering is prone to classic linked-list problems, like a
+cycles and handling of missing dependencies. But, it paves the way for enabling
+parallelization if some tasks truly depend on others.
+
+Numerical ordering makes it tricky for Git to generate suggested ordering
+numbers for each command, but is easy to determine a definitive order.
+
+[[parallelization]]
+=== Parallelization
+
+Users with many hooks might want to run them simultaneously, if the hooks don't
+modify state; if one hook depends on another's output, then users will want to
+specify those dependencies. If we decide to solve this problem, we may want to
+look to modern build systems for inspiration on how to manage dependencies and
+parallel tasks.
+
+[[securing-hookdir-hooks]]
+=== Securing hookdir hooks
+
+With the design as written in this doc, it's still possible for a malicious user
+to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
+zip their repo and send it to another user. It may be necessary to teach Git to
+only allow inlined hooks like this if they were configured outside of the local
+scope (in other words, only run hookcmds, and only allow hookcmds to be
+configured in global or system scope); or another approach, like a list of safe
+projects, might be useful. It may also be sufficient (or at least useful) to
+teach a `hook.disableAll` config or similar flag to the Git executable.
+
+[[submodule-inheritance]]
+=== Submodule inheritance
+
+It's possible some submodules may want to run the identical set of hooks that
+their superrepo runs. While a globally-configured hook set is helpful, it's not
+a great solution for users who have multiple repos-with-submodules under the
+same user. It would be useful for submodules to learn how to run hooks from
+their superrepo's config, or inherit that hook setting.
+
+[[glossary]]
+== Glossary
+
+*hook event*
+
+A point during Git's execution where user scripts may be run, for example,
+_prepare-commit-msg_ or _pre-push_.
+
+*hook command*
+
+A user script or executable which will be run on one or more hook events.
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v4 2/9] hook: scaffolding for git-hook subcommand
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
  2020-09-09  0:49     ` [PATCH v4 1/9] doc: propose hooks managed by the config Emily Shaffer
@ 2020-09-09  0:49     ` Emily Shaffer
  2020-10-05 23:24       ` Jonathan Nieder
  2020-09-09  0:49     ` [PATCH v4 3/9] hook: add list command Emily Shaffer
                       ` (8 subsequent siblings)
  10 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-09-09  0:49 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Introduce infrastructure for a new subcommand, git-hook, which will be
used to ease config-based hook management. This command will handle
parsing configs to compose a list of hooks to run for a given event, as
well as adding or modifying hook configs in an interactive fashion.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 .gitignore                    |  1 +
 Documentation/git-hook.txt    | 19 +++++++++++++++++++
 Makefile                      |  1 +
 builtin.h                     |  1 +
 builtin/hook.c                | 21 +++++++++++++++++++++
 git.c                         |  1 +
 t/t1360-config-based-hooks.sh | 11 +++++++++++
 7 files changed, 55 insertions(+)
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 builtin/hook.c
 create mode 100755 t/t1360-config-based-hooks.sh

diff --git a/.gitignore b/.gitignore
index ee509a2ad2..0694a34884 100644
--- a/.gitignore
+++ b/.gitignore
@@ -75,6 +75,7 @@
 /git-grep
 /git-hash-object
 /git-help
+/git-hook
 /git-http-backend
 /git-http-fetch
 /git-http-push
diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
new file mode 100644
index 0000000000..2d50c414cc
--- /dev/null
+++ b/Documentation/git-hook.txt
@@ -0,0 +1,19 @@
+git-hook(1)
+===========
+
+NAME
+----
+git-hook - Manage configured hooks
+
+SYNOPSIS
+--------
+[verse]
+'git hook'
+
+DESCRIPTION
+-----------
+You can list, add, and modify hooks with this command.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 65f8cfb236..6eee75555e 100644
--- a/Makefile
+++ b/Makefile
@@ -1077,6 +1077,7 @@ BUILTIN_OBJS += builtin/get-tar-commit-id.o
 BUILTIN_OBJS += builtin/grep.o
 BUILTIN_OBJS += builtin/hash-object.o
 BUILTIN_OBJS += builtin/help.o
+BUILTIN_OBJS += builtin/hook.o
 BUILTIN_OBJS += builtin/index-pack.o
 BUILTIN_OBJS += builtin/init-db.o
 BUILTIN_OBJS += builtin/interpret-trailers.o
diff --git a/builtin.h b/builtin.h
index a5ae15bfe5..4e736499c0 100644
--- a/builtin.h
+++ b/builtin.h
@@ -157,6 +157,7 @@ int cmd_get_tar_commit_id(int argc, const char **argv, const char *prefix);
 int cmd_grep(int argc, const char **argv, const char *prefix);
 int cmd_hash_object(int argc, const char **argv, const char *prefix);
 int cmd_help(int argc, const char **argv, const char *prefix);
+int cmd_hook(int argc, const char **argv, const char *prefix);
 int cmd_index_pack(int argc, const char **argv, const char *prefix);
 int cmd_init_db(int argc, const char **argv, const char *prefix);
 int cmd_interpret_trailers(int argc, const char **argv, const char *prefix);
diff --git a/builtin/hook.c b/builtin/hook.c
new file mode 100644
index 0000000000..b2bbc84d4d
--- /dev/null
+++ b/builtin/hook.c
@@ -0,0 +1,21 @@
+#include "cache.h"
+
+#include "builtin.h"
+#include "parse-options.h"
+
+static const char * const builtin_hook_usage[] = {
+	N_("git hook"),
+	NULL
+};
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+
+	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+			     builtin_hook_usage, 0);
+
+	return 0;
+}
diff --git a/git.c b/git.c
index 8bd1d7551d..1cdb3221a5 100644
--- a/git.c
+++ b/git.c
@@ -519,6 +519,7 @@ static struct cmd_struct commands[] = {
 	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
 	{ "hash-object", cmd_hash_object },
 	{ "help", cmd_help },
+	{ "hook", cmd_hook, RUN_SETUP },
 	{ "index-pack", cmd_index_pack, RUN_SETUP_GENTLY | NO_PARSEOPT },
 	{ "init", cmd_init_db },
 	{ "init-db", cmd_init_db },
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
new file mode 100755
index 0000000000..34b0df5216
--- /dev/null
+++ b/t/t1360-config-based-hooks.sh
@@ -0,0 +1,11 @@
+#!/bin/bash
+
+test_description='config-managed multihooks, including git-hook command'
+
+. ./test-lib.sh
+
+test_expect_success 'git hook command does not crash' '
+	git hook
+'
+
+test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v4 3/9] hook: add list command
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
  2020-09-09  0:49     ` [PATCH v4 1/9] doc: propose hooks managed by the config Emily Shaffer
  2020-09-09  0:49     ` [PATCH v4 2/9] hook: scaffolding for git-hook subcommand Emily Shaffer
@ 2020-09-09  0:49     ` Emily Shaffer
  2020-09-11 13:27       ` Phillip Wood
                         ` (3 more replies)
  2020-09-09  0:49     ` [PATCH v4 4/9] hook: add --porcelain to " Emily Shaffer
                       ` (7 subsequent siblings)
  10 siblings, 4 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-09-09  0:49 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Teach 'git hook list <hookname>', which checks the known configs in
order to create an ordered list of hooks to run on a given hook event.

Multiple commands can be specified for a given hook by providing
multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
run in config order. If more properties need to be set on a given hook
in the future, commands can also be specified by providing
"hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
<hookcmd-name>]" subsection; at minimum, this subsection must contain a
"hookcmd.<hookcmd-name>.command = <path-to-hook>" line.

For example:

  $ git config --list | grep ^hook
  hook.pre-commit.command=baz
  hook.pre-commit.command=~/bar.sh
  hookcmd.baz.command=~/baz/from/hookcmd.sh

  $ git hook list pre-commit
  ~/baz/from/hookcmd.sh
  ~/bar.sh

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/git-hook.txt    |  37 +++++++++++-
 Makefile                      |   1 +
 builtin/hook.c                |  55 ++++++++++++++++--
 hook.c                        | 102 ++++++++++++++++++++++++++++++++++
 hook.h                        |  15 +++++
 t/t1360-config-based-hooks.sh |  68 ++++++++++++++++++++++-
 6 files changed, 271 insertions(+), 7 deletions(-)
 create mode 100644 hook.c
 create mode 100644 hook.h

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index 2d50c414cc..e458586e96 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -8,12 +8,47 @@ git-hook - Manage configured hooks
 SYNOPSIS
 --------
 [verse]
-'git hook'
+'git hook' list <hook-name>
 
 DESCRIPTION
 -----------
 You can list, add, and modify hooks with this command.
 
+This command parses the default configuration files for sections "hook" and
+"hookcmd". "hook" is used to describe the commands which will be run during a
+particular hook event; commands are run in config order. "hookcmd" is used to
+describe attributes of a specific command. If additional attributes don't need
+to be specified, a command to run can be specified directly in the "hook"
+section; if a "hookcmd" by that name isn't found, Git will attempt to run the
+provided value directly. For example:
+
+Global config
+----
+  [hook "post-commit"]
+    command = "linter"
+    command = "~/typocheck.sh"
+
+  [hookcmd "linter"]
+    command = "/bin/linter --c"
+----
+
+Local config
+----
+  [hook "prepare-commit-msg"]
+    command = "linter"
+  [hook "post-commit"]
+    command = "python ~/run-test-suite.py"
+----
+
+COMMANDS
+--------
+
+list <hook-name>::
+
+List the hooks which have been configured for <hook-name>. Hooks appear
+in the order they should be run, and note the config scope where the relevant
+`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 6eee75555e..804de45b16 100644
--- a/Makefile
+++ b/Makefile
@@ -890,6 +890,7 @@ LIB_OBJS += grep.o
 LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
+LIB_OBJS += hook.o
 LIB_OBJS += ident.o
 LIB_OBJS += interdiff.o
 LIB_OBJS += json-writer.o
diff --git a/builtin/hook.c b/builtin/hook.c
index b2bbc84d4d..a0759a4c26 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -1,21 +1,68 @@
 #include "cache.h"
 
 #include "builtin.h"
+#include "config.h"
+#include "hook.h"
 #include "parse-options.h"
+#include "strbuf.h"
 
 static const char * const builtin_hook_usage[] = {
-	N_("git hook"),
+	N_("git hook list <hookname>"),
 	NULL
 };
 
-int cmd_hook(int argc, const char **argv, const char *prefix)
+static int list(int argc, const char **argv, const char *prefix)
 {
-	struct option builtin_hook_options[] = {
+	struct list_head *head, *pos;
+	struct hook *item;
+	struct strbuf hookname = STRBUF_INIT;
+
+	struct option list_options[] = {
 		OPT_END(),
 	};
 
-	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+	argc = parse_options(argc, argv, prefix, list_options,
 			     builtin_hook_usage, 0);
 
+	if (argc < 1) {
+		usage_msg_opt("a hookname must be provided to operate on.",
+			      builtin_hook_usage, list_options);
+	}
+
+	strbuf_addstr(&hookname, argv[0]);
+
+	head = hook_list(&hookname);
+
+	if (list_empty(head)) {
+		printf(_("no commands configured for hook '%s'\n"),
+		       hookname.buf);
+		return 0;
+	}
+
+	list_for_each(pos, head) {
+		item = list_entry(pos, struct hook, list);
+		if (item)
+			printf("%s:\t%s\n",
+			       config_scope_name(item->origin),
+			       item->command.buf);
+	}
+
+	clear_hook_list();
+	strbuf_release(&hookname);
+
 	return 0;
 }
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+	if (argc < 2)
+		usage_with_options(builtin_hook_usage, builtin_hook_options);
+
+	if (!strcmp(argv[1], "list"))
+		return list(argc - 1, argv + 1, prefix);
+
+	usage_with_options(builtin_hook_usage, builtin_hook_options);
+}
diff --git a/hook.c b/hook.c
new file mode 100644
index 0000000000..b006950eb8
--- /dev/null
+++ b/hook.c
@@ -0,0 +1,102 @@
+#include "cache.h"
+
+#include "hook.h"
+#include "config.h"
+
+/*
+ * NEEDSWORK: a stateful hook_head means we can't run two hook events in the
+ * background at the same time - which might be ok, or might not.
+ *
+ * Maybe it's better to cache a list head per hookname, since we can probably
+ * guess that the hook list won't change during a user-initiated operation. For
+ * now, within list_hooks, call clear_hook_list() at the outset.
+ */
+static LIST_HEAD(hook_head);
+
+void free_hook(struct hook *ptr)
+{
+	if (ptr) {
+		strbuf_release(&ptr->command);
+		free(ptr);
+	}
+}
+
+static void emplace_hook(struct list_head *pos, const char *command)
+{
+	struct hook *to_add = malloc(sizeof(struct hook));
+	to_add->origin = current_config_scope();
+	strbuf_init(&to_add->command, 0);
+	/* even with use_shell, run_command() needs quotes */
+	strbuf_addf(&to_add->command, "'%s'", command);
+
+	list_add_tail(&to_add->list, pos);
+}
+
+static void remove_hook(struct list_head *to_remove)
+{
+	struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
+	list_del(to_remove);
+	free_hook(hook_to_remove);
+}
+
+void clear_hook_list(void)
+{
+	struct list_head *pos, *tmp;
+	list_for_each_safe(pos, tmp, &hook_head)
+		remove_hook(pos);
+}
+
+static int hook_config_lookup(const char *key, const char *value, void *hook_key_cb)
+{
+	const char *hook_key = hook_key_cb;
+
+	if (!strcmp(key, hook_key)) {
+		const char *command = value;
+		struct strbuf hookcmd_name = STRBUF_INIT;
+		struct list_head *pos = NULL, *tmp = NULL;
+
+		/* Check if a hookcmd with that name exists. */
+		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
+		git_config_get_value(hookcmd_name.buf, &command);
+
+		if (!command)
+			BUG("git_config_get_value overwrote a string it shouldn't have");
+
+		/*
+		 * TODO: implement an option-getting callback, e.g.
+		 *   get configs by pattern hookcmd.$value.*
+		 *   for each key+value, do_callback(key, value, cb_data)
+		 */
+
+		list_for_each_safe(pos, tmp, &hook_head) {
+			struct hook *hook = list_entry(pos, struct hook, list);
+			/*
+			 * The list of hooks to run can be reordered by being redeclared
+			 * in the config. Options about hook ordering should be checked
+			 * here.
+			 */
+			if (0 == strcmp(hook->command.buf, command))
+				remove_hook(pos);
+		}
+		emplace_hook(pos, command);
+	}
+
+	return 0;
+}
+
+struct list_head* hook_list(const struct strbuf* hookname)
+{
+	struct strbuf hook_key = STRBUF_INIT;
+
+	if (!hookname)
+		return NULL;
+
+	/* hook_head is stateful */
+	clear_hook_list();
+
+	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
+
+	git_config(hook_config_lookup, (void*)hook_key.buf);
+
+	return &hook_head;
+}
diff --git a/hook.h b/hook.h
new file mode 100644
index 0000000000..aaf6511cff
--- /dev/null
+++ b/hook.h
@@ -0,0 +1,15 @@
+#include "config.h"
+#include "list.h"
+#include "strbuf.h"
+
+struct hook
+{
+	struct list_head list;
+	enum config_scope origin;
+	struct strbuf command;
+};
+
+struct list_head* hook_list(const struct strbuf *hookname);
+
+void free_hook(struct hook *ptr);
+void clear_hook_list(void);
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 34b0df5216..46d1ed354a 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -4,8 +4,72 @@ test_description='config-managed multihooks, including git-hook command'
 
 . ./test-lib.sh
 
-test_expect_success 'git hook command does not crash' '
-	git hook
+ROOT=
+if test_have_prereq MINGW
+then
+	# In Git for Windows, Unix-like paths work only in shell scripts;
+	# `git.exe`, however, will prefix them with the pseudo root directory
+	# (of the Unix shell). Let's accommodate for that.
+	ROOT="$(cd / && pwd)"
+fi
+
+setup_hooks () {
+	test_config hook.pre-commit.command "/path/ghi" --add
+	test_config_global hook.pre-commit.command "/path/def" --add
+}
+
+setup_hookcmd () {
+	test_config hook.pre-commit.command "abc" --add
+	test_config_global hookcmd.abc.command "/path/abc" --add
+}
+
+test_expect_success 'git hook rejects commands without a mode' '
+	test_must_fail git hook pre-commit
+'
+
+
+test_expect_success 'git hook rejects commands without a hookname' '
+	test_must_fail git hook list
+'
+
+test_expect_success 'git hook list orders by config order' '
+	setup_hooks &&
+
+	cat >expected <<-EOF &&
+	global:	$ROOT/path/def
+	local:	$ROOT/path/ghi
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list dereferences a hookcmd' '
+	setup_hooks &&
+	setup_hookcmd &&
+
+	cat >expected <<-EOF &&
+	global:	$ROOT/path/def
+	local:	$ROOT/path/ghi
+	local:	$ROOT/path/abc
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list reorders on duplicate commands' '
+	setup_hooks &&
+
+	test_config hook.pre-commit.command "/path/def" --add &&
+
+	cat >expected <<-EOF &&
+	local:	$ROOT/path/ghi
+	local:	$ROOT/path/def
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
 '
 
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v4 4/9] hook: add --porcelain to list command
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
                       ` (2 preceding siblings ...)
  2020-09-09  0:49     ` [PATCH v4 3/9] hook: add list command Emily Shaffer
@ 2020-09-09  0:49     ` Emily Shaffer
  2020-09-28 19:29       ` Josh Steadmon
  2020-09-09  0:49     ` [PATCH v4 5/9] parse-options: parse into strvec Emily Shaffer
                       ` (6 subsequent siblings)
  10 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-09-09  0:49 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Teach 'git hook list --porcelain <hookname>', which prints simply the
commands to be run in the order suggested by the config. This option is
intended for use by user scripts, wrappers, or out-of-process Git
commands which still want to execute hooks. For example, the following
snippet might be added to git-send-email.perl to introduce a
`pre-send-email` hook:

  sub pre_send_email {
    open(my $fh, 'git hook list --porcelain pre-send-email |');
    chomp(my @hooks = <$fh>);
    close($fh);

    foreach $hook (@hooks) {
            system $hook
    }

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/git-hook.txt    | 13 +++++++++++--
 builtin/hook.c                | 17 +++++++++++++----
 t/t1360-config-based-hooks.sh | 12 ++++++++++++
 3 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index e458586e96..0854035ce2 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -8,7 +8,7 @@ git-hook - Manage configured hooks
 SYNOPSIS
 --------
 [verse]
-'git hook' list <hook-name>
+'git hook' list [--porcelain] <hook-name>
 
 DESCRIPTION
 -----------
@@ -43,11 +43,20 @@ Local config
 COMMANDS
 --------
 
-list <hook-name>::
+list [--porcelain] <hook-name>::
 
 List the hooks which have been configured for <hook-name>. Hooks appear
 in the order they should be run, and note the config scope where the relevant
 `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
++
+If `--porcelain` is specified, instead print the commands alone, separated by
+newlines, for easy parsing by a script.
+
+OPTIONS
+-------
+--porcelain::
+	With `list`, print the commands in the order they should be run,
+	separated by newlines, for easy parsing by a script.
 
 GIT
 ---
diff --git a/builtin/hook.c b/builtin/hook.c
index a0759a4c26..0d92124ca6 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -16,8 +16,11 @@ static int list(int argc, const char **argv, const char *prefix)
 	struct list_head *head, *pos;
 	struct hook *item;
 	struct strbuf hookname = STRBUF_INIT;
+	int porcelain = 0;
 
 	struct option list_options[] = {
+		OPT_BOOL(0, "porcelain", &porcelain,
+			 "format for execution by a script"),
 		OPT_END(),
 	};
 
@@ -29,6 +32,8 @@ static int list(int argc, const char **argv, const char *prefix)
 			      builtin_hook_usage, list_options);
 	}
 
+
+
 	strbuf_addstr(&hookname, argv[0]);
 
 	head = hook_list(&hookname);
@@ -41,10 +46,14 @@ static int list(int argc, const char **argv, const char *prefix)
 
 	list_for_each(pos, head) {
 		item = list_entry(pos, struct hook, list);
-		if (item)
-			printf("%s:\t%s\n",
-			       config_scope_name(item->origin),
-			       item->command.buf);
+		if (item) {
+			if (porcelain)
+				printf("%s\n", item->command.buf);
+			else
+				printf("%s:\t%s\n",
+				       config_scope_name(item->origin),
+				       item->command.buf);
+		}
 	}
 
 	clear_hook_list();
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 46d1ed354a..ebf8f38d68 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -72,4 +72,16 @@ test_expect_success 'git hook list reorders on duplicate commands' '
 	test_cmp expected actual
 '
 
+test_expect_success 'git hook list --porcelain prints just the command' '
+	setup_hooks &&
+
+	cat >expected <<-EOF &&
+	$ROOT/path/def
+	$ROOT/path/ghi
+	EOF
+
+	git hook list --porcelain pre-commit >actual &&
+	test_cmp expected actual
+'
+
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v4 5/9] parse-options: parse into strvec
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
                       ` (3 preceding siblings ...)
  2020-09-09  0:49     ` [PATCH v4 4/9] hook: add --porcelain to " Emily Shaffer
@ 2020-09-09  0:49     ` Emily Shaffer
  2020-10-05 23:30       ` Jonathan Nieder
  2020-09-09  0:49     ` [PATCH v4 6/9] hook: add 'run' subcommand Emily Shaffer
                       ` (5 subsequent siblings)
  10 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-09-09  0:49 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

parse-options already knows how to read into a string_list, and it knows
how to read into an strvec as a passthrough (that is, including the
argument as well as its value). string_list and strvec serve similar
purposes but are somewhat painful to convert between; so, let's teach
parse-options to read values of string arguments directly into an
strvec without preserving the argument name.

This is useful if collecting generic arguments to pass through to
another command, for example, 'git hook run --arg "--quiet" --arg
"--format=pretty" some-hook'. The resulting strvec would contain
{ "--quiet", "--format=pretty" }.

The implementation is based on that of OPT_STRING_LIST.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/technical/api-parse-options.txt |  5 +++++
 parse-options-cb.c                            | 16 ++++++++++++++++
 parse-options.h                               |  4 ++++
 3 files changed, 25 insertions(+)

diff --git a/Documentation/technical/api-parse-options.txt b/Documentation/technical/api-parse-options.txt
index 5a60bbfa7f..b4f1fc4a1a 100644
--- a/Documentation/technical/api-parse-options.txt
+++ b/Documentation/technical/api-parse-options.txt
@@ -173,6 +173,11 @@ There are some macros to easily define options:
 	The string argument is stored as an element in `string_list`.
 	Use of `--no-option` will clear the list of preceding values.
 
+`OPT_ARGV_ARRAY(short, long, &struct argv_array, arg_str, description)`::
+	Introduce an option with a string argument.
+	The string argument is stored as an element in `argv_array`.
+	Use of `--no-option` will clear the list of preceding values.
+
 `OPT_INTEGER(short, long, &int_var, description)`::
 	Introduce an option with integer argument.
 	The integer is put into `int_var`.
diff --git a/parse-options-cb.c b/parse-options-cb.c
index d9d3b0819f..d2b8b7b98a 100644
--- a/parse-options-cb.c
+++ b/parse-options-cb.c
@@ -205,6 +205,22 @@ int parse_opt_string_list(const struct option *opt, const char *arg, int unset)
 	return 0;
 }
 
+int parse_opt_strvec(const struct option *opt, const char *arg, int unset)
+{
+	struct strvec *v = opt->value;
+
+	if (unset) {
+		strvec_clear(v);
+		return 0;
+	}
+
+	if (!arg)
+		return -1;
+
+	strvec_push(v, arg);
+	return 0;
+}
+
 int parse_opt_noop_cb(const struct option *opt, const char *arg, int unset)
 {
 	return 0;
diff --git a/parse-options.h b/parse-options.h
index 46af942093..177259488b 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -177,6 +177,9 @@ struct option {
 #define OPT_STRING_LIST(s, l, v, a, h) \
 				    { OPTION_CALLBACK, (s), (l), (v), (a), \
 				      (h), 0, &parse_opt_string_list }
+#define OPT_STRVEC(s, l, v, a, h) \
+				    { OPTION_CALLBACK, (s), (l), (v), (a), \
+				      (h), 0, &parse_opt_strvec }
 #define OPT_UYN(s, l, v, h)         { OPTION_CALLBACK, (s), (l), (v), NULL, \
 				      (h), PARSE_OPT_NOARG, &parse_opt_tertiary }
 #define OPT_EXPIRY_DATE(s, l, v, h) \
@@ -296,6 +299,7 @@ int parse_opt_commits(const struct option *, const char *, int);
 int parse_opt_commit(const struct option *, const char *, int);
 int parse_opt_tertiary(const struct option *, const char *, int);
 int parse_opt_string_list(const struct option *, const char *, int);
+int parse_opt_strvec(const struct option *, const char *, int);
 int parse_opt_noop_cb(const struct option *, const char *, int);
 enum parse_opt_result parse_opt_unknown_cb(struct parse_opt_ctx_t *ctx,
 					   const struct option *,
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v4 6/9] hook: add 'run' subcommand
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
                       ` (4 preceding siblings ...)
  2020-09-09  0:49     ` [PATCH v4 5/9] parse-options: parse into strvec Emily Shaffer
@ 2020-09-09  0:49     ` Emily Shaffer
  2020-09-11 13:30       ` Phillip Wood
                         ` (2 more replies)
  2020-09-09  0:49     ` [PATCH v4 7/9] hook: replace run-command.h:find_hook Emily Shaffer
                       ` (4 subsequent siblings)
  10 siblings, 3 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-09-09  0:49 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

In order to enable hooks to be run as an external process, by a
standalone Git command, or by tools which wrap Git, provide an external
means to run all configured hook commands for a given hook event.

For now, the hook commands will in config order, in series. As alternate
ordering or parallelism is supported in the future, we should add knobs
to use those to the command line as well.

As with the legacy hook implementation, all stdout generated by hook
commands is redirected to stderr. Piping from stdin is not yet
supported.

Legacy hooks (those present in $GITDIR/hooks) are run at the end of the
execution list. For now, there is no way to disable them.

Users may wish to provide hook commands like 'git config
hook.pre-commit.command "~/linter.sh --pre-commit"'. To enable this, the
contents of the 'hook.*.command' and 'hookcmd.*.command' strings are
first split by space or quotes into an argv_array, then expanded with
'expand_user_path()'.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 builtin/hook.c                | 30 ++++++++++++++++++++
 hook.c                        | 52 ++++++++++++++++++++++++++++++++---
 hook.h                        |  3 ++
 t/t1360-config-based-hooks.sh | 28 +++++++++++++++++++
 4 files changed, 109 insertions(+), 4 deletions(-)

diff --git a/builtin/hook.c b/builtin/hook.c
index 0d92124ca6..a8f8b03699 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -5,9 +5,11 @@
 #include "hook.h"
 #include "parse-options.h"
 #include "strbuf.h"
+#include "strvec.h"
 
 static const char * const builtin_hook_usage[] = {
 	N_("git hook list <hookname>"),
+	N_("git hook run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hookname>"),
 	NULL
 };
 
@@ -62,6 +64,32 @@ static int list(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+static int run(int argc, const char **argv, const char *prefix)
+{
+	struct strbuf hookname = STRBUF_INIT;
+	struct strvec envs = STRVEC_INIT;
+	struct strvec args = STRVEC_INIT;
+
+	struct option run_options[] = {
+		OPT_STRVEC('e', "env", &envs, N_("var"),
+			   N_("environment variables for hook to use")),
+		OPT_STRVEC('a', "arg", &args, N_("args"),
+			   N_("argument to pass to hook")),
+		OPT_END(),
+	};
+
+	argc = parse_options(argc, argv, prefix, run_options,
+			     builtin_hook_usage, 0);
+
+	if (argc < 1)
+		usage_msg_opt(_("a hookname must be provided to operate on."),
+			      builtin_hook_usage, run_options);
+
+	strbuf_addstr(&hookname, argv[0]);
+
+	return run_hooks(envs.v, &hookname, &args);
+}
+
 int cmd_hook(int argc, const char **argv, const char *prefix)
 {
 	struct option builtin_hook_options[] = {
@@ -72,6 +100,8 @@ int cmd_hook(int argc, const char **argv, const char *prefix)
 
 	if (!strcmp(argv[1], "list"))
 		return list(argc - 1, argv + 1, prefix);
+	if (!strcmp(argv[1], "run"))
+		return run(argc - 1, argv + 1, prefix);
 
 	usage_with_options(builtin_hook_usage, builtin_hook_options);
 }
diff --git a/hook.c b/hook.c
index b006950eb8..0dab981681 100644
--- a/hook.c
+++ b/hook.c
@@ -2,6 +2,7 @@
 
 #include "hook.h"
 #include "config.h"
+#include "run-command.h"
 
 /*
  * NEEDSWORK: a stateful hook_head means we can't run two hook events in the
@@ -21,13 +22,15 @@ void free_hook(struct hook *ptr)
 	}
 }
 
-static void emplace_hook(struct list_head *pos, const char *command)
+static void emplace_hook(struct list_head *pos, const char *command, int quoted)
 {
 	struct hook *to_add = malloc(sizeof(struct hook));
 	to_add->origin = current_config_scope();
 	strbuf_init(&to_add->command, 0);
-	/* even with use_shell, run_command() needs quotes */
-	strbuf_addf(&to_add->command, "'%s'", command);
+	if (quoted)
+		strbuf_addf(&to_add->command, "'%s'", command);
+	else
+		strbuf_addstr(&to_add->command, command);
 
 	list_add_tail(&to_add->list, pos);
 }
@@ -78,7 +81,7 @@ static int hook_config_lookup(const char *key, const char *value, void *hook_key
 			if (0 == strcmp(hook->command.buf, command))
 				remove_hook(pos);
 		}
-		emplace_hook(pos, command);
+		emplace_hook(pos, command, 0);
 	}
 
 	return 0;
@@ -87,6 +90,7 @@ static int hook_config_lookup(const char *key, const char *value, void *hook_key
 struct list_head* hook_list(const struct strbuf* hookname)
 {
 	struct strbuf hook_key = STRBUF_INIT;
+	const char *legacy_hook_path = NULL;
 
 	if (!hookname)
 		return NULL;
@@ -98,5 +102,45 @@ struct list_head* hook_list(const struct strbuf* hookname)
 
 	git_config(hook_config_lookup, (void*)hook_key.buf);
 
+	legacy_hook_path = find_hook(hookname->buf);
+
+	/* TODO: check hook.runHookDir */
+	if (legacy_hook_path)
+		emplace_hook(&hook_head, legacy_hook_path, 1);
+
 	return &hook_head;
 }
+
+int run_hooks(const char *const *env, const struct strbuf *hookname,
+	      const struct strvec *args)
+{
+	struct list_head *to_run, *pos = NULL, *tmp = NULL;
+	int rc = 0;
+
+	to_run = hook_list(hookname);
+
+	list_for_each_safe(pos, tmp, to_run) {
+		struct child_process hook_proc = CHILD_PROCESS_INIT;
+		struct hook *hook = list_entry(pos, struct hook, list);
+
+		/* add command */
+		strvec_push(&hook_proc.args, hook->command.buf);
+
+		/*
+		 * add passed-in argv, without expanding - let the user get back
+		 * exactly what they put in
+		 */
+		if (args)
+			strvec_pushv(&hook_proc.args, args->v);
+
+		hook_proc.env = env;
+		hook_proc.no_stdin = 1;
+		hook_proc.stdout_to_stderr = 1;
+		hook_proc.trace2_hook_name = hook->command.buf;
+		hook_proc.use_shell = 1;
+
+		rc |= run_command(&hook_proc);
+	}
+
+	return rc;
+}
diff --git a/hook.h b/hook.h
index aaf6511cff..d020788a6b 100644
--- a/hook.h
+++ b/hook.h
@@ -1,6 +1,7 @@
 #include "config.h"
 #include "list.h"
 #include "strbuf.h"
+#include "strvec.h"
 
 struct hook
 {
@@ -10,6 +11,8 @@ struct hook
 };
 
 struct list_head* hook_list(const struct strbuf *hookname);
+int run_hooks(const char *const *env, const struct strbuf *hookname,
+	      const struct strvec *args);
 
 void free_hook(struct hook *ptr);
 void clear_hook_list(void);
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index ebf8f38d68..ee8114250d 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -84,4 +84,32 @@ test_expect_success 'git hook list --porcelain prints just the command' '
 	test_cmp expected actual
 '
 
+test_expect_success 'inline hook definitions execute oneliners' '
+	test_config hook.pre-commit.command "echo \"Hello World\"" &&
+
+	echo "Hello World" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'inline hook definitions resolve paths' '
+	cat >~/sample-hook.sh <<-EOF &&
+	echo \"Sample Hook\"
+	EOF
+
+	test_when_finished "rm ~/sample-hook.sh" &&
+
+	chmod +x ~/sample-hook.sh &&
+
+	test_config hook.pre-commit.command "~/sample-hook.sh" &&
+
+	echo \"Sample Hook\" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v4 7/9] hook: replace run-command.h:find_hook
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
                       ` (5 preceding siblings ...)
  2020-09-09  0:49     ` [PATCH v4 6/9] hook: add 'run' subcommand Emily Shaffer
@ 2020-09-09  0:49     ` Emily Shaffer
  2020-09-09 20:32       ` Junio C Hamano
                         ` (2 more replies)
  2020-09-09  0:49     ` [PATCH v4 8/9] commit: use config-based hooks Emily Shaffer
                       ` (3 subsequent siblings)
  10 siblings, 3 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-09-09  0:49 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Add a helper to easily determine whether any hooks exist for a given
hook event.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 hook.c | 9 +++++++++
 hook.h | 1 +
 2 files changed, 10 insertions(+)

diff --git a/hook.c b/hook.c
index 0dab981681..7c7b922369 100644
--- a/hook.c
+++ b/hook.c
@@ -111,6 +111,15 @@ struct list_head* hook_list(const struct strbuf* hookname)
 	return &hook_head;
 }
 
+int hook_exists(const char *hookname)
+{
+	const char *value = NULL;
+	struct strbuf hook_key = STRBUF_INIT;
+	strbuf_addf(&hook_key, "hook.%s.command", hookname);
+
+	return (!git_config_get_value(hook_key.buf, &value)) || !!find_hook(hookname);
+}
+
 int run_hooks(const char *const *env, const struct strbuf *hookname,
 	      const struct strvec *args)
 {
diff --git a/hook.h b/hook.h
index d020788a6b..d94511b609 100644
--- a/hook.h
+++ b/hook.h
@@ -11,6 +11,7 @@ struct hook
 };
 
 struct list_head* hook_list(const struct strbuf *hookname);
+int hook_exists(const char *hookname);
 int run_hooks(const char *const *env, const struct strbuf *hookname,
 	      const struct strvec *args);
 
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v4 8/9] commit: use config-based hooks
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
                       ` (6 preceding siblings ...)
  2020-09-09  0:49     ` [PATCH v4 7/9] hook: replace run-command.h:find_hook Emily Shaffer
@ 2020-09-09  0:49     ` Emily Shaffer
  2020-09-10 13:50       ` Phillip Wood
  2020-09-23 23:47       ` Jonathan Tan
  2020-09-09  0:49     ` [PATCH v4 9/9] run_commit_hook: take strvec instead of varargs Emily Shaffer
                       ` (2 subsequent siblings)
  10 siblings, 2 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-09-09  0:49 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

As part of the adoption of config-based hooks, teach run_commit_hook()
to call hook.h instead of run-command.h. This covers 'pre-commit',
'commit-msg', and 'prepare-commit-msg'. Additionally, ask the hook
library - not run-command - whether any hooks will be run, as it's
possible hooks may exist in the config but not the hookdir.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 builtin/commit.c                                 |  3 ++-
 builtin/merge.c                                  |  3 ++-
 commit.c                                         | 13 ++++++++++++-
 t/t7503-pre-commit-and-pre-merge-commit-hooks.sh | 13 +++++++++++++
 4 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 69ac78d5e5..a19c6478eb 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -36,6 +36,7 @@
 #include "help.h"
 #include "commit-reach.h"
 #include "commit-graph.h"
+#include "hook.h"
 
 static const char * const builtin_commit_usage[] = {
 	N_("git commit [<options>] [--] <pathspec>..."),
@@ -985,7 +986,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 		return 0;
 	}
 
-	if (!no_verify && find_hook("pre-commit")) {
+	if (!no_verify && hook_exists("pre-commit")) {
 		/*
 		 * Re-read the index as pre-commit hook could have updated it,
 		 * and write it out as a tree.  We must do this before we invoke
diff --git a/builtin/merge.c b/builtin/merge.c
index 74829a838e..c1a9d0083d 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -41,6 +41,7 @@
 #include "commit-reach.h"
 #include "wt-status.h"
 #include "commit-graph.h"
+#include "hook.h"
 
 #define DEFAULT_TWOHEAD (1<<0)
 #define DEFAULT_OCTOPUS (1<<1)
@@ -829,7 +830,7 @@ static void prepare_to_commit(struct commit_list *remoteheads)
 	 * and write it out as a tree.  We must do this before we invoke
 	 * the editor and after we invoke run_status above.
 	 */
-	if (find_hook("pre-merge-commit"))
+	if (hook_exists("pre-merge-commit"))
 		discard_cache();
 	read_cache_from(index_file);
 	strbuf_addbuf(&msg, &merge_msg);
diff --git a/commit.c b/commit.c
index 4ce8cb38d5..c7a243e848 100644
--- a/commit.c
+++ b/commit.c
@@ -21,6 +21,7 @@
 #include "commit-reach.h"
 #include "run-command.h"
 #include "shallow.h"
+#include "hook.h"
 
 static struct commit_extra_header *read_commit_extra_header_lines(const char *buf, size_t len, const char **);
 
@@ -1632,8 +1633,13 @@ int run_commit_hook(int editor_is_used, const char *index_file,
 {
 	struct strvec hook_env = STRVEC_INIT;
 	va_list args;
+	const char *arg;
+	struct strvec hook_args = STRVEC_INIT;
+	struct strbuf hook_name = STRBUF_INIT;
 	int ret;
 
+	strbuf_addstr(&hook_name, name);
+
 	strvec_pushf(&hook_env, "GIT_INDEX_FILE=%s", index_file);
 
 	/*
@@ -1643,9 +1649,14 @@ int run_commit_hook(int editor_is_used, const char *index_file,
 		strvec_push(&hook_env, "GIT_EDITOR=:");
 
 	va_start(args, name);
-	ret = run_hook_ve(hook_env.v, name, args);
+	while ((arg = va_arg(args, const char *)))
+		strvec_push(&hook_args, arg);
 	va_end(args);
+
+	ret = run_hooks(hook_env.v, &hook_name, &hook_args);
 	strvec_clear(&hook_env);
+	strvec_clear(&hook_args);
+	strbuf_release(&hook_name);
 
 	return ret;
 }
diff --git a/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh b/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh
index b3485450a2..cef8085dcc 100755
--- a/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh
+++ b/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh
@@ -103,6 +103,19 @@ test_expect_success 'with succeeding hook' '
 	test_cmp expected_hooks actual_hooks
 '
 
+# NEEDSWORK: when 'git hook add' and 'git hook remove' have been added, use that
+# instead
+test_expect_success 'with succeeding hook (config-based)' '
+	test_when_finished "git config --unset hook.pre-commit.command success.sample" &&
+	test_when_finished "rm -f expected_hooks actual_hooks" &&
+	git config hook.pre-commit.command "$HOOKDIR/success.sample" &&
+	echo "$HOOKDIR/success.sample" >expected_hooks &&
+	echo "more" >>file &&
+	git add file &&
+	git commit -m "more" &&
+	test_cmp expected_hooks actual_hooks
+'
+
 test_expect_success 'with succeeding hook (merge)' '
 	test_when_finished "rm -f \"$PREMERGE\" expected_hooks actual_hooks" &&
 	cp "$HOOKDIR/success.sample" "$PREMERGE" &&
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v4 9/9] run_commit_hook: take strvec instead of varargs
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
                       ` (7 preceding siblings ...)
  2020-09-09  0:49     ` [PATCH v4 8/9] commit: use config-based hooks Emily Shaffer
@ 2020-09-09  0:49     ` Emily Shaffer
  2020-09-10 14:16       ` Phillip Wood
  2020-09-09 21:04     ` [PATCH v4 0/9] propose config-based hooks Junio C Hamano
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
  10 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-09-09  0:49 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Taking varargs in run_commit_hook() led to some bizarre patterns, like
callers using two string variables (which may or may not be filled) to
express different argument lists for the commit hooks. Because
run_commit_hook() no longer needs to call a variadic function for the
hook run itself, we can use strvec to make the calling code more
conventional.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 builtin/commit.c | 46 +++++++++++++++++++++++-----------------------
 builtin/merge.c  | 20 ++++++++++++++++----
 commit.c         | 13 ++-----------
 commit.h         |  5 +++--
 sequencer.c      | 15 ++++++++-------
 5 files changed, 52 insertions(+), 47 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index a19c6478eb..f029d4f5ac 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -691,8 +691,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 	struct strbuf committer_ident = STRBUF_INIT;
 	int committable;
 	struct strbuf sb = STRBUF_INIT;
-	const char *hook_arg1 = NULL;
-	const char *hook_arg2 = NULL;
+	struct strvec hook_args = STRVEC_INIT;
 	int clean_message_contents = (cleanup_mode != COMMIT_MSG_CLEANUP_NONE);
 	int old_display_comment_prefix;
 	int merge_contains_scissors = 0;
@@ -700,7 +699,8 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 	/* This checks and barfs if author is badly specified */
 	determine_author_info(author_ident);
 
-	if (!no_verify && run_commit_hook(use_editor, index_file, "pre-commit", NULL))
+	if (!no_verify && run_commit_hook(use_editor, index_file, "pre-commit",
+					  &hook_args))
 		return 0;
 
 	if (squash_message) {
@@ -722,27 +722,28 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 		}
 	}
 
+	strvec_push(&hook_args, git_path_commit_editmsg());
+
 	if (have_option_m && !fixup_message) {
 		strbuf_addbuf(&sb, &message);
-		hook_arg1 = "message";
+		strvec_push(&hook_args, "message");
 	} else if (logfile && !strcmp(logfile, "-")) {
 		if (isatty(0))
 			fprintf(stderr, _("(reading log message from standard input)\n"));
 		if (strbuf_read(&sb, 0, 0) < 0)
 			die_errno(_("could not read log from standard input"));
-		hook_arg1 = "message";
+		strvec_push(&hook_args, "message");
 	} else if (logfile) {
 		if (strbuf_read_file(&sb, logfile, 0) < 0)
 			die_errno(_("could not read log file '%s'"),
 				  logfile);
-		hook_arg1 = "message";
+		strvec_push(&hook_args, "message");
 	} else if (use_message) {
 		char *buffer;
 		buffer = strstr(use_message_buffer, "\n\n");
 		if (buffer)
 			strbuf_addstr(&sb, skip_blank_lines(buffer + 2));
-		hook_arg1 = "commit";
-		hook_arg2 = use_message;
+		strvec_pushl(&hook_args, "commit", use_message, NULL);
 	} else if (fixup_message) {
 		struct pretty_print_context ctx = {0};
 		struct commit *commit;
@@ -754,7 +755,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 				      &sb, &ctx);
 		if (have_option_m)
 			strbuf_addbuf(&sb, &message);
-		hook_arg1 = "message";
+		strvec_push(&hook_args, "message");
 	} else if (!stat(git_path_merge_msg(the_repository), &statbuf)) {
 		size_t merge_msg_start;
 
@@ -765,9 +766,9 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 		if (!stat(git_path_squash_msg(the_repository), &statbuf)) {
 			if (strbuf_read_file(&sb, git_path_squash_msg(the_repository), 0) < 0)
 				die_errno(_("could not read SQUASH_MSG"));
-			hook_arg1 = "squash";
+			strvec_push(&hook_args, "squash");
 		} else
-			hook_arg1 = "merge";
+			strvec_push(&hook_args, "merge");
 
 		merge_msg_start = sb.len;
 		if (strbuf_read_file(&sb, git_path_merge_msg(the_repository), 0) < 0)
@@ -781,11 +782,11 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 	} else if (!stat(git_path_squash_msg(the_repository), &statbuf)) {
 		if (strbuf_read_file(&sb, git_path_squash_msg(the_repository), 0) < 0)
 			die_errno(_("could not read SQUASH_MSG"));
-		hook_arg1 = "squash";
+		strvec_push(&hook_args, "squash");
 	} else if (template_file) {
 		if (strbuf_read_file(&sb, template_file, 0) < 0)
 			die_errno(_("could not read '%s'"), template_file);
-		hook_arg1 = "template";
+		strvec_push(&hook_args, "template");
 		clean_message_contents = 0;
 	}
 
@@ -794,11 +795,9 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 	 * just set the argument(s) to the prepare-commit-msg hook.
 	 */
 	else if (whence == FROM_MERGE)
-		hook_arg1 = "merge";
-	else if (is_from_cherry_pick(whence) || whence == FROM_REBASE_PICK) {
-		hook_arg1 = "commit";
-		hook_arg2 = "CHERRY_PICK_HEAD";
-	}
+		strvec_push(&hook_args, "merge");
+	else if (is_from_cherry_pick(whence) || whence == FROM_REBASE_PICK)
+		strvec_pushl(&hook_args, "commit", "CHERRY_PICK_HEAD", NULL);
 
 	if (squash_message) {
 		/*
@@ -806,8 +805,8 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 		 * then we're possibly hijacking other commit log options.
 		 * Reset the hook args to tell the real story.
 		 */
-		hook_arg1 = "message";
-		hook_arg2 = "";
+		strvec_clear(&hook_args);
+		strvec_pushl(&hook_args, git_path_commit_editmsg(), "message", NULL);
 	}
 
 	s->fp = fopen_for_writing(git_path_commit_editmsg());
@@ -1001,8 +1000,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 		return 0;
 	}
 
-	if (run_commit_hook(use_editor, index_file, "prepare-commit-msg",
-			    git_path_commit_editmsg(), hook_arg1, hook_arg2, NULL))
+	if (run_commit_hook(use_editor, index_file, "prepare-commit-msg", &hook_args))
 		return 0;
 
 	if (use_editor) {
@@ -1017,8 +1015,10 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 		strvec_clear(&env);
 	}
 
+	strvec_clear(&hook_args);
+	strvec_push(&hook_args, git_path_commit_editmsg());
 	if (!no_verify &&
-	    run_commit_hook(use_editor, index_file, "commit-msg", git_path_commit_editmsg(), NULL)) {
+	    run_commit_hook(use_editor, index_file, "commit-msg", &hook_args)) {
 		return 0;
 	}
 
diff --git a/builtin/merge.c b/builtin/merge.c
index c1a9d0083d..863c9039a3 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -821,10 +821,14 @@ static void write_merge_heads(struct commit_list *);
 static void prepare_to_commit(struct commit_list *remoteheads)
 {
 	struct strbuf msg = STRBUF_INIT;
+	struct strvec hook_args = STRVEC_INIT;
+	struct strbuf hook_name = STRBUF_INIT;
 	const char *index_file = get_index_file();
 
-	if (!no_verify && run_commit_hook(0 < option_edit, index_file, "pre-merge-commit", NULL))
+	if (!no_verify && run_commit_hook(0 < option_edit, index_file,
+					  "pre-merge-commit", &hook_args))
 		abort_commit(remoteheads, NULL);
+
 	/*
 	 * Re-read the index as pre-merge-commit hook could have updated it,
 	 * and write it out as a tree.  We must do this before we invoke
@@ -832,6 +836,7 @@ static void prepare_to_commit(struct commit_list *remoteheads)
 	 */
 	if (hook_exists("pre-merge-commit"))
 		discard_cache();
+
 	read_cache_from(index_file);
 	strbuf_addbuf(&msg, &merge_msg);
 	if (squash)
@@ -851,17 +856,22 @@ static void prepare_to_commit(struct commit_list *remoteheads)
 		append_signoff(&msg, ignore_non_trailer(msg.buf, msg.len), 0);
 	write_merge_heads(remoteheads);
 	write_file_buf(git_path_merge_msg(the_repository), msg.buf, msg.len);
+
+	strvec_clear(&hook_args);
+	strvec_pushl(&hook_args, git_path_merge_msg(the_repository), "merge", NULL);
 	if (run_commit_hook(0 < option_edit, get_index_file(), "prepare-commit-msg",
-			    git_path_merge_msg(the_repository), "merge", NULL))
+			    &hook_args))
 		abort_commit(remoteheads, NULL);
+
 	if (0 < option_edit) {
 		if (launch_editor(git_path_merge_msg(the_repository), NULL, NULL))
 			abort_commit(remoteheads, NULL);
 	}
 
+	strvec_clear(&hook_args);
+	strvec_push(&hook_args, git_path_merge_msg(the_repository));
 	if (!no_verify && run_commit_hook(0 < option_edit, get_index_file(),
-					  "commit-msg",
-					  git_path_merge_msg(the_repository), NULL))
+					  "commit-msg", &hook_args))
 		abort_commit(remoteheads, NULL);
 
 	read_merge_msg(&msg);
@@ -871,6 +881,8 @@ static void prepare_to_commit(struct commit_list *remoteheads)
 	strbuf_release(&merge_msg);
 	strbuf_addbuf(&merge_msg, &msg);
 	strbuf_release(&msg);
+	strbuf_release(&hook_name);
+	strvec_clear(&hook_args);
 }
 
 static int merge_trivial(struct commit *head, struct commit_list *remoteheads)
diff --git a/commit.c b/commit.c
index c7a243e848..726407152c 100644
--- a/commit.c
+++ b/commit.c
@@ -1629,12 +1629,9 @@ size_t ignore_non_trailer(const char *buf, size_t len)
 }
 
 int run_commit_hook(int editor_is_used, const char *index_file,
-		    const char *name, ...)
+		    const char *name, struct strvec *args)
 {
 	struct strvec hook_env = STRVEC_INIT;
-	va_list args;
-	const char *arg;
-	struct strvec hook_args = STRVEC_INIT;
 	struct strbuf hook_name = STRBUF_INIT;
 	int ret;
 
@@ -1648,14 +1645,8 @@ int run_commit_hook(int editor_is_used, const char *index_file,
 	if (!editor_is_used)
 		strvec_push(&hook_env, "GIT_EDITOR=:");
 
-	va_start(args, name);
-	while ((arg = va_arg(args, const char *)))
-		strvec_push(&hook_args, arg);
-	va_end(args);
-
-	ret = run_hooks(hook_env.v, &hook_name, &hook_args);
+	ret = run_hooks(hook_env.v, &hook_name, args);
 	strvec_clear(&hook_env);
-	strvec_clear(&hook_args);
 	strbuf_release(&hook_name);
 
 	return ret;
diff --git a/commit.h b/commit.h
index e901538909..978da3c3e0 100644
--- a/commit.h
+++ b/commit.h
@@ -9,6 +9,7 @@
 #include "string-list.h"
 #include "pretty.h"
 #include "commit-slab.h"
+#include "strvec.h"
 
 #define COMMIT_NOT_FROM_GRAPH 0xFFFFFFFF
 #define GENERATION_NUMBER_INFINITY 0xFFFFFFFF
@@ -353,7 +354,7 @@ void verify_merge_signature(struct commit *commit, int verbose,
 int compare_commits_by_commit_date(const void *a_, const void *b_, void *unused);
 int compare_commits_by_gen_then_commit_date(const void *a_, const void *b_, void *unused);
 
-LAST_ARG_MUST_BE_NULL
-int run_commit_hook(int editor_is_used, const char *index_file, const char *name, ...);
+int run_commit_hook(int editor_is_used, const char *index_file,
+		    const char *name, struct strvec *args);
 
 #endif /* COMMIT_H */
diff --git a/sequencer.c b/sequencer.c
index cc3f8fa88e..5dd4b134d6 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1124,22 +1124,23 @@ static int run_prepare_commit_msg_hook(struct repository *r,
 				       const char *commit)
 {
 	int ret = 0;
-	const char *name, *arg1 = NULL, *arg2 = NULL;
+	struct strvec args = STRVEC_INIT;
+	const char *name = git_path_commit_editmsg();
 
-	name = git_path_commit_editmsg();
+	strvec_push(&args, name);
 	if (write_message(msg->buf, msg->len, name, 0))
 		return -1;
 
 	if (commit) {
-		arg1 = "commit";
-		arg2 = commit;
+		strvec_push(&args, "commit");
+		strvec_push(&args, commit);
 	} else {
-		arg1 = "message";
+		strvec_push(&args, "message");
 	}
-	if (run_commit_hook(0, r->index_file, "prepare-commit-msg", name,
-			    arg1, arg2, NULL))
+	if (run_commit_hook(0, r->index_file, "prepare-commit-msg", &args))
 		ret = error(_("'prepare-commit-msg' hook failed"));
 
+	strvec_clear(&args);
 	return ret;
 }
 
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 7/9] hook: replace run-command.h:find_hook
  2020-09-09  0:49     ` [PATCH v4 7/9] hook: replace run-command.h:find_hook Emily Shaffer
@ 2020-09-09 20:32       ` Junio C Hamano
  2020-09-10 19:08         ` Emily Shaffer
  2020-09-23 23:20       ` Jonathan Tan
  2020-10-05 23:42       ` Jonathan Nieder
  2 siblings, 1 reply; 170+ messages in thread
From: Junio C Hamano @ 2020-09-09 20:32 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Emily Shaffer <emilyshaffer@google.com> writes:

> Add a helper to easily determine whether any hooks exist for a given
> hook event.
>
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
>  hook.c | 9 +++++++++
>  hook.h | 1 +
>  2 files changed, 10 insertions(+)

Should we consider the last three patches still work-in-progress
technology demonstration, or are these meant as a proposal for a new
API element as-is?

It is perfectly fine if it is the former.  I just want to make sure
we share a common understanding on the direction in which we want
these patches to take us.  Here is my take:

 - For now, a hook/event that is aware of the config-based hook
   system is supposed to use hook_exists(), while the traditional
   ones still use find_hook().  We expect more and more will be
   converted to the former over time.

 - Invoking hook scripts under the new world order is done by
   including hook.h and calling run_hooks(), not by driving the
   run-command API yourself (I count run_hook_ve() as part of the
   latter) like the traditional code did.  We expect more and more
   will be converted to the former over time.

 - From the point of view of the end users who have been happily
   using scripts in $GIT_DIR/hooks, everything will stay the same.
   hook_exists() will find them (by calling find_hook() as a
   fallback) and run_hooks() will run them (by relying on
   hook_list() to include them).

I am guessing that the above gives us a high-level description.

The new interface needs to be described in hook.h once the series
graduates from the technology demonstration state, in order to help
others who want to help updating the callsites of traditional hooks
to the new API.  And the above three-bullet point list is my attempt
to figure out what kind of things need to be documented to help
them.

I am not seeing anything in run_hooks() that consumes input from us
over pipe, by the way, without which we cannot do things like the
"pre-receive" hooks under the new world order.  Are they planned to
come in the future, after these "we feed anything they need from the
command line and from the enviornment" hooks are dealt with in this
first pass?

Thanks.

> diff --git a/hook.c b/hook.c
> index 0dab981681..7c7b922369 100644
> --- a/hook.c
> +++ b/hook.c
> @@ -111,6 +111,15 @@ struct list_head* hook_list(const struct strbuf* hookname)
>  	return &hook_head;
>  }
>  
> +int hook_exists(const char *hookname)
> +{
> +	const char *value = NULL;
> +	struct strbuf hook_key = STRBUF_INIT;
> +	strbuf_addf(&hook_key, "hook.%s.command", hookname);
> +
> +	return (!git_config_get_value(hook_key.buf, &value)) || !!find_hook(hookname);
> +}
> +
>  int run_hooks(const char *const *env, const struct strbuf *hookname,
>  	      const struct strvec *args)
>  {
> diff --git a/hook.h b/hook.h
> index d020788a6b..d94511b609 100644
> --- a/hook.h
> +++ b/hook.h
> @@ -11,6 +11,7 @@ struct hook
>  };
>  
>  struct list_head* hook_list(const struct strbuf *hookname);
> +int hook_exists(const char *hookname);
>  int run_hooks(const char *const *env, const struct strbuf *hookname,
>  	      const struct strvec *args);

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 0/9] propose config-based hooks
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
                       ` (8 preceding siblings ...)
  2020-09-09  0:49     ` [PATCH v4 9/9] run_commit_hook: take strvec instead of varargs Emily Shaffer
@ 2020-09-09 21:04     ` Junio C Hamano
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
  10 siblings, 0 replies; 170+ messages in thread
From: Junio C Hamano @ 2020-09-09 21:04 UTC (permalink / raw)
  To: Emily Shaffer
  Cc: git, Jeff King, James Ramsay, Jonathan Nieder, brian m. carlson,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Josh Steadmon, Johannes Schindelin

Emily Shaffer <emilyshaffer@google.com> writes:

> Since v3, the biggest change is the conversion of commit hooks to use the new
> hook machinery. The first change ("commit: use config-based hooks") is the
> important part; the second change ("run_commit_hook: take strvec instead of varargs")
> is probably subjective, but I thought it was a decent tech debt reduction.
>
> I wanted to send this reroll quickly since I had promised it in standup last
> week, but I've got pretty good progress locally on the patch for configuring
> "hook.runHookDir"; I'm planning to send that soon, probably this week.

I've had the attached merge-fix patch as a way to adjust argv_array
to strvec transition [*1*], but now *most* but not all parts of this
series have been migrated to the strvec API, you should apply some
parts in the merge-fix patch to your copy.  I think the changes in
the old "merge-fix" patch to *.c and *.h are already in your series
that has been rebased on a newer 'master' that has strvec, but
documentation and possibly in-code comments may need to be adjusted.

Another way to sanity check the result would be to run this:

    $ git diff master..es/config-hooks | grep -i argv.array

Thanks.  

[Footnote]

*1* The way I work with a topic that causes conflicts with other
    topics is to merge a new version of topic and letting the rerere
    records I created while resolving the conflicts with the
    previous round.  After textual conflicts are thusly resolved, if
    there are further changes that do not cause textual conflict
    that are necessary, they are written in the form of a
    "merge-fix" patch like the attached.

-- >8 --

 Documentation/technical/api-parse-options.txt  |  4 ++--
 Documentation/technical/config-based-hooks.txt |  4 ++--
 builtin/hook.c                                 | 16 ++++++++--------
 hook.c                                         |  6 +++---
 hook.h                                         |  4 ++--
 parse-options-cb.c                             |  8 ++++----
 parse-options.h                                |  6 +++---
 7 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/Documentation/technical/api-parse-options.txt b/Documentation/technical/api-parse-options.txt
index b4f1fc4a1a..679bd98629 100644
--- a/Documentation/technical/api-parse-options.txt
+++ b/Documentation/technical/api-parse-options.txt
@@ -173,9 +173,9 @@ There are some macros to easily define options:
 	The string argument is stored as an element in `string_list`.
 	Use of `--no-option` will clear the list of preceding values.
 
-`OPT_ARGV_ARRAY(short, long, &struct argv_array, arg_str, description)`::
+`OPT_STRVEC(short, long, &struct strvec, arg_str, description)`::
 	Introduce an option with a string argument.
-	The string argument is stored as an element in `argv_array`.
+	The string argument is stored as an element in `strvec`.
 	Use of `--no-option` will clear the list of preceding values.
 
 `OPT_INTEGER(short, long, &int_var, description)`::
diff --git a/Documentation/technical/config-based-hooks.txt b/Documentation/technical/config-based-hooks.txt
index c6e762b192..4443f70ded 100644
--- a/Documentation/technical/config-based-hooks.txt
+++ b/Documentation/technical/config-based-hooks.txt
@@ -106,10 +106,10 @@ a concise config afterwards. It may take a form similar to `git rebase
 `hook.c` and `hook.h` are responsible for interacting with the config files. In
 the case when the code generating a hook event doesn't have special concerns
 about how to run the hooks, the hook library will provide a basic API to call
-all hooks in config order with an `argv_array` provided by the code which
+all hooks in config order with an `strvec` provided by the code which
 generates the hook event:
 
-*`int run_hooks(const char *hookname, struct argv_array *args)`*
+*`int run_hooks(const char *hookname, struct strvec *args)`*
 
 This call includes the hook command provided by `run-command.h:find_hook()`;
 eventually, this legacy hook will be gated by a config `hook.runHookDir`. The
diff --git a/builtin/hook.c b/builtin/hook.c
index cd61fad5fb..debcb5a77a 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -5,7 +5,7 @@
 #include "hook.h"
 #include "parse-options.h"
 #include "strbuf.h"
-#include "argv-array.h"
+#include "strvec.h"
 
 static const char * const builtin_hook_usage[] = {
 	N_("git hook list <hookname>"),
@@ -67,14 +67,14 @@ static int list(int argc, const char **argv, const char *prefix)
 static int run(int argc, const char **argv, const char *prefix)
 {
 	struct strbuf hookname = STRBUF_INIT;
-	struct argv_array env_argv = ARGV_ARRAY_INIT;
-	struct argv_array arg_argv = ARGV_ARRAY_INIT;
+	struct strvec env_argv = STRVEC_INIT;
+	struct strvec arg_argv = STRVEC_INIT;
 
 	struct option run_options[] = {
-		OPT_ARGV_ARRAY('e', "env", &env_argv, N_("var"),
-			       N_("environment variables for hook to use")),
-		OPT_ARGV_ARRAY('a', "arg", &arg_argv, N_("args"),
-			       N_("argument to pass to hook")),
+		OPT_STRVEC('e', "env", &env_argv, N_("var"),
+			   N_("environment variables for hook to use")),
+		OPT_STRVEC('a', "arg", &arg_argv, N_("args"),
+			   N_("argument to pass to hook")),
 		OPT_END(),
 	};
 
@@ -87,7 +87,7 @@ static int run(int argc, const char **argv, const char *prefix)
 
 	strbuf_addstr(&hookname, argv[0]);
 
-	return run_hooks(env_argv.argv, &hookname, &arg_argv);
+	return run_hooks(env_argv.v, &hookname, &arg_argv);
 }
 
 int cmd_hook(int argc, const char **argv, const char *prefix)
diff --git a/hook.c b/hook.c
index 902e213173..40d319adb1 100644
--- a/hook.c
+++ b/hook.c
@@ -98,7 +98,7 @@ struct list_head* hook_list(const struct strbuf* hookname)
 }
 
 int run_hooks(const char *const *env, const struct strbuf *hookname,
-	      const struct argv_array *args)
+	      const struct strvec *args)
 {
 	struct list_head *to_run, *pos = NULL, *tmp = NULL;
 	int rc = 0;
@@ -110,14 +110,14 @@ int run_hooks(const char *const *env, const struct strbuf *hookname,
 		struct hook *hook = list_entry(pos, struct hook, list);
 
 		/* add command */
-		argv_array_push(&hook_proc.args, hook->command.buf);
+		strvec_push(&hook_proc.args, hook->command.buf);
 
 		/*
 		 * add passed-in argv, without expanding - let the user get back
 		 * exactly what they put in
 		 */
 		if (args)
-			argv_array_pushv(&hook_proc.args, args->argv);
+			strvec_pushv(&hook_proc.args, args->v);
 
 		hook_proc.env = env;
 		hook_proc.no_stdin = 1;
diff --git a/hook.h b/hook.h
index cf598d6ccb..d020788a6b 100644
--- a/hook.h
+++ b/hook.h
@@ -1,7 +1,7 @@
 #include "config.h"
 #include "list.h"
 #include "strbuf.h"
-#include "argv-array.h"
+#include "strvec.h"
 
 struct hook
 {
@@ -12,7 +12,7 @@ struct hook
 
 struct list_head* hook_list(const struct strbuf *hookname);
 int run_hooks(const char *const *env, const struct strbuf *hookname,
-	      const struct argv_array *args);
+	      const struct strvec *args);
 
 void free_hook(struct hook *ptr);
 void clear_hook_list(void);
diff --git a/parse-options-cb.c b/parse-options-cb.c
index 4f993cd734..d2b8b7b98a 100644
--- a/parse-options-cb.c
+++ b/parse-options-cb.c
@@ -205,19 +205,19 @@ int parse_opt_string_list(const struct option *opt, const char *arg, int unset)
 	return 0;
 }
 
-int parse_opt_argv_array(const struct option *opt, const char *arg, int unset)
+int parse_opt_strvec(const struct option *opt, const char *arg, int unset)
 {
-	struct argv_array *v = opt->value;
+	struct strvec *v = opt->value;
 
 	if (unset) {
-		argv_array_clear(v);
+		strvec_clear(v);
 		return 0;
 	}
 
 	if (!arg)
 		return -1;
 
-	argv_array_push(v, arg);
+	strvec_push(v, arg);
 	return 0;
 }
 
diff --git a/parse-options.h b/parse-options.h
index e2e2de75c8..177259488b 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -177,9 +177,9 @@ struct option {
 #define OPT_STRING_LIST(s, l, v, a, h) \
 				    { OPTION_CALLBACK, (s), (l), (v), (a), \
 				      (h), 0, &parse_opt_string_list }
-#define OPT_ARGV_ARRAY(s, l, v, a, h) \
+#define OPT_STRVEC(s, l, v, a, h) \
 				    { OPTION_CALLBACK, (s), (l), (v), (a), \
-				      (h), 0, &parse_opt_argv_array }
+				      (h), 0, &parse_opt_strvec }
 #define OPT_UYN(s, l, v, h)         { OPTION_CALLBACK, (s), (l), (v), NULL, \
 				      (h), PARSE_OPT_NOARG, &parse_opt_tertiary }
 #define OPT_EXPIRY_DATE(s, l, v, h) \
@@ -299,7 +299,7 @@ int parse_opt_commits(const struct option *, const char *, int);
 int parse_opt_commit(const struct option *, const char *, int);
 int parse_opt_tertiary(const struct option *, const char *, int);
 int parse_opt_string_list(const struct option *, const char *, int);
-int parse_opt_argv_array(const struct option *, const char *, int);
+int parse_opt_strvec(const struct option *, const char *, int);
 int parse_opt_noop_cb(const struct option *, const char *, int);
 enum parse_opt_result parse_opt_unknown_cb(struct parse_opt_ctx_t *ctx,
 					   const struct option *,
-- 
2.28.0-558-g7a0184fd7b


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 8/9] commit: use config-based hooks
  2020-09-09  0:49     ` [PATCH v4 8/9] commit: use config-based hooks Emily Shaffer
@ 2020-09-10 13:50       ` Phillip Wood
  2020-09-10 22:21         ` Junio C Hamano
  2020-09-23 23:47       ` Jonathan Tan
  1 sibling, 1 reply; 170+ messages in thread
From: Phillip Wood @ 2020-09-10 13:50 UTC (permalink / raw)
  To: Emily Shaffer, git

Hi Emily

On 09/09/2020 01:49, Emily Shaffer wrote:
> As part of the adoption of config-based hooks, teach run_commit_hook()
> to call hook.h instead of run-command.h. This covers 'pre-commit',
> 'commit-msg', and 'prepare-commit-msg'. Additionally, ask the hook
> library - not run-command - whether any hooks will be run, as it's
> possible hooks may exist in the config but not the hookdir.
> 
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
>   builtin/commit.c                                 |  3 ++-
>   builtin/merge.c                                  |  3 ++-
>   commit.c                                         | 13 ++++++++++++-
>   t/t7503-pre-commit-and-pre-merge-commit-hooks.sh | 13 +++++++++++++
>   4 files changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/builtin/commit.c b/builtin/commit.c
> index 69ac78d5e5..a19c6478eb 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -36,6 +36,7 @@
>   #include "help.h"
>   #include "commit-reach.h"
>   #include "commit-graph.h"
> +#include "hook.h"
>   
>   static const char * const builtin_commit_usage[] = {
>   	N_("git commit [<options>] [--] <pathspec>..."),
> @@ -985,7 +986,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   		return 0;
>   	}
>   
> -	if (!no_verify && find_hook("pre-commit")) {
> +	if (!no_verify && hook_exists("pre-commit")) {
>   		/*
>   		 * Re-read the index as pre-commit hook could have updated it,
>   		 * and write it out as a tree.  We must do this before we invoke
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 74829a838e..c1a9d0083d 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -41,6 +41,7 @@
>   #include "commit-reach.h"
>   #include "wt-status.h"
>   #include "commit-graph.h"
> +#include "hook.h"
>   
>   #define DEFAULT_TWOHEAD (1<<0)
>   #define DEFAULT_OCTOPUS (1<<1)
> @@ -829,7 +830,7 @@ static void prepare_to_commit(struct commit_list *remoteheads)
>   	 * and write it out as a tree.  We must do this before we invoke
>   	 * the editor and after we invoke run_status above.
>   	 */
> -	if (find_hook("pre-merge-commit"))
> +	if (hook_exists("pre-merge-commit"))
>   		discard_cache();
>   	read_cache_from(index_file);
>   	strbuf_addbuf(&msg, &merge_msg);
> diff --git a/commit.c b/commit.c
> index 4ce8cb38d5..c7a243e848 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -21,6 +21,7 @@
>   #include "commit-reach.h"
>   #include "run-command.h"
>   #include "shallow.h"
> +#include "hook.h"
>   
>   static struct commit_extra_header *read_commit_extra_header_lines(const char *buf, size_t len, const char **);
>   
> @@ -1632,8 +1633,13 @@ int run_commit_hook(int editor_is_used, const char *index_file,
>   {
>   	struct strvec hook_env = STRVEC_INIT;
>   	va_list args;
> +	const char *arg;
> +	struct strvec hook_args = STRVEC_INIT;
> +	struct strbuf hook_name = STRBUF_INIT;
>   	int ret;
>   
> +	strbuf_addstr(&hook_name, name);

Seeing this makes me wonder if it would be better for run_hooks() to 
take a string for the name rather than an strbuf, I suspect that 
virtually all callers have a fixed hook name.

Best Wishes

Phillip

>   	strvec_pushf(&hook_env, "GIT_INDEX_FILE=%s", index_file);
>   
>   	/*
> @@ -1643,9 +1649,14 @@ int run_commit_hook(int editor_is_used, const char *index_file,
>   		strvec_push(&hook_env, "GIT_EDITOR=:");
>   
>   	va_start(args, name);
> -	ret = run_hook_ve(hook_env.v, name, args);
> +	while ((arg = va_arg(args, const char *)))
> +		strvec_push(&hook_args, arg);
>   	va_end(args);
> +
> +	ret = run_hooks(hook_env.v, &hook_name, &hook_args);
>   	strvec_clear(&hook_env);
> +	strvec_clear(&hook_args);
> +	strbuf_release(&hook_name);
>   
>   	return ret;
>   }
> diff --git a/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh b/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh
> index b3485450a2..cef8085dcc 100755
> --- a/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh
> +++ b/t/t7503-pre-commit-and-pre-merge-commit-hooks.sh
> @@ -103,6 +103,19 @@ test_expect_success 'with succeeding hook' '
>   	test_cmp expected_hooks actual_hooks
>   '
>   
> +# NEEDSWORK: when 'git hook add' and 'git hook remove' have been added, use that
> +# instead
> +test_expect_success 'with succeeding hook (config-based)' '
> +	test_when_finished "git config --unset hook.pre-commit.command success.sample" &&
> +	test_when_finished "rm -f expected_hooks actual_hooks" &&
> +	git config hook.pre-commit.command "$HOOKDIR/success.sample" &&
> +	echo "$HOOKDIR/success.sample" >expected_hooks &&
> +	echo "more" >>file &&
> +	git add file &&
> +	git commit -m "more" &&
> +	test_cmp expected_hooks actual_hooks
> +'
> +
>   test_expect_success 'with succeeding hook (merge)' '
>   	test_when_finished "rm -f \"$PREMERGE\" expected_hooks actual_hooks" &&
>   	cp "$HOOKDIR/success.sample" "$PREMERGE" &&
> 

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 9/9] run_commit_hook: take strvec instead of varargs
  2020-09-09  0:49     ` [PATCH v4 9/9] run_commit_hook: take strvec instead of varargs Emily Shaffer
@ 2020-09-10 14:16       ` Phillip Wood
  2020-09-11 13:20         ` Phillip Wood
  0 siblings, 1 reply; 170+ messages in thread
From: Phillip Wood @ 2020-09-10 14:16 UTC (permalink / raw)
  To: Emily Shaffer, git

Hi Emily

On 09/09/2020 01:49, Emily Shaffer wrote:
> Taking varargs in run_commit_hook() led to some bizarre patterns, like
> callers using two string variables (which may or may not be filled) to
> express different argument lists for the commit hooks. Because
> run_commit_hook() no longer needs to call a variadic function for the
> hook run itself, we can use strvec to make the calling code more
> conventional.
> 
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
>   builtin/commit.c | 46 +++++++++++++++++++++++-----------------------
>   builtin/merge.c  | 20 ++++++++++++++++----
>   commit.c         | 13 ++-----------
>   commit.h         |  5 +++--
>   sequencer.c      | 15 ++++++++-------
>   5 files changed, 52 insertions(+), 47 deletions(-)
> 
> diff --git a/builtin/commit.c b/builtin/commit.c
> index a19c6478eb..f029d4f5ac 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -691,8 +691,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   	struct strbuf committer_ident = STRBUF_INIT;
>   	int committable;
>   	struct strbuf sb = STRBUF_INIT;
> -	const char *hook_arg1 = NULL;
> -	const char *hook_arg2 = NULL;
> +	struct strvec hook_args = STRVEC_INIT;
>   	int clean_message_contents = (cleanup_mode != COMMIT_MSG_CLEANUP_NONE);
>   	int old_display_comment_prefix;
>   	int merge_contains_scissors = 0;
> @@ -700,7 +699,8 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   	/* This checks and barfs if author is badly specified */
>   	determine_author_info(author_ident);
>   
> -	if (!no_verify && run_commit_hook(use_editor, index_file, "pre-commit", NULL))
> +	if (!no_verify && run_commit_hook(use_editor, index_file, "pre-commit",
> +					  &hook_args))
>   		return 0;
>   
>   	if (squash_message) {
> @@ -722,27 +722,28 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   		}
>   	}
>   
> +	strvec_push(&hook_args, git_path_commit_editmsg());

This is a long way from the call where we use hook_args. With the 
variadic interface it is clear by looking at the call to 
run_commit_hook() what the first argument is and that is always the same.

>   	if (have_option_m && !fixup_message) {
>   		strbuf_addbuf(&sb, &message);
> -		hook_arg1 = "message";
> +		strvec_push(&hook_args, "message");
>   	} else if (logfile && !strcmp(logfile, "-")) {
>   		if (isatty(0))
>   			fprintf(stderr, _("(reading log message from standard input)\n"));
>   		if (strbuf_read(&sb, 0, 0) < 0)
>   			die_errno(_("could not read log from standard input"));
> -		hook_arg1 = "message";
> +		strvec_push(&hook_args, "message");
>   	} else if (logfile) {
>   		if (strbuf_read_file(&sb, logfile, 0) < 0)
>   			die_errno(_("could not read log file '%s'"),
>   				  logfile);
> -		hook_arg1 = "message";
> +		strvec_push(&hook_args, "message");
>   	} else if (use_message) {
>   		char *buffer;
>   		buffer = strstr(use_message_buffer, "\n\n");
>   		if (buffer)
>   			strbuf_addstr(&sb, skip_blank_lines(buffer + 2));
> -		hook_arg1 = "commit";
> -		hook_arg2 = use_message;
> +		strvec_pushl(&hook_args, "commit", use_message, NULL);
>   	} else if (fixup_message) {
>   		struct pretty_print_context ctx = {0};
>   		struct commit *commit;
> @@ -754,7 +755,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   				      &sb, &ctx);
>   		if (have_option_m)
>   			strbuf_addbuf(&sb, &message);
> -		hook_arg1 = "message";
> +		strvec_push(&hook_args, "message");
>   	} else if (!stat(git_path_merge_msg(the_repository), &statbuf)) {
>   		size_t merge_msg_start;
>   
> @@ -765,9 +766,9 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   		if (!stat(git_path_squash_msg(the_repository), &statbuf)) {
>   			if (strbuf_read_file(&sb, git_path_squash_msg(the_repository), 0) < 0)
>   				die_errno(_("could not read SQUASH_MSG"));
> -			hook_arg1 = "squash";
> +			strvec_push(&hook_args, "squash");
>   		} else
> -			hook_arg1 = "merge";
> +			strvec_push(&hook_args, "merge");
>   
>   		merge_msg_start = sb.len;
>   		if (strbuf_read_file(&sb, git_path_merge_msg(the_repository), 0) < 0)
> @@ -781,11 +782,11 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   	} else if (!stat(git_path_squash_msg(the_repository), &statbuf)) {
>   		if (strbuf_read_file(&sb, git_path_squash_msg(the_repository), 0) < 0)
>   			die_errno(_("could not read SQUASH_MSG"));
> -		hook_arg1 = "squash";
> +		strvec_push(&hook_args, "squash");
>   	} else if (template_file) {
>   		if (strbuf_read_file(&sb, template_file, 0) < 0)
>   			die_errno(_("could not read '%s'"), template_file);
> -		hook_arg1 = "template";
> +		strvec_push(&hook_args, "template");
>   		clean_message_contents = 0;
>   	}
>   
> @@ -794,11 +795,9 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   	 * just set the argument(s) to the prepare-commit-msg hook.
>   	 */
>   	else if (whence == FROM_MERGE)
> -		hook_arg1 = "merge";
> -	else if (is_from_cherry_pick(whence) || whence == FROM_REBASE_PICK) {
> -		hook_arg1 = "commit";
> -		hook_arg2 = "CHERRY_PICK_HEAD";
> -	}
> +		strvec_push(&hook_args, "merge");
> +	else if (is_from_cherry_pick(whence) || whence == FROM_REBASE_PICK)
> +		strvec_pushl(&hook_args, "commit", "CHERRY_PICK_HEAD", NULL);
>   
>   	if (squash_message) {
>   		/*
> @@ -806,8 +805,8 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   		 * then we're possibly hijacking other commit log options.
>   		 * Reset the hook args to tell the real story.
>   		 */
> -		hook_arg1 = "message";
> -		hook_arg2 = "";
> +		strvec_clear(&hook_args);
> +		strvec_pushl(&hook_args, git_path_commit_editmsg(), "message", NULL);

It's a shame we have to clear the strvec and remember to re-add 
git_path_commit_editmsg() here.

>   	}
>   
>   	s->fp = fopen_for_writing(git_path_commit_editmsg());
> @@ -1001,8 +1000,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   		return 0;
>   	}
>   
> -	if (run_commit_hook(use_editor, index_file, "prepare-commit-msg",
> -			    git_path_commit_editmsg(), hook_arg1, hook_arg2, NULL))
> +	if (run_commit_hook(use_editor, index_file, "prepare-commit-msg", &hook_args))
>   		return 0;
>   
>   	if (use_editor) {
> @@ -1017,8 +1015,10 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>   		strvec_clear(&env);
>   	}
>   
> +	strvec_clear(&hook_args);
> +	strvec_push(&hook_args, git_path_commit_editmsg());
>   	if (!no_verify &&
> -	    run_commit_hook(use_editor, index_file, "commit-msg", git_path_commit_editmsg(), NULL)) {
> +	    run_commit_hook(use_editor, index_file, "commit-msg", &hook_args)) {
>   		return 0;
>   	}
>   
> diff --git a/builtin/merge.c b/builtin/merge.c
> index c1a9d0083d..863c9039a3 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -821,10 +821,14 @@ static void write_merge_heads(struct commit_list *);
>   static void prepare_to_commit(struct commit_list *remoteheads)
>   {
>   	struct strbuf msg = STRBUF_INIT;
> +	struct strvec hook_args = STRVEC_INIT;
> +	struct strbuf hook_name = STRBUF_INIT;

As far as I can see hook_name is never used except to free it at the end.

>   	const char *index_file = get_index_file();
>   
> -	if (!no_verify && run_commit_hook(0 < option_edit, index_file, "pre-merge-commit", NULL))
> +	if (!no_verify && run_commit_hook(0 < option_edit, index_file,
> +					  "pre-merge-commit", &hook_args))
>   		abort_commit(remoteheads, NULL);
> +
>   	/*
>   	 * Re-read the index as pre-merge-commit hook could have updated it,
>   	 * and write it out as a tree.  We must do this before we invoke
> @@ -832,6 +836,7 @@ static void prepare_to_commit(struct commit_list *remoteheads)
>   	 */
>   	if (hook_exists("pre-merge-commit"))
>   		discard_cache();
> +
>   	read_cache_from(index_file);
>   	strbuf_addbuf(&msg, &merge_msg);
>   	if (squash)
> @@ -851,17 +856,22 @@ static void prepare_to_commit(struct commit_list *remoteheads)
>   		append_signoff(&msg, ignore_non_trailer(msg.buf, msg.len), 0);
>   	write_merge_heads(remoteheads);
>   	write_file_buf(git_path_merge_msg(the_repository), msg.buf, msg.len);
> +
> +	strvec_clear(&hook_args);
> +	strvec_pushl(&hook_args, git_path_merge_msg(the_repository), "merge", NULL);
>   	if (run_commit_hook(0 < option_edit, get_index_file(), "prepare-commit-msg",
> -			    git_path_merge_msg(the_repository), "merge", NULL))
> +			    &hook_args))
>   		abort_commit(remoteheads, NULL);
> +
>   	if (0 < option_edit) {
>   		if (launch_editor(git_path_merge_msg(the_repository), NULL, NULL))
>   			abort_commit(remoteheads, NULL);
>   	}
>   
> +	strvec_clear(&hook_args);
> +	strvec_push(&hook_args, git_path_merge_msg(the_repository));
>   	if (!no_verify && run_commit_hook(0 < option_edit, get_index_file(),
> -					  "commit-msg",
> -					  git_path_merge_msg(the_repository), NULL))
> +					  "commit-msg", &hook_args))
>   		abort_commit(remoteheads, NULL);
>   
>   	read_merge_msg(&msg);
> @@ -871,6 +881,8 @@ static void prepare_to_commit(struct commit_list *remoteheads)
>   	strbuf_release(&merge_msg);
>   	strbuf_addbuf(&merge_msg, &msg);
>   	strbuf_release(&msg);
> +	strbuf_release(&hook_name);
> +	strvec_clear(&hook_args);
>   }
>   
>   static int merge_trivial(struct commit *head, struct commit_list *remoteheads)
> diff --git a/commit.c b/commit.c
> index c7a243e848..726407152c 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -1629,12 +1629,9 @@ size_t ignore_non_trailer(const char *buf, size_t len)
>   }
>   
>   int run_commit_hook(int editor_is_used, const char *index_file,
> -		    const char *name, ...)
> +		    const char *name, struct strvec *args)
>   {
>   	struct strvec hook_env = STRVEC_INIT;
> -	va_list args;
> -	const char *arg;
> -	struct strvec hook_args = STRVEC_INIT;
>   	struct strbuf hook_name = STRBUF_INIT;
>   	int ret;
>   
> @@ -1648,14 +1645,8 @@ int run_commit_hook(int editor_is_used, const char *index_file,
>   	if (!editor_is_used)
>   		strvec_push(&hook_env, "GIT_EDITOR=:");
>   
> -	va_start(args, name);
> -	while ((arg = va_arg(args, const char *)))
> -		strvec_push(&hook_args, arg);
> -	va_end(args);
> -
> -	ret = run_hooks(hook_env.v, &hook_name, &hook_args);
> +	ret = run_hooks(hook_env.v, &hook_name, args);
>   	strvec_clear(&hook_env);
> -	strvec_clear(&hook_args);
>   	strbuf_release(&hook_name);
>   
>   	return ret;
> diff --git a/commit.h b/commit.h
> index e901538909..978da3c3e0 100644
> --- a/commit.h
> +++ b/commit.h
> @@ -9,6 +9,7 @@
>   #include "string-list.h"
>   #include "pretty.h"
>   #include "commit-slab.h"
> +#include "strvec.h"
>   
>   #define COMMIT_NOT_FROM_GRAPH 0xFFFFFFFF
>   #define GENERATION_NUMBER_INFINITY 0xFFFFFFFF
> @@ -353,7 +354,7 @@ void verify_merge_signature(struct commit *commit, int verbose,
>   int compare_commits_by_commit_date(const void *a_, const void *b_, void *unused);
>   int compare_commits_by_gen_then_commit_date(const void *a_, const void *b_, void *unused);
>   
> -LAST_ARG_MUST_BE_NULL
> -int run_commit_hook(int editor_is_used, const char *index_file, const char *name, ...);
> +int run_commit_hook(int editor_is_used, const char *index_file,
> +		    const char *name, struct strvec *args);
>   
>   #endif /* COMMIT_H */
> diff --git a/sequencer.c b/sequencer.c
> index cc3f8fa88e..5dd4b134d6 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -1124,22 +1124,23 @@ static int run_prepare_commit_msg_hook(struct repository *r,
>   				       const char *commit)
>   {
>   	int ret = 0;
> -	const char *name, *arg1 = NULL, *arg2 = NULL;
> +	struct strvec args = STRVEC_INIT;
> +	const char *name = git_path_commit_editmsg();
>   
> -	name = git_path_commit_editmsg();
> +	strvec_push(&args, name);

I think you could drop name altogether and just pass 
git_path_commit_editmsg() instead.

>   	if (write_message(msg->buf, msg->len, name, 0))
>   		return -1;
>   
>   	if (commit) {
> -		arg1 = "commit";
> -		arg2 = commit;
> +		strvec_push(&args, "commit");
> +		strvec_push(&args, commit);

Complete nit pick but the other conversions all used strvec_pushl() 
rather than two strvec_push() calls.

I don't have a strong opinion about these changes (though I'm not 
particularly enthusiastic). Having to push the arguments in order is not 
particularly convenient and the use of strvec_pushl() means we are 
replacing a small number of variadic calls to run_commit_hook() with a 
larger number of calls to a different variadic interface.

Best Wishes

Phillip

>   	} else {
> -		arg1 = "message";
> +		strvec_push(&args, "message");
>   	}
> -	if (run_commit_hook(0, r->index_file, "prepare-commit-msg", name,
> -			    arg1, arg2, NULL))
> +	if (run_commit_hook(0, r->index_file, "prepare-commit-msg", &args))
>   		ret = error(_("'prepare-commit-msg' hook failed"));
>   
> +	strvec_clear(&args);
>   	return ret;
>   }
>   
> 

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 7/9] hook: replace run-command.h:find_hook
  2020-09-09 20:32       ` Junio C Hamano
@ 2020-09-10 19:08         ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-09-10 19:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Sep 09, 2020 at 01:32:12PM -0700, Junio C Hamano wrote:
> 
> Emily Shaffer <emilyshaffer@google.com> writes:
> 
> > Add a helper to easily determine whether any hooks exist for a given
> > hook event.
> >
> > Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> > ---
> >  hook.c | 9 +++++++++
> >  hook.h | 1 +
> >  2 files changed, 10 insertions(+)
> 
> Should we consider the last three patches still work-in-progress
> technology demonstration, or are these meant as a proposal for a new
> API element as-is?

The former. I'm irritated with myself for spending a long time fidgeting
with the wording on this reroll and still forgetting to mark the last
three "RFC" as I had planned to do.

> It is perfectly fine if it is the former.  I just want to make sure
> we share a common understanding on the direction in which we want
> these patches to take us.  Here is my take:
> 
>  - For now, a hook/event that is aware of the config-based hook
>    system is supposed to use hook_exists(), while the traditional
>    ones still use find_hook().  We expect more and more will be
>    converted to the former over time.
> 
>  - Invoking hook scripts under the new world order is done by
>    including hook.h and calling run_hooks(), not by driving the
>    run-command API yourself (I count run_hook_ve() as part of the
>    latter) like the traditional code did.  We expect more and more
>    will be converted to the former over time.
> 
>  - From the point of view of the end users who have been happily
>    using scripts in $GIT_DIR/hooks, everything will stay the same.
>    hook_exists() will find them (by calling find_hook() as a
>    fallback) and run_hooks() will run them (by relying on
>    hook_list() to include them).
> 
> I am guessing that the above gives us a high-level description.

Yes. I am also working on a patch locally to include a config -
optionally users could shut off the $GIT_DIR/hooks, but I don't see us
making that the default behavior any time soon (or ever).

> 
> The new interface needs to be described in hook.h once the series
> graduates from the technology demonstration state, in order to help
> others who want to help updating the callsites of traditional hooks
> to the new API.  And the above three-bullet point list is my attempt
> to figure out what kind of things need to be documented to help
> them.

Sure. Agreed. Thanks for pointing it out - I had planned on updating the
`git help hook` manpage but adding API comments in hook.h had slipped my
mind, so the reminder is useful.

> 
> I am not seeing anything in run_hooks() that consumes input from us
> over pipe, by the way, without which we cannot do things like the
> "pre-receive" hooks under the new world order.  Are they planned to
> come in the future, after these "we feed anything they need from the
> command line and from the enviornment" hooks are dealt with in this
> first pass?

I included this conversion to demonstrate the tech and give people
something to look at (and shout to stop if so needed). I do plan to
include hooks which need piped input; in fact, I'm hoping to target one
such for the next conversion I do. The todo list looks like so:

 1. semantics for checking hook.runHookDir config
 2. convert all the hooks which take input in interesting ways (or, just
 all the hooks)
 3. add user friendliness via 'git hook add', 'git hook edit', etc

 The config semantics are in progress and I'm hoping to send this week.

 As for submission plan, I don't mind including new architecture (if
 unused) except for the code bloat; I'd rather push all the
 "conversions" simultaneously, so users don't have to wonder "is this
 hook a new and supported one, or not?".  I don't mind adding the
 niceties ('git hook add' etc) later as the config is a little annoying
 for a human to write themselves, but not impossible.

  - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 8/9] commit: use config-based hooks
  2020-09-10 13:50       ` Phillip Wood
@ 2020-09-10 22:21         ` Junio C Hamano
  0 siblings, 0 replies; 170+ messages in thread
From: Junio C Hamano @ 2020-09-10 22:21 UTC (permalink / raw)
  To: Phillip Wood; +Cc: Emily Shaffer, git

Phillip Wood <phillip.wood123@gmail.com> writes:

>> +	const char *arg;
>> +	struct strvec hook_args = STRVEC_INIT;
>> +	struct strbuf hook_name = STRBUF_INIT;
>>   	int ret;
>>   +	strbuf_addstr(&hook_name, name);
>
> Seeing this makes me wonder if it would be better for run_hooks() to
> take a string for the name rather than an strbuf, I suspect that
> virtually all callers have a fixed hook name.

Yeah, that is a good point.  It is always a good discipline to keep
the type of the parameters callers need to pass to the minimum.




^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 9/9] run_commit_hook: take strvec instead of varargs
  2020-09-10 14:16       ` Phillip Wood
@ 2020-09-11 13:20         ` Phillip Wood
  0 siblings, 0 replies; 170+ messages in thread
From: Phillip Wood @ 2020-09-11 13:20 UTC (permalink / raw)
  To: Emily Shaffer, git

On 10/09/2020 15:16, Phillip Wood wrote:
> Hi Emily
> 
> On 09/09/2020 01:49, Emily Shaffer wrote:
>> Taking varargs in run_commit_hook() led to some bizarre patterns, like
>> callers using two string variables (which may or may not be filled) to
>> express different argument lists for the commit hooks. Because
>> run_commit_hook() no longer needs to call a variadic function for the
>> hook run itself, we can use strvec to make the calling code more
>> conventional.
>>
>> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
>> ---
>>   builtin/commit.c | 46 +++++++++++++++++++++++-----------------------
>>   builtin/merge.c  | 20 ++++++++++++++++----
>>   commit.c         | 13 ++-----------
>>   commit.h         |  5 +++--
>>   sequencer.c      | 15 ++++++++-------
>>   5 files changed, 52 insertions(+), 47 deletions(-)
>>
>> diff --git a/builtin/commit.c b/builtin/commit.c
>> index a19c6478eb..f029d4f5ac 100644
>> --- a/builtin/commit.c
>> +++ b/builtin/commit.c
>> @@ -691,8 +691,7 @@ static int prepare_to_commit(const char 
>> *index_file, const char *prefix,
>>       struct strbuf committer_ident = STRBUF_INIT;
>>       int committable;
>>       struct strbuf sb = STRBUF_INIT;
>> -    const char *hook_arg1 = NULL;
>> -    const char *hook_arg2 = NULL;
>> +    struct strvec hook_args = STRVEC_INIT;
>>       int clean_message_contents = (cleanup_mode != 
>> COMMIT_MSG_CLEANUP_NONE);
>>       int old_display_comment_prefix;
>>       int merge_contains_scissors = 0;
>> @@ -700,7 +699,8 @@ static int prepare_to_commit(const char 
>> *index_file, const char *prefix,
>>       /* This checks and barfs if author is badly specified */
>>       determine_author_info(author_ident);
>> -    if (!no_verify && run_commit_hook(use_editor, index_file, 
>> "pre-commit", NULL))
>> +    if (!no_verify && run_commit_hook(use_editor, index_file, 
>> "pre-commit",
>> +                      &hook_args))
>>           return 0;
>>       if (squash_message) {
>> @@ -722,27 +722,28 @@ static int prepare_to_commit(const char 
>> *index_file, const char *prefix,
>>           }
>>       }
>> +    strvec_push(&hook_args, git_path_commit_editmsg());
> 
> This is a long way from the call where we use hook_args. With the 
> variadic interface it is clear by looking at the call to 
> run_commit_hook() what the first argument is and that is always the same.
> 
>>       if (have_option_m && !fixup_message) {
>>           strbuf_addbuf(&sb, &message);
>> -        hook_arg1 = "message";
>> +        strvec_push(&hook_args, "message");
>>       } else if (logfile && !strcmp(logfile, "-")) {
>>           if (isatty(0))
>>               fprintf(stderr, _("(reading log message from standard 
>> input)\n"));
>>           if (strbuf_read(&sb, 0, 0) < 0)
>>               die_errno(_("could not read log from standard input"));
>> -        hook_arg1 = "message";
>> +        strvec_push(&hook_args, "message");
>>       } else if (logfile) {
>>           if (strbuf_read_file(&sb, logfile, 0) < 0)
>>               die_errno(_("could not read log file '%s'"),
>>                     logfile);
>> -        hook_arg1 = "message";
>> +        strvec_push(&hook_args, "message");
>>       } else if (use_message) {
>>           char *buffer;
>>           buffer = strstr(use_message_buffer, "\n\n");
>>           if (buffer)
>>               strbuf_addstr(&sb, skip_blank_lines(buffer + 2));
>> -        hook_arg1 = "commit";
>> -        hook_arg2 = use_message;
>> +        strvec_pushl(&hook_args, "commit", use_message, NULL);
>>       } else if (fixup_message) {
>>           struct pretty_print_context ctx = {0};
>>           struct commit *commit;
>> @@ -754,7 +755,7 @@ static int prepare_to_commit(const char 
>> *index_file, const char *prefix,
>>                         &sb, &ctx);
>>           if (have_option_m)
>>               strbuf_addbuf(&sb, &message);
>> -        hook_arg1 = "message";
>> +        strvec_push(&hook_args, "message");
>>       } else if (!stat(git_path_merge_msg(the_repository), &statbuf)) {
>>           size_t merge_msg_start;
>> @@ -765,9 +766,9 @@ static int prepare_to_commit(const char 
>> *index_file, const char *prefix,
>>           if (!stat(git_path_squash_msg(the_repository), &statbuf)) {
>>               if (strbuf_read_file(&sb, 
>> git_path_squash_msg(the_repository), 0) < 0)
>>                   die_errno(_("could not read SQUASH_MSG"));
>> -            hook_arg1 = "squash";
>> +            strvec_push(&hook_args, "squash");
>>           } else
>> -            hook_arg1 = "merge";
>> +            strvec_push(&hook_args, "merge");
>>           merge_msg_start = sb.len;
>>           if (strbuf_read_file(&sb, 
>> git_path_merge_msg(the_repository), 0) < 0)
>> @@ -781,11 +782,11 @@ static int prepare_to_commit(const char 
>> *index_file, const char *prefix,
>>       } else if (!stat(git_path_squash_msg(the_repository), &statbuf)) {
>>           if (strbuf_read_file(&sb, 
>> git_path_squash_msg(the_repository), 0) < 0)
>>               die_errno(_("could not read SQUASH_MSG"));
>> -        hook_arg1 = "squash";
>> +        strvec_push(&hook_args, "squash");
>>       } else if (template_file) {
>>           if (strbuf_read_file(&sb, template_file, 0) < 0)
>>               die_errno(_("could not read '%s'"), template_file);
>> -        hook_arg1 = "template";
>> +        strvec_push(&hook_args, "template");
>>           clean_message_contents = 0;
>>       }
>> @@ -794,11 +795,9 @@ static int prepare_to_commit(const char 
>> *index_file, const char *prefix,
>>        * just set the argument(s) to the prepare-commit-msg hook.
>>        */
>>       else if (whence == FROM_MERGE)
>> -        hook_arg1 = "merge";
>> -    else if (is_from_cherry_pick(whence) || whence == 
>> FROM_REBASE_PICK) {
>> -        hook_arg1 = "commit";
>> -        hook_arg2 = "CHERRY_PICK_HEAD";
>> -    }
>> +        strvec_push(&hook_args, "merge");
>> +    else if (is_from_cherry_pick(whence) || whence == FROM_REBASE_PICK)
>> +        strvec_pushl(&hook_args, "commit", "CHERRY_PICK_HEAD", NULL);
>>       if (squash_message) {
>>           /*
>> @@ -806,8 +805,8 @@ static int prepare_to_commit(const char 
>> *index_file, const char *prefix,
>>            * then we're possibly hijacking other commit log options.
>>            * Reset the hook args to tell the real story.
>>            */
>> -        hook_arg1 = "message";
>> -        hook_arg2 = "";
>> +        strvec_clear(&hook_args);
>> +        strvec_pushl(&hook_args, git_path_commit_editmsg(), 
>> "message", NULL);
> 
> It's a shame we have to clear the strvec and remember to re-add 
> git_path_commit_editmsg() here.
> 
>>       }
>>       s->fp = fopen_for_writing(git_path_commit_editmsg());
>> @@ -1001,8 +1000,7 @@ static int prepare_to_commit(const char 
>> *index_file, const char *prefix,
>>           return 0;
>>       }
>> -    if (run_commit_hook(use_editor, index_file, "prepare-commit-msg",
>> -                git_path_commit_editmsg(), hook_arg1, hook_arg2, NULL))
>> +    if (run_commit_hook(use_editor, index_file, "prepare-commit-msg", 
>> &hook_args))
>>           return 0;
>>       if (use_editor) {
>> @@ -1017,8 +1015,10 @@ static int prepare_to_commit(const char 
>> *index_file, const char *prefix,
>>           strvec_clear(&env);
>>       }
>> +    strvec_clear(&hook_args);
>> +    strvec_push(&hook_args, git_path_commit_editmsg());
>>       if (!no_verify &&
>> -        run_commit_hook(use_editor, index_file, "commit-msg", 
>> git_path_commit_editmsg(), NULL)) {
>> +        run_commit_hook(use_editor, index_file, "commit-msg", 
>> &hook_args)) {
>>           return 0;
>>       }
 >[...]
> 
> I don't have a strong opinion about these changes (though I'm not 
> particularly enthusiastic). Having to push the arguments in order is not 
> particularly convenient and the use of strvec_pushl() means we are 
> replacing a small number of variadic calls to run_commit_hook() with a 
> larger number of calls to a different variadic interface.

On reflection I think it is the conversion in builtin/commit.c rather 
than the change in the API that makes me uncomfortable. If it kept 
`hook_arg1` and `hook_arg2` and just did

strvec_push(&hook_args, git_path_commit_editmsg())\
strvec_push(&hook_args, hook_arg1);
if (hook_arg2)
	strvec_push(&hook_args, hook_arg2);
run_commit_hook(..., &hook_args);

It would keep the fixed first argument near the call to 
run_commit_hook() and avoid the problem of having to clear hook_args in 
the hunk at line 806.

Thank you for adding the last couple of patches that show an example 
conversion, it is really helpful to see how the API would be used.

Best Wishes

Phillip

> Best Wishes
> 
> Phillip
> 
>>       } else {
>> -        arg1 = "message";
>> +        strvec_push(&args, "message");
>>       }
>> -    if (run_commit_hook(0, r->index_file, "prepare-commit-msg", name,
>> -                arg1, arg2, NULL))
>> +    if (run_commit_hook(0, r->index_file, "prepare-commit-msg", &args))
>>           ret = error(_("'prepare-commit-msg' hook failed"));
>> +    strvec_clear(&args);
>>       return ret;
>>   }
>>


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 3/9] hook: add list command
  2020-09-09  0:49     ` [PATCH v4 3/9] hook: add list command Emily Shaffer
@ 2020-09-11 13:27       ` Phillip Wood
  2020-09-11 16:51         ` Emily Shaffer
  2020-09-23 23:04       ` Jonathan Tan
                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 170+ messages in thread
From: Phillip Wood @ 2020-09-11 13:27 UTC (permalink / raw)
  To: Emily Shaffer, git

Hi Emily

On 09/09/2020 01:49, Emily Shaffer wrote:
> Teach 'git hook list <hookname>', which checks the known configs in
> order to create an ordered list of hooks to run on a given hook event.
> 
> Multiple commands can be specified for a given hook by providing
> multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
> run in config order. If more properties need to be set on a given hook
> in the future, commands can also be specified by providing
> "hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
> <hookcmd-name>]" subsection; at minimum, this subsection must contain a
> "hookcmd.<hookcmd-name>.command = <path-to-hook>" line.
> 
> For example:
> 
>    $ git config --list | grep ^hook
>    hook.pre-commit.command=baz
>    hook.pre-commit.command=~/bar.sh
>    hookcmd.baz.command=~/baz/from/hookcmd.sh
> 
>    $ git hook list pre-commit
>    ~/baz/from/hookcmd.sh
>    ~/bar.sh
> 
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
>   Documentation/git-hook.txt    |  37 +++++++++++-
>   Makefile                      |   1 +
>   builtin/hook.c                |  55 ++++++++++++++++--
>   hook.c                        | 102 ++++++++++++++++++++++++++++++++++
>   hook.h                        |  15 +++++
>   t/t1360-config-based-hooks.sh |  68 ++++++++++++++++++++++-
>   6 files changed, 271 insertions(+), 7 deletions(-)
>   create mode 100644 hook.c
>   create mode 100644 hook.h
> 
> diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
> index 2d50c414cc..e458586e96 100644
> --- a/Documentation/git-hook.txt
> +++ b/Documentation/git-hook.txt
> @@ -8,12 +8,47 @@ git-hook - Manage configured hooks
>   SYNOPSIS
>   --------
>   [verse]
> -'git hook'
> +'git hook' list <hook-name>
>   
>   DESCRIPTION
>   -----------
>   You can list, add, and modify hooks with this command.
>   
> +This command parses the default configuration files for sections "hook" and
> +"hookcmd". "hook" is used to describe the commands which will be run during a
> +particular hook event; commands are run in config order. "hookcmd" is used to
> +describe attributes of a specific command. If additional attributes don't need
> +to be specified, a command to run can be specified directly in the "hook"
> +section; if a "hookcmd" by that name isn't found, Git will attempt to run the
> +provided value directly. For example:
> +
> +Global config
> +----
> +  [hook "post-commit"]
> +    command = "linter"
> +    command = "~/typocheck.sh"
> +
> +  [hookcmd "linter"]
> +    command = "/bin/linter --c"
> +----
> +
> +Local config
> +----
> +  [hook "prepare-commit-msg"]
> +    command = "linter"
> +  [hook "post-commit"]
> +    command = "python ~/run-test-suite.py"
> +----

I think it would be helpful to have a couple of lines explaining what 
the example configuration sets up

> +COMMANDS
> +--------
> +
> +list <hook-name>::
> +
> +List the hooks which have been configured for <hook-name>. Hooks appear
> +in the order they should be run, and note the config scope where the relevant
> +`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).

Thanks for clarifying that it is the origin of the 
hook.<hook-name>.command that is printed. An example of the output of 
the config above would be useful I think.

>   GIT
>   ---
>   Part of the linkgit:git[1] suite
> diff --git a/Makefile b/Makefile
> index 6eee75555e..804de45b16 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -890,6 +890,7 @@ LIB_OBJS += grep.o
>   LIB_OBJS += hashmap.o
>   LIB_OBJS += help.o
>   LIB_OBJS += hex.o
> +LIB_OBJS += hook.o
>   LIB_OBJS += ident.o
>   LIB_OBJS += interdiff.o
>   LIB_OBJS += json-writer.o
> diff --git a/builtin/hook.c b/builtin/hook.c
> index b2bbc84d4d..a0759a4c26 100644
> --- a/builtin/hook.c
> +++ b/builtin/hook.c
> @@ -1,21 +1,68 @@
>   #include "cache.h"
>   
>   #include "builtin.h"
> +#include "config.h"
> +#include "hook.h"
>   #include "parse-options.h"
> +#include "strbuf.h"
>   
>   static const char * const builtin_hook_usage[] = {
> -	N_("git hook"),
> +	N_("git hook list <hookname>"),
>   	NULL
>   };
>   
> -int cmd_hook(int argc, const char **argv, const char *prefix)
> +static int list(int argc, const char **argv, const char *prefix)
>   {
> -	struct option builtin_hook_options[] = {
> +	struct list_head *head, *pos;
> +	struct hook *item;
> +	struct strbuf hookname = STRBUF_INIT;
> +
> +	struct option list_options[] = {
>   		OPT_END(),
>   	};
>   
> -	argc = parse_options(argc, argv, prefix, builtin_hook_options,
> +	argc = parse_options(argc, argv, prefix, list_options,
>   			     builtin_hook_usage, 0);
>   
> +	if (argc < 1) {
> +		usage_msg_opt("a hookname must be provided to operate on.",
> +			      builtin_hook_usage, list_options);
> +	}
> +
> +	strbuf_addstr(&hookname, argv[0]);
> +
> +	head = hook_list(&hookname);
> +
> +	if (list_empty(head)) {
> +		printf(_("no commands configured for hook '%s'\n"),
> +		       hookname.buf);
> +		return 0;
> +	}
> +
> +	list_for_each(pos, head) {
> +		item = list_entry(pos, struct hook, list);
> +		if (item)
> +			printf("%s:\t%s\n",
> +			       config_scope_name(item->origin),
> +			       item->command.buf);
> +	}
> +
> +	clear_hook_list();
> +	strbuf_release(&hookname);
> +
>   	return 0;
>   }
> +
> +int cmd_hook(int argc, const char **argv, const char *prefix)
> +{
> +	struct option builtin_hook_options[] = {
> +		OPT_END(),
> +	};
> +	if (argc < 2)
> +		usage_with_options(builtin_hook_usage, builtin_hook_options);
> +
> +	if (!strcmp(argv[1], "list"))
> +		return list(argc - 1, argv + 1, prefix);
> +
> +	usage_with_options(builtin_hook_usage, builtin_hook_options);
> +}
> diff --git a/hook.c b/hook.c
> new file mode 100644
> index 0000000000..b006950eb8
> --- /dev/null
> +++ b/hook.c
> @@ -0,0 +1,102 @@
> +#include "cache.h"
> +
> +#include "hook.h"
> +#include "config.h"
> +
> +/*
> + * NEEDSWORK: a stateful hook_head means we can't run two hook events in the
> + * background at the same time - which might be ok, or might not.
> + *
> + * Maybe it's better to cache a list head per hookname, since we can probably
> + * guess that the hook list won't change during a user-initiated operation. For
> + * now, within list_hooks, call clear_hook_list() at the outset.
> + */
> +static LIST_HEAD(hook_head);

I can see a cache might be useful for the sequencer which needs to run 
the prepare-msg hook for each commit (it should probably not be running 
the post-commit hook but does at the moment) and for am which runs some 
hooks for each patch but until then I'm not sure why we need a global 
variable here, can't we just declare `hook_head` in `list_hook()`?

> +void free_hook(struct hook *ptr)
> +{
> +	if (ptr) {
> +		strbuf_release(&ptr->command);
> +		free(ptr);
> +	}
> +}
> +
> +static void emplace_hook(struct list_head *pos, const char *command)
> +{
> +	struct hook *to_add = malloc(sizeof(struct hook));
> +	to_add->origin = current_config_scope();
> +	strbuf_init(&to_add->command, 0);
> +	/* even with use_shell, run_command() needs quotes */
> +	strbuf_addf(&to_add->command, "'%s'", command);
> +
> +	list_add_tail(&to_add->list, pos);
> +}
> +
> +static void remove_hook(struct list_head *to_remove)
> +{
> +	struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
> +	list_del(to_remove);
> +	free_hook(hook_to_remove);
> +}
> +
> +void clear_hook_list(void)
> +{
> +	struct list_head *pos, *tmp;
> +	list_for_each_safe(pos, tmp, &hook_head)
> +		remove_hook(pos);
> +}
> +
> +static int hook_config_lookup(const char *key, const char *value, void *hook_key_cb)
> +{
> +	const char *hook_key = hook_key_cb;
> +
> +	if (!strcmp(key, hook_key)) {
> +		const char *command = value;
> +		struct strbuf hookcmd_name = STRBUF_INIT;
> +		struct list_head *pos = NULL, *tmp = NULL;
> +
> +		/* Check if a hookcmd with that name exists. */
> +		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
> +		git_config_get_value(hookcmd_name.buf, &command);
> +
> +		if (!command)
> +			BUG("git_config_get_value overwrote a string it shouldn't have");
> +
> +		/*
> +		 * TODO: implement an option-getting callback, e.g.
> +		 *   get configs by pattern hookcmd.$value.*
> +		 *   for each key+value, do_callback(key, value, cb_data)
> +		 */
> +
> +		list_for_each_safe(pos, tmp, &hook_head) {
> +			struct hook *hook = list_entry(pos, struct hook, list);
> +			/*
> +			 * The list of hooks to run can be reordered by being redeclared
> +			 * in the config. Options about hook ordering should be checked
> +			 * here.
> +			 */
> +			if (0 == strcmp(hook->command.buf, command))

We normally write this as !strcmp(...)

> +				remove_hook(pos);
> +		}
> +		emplace_hook(pos, command);
> +	}
> +
> +	return 0;
> +}
> +
> +struct list_head* hook_list(const struct strbuf* hookname)
> +{
> +	struct strbuf hook_key = STRBUF_INIT;
> +
> +	if (!hookname)
> +		return NULL;
> +
> +	/* hook_head is stateful */
> +	clear_hook_list();
> +
> +	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
> +
> +	git_config(hook_config_lookup, (void*)hook_key.buf);
> +
> +	return &hook_head;
> +}
> diff --git a/hook.h b/hook.h
> new file mode 100644
> index 0000000000..aaf6511cff
> --- /dev/null
> +++ b/hook.h
> @@ -0,0 +1,15 @@
> +#include "config.h"
> +#include "list.h"
> +#include "strbuf.h"
> +
> +struct hook
> +{
> +	struct list_head list;
> +	enum config_scope origin;
> +	struct strbuf command;
> +};
> +
> +struct list_head* hook_list(const struct strbuf *hookname);
> +
> +void free_hook(struct hook *ptr);
> +void clear_hook_list(void);
> diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
> index 34b0df5216..46d1ed354a 100755
> --- a/t/t1360-config-based-hooks.sh
> +++ b/t/t1360-config-based-hooks.sh
> @@ -4,8 +4,72 @@ test_description='config-managed multihooks, including git-hook command'
>   
>   . ./test-lib.sh
>   
> -test_expect_success 'git hook command does not crash' '
> -	git hook
> +ROOT=
> +if test_have_prereq MINGW
> +then
> +	# In Git for Windows, Unix-like paths work only in shell scripts;
> +	# `git.exe`, however, will prefix them with the pseudo root directory
> +	# (of the Unix shell). Let's accommodate for that.
> +	ROOT="$(cd / && pwd)"
> +fi
> +
> +setup_hooks () {
> +	test_config hook.pre-commit.command "/path/ghi" --add
> +	test_config_global hook.pre-commit.command "/path/def" --add
> +}
> +
> +setup_hookcmd () {
> +	test_config hook.pre-commit.command "abc" --add
> +	test_config_global hookcmd.abc.command "/path/abc" --add
> +}
> +
> +test_expect_success 'git hook rejects commands without a mode' '
> +	test_must_fail git hook pre-commit
> +'

Thanks for changing the tests to be independent of each other

Best Wishes

Phillip

> +
> +test_expect_success 'git hook rejects commands without a hookname' '
> +	test_must_fail git hook list
> +'
> +
> +test_expect_success 'git hook list orders by config order' '
> +	setup_hooks &&
> +
> +	cat >expected <<-EOF &&
> +	global:	$ROOT/path/def
> +	local:	$ROOT/path/ghi
> +	EOF
> +
> +	git hook list pre-commit >actual &&
> +	test_cmp expected actual
> +'
> +
> +test_expect_success 'git hook list dereferences a hookcmd' '
> +	setup_hooks &&
> +	setup_hookcmd &&
> +
> +	cat >expected <<-EOF &&
> +	global:	$ROOT/path/def
> +	local:	$ROOT/path/ghi
> +	local:	$ROOT/path/abc
> +	EOF
> +
> +	git hook list pre-commit >actual &&
> +	test_cmp expected actual
> +'
> +
> +test_expect_success 'git hook list reorders on duplicate commands' '
> +	setup_hooks &&
> +
> +	test_config hook.pre-commit.command "/path/def" --add &&
> +
> +	cat >expected <<-EOF &&
> +	local:	$ROOT/path/ghi
> +	local:	$ROOT/path/def
> +	EOF
> +
> +	git hook list pre-commit >actual &&
> +	test_cmp expected actual
>   '
>   
>   test_done
> 


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 6/9] hook: add 'run' subcommand
  2020-09-09  0:49     ` [PATCH v4 6/9] hook: add 'run' subcommand Emily Shaffer
@ 2020-09-11 13:30       ` Phillip Wood
  2020-09-28 19:29       ` Josh Steadmon
  2020-10-05 23:39       ` Jonathan Nieder
  2 siblings, 0 replies; 170+ messages in thread
From: Phillip Wood @ 2020-09-11 13:30 UTC (permalink / raw)
  To: Emily Shaffer, git

Hi Emily

On 09/09/2020 01:49, Emily Shaffer wrote:
> In order to enable hooks to be run as an external process, by a
> standalone Git command, or by tools which wrap Git, provide an external
> means to run all configured hook commands for a given hook event.
> 
> For now, the hook commands will in config order, in series. As alternate
> ordering or parallelism is supported in the future, we should add knobs
> to use those to the command line as well.
> 
> As with the legacy hook implementation, all stdout generated by hook
> commands is redirected to stderr. Piping from stdin is not yet
> supported.
> 
> Legacy hooks (those present in $GITDIR/hooks) are run at the end of the
> execution list. For now, there is no way to disable them.
> 
> Users may wish to provide hook commands like 'git config
> hook.pre-commit.command "~/linter.sh --pre-commit"'. To enable this, the
> contents of the 'hook.*.command' and 'hookcmd.*.command' strings are
> first split by space or quotes into an argv_array, then expanded with
> 'expand_user_path()'.
> 
 > [...]
> diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
> index ebf8f38d68..ee8114250d 100755
> --- a/t/t1360-config-based-hooks.sh
> +++ b/t/t1360-config-based-hooks.sh
> @@ -84,4 +84,32 @@ test_expect_success 'git hook list --porcelain prints just the command' '
>   	test_cmp expected actual
>   '
>   
> +test_expect_success 'inline hook definitions execute oneliners' '
> +	test_config hook.pre-commit.command "echo \"Hello World\"" &&
> +
> +	echo "Hello World" >expected &&
> +
> +	# hooks are run with stdout_to_stderr = 1
> +	git hook run pre-commit 2>actual &&
> +	test_cmp expected actual
> +'
> +
> +test_expect_success 'inline hook definitions resolve paths' '
> +	cat >~/sample-hook.sh <<-EOF &&
> +	echo \"Sample Hook\"
> +	EOF

I think this could use `write_script`. I'm rather scared of the '~' in 
the script path, can we write it to the test directory please.

Best Wishes

Phillip

> +	test_when_finished "rm ~/sample-hook.sh" &&
> +
> +	chmod +x ~/sample-hook.sh &&
> +
> +	test_config hook.pre-commit.command "~/sample-hook.sh" &&
> +
> +	echo \"Sample Hook\" >expected &&
> +
> +	# hooks are run with stdout_to_stderr = 1
> +	git hook run pre-commit 2>actual &&
> +	test_cmp expected actual
> +'
> +
>   test_done
> 


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 3/9] hook: add list command
  2020-09-11 13:27       ` Phillip Wood
@ 2020-09-11 16:51         ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-09-11 16:51 UTC (permalink / raw)
  To: Phillip Wood; +Cc: git

On Fri, Sep 11, 2020 at 02:27:42PM +0100, Phillip Wood wrote:
> 
> Hi Emily
> 
> > +Global config
> > +----
> > +  [hook "post-commit"]
> > +    command = "linter"
> > +    command = "~/typocheck.sh"
> > +
> > +  [hookcmd "linter"]
> > +    command = "/bin/linter --c"
> > +----
> > +
> > +Local config
> > +----
> > +  [hook "prepare-commit-msg"]
> > +    command = "linter"
> > +  [hook "post-commit"]
> > +    command = "python ~/run-test-suite.py"
> > +----
> 
> I think it would be helpful to have a couple of lines explaining what the
> example configuration sets up

Sure.

> 
> > +COMMANDS
> > +--------
> > +
> > +list <hook-name>::
> > +
> > +List the hooks which have been configured for <hook-name>. Hooks appear
> > +in the order they should be run, and note the config scope where the relevant
> > +`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
> 
> Thanks for clarifying that it is the origin of the hook.<hook-name>.command
> that is printed. An example of the output of the config above would be
> useful I think.

Oh, that's a good idea - you're absolutely right. I'll do that.

> > +/*
> > + * NEEDSWORK: a stateful hook_head means we can't run two hook events in the
> > + * background at the same time - which might be ok, or might not.
> > + *
> > + * Maybe it's better to cache a list head per hookname, since we can probably
> > + * guess that the hook list won't change during a user-initiated operation. For
> > + * now, within list_hooks, call clear_hook_list() at the outset.
> > + */
> > +static LIST_HEAD(hook_head);
> 
> I can see a cache might be useful for the sequencer which needs to run the
> prepare-msg hook for each commit (it should probably not be running the
> post-commit hook but does at the moment) and for am which runs some hooks
> for each patch but until then I'm not sure why we need a global variable
> here, can't we just declare `hook_head` in `list_hook()`?

Yeah, I agree. I'll make that change with the next reroll.

Thanks for reading.
 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 1/9] doc: propose hooks managed by the config
  2020-09-09  0:49     ` [PATCH v4 1/9] doc: propose hooks managed by the config Emily Shaffer
@ 2020-09-23 22:59       ` Jonathan Tan
  2020-09-24 21:54         ` Emily Shaffer
  2020-10-07  9:23       ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 170+ messages in thread
From: Jonathan Tan @ 2020-09-23 22:59 UTC (permalink / raw)
  To: emilyshaffer; +Cc: git, Jonathan Tan

For this review, I'll just concern myself with overall design and
structure.

For this patch, overall I think it's better if there's a clear
distinction between what we are implementing now and what we are
implementing later.

> +[[motivation]]
> +== Motivation
> +
> +Treat hooks as a first-class citizen by replacing the .git/hook/hookname path as
> +the only source of hooks to execute, in a way which is friendly to users with
> +multiple repos which have similar needs.

I don't understand what "first-class citizen" here means - probably
better to just omit that phrase and describe the new way of doing hooks.

> +[[config-schema-hook]]
> +==== `hook`
> +
> +Primarily contains subsections for each hook event. These order of these
> +subsections defines the hook command execution order

The execution order is defined by the order of a multivalue config
variable, I think, not the order of subsections? Besides, I believe that
there's one subsection per hook event (e.g. hook."pre-commit"), not one
subsection per command.

> ; hook commands can be
> +specified by setting the value directly to the command if no additional
> +configuration is needed, or by setting the value as the name of a `hookcmd`. If
> +Git does not find a `hookcmd` whose subsection matches the value of the given
> +command string, Git will try to execute the string directly. Hooks are executed
> +by passing the resolved command string to the shell.

[snip]

> Hook event subsections can
> +also contain per-hook-event settings.

If this is not yet implemented, maybe list under "future work".

> +
> +Also contains top-level hook execution settings, for example,
> +`hook.warnHookDir`, `hook.runHookDir`, or `hook.disableAll`. (These settings are
> +described more in <<library,Library>>.)

I think it's clearer if you list this under "future work" - I didn't see
any implementation of this.

> +[hook "pre-commit"]
> +  command = perl-linter
> +  command = /usr/bin/git-secrets --pre-commit
> +
> +[hook "pre-applypatch"]
> +  command = perl-linter
> +  error = ignore

Is "error" implemented?

> +
> +[hook]
> +  runHookDir = interactive

Same question for "runHookDir".

> +[[config-schema-hookcmd]]
> +==== `hookcmd`
> +
> +Defines a hook command and its attributes, which will be used when a hook event
> +occurs. Unqualified attributes are assumed to apply to this hook during all hook
> +events, but event-specific attributes can also be supplied. The example runs
> +`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
> +include this config, the hook command will be skipped for all events to which
> +it's normally subscribed _except_ `pre-commit`.
> +
> +----
> +[hookcmd "perl-linter"]
> +  command = /usr/bin/lint-it --language=perl
> +  skip = true
> +  pre-commit-skip = false
> +----

And the skips. (And several more below which I will skip.)

> +If the caller wants to do something more complicated, the hook library can also
> +provide a callback API:
> +
> +*`int for_each_hookcmd(const char *hookname, hookcmd_function *cb)`*

Is there a use case that would need such a function?

> +[[migration]]
> +=== Migration path
> +
> +[[stage-0]]
> +==== Stage 0
> +
> +Hooks are called by running `run-command.h:find_hook()` with the hookname and
> +executing the result. The hook library and builtin do not exist. Hooks only
> +exist as specially named scripts within `.git/hooks/`.
> +
> +[[stage-1]]
> +==== Stage 1
> +
> +`git hook list --porcelain <hook-event>` is implemented. Users can replace their
> +`.git/hooks/<hook-event>` scripts with a trampoline based on `git hook list`'s
> +output. Modifier commands like `git hook add` and `git hook edit` can be
> +implemented around this time as well.

This seems to contradict patch 8, which teaches Git to use the configs
directly without any change to .git/hooks/<hook-event> (at least for
certain commit-related hooks).

> +[[future-work]]
> +== Future work
> +
> +[[execution-ordering]]
> +=== Execution ordering
> +
> +We may find that config order is insufficient for some users; for example,
> +config order makes it difficult to add a new hook to the system or global config
> +which runs at the end of the hook list. A new ordering schema should be:
> +
> +1) Specified by a `hook.order` config, so that users will not unexpectedly see
> +their order change;
> +
> +2) Either dependency or numerically based.
> +
> +Dependency-based ordering is prone to classic linked-list problems, like a
> +cycles and handling of missing dependencies. But, it paves the way for enabling
> +parallelization if some tasks truly depend on others.
> +
> +Numerical ordering makes it tricky for Git to generate suggested ordering
> +numbers for each command, but is easy to determine a definitive order.

With this schema, and with the "skip" behavior described above (but not
implemented in this patch set), rudimentary ordering can already be
done; because a hook is removed and reinserted whenever it appears in
the config, even a hook X in the system config can be made to run after a
hook Y in the worktree config by adding Y then X in the worktree config,
and if we want to disable X instead, we can just add "skip" to X.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 3/9] hook: add list command
  2020-09-09  0:49     ` [PATCH v4 3/9] hook: add list command Emily Shaffer
  2020-09-11 13:27       ` Phillip Wood
@ 2020-09-23 23:04       ` Jonathan Tan
  2020-10-06 20:46         ` Emily Shaffer
  2020-09-27 19:23       ` Martin Ågren
  2020-10-05 23:27       ` Jonathan Nieder
  3 siblings, 1 reply; 170+ messages in thread
From: Jonathan Tan @ 2020-09-23 23:04 UTC (permalink / raw)
  To: emilyshaffer; +Cc: git, Jonathan Tan

>   $ git hook list pre-commit
>   ~/baz/from/hookcmd.sh
>   ~/bar.sh

In the tests below, there is a "local:" prefix (or similar). It's
clearer if the commit message has that too.

Also, looking at a later commit, the "list" command probably should
include the legacy hook if it exists.

> +static void emplace_hook(struct list_head *pos, const char *command)
> +{
> +	struct hook *to_add = malloc(sizeof(struct hook));
> +	to_add->origin = current_config_scope();
> +	strbuf_init(&to_add->command, 0);
> +	/* even with use_shell, run_command() needs quotes */
> +	strbuf_addf(&to_add->command, "'%s'", command);
> +
> +	list_add_tail(&to_add->list, pos);
> +}

It might be odd to a programmer reading this that an existing "struct
hook" with the same name is not reused - the scanning of the list done
in hook_config_lookup() could probably go here instead.

> +test_expect_success 'git hook list orders by config order' '
> +	setup_hooks &&
> +
> +	cat >expected <<-EOF &&
> +	global:	$ROOT/path/def
> +	local:	$ROOT/path/ghi

Will the "global" strings etc. be translated? If yes, it's probably not
worth it to align the paths in this way.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 7/9] hook: replace run-command.h:find_hook
  2020-09-09  0:49     ` [PATCH v4 7/9] hook: replace run-command.h:find_hook Emily Shaffer
  2020-09-09 20:32       ` Junio C Hamano
@ 2020-09-23 23:20       ` Jonathan Tan
  2020-10-05 23:42       ` Jonathan Nieder
  2 siblings, 0 replies; 170+ messages in thread
From: Jonathan Tan @ 2020-09-23 23:20 UTC (permalink / raw)
  To: emilyshaffer; +Cc: git, Jonathan Tan

> +int hook_exists(const char *hookname)
> +{
> +	const char *value = NULL;
> +	struct strbuf hook_key = STRBUF_INIT;
> +	strbuf_addf(&hook_key, "hook.%s.command", hookname);
> +
> +	return (!git_config_get_value(hook_key.buf, &value)) || !!find_hook(hookname);
> +}

I was surprised that this didn't share code with hook_list. Upon further
thought, hook_list might be expensive if hooks are present, but if we
can cache results, I think it's worth it. A caller that calls this
function usually will run hooks if they are present, so it's not wasted
work to construct the hook list.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 8/9] commit: use config-based hooks
  2020-09-09  0:49     ` [PATCH v4 8/9] commit: use config-based hooks Emily Shaffer
  2020-09-10 13:50       ` Phillip Wood
@ 2020-09-23 23:47       ` Jonathan Tan
  2020-10-05 21:27         ` Emily Shaffer
  1 sibling, 1 reply; 170+ messages in thread
From: Jonathan Tan @ 2020-09-23 23:47 UTC (permalink / raw)
  To: emilyshaffer; +Cc: git, Jonathan Tan

> -	if (!no_verify && find_hook("pre-commit")) {
> +	if (!no_verify && hook_exists("pre-commit")) {

A reviewer would probably need to look at all instances of "pre-commit"
(and likewise for the other hooks) but if the plan is to convert all
hooks, then the reviewer wouldn't need to do this since we could just
delete the "find_hook" function.

Overall comments about the design and scope of the patch set:

 - I think that the abilities of the current patch set regarding
   overriding order of globally-set hook commands is sufficient. We
   should also have some way of disabling globally-set hooks, perhaps
   by implementing the "skip" variable mentioned in patch 1 or by
   allowing the redefinition of hookcmd sections (e.g. by redefining a
   command to "/usr/bin/true"). To me, these provide substantial
   user-facing value, and would be sufficient for a first version - and
   other things like parallelization can come later.

 - As for the UI that should be exposed through the "git hook" command,
   I think that "git hook list" and "git hook run" are sufficient.
   Editing the config files are not too difficult, and "git hook add"
   etc. can be added later.

 - As for whether (1) it is OK for none of the hooks to be converted (and
   instead rely on the user to edit their hook scripts to call "git hook
   run ???"), or if (2) we should require some hooks to be
   converted, or if (3) we should require all hooks to be converted: I'd
   rather have (2) or (3) so that we don't have dead code. I prefer (3),
   especially since a reviewer wouldn't have to worry about leftover
   usages of old functions like find_hook() (as I mentioned at the start
   of this email), but I'm not fully opposed to (2) either.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 1/9] doc: propose hooks managed by the config
  2020-09-23 22:59       ` Jonathan Tan
@ 2020-09-24 21:54         ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-09-24 21:54 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Wed, Sep 23, 2020 at 03:59:10PM -0700, Jonathan Tan wrote:
> 
> For this review, I'll just concern myself with overall design and
> structure.

Thanks - the design doc is now slightly old, so it's nice to have some
fresh eyes on it.

> 
> For this patch, overall I think it's better if there's a clear
> distinction between what we are implementing now and what we are
> implementing later.

I took a light hand when I checked for this - the topic isn't complete
yet, and there's some work in the design doc which I want to include in
this topic, but which hasn't been sent around (or written) yet.

> 
> > +[[motivation]]
> > +== Motivation
> > +
> > +Treat hooks as a first-class citizen by replacing the .git/hook/hookname path as
> > +the only source of hooks to execute, in a way which is friendly to users with
> > +multiple repos which have similar needs.
> 
> I don't understand what "first-class citizen" here means - probably
> better to just omit that phrase and describe the new way of doing hooks.

Sure.

> 
> > +[[config-schema-hook]]
> > +==== `hook`
> > +
> > +Primarily contains subsections for each hook event. These order of these
> > +subsections defines the hook command execution order
> 
> The execution order is defined by the order of a multivalue config
> variable, I think, not the order of subsections? Besides, I believe that
> there's one subsection per hook event (e.g. hook."pre-commit"), not one
> subsection per command.

Ok. Have changed to "The order of variables in these subsections
defines..."

> 
> > ; hook commands can be
> > +specified by setting the value directly to the command if no additional
> > +configuration is needed, or by setting the value as the name of a `hookcmd`. If
> > +Git does not find a `hookcmd` whose subsection matches the value of the given
> > +command string, Git will try to execute the string directly. Hooks are executed
> > +by passing the resolved command string to the shell.
> 
> [snip]
> 
> > Hook event subsections can
> > +also contain per-hook-event settings.
> 
> If this is not yet implemented, maybe list under "future work".

Good idea. Done.

> 
> > +
> > +Also contains top-level hook execution settings, for example,
> > +`hook.warnHookDir`, `hook.runHookDir`, or `hook.disableAll`. (These settings are
> > +described more in <<library,Library>>.)
> 
> I think it's clearer if you list this under "future work" - I didn't see
> any implementation of this.

Yeah, this is out of sync with what the implementation ended up looking
like; disableAll might still be a useful thing to include in the initial
feature topic, so I won't remove it, but warnHookDir is not necessary.

> 
> > +[hook "pre-commit"]
> > +  command = perl-linter
> > +  command = /usr/bin/git-secrets --pre-commit
> > +
> > +[hook "pre-applypatch"]
> > +  command = perl-linter
> > +  error = ignore
> 
> Is "error" implemented?

No, have marked it with a comment.

> 
> > +
> > +[hook]
> > +  runHookDir = interactive
> 
> Same question for "runHookDir".

It is in the reroll I'm trying to get out this week :)

> 
> > +[[config-schema-hookcmd]]
> > +==== `hookcmd`
> > +
> > +Defines a hook command and its attributes, which will be used when a hook event
> > +occurs. Unqualified attributes are assumed to apply to this hook during all hook
> > +events, but event-specific attributes can also be supplied. The example runs
> > +`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
> > +include this config, the hook command will be skipped for all events to which
> > +it's normally subscribed _except_ `pre-commit`.
> > +
> > +----
> > +[hookcmd "perl-linter"]
> > +  command = /usr/bin/lint-it --language=perl
> > +  skip = true
> > +  pre-commit-skip = false
> > +----
> 
> And the skips. (And several more below which I will skip.)

Again, this is in the reroll I'm working on.

> 
> > +If the caller wants to do something more complicated, the hook library can also
> > +provide a callback API:
> > +
> > +*`int for_each_hookcmd(const char *hookname, hookcmd_function *cb)`*
> 
> Is there a use case that would need such a function?

I'm not sure yet - but I'm not quite ready to cut it from the design
doc, until I finish migrating the existing hooks and know that it's not
needed. At that point it'll make sense to move it into the future work
section.

> 
> > +[[migration]]
> > +=== Migration path
> > +
> > +[[stage-0]]
> > +==== Stage 0
> > +
> > +Hooks are called by running `run-command.h:find_hook()` with the hookname and
> > +executing the result. The hook library and builtin do not exist. Hooks only
> > +exist as specially named scripts within `.git/hooks/`.
> > +
> > +[[stage-1]]
> > +==== Stage 1
> > +
> > +`git hook list --porcelain <hook-event>` is implemented. Users can replace their
> > +`.git/hooks/<hook-event>` scripts with a trampoline based on `git hook list`'s
> > +output. Modifier commands like `git hook add` and `git hook edit` can be
> > +implemented around this time as well.
> 
> This seems to contradict patch 8, which teaches Git to use the configs
> directly without any change to .git/hooks/<hook-event> (at least for
> certain commit-related hooks).

Yeah, I think this needs to be rephrased; at this point locally I've
completely removed the --porcelain patch - I'm pretty sure it needs to
be a format string instead.

> 
> > +[[future-work]]
> > +== Future work
> > +
> > +[[execution-ordering]]
> > +=== Execution ordering
> > +
> > +We may find that config order is insufficient for some users; for example,
> > +config order makes it difficult to add a new hook to the system or global config
> > +which runs at the end of the hook list. A new ordering schema should be:
> > +
> > +1) Specified by a `hook.order` config, so that users will not unexpectedly see
> > +their order change;
> > +
> > +2) Either dependency or numerically based.
> > +
> > +Dependency-based ordering is prone to classic linked-list problems, like a
> > +cycles and handling of missing dependencies. But, it paves the way for enabling
> > +parallelization if some tasks truly depend on others.
> > +
> > +Numerical ordering makes it tricky for Git to generate suggested ordering
> > +numbers for each command, but is easy to determine a definitive order.
> 
> With this schema, and with the "skip" behavior described above (but not
> implemented in this patch set), rudimentary ordering can already be
> done; because a hook is removed and reinserted whenever it appears in
> the config, even a hook X in the system config can be made to run after a
> hook Y in the worktree config by adding Y then X in the worktree config,
> and if we want to disable X instead, we can just add "skip" to X.

Yep, that's why reordering is in the future work section :)

The problem with config ordering is like such: if I want everyone in
my enterprise to run 'git-secrets --prepare-commit-msg' as the very last
prepare-commit-msg hook, but I can only ship them an /etc/gitconfig,
then the best I can do is email my users and ask them to run 'git config
hook.prepare-commit-msg.command git-secrets-prepare-commit-msg' in every
new repo and include a 'hookcmd.git-secrets-prepare-commit-msg.command'
config in the /etc/gitconfig I ship. (I mention git-secrets here because
it's possible other hooks could have introduced credential secrets into
my user's commit message after git-secrets already ran.)

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 3/9] hook: add list command
  2020-09-09  0:49     ` [PATCH v4 3/9] hook: add list command Emily Shaffer
  2020-09-11 13:27       ` Phillip Wood
  2020-09-23 23:04       ` Jonathan Tan
@ 2020-09-27 19:23       ` Martin Ågren
  2020-10-06 20:20         ` Emily Shaffer
  2020-10-05 23:27       ` Jonathan Nieder
  3 siblings, 1 reply; 170+ messages in thread
From: Martin Ågren @ 2020-09-27 19:23 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: Git Mailing List

Hi Emily,

On Wed, 9 Sep 2020 at 02:54, Emily Shaffer <emilyshaffer@google.com> wrote:

>  DESCRIPTION
>  -----------
>  You can list, add, and modify hooks with this command.

(BTW, I think this patch could teach this to say "You can list hooks
with this command." If/when we add the other commands, we can expand
on this.)

> +This command parses the default configuration files for sections "hook" and
> +"hookcmd". "hook" is used to describe the commands which will be run during a

I propose s/"hook"/`hook`/ and similar to set this as monospace since we
are discussing configuration sections. If we want to avoid starting
sentences with "hook" (or `hookcmd`; do we?), maybe something like "The
section `hook` ..." would work fine.

> +particular hook event; commands are run in config order. "hookcmd" is used to

"config order" feels a bit too colloquial/vague. You use the same phrase
in the commit message and I think it works well there for the indented
audience. But for this document, I'm not so sure. How about

  Commands are run in the order they are encountered as the Git
  configuration files are processed (see linkgit:git-config[1]).

? It's also quite possible that "config order" hits the exact right tone
-- please trust your judgment.

> +describe attributes of a specific command. If additional attributes don't need
> +to be specified, a command to run can be specified directly in the "hook"
> +section; if a "hookcmd" by that name isn't found, Git will attempt to run the
> +provided value directly. For example:

> +  [hook "post-commit"]
> +    command = "linter"
> +    command = "~/typocheck.sh"
> +
> +  [hookcmd "linter"]
> +    command = "/bin/linter --c"

Hmm. "hook", "command" and "hookcmd". Should that be "cmd", or
"hookcommand"? I'd favour the latter, but the current proposal somehow
feels asymmetric. (If code uses, and is consistent about using,
"hookcmd" that's another thing entirely, I think. It's just that for the
configuration, it looks a bit odd.)

> +List the hooks which have been configured for <hook-name>. Hooks appear

`<hook-name>` with backticks.

> +in the order they should be run, and note the config scope where the relevant
> +`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).

I had to read and re-read this a few times. The "and note the" does not
mean "and please observe that", but rather "and they make note of". Not
sure how that can be done clearer. The second thing that tripped me up
was that last part. Maybe end the sentence after "specified", then add
something like "The scope is not affected by if and where
`hookcmd.<hook-name>.command` appears.".

I think you could add

  CONFIGURATION
  -------------
  include::config/hook.txt[]

here and add such a file

  hook.<hook-name>.command::
         ...

  hookcmd.<hook-name>.command::
         ...

where you define/describe those items. And you can include it from
config.txt as well.

Martin

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 4/9] hook: add --porcelain to list command
  2020-09-09  0:49     ` [PATCH v4 4/9] hook: add --porcelain to " Emily Shaffer
@ 2020-09-28 19:29       ` Josh Steadmon
  0 siblings, 0 replies; 170+ messages in thread
From: Josh Steadmon @ 2020-09-28 19:29 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

On 2020.09.08 17:49, Emily Shaffer wrote:
> Teach 'git hook list --porcelain <hookname>', which prints simply the
> commands to be run in the order suggested by the config. This option is
> intended for use by user scripts, wrappers, or out-of-process Git
> commands which still want to execute hooks. For example, the following
> snippet might be added to git-send-email.perl to introduce a
> `pre-send-email` hook:
> 
>   sub pre_send_email {
>     open(my $fh, 'git hook list --porcelain pre-send-email |');
>     chomp(my @hooks = <$fh>);
>     close($fh);
> 
>     foreach $hook (@hooks) {
>             system $hook
>     }
> 
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
>  Documentation/git-hook.txt    | 13 +++++++++++--
>  builtin/hook.c                | 17 +++++++++++++----
>  t/t1360-config-based-hooks.sh | 12 ++++++++++++
>  3 files changed, 36 insertions(+), 6 deletions(-)
> 
> diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
> index e458586e96..0854035ce2 100644
> --- a/Documentation/git-hook.txt
> +++ b/Documentation/git-hook.txt
> @@ -8,7 +8,7 @@ git-hook - Manage configured hooks
>  SYNOPSIS
>  --------
>  [verse]
> -'git hook' list <hook-name>
> +'git hook' list [--porcelain] <hook-name>
>  
>  DESCRIPTION
>  -----------
> @@ -43,11 +43,20 @@ Local config
>  COMMANDS
>  --------
>  
> -list <hook-name>::
> +list [--porcelain] <hook-name>::
>  
>  List the hooks which have been configured for <hook-name>. Hooks appear
>  in the order they should be run, and note the config scope where the relevant
>  `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
> ++
> +If `--porcelain` is specified, instead print the commands alone, separated by
> +newlines, for easy parsing by a script.
> +
> +OPTIONS
> +-------
> +--porcelain::
> +	With `list`, print the commands in the order they should be run,
> +	separated by newlines, for easy parsing by a script.

Rather than a hard-coded porcelain format, perhaps we could accept a
format string to allow callers to specify which items they want, for
greater forwards-compatibility?

Also, we may want a "-z / --null" option like in `git config` to delimit
by null bytes rather than newlines, in case any commands end up with
embedded newlines.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 6/9] hook: add 'run' subcommand
  2020-09-09  0:49     ` [PATCH v4 6/9] hook: add 'run' subcommand Emily Shaffer
  2020-09-11 13:30       ` Phillip Wood
@ 2020-09-28 19:29       ` Josh Steadmon
  2020-10-05 23:39       ` Jonathan Nieder
  2 siblings, 0 replies; 170+ messages in thread
From: Josh Steadmon @ 2020-09-28 19:29 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

On 2020.09.08 17:49, Emily Shaffer wrote:
> In order to enable hooks to be run as an external process, by a
> standalone Git command, or by tools which wrap Git, provide an external
> means to run all configured hook commands for a given hook event.
> 
> For now, the hook commands will in config order, in series. As alternate

Looks like a small typo here:
s/will in config order/will run in config order/

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 8/9] commit: use config-based hooks
  2020-09-23 23:47       ` Jonathan Tan
@ 2020-10-05 21:27         ` Emily Shaffer
  2020-10-05 23:48           ` Jonathan Nieder
  0 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-10-05 21:27 UTC (permalink / raw)
  To: Jonathan Tan, Junio C Hamano; +Cc: git

On Wed, Sep 23, 2020 at 04:47:34PM -0700, Jonathan Tan wrote:
> 
> > -	if (!no_verify && find_hook("pre-commit")) {
> > +	if (!no_verify && hook_exists("pre-commit")) {
> 
> A reviewer would probably need to look at all instances of "pre-commit"
> (and likewise for the other hooks) but if the plan is to convert all
> hooks, then the reviewer wouldn't need to do this since we could just
> delete the "find_hook" function.
> 
> Overall comments about the design and scope of the patch set:
> 
>  - I think that the abilities of the current patch set regarding
>    overriding order of globally-set hook commands is sufficient. We
>    should also have some way of disabling globally-set hooks, perhaps
>    by implementing the "skip" variable mentioned in patch 1 or by
>    allowing the redefinition of hookcmd sections (e.g. by redefining a
>    command to "/usr/bin/true"). To me, these provide substantial
>    user-facing value, and would be sufficient for a first version - and
>    other things like parallelization can come later.

OK. I will send 'skip' in the next reroll. Thanks for pointing it out!

> 
>  - As for the UI that should be exposed through the "git hook" command,
>    I think that "git hook list" and "git hook run" are sufficient.
>    Editing the config files are not too difficult, and "git hook add"
>    etc. can be added later.
> 
>  - As for whether (1) it is OK for none of the hooks to be converted (and
>    instead rely on the user to edit their hook scripts to call "git hook
>    run ???"), or if (2) we should require some hooks to be
>    converted, or if (3) we should require all hooks to be converted: I'd
>    rather have (2) or (3) so that we don't have dead code. I prefer (3),
>    especially since a reviewer wouldn't have to worry about leftover
>    usages of old functions like find_hook() (as I mentioned at the start
>    of this email), but I'm not fully opposed to (2) either.

I personally prefer (3) - I think the user experience with (2) in a
release (or even in 'next', which all Googlers use) is pretty bad. The
downside, of course, is that a large topic gets merged all at once and
makes some pretty nasty reviewer overhead.

Junio, I wonder if you can give any advice here? What would be really
ideal for me would be to do something like Stolee has been doing with
his maintenance series - config-based hooks pt. I containing the library
code and config-based hooks pt. II containing the conversion of
preexisting hooks. Does that make the overhead for you significantly
worse?

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 2/9] hook: scaffolding for git-hook subcommand
  2020-09-09  0:49     ` [PATCH v4 2/9] hook: scaffolding for git-hook subcommand Emily Shaffer
@ 2020-10-05 23:24       ` Jonathan Nieder
  2020-10-06 19:06         ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Jonathan Nieder @ 2020-10-05 23:24 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Hi,

Emily Shaffer wrote:

> Introduce infrastructure for a new subcommand, git-hook, which will be
> used to ease config-based hook management. This command will handle
> parsing configs to compose a list of hooks to run for a given event, as
> well as adding or modifying hook configs in an interactive fashion.
>
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
>  .gitignore                    |  1 +
>  Documentation/git-hook.txt    | 19 +++++++++++++++++++
>  Makefile                      |  1 +
>  builtin.h                     |  1 +
>  builtin/hook.c                | 21 +++++++++++++++++++++
>  git.c                         |  1 +
>  t/t1360-config-based-hooks.sh | 11 +++++++++++
>  7 files changed, 55 insertions(+)
>  create mode 100644 Documentation/git-hook.txt
>  create mode 100644 builtin/hook.c
>  create mode 100755 t/t1360-config-based-hooks.sh

optional: I could imagine this being squashed into patch 3 --- that way,
the command has functionality as soon as it exists.  Alternatively:

[...]
> --- /dev/null
> +++ b/Documentation/git-hook.txt
> @@ -0,0 +1,19 @@
> +git-hook(1)
> +===========
> +
> +NAME
> +----
> +git-hook - Manage configured hooks
> +
> +SYNOPSIS
> +--------
> +[verse]
> +'git hook'
> +
> +DESCRIPTION
> +-----------
> +You can list, add, and modify hooks with this command.

This could say something like "This is a placeholder command that will
gain functionality in subsequent patches" to make the current state
clear.

[...]
> --- a/git.c
> +++ b/git.c
> @@ -519,6 +519,7 @@ static struct cmd_struct commands[] = {
>  	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
>  	{ "hash-object", cmd_hash_object },
>  	{ "help", cmd_help },
> +	{ "hook", cmd_hook, RUN_SETUP },

This makes the command require that it run within a git repository,
but I can imagine wanting to list hooks outside of any.  Should it use
RUN_SETUP_GENTLY instead?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 3/9] hook: add list command
  2020-09-09  0:49     ` [PATCH v4 3/9] hook: add list command Emily Shaffer
                         ` (2 preceding siblings ...)
  2020-09-27 19:23       ` Martin Ågren
@ 2020-10-05 23:27       ` Jonathan Nieder
  3 siblings, 0 replies; 170+ messages in thread
From: Jonathan Nieder @ 2020-10-05 23:27 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Emily Shaffer wrote:

> --- a/Documentation/git-hook.txt
> +++ b/Documentation/git-hook.txt
> @@ -8,12 +8,47 @@ git-hook - Manage configured hooks
[...]
> +COMMANDS
> +--------
> +
> +list <hook-name>::
> +
> +List the hooks which have been configured for <hook-name>. Hooks appear
> +in the order they should be run, and note the config scope where the relevant
> +`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).

A little bit of futureproofing: this may want to mention that the
output is intended to be human-readable and is subject to change over
time (scripters beware!).

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 5/9] parse-options: parse into strvec
  2020-09-09  0:49     ` [PATCH v4 5/9] parse-options: parse into strvec Emily Shaffer
@ 2020-10-05 23:30       ` Jonathan Nieder
  2020-10-06  4:49         ` Junio C Hamano
  0 siblings, 1 reply; 170+ messages in thread
From: Jonathan Nieder @ 2020-10-05 23:30 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Emily Shaffer wrote:

> This is useful if collecting generic arguments to pass through to
> another command, for example, 'git hook run --arg "--quiet" --arg
> "--format=pretty" some-hook'. The resulting strvec would contain
> { "--quiet", "--format=pretty" }.

An alternative is to use OPT_STRING_LIST and then convert in the
caller.  One advantage of that is that it would guarantee the behavior
with --no-arg etc is going to match exactly.

I prefer this OPT_STRVEC approach nonetheless.  Can the
parse_opt_strvec and parse_opt_string_list functions get comments
pointing to each other as an alternative way to encourage that kind of
consistency?

[...]
> --- a/Documentation/technical/api-parse-options.txt
> +++ b/Documentation/technical/api-parse-options.txt
> @@ -173,6 +173,11 @@ There are some macros to easily define options:
>  	The string argument is stored as an element in `string_list`.
>  	Use of `--no-option` will clear the list of preceding values.
>  
> +`OPT_ARGV_ARRAY(short, long, &struct argv_array, arg_str, description)`::

nit: this should be OPT_STRVEC

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 6/9] hook: add 'run' subcommand
  2020-09-09  0:49     ` [PATCH v4 6/9] hook: add 'run' subcommand Emily Shaffer
  2020-09-11 13:30       ` Phillip Wood
  2020-09-28 19:29       ` Josh Steadmon
@ 2020-10-05 23:39       ` Jonathan Nieder
  2020-10-06 22:57         ` Emily Shaffer
  2 siblings, 1 reply; 170+ messages in thread
From: Jonathan Nieder @ 2020-10-05 23:39 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Hi,

Emily Shaffer wrote:

> In order to enable hooks to be run as an external process, by a
> standalone Git command, or by tools which wrap Git, provide an external
> means to run all configured hook commands for a given hook event.

Exciting!

I would even be tempted to put this earlier in the series: providing a
"git hook run" command that only supports legacy hooks and then
improving it from there to support config-based hooks.  This ordering is
also fine, though.

[...]
> ---
>  builtin/hook.c                | 30 ++++++++++++++++++++
>  hook.c                        | 52 ++++++++++++++++++++++++++++++++---
>  hook.h                        |  3 ++
>  t/t1360-config-based-hooks.sh | 28 +++++++++++++++++++
>  4 files changed, 109 insertions(+), 4 deletions(-)

Needs docs.

[...]
> --- a/builtin/hook.c
> +++ b/builtin/hook.c
> @@ -5,9 +5,11 @@
[...]
> +static int run(int argc, const char **argv, const char *prefix)
> +{
> +	struct strbuf hookname = STRBUF_INIT;
> +	struct strvec envs = STRVEC_INIT;
> +	struct strvec args = STRVEC_INIT;
> +
> +	struct option run_options[] = {
> +		OPT_STRVEC('e', "env", &envs, N_("var"),
> +			   N_("environment variables for hook to use")),
> +		OPT_STRVEC('a', "arg", &args, N_("args"),
> +			   N_("argument to pass to hook")),
> +		OPT_END(),
> +	};
> +
> +	argc = parse_options(argc, argv, prefix, run_options,
> +			     builtin_hook_usage, 0);
> +
> +	if (argc < 1)
> +		usage_msg_opt(_("a hookname must be provided to operate on."),
> +			      builtin_hook_usage, run_options);

Error message nit: what does it mean to operate on a hookname?

Perhaps this should allude to the usage string?

	usage_msg_opt(_("missing <hookname> parameter"), ...);

Or to match the conversational approach of commands like "clone":

	usage_msg_opt(_("You must specify a hook to run."), ...);

[...]
> --- a/hook.c
> +++ b/hook.c
> @@ -2,6 +2,7 @@
>  
>  #include "hook.h"
>  #include "config.h"
> +#include "run-command.h"
>  
>  /*
>   * NEEDSWORK: a stateful hook_head means we can't run two hook events in the
> @@ -21,13 +22,15 @@ void free_hook(struct hook *ptr)
>  	}
>  }
>  
> -static void emplace_hook(struct list_head *pos, const char *command)
> +static void emplace_hook(struct list_head *pos, const char *command, int quoted)
>  {
>  	struct hook *to_add = malloc(sizeof(struct hook));
>  	to_add->origin = current_config_scope();
>  	strbuf_init(&to_add->command, 0);
> -	/* even with use_shell, run_command() needs quotes */
> -	strbuf_addf(&to_add->command, "'%s'", command);
> +	if (quoted)
> +		strbuf_addf(&to_add->command, "'%s'", command);
> +	else
> +		strbuf_addstr(&to_add->command, command);
>  
>  	list_add_tail(&to_add->list, pos);
>  }

This would need to use sq_quote_* to be safe, but we can do something
simpler: if we accumulate parameters in an argv_array passed to
run_command, then they will be safely passed to the shell without
triggering expansion.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 7/9] hook: replace run-command.h:find_hook
  2020-09-09  0:49     ` [PATCH v4 7/9] hook: replace run-command.h:find_hook Emily Shaffer
  2020-09-09 20:32       ` Junio C Hamano
  2020-09-23 23:20       ` Jonathan Tan
@ 2020-10-05 23:42       ` Jonathan Nieder
  2 siblings, 0 replies; 170+ messages in thread
From: Jonathan Nieder @ 2020-10-05 23:42 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

Emily Shaffer wrote:

> Subject: hook: replace run-command.h:find_hook

tiny nit: This doesn't remove find_hook, so this may want to be
described as "add replacement for" instead of "replace".

[...]
> --- a/hook.c
> +++ b/hook.c
> @@ -111,6 +111,15 @@ struct list_head* hook_list(const struct strbuf* hookname)
>  	return &hook_head;
>  }
>  
> +int hook_exists(const char *hookname)
> +{
> +	const char *value = NULL;
> +	struct strbuf hook_key = STRBUF_INIT;
> +	strbuf_addf(&hook_key, "hook.%s.command", hookname);
> +
> +	return (!git_config_get_value(hook_key.buf, &value)) || !!find_hook(hookname);
> +}

This feels a bit fragile, since it can go out of sync with run_hooks.
I think I'd prefer if they shared code and this function either
returned a parsed structure that could be used later to run hooks or
cached the result keyed by hookname.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 8/9] commit: use config-based hooks
  2020-10-05 21:27         ` Emily Shaffer
@ 2020-10-05 23:48           ` Jonathan Nieder
  2020-10-06 19:08             ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Jonathan Nieder @ 2020-10-05 23:48 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: Jonathan Tan, Junio C Hamano, git

Emily Shaffer wrote:
> On Wed, Sep 23, 2020 at 04:47:34PM -0700, Jonathan Tan wrote:

>>  - As for whether (1) it is OK for none of the hooks to be converted (and
>>    instead rely on the user to edit their hook scripts to call "git hook
>>    run ???"), or if (2) we should require some hooks to be
>>    converted, or if (3) we should require all hooks to be converted: I'd
>>    rather have (2) or (3) so that we don't have dead code. I prefer (3),
>>    especially since a reviewer wouldn't have to worry about leftover
>>    usages of old functions like find_hook() (as I mentioned at the start
>>    of this email), but I'm not fully opposed to (2) either.
>
> I personally prefer (3) - I think the user experience with (2) in a
> release (or even in 'next', which all Googlers use) is pretty bad. The
> downside, of course, is that a large topic gets merged all at once and
> makes some pretty nasty reviewer overhead.

One approach is to build up a series with "git hook run" and "git hook
list" demonstrating and testing the functionality and [PATCH n+1/n]
extra patches at the end converting existing hooks.  The user
experience from "git hook run" and even "git hook list" supporting a
preview of the future without built-in commands living in that future
yet would not be so bad, methinks.  And then a final series could
update the built-in commands' usage of hooks and would still be fairly
small.

In other words, I think I like (1), except *without* the
recommendation for users to edit their hook scripts to call "git hook
run" --- instead, the recommendation would be "try running this
command if you want to see what hooks will do in the future".

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 5/9] parse-options: parse into strvec
  2020-10-05 23:30       ` Jonathan Nieder
@ 2020-10-06  4:49         ` Junio C Hamano
  0 siblings, 0 replies; 170+ messages in thread
From: Junio C Hamano @ 2020-10-06  4:49 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Emily Shaffer, git

Jonathan Nieder <jrnieder@gmail.com> writes:

> Emily Shaffer wrote:
>
>> This is useful if collecting generic arguments to pass through to
>> another command, for example, 'git hook run --arg "--quiet" --arg
>> "--format=pretty" some-hook'. The resulting strvec would contain
>> { "--quiet", "--format=pretty" }.
>
> An alternative is to use OPT_STRING_LIST and then convert in the
> caller.  One advantage of that is that it would guarantee the behavior
> with --no-arg etc is going to match exactly.
>
> I prefer this OPT_STRVEC approach nonetheless.  Can the
> parse_opt_strvec and parse_opt_string_list functions get comments
> pointing to each other as an alternative way to encourage that kind of
> consistency?
>
> [...]
>> --- a/Documentation/technical/api-parse-options.txt
>> +++ b/Documentation/technical/api-parse-options.txt
>> @@ -173,6 +173,11 @@ There are some macros to easily define options:
>>  	The string argument is stored as an element in `string_list`.
>>  	Use of `--no-option` will clear the list of preceding values.
>>  
>> +`OPT_ARGV_ARRAY(short, long, &struct argv_array, arg_str, description)`::
>
> nit: this should be OPT_STRVEC

Sigh.  I thought I caught all of these with a SQUASH fix-up patch
the last round.  Thanks for being extra careful.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 2/9] hook: scaffolding for git-hook subcommand
  2020-10-05 23:24       ` Jonathan Nieder
@ 2020-10-06 19:06         ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-06 19:06 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git

On Mon, Oct 05, 2020 at 04:24:18PM -0700, Jonathan Nieder wrote:
> 
> Hi,
> 
> Emily Shaffer wrote:
> 
> > Introduce infrastructure for a new subcommand, git-hook, which will be
> > used to ease config-based hook management. This command will handle
> > parsing configs to compose a list of hooks to run for a given event, as
> > well as adding or modifying hook configs in an interactive fashion.
> >
> > Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> > ---
> >  .gitignore                    |  1 +
> >  Documentation/git-hook.txt    | 19 +++++++++++++++++++
> >  Makefile                      |  1 +
> >  builtin.h                     |  1 +
> >  builtin/hook.c                | 21 +++++++++++++++++++++
> >  git.c                         |  1 +
> >  t/t1360-config-based-hooks.sh | 11 +++++++++++
> >  7 files changed, 55 insertions(+)
> >  create mode 100644 Documentation/git-hook.txt
> >  create mode 100644 builtin/hook.c
> >  create mode 100755 t/t1360-config-based-hooks.sh
> 
> optional: I could imagine this being squashed into patch 3 --- that way,
> the command has functionality as soon as it exists.  Alternatively:

I would prefer to leave it on its own. Managing changes like
builtin<->standalone or even the one you mentioned below about
RUN_SETUP_GENTLY is somewhat easier to manage when they aren't in the
same patch as the business logic, IMO.

> 
> [...]
> > --- /dev/null
> > +++ b/Documentation/git-hook.txt
> > @@ -0,0 +1,19 @@
> > +git-hook(1)
> > +===========
> > +
> > +NAME
> > +----
> > +git-hook - Manage configured hooks
> > +
> > +SYNOPSIS
> > +--------
> > +[verse]
> > +'git hook'
> > +
> > +DESCRIPTION
> > +-----------
> > +You can list, add, and modify hooks with this command.
> 
> This could say something like "This is a placeholder command that will
> gain functionality in subsequent patches" to make the current state
> clear.

Done.

> 
> [...]
> > --- a/git.c
> > +++ b/git.c
> > @@ -519,6 +519,7 @@ static struct cmd_struct commands[] = {
> >  	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
> >  	{ "hash-object", cmd_hash_object },
> >  	{ "help", cmd_help },
> > +	{ "hook", cmd_hook, RUN_SETUP },
> 
> This makes the command require that it run within a git repository,
> but I can imagine wanting to list hooks outside of any.  Should it use
> RUN_SETUP_GENTLY instead?

Nice catch. I'll add a test to the list patch to that effect also.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 8/9] commit: use config-based hooks
  2020-10-05 23:48           ` Jonathan Nieder
@ 2020-10-06 19:08             ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-06 19:08 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Jonathan Tan, Junio C Hamano, git

On Mon, Oct 05, 2020 at 04:48:39PM -0700, Jonathan Nieder wrote:
> 
> Emily Shaffer wrote:
> > On Wed, Sep 23, 2020 at 04:47:34PM -0700, Jonathan Tan wrote:
> 
> >>  - As for whether (1) it is OK for none of the hooks to be converted (and
> >>    instead rely on the user to edit their hook scripts to call "git hook
> >>    run ???"), or if (2) we should require some hooks to be
> >>    converted, or if (3) we should require all hooks to be converted: I'd
> >>    rather have (2) or (3) so that we don't have dead code. I prefer (3),
> >>    especially since a reviewer wouldn't have to worry about leftover
> >>    usages of old functions like find_hook() (as I mentioned at the start
> >>    of this email), but I'm not fully opposed to (2) either.
> >
> > I personally prefer (3) - I think the user experience with (2) in a
> > release (or even in 'next', which all Googlers use) is pretty bad. The
> > downside, of course, is that a large topic gets merged all at once and
> > makes some pretty nasty reviewer overhead.
> 
> One approach is to build up a series with "git hook run" and "git hook
> list" demonstrating and testing the functionality and [PATCH n+1/n]
> extra patches at the end converting existing hooks.  The user
> experience from "git hook run" and even "git hook list" supporting a
> preview of the future without built-in commands living in that future
> yet would not be so bad, methinks.  And then a final series could
> update the built-in commands' usage of hooks and would still be fairly
> small.
> 
> In other words, I think I like (1), except *without* the
> recommendation for users to edit their hook scripts to call "git hook
> run" --- instead, the recommendation would be "try running this
> command if you want to see what hooks will do in the future".

Ok. I'll fix up the wording in the design doc and follow through with my
plan to split the series into two parts.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 3/9] hook: add list command
  2020-09-27 19:23       ` Martin Ågren
@ 2020-10-06 20:20         ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-06 20:20 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Git Mailing List

On Sun, Sep 27, 2020 at 09:23:35PM +0200, Martin Ågren wrote:
> 
> Hi Emily,

Firstly, thanks for the doc review - this is great stuff.

> 
> On Wed, 9 Sep 2020 at 02:54, Emily Shaffer <emilyshaffer@google.com> wrote:
> 
> >  DESCRIPTION
> >  -----------
> >  You can list, add, and modify hooks with this command.
> 
> (BTW, I think this patch could teach this to say "You can list hooks
> with this command." If/when we add the other commands, we can expand
> on this.)

Done. I sort of glued this together with Jonathan Nieder's suggestion in
the setup patch, and ended up saying "later you will be able to blah".

> 
> > +This command parses the default configuration files for sections "hook" and
> > +"hookcmd". "hook" is used to describe the commands which will be run during a
> 
> I propose s/"hook"/`hook`/ and similar to set this as monospace since we
> are discussing configuration sections. If we want to avoid starting
> sentences with "hook" (or `hookcmd`; do we?), maybe something like "The
> section `hook` ..." would work fine.

Nice - done. I don't see much problem with starting a sentence with
monospaced lower-cased section name... someone can disagree with me :)

> 
> > +particular hook event; commands are run in config order. "hookcmd" is used to
> 
> "config order" feels a bit too colloquial/vague. You use the same phrase
> in the commit message and I think it works well there for the indented
> audience. But for this document, I'm not so sure. How about
> 
>   Commands are run in the order they are encountered as the Git
>   configuration files are processed (see linkgit:git-config[1]).

I don't mind colloquial - I think that improves the readability of user
documentation - but you're right that it's vague. "...commands are run
in the order Git encounters them during the configuration parse (see
linkgitblah)" seemed like an okay balance to me.

> 
> ? It's also quite possible that "config order" hits the exact right tone
> -- please trust your judgment.

Nah, I think you're right that "config order" is easily understood by
Git devs, but probably not by Git users. I like that linking out to the
config doc invites users to also learn a little more about how config
files work :)

> 
> > +describe attributes of a specific command. If additional attributes don't need
> > +to be specified, a command to run can be specified directly in the "hook"
> > +section; if a "hookcmd" by that name isn't found, Git will attempt to run the
> > +provided value directly. For example:
> 
> > +  [hook "post-commit"]
> > +    command = "linter"
> > +    command = "~/typocheck.sh"
> > +
> > +  [hookcmd "linter"]
> > +    command = "/bin/linter --c"
> 
> Hmm. "hook", "command" and "hookcmd". Should that be "cmd", or
> "hookcommand"? I'd favour the latter, but the current proposal somehow
> feels asymmetric. (If code uses, and is consistent about using,
> "hookcmd" that's another thing entirely, I think. It's just that for the
> configuration, it looks a bit odd.)

I'm not entirely in love with the name "hookcmd" but somehow I like
"hookcommand" even less - especially since you end up with
"hook.command" referencing a "hookcommand" which also has a
"hookcommand.command" - blech.

Some possible alternatives to "hookcmd":
- hookmodule/hook-module
- reusable-hook
- hook-with-options/hook-options (nah, this sounds like it means
  "options for hook execution")
- hook-details/detailed-hook
- named-hook

I'll think on this more. I like "named-hook" quite a lot. Very
interested in hearing other ideas - "the two hardest problems in
computer science are naming, cache invalidation, and off-by-one errors"
;)

> 
> > +List the hooks which have been configured for <hook-name>. Hooks appear
> 
> `<hook-name>` with backticks.
> 
> > +in the order they should be run, and note the config scope where the relevant
> > +`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
> 
> I had to read and re-read this a few times. The "and note the" does not
> mean "and please observe that", but rather "and they make note of". Not
> sure how that can be done clearer. The second thing that tripped me up
> was that last part. Maybe end the sentence after "specified", then add
> something like "The scope is not affected by if and where
> `hookcmd.<hook-name>.command` appears.".

Occam's Razor suggests "Hooks appear in the order they should be run,
and print the config scope blah". Thanks for pointing out "and note
that" collision - I never use that phrase so it didn't occur to me!

> 
> I think you could add
> 
>   CONFIGURATION
>   -------------
>   include::config/hook.txt[]
> 
> here and add such a file
> 
>   hook.<hook-name>.command::
>          ...
> 
>   hookcmd.<hook-name>.command::
>          ...
> 
> where you define/describe those items. And you can include it from
> config.txt as well.

Yes, totally. Thanks.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 3/9] hook: add list command
  2020-09-23 23:04       ` Jonathan Tan
@ 2020-10-06 20:46         ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-06 20:46 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Wed, Sep 23, 2020 at 04:04:51PM -0700, Jonathan Tan wrote:
> 
> >   $ git hook list pre-commit
> >   ~/baz/from/hookcmd.sh
> >   ~/bar.sh
> 
> In the tests below, there is a "local:" prefix (or similar). It's
> clearer if the commit message has that too.
> 
> Also, looking at a later commit, the "list" command probably should
> include the legacy hook if it exists.

Have added it as a separate patch for v5, hopefully that will make more
sense.

> 
> > +static void emplace_hook(struct list_head *pos, const char *command)
> > +{
> > +	struct hook *to_add = malloc(sizeof(struct hook));
> > +	to_add->origin = current_config_scope();
> > +	strbuf_init(&to_add->command, 0);
> > +	/* even with use_shell, run_command() needs quotes */
> > +	strbuf_addf(&to_add->command, "'%s'", command);
> > +
> > +	list_add_tail(&to_add->list, pos);
> > +}
> 
> It might be odd to a programmer reading this that an existing "struct
> hook" with the same name is not reused - the scanning of the list done
> in hook_config_lookup() could probably go here instead.

Sure, done.

> 
> > +test_expect_success 'git hook list orders by config order' '
> > +	setup_hooks &&
> > +
> > +	cat >expected <<-EOF &&
> > +	global:	$ROOT/path/def
> > +	local:	$ROOT/path/ghi
> 
> Will the "global" strings etc. be translated? If yes, it's probably not
> worth it to align the paths in this way.

Asked more offline. Jonathan was saying that translation might result in
scope name + tab character leaving the path in different columns
depending on the scope anyways, so there's no point in using a tab
character instead of a space character here. That seems reasonable; I'll
switch.

 - Emily


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 6/9] hook: add 'run' subcommand
  2020-10-05 23:39       ` Jonathan Nieder
@ 2020-10-06 22:57         ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-06 22:57 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git

On Mon, Oct 05, 2020 at 04:39:03PM -0700, Jonathan Nieder wrote:
> 
> Hi,
> 
> Emily Shaffer wrote:
> 
> > In order to enable hooks to be run as an external process, by a
> > standalone Git command, or by tools which wrap Git, provide an external
> > means to run all configured hook commands for a given hook event.
> 
> Exciting!
> 
> I would even be tempted to put this earlier in the series: providing a
> "git hook run" command that only supports legacy hooks and then
> improving it from there to support config-based hooks.  This ordering is
> also fine, though.

Oh, interesting! I sort of wish I had started with that ordering... but
now it seems a little unwieldy to switch. I'd probably want to do 100%
of the run_hook_(ve|le) conversions first, in that case, and delete the
old hook API. But at this point I think it would be a pretty large
amount of overhead to switch.

> 
> [...]
> > ---
> >  builtin/hook.c                | 30 ++++++++++++++++++++
> >  hook.c                        | 52 ++++++++++++++++++++++++++++++++---
> >  hook.h                        |  3 ++
> >  t/t1360-config-based-hooks.sh | 28 +++++++++++++++++++
> >  4 files changed, 109 insertions(+), 4 deletions(-)
> 
> Needs docs.

Done

> 
> [...]
> > --- a/builtin/hook.c
> > +++ b/builtin/hook.c
> > @@ -5,9 +5,11 @@
> [...]
> > +static int run(int argc, const char **argv, const char *prefix)
> > +{
> > +	struct strbuf hookname = STRBUF_INIT;
> > +	struct strvec envs = STRVEC_INIT;
> > +	struct strvec args = STRVEC_INIT;
> > +
> > +	struct option run_options[] = {
> > +		OPT_STRVEC('e', "env", &envs, N_("var"),
> > +			   N_("environment variables for hook to use")),
> > +		OPT_STRVEC('a', "arg", &args, N_("args"),
> > +			   N_("argument to pass to hook")),
> > +		OPT_END(),
> > +	};
> > +
> > +	argc = parse_options(argc, argv, prefix, run_options,
> > +			     builtin_hook_usage, 0);
> > +
> > +	if (argc < 1)
> > +		usage_msg_opt(_("a hookname must be provided to operate on."),
> > +			      builtin_hook_usage, run_options);
> 
> Error message nit: what does it mean to operate on a hookname?
> 
> Perhaps this should allude to the usage string?
> 
> 	usage_msg_opt(_("missing <hookname> parameter"), ...);
> 
> Or to match the conversational approach of commands like "clone":
> 
> 	usage_msg_opt(_("You must specify a hook to run."), ...);
> 

Yeah, I like this one. I noticed the same error (untranslated, even!) is
used for list, so I'll fix that too.

> [...]
> > --- a/hook.c
> > +++ b/hook.c
> > @@ -2,6 +2,7 @@
> >  
> >  #include "hook.h"
> >  #include "config.h"
> > +#include "run-command.h"
> >  
> >  /*
> >   * NEEDSWORK: a stateful hook_head means we can't run two hook events in the
> > @@ -21,13 +22,15 @@ void free_hook(struct hook *ptr)
> >  	}
> >  }
> >  
> > -static void emplace_hook(struct list_head *pos, const char *command)
> > +static void emplace_hook(struct list_head *pos, const char *command, int quoted)
> >  {
> >  	struct hook *to_add = malloc(sizeof(struct hook));
> >  	to_add->origin = current_config_scope();
> >  	strbuf_init(&to_add->command, 0);
> > -	/* even with use_shell, run_command() needs quotes */
> > -	strbuf_addf(&to_add->command, "'%s'", command);
> > +	if (quoted)
> > +		strbuf_addf(&to_add->command, "'%s'", command);
> > +	else
> > +		strbuf_addstr(&to_add->command, command);
> >  
> >  	list_add_tail(&to_add->list, pos);
> >  }
> 
> This would need to use sq_quote_* to be safe, but we can do something
> simpler: if we accumulate parameters in an argv_array passed to
> run_command, then they will be safely passed to the shell without
> triggering expansion.

Thanks. I'll do that - no point in duplicating the work :)

 - Emily




^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 1/9] doc: propose hooks managed by the config
  2020-09-09  0:49     ` [PATCH v4 1/9] doc: propose hooks managed by the config Emily Shaffer
  2020-09-23 22:59       ` Jonathan Tan
@ 2020-10-07  9:23       ` Ævar Arnfjörð Bjarmason
  2020-10-22  0:58         ` Emily Shaffer
  1 sibling, 1 reply; 170+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2020-10-07  9:23 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git


On Wed, Sep 09 2020, Emily Shaffer wrote:

First, thanks a lot for working on this. As you may have found I've done
some small amount of actual work in this area before, but mostly just
blathered about it on the ML.

> Begin a design document for config-based hooks, managed via git-hook.
> Focus on an overview of the implementation and motivation for design
> decisions. Briefly discuss the alternatives considered before this
> point. Also, attempt to redefine terms to fit into a multihook world.
> [...]
> +[[status-quo]]
> +=== Status quo
> +
> +Today users can implement multihooks themselves by using a "trampoline script"
> +as their hook, and pointing that script to a directory or list of other scripts
> +they wish to run.

...or by setting core.hooksPath in their local/global/system
config. Granted it doesn't cover the malicious hook injection case
you're also trying to solve, but does address e.g. having a git server
with a lot of centralized hooks.

The "trampoline script" also isn't needed for the common case you
mention, you just symlink the .git/hooks directory (as e.g. GitLab
does). People usually use a trampoline script for e.g. using GNU
parallel or something to execute N hooks.


> +[[hook-directories]]
> +=== Hook directories
> +
> +Other contributors have suggested Git learn about the existence of a directory
> +such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.

...which seems like an easy thing to add later by having a "hookdir" in
addition to "hookcmd", i.e. just specify a glob there instead of a
cmd/path.

You already use "hookdir" for something else though, so that's a bit
confusing, perhaps s/hookcmd/definehookcmd/ would be less confusing, or
perhaps more confusing...

> [...]
> +[[execution-ordering]]
> +=== Execution ordering
> +
> +We may find that config order is insufficient for some users; for example,
> +config order makes it difficult to add a new hook to the system or global config
> +which runs at the end of the hook list. A new ordering schema should be:
> +
> +1) Specified by a `hook.order` config, so that users will not unexpectedly see
> +their order change;
> +
> +2) Either dependency or numerically based.
> +
> +Dependency-based ordering is prone to classic linked-list problems, like a
> +cycles and handling of missing dependencies. But, it paves the way for enabling
> +parallelization if some tasks truly depend on others.
>
> +Numerical ordering makes it tricky for Git to generate suggested ordering
> +numbers for each command, but is easy to determine a definitive order.
> +
> +[[parallelization]]
> +=== Parallelization
> +
> +Users with many hooks might want to run them simultaneously, if the hooks don't
> +modify state; if one hook depends on another's output, then users will want to
> +specify those dependencies. If we decide to solve this problem, we may want to
> +look to modern build systems for inspiration on how to manage dependencies and
> +parallel tasks.

If you're taking requests it would make me very happy if we had
parallelism in this from day one. It's the kind of thing that's hard to
do by default once a feature is shipped since people will implicitly
depend on it not being there, i.e. we won't know what we're breaking.

I think doing it this way is simple, covers most use cases, and solves a
lot of the problems you note:

1. Don't use config order to execute hooks, use glob'd name order
   regardless of origin. I.e. a system-level hook is called "001-first"
   is executed before a local hook called "999-at-the-end" (or the other
   way around, i.e. hook origin doesn't matter).

2. We execute hooks parallel in that glob order, i.e. a pthread for-loop
   that starts the 001-first task first, eventually getting to
   999-at-the-end N at a time. I.e. the same as:

       parallel --jobs N --halt-on-error soon,fail=1" ::: <hooks-in-glob-order>

   This allows for parallelism but guarantees the very useful case of
   having a global log hook being guaranteed to execute.

3. A hook can define "parallel=no" in its config. We'll then run it
   while no other hook is running.

4. We don't attempt to do dependencies etc, if you need that sort of
   complexity you can just make one of the hooks be a hook runner as
   users do now for the common "make it parallel" case.

It's a relatively small change to the code you have already. I.e. the
for_each() in run_hooks() would be called N times for each continuous
glob'd parallel/non-parallel segment, and hook_list()'s config parsing
would learn to spew those out as a list-of-lists.

This also gives you a rudimentary implementation of the dependency
schema you proposed for free. I.e. a definition of (pseudocode):

    hookcmd=000-first
    parallel=no

    hookcmd=250-middle-abc
    hookcmd=250-middle-xyz

    hookcmd=300-gather
    parallel=no

    hookcmd=999-the-end

Would result in the pseudocode execution of;

    segments=[[000-first],
              [250-middle-abc, 250-middle-xyz],
              [300-gather],
              [999-the-end]]
    for each s in segments:
        ok = run_in_parallel(s)
        last if !ok # or, depending on "early exit?" config

I.e.:

 * The common case of people adding N hooks won't take sum(N) time.

 * parallel=no hooks aren't run in parallel with other non-parallel
   hooks

 * We support a rudimentary dependency schema as a side-effect,
   i.e. defining 300-gather as non-parallel allows it to act as the sole
   "reduce" step in a map/reduce in a "map" step started with the 250-*
   hooks.

> +[[securing-hookdir-hooks]]
> +=== Securing hookdir hooks
> +
> +With the design as written in this doc, it's still possible for a malicious user
> +to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
> +zip their repo and send it to another user. It may be necessary to teach Git to
> +only allow inlined hooks like this if they were configured outside of the local
> +scope (in other words, only run hookcmds, and only allow hookcmds to be
> +configured in global or system scope); or another approach, like a list of safe
> +projects, might be useful. It may also be sufficient (or at least useful) to
> +teach a `hook.disableAll` config or similar flag to the Git executable.

I think this part of the doc should note a bit of the context in
https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/

I.e. even if we get a 100% secure hook implementation we've done
practically nothing for overall security, since we'll still run the
pager, aliases etc. from that local repo.

This is a great step in the right direction, but it behooves us to note
that, so some user reading this documentation without context doesn't
think inspecting untrusted repositories like that is safe just because
they set the right hook settings in their config (once what's being
proposed here is implemented).

^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v5 0/8] propose config-based hooks (part I)
  2020-09-09  0:49   ` [PATCH v4 0/9] propose config-based hooks Emily Shaffer
                       ` (9 preceding siblings ...)
  2020-09-09 21:04     ` [PATCH v4 0/9] propose config-based hooks Junio C Hamano
@ 2020-10-14 23:24     ` Emily Shaffer
  2020-10-14 23:24       ` [PATCH v5 1/8] doc: propose hooks managed by the config Emily Shaffer
                         ` (8 more replies)
  10 siblings, 9 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-14 23:24 UTC (permalink / raw)
  To: git
  Cc: Emily Shaffer, Jeff King, Junio C Hamano, James Ramsay,
	Jonathan Nieder, brian m. carlson,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Josh Steadmon, Johannes Schindelin

Since v4:
- Reordered the commits. Hookdir support is added sooner and conversion
  of existing hooks is moved to another branch (part II) for hopefully
  more granular reviewing. If folks hate this, let me know and I'll
  reintegrate the two topics.
- Removed the --porcelain option on 'git hook list'. General consensus
  is that this should use a format string instead, and I didn't want to
  write that new feature while I had been promising v5 "any day now".
- Added functionality for 'skip' to remove hooks from the execution
  list.
- General nits from folks.

Coming soon:
- 'git hook list --format'
- More conversions (in the other topic)
- As required by new conversions, stdin support for hooks

Coming much later:
- 'git hook add'/'git hook edit'. The config isn't too ugly to manually
  edit, for now, so I'd like to get the hooks themselves all figured out
  before adding these convenience tools. I do still think they're a good
  idea, as they'll increase the discoverability of the feature for new
  users.

More detailed notes in each commit. Thanks all for your patience and
reviews.

 - Emily

Emily Shaffer (8):
  doc: propose hooks managed by the config
  hook: scaffolding for git-hook subcommand
  hook: add list command
  hook: include hookdir hook in list
  hook: implement hookcmd.<name>.skip
  parse-options: parse into strvec
  hook: add 'run' subcommand
  hook: replace find_hook() with hook_exists()

 .gitignore                                    |   1 +
 Documentation/Makefile                        |   1 +
 Documentation/config/hook.txt                 |  14 +
 Documentation/git-hook.txt                    |  81 ++++
 Documentation/technical/api-parse-options.txt |   5 +
 .../technical/config-based-hooks.txt          | 367 ++++++++++++++++++
 Makefile                                      |   2 +
 builtin.h                                     |   1 +
 builtin/hook.c                                | 163 ++++++++
 git.c                                         |   1 +
 hook.c                                        | 282 ++++++++++++++
 hook.h                                        |  58 +++
 parse-options-cb.c                            |  16 +
 parse-options.h                               |   4 +
 t/t1360-config-based-hooks.sh                 | 232 +++++++++++
 15 files changed, 1228 insertions(+)
 create mode 100644 Documentation/config/hook.txt
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 Documentation/technical/config-based-hooks.txt
 create mode 100644 builtin/hook.c
 create mode 100644 hook.c
 create mode 100644 hook.h
 create mode 100755 t/t1360-config-based-hooks.sh

-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v5 1/8] doc: propose hooks managed by the config
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
@ 2020-10-14 23:24       ` Emily Shaffer
  2020-10-15 16:31         ` Ævar Arnfjörð Bjarmason
  2020-10-14 23:24       ` [PATCH v5 2/8] hook: scaffolding for git-hook subcommand Emily Shaffer
                         ` (7 subsequent siblings)
  8 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-10-14 23:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Begin a design document for config-based hooks, managed via git-hook.
Focus on an overview of the implementation and motivation for design
decisions. Briefly discuss the alternatives considered before this
point. Also, attempt to redefine terms to fit into a multihook world.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, addressed comments from Jonathan Tan about wording.

 Documentation/Makefile                        |   1 +
 .../technical/config-based-hooks.txt          | 367 ++++++++++++++++++
 2 files changed, 368 insertions(+)
 create mode 100644 Documentation/technical/config-based-hooks.txt

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 80d1908a44..58d6b3acbe 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -81,6 +81,7 @@ SP_ARTICLES += $(API_DOCS)
 TECH_DOCS += MyFirstContribution
 TECH_DOCS += MyFirstObjectWalk
 TECH_DOCS += SubmittingPatches
+TECH_DOCS += technical/config-based-hooks
 TECH_DOCS += technical/hash-function-transition
 TECH_DOCS += technical/http-protocol
 TECH_DOCS += technical/index-format
diff --git a/Documentation/technical/config-based-hooks.txt b/Documentation/technical/config-based-hooks.txt
new file mode 100644
index 0000000000..dac391f505
--- /dev/null
+++ b/Documentation/technical/config-based-hooks.txt
@@ -0,0 +1,367 @@
+Configuration-based hook management
+===================================
+:sectanchors:
+
+[[motivation]]
+== Motivation
+
+Replace the .git/hook/hookname path as the only source of hooks to execute;
+allow users to define hooks using config files, in a way which is friendly to
+users with multiple repos which have similar needs.
+
+Redefine "hook" as an event rather than a single script, allowing users to
+perform unrelated actions on a single event.
+
+Take a step closer to safety when copying zipped Git repositories from untrusted
+users by making it more apparent to users which scripts will be run during
+normal Git operations.
+
+Make it easier for users to discover Git's hook feature and automate their
+workflows.
+
+[[user-interfaces]]
+== User interfaces
+
+[[config-schema]]
+=== Config schema
+
+Hooks can be introduced by editing the configuration manually. There are two new
+sections added, `hook` and `hookcmd`.
+
+[[config-schema-hook]]
+==== `hook`
+
+Primarily contains subsections for each hook event. The order of variables in
+these subsections defines the hook command execution order; hook commands can be
+specified by setting the value directly to the command if no additional
+configuration is needed, or by setting the value as the name of a `hookcmd`. If
+Git does not find a `hookcmd` whose subsection matches the value of the given
+command string, Git will try to execute the string directly. Hooks are executed
+by passing the resolved command string to the shell. In the future, hook event
+subsections could also contain per-hook-event settings; see
+<<per-hook-event-settings,the section in Future Work>> for more details.
+
+Also contains top-level hook execution settings, for example, `hook.runHookDir`
+or `hook.disableAll`. (These settings are described more in
+<<library,Library>>.)
+
+----
+[hook "pre-commit"]
+  command = perl-linter
+  command = /usr/bin/git-secrets --pre-commit
+
+[hook "pre-applypatch"]
+  command = perl-linter
+  # for illustration purposes; error behavior isn't planned yet
+  error = ignore
+
+[hook]
+  runHookDir = interactive
+----
+
+[[config-schema-hookcmd]]
+==== `hookcmd`
+
+Defines a hook command and its attributes, which will be used when a hook event
+occurs. Unqualified attributes are assumed to apply to this hook during all hook
+events, but event-specific attributes can also be supplied. The example runs
+`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
+include this config, the hook command will be skipped for all events to which
+it's normally subscribed _except_ `pre-commit`.
+
+----
+[hookcmd "perl-linter"]
+  command = /usr/bin/lint-it --language=perl
+  skip = true
+  # for illustration purposes; below hasn't been defined yet
+  pre-commit-skip = false
+----
+
+[[command-line-api]]
+=== Command-line API
+
+Users should be able to view, reorder, and create hook commands via the command
+line. External tools should be able to view a list of hooks in the correct order
+to run.
+
+*`git hook list <hook-event>`*
+
+*`git hook list (--system|--global|--local|--worktree)`*
+
+*`git hook edit <hook-event>`*
+
+*`git hook add <hook-command> <hook-event> <options...>`*
+
+[[hook-editor]]
+=== Hook editor
+
+The tool which is presented by `git hook edit <hook-command>`. Ideally, this
+tool should be easier to use than manually editing the config, and then produce
+a concise config afterwards. It may take a form similar to `git rebase
+--interactive`.
+
+[[implementation]]
+== Implementation
+
+[[library]]
+=== Library
+
+`hook.c` and `hook.h` are responsible for interacting with the config files. In
+the case when the code generating a hook event doesn't have special concerns
+about how to run the hooks, the hook library will provide a basic API to call
+all hooks in config order with an `strvec` provided by the code which
+generates the hook event:
+
+*`int run_hooks(const char *hookname, struct strvec *args)`*
+
+This call includes the hook command provided by `run-command.h:find_hook()`;
+eventually, this legacy hook will be gated by a config `hook.runHookDir`. The
+config is checked against a number of cases:
+
+- "no": the legacy hook will not be run
+- "interactive": Git will prompt the user before running the legacy hook
+- "warn": Git will print a warning to stderr before running the legacy hook
+- "yes" (default): Git will silently run the legacy hook
+
+In case this list is expanded in the future, if a value for `hook.runHookDir` is
+given which Git does not recognize, Git should discard that config entry. For
+example, if "warn" was specified at system level and "junk" was specified at
+global level, Git would resolve the value to "warn"; if the only time the config
+was set was to "junk", Git would use the default value of "yes".
+
+If the caller wants to do something more complicated, the hook library can also
+provide a callback API:
+
+*`int for_each_hookcmd(const char *hookname, hookcmd_function *cb)`*
+
+Finally, to facilitate the builtin, the library will also provide the following
+APIs to interact with the config:
+
+----
+int set_hook_commands(const char *hookname, struct string_list *commands,
+	enum config_scope scope);
+int set_hookcmd(const char *hookcmd, struct hookcmd options);
+
+int list_hook_commands(const char *hookname, struct string_list *commands);
+int list_hooks_in_scope(enum config_scope scope, struct string_list *commands);
+----
+
+`struct hookcmd` is expected to grow in size over time as more functionality is
+added to hooks; so that other parts of the code don't need to understand the
+config schema, `struct hookcmd` should contain logical values instead of string
+pairs.
+
+----
+struct hookcmd {
+  const char *name;
+  const char *command;
+
+  /* for illustration only; not planned at present */
+  int parallelizable;
+  const char *hookcmd_before;
+  const char *hookcmd_after;
+  enum recovery_action on_fail;
+}
+----
+
+[[builtin]]
+=== Builtin
+
+`builtin/hook.c` is responsible for providing the frontend. It's responsible for
+formatting user-provided data and then calling the library API to set the
+configs as appropriate. The builtin frontend is not responsible for calling the
+config directly, so that other areas of Git can rely on the hook library to
+understand the most recent config schema for hooks.
+
+[[migration]]
+=== Migration path
+
+[[stage-0]]
+==== Stage 0
+
+Hooks are called by running `run-command.h:find_hook()` with the hookname and
+executing the result. The hook library and builtin do not exist. Hooks only
+exist as specially named scripts within `.git/hooks/`.
+
+[[stage-1]]
+==== Stage 1
+
+`git hook list --porcelain <hook-event>` is implemented. Users can replace their
+`.git/hooks/<hook-event>` scripts with a trampoline based on `git hook list`'s
+output. Modifier commands like `git hook add` and `git hook edit` can be
+implemented around this time as well.
+
+[[stage-2]]
+==== Stage 2
+
+`hook.h:run_hooks()` is taught to include `run-command.h:find_hook()` at the
+end; calls to `find_hook()` are replaced with calls to `run_hooks()`. Users can
+opt-in to config-based hooks simply by creating some in their config; otherwise
+users should remain unaffected by the change.
+
+[[stage-3]]
+==== Stage 3
+
+The call to `find_hook()` inside of `run_hooks()` learns to check for a config,
+`hook.runHookDir`. Users can opt into managing their hooks completely via the
+config this way.
+
+[[stage-4]]
+==== Stage 4
+
+`.git/hooks` is removed from the template and the hook directory is considered
+deprecated. To avoid breaking older repos, the default of `hook.runHookDir` is
+not changed, and `find_hook()` is not removed.
+
+[[caveats]]
+== Caveats
+
+[[security]]
+=== Security and repo config
+
+Part of the motivation behind this refactor is to mitigate hooks as an attack
+vector;footnote:[https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/]
+however, as the design stands, users can still provide hooks in the repo-level
+config, which is included when a repo is zipped and sent elsewhere.  The
+security of the repo-level config is still under discussion; this design
+generally assumes the repo-level config is secure, which is not true yet. The
+goal is to avoid an overcomplicated design to work around a problem which has
+ceased to exist.
+
+[[ease-of-use]]
+=== Ease of use
+
+The config schema is nontrivial; that's why it's important for the `git hook`
+modifier commands to be usable. Contributors with UX expertise are encouraged to
+share their suggestions.
+
+[[alternatives]]
+== Alternative approaches
+
+A previous summary of alternatives exists in the
+archives.footnote:[https://lore.kernel.org/git/20191116011125.GG22855@google.com]
+
+[[status-quo]]
+=== Status quo
+
+Today users can implement multihooks themselves by using a "trampoline script"
+as their hook, and pointing that script to a directory or list of other scripts
+they wish to run.
+
+[[hook-directories]]
+=== Hook directories
+
+Other contributors have suggested Git learn about the existence of a directory
+such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
+
+[[comparison]]
+=== Comparison table
+
+.Comparison of alternatives
+|===
+|Feature |Config-based hooks |Hook directories |Status quo
+
+|Supports multiple hooks
+|Natively
+|Natively
+|With user effort
+
+|Safer for zipped repos
+|A little
+|No
+|No
+
+|Previous hooks just work
+|If configured
+|Yes
+|Yes
+
+|Can install one hook to many repos
+|Yes
+|No
+|No
+
+|Discoverability
+|Better (in `git help git`)
+|Same as before
+|Same as before
+
+|Hard to run unexpected hook
+|If configured
+|No
+|No
+|===
+
+[[future-work]]
+== Future work
+
+[[execution-ordering]]
+=== Execution ordering
+
+We may find that config order is insufficient for some users; for example,
+config order makes it difficult to add a new hook to the system or global config
+which runs at the end of the hook list. A new ordering schema should be:
+
+1) Specified by a `hook.order` config, so that users will not unexpectedly see
+their order change;
+
+2) Either dependency or numerically based.
+
+Dependency-based ordering is prone to classic linked-list problems, like a
+cycles and handling of missing dependencies. But, it paves the way for enabling
+parallelization if some tasks truly depend on others.
+
+Numerical ordering makes it tricky for Git to generate suggested ordering
+numbers for each command, but is easy to determine a definitive order.
+
+[[parallelization]]
+=== Parallelization
+
+Users with many hooks might want to run them simultaneously, if the hooks don't
+modify state; if one hook depends on another's output, then users will want to
+specify those dependencies. If we decide to solve this problem, we may want to
+look to modern build systems for inspiration on how to manage dependencies and
+parallel tasks.
+
+[[securing-hookdir-hooks]]
+=== Securing hookdir hooks
+
+With the design as written in this doc, it's still possible for a malicious user
+to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
+zip their repo and send it to another user. It may be necessary to teach Git to
+only allow inlined hooks like this if they were configured outside of the local
+scope (in other words, only run hookcmds, and only allow hookcmds to be
+configured in global or system scope); or another approach, like a list of safe
+projects, might be useful. It may also be sufficient (or at least useful) to
+teach a `hook.disableAll` config or similar flag to the Git executable.
+
+[[submodule-inheritance]]
+=== Submodule inheritance
+
+It's possible some submodules may want to run the identical set of hooks that
+their superrepo runs. While a globally-configured hook set is helpful, it's not
+a great solution for users who have multiple repos-with-submodules under the
+same user. It would be useful for submodules to learn how to run hooks from
+their superrepo's config, or inherit that hook setting.
+
+[[per-hook-event-settings]]
+=== Per-hook-event settings
+
+It might be desirable to keep settings specifically for some hook events, but
+not for others - for example, a user may wish to disable hookdir hooks for all
+events but pre-commit, which they haven't had time to convert yet; or, a user
+may wish for execution order settings to differ based on hook event. In that
+case, it would be useful to set something like `hook.pre-commit.executionOrder`
+which would not apply to the 'prepare-commit-msg' hook, for example.
+
+[[glossary]]
+== Glossary
+
+*hook event*
+
+A point during Git's execution where user scripts may be run, for example,
+_prepare-commit-msg_ or _pre-push_.
+
+*hook command*
+
+A user script or executable which will be run on one or more hook events.
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v5 2/8] hook: scaffolding for git-hook subcommand
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
  2020-10-14 23:24       ` [PATCH v5 1/8] doc: propose hooks managed by the config Emily Shaffer
@ 2020-10-14 23:24       ` Emily Shaffer
  2020-10-14 23:24       ` [PATCH v5 3/8] hook: add list command Emily Shaffer
                         ` (6 subsequent siblings)
  8 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-14 23:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Introduce infrastructure for a new subcommand, git-hook, which will be
used to ease config-based hook management. This command will handle
parsing configs to compose a list of hooks to run for a given event, as
well as adding or modifying hook configs in an interactive fashion.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, mainly changed to RUN_SETUP_GENTLY so that 'git hook list' can
    be executed outside of a repo.

 .gitignore                    |  1 +
 Documentation/git-hook.txt    | 20 ++++++++++++++++++++
 Makefile                      |  1 +
 builtin.h                     |  1 +
 builtin/hook.c                | 21 +++++++++++++++++++++
 git.c                         |  1 +
 t/t1360-config-based-hooks.sh | 11 +++++++++++
 7 files changed, 56 insertions(+)
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 builtin/hook.c
 create mode 100755 t/t1360-config-based-hooks.sh

diff --git a/.gitignore b/.gitignore
index 6232d33924..432e0b11cb 100644
--- a/.gitignore
+++ b/.gitignore
@@ -75,6 +75,7 @@
 /git-grep
 /git-hash-object
 /git-help
+/git-hook
 /git-http-backend
 /git-http-fetch
 /git-http-push
diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
new file mode 100644
index 0000000000..9eeab0009d
--- /dev/null
+++ b/Documentation/git-hook.txt
@@ -0,0 +1,20 @@
+git-hook(1)
+===========
+
+NAME
+----
+git-hook - Manage configured hooks
+
+SYNOPSIS
+--------
+[verse]
+'git hook'
+
+DESCRIPTION
+-----------
+A placeholder command. Later, you will be able to list, add, and modify hooks
+with this command.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 5311b1d2c4..9152f6d7c8 100644
--- a/Makefile
+++ b/Makefile
@@ -1095,6 +1095,7 @@ BUILTIN_OBJS += builtin/get-tar-commit-id.o
 BUILTIN_OBJS += builtin/grep.o
 BUILTIN_OBJS += builtin/hash-object.o
 BUILTIN_OBJS += builtin/help.o
+BUILTIN_OBJS += builtin/hook.o
 BUILTIN_OBJS += builtin/index-pack.o
 BUILTIN_OBJS += builtin/init-db.o
 BUILTIN_OBJS += builtin/interpret-trailers.o
diff --git a/builtin.h b/builtin.h
index 53fb290963..3b20689d1a 100644
--- a/builtin.h
+++ b/builtin.h
@@ -162,6 +162,7 @@ int cmd_get_tar_commit_id(int argc, const char **argv, const char *prefix);
 int cmd_grep(int argc, const char **argv, const char *prefix);
 int cmd_hash_object(int argc, const char **argv, const char *prefix);
 int cmd_help(int argc, const char **argv, const char *prefix);
+int cmd_hook(int argc, const char **argv, const char *prefix);
 int cmd_index_pack(int argc, const char **argv, const char *prefix);
 int cmd_init_db(int argc, const char **argv, const char *prefix);
 int cmd_interpret_trailers(int argc, const char **argv, const char *prefix);
diff --git a/builtin/hook.c b/builtin/hook.c
new file mode 100644
index 0000000000..b2bbc84d4d
--- /dev/null
+++ b/builtin/hook.c
@@ -0,0 +1,21 @@
+#include "cache.h"
+
+#include "builtin.h"
+#include "parse-options.h"
+
+static const char * const builtin_hook_usage[] = {
+	N_("git hook"),
+	NULL
+};
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+
+	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+			     builtin_hook_usage, 0);
+
+	return 0;
+}
diff --git a/git.c b/git.c
index f1e8b56d99..caad1c877f 100644
--- a/git.c
+++ b/git.c
@@ -524,6 +524,7 @@ static struct cmd_struct commands[] = {
 	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
 	{ "hash-object", cmd_hash_object },
 	{ "help", cmd_help },
+	{ "hook", cmd_hook, RUN_SETUP_GENTLY },
 	{ "index-pack", cmd_index_pack, RUN_SETUP_GENTLY | NO_PARSEOPT },
 	{ "init", cmd_init_db },
 	{ "init-db", cmd_init_db },
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
new file mode 100755
index 0000000000..34b0df5216
--- /dev/null
+++ b/t/t1360-config-based-hooks.sh
@@ -0,0 +1,11 @@
+#!/bin/bash
+
+test_description='config-managed multihooks, including git-hook command'
+
+. ./test-lib.sh
+
+test_expect_success 'git hook command does not crash' '
+	git hook
+'
+
+test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v5 3/8] hook: add list command
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
  2020-10-14 23:24       ` [PATCH v5 1/8] doc: propose hooks managed by the config Emily Shaffer
  2020-10-14 23:24       ` [PATCH v5 2/8] hook: scaffolding for git-hook subcommand Emily Shaffer
@ 2020-10-14 23:24       ` Emily Shaffer
  2020-10-14 23:24       ` [PATCH v5 4/8] hook: include hookdir hook in list Emily Shaffer
                         ` (5 subsequent siblings)
  8 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-14 23:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Teach 'git hook list <hookname>', which checks the known configs in
order to create an ordered list of hooks to run on a given hook event.

Multiple commands can be specified for a given hook by providing
multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
run in config order. If more properties need to be set on a given hook
in the future, commands can also be specified by providing
"hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
<hookcmd-name>]" subsection; at minimum, this subsection must contain a
"hookcmd.<hookcmd-name>.command = <path-to-hook>" line.

For example:

  $ git config --list | grep ^hook
  hook.pre-commit.command=baz
  hook.pre-commit.command=~/bar.sh
  hookcmd.baz.command=~/baz/from/hookcmd.sh

  $ git hook list pre-commit
  global: ~/baz/from/hookcmd.sh
  local: ~/bar.sh

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, more work on the documentation. Also a slight change to the
    output format (space instead of tab).

 Documentation/config/hook.txt |   9 +++
 Documentation/git-hook.txt    |  59 ++++++++++++++++-
 Makefile                      |   1 +
 builtin/hook.c                |  56 +++++++++++++++--
 hook.c                        | 115 ++++++++++++++++++++++++++++++++++
 hook.h                        |  26 ++++++++
 t/t1360-config-based-hooks.sh |  81 +++++++++++++++++++++++-
 7 files changed, 338 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/config/hook.txt
 create mode 100644 hook.c
 create mode 100644 hook.h

diff --git a/Documentation/config/hook.txt b/Documentation/config/hook.txt
new file mode 100644
index 0000000000..71449ecbc7
--- /dev/null
+++ b/Documentation/config/hook.txt
@@ -0,0 +1,9 @@
+hook.<command>.command::
+	A command to execute during the <command> hook event. This can be an
+	executable on your device, a oneliner for your shell, or the name of a
+	hookcmd. See linkgit:git-hook[1].
+
+hookcmd.<name>.command::
+	A command to execute during a hook for which <name> has been specified
+	as a command. This can be an executable on your device or a oneliner for
+	your shell. See linkgit:git-hook[1].
diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index 9eeab0009d..f19875ed68 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -8,12 +8,65 @@ git-hook - Manage configured hooks
 SYNOPSIS
 --------
 [verse]
-'git hook'
+'git hook' list <hook-name>
 
 DESCRIPTION
 -----------
-A placeholder command. Later, you will be able to list, add, and modify hooks
-with this command.
+You can list configured hooks with this command. Later, you will be able to run,
+add, and modify hooks with this command.
+
+This command parses the default configuration files for sections `hook` and
+`hookcmd`. `hook` is used to describe the commands which will be run during a
+particular hook event; commands are run in the order Git encounters them during
+the configuration parse (see linkgit:git-config[1]). `hookcmd` is used to
+describe attributes of a specific command. If additional attributes don't need
+to be specified, a command to run can be specified directly in the `hook`
+section; if a `hookcmd` by that name isn't found, Git will attempt to run the
+provided value directly. For example:
+
+Global config
+----
+  [hook "post-commit"]
+    command = "linter"
+    command = "~/typocheck.sh"
+
+  [hookcmd "linter"]
+    command = "/bin/linter --c"
+----
+
+Local config
+----
+  [hook "prepare-commit-msg"]
+    command = "linter"
+  [hook "post-commit"]
+    command = "python ~/run-test-suite.py"
+----
+
+With these configs, you'd then see:
+
+----
+$ git hook list "post-commit"
+global: /bin/linter --c
+global: ~/typocheck.sh
+local: python ~/run-test-suite.py
+
+$ git hook list "prepare-commit-msg"
+local: /bin/linter --c
+----
+
+COMMANDS
+--------
+
+list `<hook-name>`::
+
+List the hooks which have been configured for `<hook-name>`. Hooks appear
+in the order they should be run, and print the config scope where the relevant
+`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
+This output is human-readable and the format is subject to change over time.
+
+CONFIGURATION
+-------------
+include::config/hook.txt[]
 
 GIT
 ---
diff --git a/Makefile b/Makefile
index 9152f6d7c8..5cd1486e42 100644
--- a/Makefile
+++ b/Makefile
@@ -902,6 +902,7 @@ LIB_OBJS += grep.o
 LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
+LIB_OBJS += hook.o
 LIB_OBJS += ident.o
 LIB_OBJS += json-writer.o
 LIB_OBJS += kwset.o
diff --git a/builtin/hook.c b/builtin/hook.c
index b2bbc84d4d..4d36de52f8 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -1,21 +1,69 @@
 #include "cache.h"
 
 #include "builtin.h"
+#include "config.h"
+#include "hook.h"
 #include "parse-options.h"
+#include "strbuf.h"
 
 static const char * const builtin_hook_usage[] = {
-	N_("git hook"),
+	N_("git hook list <hookname>"),
 	NULL
 };
 
-int cmd_hook(int argc, const char **argv, const char *prefix)
+static int list(int argc, const char **argv, const char *prefix)
 {
-	struct option builtin_hook_options[] = {
+	struct list_head *head, *pos;
+	struct hook *item;
+	struct strbuf hookname = STRBUF_INIT;
+
+	struct option list_options[] = {
 		OPT_END(),
 	};
 
-	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+	argc = parse_options(argc, argv, prefix, list_options,
 			     builtin_hook_usage, 0);
 
+	if (argc < 1) {
+		usage_msg_opt(_("You must specify a hook event name to list."),
+			      builtin_hook_usage, list_options);
+	}
+
+	strbuf_addstr(&hookname, argv[0]);
+
+	head = hook_list(&hookname);
+
+	if (list_empty(head)) {
+		printf(_("no commands configured for hook '%s'\n"),
+		       hookname.buf);
+		strbuf_release(&hookname);
+		return 0;
+	}
+
+	list_for_each(pos, head) {
+		item = list_entry(pos, struct hook, list);
+		if (item)
+			printf("%s: %s\n",
+			       config_scope_name(item->origin),
+			       item->command.buf);
+	}
+
+	clear_hook_list(head);
+	strbuf_release(&hookname);
+
 	return 0;
 }
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+	if (argc < 2)
+		usage_with_options(builtin_hook_usage, builtin_hook_options);
+
+	if (!strcmp(argv[1], "list"))
+		return list(argc - 1, argv + 1, prefix);
+
+	usage_with_options(builtin_hook_usage, builtin_hook_options);
+}
diff --git a/hook.c b/hook.c
new file mode 100644
index 0000000000..937dc768c8
--- /dev/null
+++ b/hook.c
@@ -0,0 +1,115 @@
+#include "cache.h"
+
+#include "hook.h"
+#include "config.h"
+
+void free_hook(struct hook *ptr)
+{
+	if (ptr) {
+		strbuf_release(&ptr->command);
+		free(ptr);
+	}
+}
+
+static void append_or_move_hook(struct list_head *head, const char *command)
+{
+	struct list_head *pos = NULL, *tmp = NULL;
+	struct hook *to_add = NULL;
+
+	/*
+	 * remove the prior entry with this command; we'll replace it at the
+	 * end.
+	 */
+	list_for_each_safe(pos, tmp, head) {
+		struct hook *it = list_entry(pos, struct hook, list);
+		if (!strcmp(it->command.buf, command)) {
+		    list_del(pos);
+		    /* we'll simply move the hook to the end */
+		    to_add = it;
+		}
+	}
+
+	if (!to_add) {
+		/* adding a new hook, not moving an old one */
+		to_add = xmalloc(sizeof(struct hook));
+		strbuf_init(&to_add->command, 0);
+		strbuf_addstr(&to_add->command, command);
+	}
+
+	/* re-set the scope so we show where an override was specified */
+	to_add->origin = current_config_scope();
+
+	list_add_tail(&to_add->list, pos);
+}
+
+static void remove_hook(struct list_head *to_remove)
+{
+	struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
+	list_del(to_remove);
+	free_hook(hook_to_remove);
+}
+
+void clear_hook_list(struct list_head *head)
+{
+	struct list_head *pos, *tmp;
+	list_for_each_safe(pos, tmp, head)
+		remove_hook(pos);
+}
+
+struct hook_config_cb
+{
+	struct strbuf *hookname;
+	struct list_head *list;
+};
+
+static int hook_config_lookup(const char *key, const char *value, void *cb_data)
+{
+	struct hook_config_cb *data = cb_data;
+	const char *hook_key = data->hookname->buf;
+	struct list_head *head = data->list;
+
+	if (!strcmp(key, hook_key)) {
+		const char *command = value;
+		struct strbuf hookcmd_name = STRBUF_INIT;
+
+		/* Check if a hookcmd with that name exists. */
+		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
+		git_config_get_value(hookcmd_name.buf, &command);
+
+		if (!command) {
+			strbuf_release(&hookcmd_name);
+			BUG("git_config_get_value overwrote a string it shouldn't have");
+		}
+
+		/*
+		 * TODO: implement an option-getting callback, e.g.
+		 *   get configs by pattern hookcmd.$value.*
+		 *   for each key+value, do_callback(key, value, cb_data)
+		 */
+
+		append_or_move_hook(head, command);
+
+		strbuf_release(&hookcmd_name);
+	}
+
+	return 0;
+}
+
+struct list_head* hook_list(const struct strbuf* hookname)
+{
+	struct strbuf hook_key = STRBUF_INIT;
+	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
+	struct hook_config_cb cb_data = { &hook_key, hook_head };
+
+	INIT_LIST_HEAD(hook_head);
+
+	if (!hookname)
+		return NULL;
+
+	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
+
+	git_config(hook_config_lookup, (void*)&cb_data);
+
+	strbuf_release(&hook_key);
+	return hook_head;
+}
diff --git a/hook.h b/hook.h
new file mode 100644
index 0000000000..8ffc4f14b6
--- /dev/null
+++ b/hook.h
@@ -0,0 +1,26 @@
+#include "config.h"
+#include "list.h"
+#include "strbuf.h"
+
+struct hook
+{
+	struct list_head list;
+	/*
+	 * Config file which holds the hook.*.command definition.
+	 * (This has nothing to do with the hookcmd.<name>.* configs.)
+	 */
+	enum config_scope origin;
+	/* The literal command to run. */
+	struct strbuf command;
+};
+
+/*
+ * Provides a linked list of 'struct hook' detailing commands which should run
+ * in response to the 'hookname' event, in execution order.
+ */
+struct list_head* hook_list(const struct strbuf *hookname);
+
+/* Free memory associated with a 'struct hook' */
+void free_hook(struct hook *ptr);
+/* Empties the list at 'head', calling 'free_hook()' on each entry */
+void clear_hook_list(struct list_head *head);
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 34b0df5216..6e4a3e763f 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -4,8 +4,85 @@ test_description='config-managed multihooks, including git-hook command'
 
 . ./test-lib.sh
 
-test_expect_success 'git hook command does not crash' '
-	git hook
+ROOT=
+if test_have_prereq MINGW
+then
+	# In Git for Windows, Unix-like paths work only in shell scripts;
+	# `git.exe`, however, will prefix them with the pseudo root directory
+	# (of the Unix shell). Let's accommodate for that.
+	ROOT="$(cd / && pwd)"
+fi
+
+setup_hooks () {
+	test_config hook.pre-commit.command "/path/ghi" --add
+	test_config_global hook.pre-commit.command "/path/def" --add
+}
+
+setup_hookcmd () {
+	test_config hook.pre-commit.command "abc" --add
+	test_config_global hookcmd.abc.command "/path/abc" --add
+}
+
+test_expect_success 'git hook rejects commands without a mode' '
+	test_must_fail git hook pre-commit
+'
+
+
+test_expect_success 'git hook rejects commands without a hookname' '
+	test_must_fail git hook list
+'
+
+test_expect_success 'git hook runs outside of a repo' '
+	setup_hooks &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	EOF
+
+	nongit git config --list --global &&
+
+	nongit git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list orders by config order' '
+	setup_hooks &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	local: $ROOT/path/ghi
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list dereferences a hookcmd' '
+	setup_hooks &&
+	setup_hookcmd &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	local: $ROOT/path/ghi
+	local: $ROOT/path/abc
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list reorders on duplicate commands' '
+	setup_hooks &&
+
+	test_config hook.pre-commit.command "/path/def" --add &&
+
+	cat >expected <<-EOF &&
+	local: $ROOT/path/ghi
+	local: $ROOT/path/def
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
 '
 
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v5 4/8] hook: include hookdir hook in list
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
                         ` (2 preceding siblings ...)
  2020-10-14 23:24       ` [PATCH v5 3/8] hook: add list command Emily Shaffer
@ 2020-10-14 23:24       ` Emily Shaffer
  2020-10-14 23:24       ` [PATCH v5 5/8] hook: implement hookcmd.<name>.skip Emily Shaffer
                         ` (4 subsequent siblings)
  8 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-14 23:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Historically, hooks are declared by placing an executable into
$GIT_DIR/hooks/$HOOKNAME (or $HOOKDIR/$HOOKNAME). Although hooks taken
from the config are more featureful than hooks placed in the $HOOKDIR,
those hooks should not stop working for users who already have them.

When we add hooks from $HOOKDIR to the list of all hooks to run, to
support paths with spaces in them, quote legacy hook paths.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Newly split into its own commit since v4, and taking place much sooner.
    
    An unfortunate side effect of adding this support *before* the
    hook.runHookDir support is that the labels on the list are not clear -
    because we aren't yet flagging which hooks are from the hookdir versus
    the config. I suppose we could move the addition of that field to the
    struct hook up to this patch, but it didn't make a lot of sense to me to
    do it just for cosmetic purposes.

 Documentation/config/hook.txt |  5 +++
 builtin/hook.c                | 70 +++++++++++++++++++++++++++++++----
 hook.c                        | 36 ++++++++++++++++++
 hook.h                        | 16 ++++++++
 t/t1360-config-based-hooks.sh | 62 +++++++++++++++++++++++++++++++
 5 files changed, 182 insertions(+), 7 deletions(-)

diff --git a/Documentation/config/hook.txt b/Documentation/config/hook.txt
index 71449ecbc7..75312754ae 100644
--- a/Documentation/config/hook.txt
+++ b/Documentation/config/hook.txt
@@ -7,3 +7,8 @@ hookcmd.<name>.command::
 	A command to execute during a hook for which <name> has been specified
 	as a command. This can be an executable on your device or a oneliner for
 	your shell. See linkgit:git-hook[1].
+
+hook.runHookDir::
+	Controls how hooks contained in your hookdir are executed. Can be any of
+	"yes", "warn", "interactive", or "no". Defaults to "yes". See
+	linkgit:git-hook[1] and linkgit:git-config[1] "core.hooksPath").
diff --git a/builtin/hook.c b/builtin/hook.c
index 4d36de52f8..16324d4195 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -11,11 +11,14 @@ static const char * const builtin_hook_usage[] = {
 	NULL
 };
 
+static enum hookdir_opt should_run_hookdir;
+
 static int list(int argc, const char **argv, const char *prefix)
 {
 	struct list_head *head, *pos;
 	struct hook *item;
 	struct strbuf hookname = STRBUF_INIT;
+	struct strbuf hookdir_annotation = STRBUF_INIT;
 
 	struct option list_options[] = {
 		OPT_END(),
@@ -40,12 +43,39 @@ static int list(int argc, const char **argv, const char *prefix)
 		return 0;
 	}
 
+	switch (should_run_hookdir) {
+		case hookdir_no:
+			strbuf_addstr(&hookdir_annotation, _(" (will not run)"));
+			break;
+		case hookdir_interactive:
+			strbuf_addstr(&hookdir_annotation, _(" (will prompt)"));
+			break;
+		case hookdir_warn:
+		case hookdir_unknown:
+			strbuf_addstr(&hookdir_annotation, _(" (will warn)"));
+			break;
+		case hookdir_yes:
+		/*
+		 * The default behavior should agree with
+		 * hook.c:configured_hookdir_opt().
+		 */
+		default:
+			break;
+	}
+
 	list_for_each(pos, head) {
 		item = list_entry(pos, struct hook, list);
-		if (item)
-			printf("%s: %s\n",
-			       config_scope_name(item->origin),
-			       item->command.buf);
+		if (item) {
+			/* Don't translate 'hookdir' - it matches the config */
+			printf("%s: %s%s\n",
+			       (item->from_hookdir
+				? "hookdir"
+				: config_scope_name(item->origin)),
+			       item->command.buf,
+			       (item->from_hookdir
+				? hookdir_annotation.buf
+				: ""));
+		}
 	}
 
 	clear_hook_list(head);
@@ -56,14 +86,40 @@ static int list(int argc, const char **argv, const char *prefix)
 
 int cmd_hook(int argc, const char **argv, const char *prefix)
 {
+	const char *run_hookdir = NULL;
+
 	struct option builtin_hook_options[] = {
+		OPT_STRING(0, "run-hookdir", &run_hookdir, N_("option"),
+			   N_("what to do with hooks found in the hookdir")),
 		OPT_END(),
 	};
-	if (argc < 2)
+
+	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+			     builtin_hook_usage, 0);
+
+	/* after the parse, we should have "<command> <hookname> <args...>" */
+	if (argc < 1)
 		usage_with_options(builtin_hook_usage, builtin_hook_options);
 
-	if (!strcmp(argv[1], "list"))
-		return list(argc - 1, argv + 1, prefix);
+
+	/* argument > config */
+	if (run_hookdir)
+		if (!strcmp(run_hookdir, "no"))
+			should_run_hookdir = hookdir_no;
+		else if (!strcmp(run_hookdir, "yes"))
+			should_run_hookdir = hookdir_yes;
+		else if (!strcmp(run_hookdir, "warn"))
+			should_run_hookdir = hookdir_warn;
+		else if (!strcmp(run_hookdir, "interactive"))
+			should_run_hookdir = hookdir_interactive;
+		else
+			die(_("'%s' is not a valid option for --run-hookdir "
+			      "(yes, warn, interactive, no)"), run_hookdir);
+	else
+		should_run_hookdir = configured_hookdir_opt();
+
+	if (!strcmp(argv[0], "list"))
+		return list(argc, argv, prefix);
 
 	usage_with_options(builtin_hook_usage, builtin_hook_options);
 }
diff --git a/hook.c b/hook.c
index 937dc768c8..340e5a35c8 100644
--- a/hook.c
+++ b/hook.c
@@ -2,6 +2,7 @@
 
 #include "hook.h"
 #include "config.h"
+#include "run-command.h"
 
 void free_hook(struct hook *ptr)
 {
@@ -34,6 +35,7 @@ static void append_or_move_hook(struct list_head *head, const char *command)
 		to_add = xmalloc(sizeof(struct hook));
 		strbuf_init(&to_add->command, 0);
 		strbuf_addstr(&to_add->command, command);
+		to_add->from_hookdir = 0;
 	}
 
 	/* re-set the scope so we show where an override was specified */
@@ -95,11 +97,33 @@ static int hook_config_lookup(const char *key, const char *value, void *cb_data)
 	return 0;
 }
 
+enum hookdir_opt configured_hookdir_opt(void)
+{
+	const char *key;
+	if (git_config_get_value("hook.runhookdir", &key))
+		return hookdir_yes; /* by default, just run it. */
+
+	if (!strcmp(key, "no"))
+		return hookdir_no;
+
+	if (!strcmp(key, "yes"))
+		return hookdir_yes;
+
+	if (!strcmp(key, "warn"))
+		return hookdir_warn;
+
+	if (!strcmp(key, "interactive"))
+		return hookdir_interactive;
+
+	return hookdir_unknown;
+}
+
 struct list_head* hook_list(const struct strbuf* hookname)
 {
 	struct strbuf hook_key = STRBUF_INIT;
 	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
 	struct hook_config_cb cb_data = { &hook_key, hook_head };
+	const char *legacy_hook_path = NULL;
 
 	INIT_LIST_HEAD(hook_head);
 
@@ -110,6 +134,18 @@ struct list_head* hook_list(const struct strbuf* hookname)
 
 	git_config(hook_config_lookup, (void*)&cb_data);
 
+	if (have_git_dir())
+		legacy_hook_path = find_hook(hookname->buf);
+
+	/* Unconditionally add legacy hook, but annotate it. */
+	if (legacy_hook_path) {
+		struct hook *legacy_hook;
+
+		append_or_move_hook(hook_head, absolute_path(legacy_hook_path));
+		legacy_hook = list_entry(hook_head->prev, struct hook, list);
+		legacy_hook->from_hookdir = 1;
+	}
+
 	strbuf_release(&hook_key);
 	return hook_head;
 }
diff --git a/hook.h b/hook.h
index 8ffc4f14b6..ca45d388d3 100644
--- a/hook.h
+++ b/hook.h
@@ -12,6 +12,7 @@ struct hook
 	enum config_scope origin;
 	/* The literal command to run. */
 	struct strbuf command;
+	int from_hookdir;
 };
 
 /*
@@ -20,6 +21,21 @@ struct hook
  */
 struct list_head* hook_list(const struct strbuf *hookname);
 
+enum hookdir_opt
+{
+	hookdir_no,
+	hookdir_warn,
+	hookdir_interactive,
+	hookdir_yes,
+	hookdir_unknown,
+};
+
+/*
+ * Provides the hookdir_opt specified in the config without consulting any
+ * command line arguments.
+ */
+enum hookdir_opt configured_hookdir_opt(void);
+
 /* Free memory associated with a 'struct hook' */
 void free_hook(struct hook *ptr);
 /* Empties the list at 'head', calling 'free_hook()' on each entry */
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 6e4a3e763f..91127a50a4 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -23,6 +23,14 @@ setup_hookcmd () {
 	test_config_global hookcmd.abc.command "/path/abc" --add
 }
 
+setup_hookdir () {
+	mkdir .git/hooks
+	write_script .git/hooks/pre-commit <<-EOF
+	echo \"Legacy Hook\"
+	EOF
+	test_when_finished rm -rf .git/hooks
+}
+
 test_expect_success 'git hook rejects commands without a mode' '
 	test_must_fail git hook pre-commit
 '
@@ -85,4 +93,58 @@ test_expect_success 'git hook list reorders on duplicate commands' '
 	test_cmp expected actual
 '
 
+test_expect_success 'git hook list shows hooks from the hookdir' '
+	setup_hookdir &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'hook.runHookDir = no is respected by list' '
+	setup_hookdir &&
+
+	test_config hook.runHookDir "no" &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit (will not run)
+	EOF
+
+	git hook list pre-commit >actual &&
+	# the hookdir annotation is translated
+	test_i18ncmp expected actual
+'
+
+test_expect_success 'hook.runHookDir = warn is respected by list' '
+	setup_hookdir &&
+
+	test_config hook.runHookDir "warn" &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit (will warn)
+	EOF
+
+	git hook list pre-commit >actual &&
+	# the hookdir annotation is translated
+	test_i18ncmp expected actual
+'
+
+
+test_expect_success 'hook.runHookDir = interactive is respected by list' '
+	setup_hookdir &&
+
+	test_config hook.runHookDir "interactive" &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit (will prompt)
+	EOF
+
+	git hook list pre-commit >actual &&
+	# the hookdir annotation is translated
+	test_i18ncmp expected actual
+'
+
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v5 5/8] hook: implement hookcmd.<name>.skip
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
                         ` (3 preceding siblings ...)
  2020-10-14 23:24       ` [PATCH v5 4/8] hook: include hookdir hook in list Emily Shaffer
@ 2020-10-14 23:24       ` Emily Shaffer
  2020-10-14 23:24       ` [PATCH v5 6/8] parse-options: parse into strvec Emily Shaffer
                         ` (3 subsequent siblings)
  8 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-14 23:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

If a user wants a specific repo to skip execution of a hook which is set
at a global or system level, they can now do so by specifying 'skip' in
their repo config:

~/.gitconfig
  [hook.pre-commit]
    command = skippable-oneliner
    command = skippable-hookcmd

  [hookcmd.skippable-hookcmd]
    command = foo.sh

$GIT_DIR/.git/config
  [hookcmd.skippable-oneliner]
    skip = true
  [hookcmd.skippable-hookcmd]
    skip = true

Later it may make sense to add an option like
"hookcmd.<name>.<hook-event>-skip" - but for simplicity, let's start
with a universal skip setting like this.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    New since v4.
    
    During the Google team's review club I was reminded about this whole
    'skip' option I never implemented. It's true that it's impossible to
    exclude a given hook without this; however, I think I have some more
    work to do on it, so consider it RFC for now and tell me what you think
    :)
     - Emily

 hook.c                        | 37 +++++++++++++++++++++++++----------
 t/t1360-config-based-hooks.sh | 23 ++++++++++++++++++++++
 2 files changed, 50 insertions(+), 10 deletions(-)

diff --git a/hook.c b/hook.c
index 340e5a35c8..f4084e33c8 100644
--- a/hook.c
+++ b/hook.c
@@ -12,23 +12,24 @@ void free_hook(struct hook *ptr)
 	}
 }
 
-static void append_or_move_hook(struct list_head *head, const char *command)
+static struct hook* find_hook_by_command(struct list_head *head, const char *command)
 {
 	struct list_head *pos = NULL, *tmp = NULL;
-	struct hook *to_add = NULL;
+	struct hook *found = NULL;
 
-	/*
-	 * remove the prior entry with this command; we'll replace it at the
-	 * end.
-	 */
 	list_for_each_safe(pos, tmp, head) {
 		struct hook *it = list_entry(pos, struct hook, list);
 		if (!strcmp(it->command.buf, command)) {
 		    list_del(pos);
-		    /* we'll simply move the hook to the end */
-		    to_add = it;
+		    found = it;
 		}
 	}
+	return found;
+}
+
+static void append_or_move_hook(struct list_head *head, const char *command)
+{
+	struct hook *to_add = find_hook_by_command(head, command);
 
 	if (!to_add) {
 		/* adding a new hook, not moving an old one */
@@ -41,7 +42,7 @@ static void append_or_move_hook(struct list_head *head, const char *command)
 	/* re-set the scope so we show where an override was specified */
 	to_add->origin = current_config_scope();
 
-	list_add_tail(&to_add->list, pos);
+	list_add_tail(&to_add->list, head);
 }
 
 static void remove_hook(struct list_head *to_remove)
@@ -73,8 +74,18 @@ static int hook_config_lookup(const char *key, const char *value, void *cb_data)
 	if (!strcmp(key, hook_key)) {
 		const char *command = value;
 		struct strbuf hookcmd_name = STRBUF_INIT;
+		int skip = 0;
+
+		/*
+		 * Check if we're removing that hook instead. Hookcmds are
+		 * removed by name, and inlined hooks are removed by command
+		 * content.
+		 */
+		strbuf_addf(&hookcmd_name, "hookcmd.%s.skip", command);
+		git_config_get_bool(hookcmd_name.buf, &skip);
 
 		/* Check if a hookcmd with that name exists. */
+		strbuf_reset(&hookcmd_name);
 		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
 		git_config_get_value(hookcmd_name.buf, &command);
 
@@ -89,7 +100,13 @@ static int hook_config_lookup(const char *key, const char *value, void *cb_data)
 		 *   for each key+value, do_callback(key, value, cb_data)
 		 */
 
-		append_or_move_hook(head, command);
+		if (skip) {
+			struct hook *to_remove = find_hook_by_command(head, command);
+			if (to_remove)
+				remove_hook(&(to_remove->list));
+		} else {
+			append_or_move_hook(head, command);
+		}
 
 		strbuf_release(&hookcmd_name);
 	}
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 91127a50a4..ebd3bc623f 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -132,6 +132,29 @@ test_expect_success 'hook.runHookDir = warn is respected by list' '
 	test_i18ncmp expected actual
 '
 
+test_expect_success 'git hook list removes skipped hookcmd' '
+	setup_hookcmd &&
+	test_config hookcmd.abc.skip "true" --add &&
+
+	cat >expected <<-EOF &&
+	no commands configured for hook '\''pre-commit'\''
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_i18ncmp expected actual
+'
+
+test_expect_success 'git hook list removes skipped inlined hook' '
+	setup_hooks &&
+	test_config hookcmd."$ROOT/path/ghi".skip "true" --add &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
 
 test_expect_success 'hook.runHookDir = interactive is respected by list' '
 	setup_hookdir &&
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v5 6/8] parse-options: parse into strvec
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
                         ` (4 preceding siblings ...)
  2020-10-14 23:24       ` [PATCH v5 5/8] hook: implement hookcmd.<name>.skip Emily Shaffer
@ 2020-10-14 23:24       ` Emily Shaffer
  2020-10-14 23:24       ` [PATCH v5 7/8] hook: add 'run' subcommand Emily Shaffer
                         ` (2 subsequent siblings)
  8 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-14 23:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

parse-options already knows how to read into a string_list, and it knows
how to read into an strvec as a passthrough (that is, including the
argument as well as its value). string_list and strvec serve similar
purposes but are somewhat painful to convert between; so, let's teach
parse-options to read values of string arguments directly into an
strvec without preserving the argument name.

This is useful if collecting generic arguments to pass through to
another command, for example, 'git hook run --arg "--quiet" --arg
"--format=pretty" some-hook'. The resulting strvec would contain
{ "--quiet", "--format=pretty" }.

The implementation is based on that of OPT_STRING_LIST.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, fixed one or two more places where I missed the argv_array->strvec
    rename.

 Documentation/technical/api-parse-options.txt |  5 +++++
 parse-options-cb.c                            | 16 ++++++++++++++++
 parse-options.h                               |  4 ++++
 3 files changed, 25 insertions(+)

diff --git a/Documentation/technical/api-parse-options.txt b/Documentation/technical/api-parse-options.txt
index 5a60bbfa7f..679bd98629 100644
--- a/Documentation/technical/api-parse-options.txt
+++ b/Documentation/technical/api-parse-options.txt
@@ -173,6 +173,11 @@ There are some macros to easily define options:
 	The string argument is stored as an element in `string_list`.
 	Use of `--no-option` will clear the list of preceding values.
 
+`OPT_STRVEC(short, long, &struct strvec, arg_str, description)`::
+	Introduce an option with a string argument.
+	The string argument is stored as an element in `strvec`.
+	Use of `--no-option` will clear the list of preceding values.
+
 `OPT_INTEGER(short, long, &int_var, description)`::
 	Introduce an option with integer argument.
 	The integer is put into `int_var`.
diff --git a/parse-options-cb.c b/parse-options-cb.c
index 4542d4d3f9..c2451dfb1b 100644
--- a/parse-options-cb.c
+++ b/parse-options-cb.c
@@ -207,6 +207,22 @@ int parse_opt_string_list(const struct option *opt, const char *arg, int unset)
 	return 0;
 }
 
+int parse_opt_strvec(const struct option *opt, const char *arg, int unset)
+{
+	struct strvec *v = opt->value;
+
+	if (unset) {
+		strvec_clear(v);
+		return 0;
+	}
+
+	if (!arg)
+		return -1;
+
+	strvec_push(v, arg);
+	return 0;
+}
+
 int parse_opt_noop_cb(const struct option *opt, const char *arg, int unset)
 {
 	return 0;
diff --git a/parse-options.h b/parse-options.h
index 7030d8f3da..75cc8c7c96 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -177,6 +177,9 @@ struct option {
 #define OPT_STRING_LIST(s, l, v, a, h) \
 				    { OPTION_CALLBACK, (s), (l), (v), (a), \
 				      (h), 0, &parse_opt_string_list }
+#define OPT_STRVEC(s, l, v, a, h) \
+				    { OPTION_CALLBACK, (s), (l), (v), (a), \
+				      (h), 0, &parse_opt_strvec }
 #define OPT_UYN(s, l, v, h)         { OPTION_CALLBACK, (s), (l), (v), NULL, \
 				      (h), PARSE_OPT_NOARG, &parse_opt_tertiary }
 #define OPT_EXPIRY_DATE(s, l, v, h) \
@@ -296,6 +299,7 @@ int parse_opt_commits(const struct option *, const char *, int);
 int parse_opt_commit(const struct option *, const char *, int);
 int parse_opt_tertiary(const struct option *, const char *, int);
 int parse_opt_string_list(const struct option *, const char *, int);
+int parse_opt_strvec(const struct option *, const char *, int);
 int parse_opt_noop_cb(const struct option *, const char *, int);
 enum parse_opt_result parse_opt_unknown_cb(struct parse_opt_ctx_t *ctx,
 					   const struct option *,
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v5 7/8] hook: add 'run' subcommand
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
                         ` (5 preceding siblings ...)
  2020-10-14 23:24       ` [PATCH v5 6/8] parse-options: parse into strvec Emily Shaffer
@ 2020-10-14 23:24       ` Emily Shaffer
  2020-10-14 23:24       ` [PATCH v5 8/8] hook: replace find_hook() with hook_exists() Emily Shaffer
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
  8 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-14 23:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

In order to enable hooks to be run as an external process, by a
standalone Git command, or by tools which wrap Git, provide an external
means to run all configured hook commands for a given hook event.

For now, the hook commands will run in config order, in series. As
alternate ordering or parallelism is supported in the future, we should
add knobs to use those to the command line as well.

As with the legacy hook implementation, all stdout generated by hook
commands is redirected to stderr. Piping from stdin is not yet
supported.

Legacy hooks (those present in $GITDIR/hooks) are run at the end of the
execution list. For now, there is no way to disable them.

Users may wish to provide hook commands like 'git config
hook.pre-commit.command "~/linter.sh --pre-commit"'. To enable this, the
contents of the 'hook.*.command' and 'hookcmd.*.command' strings are
first split by space or quotes into an argv_array, then expanded with
'expand_user_path()'.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, updated the docs, and did less local application of single
    quotes. In order for hookdir hooks to run successfully with a space in
    the path, though, they must not be run with 'sh -c'. So we can treat the
    hookdir hooks specially, and warn users via doc about special
    considerations for configured hooks with spaces in their path.

 Documentation/git-hook.txt    |  12 +++-
 builtin/hook.c                |  40 +++++++++++++-
 hook.c                        | 100 ++++++++++++++++++++++++++++++++++
 hook.h                        |   7 +++
 t/t1360-config-based-hooks.sh |  65 +++++++++++++++++++++-
 5 files changed, 218 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index f19875ed68..95d3687905 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -9,11 +9,12 @@ SYNOPSIS
 --------
 [verse]
 'git hook' list <hook-name>
+'git hook' run <hook-name>
 
 DESCRIPTION
 -----------
-You can list configured hooks with this command. Later, you will be able to run,
-add, and modify hooks with this command.
+You can list and run configured hooks with this command. Later, you will be able
+to add and modify hooks with this command.
 
 This command parses the default configuration files for sections `hook` and
 `hookcmd`. `hook` is used to describe the commands which will be run during a
@@ -64,6 +65,13 @@ in the order they should be run, and print the config scope where the relevant
 `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
 This output is human-readable and the format is subject to change over time.
 
+run `<hook-name>`::
+
+Runs hooks configured for `<hook-name>`, in the same order displayed by `git
+hook list`. Hooks configured this way are run prepended with `sh -c`, so paths
+containing special characters or spaces should be wrapped in single quotes:
+`command = '/my/path with spaces/script.sh' some args`.
+
 CONFIGURATION
 -------------
 include::config/hook.txt[]
diff --git a/builtin/hook.c b/builtin/hook.c
index 16324d4195..64aad28e54 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -5,9 +5,11 @@
 #include "hook.h"
 #include "parse-options.h"
 #include "strbuf.h"
+#include "strvec.h"
 
 static const char * const builtin_hook_usage[] = {
 	N_("git hook list <hookname>"),
+	N_("git hook run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hookname>"),
 	NULL
 };
 
@@ -84,6 +86,40 @@ static int list(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+static int run(int argc, const char **argv, const char *prefix)
+{
+	struct strbuf hookname = STRBUF_INIT;
+	struct strvec envs = STRVEC_INIT;
+	struct strvec args = STRVEC_INIT;
+
+	struct option run_options[] = {
+		OPT_STRVEC('e', "env", &envs, N_("var"),
+			   N_("environment variables for hook to use")),
+		OPT_STRVEC('a', "arg", &args, N_("args"),
+			   N_("argument to pass to hook")),
+		OPT_END(),
+	};
+
+	/*
+	 * While it makes sense to list hooks out-of-repo, it doesn't make sense
+	 * to execute them. Hooks usually want to look at repository artifacts.
+	 */
+	if (!have_git_dir())
+		usage_msg_opt(_("You must be in a Git repo to execute hooks."),
+			      builtin_hook_usage, run_options);
+
+	argc = parse_options(argc, argv, prefix, run_options,
+			     builtin_hook_usage, 0);
+
+	if (argc < 1)
+		usage_msg_opt(_("You must specify a hook event to run."),
+			      builtin_hook_usage, run_options);
+
+	strbuf_addstr(&hookname, argv[0]);
+
+	return run_hooks(envs.v, hookname.buf, &args, should_run_hookdir);
+}
+
 int cmd_hook(int argc, const char **argv, const char *prefix)
 {
 	const char *run_hookdir = NULL;
@@ -98,7 +134,7 @@ int cmd_hook(int argc, const char **argv, const char *prefix)
 			     builtin_hook_usage, 0);
 
 	/* after the parse, we should have "<command> <hookname> <args...>" */
-	if (argc < 1)
+	if (argc < 2)
 		usage_with_options(builtin_hook_usage, builtin_hook_options);
 
 
@@ -120,6 +156,8 @@ int cmd_hook(int argc, const char **argv, const char *prefix)
 
 	if (!strcmp(argv[0], "list"))
 		return list(argc, argv, prefix);
+	if (!strcmp(argv[0], "run"))
+		return run(argc, argv, prefix);
 
 	usage_with_options(builtin_hook_usage, builtin_hook_options);
 }
diff --git a/hook.c b/hook.c
index f4084e33c8..1494a32c1a 100644
--- a/hook.c
+++ b/hook.c
@@ -3,6 +3,7 @@
 #include "hook.h"
 #include "config.h"
 #include "run-command.h"
+#include "prompt.h"
 
 void free_hook(struct hook *ptr)
 {
@@ -135,6 +136,56 @@ enum hookdir_opt configured_hookdir_opt(void)
 	return hookdir_unknown;
 }
 
+static int should_include_hookdir(const char *path, enum hookdir_opt cfg)
+{
+	struct strbuf prompt = STRBUF_INIT;
+	/*
+	 * If the path doesn't exist, don't bother adding the empty hook and
+	 * don't bother checking the config or prompting the user.
+	 */
+	if (!path)
+		return 0;
+
+	switch (cfg)
+	{
+		case hookdir_no:
+			return 0;
+		case hookdir_unknown:
+			fprintf(stderr,
+				_("Unrecognized value for 'hook.runHookDir'. "
+				  "Is there a typo? "));
+			/* FALLTHROUGH */
+		case hookdir_warn:
+			fprintf(stderr, _("Running legacy hook at '%s'\n"),
+				path);
+			return 1;
+		case hookdir_interactive:
+			do {
+				/*
+				 * TRANSLATORS: Make sure to include [Y] and [n]
+				 * in your translation. Only English input is
+				 * accepted. Default option is "yes".
+				 */
+				fprintf(stderr, _("Run '%s'? [Yn] "), path);
+				git_read_line_interactively(&prompt);
+				strbuf_tolower(&prompt);
+				if (starts_with(prompt.buf, "n")) {
+					strbuf_release(&prompt);
+					return 0;
+				} else if (starts_with(prompt.buf, "y")) {
+					strbuf_release(&prompt);
+					return 1;
+				}
+				/* otherwise, we didn't understand the input */
+			} while (prompt.len); /* an empty reply means "Yes" */
+			strbuf_release(&prompt);
+			return 1;
+		case hookdir_yes:
+		default:
+			return 1;
+	}
+}
+
 struct list_head* hook_list(const struct strbuf* hookname)
 {
 	struct strbuf hook_key = STRBUF_INIT;
@@ -166,3 +217,52 @@ struct list_head* hook_list(const struct strbuf* hookname)
 	strbuf_release(&hook_key);
 	return hook_head;
 }
+
+
+int run_hooks(const char *const *env, const char *hookname,
+	      const struct strvec *args, enum hookdir_opt run_hookdir)
+{
+	struct strbuf hookname_str = STRBUF_INIT;
+	struct list_head *to_run, *pos = NULL, *tmp = NULL;
+	int rc = 0;
+
+	strbuf_addstr(&hookname_str, hookname);
+
+	to_run = hook_list(&hookname_str);
+
+	list_for_each_safe(pos, tmp, to_run) {
+		struct child_process hook_proc = CHILD_PROCESS_INIT;
+		struct hook *hook = list_entry(pos, struct hook, list);
+
+		hook_proc.env = env;
+		hook_proc.no_stdin = 1;
+		hook_proc.stdout_to_stderr = 1;
+		hook_proc.trace2_hook_name = hook->command.buf;
+		hook_proc.use_shell = 1;
+
+
+		if (hook->from_hookdir) {
+		    if (!should_include_hookdir(hook->command.buf, run_hookdir))
+			continue;
+		    /*
+		     * Commands from the config could be oneliners, but we know
+		     * for certain that hookdir commands are not.
+		     */
+		    hook_proc.use_shell = 0;
+		}
+
+		/* add command */
+		strvec_push(&hook_proc.args, hook->command.buf);
+
+		/*
+		 * add passed-in argv, without expanding - let the user get back
+		 * exactly what they put in
+		 */
+		if (args)
+			strvec_pushv(&hook_proc.args, args->v);
+
+		rc |= run_command(&hook_proc);
+	}
+
+	return rc;
+}
diff --git a/hook.h b/hook.h
index ca45d388d3..6eb1dc99c4 100644
--- a/hook.h
+++ b/hook.h
@@ -1,6 +1,7 @@
 #include "config.h"
 #include "list.h"
 #include "strbuf.h"
+#include "strvec.h"
 
 struct hook
 {
@@ -35,6 +36,12 @@ enum hookdir_opt
  * command line arguments.
  */
 enum hookdir_opt configured_hookdir_opt(void);
+/*
+ * Runs all hooks associated to the 'hookname' event in order. Each hook will be
+ * passed 'env' and 'args'.
+ */
+int run_hooks(const char *const *env, const char *hookname,
+	      const struct strvec *args, enum hookdir_opt run_hookdir);
 
 /* Free memory associated with a 'struct hook' */
 void free_hook(struct hook *ptr);
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index ebd3bc623f..5b3003d59b 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -115,7 +115,10 @@ test_expect_success 'hook.runHookDir = no is respected by list' '
 
 	git hook list pre-commit >actual &&
 	# the hookdir annotation is translated
-	test_i18ncmp expected actual
+	test_i18ncmp expected actual &&
+
+	git hook run pre-commit 2>actual &&
+	test_must_be_empty actual
 '
 
 test_expect_success 'hook.runHookDir = warn is respected by list' '
@@ -129,6 +132,14 @@ test_expect_success 'hook.runHookDir = warn is respected by list' '
 
 	git hook list pre-commit >actual &&
 	# the hookdir annotation is translated
+	test_i18ncmp expected actual &&
+
+	cat >expected <<-EOF &&
+	Running legacy hook at '\''$(pwd)/.git/hooks/pre-commit'\''
+	"Legacy Hook"
+	EOF
+
+	git hook run pre-commit 2>actual &&
 	test_i18ncmp expected actual
 '
 
@@ -156,7 +167,7 @@ test_expect_success 'git hook list removes skipped inlined hook' '
 	test_cmp expected actual
 '
 
-test_expect_success 'hook.runHookDir = interactive is respected by list' '
+test_expect_success 'hook.runHookDir = interactive is respected by list and run' '
 	setup_hookdir &&
 
 	test_config hook.runHookDir "interactive" &&
@@ -167,7 +178,55 @@ test_expect_success 'hook.runHookDir = interactive is respected by list' '
 
 	git hook list pre-commit >actual &&
 	# the hookdir annotation is translated
-	test_i18ncmp expected actual
+	test_i18ncmp expected actual &&
+
+	test_write_lines n | git hook run pre-commit 2>actual &&
+	! grep "Legacy Hook" actual &&
+
+	test_write_lines y | git hook run pre-commit 2>actual &&
+	grep "Legacy Hook" actual
+'
+
+test_expect_success 'inline hook definitions execute oneliners' '
+	test_config hook.pre-commit.command "echo \"Hello World\"" &&
+
+	echo "Hello World" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'inline hook definitions resolve paths' '
+	write_script sample-hook.sh <<-EOF &&
+	echo \"Sample Hook\"
+	EOF
+
+	test_when_finished "rm sample-hook.sh" &&
+
+	test_config hook.pre-commit.command "\"$(pwd)/sample-hook.sh\"" &&
+
+	echo \"Sample Hook\" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'hookdir hook included in git hook run' '
+	setup_hookdir &&
+
+	echo \"Legacy Hook\" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'out-of-repo runs excluded' '
+	setup_hooks &&
+
+	nongit test_must_fail git hook run pre-commit
 '
 
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v5 8/8] hook: replace find_hook() with hook_exists()
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
                         ` (6 preceding siblings ...)
  2020-10-14 23:24       ` [PATCH v5 7/8] hook: add 'run' subcommand Emily Shaffer
@ 2020-10-14 23:24       ` Emily Shaffer
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
  8 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-14 23:24 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Add a helper to easily determine whether any hooks exist for a given
hook event.

Many callers want to check whether some state could be modified by a
hook; that check should include the config-based hooks as well. Optimize
by checking the config directly. Since commands which execute hooks
might want to take args to replace 'hook.runHookDir', let
'hook_exists()' mirror the behavior of 'hook.runHookDir'.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, a little more nuance when deciding whether a hookdir hook can happen.

 hook.c | 14 ++++++++++++++
 hook.h |  9 +++++++++
 2 files changed, 23 insertions(+)

diff --git a/hook.c b/hook.c
index 1494a32c1a..e3d289d0e9 100644
--- a/hook.c
+++ b/hook.c
@@ -218,6 +218,20 @@ struct list_head* hook_list(const struct strbuf* hookname)
 	return hook_head;
 }
 
+int hook_exists(const char *hookname, enum hookdir_opt should_run_hookdir)
+{
+	const char *value = NULL; /* throwaway */
+	struct strbuf hook_key = STRBUF_INIT;
+
+	int could_run_hookdir = (should_run_hookdir == hookdir_interactive ||
+				should_run_hookdir == hookdir_warn ||
+				should_run_hookdir == hookdir_yes)
+				&& !!find_hook(hookname);
+
+	strbuf_addf(&hook_key, "hook.%s.command", hookname);
+
+	return (!git_config_get_value(hook_key.buf, &value)) || could_run_hookdir;
+}
 
 int run_hooks(const char *const *env, const char *hookname,
 	      const struct strvec *args, enum hookdir_opt run_hookdir)
diff --git a/hook.h b/hook.h
index 6eb1dc99c4..bf8ea3ee11 100644
--- a/hook.h
+++ b/hook.h
@@ -36,6 +36,15 @@ enum hookdir_opt
  * command line arguments.
  */
 enum hookdir_opt configured_hookdir_opt(void);
+
+/*
+ * Returns 1 if any hooks are specified in the config or if a hook exists in the
+ * hookdir. Typically, invoke hook_exsts() like:
+ *   hook_exists(hookname, configured_hookdir_opt());
+ * Like with run_hooks, if you take a --run-hookdir flag, reflect that
+ * user-specified behavior here instead.
+ */
+int hook_exists(const char *hookname, enum hookdir_opt should_run_hookdir);
 /*
  * Runs all hooks associated to the 'hookname' event in order. Each hook will be
  * passed 'env' and 'args'.
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* Re: [PATCH v5 1/8] doc: propose hooks managed by the config
  2020-10-14 23:24       ` [PATCH v5 1/8] doc: propose hooks managed by the config Emily Shaffer
@ 2020-10-15 16:31         ` Ævar Arnfjörð Bjarmason
  2020-10-16 17:29           ` Junio C Hamano
  2020-10-21 23:37           ` Emily Shaffer
  0 siblings, 2 replies; 170+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2020-10-15 16:31 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git


On Thu, Oct 15 2020, Emily Shaffer wrote:

> Notes:
>     Since v4, addressed comments from Jonathan Tan about wording.

I had some extensive comments on the v4 here:
https://lore.kernel.org/git/87mu0ygzk1.fsf@evledraar.gmail.com/

Your CL & this patch don't mention it. I'd be interested in
collaborating on this depending on if/how our goals/wants align, but I'd
lke to get your thoughts on that feedback first.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v5 1/8] doc: propose hooks managed by the config
  2020-10-15 16:31         ` Ævar Arnfjörð Bjarmason
@ 2020-10-16 17:29           ` Junio C Hamano
  2020-10-21 23:37           ` Emily Shaffer
  1 sibling, 0 replies; 170+ messages in thread
From: Junio C Hamano @ 2020-10-16 17:29 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Emily Shaffer, git

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Thu, Oct 15 2020, Emily Shaffer wrote:
>
>> Notes:
>>     Since v4, addressed comments from Jonathan Tan about wording.
>
> I had some extensive comments on the v4 here:
> https://lore.kernel.org/git/87mu0ygzk1.fsf@evledraar.gmail.com/
>
> Your CL & this patch don't mention it. I'd be interested in
> collaborating on this depending on if/how our goals/wants align, but I'd
> lke to get your thoughts on that feedback first.

True.

It seems that it wasn't responded (not even a single-liner "Thanks,
I'll get to it later" or "Thanks, but the goal I am aiming is
different from yours and your experience does not translate directly
here") and I can only conclude that it somehow was overlooked?

Emily?

Side note: perhaps it is just me, but after making a review and
giving extensive comments and suggestions, it is often disorienting
to read the next round without getting any hint on which parts of
the comments were heard and which other parts were dismissed (and
why).  I think your earlier review is a kind that deserves a
separate response before the updated patchset.

Thanks.



^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v5 1/8] doc: propose hooks managed by the config
  2020-10-15 16:31         ` Ævar Arnfjörð Bjarmason
  2020-10-16 17:29           ` Junio C Hamano
@ 2020-10-21 23:37           ` Emily Shaffer
  1 sibling, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-10-21 23:37 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git

On Thu, Oct 15, 2020 at 06:31:15PM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
> 
> On Thu, Oct 15 2020, Emily Shaffer wrote:
> 
> > Notes:
> >     Since v4, addressed comments from Jonathan Tan about wording.
> 
> I had some extensive comments on the v4 here:
> https://lore.kernel.org/git/87mu0ygzk1.fsf@evledraar.gmail.com/

Hum, it seems I completely missed it. I'm sorry - that was very rude of
me! I'll have a look now and reply there.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 1/9] doc: propose hooks managed by the config
  2020-10-07  9:23       ` Ævar Arnfjörð Bjarmason
@ 2020-10-22  0:58         ` Emily Shaffer
  2020-10-23 19:10           ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-10-22  0:58 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git

On Wed, Oct 07, 2020 at 11:23:10AM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
> 
> On Wed, Sep 09 2020, Emily Shaffer wrote:
> 
> First, thanks a lot for working on this. As you may have found I've done
> some small amount of actual work in this area before, but mostly just
> blathered about it on the ML.
> 
> > Begin a design document for config-based hooks, managed via git-hook.
> > Focus on an overview of the implementation and motivation for design
> > decisions. Briefly discuss the alternatives considered before this
> > point. Also, attempt to redefine terms to fit into a multihook world.
> > [...]
> > +[[status-quo]]
> > +=== Status quo
> > +
> > +Today users can implement multihooks themselves by using a "trampoline script"
> > +as their hook, and pointing that script to a directory or list of other scripts
> > +they wish to run.
> 
> ...or by setting core.hooksPath in their local/global/system
> config. Granted it doesn't cover the malicious hook injection case
> you're also trying to solve, but does address e.g. having a git server
> with a lot of centralized hooks.

Aha, setting core.hooksPath in the global/system config had not occurred
to me.

> 
> The "trampoline script" also isn't needed for the common case you
> mention, you just symlink the .git/hooks directory (as e.g. GitLab
> does). People usually use a trampoline script for e.g. using GNU
> parallel or something to execute N hooks.

Hm, I don't think that's quite true. Symlinking out .git/hooks doesn't
give me more than one $HOOKDIR/pre-commit - it just gives me a different
one. So if I wanted to run three different hooks, $HOOKDIR/pre-commit
would need to do the work of all three, regardless of where $HOOKDIR
points. That's what I meant when I said "multihooks" in this section.

But I think what you're trying to say is this: the "status quo" section
doesn't fully cover the status quo. There are more tricks than I
mentioned, e.g. 'git config --global core.hooksPath
/home/emily/githook/' to get the same set of hooks to run everywhere.
This approach still has some drawbacks - for example, it doesn't allow
me to use language-specific linters if I have repos in various
languages, without exempting an individual repo from the ~/githook/ by
'git config --local core.hooksPath
/home/emily/my-python-thing/.git/hook'.

It looks like, then, the "status quo" section needs some rework for the
next iteration.

> 
> 
> > +[[hook-directories]]
> > +=== Hook directories
> > +
> > +Other contributors have suggested Git learn about the existence of a directory
> > +such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
> 
> ...which seems like an easy thing to add later by having a "hookdir" in
> addition to "hookcmd", i.e. just specify a glob there instead of a
> cmd/path.

Hum, interesting! Something like so:

[hook.pre-commit]
  command = last-minute-checks

[hookdir.last-minute-checks]
  dir = /home/emily/last-minute-checks/*

And then the hooks library knows to go and run everything in
~/last-minute-checks/. This is easier to keep fresh than:

[hook.pre-commit]
  command = /home/emily/last-minute-checks/c-linter
  command = /home/emily/last-minute-checks/check-for-debug-prints
  command = /home/emily/last-minute-checks/check-for-notes
  ...

I actually like the idea of this for folks who might have a small number
of hooks they wrote for themselves. I wonder if it's applicable for
something like git-secrets, which presumably users would grab with a
'git clone' later.

It doesn't seem at odds with the rest of the design - how would you feel
about me adding it to the "future work" section at the end? Future work,
rather than "Emily will do this in the next couple of rounds", because:
 - I think nobody already has their hooks in $HOOKDIR/hook/pre-commit.d
   without a corresponding trampoline in $HOOKDIR/hook/pre-commit; so
   they could still call that trampoline, for now
 - I think it might be prone to some bikeshedding - e.g. should we
   recurse into ~/last-minute-checks/linters/c/? how far? what if some
   script requires magic options? etc? But as I'm typing those questions
   out they sound mostly trivial or ridiculous, so maybe my assessment
   is wrong here.
 - It sounds like you might be keen to write it, or at the very least,
   more keen than me
 - Practically speaking, I am not sure I have time to do it alongside
   the rest of the series. Again, my bikeshedding assessment could be
   wrong, and this extra feature could be totally trivial.

> You already use "hookdir" for something else though, so that's a bit
> confusing, perhaps s/hookcmd/definehookcmd/ would be less confusing, or
> perhaps more confusing...

"Hookdir" might be the wrong word to use, too - maybe it's better to
mirror "hookspath" there. Eitherway, "hookdir" and "hookspath" are
similar enough that I think it would be confusing, and "hookcmd" is
already getting some side-eye from me for not being a great choice.

Some thoughts for "a path to a directory in which multiple scripts for a
single hook live":
 - hookset
 - hookbatch (ugh, redundant with MS scripting)
 - hook.pre-commit.all-of = ~/last-minute-checks/
 -  "   "  .everything-in = "   "
...?

I think I named a couple silly ideas for "hookcmd" in another mail.

> 
> > [...]
> > +[[execution-ordering]]
> > +=== Execution ordering
> > +
> > +We may find that config order is insufficient for some users; for example,
> > +config order makes it difficult to add a new hook to the system or global config
> > +which runs at the end of the hook list. A new ordering schema should be:
> > +
> > +1) Specified by a `hook.order` config, so that users will not unexpectedly see
> > +their order change;
> > +
> > +2) Either dependency or numerically based.
> > +
> > +Dependency-based ordering is prone to classic linked-list problems, like a
> > +cycles and handling of missing dependencies. But, it paves the way for enabling
> > +parallelization if some tasks truly depend on others.
> >
> > +Numerical ordering makes it tricky for Git to generate suggested ordering
> > +numbers for each command, but is easy to determine a definitive order.
> > +
> > +[[parallelization]]
> > +=== Parallelization
> > +
> > +Users with many hooks might want to run them simultaneously, if the hooks don't
> > +modify state; if one hook depends on another's output, then users will want to
> > +specify those dependencies. If we decide to solve this problem, we may want to
> > +look to modern build systems for inspiration on how to manage dependencies and
> > +parallel tasks.
> 
> If you're taking requests it would make me very happy if we had
> parallelism in this from day one. It's the kind of thing that's hard to
> do by default once a feature is shipped since people will implicitly
> depend on it not being there, i.e. we won't know what we're breaking.

Hm. This might be tricky.

Some hooks are inherently not able to be parallelized - for example,
hooks which modify a given file, like the commit message draft. In
general, based on the handful of hooks I've converted locally, it's hard
to check whether a callsite assumes a hook could have modified state.
Usually this seems to be done with a call to find_hook() ("was there a
hook that might have run?") and then reopening the file. Sometimes a
file is reopened unconditionally. Sometimes the find_hook() call is
very far away from the run_hook_le() call.

The rest, then, which only read a file and say yes or no, probably don't
need to have a strict ordering - at least as far as Git is concerned.
And I think that's what you're worried about:

[hook.theoretical-parallelizable-event]
  command = check-and-mark-a-file-foo
  command = check-file-foo-and-do-something-else
  command = do-something-totally-unrelated

On day 1 of this feature, as written, this is safe. But if we aren't
careful and we start to parallelize *without* setting up dependency
ordering, e.g. 'git config --global hook.parallelize', and turn that on
by default without warning anyone, then the author of this config will
be unhappy.

But as I read further, you're talking about specifically *not* allowing
dependency ordering...

> 
> I think doing it this way is simple, covers most use cases, and solves a
> lot of the problems you note:
> 
> 1. Don't use config order to execute hooks, use glob'd name order
>    regardless of origin. I.e. a system-level hook is called "001-first"
>    is executed before a local hook called "999-at-the-end" (or the other
>    way around, i.e. hook origin doesn't matter).

Can you say a little more about why different ordering schema would
matter, if we effectively don't care which jobs are in parallel with
which, as you describe? I'm not quite following.

> 
> 2. We execute hooks parallel in that glob order, i.e. a pthread for-loop
>    that starts the 001-first task first, eventually getting to
>    999-at-the-end N at a time. I.e. the same as:
> 
>        parallel --jobs N --halt-on-error soon,fail=1" ::: <hooks-in-glob-order>
> 
>    This allows for parallelism but guarantees the very useful case of
>    having a global log hook being guaranteed to execute.

Ah, I think you're suggesting the glob order specifically to make up for
--halt-on-error in this case.

> 
> 3. A hook can define "parallel=no" in its config. We'll then run it
>    while no other hook is running.
> 
> 4. We don't attempt to do dependencies etc, if you need that sort of
>    complexity you can just make one of the hooks be a hook runner as
>    users do now for the common "make it parallel" case.

If we aren't attempting any magical ordering, then I don't really see a
big difference between glob vs. config order - presumably for most users
the effect would be same, e.g. N = $(nproc * hyperthreading), M = (number of scripts I
care to run) probably will often result in M < N, so all jobs would run
simultaneously anyways.

> 
> It's a relatively small change to the code you have already. I.e. the
> for_each() in run_hooks() would be called N times for each continuous
> glob'd parallel/non-parallel segment, and hook_list()'s config parsing
> would learn to spew those out as a list-of-lists.
> 
> This also gives you a rudimentary implementation of the dependency
> schema you proposed for free. I.e. a definition of (pseudocode):
> 
>     hookcmd=000-first
>     parallel=no
> 
>     hookcmd=250-middle-abc
>     hookcmd=250-middle-xyz
> 
>     hookcmd=300-gather
>     parallel=no
> 
>     hookcmd=999-the-end
> 
> Would result in the pseudocode execution of;
> 
>     segments=[[000-first],
>               [250-middle-abc, 250-middle-xyz],

Hum. This seems to say "folks who started their hooks with the same
number agree that their hooks should also run simultaneously" - which
sounds like an even harder problem than "how do I know my ordering
number isn't the same as someone else's in another config file". Or else
I'm misunderstanding your pseudo :)

Ah, I see later you mention it directly as a dependency schema. I think
this offers the same set of problems I saw trying to use this as an
ordering schema, but worse in all the usual ways parallelism provides.
It is still impossible for someone writing a global or system config to
know where in the dependency chain more local hooks reside.

>               [300-gather],
>               [999-the-end]]
>     for each s in segments:
>         ok = run_in_parallel(s)
>         last if !ok # or, depending on "early exit?" config
> 
> I.e.:
> 
>  * The common case of people adding N hooks won't take sum(N) time.
> 
>  * parallel=no hooks aren't run in parallel with other non-parallel
>    hooks
> 
>  * We support a rudimentary dependency schema as a side-effect,
>    i.e. defining 300-gather as non-parallel allows it to act as the sole
>    "reduce" step in a map/reduce in a "map" step started with the 250-*
>    hooks.

As I understand it, the main concerns you have about getting
parallelization to happen on day 1 are like so:

 - keep users from assuming serial execution
 - avoid a messy schema change to deal with dependencies

I see the benefit of the former; I don't like the new schema proposed by
the latter. I do see that not turning it on day 1 would prevent us from
turning it on by default later, in case users did something silly like
assume dependencies.

Hrm.

I think we could turn on parallelization day 1 by providing an
explicitly-parallel API in hook.h (and a similar 'git hook run foo
--parallel' flag), and being more careful when converting hooks to call
run_hooks_parallel() instead of run_hooks(). That way hooks which will
never be parallelizable (e.g. commit-msg) won't get burned later by us
trying to be clever. Everyone else who can be parallelized is, in config
order, with no dependency management whatsoever. That leaves the door
open for us to add dependency management however we want later on, but
users can still roll their own with a launcher script today.

I know I rambled a lot - I was trying to convince myself :) For now, I'd
prefer to add more detail to the "future work" section of the doc and
then not touch this problem with a very long pole... ;) Thoughts
welcome.

> 
> > +[[securing-hookdir-hooks]]
> > +=== Securing hookdir hooks
> > +
> > +With the design as written in this doc, it's still possible for a malicious user
> > +to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
> > +zip their repo and send it to another user. It may be necessary to teach Git to
> > +only allow inlined hooks like this if they were configured outside of the local
> > +scope (in other words, only run hookcmds, and only allow hookcmds to be
> > +configured in global or system scope); or another approach, like a list of safe
> > +projects, might be useful. It may also be sufficient (or at least useful) to
> > +teach a `hook.disableAll` config or similar flag to the Git executable.
> 
> I think this part of the doc should note a bit of the context in
> https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/
> 
> I.e. even if we get a 100% secure hook implementation we've done
> practically nothing for overall security, since we'll still run the
> pager, aliases etc. from that local repo.
> 
> This is a great step in the right direction, but it behooves us to note
> that, so some user reading this documentation without context doesn't
> think inspecting untrusted repositories like that is safe just because
> they set the right hook settings in their config (once what's being
> proposed here is implemented).

Yeah, I agree. I'll try to make that clearer in the doc in the next
reroll.

Very sorry again for having missed this - I think the first weeks of
October I was working from my local todo list instead of from the list
of replies in mutt. Urk.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 1/9] doc: propose hooks managed by the config
  2020-10-22  0:58         ` Emily Shaffer
@ 2020-10-23 19:10           ` Ævar Arnfjörð Bjarmason
  2020-10-29 15:38             ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2020-10-23 19:10 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git


On Thu, Oct 22 2020, Emily Shaffer wrote:

> On Wed, Oct 07, 2020 at 11:23:10AM +0200, Ævar Arnfjörð Bjarmason wrote:
>> 
>> 
>> On Wed, Sep 09 2020, Emily Shaffer wrote:
>> 
>> First, thanks a lot for working on this. As you may have found I've done
>> some small amount of actual work in this area before, but mostly just
>> blathered about it on the ML.
>> 
>> > Begin a design document for config-based hooks, managed via git-hook.
>> > Focus on an overview of the implementation and motivation for design
>> > decisions. Briefly discuss the alternatives considered before this
>> > point. Also, attempt to redefine terms to fit into a multihook world.
>> > [...]
>> > +[[status-quo]]
>> > +=== Status quo
>> > +
>> > +Today users can implement multihooks themselves by using a "trampoline script"
>> > +as their hook, and pointing that script to a directory or list of other scripts
>> > +they wish to run.
>> 
>> ...or by setting core.hooksPath in their local/global/system
>> config. Granted it doesn't cover the malicious hook injection case
>> you're also trying to solve, but does address e.g. having a git server
>> with a lot of centralized hooks.
>
> Aha, setting core.hooksPath in the global/system config had not occurred
> to me.

It's a useful hack.

>> 
>> The "trampoline script" also isn't needed for the common case you
>> mention, you just symlink the .git/hooks directory (as e.g. GitLab
>> does). People usually use a trampoline script for e.g. using GNU
>> parallel or something to execute N hooks.
>
> Hm, I don't think that's quite true. Symlinking out .git/hooks doesn't
> give me more than one $HOOKDIR/pre-commit - it just gives me a different
> one. So if I wanted to run three different hooks, $HOOKDIR/pre-commit
> would need to do the work of all three, regardless of where $HOOKDIR
> points. That's what I meant when I said "multihooks" in this section.
>
> But I think what you're trying to say is this: the "status quo" section
> doesn't fully cover the status quo. There are more tricks than I
> mentioned, e.g. 'git config --global core.hooksPath
> /home/emily/githook/' to get the same set of hooks to run everywhere.
> This approach still has some drawbacks - for example, it doesn't allow
> me to use language-specific linters if I have repos in various
> languages, without exempting an individual repo from the ~/githook/ by
> 'git config --local core.hooksPath
> /home/emily/my-python-thing/.git/hook'.
>
> It looks like, then, the "status quo" section needs some rework for the
> next iteration.

Re-reading your original patch I think I just misread that. I thought
you were saying a stub script was needed in the .git to point to a
multi-hook script, but I was pointing out that you can just symlink to
the multi-hook script (as e.g. GitLab does), but reading it again & this
I don't thin you meant that at all. Nevermind.

>> 
>> 
>> > +[[hook-directories]]
>> > +=== Hook directories
>> > +
>> > +Other contributors have suggested Git learn about the existence of a directory
>> > +such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
>> 
>> ...which seems like an easy thing to add later by having a "hookdir" in
>> addition to "hookcmd", i.e. just specify a glob there instead of a
>> cmd/path.
>
> Hum, interesting! Something like so:
>
> [hook.pre-commit]
>   command = last-minute-checks
>
> [hookdir.last-minute-checks]
>   dir = /home/emily/last-minute-checks/*
>
> And then the hooks library knows to go and run everything in
> ~/last-minute-checks/. This is easier to keep fresh than:
>
> [hook.pre-commit]
>   command = /home/emily/last-minute-checks/c-linter
>   command = /home/emily/last-minute-checks/check-for-debug-prints
>   command = /home/emily/last-minute-checks/check-for-notes
>   ...
>
> I actually like the idea of this for folks who might have a small number
> of hooks they wrote for themselves. I wonder if it's applicable for
> something like git-secrets, which presumably users would grab with a
> 'git clone' later.
>
> It doesn't seem at odds with the rest of the design - how would you feel
> about me adding it to the "future work" section at the end? Future work,
> rather than "Emily will do this in the next couple of rounds", because:
>  - I think nobody already has their hooks in $HOOKDIR/hook/pre-commit.d
>    without a corresponding trampoline in $HOOKDIR/hook/pre-commit; so
>    they could still call that trampoline, for now
>  - I think it might be prone to some bikeshedding - e.g. should we
>    recurse into ~/last-minute-checks/linters/c/? how far? what if some
>    script requires magic options? etc? But as I'm typing those questions
>    out they sound mostly trivial or ridiculous, so maybe my assessment
>    is wrong here.
>  - It sounds like you might be keen to write it, or at the very least,
>    more keen than me
>  - Practically speaking, I am not sure I have time to do it alongside
>    the rest of the series. Again, my bikeshedding assessment could be
>    wrong, and this extra feature could be totally trivial.
>
>> You already use "hookdir" for something else though, so that's a bit
>> confusing, perhaps s/hookcmd/definehookcmd/ would be less confusing, or
>> perhaps more confusing...
>
> "Hookdir" might be the wrong word to use, too - maybe it's better to
> mirror "hookspath" there. Eitherway, "hookdir" and "hookspath" are
> similar enough that I think it would be confusing, and "hookcmd" is
> already getting some side-eye from me for not being a great choice.
>
> Some thoughts for "a path to a directory in which multiple scripts for a
> single hook live":
>  - hookset
>  - hookbatch (ugh, redundant with MS scripting)
>  - hook.pre-commit.all-of = ~/last-minute-checks/
>  -  "   "  .everything-in = "   "
> ...?
>
> I think I named a couple silly ideas for "hookcmd" in another mail.

To both of the above: Yeah I'm not saying you need to do the work, just
that I think it would be a useful case to bikeshed now since it seems
inevitable that we'll get a "find hooks in this dir by glob" once we
have this facility. So having a config syntax for that which isn't
overly confusing / extensible to that case would be useful, i.e. as the
current syntax uses "dir" already.

>> 
>> > [...]
>> > +[[execution-ordering]]
>> > +=== Execution ordering
>> > +
>> > +We may find that config order is insufficient for some users; for example,
>> > +config order makes it difficult to add a new hook to the system or global config
>> > +which runs at the end of the hook list. A new ordering schema should be:
>> > +
>> > +1) Specified by a `hook.order` config, so that users will not unexpectedly see
>> > +their order change;
>> > +
>> > +2) Either dependency or numerically based.
>> > +
>> > +Dependency-based ordering is prone to classic linked-list problems, like a
>> > +cycles and handling of missing dependencies. But, it paves the way for enabling
>> > +parallelization if some tasks truly depend on others.
>> >
>> > +Numerical ordering makes it tricky for Git to generate suggested ordering
>> > +numbers for each command, but is easy to determine a definitive order.
>> > +
>> > +[[parallelization]]
>> > +=== Parallelization
>> > +
>> > +Users with many hooks might want to run them simultaneously, if the hooks don't
>> > +modify state; if one hook depends on another's output, then users will want to
>> > +specify those dependencies. If we decide to solve this problem, we may want to
>> > +look to modern build systems for inspiration on how to manage dependencies and
>> > +parallel tasks.
>> 
>> If you're taking requests it would make me very happy if we had
>> parallelism in this from day one. It's the kind of thing that's hard to
>> do by default once a feature is shipped since people will implicitly
>> depend on it not being there, i.e. we won't know what we're breaking.
>
> Hm. This might be tricky.
>
> Some hooks are inherently not able to be parallelized - for example,
> hooks which modify a given file, like the commit message draft. In
> general, based on the handful of hooks I've converted locally, it's hard
> to check whether a callsite assumes a hook could have modified state.
> Usually this seems to be done with a call to find_hook() ("was there a
> hook that might have run?") and then reopening the file. Sometimes a
> file is reopened unconditionally. Sometimes the find_hook() call is
> very far away from the run_hook_le() call.
>
> The rest, then, which only read a file and say yes or no, probably don't
> need to have a strict ordering - at least as far as Git is concerned.
> And I think that's what you're worried about:
>
> [hook.theoretical-parallelizable-event]
>   command = check-and-mark-a-file-foo
>   command = check-file-foo-and-do-something-else
>   command = do-something-totally-unrelated
>
> On day 1 of this feature, as written, this is safe. But if we aren't
> careful and we start to parallelize *without* setting up dependency
> ordering, e.g. 'git config --global hook.parallelize', and turn that on
> by default without warning anyone, then the author of this config will
> be unhappy.
>
> But as I read further, you're talking about specifically *not* allowing
> dependency ordering...
>
>> 
>> I think doing it this way is simple, covers most use cases, and solves a
>> lot of the problems you note:
>> 
>> 1. Don't use config order to execute hooks, use glob'd name order
>>    regardless of origin. I.e. a system-level hook is called "001-first"
>>    is executed before a local hook called "999-at-the-end" (or the other
>>    way around, i.e. hook origin doesn't matter).
>
> Can you say a little more about why different ordering schema would
> matter, if we effectively don't care which jobs are in parallel with
> which, as you describe? I'm not quite following.
>
>> 
>> 2. We execute hooks parallel in that glob order, i.e. a pthread for-loop
>>    that starts the 001-first task first, eventually getting to
>>    999-at-the-end N at a time. I.e. the same as:
>> 
>>        parallel --jobs N --halt-on-error soon,fail=1" ::: <hooks-in-glob-order>
>> 
>>    This allows for parallelism but guarantees the very useful case of
>>    having a global log hook being guaranteed to execute.
>
> Ah, I think you're suggesting the glob order specifically to make up for
> --halt-on-error in this case.
>
>> 
>> 3. A hook can define "parallel=no" in its config. We'll then run it
>>    while no other hook is running.
>> 
>> 4. We don't attempt to do dependencies etc, if you need that sort of
>>    complexity you can just make one of the hooks be a hook runner as
>>    users do now for the common "make it parallel" case.
>
> If we aren't attempting any magical ordering, then I don't really see a
> big difference between glob vs. config order - presumably for most users
> the effect would be same, e.g. N = $(nproc * hyperthreading), M = (number of scripts I
> care to run) probably will often result in M < N, so all jobs would run
> simultaneously anyways.
>
>> 
>> It's a relatively small change to the code you have already. I.e. the
>> for_each() in run_hooks() would be called N times for each continuous
>> glob'd parallel/non-parallel segment, and hook_list()'s config parsing
>> would learn to spew those out as a list-of-lists.
>> 
>> This also gives you a rudimentary implementation of the dependency
>> schema you proposed for free. I.e. a definition of (pseudocode):
>> 
>>     hookcmd=000-first
>>     parallel=no
>> 
>>     hookcmd=250-middle-abc
>>     hookcmd=250-middle-xyz
>> 
>>     hookcmd=300-gather
>>     parallel=no
>> 
>>     hookcmd=999-the-end
>> 
>> Would result in the pseudocode execution of;
>> 
>>     segments=[[000-first],
>>               [250-middle-abc, 250-middle-xyz],
>
> Hum. This seems to say "folks who started their hooks with the same
> number agree that their hooks should also run simultaneously" - which
> sounds like an even harder problem than "how do I know my ordering
> number isn't the same as someone else's in another config file". Or else
> I'm misunderstanding your pseudo :)

The prefix number isn't meaningful in that way, i.e. if you have 10
threads and 5 hooks starting with 250-* they won't all be invoked at the
same time.

> Ah, I see later you mention it directly as a dependency schema. I think
> this offers the same set of problems I saw trying to use this as an
> ordering schema, but worse in all the usual ways parallelism provides.
> It is still impossible for someone writing a global or system config to
> know where in the dependency chain more local hooks reside.
>
>>               [300-gather],
>>               [999-the-end]]
>>     for each s in segments:
>>         ok = run_in_parallel(s)
>>         last if !ok # or, depending on "early exit?" config
>> 
>> I.e.:
>> 
>>  * The common case of people adding N hooks won't take sum(N) time.
>> 
>>  * parallel=no hooks aren't run in parallel with other non-parallel
>>    hooks
>> 
>>  * We support a rudimentary dependency schema as a side-effect,
>>    i.e. defining 300-gather as non-parallel allows it to act as the sole
>>    "reduce" step in a map/reduce in a "map" step started with the 250-*
>>    hooks.
>
> As I understand it, the main concerns you have about getting
> parallelization to happen on day 1 are like so:
>
>  - keep users from assuming serial execution
>  - avoid a messy schema change to deal with dependencies
>
> I see the benefit of the former; I don't like the new schema proposed by
> the latter. I do see that not turning it on day 1 would prevent us from
> turning it on by default later, in case users did something silly like
> assume dependencies.
>
> Hrm.
>
> I think we could turn on parallelization day 1 by providing an
> explicitly-parallel API in hook.h (and a similar 'git hook run foo
> --parallel' flag), and being more careful when converting hooks to call
> run_hooks_parallel() instead of run_hooks(). That way hooks which will
> never be parallelizable (e.g. commit-msg) won't get burned later by us
> trying to be clever. Everyone else who can be parallelized is, in config
> order, with no dependency management whatsoever. That leaves the door
> open for us to add dependency management however we want later on, but
> users can still roll their own with a launcher script today.
>
> I know I rambled a lot - I was trying to convince myself :) For now, I'd
> prefer to add more detail to the "future work" section of the doc and
> then not touch this problem with a very long pole... ;) Thoughts
> welcome.

I'm replying to much of the above in general here, particularly since
much of it was in the form of a question you answered yourself later :)

Yes as you point out the reason I'm raising the parallel thing now is
"keep users from assuming serial execution", i.e. any implementation
that isn't like that from day 1 will need more verbose syntax to opt-in
to that.

I think parallel is the sane default, although there's a really strong
case as you point out with the "commit-msg" hook for treating that on a
hook-type basis. E.g. commit-msg (in-place editing of as single file)
being non-parallel by default, but e.g. post-commit, pre-applypatch,
pre-receive and other "should we proceed?" hooks being parallel.

But I'm also raising a general concern with the design of the API /
command around this.

I don't see the need for having a git hook list/edit/add command at
all. We should just keep this simpler and be able to point to "git
config --add/--get-regexp" etc.

It seems the reason to introduce this command API around it is because
you're imagining that git needs to manage hooks whose relative execution
order is important, and to later on once this lands aim to implement a
much more complex dependency management schema.

I just can't imagine a case that needs that where say those 10 hooks
need to execute in exact order 1/2/3/4 where the author of that tight
coupling wouldn't also desire to roll that all into one script, or at
least that it's an obscure enough case that we can just say "do that".

Whereas I do think "run a bunch of independent checks, if all pass
proceed" is *the* common case, e.g. adding a bunch of pre-receive
hooks. If we tell the user we'll treat those as independent programs we
can run them in parallel. The vast majority of users will benefit from
the default faster execution.

The "glob order" case I mentioned is extra complexity on top of that,
yes, but I think that concession is sane for the common case of "yes
parallel, but I want to always run the always-exit-0 log
hook". E.g. I've used this to setup a hook to run push
attempts/successes in a hook framework that runs N pre-receive hooks.

All that being said I'm open to being convinced, I just don't see what
the target user is, and the submitted docs don't really make a case for
it. I.e. there's plenty of "what" not "why would someone want this...".

>> 
>> > +[[securing-hookdir-hooks]]
>> > +=== Securing hookdir hooks
>> > +
>> > +With the design as written in this doc, it's still possible for a malicious user
>> > +to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
>> > +zip their repo and send it to another user. It may be necessary to teach Git to
>> > +only allow inlined hooks like this if they were configured outside of the local
>> > +scope (in other words, only run hookcmds, and only allow hookcmds to be
>> > +configured in global or system scope); or another approach, like a list of safe
>> > +projects, might be useful. It may also be sufficient (or at least useful) to
>> > +teach a `hook.disableAll` config or similar flag to the Git executable.
>> 
>> I think this part of the doc should note a bit of the context in
>> https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/
>> 
>> I.e. even if we get a 100% secure hook implementation we've done
>> practically nothing for overall security, since we'll still run the
>> pager, aliases etc. from that local repo.
>> 
>> This is a great step in the right direction, but it behooves us to note
>> that, so some user reading this documentation without context doesn't
>> think inspecting untrusted repositories like that is safe just because
>> they set the right hook settings in their config (once what's being
>> proposed here is implemented).
>
> Yeah, I agree. I'll try to make that clearer in the doc in the next
> reroll.
>
> Very sorry again for having missed this - I think the first weeks of
> October I was working from my local todo list instead of from the list
> of replies in mutt. Urk.

*nod*

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 1/9] doc: propose hooks managed by the config
  2020-10-23 19:10           ` Ævar Arnfjörð Bjarmason
@ 2020-10-29 15:38             ` Emily Shaffer
  2020-10-29 20:04               ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-10-29 15:38 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, h; +Cc: git

On Fri, Oct 23, 2020 at 09:10:24PM +0200, Ævar Arnfjörð Bjarmason wrote:

> >> You already use "hookdir" for something else though, so that's a bit
> >> confusing, perhaps s/hookcmd/definehookcmd/ would be less confusing, or
> >> perhaps more confusing...
> >
> > "Hookdir" might be the wrong word to use, too - maybe it's better to
> > mirror "hookspath" there. Eitherway, "hookdir" and "hookspath" are
> > similar enough that I think it would be confusing, and "hookcmd" is
> > already getting some side-eye from me for not being a great choice.
> >
> > Some thoughts for "a path to a directory in which multiple scripts for a
> > single hook live":
> >  - hookset
> >  - hookbatch (ugh, redundant with MS scripting)
> >  - hook.pre-commit.all-of = ~/last-minute-checks/
> >  -  "   "  .everything-in = "   "
> > ...?
> >
> > I think I named a couple silly ideas for "hookcmd" in another mail.
> 
> To both of the above: Yeah I'm not saying you need to do the work, just
> that I think it would be a useful case to bikeshed now since it seems
> inevitable that we'll get a "find hooks in this dir by glob" once we
> have this facility. So having a config syntax for that which isn't
> overly confusing / extensible to that case would be useful, i.e. as the
> current syntax uses "dir" already.

Yeah. I'm not sure that it needs to happen right away. Because
hook.*.command // hookcommand.*.command gets passed right into
run_command()-with-shell, it's possible for a user who's keen to also
set `hook.*.command = find -type f /some/path | xargs` in the meantime.
And also because it's passed right into run_command()-with-shell, it's
hard to do some smart wildcarding on the .command config and try to
figure out the right syntax. I'd just as soon see something explicit
like the configs I mentioned above, which can be added pretty easily
after the fact. I think what you're mostly saying, though, is "Leave
some words for glob execution!" and that I can appreciate.

> > Hum. This seems to say "folks who started their hooks with the same
> > number agree that their hooks should also run simultaneously" - which
> > sounds like an even harder problem than "how do I know my ordering
> > number isn't the same as someone else's in another config file". Or else
> > I'm misunderstanding your pseudo :)
> 
> The prefix number isn't meaningful in that way, i.e. if you have 10
> threads and 5 hooks starting with 250-* they won't all be invoked at the
> same time.

Ok. I misunderstood, then.

> > I know I rambled a lot - I was trying to convince myself :) For now, I'd
> > prefer to add more detail to the "future work" section of the doc and
> > then not touch this problem with a very long pole... ;) Thoughts
> > welcome.
> 
> I'm replying to much of the above in general here, particularly since
> much of it was in the form of a question you answered yourself later :)
> 
> Yes as you point out the reason I'm raising the parallel thing now is
> "keep users from assuming serial execution", i.e. any implementation
> that isn't like that from day 1 will need more verbose syntax to opt-in
> to that.
> 
> I think parallel is the sane default, although there's a really strong
> case as you point out with the "commit-msg" hook for treating that on a
> hook-type basis. E.g. commit-msg (in-place editing of as single file)
> being non-parallel by default, but e.g. post-commit, pre-applypatch,
> pre-receive and other "should we proceed?" hooks being parallel.

Yeah. I think you've sold me. So what I will do is thus: before I send
the next reroll (as I'm pretty much done, locally, and hope to be ready
for nits next time) I'll take a look in 'git help githooks' and see
which ones expect writes to occur. I think there are more than just
"commit-msg". I'll add a bit to run_hooks() and a corresponding flag to
'git hook run', plus relevant documentation. I'll also plan to add
explicit documentation to 'git help githooks' mentioning parallel vs.
serial execution.

But I will plan on writing it stupidly - user configurable job number
but no dependency checking; and let the user turn off parallel execution
for everyone (hook.jobs=1) or for just one hook
(hook.pre-commit.parallel = false (?)). Like you and Jonathan N say, we
can add more sugar like hookcmd.*.depends later on when we need it.

> 
> But I'm also raising a general concern with the design of the API /
> command around this.
> 
> I don't see the need for having a git hook list/edit/add command at
> all. We should just keep this simpler and be able to point to "git
> config --add/--get-regexp" etc.
> 
> It seems the reason to introduce this command API around it is because
> you're imagining that git needs to manage hooks whose relative execution
> order is important, and to later on once this lands aim to implement a
> much more complex dependency management schema.

No, I don't think that's the reason to have list/edit/add. The reason is
more for discoverability (if I 'git help git' or 'git^TAB', do I see
something handy in the command list that I didn't know about before?)
and user friendliness ("I can't remember the right config options to set
this up every dang time"). And 'list', I think, is handy for giving
users a dry run of what they can expect to see happen (and where to fix
them, since it lists the origin). Yes, a user could put it all together
from invocations of 'git config', but I personally think it's more
useful for Git to tell me what Git is going to do/what Git wants than
for my meat brain to try and guess :)

> 
> I just can't imagine a case that needs that where say those 10 hooks
> need to execute in exact order 1/2/3/4 where the author of that tight
> coupling wouldn't also desire to roll that all into one script, or at
> least that it's an obscure enough case that we can just say "do that".
> 
> Whereas I do think "run a bunch of independent checks, if all pass
> proceed" is *the* common case, e.g. adding a bunch of pre-receive
> hooks. If we tell the user we'll treat those as independent programs we
> can run them in parallel. The vast majority of users will benefit from
> the default faster execution.
> 
> The "glob order" case I mentioned is extra complexity on top of that,
> yes, but I think that concession is sane for the common case of "yes
> parallel, but I want to always run the always-exit-0 log
> hook". E.g. I've used this to setup a hook to run push
> attempts/successes in a hook framework that runs N pre-receive hooks.

Reading this, I think I'm still missing something key about what you
think glob ordering provides. I'm not following why having the log hook
set early requires glob ordering over config ordering (since the config
ordering schema allows reordering via replacement), and I'm not
following why it's required to halt on failure.

> 
> All that being said I'm open to being convinced, I just don't see what
> the target user is, and the submitted docs don't really make a case for
> it. I.e. there's plenty of "what" not "why would someone want this...".

ACK. I'll try and go over the doc again before I reroll.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v4 1/9] doc: propose hooks managed by the config
  2020-10-29 15:38             ` Emily Shaffer
@ 2020-10-29 20:04               ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 170+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2020-10-29 20:04 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: h, git


On Thu, Oct 29 2020, Emily Shaffer wrote:

> On Fri, Oct 23, 2020 at 09:10:24PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> >> You already use "hookdir" for something else though, so that's a bit
>> >> confusing, perhaps s/hookcmd/definehookcmd/ would be less confusing, or
>> >> perhaps more confusing...
>> >
>> > "Hookdir" might be the wrong word to use, too - maybe it's better to
>> > mirror "hookspath" there. Eitherway, "hookdir" and "hookspath" are
>> > similar enough that I think it would be confusing, and "hookcmd" is
>> > already getting some side-eye from me for not being a great choice.
>> >
>> > Some thoughts for "a path to a directory in which multiple scripts for a
>> > single hook live":
>> >  - hookset
>> >  - hookbatch (ugh, redundant with MS scripting)
>> >  - hook.pre-commit.all-of = ~/last-minute-checks/
>> >  -  "   "  .everything-in = "   "
>> > ...?
>> >
>> > I think I named a couple silly ideas for "hookcmd" in another mail.
>> 
>> To both of the above: Yeah I'm not saying you need to do the work, just
>> that I think it would be a useful case to bikeshed now since it seems
>> inevitable that we'll get a "find hooks in this dir by glob" once we
>> have this facility. So having a config syntax for that which isn't
>> overly confusing / extensible to that case would be useful, i.e. as the
>> current syntax uses "dir" already.
>
> Yeah. I'm not sure that it needs to happen right away. Because
> hook.*.command // hookcommand.*.command gets passed right into
> run_command()-with-shell, it's possible for a user who's keen to also
> set `hook.*.command = find -type f /some/path | xargs` in the meantime.
> And also because it's passed right into run_command()-with-shell, it's
> hard to do some smart wildcarding on the .command config and try to
> figure out the right syntax. I'd just as soon see something explicit
> like the configs I mentioned above, which can be added pretty easily
> after the fact. I think what you're mostly saying, though, is "Leave
> some words for glob execution!" and that I can appreciate.

Yeah, or rather, just now in config key naming think about if the key
naming makes sense if it's expanded to support such glob inclusion,
which seems like a desired addition. But I won't belabor that point.

Just one thing to add: We don't really need to come up with a syntax &
semantics for glob inclusion special to this, we'd use the sort of glob
patterns "Conditional includes" use, as documented in  git-config(1).

>> > Hum. This seems to say "folks who started their hooks with the same
>> > number agree that their hooks should also run simultaneously" - which
>> > sounds like an even harder problem than "how do I know my ordering
>> > number isn't the same as someone else's in another config file". Or else
>> > I'm misunderstanding your pseudo :)
>> 
>> The prefix number isn't meaningful in that way, i.e. if you have 10
>> threads and 5 hooks starting with 250-* they won't all be invoked at the
>> same time.
>
> Ok. I misunderstood, then.
>
>> > I know I rambled a lot - I was trying to convince myself :) For now, I'd
>> > prefer to add more detail to the "future work" section of the doc and
>> > then not touch this problem with a very long pole... ;) Thoughts
>> > welcome.
>> 
>> I'm replying to much of the above in general here, particularly since
>> much of it was in the form of a question you answered yourself later :)
>> 
>> Yes as you point out the reason I'm raising the parallel thing now is
>> "keep users from assuming serial execution", i.e. any implementation
>> that isn't like that from day 1 will need more verbose syntax to opt-in
>> to that.
>> 
>> I think parallel is the sane default, although there's a really strong
>> case as you point out with the "commit-msg" hook for treating that on a
>> hook-type basis. E.g. commit-msg (in-place editing of as single file)
>> being non-parallel by default, but e.g. post-commit, pre-applypatch,
>> pre-receive and other "should we proceed?" hooks being parallel.
>
> Yeah. I think you've sold me. So what I will do is thus: before I send
> the next reroll (as I'm pretty much done, locally, and hope to be ready
> for nits next time) I'll take a look in 'git help githooks' and see
> which ones expect writes to occur. I think there are more than just
> "commit-msg". I'll add a bit to run_hooks() and a corresponding flag to
> 'git hook run', plus relevant documentation. I'll also plan to add
> explicit documentation to 'git help githooks' mentioning parallel vs.
> serial execution.

Sounds good.

> But I will plan on writing it stupidly - user configurable job number
> but no dependency checking; and let the user turn off parallel execution
> for everyone (hook.jobs=1) or for just one hook
> (hook.pre-commit.parallel = false (?)). Like you and Jonathan N say, we
> can add more sugar like hookcmd.*.depends later on when we need it.

Yeah, that sounds great. As long as there's parallelism that stuff can
always be tweaked later.

>> 
>> But I'm also raising a general concern with the design of the API /
>> command around this.
>> 
>> I don't see the need for having a git hook list/edit/add command at
>> all. We should just keep this simpler and be able to point to "git
>> config --add/--get-regexp" etc.
>> 
>> It seems the reason to introduce this command API around it is because
>> you're imagining that git needs to manage hooks whose relative execution
>> order is important, and to later on once this lands aim to implement a
>> much more complex dependency management schema.
>
> No, I don't think that's the reason to have list/edit/add. The reason is
> more for discoverability (if I 'git help git' or 'git^TAB', do I see
> something handy in the command list that I didn't know about before?)
> and user friendliness ("I can't remember the right config options to set
> this up every dang time"). And 'list', I think, is handy for giving
> users a dry run of what they can expect to see happen (and where to fix
> them, since it lists the origin). Yes, a user could put it all together
> from invocations of 'git config', but I personally think it's more
> useful for Git to tell me what Git is going to do/what Git wants than
> for my meat brain to try and guess :)

Okey, that makes sense & I've got nothing against that, just clarifying
since it *looked* like it was the first step in some future addition of
complexity around this.

It would be nice if the docs for the new command were modified to state
that clearly, even to the point of saying "this is really just sugar for
this similar git-config invocation".

>> 
>> I just can't imagine a case that needs that where say those 10 hooks
>> need to execute in exact order 1/2/3/4 where the author of that tight
>> coupling wouldn't also desire to roll that all into one script, or at
>> least that it's an obscure enough case that we can just say "do that".
>> 
>> Whereas I do think "run a bunch of independent checks, if all pass
>> proceed" is *the* common case, e.g. adding a bunch of pre-receive
>> hooks. If we tell the user we'll treat those as independent programs we
>> can run them in parallel. The vast majority of users will benefit from
>> the default faster execution.
>> 
>> The "glob order" case I mentioned is extra complexity on top of that,
>> yes, but I think that concession is sane for the common case of "yes
>> parallel, but I want to always run the always-exit-0 log
>> hook". E.g. I've used this to setup a hook to run push
>> attempts/successes in a hook framework that runs N pre-receive hooks.
>
> Reading this, I think I'm still missing something key about what you
> think glob ordering provides. 

For context, I feel strongly that we should do parallel by default for
implementing something like this, it's great that per the above
discussion you're open to that.

This "glob ordering" is an entirely separate idea I'm not strongly
advocating, there's pros & cons of doing that v.s. config ordering.

 * Con: less obvious than config order, you write hooks "a c b" in the
   config and we execute in "a b c" order.

 * Pro: Sidesteps the issues you noted in "Execution ordering" in the
   docs you're adding, i.e. now it'll be impossible to execute a
   repo-local hook before a system-wide one, you can override that with
   having a local one called "000-something".

   I.e. now we'd read the config in the normal config order, and thus if
   there's a system hook there's no way to define a local hook to run
   first, until we get some sort of override for that.

> I'm not following why having the log hook set early requires glob
> ordering over config ordering (since the config ordering schema allows
> reordering via replacement)
> [...]
>  and I'm not following why it's required to halt on failure.

I realize I didn't elaborate on this, there's some past discussion[1][2]
about this. 

I.e. when running N hooks sometimes you'd want to run them all (e.g. to
send notifications), but for others such as pre-receive.d guard checks
you don't have to run all N, if one check (say one checks commit format
validity, another code syntax) fails you'd like to abort early.

So halting on failure is just saving CPU, you might have 10 hooks that
each take 1 second, no point in making the user wait on all 10 checks
for 10 seconds if a failure of any fails the push.

But OTOH you have other use-cases where users want to run them all
(talked about in the [1][2] discussion above), so it's been anticipated
as something we'd grow config for with multi-hook support.

The glob ordering allows common cases for things that aren't possible
with config-order with such early abort.

E.g. consider a server with some common system-wide pre-receive.d hook
(e.g. author e-mail envelope check), and a SOX/PCI controlled repository
where some compliance thing says all push attempts must be logged.

You could then do:

    /etc/git/hooks/pre-receive.d/email-check
    /path/to/repo/hooks/pre-receive.d/000-log-push-attempt-to-db
    /path/to/repo/hooks/pre-receive.d/some-other-check

And we'd always run the 000-* hook first, whereas in the current schema
you can't do that without editing the system-wide config.

>> 
>> All that being said I'm open to being convinced, I just don't see what
>> the target user is, and the submitted docs don't really make a case for
>> it. I.e. there's plenty of "what" not "why would someone want this...".
>
> ACK. I'll try and go over the doc again before I reroll.
>
>  - Emily

1. https://lore.kernel.org/git/87wojjsv9p.fsf@evledraar.gmail.com/
2. https://public-inbox.org/git/CACBZZX6j6q2DUN_Z-Pnent1u714dVNPFBrL_PiEQyLmCzLUVxg@mail.gmail.com/

^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v6 00/17] propose config-based hooks (part I)
  2020-10-14 23:24     ` [PATCH v5 0/8] propose config-based hooks (part I) Emily Shaffer
                         ` (7 preceding siblings ...)
  2020-10-14 23:24       ` [PATCH v5 8/8] hook: replace find_hook() with hook_exists() Emily Shaffer
@ 2020-12-05  1:45       ` Emily Shaffer
  2020-12-05  1:45         ` [PATCH 01/17] doc: propose hooks managed by the config Emily Shaffer
                           ` (18 more replies)
  8 siblings, 19 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:45 UTC (permalink / raw)
  To: git
  Cc: Emily Shaffer, Jeff King, Junio C Hamano, James Ramsay,
	Jonathan Nieder, brian m. carlson,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Josh Steadmon, Johannes Schindelin

Hi folks, and thanks for the patience - I ran into many, many last-mile
challenges.

I haven't addressed many comments on the design doc yet - I was keen to get the
"functionally complete" implementation and conversion to the list.

Next on my plate:
 - Update the design doc to make sense with what's in the implementation.
 - A blog post! How to set up new hooks, why they're neat, etc.
 - We seem to have some Googlers interested in trying it out internally, so
   I'm hoping we'll gather and collate feedback from that soon too.
 - And of course addressing comments on this series.

Thanks!
 - Emily

Emily Shaffer (17):
  doc: propose hooks managed by the config
  hook: scaffolding for git-hook subcommand
  hook: add list command
  hook: include hookdir hook in list
  hook: respect hook.runHookDir
  hook: implement hookcmd.<name>.skip
  parse-options: parse into strvec
  hook: add 'run' subcommand
  hook: replace find_hook() with hook_exists()
  hook: support passing stdin to hooks
  run-command: allow stdin for run_processes_parallel
  hook: allow parallel hook execution
  hook: allow specifying working directory for hooks
  run-command: add stdin callback for parallelization
  hook: provide stdin by string_list or callback
  run-command: allow capturing of collated output
  hooks: allow callers to capture output

 .gitignore                                    |   1 +
 Documentation/Makefile                        |   1 +
 Documentation/config/hook.txt                 |  19 +
 Documentation/git-hook.txt                    | 118 +++++
 Documentation/technical/api-parse-options.txt |   5 +
 .../technical/config-based-hooks.txt          | 367 +++++++++++++++
 Makefile                                      |   2 +
 builtin.h                                     |   1 +
 builtin/bugreport.c                           |   4 +-
 builtin/fetch.c                               |   1 +
 builtin/hook.c                                | 174 ++++++++
 builtin/submodule--helper.c                   |   2 +-
 git.c                                         |   1 +
 hook.c                                        | 417 ++++++++++++++++++
 hook.h                                        | 154 +++++++
 parse-options-cb.c                            |  16 +
 parse-options.h                               |   4 +
 run-command.c                                 |  85 +++-
 run-command.h                                 |  31 ++
 submodule.c                                   |   1 +
 t/helper/test-run-command.c                   |  46 +-
 t/t0061-run-command.sh                        |  37 ++
 t/t1360-config-based-hooks.sh                 | 256 +++++++++++
 23 files changed, 1728 insertions(+), 15 deletions(-)
 create mode 100644 Documentation/config/hook.txt
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 Documentation/technical/config-based-hooks.txt
 create mode 100644 builtin/hook.c
 create mode 100644 hook.c
 create mode 100644 hook.h
 create mode 100755 t/t1360-config-based-hooks.sh

-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH 01/17] doc: propose hooks managed by the config
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
@ 2020-12-05  1:45         ` Emily Shaffer
  2020-12-05  1:45         ` [PATCH 02/17] hook: scaffolding for git-hook subcommand Emily Shaffer
                           ` (17 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:45 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Begin a design document for config-based hooks, managed via git-hook.
Focus on an overview of the implementation and motivation for design
decisions. Briefly discuss the alternatives considered before this
point. Also, attempt to redefine terms to fit into a multihook world.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, addressed comments from Jonathan Tan about wording. However, I have
    not addressed AEvar's comments or done a full re-review of this document.
    I wanted to get the rest of the series out for initial review first.
    
     - Emily
    
    Since v4, addressed comments from Jonathan Tan about wording.

 Documentation/Makefile                        |   1 +
 .../technical/config-based-hooks.txt          | 367 ++++++++++++++++++
 2 files changed, 368 insertions(+)
 create mode 100644 Documentation/technical/config-based-hooks.txt

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 80d1908a44..58d6b3acbe 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -81,6 +81,7 @@ SP_ARTICLES += $(API_DOCS)
 TECH_DOCS += MyFirstContribution
 TECH_DOCS += MyFirstObjectWalk
 TECH_DOCS += SubmittingPatches
+TECH_DOCS += technical/config-based-hooks
 TECH_DOCS += technical/hash-function-transition
 TECH_DOCS += technical/http-protocol
 TECH_DOCS += technical/index-format
diff --git a/Documentation/technical/config-based-hooks.txt b/Documentation/technical/config-based-hooks.txt
new file mode 100644
index 0000000000..dac391f505
--- /dev/null
+++ b/Documentation/technical/config-based-hooks.txt
@@ -0,0 +1,367 @@
+Configuration-based hook management
+===================================
+:sectanchors:
+
+[[motivation]]
+== Motivation
+
+Replace the .git/hook/hookname path as the only source of hooks to execute;
+allow users to define hooks using config files, in a way which is friendly to
+users with multiple repos which have similar needs.
+
+Redefine "hook" as an event rather than a single script, allowing users to
+perform unrelated actions on a single event.
+
+Take a step closer to safety when copying zipped Git repositories from untrusted
+users by making it more apparent to users which scripts will be run during
+normal Git operations.
+
+Make it easier for users to discover Git's hook feature and automate their
+workflows.
+
+[[user-interfaces]]
+== User interfaces
+
+[[config-schema]]
+=== Config schema
+
+Hooks can be introduced by editing the configuration manually. There are two new
+sections added, `hook` and `hookcmd`.
+
+[[config-schema-hook]]
+==== `hook`
+
+Primarily contains subsections for each hook event. The order of variables in
+these subsections defines the hook command execution order; hook commands can be
+specified by setting the value directly to the command if no additional
+configuration is needed, or by setting the value as the name of a `hookcmd`. If
+Git does not find a `hookcmd` whose subsection matches the value of the given
+command string, Git will try to execute the string directly. Hooks are executed
+by passing the resolved command string to the shell. In the future, hook event
+subsections could also contain per-hook-event settings; see
+<<per-hook-event-settings,the section in Future Work>> for more details.
+
+Also contains top-level hook execution settings, for example, `hook.runHookDir`
+or `hook.disableAll`. (These settings are described more in
+<<library,Library>>.)
+
+----
+[hook "pre-commit"]
+  command = perl-linter
+  command = /usr/bin/git-secrets --pre-commit
+
+[hook "pre-applypatch"]
+  command = perl-linter
+  # for illustration purposes; error behavior isn't planned yet
+  error = ignore
+
+[hook]
+  runHookDir = interactive
+----
+
+[[config-schema-hookcmd]]
+==== `hookcmd`
+
+Defines a hook command and its attributes, which will be used when a hook event
+occurs. Unqualified attributes are assumed to apply to this hook during all hook
+events, but event-specific attributes can also be supplied. The example runs
+`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
+include this config, the hook command will be skipped for all events to which
+it's normally subscribed _except_ `pre-commit`.
+
+----
+[hookcmd "perl-linter"]
+  command = /usr/bin/lint-it --language=perl
+  skip = true
+  # for illustration purposes; below hasn't been defined yet
+  pre-commit-skip = false
+----
+
+[[command-line-api]]
+=== Command-line API
+
+Users should be able to view, reorder, and create hook commands via the command
+line. External tools should be able to view a list of hooks in the correct order
+to run.
+
+*`git hook list <hook-event>`*
+
+*`git hook list (--system|--global|--local|--worktree)`*
+
+*`git hook edit <hook-event>`*
+
+*`git hook add <hook-command> <hook-event> <options...>`*
+
+[[hook-editor]]
+=== Hook editor
+
+The tool which is presented by `git hook edit <hook-command>`. Ideally, this
+tool should be easier to use than manually editing the config, and then produce
+a concise config afterwards. It may take a form similar to `git rebase
+--interactive`.
+
+[[implementation]]
+== Implementation
+
+[[library]]
+=== Library
+
+`hook.c` and `hook.h` are responsible for interacting with the config files. In
+the case when the code generating a hook event doesn't have special concerns
+about how to run the hooks, the hook library will provide a basic API to call
+all hooks in config order with an `strvec` provided by the code which
+generates the hook event:
+
+*`int run_hooks(const char *hookname, struct strvec *args)`*
+
+This call includes the hook command provided by `run-command.h:find_hook()`;
+eventually, this legacy hook will be gated by a config `hook.runHookDir`. The
+config is checked against a number of cases:
+
+- "no": the legacy hook will not be run
+- "interactive": Git will prompt the user before running the legacy hook
+- "warn": Git will print a warning to stderr before running the legacy hook
+- "yes" (default): Git will silently run the legacy hook
+
+In case this list is expanded in the future, if a value for `hook.runHookDir` is
+given which Git does not recognize, Git should discard that config entry. For
+example, if "warn" was specified at system level and "junk" was specified at
+global level, Git would resolve the value to "warn"; if the only time the config
+was set was to "junk", Git would use the default value of "yes".
+
+If the caller wants to do something more complicated, the hook library can also
+provide a callback API:
+
+*`int for_each_hookcmd(const char *hookname, hookcmd_function *cb)`*
+
+Finally, to facilitate the builtin, the library will also provide the following
+APIs to interact with the config:
+
+----
+int set_hook_commands(const char *hookname, struct string_list *commands,
+	enum config_scope scope);
+int set_hookcmd(const char *hookcmd, struct hookcmd options);
+
+int list_hook_commands(const char *hookname, struct string_list *commands);
+int list_hooks_in_scope(enum config_scope scope, struct string_list *commands);
+----
+
+`struct hookcmd` is expected to grow in size over time as more functionality is
+added to hooks; so that other parts of the code don't need to understand the
+config schema, `struct hookcmd` should contain logical values instead of string
+pairs.
+
+----
+struct hookcmd {
+  const char *name;
+  const char *command;
+
+  /* for illustration only; not planned at present */
+  int parallelizable;
+  const char *hookcmd_before;
+  const char *hookcmd_after;
+  enum recovery_action on_fail;
+}
+----
+
+[[builtin]]
+=== Builtin
+
+`builtin/hook.c` is responsible for providing the frontend. It's responsible for
+formatting user-provided data and then calling the library API to set the
+configs as appropriate. The builtin frontend is not responsible for calling the
+config directly, so that other areas of Git can rely on the hook library to
+understand the most recent config schema for hooks.
+
+[[migration]]
+=== Migration path
+
+[[stage-0]]
+==== Stage 0
+
+Hooks are called by running `run-command.h:find_hook()` with the hookname and
+executing the result. The hook library and builtin do not exist. Hooks only
+exist as specially named scripts within `.git/hooks/`.
+
+[[stage-1]]
+==== Stage 1
+
+`git hook list --porcelain <hook-event>` is implemented. Users can replace their
+`.git/hooks/<hook-event>` scripts with a trampoline based on `git hook list`'s
+output. Modifier commands like `git hook add` and `git hook edit` can be
+implemented around this time as well.
+
+[[stage-2]]
+==== Stage 2
+
+`hook.h:run_hooks()` is taught to include `run-command.h:find_hook()` at the
+end; calls to `find_hook()` are replaced with calls to `run_hooks()`. Users can
+opt-in to config-based hooks simply by creating some in their config; otherwise
+users should remain unaffected by the change.
+
+[[stage-3]]
+==== Stage 3
+
+The call to `find_hook()` inside of `run_hooks()` learns to check for a config,
+`hook.runHookDir`. Users can opt into managing their hooks completely via the
+config this way.
+
+[[stage-4]]
+==== Stage 4
+
+`.git/hooks` is removed from the template and the hook directory is considered
+deprecated. To avoid breaking older repos, the default of `hook.runHookDir` is
+not changed, and `find_hook()` is not removed.
+
+[[caveats]]
+== Caveats
+
+[[security]]
+=== Security and repo config
+
+Part of the motivation behind this refactor is to mitigate hooks as an attack
+vector;footnote:[https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/]
+however, as the design stands, users can still provide hooks in the repo-level
+config, which is included when a repo is zipped and sent elsewhere.  The
+security of the repo-level config is still under discussion; this design
+generally assumes the repo-level config is secure, which is not true yet. The
+goal is to avoid an overcomplicated design to work around a problem which has
+ceased to exist.
+
+[[ease-of-use]]
+=== Ease of use
+
+The config schema is nontrivial; that's why it's important for the `git hook`
+modifier commands to be usable. Contributors with UX expertise are encouraged to
+share their suggestions.
+
+[[alternatives]]
+== Alternative approaches
+
+A previous summary of alternatives exists in the
+archives.footnote:[https://lore.kernel.org/git/20191116011125.GG22855@google.com]
+
+[[status-quo]]
+=== Status quo
+
+Today users can implement multihooks themselves by using a "trampoline script"
+as their hook, and pointing that script to a directory or list of other scripts
+they wish to run.
+
+[[hook-directories]]
+=== Hook directories
+
+Other contributors have suggested Git learn about the existence of a directory
+such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
+
+[[comparison]]
+=== Comparison table
+
+.Comparison of alternatives
+|===
+|Feature |Config-based hooks |Hook directories |Status quo
+
+|Supports multiple hooks
+|Natively
+|Natively
+|With user effort
+
+|Safer for zipped repos
+|A little
+|No
+|No
+
+|Previous hooks just work
+|If configured
+|Yes
+|Yes
+
+|Can install one hook to many repos
+|Yes
+|No
+|No
+
+|Discoverability
+|Better (in `git help git`)
+|Same as before
+|Same as before
+
+|Hard to run unexpected hook
+|If configured
+|No
+|No
+|===
+
+[[future-work]]
+== Future work
+
+[[execution-ordering]]
+=== Execution ordering
+
+We may find that config order is insufficient for some users; for example,
+config order makes it difficult to add a new hook to the system or global config
+which runs at the end of the hook list. A new ordering schema should be:
+
+1) Specified by a `hook.order` config, so that users will not unexpectedly see
+their order change;
+
+2) Either dependency or numerically based.
+
+Dependency-based ordering is prone to classic linked-list problems, like a
+cycles and handling of missing dependencies. But, it paves the way for enabling
+parallelization if some tasks truly depend on others.
+
+Numerical ordering makes it tricky for Git to generate suggested ordering
+numbers for each command, but is easy to determine a definitive order.
+
+[[parallelization]]
+=== Parallelization
+
+Users with many hooks might want to run them simultaneously, if the hooks don't
+modify state; if one hook depends on another's output, then users will want to
+specify those dependencies. If we decide to solve this problem, we may want to
+look to modern build systems for inspiration on how to manage dependencies and
+parallel tasks.
+
+[[securing-hookdir-hooks]]
+=== Securing hookdir hooks
+
+With the design as written in this doc, it's still possible for a malicious user
+to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
+zip their repo and send it to another user. It may be necessary to teach Git to
+only allow inlined hooks like this if they were configured outside of the local
+scope (in other words, only run hookcmds, and only allow hookcmds to be
+configured in global or system scope); or another approach, like a list of safe
+projects, might be useful. It may also be sufficient (or at least useful) to
+teach a `hook.disableAll` config or similar flag to the Git executable.
+
+[[submodule-inheritance]]
+=== Submodule inheritance
+
+It's possible some submodules may want to run the identical set of hooks that
+their superrepo runs. While a globally-configured hook set is helpful, it's not
+a great solution for users who have multiple repos-with-submodules under the
+same user. It would be useful for submodules to learn how to run hooks from
+their superrepo's config, or inherit that hook setting.
+
+[[per-hook-event-settings]]
+=== Per-hook-event settings
+
+It might be desirable to keep settings specifically for some hook events, but
+not for others - for example, a user may wish to disable hookdir hooks for all
+events but pre-commit, which they haven't had time to convert yet; or, a user
+may wish for execution order settings to differ based on hook event. In that
+case, it would be useful to set something like `hook.pre-commit.executionOrder`
+which would not apply to the 'prepare-commit-msg' hook, for example.
+
+[[glossary]]
+== Glossary
+
+*hook event*
+
+A point during Git's execution where user scripts may be run, for example,
+_prepare-commit-msg_ or _pre-push_.
+
+*hook command*
+
+A user script or executable which will be run on one or more hook events.
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 02/17] hook: scaffolding for git-hook subcommand
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
  2020-12-05  1:45         ` [PATCH 01/17] doc: propose hooks managed by the config Emily Shaffer
@ 2020-12-05  1:45         ` Emily Shaffer
  2020-12-05  1:45         ` [PATCH 03/17] hook: add list command Emily Shaffer
                           ` (16 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:45 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Introduce infrastructure for a new subcommand, git-hook, which will be
used to ease config-based hook management. This command will handle
parsing configs to compose a list of hooks to run for a given event, as
well as adding or modifying hook configs in an interactive fashion.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, mainly changed to RUN_SETUP_GENTLY so that 'git hook list' can
    be executed outside of a repo.

 .gitignore                    |  1 +
 Documentation/git-hook.txt    | 20 ++++++++++++++++++++
 Makefile                      |  1 +
 builtin.h                     |  1 +
 builtin/hook.c                | 21 +++++++++++++++++++++
 git.c                         |  1 +
 t/t1360-config-based-hooks.sh | 11 +++++++++++
 7 files changed, 56 insertions(+)
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 builtin/hook.c
 create mode 100755 t/t1360-config-based-hooks.sh

diff --git a/.gitignore b/.gitignore
index f22b7a4cf1..094f58a175 100644
--- a/.gitignore
+++ b/.gitignore
@@ -76,6 +76,7 @@
 /git-grep
 /git-hash-object
 /git-help
+/git-hook
 /git-http-backend
 /git-http-fetch
 /git-http-push
diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
new file mode 100644
index 0000000000..9eeab0009d
--- /dev/null
+++ b/Documentation/git-hook.txt
@@ -0,0 +1,20 @@
+git-hook(1)
+===========
+
+NAME
+----
+git-hook - Manage configured hooks
+
+SYNOPSIS
+--------
+[verse]
+'git hook'
+
+DESCRIPTION
+-----------
+A placeholder command. Later, you will be able to list, add, and modify hooks
+with this command.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 45bce31016..6ef9c0ee4e 100644
--- a/Makefile
+++ b/Makefile
@@ -1100,6 +1100,7 @@ BUILTIN_OBJS += builtin/get-tar-commit-id.o
 BUILTIN_OBJS += builtin/grep.o
 BUILTIN_OBJS += builtin/hash-object.o
 BUILTIN_OBJS += builtin/help.o
+BUILTIN_OBJS += builtin/hook.o
 BUILTIN_OBJS += builtin/index-pack.o
 BUILTIN_OBJS += builtin/init-db.o
 BUILTIN_OBJS += builtin/interpret-trailers.o
diff --git a/builtin.h b/builtin.h
index b6ce981b73..8df1d36a7a 100644
--- a/builtin.h
+++ b/builtin.h
@@ -163,6 +163,7 @@ int cmd_get_tar_commit_id(int argc, const char **argv, const char *prefix);
 int cmd_grep(int argc, const char **argv, const char *prefix);
 int cmd_hash_object(int argc, const char **argv, const char *prefix);
 int cmd_help(int argc, const char **argv, const char *prefix);
+int cmd_hook(int argc, const char **argv, const char *prefix);
 int cmd_index_pack(int argc, const char **argv, const char *prefix);
 int cmd_init_db(int argc, const char **argv, const char *prefix);
 int cmd_interpret_trailers(int argc, const char **argv, const char *prefix);
diff --git a/builtin/hook.c b/builtin/hook.c
new file mode 100644
index 0000000000..b2bbc84d4d
--- /dev/null
+++ b/builtin/hook.c
@@ -0,0 +1,21 @@
+#include "cache.h"
+
+#include "builtin.h"
+#include "parse-options.h"
+
+static const char * const builtin_hook_usage[] = {
+	N_("git hook"),
+	NULL
+};
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+
+	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+			     builtin_hook_usage, 0);
+
+	return 0;
+}
diff --git a/git.c b/git.c
index 4b7bd77b80..8e92b5d3f6 100644
--- a/git.c
+++ b/git.c
@@ -525,6 +525,7 @@ static struct cmd_struct commands[] = {
 	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
 	{ "hash-object", cmd_hash_object },
 	{ "help", cmd_help },
+	{ "hook", cmd_hook, RUN_SETUP_GENTLY },
 	{ "index-pack", cmd_index_pack, RUN_SETUP_GENTLY | NO_PARSEOPT },
 	{ "init", cmd_init_db },
 	{ "init-db", cmd_init_db },
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
new file mode 100755
index 0000000000..34b0df5216
--- /dev/null
+++ b/t/t1360-config-based-hooks.sh
@@ -0,0 +1,11 @@
+#!/bin/bash
+
+test_description='config-managed multihooks, including git-hook command'
+
+. ./test-lib.sh
+
+test_expect_success 'git hook command does not crash' '
+	git hook
+'
+
+test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 03/17] hook: add list command
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
  2020-12-05  1:45         ` [PATCH 01/17] doc: propose hooks managed by the config Emily Shaffer
  2020-12-05  1:45         ` [PATCH 02/17] hook: scaffolding for git-hook subcommand Emily Shaffer
@ 2020-12-05  1:45         ` Emily Shaffer
  2020-12-05  1:45         ` [PATCH 04/17] hook: include hookdir hook in list Emily Shaffer
                           ` (15 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:45 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Teach 'git hook list <hookname>', which checks the known configs in
order to create an ordered list of hooks to run on a given hook event.

Multiple commands can be specified for a given hook by providing
multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
run in config order. If more properties need to be set on a given hook
in the future, commands can also be specified by providing
"hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
<hookcmd-name>]" subsection; at minimum, this subsection must contain a
"hookcmd.<hookcmd-name>.command = <path-to-hook>" line.

For example:

  $ git config --list | grep ^hook
  hook.pre-commit.command=baz
  hook.pre-commit.command=~/bar.sh
  hookcmd.baz.command=~/baz/from/hookcmd.sh

  $ git hook list pre-commit
  global: ~/baz/from/hookcmd.sh
  local: ~/bar.sh

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, updated the sample in the commit message to reflect reality better.
    
    Since v4, more work on the documentation. Also a slight change to the
    output format (space instead of tab).

 Documentation/config/hook.txt |   9 +++
 Documentation/git-hook.txt    |  59 ++++++++++++++++-
 Makefile                      |   1 +
 builtin/hook.c                |  56 +++++++++++++++--
 hook.c                        | 115 ++++++++++++++++++++++++++++++++++
 hook.h                        |  26 ++++++++
 t/t1360-config-based-hooks.sh |  81 +++++++++++++++++++++++-
 7 files changed, 338 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/config/hook.txt
 create mode 100644 hook.c
 create mode 100644 hook.h

diff --git a/Documentation/config/hook.txt b/Documentation/config/hook.txt
new file mode 100644
index 0000000000..71449ecbc7
--- /dev/null
+++ b/Documentation/config/hook.txt
@@ -0,0 +1,9 @@
+hook.<command>.command::
+	A command to execute during the <command> hook event. This can be an
+	executable on your device, a oneliner for your shell, or the name of a
+	hookcmd. See linkgit:git-hook[1].
+
+hookcmd.<name>.command::
+	A command to execute during a hook for which <name> has been specified
+	as a command. This can be an executable on your device or a oneliner for
+	your shell. See linkgit:git-hook[1].
diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index 9eeab0009d..f19875ed68 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -8,12 +8,65 @@ git-hook - Manage configured hooks
 SYNOPSIS
 --------
 [verse]
-'git hook'
+'git hook' list <hook-name>
 
 DESCRIPTION
 -----------
-A placeholder command. Later, you will be able to list, add, and modify hooks
-with this command.
+You can list configured hooks with this command. Later, you will be able to run,
+add, and modify hooks with this command.
+
+This command parses the default configuration files for sections `hook` and
+`hookcmd`. `hook` is used to describe the commands which will be run during a
+particular hook event; commands are run in the order Git encounters them during
+the configuration parse (see linkgit:git-config[1]). `hookcmd` is used to
+describe attributes of a specific command. If additional attributes don't need
+to be specified, a command to run can be specified directly in the `hook`
+section; if a `hookcmd` by that name isn't found, Git will attempt to run the
+provided value directly. For example:
+
+Global config
+----
+  [hook "post-commit"]
+    command = "linter"
+    command = "~/typocheck.sh"
+
+  [hookcmd "linter"]
+    command = "/bin/linter --c"
+----
+
+Local config
+----
+  [hook "prepare-commit-msg"]
+    command = "linter"
+  [hook "post-commit"]
+    command = "python ~/run-test-suite.py"
+----
+
+With these configs, you'd then see:
+
+----
+$ git hook list "post-commit"
+global: /bin/linter --c
+global: ~/typocheck.sh
+local: python ~/run-test-suite.py
+
+$ git hook list "prepare-commit-msg"
+local: /bin/linter --c
+----
+
+COMMANDS
+--------
+
+list `<hook-name>`::
+
+List the hooks which have been configured for `<hook-name>`. Hooks appear
+in the order they should be run, and print the config scope where the relevant
+`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
+This output is human-readable and the format is subject to change over time.
+
+CONFIGURATION
+-------------
+include::config/hook.txt[]
 
 GIT
 ---
diff --git a/Makefile b/Makefile
index 6ef9c0ee4e..4bf158c4f8 100644
--- a/Makefile
+++ b/Makefile
@@ -903,6 +903,7 @@ LIB_OBJS += grep.o
 LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
+LIB_OBJS += hook.o
 LIB_OBJS += ident.o
 LIB_OBJS += json-writer.o
 LIB_OBJS += kwset.o
diff --git a/builtin/hook.c b/builtin/hook.c
index b2bbc84d4d..4d36de52f8 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -1,21 +1,69 @@
 #include "cache.h"
 
 #include "builtin.h"
+#include "config.h"
+#include "hook.h"
 #include "parse-options.h"
+#include "strbuf.h"
 
 static const char * const builtin_hook_usage[] = {
-	N_("git hook"),
+	N_("git hook list <hookname>"),
 	NULL
 };
 
-int cmd_hook(int argc, const char **argv, const char *prefix)
+static int list(int argc, const char **argv, const char *prefix)
 {
-	struct option builtin_hook_options[] = {
+	struct list_head *head, *pos;
+	struct hook *item;
+	struct strbuf hookname = STRBUF_INIT;
+
+	struct option list_options[] = {
 		OPT_END(),
 	};
 
-	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+	argc = parse_options(argc, argv, prefix, list_options,
 			     builtin_hook_usage, 0);
 
+	if (argc < 1) {
+		usage_msg_opt(_("You must specify a hook event name to list."),
+			      builtin_hook_usage, list_options);
+	}
+
+	strbuf_addstr(&hookname, argv[0]);
+
+	head = hook_list(&hookname);
+
+	if (list_empty(head)) {
+		printf(_("no commands configured for hook '%s'\n"),
+		       hookname.buf);
+		strbuf_release(&hookname);
+		return 0;
+	}
+
+	list_for_each(pos, head) {
+		item = list_entry(pos, struct hook, list);
+		if (item)
+			printf("%s: %s\n",
+			       config_scope_name(item->origin),
+			       item->command.buf);
+	}
+
+	clear_hook_list(head);
+	strbuf_release(&hookname);
+
 	return 0;
 }
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+	if (argc < 2)
+		usage_with_options(builtin_hook_usage, builtin_hook_options);
+
+	if (!strcmp(argv[1], "list"))
+		return list(argc - 1, argv + 1, prefix);
+
+	usage_with_options(builtin_hook_usage, builtin_hook_options);
+}
diff --git a/hook.c b/hook.c
new file mode 100644
index 0000000000..937dc768c8
--- /dev/null
+++ b/hook.c
@@ -0,0 +1,115 @@
+#include "cache.h"
+
+#include "hook.h"
+#include "config.h"
+
+void free_hook(struct hook *ptr)
+{
+	if (ptr) {
+		strbuf_release(&ptr->command);
+		free(ptr);
+	}
+}
+
+static void append_or_move_hook(struct list_head *head, const char *command)
+{
+	struct list_head *pos = NULL, *tmp = NULL;
+	struct hook *to_add = NULL;
+
+	/*
+	 * remove the prior entry with this command; we'll replace it at the
+	 * end.
+	 */
+	list_for_each_safe(pos, tmp, head) {
+		struct hook *it = list_entry(pos, struct hook, list);
+		if (!strcmp(it->command.buf, command)) {
+		    list_del(pos);
+		    /* we'll simply move the hook to the end */
+		    to_add = it;
+		}
+	}
+
+	if (!to_add) {
+		/* adding a new hook, not moving an old one */
+		to_add = xmalloc(sizeof(struct hook));
+		strbuf_init(&to_add->command, 0);
+		strbuf_addstr(&to_add->command, command);
+	}
+
+	/* re-set the scope so we show where an override was specified */
+	to_add->origin = current_config_scope();
+
+	list_add_tail(&to_add->list, pos);
+}
+
+static void remove_hook(struct list_head *to_remove)
+{
+	struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
+	list_del(to_remove);
+	free_hook(hook_to_remove);
+}
+
+void clear_hook_list(struct list_head *head)
+{
+	struct list_head *pos, *tmp;
+	list_for_each_safe(pos, tmp, head)
+		remove_hook(pos);
+}
+
+struct hook_config_cb
+{
+	struct strbuf *hookname;
+	struct list_head *list;
+};
+
+static int hook_config_lookup(const char *key, const char *value, void *cb_data)
+{
+	struct hook_config_cb *data = cb_data;
+	const char *hook_key = data->hookname->buf;
+	struct list_head *head = data->list;
+
+	if (!strcmp(key, hook_key)) {
+		const char *command = value;
+		struct strbuf hookcmd_name = STRBUF_INIT;
+
+		/* Check if a hookcmd with that name exists. */
+		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
+		git_config_get_value(hookcmd_name.buf, &command);
+
+		if (!command) {
+			strbuf_release(&hookcmd_name);
+			BUG("git_config_get_value overwrote a string it shouldn't have");
+		}
+
+		/*
+		 * TODO: implement an option-getting callback, e.g.
+		 *   get configs by pattern hookcmd.$value.*
+		 *   for each key+value, do_callback(key, value, cb_data)
+		 */
+
+		append_or_move_hook(head, command);
+
+		strbuf_release(&hookcmd_name);
+	}
+
+	return 0;
+}
+
+struct list_head* hook_list(const struct strbuf* hookname)
+{
+	struct strbuf hook_key = STRBUF_INIT;
+	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
+	struct hook_config_cb cb_data = { &hook_key, hook_head };
+
+	INIT_LIST_HEAD(hook_head);
+
+	if (!hookname)
+		return NULL;
+
+	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
+
+	git_config(hook_config_lookup, (void*)&cb_data);
+
+	strbuf_release(&hook_key);
+	return hook_head;
+}
diff --git a/hook.h b/hook.h
new file mode 100644
index 0000000000..8ffc4f14b6
--- /dev/null
+++ b/hook.h
@@ -0,0 +1,26 @@
+#include "config.h"
+#include "list.h"
+#include "strbuf.h"
+
+struct hook
+{
+	struct list_head list;
+	/*
+	 * Config file which holds the hook.*.command definition.
+	 * (This has nothing to do with the hookcmd.<name>.* configs.)
+	 */
+	enum config_scope origin;
+	/* The literal command to run. */
+	struct strbuf command;
+};
+
+/*
+ * Provides a linked list of 'struct hook' detailing commands which should run
+ * in response to the 'hookname' event, in execution order.
+ */
+struct list_head* hook_list(const struct strbuf *hookname);
+
+/* Free memory associated with a 'struct hook' */
+void free_hook(struct hook *ptr);
+/* Empties the list at 'head', calling 'free_hook()' on each entry */
+void clear_hook_list(struct list_head *head);
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 34b0df5216..6e4a3e763f 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -4,8 +4,85 @@ test_description='config-managed multihooks, including git-hook command'
 
 . ./test-lib.sh
 
-test_expect_success 'git hook command does not crash' '
-	git hook
+ROOT=
+if test_have_prereq MINGW
+then
+	# In Git for Windows, Unix-like paths work only in shell scripts;
+	# `git.exe`, however, will prefix them with the pseudo root directory
+	# (of the Unix shell). Let's accommodate for that.
+	ROOT="$(cd / && pwd)"
+fi
+
+setup_hooks () {
+	test_config hook.pre-commit.command "/path/ghi" --add
+	test_config_global hook.pre-commit.command "/path/def" --add
+}
+
+setup_hookcmd () {
+	test_config hook.pre-commit.command "abc" --add
+	test_config_global hookcmd.abc.command "/path/abc" --add
+}
+
+test_expect_success 'git hook rejects commands without a mode' '
+	test_must_fail git hook pre-commit
+'
+
+
+test_expect_success 'git hook rejects commands without a hookname' '
+	test_must_fail git hook list
+'
+
+test_expect_success 'git hook runs outside of a repo' '
+	setup_hooks &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	EOF
+
+	nongit git config --list --global &&
+
+	nongit git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list orders by config order' '
+	setup_hooks &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	local: $ROOT/path/ghi
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list dereferences a hookcmd' '
+	setup_hooks &&
+	setup_hookcmd &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	local: $ROOT/path/ghi
+	local: $ROOT/path/abc
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list reorders on duplicate commands' '
+	setup_hooks &&
+
+	test_config hook.pre-commit.command "/path/def" --add &&
+
+	cat >expected <<-EOF &&
+	local: $ROOT/path/ghi
+	local: $ROOT/path/def
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
 '
 
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 04/17] hook: include hookdir hook in list
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (2 preceding siblings ...)
  2020-12-05  1:45         ` [PATCH 03/17] hook: add list command Emily Shaffer
@ 2020-12-05  1:45         ` Emily Shaffer
  2020-12-05  1:45         ` [PATCH 05/17] hook: respect hook.runHookDir Emily Shaffer
                           ` (14 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:45 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Historically, hooks are declared by placing an executable into
$GIT_DIR/hooks/$HOOKNAME (or $HOOKDIR/$HOOKNAME). Although hooks taken
from the config are more featureful than hooks placed in the $HOOKDIR,
those hooks should not stop working for users who already have them.

Legacy hooks should be run directly, not in shell. We know that they are
a path to an executable, not a oneliner script - and running them
directly takes care of path quoting concerns for us for free.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Newly split into its own commit since v4, and taking place much sooner.
    
    An unfortunate side effect of adding this support *before* the
    hook.runHookDir support is that the labels on the list are not clear -
    because we aren't yet flagging which hooks are from the hookdir versus
    the config. I suppose we could move the addition of that field to the
    struct hook up to this patch, but it didn't make a lot of sense to me to
    do it just for cosmetic purposes.

 builtin/hook.c                | 16 ++++++++++++----
 hook.c                        | 15 +++++++++++++++
 hook.h                        |  1 +
 t/t1360-config-based-hooks.sh | 19 +++++++++++++++++++
 4 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/builtin/hook.c b/builtin/hook.c
index 4d36de52f8..45bbc83b2b 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -16,6 +16,7 @@ static int list(int argc, const char **argv, const char *prefix)
 	struct list_head *head, *pos;
 	struct hook *item;
 	struct strbuf hookname = STRBUF_INIT;
+	struct strbuf hookdir_annotation = STRBUF_INIT;
 
 	struct option list_options[] = {
 		OPT_END(),
@@ -42,10 +43,17 @@ static int list(int argc, const char **argv, const char *prefix)
 
 	list_for_each(pos, head) {
 		item = list_entry(pos, struct hook, list);
-		if (item)
-			printf("%s: %s\n",
-			       config_scope_name(item->origin),
-			       item->command.buf);
+		if (item) {
+			/* Don't translate 'hookdir' - it matches the config */
+			printf("%s: %s%s\n",
+			       (item->from_hookdir
+				? "hookdir"
+				: config_scope_name(item->origin)),
+			       item->command.buf,
+			       (item->from_hookdir
+				? hookdir_annotation.buf
+				: ""));
+		}
 	}
 
 	clear_hook_list(head);
diff --git a/hook.c b/hook.c
index 937dc768c8..ffbdcfd987 100644
--- a/hook.c
+++ b/hook.c
@@ -2,6 +2,7 @@
 
 #include "hook.h"
 #include "config.h"
+#include "run-command.h"
 
 void free_hook(struct hook *ptr)
 {
@@ -34,6 +35,7 @@ static void append_or_move_hook(struct list_head *head, const char *command)
 		to_add = xmalloc(sizeof(struct hook));
 		strbuf_init(&to_add->command, 0);
 		strbuf_addstr(&to_add->command, command);
+		to_add->from_hookdir = 0;
 	}
 
 	/* re-set the scope so we show where an override was specified */
@@ -100,6 +102,7 @@ struct list_head* hook_list(const struct strbuf* hookname)
 	struct strbuf hook_key = STRBUF_INIT;
 	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
 	struct hook_config_cb cb_data = { &hook_key, hook_head };
+	const char *legacy_hook_path = NULL;
 
 	INIT_LIST_HEAD(hook_head);
 
@@ -110,6 +113,18 @@ struct list_head* hook_list(const struct strbuf* hookname)
 
 	git_config(hook_config_lookup, (void*)&cb_data);
 
+	if (have_git_dir())
+		legacy_hook_path = find_hook(hookname->buf);
+
+	/* Unconditionally add legacy hook, but annotate it. */
+	if (legacy_hook_path) {
+		struct hook *legacy_hook;
+
+		append_or_move_hook(hook_head, absolute_path(legacy_hook_path));
+		legacy_hook = list_entry(hook_head->prev, struct hook, list);
+		legacy_hook->from_hookdir = 1;
+	}
+
 	strbuf_release(&hook_key);
 	return hook_head;
 }
diff --git a/hook.h b/hook.h
index 8ffc4f14b6..5750634c83 100644
--- a/hook.h
+++ b/hook.h
@@ -12,6 +12,7 @@ struct hook
 	enum config_scope origin;
 	/* The literal command to run. */
 	struct strbuf command;
+	int from_hookdir;
 };
 
 /*
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 6e4a3e763f..0f12af4659 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -23,6 +23,14 @@ setup_hookcmd () {
 	test_config_global hookcmd.abc.command "/path/abc" --add
 }
 
+setup_hookdir () {
+	mkdir .git/hooks
+	write_script .git/hooks/pre-commit <<-EOF
+	echo \"Legacy Hook\"
+	EOF
+	test_when_finished rm -rf .git/hooks
+}
+
 test_expect_success 'git hook rejects commands without a mode' '
 	test_must_fail git hook pre-commit
 '
@@ -85,4 +93,15 @@ test_expect_success 'git hook list reorders on duplicate commands' '
 	test_cmp expected actual
 '
 
+test_expect_success 'git hook list shows hooks from the hookdir' '
+	setup_hookdir &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 05/17] hook: respect hook.runHookDir
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (3 preceding siblings ...)
  2020-12-05  1:45         ` [PATCH 04/17] hook: include hookdir hook in list Emily Shaffer
@ 2020-12-05  1:45         ` Emily Shaffer
  2020-12-05  1:45         ` [PATCH 06/17] hook: implement hookcmd.<name>.skip Emily Shaffer
                           ` (13 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:45 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Include hooks specified in the hook directory in the list of hooks to
run. These hooks do need to be treated differently from config-specified
ones - they do not need to run in a shell, and later on may be disabled
or warned about based on a config setting.

Because they are at least as local as the local config, we'll run them
last - to keep the hook execution order from global to local.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Newly split into its own commit since v4, and taking place much sooner.
    
    An unfortunate side effect of adding this support *before* the
    hook.runHookDir support is that the labels on the list are not clear -
    because we aren't yet flagging which hooks are from the hookdir versus
    the config. I suppose we could move the addition of that field to the
    struct hook up to this patch, but it didn't make a lot of sense to me to
    do it just for cosmetic purposes.

 Documentation/config/hook.txt |  5 ++++
 builtin/hook.c                | 54 +++++++++++++++++++++++++++++++++--
 hook.c                        | 21 ++++++++++++++
 hook.h                        | 15 ++++++++++
 t/t1360-config-based-hooks.sh | 43 ++++++++++++++++++++++++++++
 5 files changed, 135 insertions(+), 3 deletions(-)

diff --git a/Documentation/config/hook.txt b/Documentation/config/hook.txt
index 71449ecbc7..75312754ae 100644
--- a/Documentation/config/hook.txt
+++ b/Documentation/config/hook.txt
@@ -7,3 +7,8 @@ hookcmd.<name>.command::
 	A command to execute during a hook for which <name> has been specified
 	as a command. This can be an executable on your device or a oneliner for
 	your shell. See linkgit:git-hook[1].
+
+hook.runHookDir::
+	Controls how hooks contained in your hookdir are executed. Can be any of
+	"yes", "warn", "interactive", or "no". Defaults to "yes". See
+	linkgit:git-hook[1] and linkgit:git-config[1] "core.hooksPath").
diff --git a/builtin/hook.c b/builtin/hook.c
index 45bbc83b2b..16324d4195 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -11,6 +11,8 @@ static const char * const builtin_hook_usage[] = {
 	NULL
 };
 
+static enum hookdir_opt should_run_hookdir;
+
 static int list(int argc, const char **argv, const char *prefix)
 {
 	struct list_head *head, *pos;
@@ -41,6 +43,26 @@ static int list(int argc, const char **argv, const char *prefix)
 		return 0;
 	}
 
+	switch (should_run_hookdir) {
+		case hookdir_no:
+			strbuf_addstr(&hookdir_annotation, _(" (will not run)"));
+			break;
+		case hookdir_interactive:
+			strbuf_addstr(&hookdir_annotation, _(" (will prompt)"));
+			break;
+		case hookdir_warn:
+		case hookdir_unknown:
+			strbuf_addstr(&hookdir_annotation, _(" (will warn)"));
+			break;
+		case hookdir_yes:
+		/*
+		 * The default behavior should agree with
+		 * hook.c:configured_hookdir_opt().
+		 */
+		default:
+			break;
+	}
+
 	list_for_each(pos, head) {
 		item = list_entry(pos, struct hook, list);
 		if (item) {
@@ -64,14 +86,40 @@ static int list(int argc, const char **argv, const char *prefix)
 
 int cmd_hook(int argc, const char **argv, const char *prefix)
 {
+	const char *run_hookdir = NULL;
+
 	struct option builtin_hook_options[] = {
+		OPT_STRING(0, "run-hookdir", &run_hookdir, N_("option"),
+			   N_("what to do with hooks found in the hookdir")),
 		OPT_END(),
 	};
-	if (argc < 2)
+
+	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+			     builtin_hook_usage, 0);
+
+	/* after the parse, we should have "<command> <hookname> <args...>" */
+	if (argc < 1)
 		usage_with_options(builtin_hook_usage, builtin_hook_options);
 
-	if (!strcmp(argv[1], "list"))
-		return list(argc - 1, argv + 1, prefix);
+
+	/* argument > config */
+	if (run_hookdir)
+		if (!strcmp(run_hookdir, "no"))
+			should_run_hookdir = hookdir_no;
+		else if (!strcmp(run_hookdir, "yes"))
+			should_run_hookdir = hookdir_yes;
+		else if (!strcmp(run_hookdir, "warn"))
+			should_run_hookdir = hookdir_warn;
+		else if (!strcmp(run_hookdir, "interactive"))
+			should_run_hookdir = hookdir_interactive;
+		else
+			die(_("'%s' is not a valid option for --run-hookdir "
+			      "(yes, warn, interactive, no)"), run_hookdir);
+	else
+		should_run_hookdir = configured_hookdir_opt();
+
+	if (!strcmp(argv[0], "list"))
+		return list(argc, argv, prefix);
 
 	usage_with_options(builtin_hook_usage, builtin_hook_options);
 }
diff --git a/hook.c b/hook.c
index ffbdcfd987..340e5a35c8 100644
--- a/hook.c
+++ b/hook.c
@@ -97,6 +97,27 @@ static int hook_config_lookup(const char *key, const char *value, void *cb_data)
 	return 0;
 }
 
+enum hookdir_opt configured_hookdir_opt(void)
+{
+	const char *key;
+	if (git_config_get_value("hook.runhookdir", &key))
+		return hookdir_yes; /* by default, just run it. */
+
+	if (!strcmp(key, "no"))
+		return hookdir_no;
+
+	if (!strcmp(key, "yes"))
+		return hookdir_yes;
+
+	if (!strcmp(key, "warn"))
+		return hookdir_warn;
+
+	if (!strcmp(key, "interactive"))
+		return hookdir_interactive;
+
+	return hookdir_unknown;
+}
+
 struct list_head* hook_list(const struct strbuf* hookname)
 {
 	struct strbuf hook_key = STRBUF_INIT;
diff --git a/hook.h b/hook.h
index 5750634c83..ca45d388d3 100644
--- a/hook.h
+++ b/hook.h
@@ -21,6 +21,21 @@ struct hook
  */
 struct list_head* hook_list(const struct strbuf *hookname);
 
+enum hookdir_opt
+{
+	hookdir_no,
+	hookdir_warn,
+	hookdir_interactive,
+	hookdir_yes,
+	hookdir_unknown,
+};
+
+/*
+ * Provides the hookdir_opt specified in the config without consulting any
+ * command line arguments.
+ */
+enum hookdir_opt configured_hookdir_opt(void);
+
 /* Free memory associated with a 'struct hook' */
 void free_hook(struct hook *ptr);
 /* Empties the list at 'head', calling 'free_hook()' on each entry */
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 0f12af4659..91127a50a4 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -104,4 +104,47 @@ test_expect_success 'git hook list shows hooks from the hookdir' '
 	test_cmp expected actual
 '
 
+test_expect_success 'hook.runHookDir = no is respected by list' '
+	setup_hookdir &&
+
+	test_config hook.runHookDir "no" &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit (will not run)
+	EOF
+
+	git hook list pre-commit >actual &&
+	# the hookdir annotation is translated
+	test_i18ncmp expected actual
+'
+
+test_expect_success 'hook.runHookDir = warn is respected by list' '
+	setup_hookdir &&
+
+	test_config hook.runHookDir "warn" &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit (will warn)
+	EOF
+
+	git hook list pre-commit >actual &&
+	# the hookdir annotation is translated
+	test_i18ncmp expected actual
+'
+
+
+test_expect_success 'hook.runHookDir = interactive is respected by list' '
+	setup_hookdir &&
+
+	test_config hook.runHookDir "interactive" &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit (will prompt)
+	EOF
+
+	git hook list pre-commit >actual &&
+	# the hookdir annotation is translated
+	test_i18ncmp expected actual
+'
+
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 06/17] hook: implement hookcmd.<name>.skip
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (4 preceding siblings ...)
  2020-12-05  1:45         ` [PATCH 05/17] hook: respect hook.runHookDir Emily Shaffer
@ 2020-12-05  1:45         ` Emily Shaffer
  2020-12-05  1:45         ` [PATCH 07/17] parse-options: parse into strvec Emily Shaffer
                           ` (12 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:45 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

If a user wants a specific repo to skip execution of a hook which is set
at a global or system level, they can now do so by specifying 'skip' in
their repo config:

~/.gitconfig
  [hook.pre-commit]
    command = skippable-oneliner
    command = skippable-hookcmd

  [hookcmd.skippable-hookcmd]
    command = foo.sh

$GIT_DIR/.git/config
  [hookcmd.skippable-oneliner]
    skip = true
  [hookcmd.skippable-hookcmd]
    skip = true

Later it may make sense to add an option like
"hookcmd.<name>.<hook-event>-skip" - but for simplicity, let's start
with a universal skip setting like this.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    In addition to being handy for turning off global hooks one project doesn't
    care about, this setting will be necessary much later for the 'proc-receive'
    hook, which can only cope with up to one hook being specified.
    
    New since v4.
    
    During the Google team's review club I was reminded about this whole
    'skip' option I never implemented. It's true that it's impossible to
    exclude a given hook without this; however, I think I have some more
    work to do on it, so consider it RFC for now and tell me what you think
    :)
     - Emily
    
    During the Google team's review club this week I was reminded about this whole
    'skip' option I never implemented. It's true that it's impossible to exclude
    a given hook without this; however, I think we have some more work to do on it,
    so consider it RFC for now and tell me what you think :)
    
     - Emily

 hook.c                        | 37 +++++++++++++++++++++++++----------
 t/t1360-config-based-hooks.sh | 23 ++++++++++++++++++++++
 2 files changed, 50 insertions(+), 10 deletions(-)

diff --git a/hook.c b/hook.c
index 340e5a35c8..f4084e33c8 100644
--- a/hook.c
+++ b/hook.c
@@ -12,23 +12,24 @@ void free_hook(struct hook *ptr)
 	}
 }
 
-static void append_or_move_hook(struct list_head *head, const char *command)
+static struct hook* find_hook_by_command(struct list_head *head, const char *command)
 {
 	struct list_head *pos = NULL, *tmp = NULL;
-	struct hook *to_add = NULL;
+	struct hook *found = NULL;
 
-	/*
-	 * remove the prior entry with this command; we'll replace it at the
-	 * end.
-	 */
 	list_for_each_safe(pos, tmp, head) {
 		struct hook *it = list_entry(pos, struct hook, list);
 		if (!strcmp(it->command.buf, command)) {
 		    list_del(pos);
-		    /* we'll simply move the hook to the end */
-		    to_add = it;
+		    found = it;
 		}
 	}
+	return found;
+}
+
+static void append_or_move_hook(struct list_head *head, const char *command)
+{
+	struct hook *to_add = find_hook_by_command(head, command);
 
 	if (!to_add) {
 		/* adding a new hook, not moving an old one */
@@ -41,7 +42,7 @@ static void append_or_move_hook(struct list_head *head, const char *command)
 	/* re-set the scope so we show where an override was specified */
 	to_add->origin = current_config_scope();
 
-	list_add_tail(&to_add->list, pos);
+	list_add_tail(&to_add->list, head);
 }
 
 static void remove_hook(struct list_head *to_remove)
@@ -73,8 +74,18 @@ static int hook_config_lookup(const char *key, const char *value, void *cb_data)
 	if (!strcmp(key, hook_key)) {
 		const char *command = value;
 		struct strbuf hookcmd_name = STRBUF_INIT;
+		int skip = 0;
+
+		/*
+		 * Check if we're removing that hook instead. Hookcmds are
+		 * removed by name, and inlined hooks are removed by command
+		 * content.
+		 */
+		strbuf_addf(&hookcmd_name, "hookcmd.%s.skip", command);
+		git_config_get_bool(hookcmd_name.buf, &skip);
 
 		/* Check if a hookcmd with that name exists. */
+		strbuf_reset(&hookcmd_name);
 		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
 		git_config_get_value(hookcmd_name.buf, &command);
 
@@ -89,7 +100,13 @@ static int hook_config_lookup(const char *key, const char *value, void *cb_data)
 		 *   for each key+value, do_callback(key, value, cb_data)
 		 */
 
-		append_or_move_hook(head, command);
+		if (skip) {
+			struct hook *to_remove = find_hook_by_command(head, command);
+			if (to_remove)
+				remove_hook(&(to_remove->list));
+		} else {
+			append_or_move_hook(head, command);
+		}
 
 		strbuf_release(&hookcmd_name);
 	}
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 91127a50a4..ebd3bc623f 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -132,6 +132,29 @@ test_expect_success 'hook.runHookDir = warn is respected by list' '
 	test_i18ncmp expected actual
 '
 
+test_expect_success 'git hook list removes skipped hookcmd' '
+	setup_hookcmd &&
+	test_config hookcmd.abc.skip "true" --add &&
+
+	cat >expected <<-EOF &&
+	no commands configured for hook '\''pre-commit'\''
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_i18ncmp expected actual
+'
+
+test_expect_success 'git hook list removes skipped inlined hook' '
+	setup_hooks &&
+	test_config hookcmd."$ROOT/path/ghi".skip "true" --add &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
 
 test_expect_success 'hook.runHookDir = interactive is respected by list' '
 	setup_hookdir &&
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 07/17] parse-options: parse into strvec
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (5 preceding siblings ...)
  2020-12-05  1:45         ` [PATCH 06/17] hook: implement hookcmd.<name>.skip Emily Shaffer
@ 2020-12-05  1:45         ` Emily Shaffer
  2020-12-05  1:45         ` [PATCH 08/17] hook: add 'run' subcommand Emily Shaffer
                           ` (11 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:45 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

parse-options already knows how to read into a string_list, and it knows
how to read into an strvec as a passthrough (that is, including the
argument as well as its value). string_list and strvec serve similar
purposes but are somewhat painful to convert between; so, let's teach
parse-options to read values of string arguments directly into an
strvec without preserving the argument name.

This is useful if collecting generic arguments to pass through to
another command, for example, 'git hook run --arg "--quiet" --arg
"--format=pretty" some-hook'. The resulting strvec would contain
{ "--quiet", "--format=pretty" }.

The implementation is based on that of OPT_STRING_LIST.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, fixed one or two more places where I missed the argv_array->strvec
    rename.

 Documentation/technical/api-parse-options.txt |  5 +++++
 parse-options-cb.c                            | 16 ++++++++++++++++
 parse-options.h                               |  4 ++++
 3 files changed, 25 insertions(+)

diff --git a/Documentation/technical/api-parse-options.txt b/Documentation/technical/api-parse-options.txt
index 5a60bbfa7f..679bd98629 100644
--- a/Documentation/technical/api-parse-options.txt
+++ b/Documentation/technical/api-parse-options.txt
@@ -173,6 +173,11 @@ There are some macros to easily define options:
 	The string argument is stored as an element in `string_list`.
 	Use of `--no-option` will clear the list of preceding values.
 
+`OPT_STRVEC(short, long, &struct strvec, arg_str, description)`::
+	Introduce an option with a string argument.
+	The string argument is stored as an element in `strvec`.
+	Use of `--no-option` will clear the list of preceding values.
+
 `OPT_INTEGER(short, long, &int_var, description)`::
 	Introduce an option with integer argument.
 	The integer is put into `int_var`.
diff --git a/parse-options-cb.c b/parse-options-cb.c
index 4542d4d3f9..c2451dfb1b 100644
--- a/parse-options-cb.c
+++ b/parse-options-cb.c
@@ -207,6 +207,22 @@ int parse_opt_string_list(const struct option *opt, const char *arg, int unset)
 	return 0;
 }
 
+int parse_opt_strvec(const struct option *opt, const char *arg, int unset)
+{
+	struct strvec *v = opt->value;
+
+	if (unset) {
+		strvec_clear(v);
+		return 0;
+	}
+
+	if (!arg)
+		return -1;
+
+	strvec_push(v, arg);
+	return 0;
+}
+
 int parse_opt_noop_cb(const struct option *opt, const char *arg, int unset)
 {
 	return 0;
diff --git a/parse-options.h b/parse-options.h
index 7030d8f3da..75cc8c7c96 100644
--- a/parse-options.h
+++ b/parse-options.h
@@ -177,6 +177,9 @@ struct option {
 #define OPT_STRING_LIST(s, l, v, a, h) \
 				    { OPTION_CALLBACK, (s), (l), (v), (a), \
 				      (h), 0, &parse_opt_string_list }
+#define OPT_STRVEC(s, l, v, a, h) \
+				    { OPTION_CALLBACK, (s), (l), (v), (a), \
+				      (h), 0, &parse_opt_strvec }
 #define OPT_UYN(s, l, v, h)         { OPTION_CALLBACK, (s), (l), (v), NULL, \
 				      (h), PARSE_OPT_NOARG, &parse_opt_tertiary }
 #define OPT_EXPIRY_DATE(s, l, v, h) \
@@ -296,6 +299,7 @@ int parse_opt_commits(const struct option *, const char *, int);
 int parse_opt_commit(const struct option *, const char *, int);
 int parse_opt_tertiary(const struct option *, const char *, int);
 int parse_opt_string_list(const struct option *, const char *, int);
+int parse_opt_strvec(const struct option *, const char *, int);
 int parse_opt_noop_cb(const struct option *, const char *, int);
 enum parse_opt_result parse_opt_unknown_cb(struct parse_opt_ctx_t *ctx,
 					   const struct option *,
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 08/17] hook: add 'run' subcommand
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (6 preceding siblings ...)
  2020-12-05  1:45         ` [PATCH 07/17] parse-options: parse into strvec Emily Shaffer
@ 2020-12-05  1:45         ` Emily Shaffer
  2020-12-11 10:15           ` Phillip Wood
  2020-12-05  1:45         ` [PATCH 09/17] hook: replace find_hook() with hook_exists() Emily Shaffer
                           ` (10 subsequent siblings)
  18 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:45 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

In order to enable hooks to be run as an external process, by a
standalone Git command, or by tools which wrap Git, provide an external
means to run all configured hook commands for a given hook event.

For now, the hook commands will run in config order, in series. As
alternate ordering or parallelism is supported in the future, we should
add knobs to use those to the command line as well.

As with the legacy hook implementation, all stdout generated by hook
commands is redirected to stderr. Piping from stdin is not yet
supported.

Legacy hooks (those present in $GITDIR/hooks) are run at the end of the
execution list. For now, there is no way to disable them.

Users may wish to provide hook commands like 'git config
hook.pre-commit.command "~/linter.sh --pre-commit"'. To enable this, the
contents of the 'hook.*.command' and 'hookcmd.*.command' strings are
first split by space or quotes into an argv_array, then expanded with
'expand_user_path()'.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, updated the docs, and did less local application of single
    quotes. In order for hookdir hooks to run successfully with a space in
    the path, though, they must not be run with 'sh -c'. So we can treat the
    hookdir hooks specially, and warn users via doc about special
    considerations for configured hooks with spaces in their path.

 Documentation/git-hook.txt    |  31 +++++++++-
 builtin/hook.c                |  48 ++++++++++++++-
 hook.c                        | 112 ++++++++++++++++++++++++++++++++++
 hook.h                        |  32 ++++++++++
 t/t1360-config-based-hooks.sh |  65 +++++++++++++++++++-
 5 files changed, 281 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index f19875ed68..18a817d832 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -9,11 +9,12 @@ SYNOPSIS
 --------
 [verse]
 'git hook' list <hook-name>
+'git hook' run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hook-name>
 
 DESCRIPTION
 -----------
-You can list configured hooks with this command. Later, you will be able to run,
-add, and modify hooks with this command.
+You can list and run configured hooks with this command. Later, you will be able
+to add and modify hooks with this command.
 
 This command parses the default configuration files for sections `hook` and
 `hookcmd`. `hook` is used to describe the commands which will be run during a
@@ -64,6 +65,32 @@ in the order they should be run, and print the config scope where the relevant
 `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
 This output is human-readable and the format is subject to change over time.
 
+run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] `<hook-name>`::
+
+Runs hooks configured for `<hook-name>`, in the same order displayed by `git
+hook list`. Hooks configured this way are run prepended with `sh -c`, so paths
+containing special characters or spaces should be wrapped in single quotes:
+`command = '/my/path with spaces/script.sh' some args`.
+
+OPTIONS
+-------
+--run-hookdir::
+	Overrides the hook.runHookDir config. Must be 'yes', 'warn',
+	'interactive', or 'no'. Specifies how to handle hooks located in the Git
+	hook directory (core.hooksPath).
+
+-a::
+--arg::
+	Only valid for `run`.
++
+Specify arguments to pass to every hook that is run.
+
+-e::
+--env::
+	Only valid for `run`.
++
+Specify environment variables to set for every hook that is run.
+
 CONFIGURATION
 -------------
 include::config/hook.txt[]
diff --git a/builtin/hook.c b/builtin/hook.c
index 16324d4195..26f7050387 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -5,9 +5,11 @@
 #include "hook.h"
 #include "parse-options.h"
 #include "strbuf.h"
+#include "strvec.h"
 
 static const char * const builtin_hook_usage[] = {
 	N_("git hook list <hookname>"),
+	N_("git hook run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hookname>"),
 	NULL
 };
 
@@ -84,6 +86,46 @@ static int list(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+static int run(int argc, const char **argv, const char *prefix)
+{
+	struct strbuf hookname = STRBUF_INIT;
+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
+	int rc = 0;
+
+	struct option run_options[] = {
+		OPT_STRVEC('e', "env", &opt.env, N_("var"),
+			   N_("environment variables for hook to use")),
+		OPT_STRVEC('a', "arg", &opt.args, N_("args"),
+			   N_("argument to pass to hook")),
+		OPT_END(),
+	};
+
+	/*
+	 * While it makes sense to list hooks out-of-repo, it doesn't make sense
+	 * to execute them. Hooks usually want to look at repository artifacts.
+	 */
+	if (!have_git_dir())
+		usage_msg_opt(_("You must be in a Git repo to execute hooks."),
+			      builtin_hook_usage, run_options);
+
+	argc = parse_options(argc, argv, prefix, run_options,
+			     builtin_hook_usage, 0);
+
+	if (argc < 1)
+		usage_msg_opt(_("You must specify a hook event to run."),
+			      builtin_hook_usage, run_options);
+
+	strbuf_addstr(&hookname, argv[0]);
+	opt.run_hookdir = should_run_hookdir;
+
+	rc = run_hooks(hookname.buf, &opt);
+
+	strbuf_release(&hookname);
+	run_hooks_opt_clear(&opt);
+
+	return rc;
+}
+
 int cmd_hook(int argc, const char **argv, const char *prefix)
 {
 	const char *run_hookdir = NULL;
@@ -95,10 +137,10 @@ int cmd_hook(int argc, const char **argv, const char *prefix)
 	};
 
 	argc = parse_options(argc, argv, prefix, builtin_hook_options,
-			     builtin_hook_usage, 0);
+			     builtin_hook_usage, PARSE_OPT_KEEP_UNKNOWN);
 
 	/* after the parse, we should have "<command> <hookname> <args...>" */
-	if (argc < 1)
+	if (argc < 2)
 		usage_with_options(builtin_hook_usage, builtin_hook_options);
 
 
@@ -120,6 +162,8 @@ int cmd_hook(int argc, const char **argv, const char *prefix)
 
 	if (!strcmp(argv[0], "list"))
 		return list(argc, argv, prefix);
+	if (!strcmp(argv[0], "run"))
+		return run(argc, argv, prefix);
 
 	usage_with_options(builtin_hook_usage, builtin_hook_options);
 }
diff --git a/hook.c b/hook.c
index f4084e33c8..c4595a2324 100644
--- a/hook.c
+++ b/hook.c
@@ -3,6 +3,7 @@
 #include "hook.h"
 #include "config.h"
 #include "run-command.h"
+#include "prompt.h"
 
 void free_hook(struct hook *ptr)
 {
@@ -135,6 +136,56 @@ enum hookdir_opt configured_hookdir_opt(void)
 	return hookdir_unknown;
 }
 
+static int should_include_hookdir(const char *path, enum hookdir_opt cfg)
+{
+	struct strbuf prompt = STRBUF_INIT;
+	/*
+	 * If the path doesn't exist, don't bother adding the empty hook and
+	 * don't bother checking the config or prompting the user.
+	 */
+	if (!path)
+		return 0;
+
+	switch (cfg)
+	{
+		case hookdir_no:
+			return 0;
+		case hookdir_unknown:
+			fprintf(stderr,
+				_("Unrecognized value for 'hook.runHookDir'. "
+				  "Is there a typo? "));
+			/* FALLTHROUGH */
+		case hookdir_warn:
+			fprintf(stderr, _("Running legacy hook at '%s'\n"),
+				path);
+			return 1;
+		case hookdir_interactive:
+			do {
+				/*
+				 * TRANSLATORS: Make sure to include [Y] and [n]
+				 * in your translation. Only English input is
+				 * accepted. Default option is "yes".
+				 */
+				fprintf(stderr, _("Run '%s'? [Yn] "), path);
+				git_read_line_interactively(&prompt);
+				strbuf_tolower(&prompt);
+				if (starts_with(prompt.buf, "n")) {
+					strbuf_release(&prompt);
+					return 0;
+				} else if (starts_with(prompt.buf, "y")) {
+					strbuf_release(&prompt);
+					return 1;
+				}
+				/* otherwise, we didn't understand the input */
+			} while (prompt.len); /* an empty reply means "Yes" */
+			strbuf_release(&prompt);
+			return 1;
+		case hookdir_yes:
+		default:
+			return 1;
+	}
+}
+
 struct list_head* hook_list(const struct strbuf* hookname)
 {
 	struct strbuf hook_key = STRBUF_INIT;
@@ -166,3 +217,64 @@ struct list_head* hook_list(const struct strbuf* hookname)
 	strbuf_release(&hook_key);
 	return hook_head;
 }
+
+void run_hooks_opt_init(struct run_hooks_opt *o)
+{
+	strvec_init(&o->env);
+	strvec_init(&o->args);
+	o->run_hookdir = configured_hookdir_opt();
+}
+
+void run_hooks_opt_clear(struct run_hooks_opt *o)
+{
+	strvec_clear(&o->env);
+	strvec_clear(&o->args);
+}
+
+int run_hooks(const char *hookname, struct run_hooks_opt *options)
+{
+	struct strbuf hookname_str = STRBUF_INIT;
+	struct list_head *to_run, *pos = NULL, *tmp = NULL;
+	int rc = 0;
+
+	if (!options)
+		BUG("a struct run_hooks_opt must be provided to run_hooks");
+
+	strbuf_addstr(&hookname_str, hookname);
+
+	to_run = hook_list(&hookname_str);
+
+	list_for_each_safe(pos, tmp, to_run) {
+		struct child_process hook_proc = CHILD_PROCESS_INIT;
+		struct hook *hook = list_entry(pos, struct hook, list);
+
+		hook_proc.env = options->env.v;
+		hook_proc.no_stdin = 1;
+		hook_proc.stdout_to_stderr = 1;
+		hook_proc.trace2_hook_name = hook->command.buf;
+		hook_proc.use_shell = 1;
+
+		if (hook->from_hookdir) {
+		    if (!should_include_hookdir(hook->command.buf, options->run_hookdir))
+			continue;
+		    /*
+		     * Commands from the config could be oneliners, but we know
+		     * for certain that hookdir commands are not.
+		     */
+		    hook_proc.use_shell = 0;
+		}
+
+		/* add command */
+		strvec_push(&hook_proc.args, hook->command.buf);
+
+		/*
+		 * add passed-in argv, without expanding - let the user get back
+		 * exactly what they put in
+		 */
+		strvec_pushv(&hook_proc.args, options->args.v);
+
+		rc |= run_command(&hook_proc);
+	}
+
+	return rc;
+}
diff --git a/hook.h b/hook.h
index ca45d388d3..d1c3d71e82 100644
--- a/hook.h
+++ b/hook.h
@@ -1,6 +1,7 @@
 #include "config.h"
 #include "list.h"
 #include "strbuf.h"
+#include "strvec.h"
 
 struct hook
 {
@@ -36,6 +37,37 @@ enum hookdir_opt
  */
 enum hookdir_opt configured_hookdir_opt(void);
 
+struct run_hooks_opt
+{
+	/* Environment vars to be set for each hook */
+	struct strvec env;
+
+	/* Args to be passed to each hook */
+	struct strvec args;
+
+	/*
+	 * How should the hookdir be handled?
+	 * Leave the RUN_HOOKS_OPT_INIT default in most cases; this only needs
+	 * to be overridden if the user can override it at the command line.
+	 */
+	enum hookdir_opt run_hookdir;
+};
+
+#define RUN_HOOKS_OPT_INIT  {   		\
+	.env = STRVEC_INIT, 				\
+	.args = STRVEC_INIT, 			\
+	.run_hookdir = configured_hookdir_opt()	\
+}
+
+void run_hooks_opt_init(struct run_hooks_opt *o);
+void run_hooks_opt_clear(struct run_hooks_opt *o);
+
+/*
+ * Runs all hooks associated to the 'hookname' event in order. Each hook will be
+ * passed 'env' and 'args'.
+ */
+int run_hooks(const char *hookname, struct run_hooks_opt *options);
+
 /* Free memory associated with a 'struct hook' */
 void free_hook(struct hook *ptr);
 /* Empties the list at 'head', calling 'free_hook()' on each entry */
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index ebd3bc623f..5b3003d59b 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -115,7 +115,10 @@ test_expect_success 'hook.runHookDir = no is respected by list' '
 
 	git hook list pre-commit >actual &&
 	# the hookdir annotation is translated
-	test_i18ncmp expected actual
+	test_i18ncmp expected actual &&
+
+	git hook run pre-commit 2>actual &&
+	test_must_be_empty actual
 '
 
 test_expect_success 'hook.runHookDir = warn is respected by list' '
@@ -129,6 +132,14 @@ test_expect_success 'hook.runHookDir = warn is respected by list' '
 
 	git hook list pre-commit >actual &&
 	# the hookdir annotation is translated
+	test_i18ncmp expected actual &&
+
+	cat >expected <<-EOF &&
+	Running legacy hook at '\''$(pwd)/.git/hooks/pre-commit'\''
+	"Legacy Hook"
+	EOF
+
+	git hook run pre-commit 2>actual &&
 	test_i18ncmp expected actual
 '
 
@@ -156,7 +167,7 @@ test_expect_success 'git hook list removes skipped inlined hook' '
 	test_cmp expected actual
 '
 
-test_expect_success 'hook.runHookDir = interactive is respected by list' '
+test_expect_success 'hook.runHookDir = interactive is respected by list and run' '
 	setup_hookdir &&
 
 	test_config hook.runHookDir "interactive" &&
@@ -167,7 +178,55 @@ test_expect_success 'hook.runHookDir = interactive is respected by list' '
 
 	git hook list pre-commit >actual &&
 	# the hookdir annotation is translated
-	test_i18ncmp expected actual
+	test_i18ncmp expected actual &&
+
+	test_write_lines n | git hook run pre-commit 2>actual &&
+	! grep "Legacy Hook" actual &&
+
+	test_write_lines y | git hook run pre-commit 2>actual &&
+	grep "Legacy Hook" actual
+'
+
+test_expect_success 'inline hook definitions execute oneliners' '
+	test_config hook.pre-commit.command "echo \"Hello World\"" &&
+
+	echo "Hello World" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'inline hook definitions resolve paths' '
+	write_script sample-hook.sh <<-EOF &&
+	echo \"Sample Hook\"
+	EOF
+
+	test_when_finished "rm sample-hook.sh" &&
+
+	test_config hook.pre-commit.command "\"$(pwd)/sample-hook.sh\"" &&
+
+	echo \"Sample Hook\" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'hookdir hook included in git hook run' '
+	setup_hookdir &&
+
+	echo \"Legacy Hook\" >expected &&
+
+	# hooks are run with stdout_to_stderr = 1
+	git hook run pre-commit 2>actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'out-of-repo runs excluded' '
+	setup_hooks &&
+
+	nongit test_must_fail git hook run pre-commit
 '
 
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 09/17] hook: replace find_hook() with hook_exists()
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (7 preceding siblings ...)
  2020-12-05  1:45         ` [PATCH 08/17] hook: add 'run' subcommand Emily Shaffer
@ 2020-12-05  1:45         ` Emily Shaffer
  2020-12-05  1:46         ` [PATCH 10/17] hook: support passing stdin to hooks Emily Shaffer
                           ` (9 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:45 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Add a helper to easily determine whether any hooks exist for a given
hook event.

Many callers want to check whether some state could be modified by a
hook; that check should include the config-based hooks as well. Optimize
by checking the config directly. Since commands which execute hooks
might want to take args to replace 'hook.runHookDir', let
'hook_exists()' mirror the behavior of 'hook.runHookDir'.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, updated this commit to include bugreport as a builtin instead of
    as a standalone.
    
    Since v4, a little more nuance when deciding whether a hookdir hook can happen.

 builtin/bugreport.c |  4 ++--
 hook.c              | 15 +++++++++++++++
 hook.h              |  9 +++++++++
 3 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/builtin/bugreport.c b/builtin/bugreport.c
index 3ad4b9b62e..11043f4a22 100644
--- a/builtin/bugreport.c
+++ b/builtin/bugreport.c
@@ -3,7 +3,7 @@
 #include "strbuf.h"
 #include "help.h"
 #include "compat/compiler.h"
-#include "run-command.h"
+#include "hook.h"
 
 
 static void get_system_info(struct strbuf *sys_info)
@@ -82,7 +82,7 @@ static void get_populated_hooks(struct strbuf *hook_info, int nongit)
 	}
 
 	for (i = 0; i < ARRAY_SIZE(hook); i++)
-		if (find_hook(hook[i]))
+		if (hook_exists(hook[i], configured_hookdir_opt()))
 			strbuf_addf(hook_info, "%s\n", hook[i]);
 }
 
diff --git a/hook.c b/hook.c
index c4595a2324..a7a4abdcac 100644
--- a/hook.c
+++ b/hook.c
@@ -225,6 +225,21 @@ void run_hooks_opt_init(struct run_hooks_opt *o)
 	o->run_hookdir = configured_hookdir_opt();
 }
 
+int hook_exists(const char *hookname, enum hookdir_opt should_run_hookdir)
+{
+	const char *value = NULL; /* throwaway */
+	struct strbuf hook_key = STRBUF_INIT;
+
+	int could_run_hookdir = (should_run_hookdir == hookdir_interactive ||
+				should_run_hookdir == hookdir_warn ||
+				should_run_hookdir == hookdir_yes)
+				&& !!find_hook(hookname);
+
+	strbuf_addf(&hook_key, "hook.%s.command", hookname);
+
+	return (!git_config_get_value(hook_key.buf, &value)) || could_run_hookdir;
+}
+
 void run_hooks_opt_clear(struct run_hooks_opt *o)
 {
 	strvec_clear(&o->env);
diff --git a/hook.h b/hook.h
index d1c3d71e82..94a25c7cd0 100644
--- a/hook.h
+++ b/hook.h
@@ -62,6 +62,15 @@ struct run_hooks_opt
 void run_hooks_opt_init(struct run_hooks_opt *o);
 void run_hooks_opt_clear(struct run_hooks_opt *o);
 
+/*
+ * Returns 1 if any hooks are specified in the config or if a hook exists in the
+ * hookdir. Typically, invoke hook_exsts() like:
+ *   hook_exists(hookname, configured_hookdir_opt());
+ * Like with run_hooks, if you take a --run-hookdir flag, reflect that
+ * user-specified behavior here instead.
+ */
+int hook_exists(const char *hookname, enum hookdir_opt should_run_hookdir);
+
 /*
  * Runs all hooks associated to the 'hookname' event in order. Each hook will be
  * passed 'env' and 'args'.
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 10/17] hook: support passing stdin to hooks
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (8 preceding siblings ...)
  2020-12-05  1:45         ` [PATCH 09/17] hook: replace find_hook() with hook_exists() Emily Shaffer
@ 2020-12-05  1:46         ` Emily Shaffer
  2020-12-05  1:46         ` [PATCH 11/17] run-command: allow stdin for run_processes_parallel Emily Shaffer
                           ` (8 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:46 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Some hooks (such as post-rewrite) need to take input via stdin.
Previously, callers provided stdin to hooks by setting
run-command.h:child_process.in, which takes a FD. Callers would open the
file in question themselves before calling run-command(). However, since
we will now need to seek to the front of the file and read it again for
every hook which runs, hook.h:run_command() takes a path and handles FD
management itself. Since this file is opened for read only, it should
not prevent later parallel execution support.

On the frontend, this is supported by asking for a file path, rather
than by reading stdin. Reading directly from stdin would involve caching
the entire stdin (to memory or to disk) and reading it back from the
beginning to each hook. We'd want to support cases like insufficient
memory or storage for the file. While this may prove useful later, for
now the path of least resistance is to just ask the user to make this
interim file themselves.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/git-hook.txt    | 11 +++++++++--
 builtin/hook.c                |  5 ++++-
 hook.c                        |  7 ++++++-
 hook.h                        |  9 +++++++--
 t/t1360-config-based-hooks.sh | 24 ++++++++++++++++++++++++
 5 files changed, 50 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index 18a817d832..cce30a80d0 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -9,7 +9,8 @@ SYNOPSIS
 --------
 [verse]
 'git hook' list <hook-name>
-'git hook' run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hook-name>
+'git hook' run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] [--to-stdin=<path>]
+	<hook-name>
 
 DESCRIPTION
 -----------
@@ -65,7 +66,7 @@ in the order they should be run, and print the config scope where the relevant
 `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
 This output is human-readable and the format is subject to change over time.
 
-run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] `<hook-name>`::
+run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] [--to-stdin=<path>] `<hook-name>`::
 
 Runs hooks configured for `<hook-name>`, in the same order displayed by `git
 hook list`. Hooks configured this way are run prepended with `sh -c`, so paths
@@ -91,6 +92,12 @@ Specify arguments to pass to every hook that is run.
 +
 Specify environment variables to set for every hook that is run.
 
+--to-stdin::
+	Only valid for `run`.
++
+Specify a file which will be streamed into stdin for every hook that is run.
+Each hook will receive the entire file from beginning to EOF.
+
 CONFIGURATION
 -------------
 include::config/hook.txt[]
diff --git a/builtin/hook.c b/builtin/hook.c
index 26f7050387..e45831e01d 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -9,7 +9,8 @@
 
 static const char * const builtin_hook_usage[] = {
 	N_("git hook list <hookname>"),
-	N_("git hook run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hookname>"),
+	N_("git hook run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...]"
+	   "[--to-stdin=<path>] <hookname>"),
 	NULL
 };
 
@@ -97,6 +98,8 @@ static int run(int argc, const char **argv, const char *prefix)
 			   N_("environment variables for hook to use")),
 		OPT_STRVEC('a', "arg", &opt.args, N_("args"),
 			   N_("argument to pass to hook")),
+		OPT_STRING(0, "to-stdin", &opt.path_to_stdin, N_("path"),
+			   N_("file to read into hooks' stdin")),
 		OPT_END(),
 	};
 
diff --git a/hook.c b/hook.c
index a7a4abdcac..c7fdf556fe 100644
--- a/hook.c
+++ b/hook.c
@@ -263,8 +263,13 @@ int run_hooks(const char *hookname, struct run_hooks_opt *options)
 		struct child_process hook_proc = CHILD_PROCESS_INIT;
 		struct hook *hook = list_entry(pos, struct hook, list);
 
+		/* reopen the file for stdin; run_command closes it. */
+		if (options->path_to_stdin)
+			hook_proc.in = xopen(options->path_to_stdin, O_RDONLY);
+		else
+			hook_proc.no_stdin = 1;
+
 		hook_proc.env = options->env.v;
-		hook_proc.no_stdin = 1;
 		hook_proc.stdout_to_stderr = 1;
 		hook_proc.trace2_hook_name = hook->command.buf;
 		hook_proc.use_shell = 1;
diff --git a/hook.h b/hook.h
index 94a25c7cd0..5184dcaa5a 100644
--- a/hook.h
+++ b/hook.h
@@ -51,11 +51,15 @@ struct run_hooks_opt
 	 * to be overridden if the user can override it at the command line.
 	 */
 	enum hookdir_opt run_hookdir;
+
+	/* Path to file which should be piped to stdin for each hook */
+	const char *path_to_stdin;
 };
 
 #define RUN_HOOKS_OPT_INIT  {   		\
-	.env = STRVEC_INIT, 				\
+	.env = STRVEC_INIT, 			\
 	.args = STRVEC_INIT, 			\
+	.path_to_stdin = NULL,			\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
@@ -73,7 +77,8 @@ int hook_exists(const char *hookname, enum hookdir_opt should_run_hookdir);
 
 /*
  * Runs all hooks associated to the 'hookname' event in order. Each hook will be
- * passed 'env' and 'args'.
+ * passed 'env' and 'args'. The file at 'stdin_path' will be closed and reopened
+ * for each hook that runs.
  */
 int run_hooks(const char *hookname, struct run_hooks_opt *options);
 
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 5b3003d59b..c672269ee4 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -229,4 +229,28 @@ test_expect_success 'out-of-repo runs excluded' '
 	nongit test_must_fail git hook run pre-commit
 '
 
+test_expect_success 'stdin to multiple hooks' '
+	git config --add hook.test.command "xargs -P1 -I% echo a%" &&
+	git config --add hook.test.command "xargs -P1 -I% echo b%" &&
+	test_when_finished "test_unconfig hook.test.command" &&
+
+	cat >input <<-EOF &&
+	1
+	2
+	3
+	EOF
+
+	cat >expected <<-EOF &&
+	a1
+	a2
+	a3
+	b1
+	b2
+	b3
+	EOF
+
+	git hook run --to-stdin=input test 2>actual &&
+	test_cmp expected actual
+'
+
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 11/17] run-command: allow stdin for run_processes_parallel
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (9 preceding siblings ...)
  2020-12-05  1:46         ` [PATCH 10/17] hook: support passing stdin to hooks Emily Shaffer
@ 2020-12-05  1:46         ` Emily Shaffer
  2020-12-05  1:46         ` [PATCH 12/17] hook: allow parallel hook execution Emily Shaffer
                           ` (7 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:46 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

While it makes sense not to inherit stdin from the parent process to
avoid deadlocking, it's not necessary to completely ban stdin to
children. An informed user should be able to configure stdin safely. By
setting `some_child.process.no_stdin=1` before calling `get_next_task()`
we provide a reasonable default behavior but enable users to set up
stdin streaming for themselves during the callback.

`some_child.process.stdout_to_stderr`, however, remains unmodifiable by
`get_next_task()` - the rest of the run_processes_parallel() API depends
on child output in stderr.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 run-command.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/run-command.c b/run-command.c
index ea4d0fb4b1..80c8c97bc1 100644
--- a/run-command.c
+++ b/run-command.c
@@ -1683,6 +1683,9 @@ static int pp_start_one(struct parallel_processes *pp)
 	if (i == pp->max_processes)
 		BUG("bookkeeping is hard");
 
+	/* disallow by default, but allow users to set up stdin if they wish */
+	pp->children[i].process.no_stdin = 1;
+
 	code = pp->get_next_task(&pp->children[i].process,
 				 &pp->children[i].err,
 				 pp->data,
@@ -1694,7 +1697,6 @@ static int pp_start_one(struct parallel_processes *pp)
 	}
 	pp->children[i].process.err = -1;
 	pp->children[i].process.stdout_to_stderr = 1;
-	pp->children[i].process.no_stdin = 1;
 
 	if (start_command(&pp->children[i].process)) {
 		code = pp->start_failure(&pp->children[i].err,
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 12/17] hook: allow parallel hook execution
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (10 preceding siblings ...)
  2020-12-05  1:46         ` [PATCH 11/17] run-command: allow stdin for run_processes_parallel Emily Shaffer
@ 2020-12-05  1:46         ` Emily Shaffer
  2020-12-05  1:46         ` [PATCH 13/17] hook: allow specifying working directory for hooks Emily Shaffer
                           ` (6 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:46 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

In many cases, there's no reason not to allow hooks to execute in
parallel. run_processes_parallel() is well-suited - it's a task queue
that runs its housekeeping in series, which means users don't
need to worry about thread safety on their callback data. True
multithreaded execution with the async_* functions isn't necessary here.
Synchronous hook execution can be achieved by only allowing 1 job to run
at a time.

Teach run_hooks() to use that function for simple hooks which don't
require stdin or capture of stderr.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Per AEvar's request - parallel hook execution on day zero.
    
    In most ways run_processes_parallel() worked great for me - but it didn't
    have great support for hooks where we pipe to and from. I had to add this
    support later in the series.
    
    Since I modified an existing and in-use library I'd appreciate a keen look on
    these patches.
    
     - Emily

 Documentation/config/hook.txt |   5 ++
 Documentation/git-hook.txt    |  15 +++-
 builtin/hook.c                |   6 +-
 hook.c                        | 142 ++++++++++++++++++++++++++--------
 hook.h                        |  28 ++++++-
 5 files changed, 158 insertions(+), 38 deletions(-)

diff --git a/Documentation/config/hook.txt b/Documentation/config/hook.txt
index 75312754ae..a423d13781 100644
--- a/Documentation/config/hook.txt
+++ b/Documentation/config/hook.txt
@@ -12,3 +12,8 @@ hook.runHookDir::
 	Controls how hooks contained in your hookdir are executed. Can be any of
 	"yes", "warn", "interactive", or "no". Defaults to "yes". See
 	linkgit:git-hook[1] and linkgit:git-config[1] "core.hooksPath").
+
+hook.jobs::
+	Specifies how many hooks can be run simultaneously during parallelized
+	hook execution. If unspecified, defaults to the number of processors on
+	the current system.
diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index cce30a80d0..c2678c61b2 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 [verse]
 'git hook' list <hook-name>
 'git hook' run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] [--to-stdin=<path>]
-	<hook-name>
+	[(-j|--jobs) <n>] <hook-name>
 
 DESCRIPTION
 -----------
@@ -66,7 +66,8 @@ in the order they should be run, and print the config scope where the relevant
 `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
 This output is human-readable and the format is subject to change over time.
 
-run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] [--to-stdin=<path>] `<hook-name>`::
+run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] [--to-stdin=<path>]
+	[(-j|--jobs)<n>]`<hook-name>`::
 
 Runs hooks configured for `<hook-name>`, in the same order displayed by `git
 hook list`. Hooks configured this way are run prepended with `sh -c`, so paths
@@ -98,6 +99,16 @@ Specify environment variables to set for every hook that is run.
 Specify a file which will be streamed into stdin for every hook that is run.
 Each hook will receive the entire file from beginning to EOF.
 
+-j::
+--jobs::
+	Only valid for `run`.
++
+Specify how many hooks to run simultaneously. If this flag is not specified, use
+the value of the `hook.jobs` config. If the config is not specified, use the
+number of CPUs on the current system. Some hooks may be ineligible for
+parallelization: for example, 'commit-msg' intends hooks modify the commit
+message body and cannot be parallelized.
+
 CONFIGURATION
 -------------
 include::config/hook.txt[]
diff --git a/builtin/hook.c b/builtin/hook.c
index e45831e01d..064a0fea29 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -10,7 +10,7 @@
 static const char * const builtin_hook_usage[] = {
 	N_("git hook list <hookname>"),
 	N_("git hook run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...]"
-	   "[--to-stdin=<path>] <hookname>"),
+	   "[--to-stdin=<path>] [(-j|--jobs) <count>] <hookname>"),
 	NULL
 };
 
@@ -90,7 +90,7 @@ static int list(int argc, const char **argv, const char *prefix)
 static int run(int argc, const char **argv, const char *prefix)
 {
 	struct strbuf hookname = STRBUF_INIT;
-	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
+	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT_ASYNC;
 	int rc = 0;
 
 	struct option run_options[] = {
@@ -100,6 +100,8 @@ static int run(int argc, const char **argv, const char *prefix)
 			   N_("argument to pass to hook")),
 		OPT_STRING(0, "to-stdin", &opt.path_to_stdin, N_("path"),
 			   N_("file to read into hooks' stdin")),
+		OPT_INTEGER('j', "jobs", &opt.jobs,
+			    N_("run up to <n> hooks simultaneously")),
 		OPT_END(),
 	};
 
diff --git a/hook.c b/hook.c
index c7fdf556fe..edea54f95c 100644
--- a/hook.c
+++ b/hook.c
@@ -136,6 +136,14 @@ enum hookdir_opt configured_hookdir_opt(void)
 	return hookdir_unknown;
 }
 
+int configured_hook_jobs(void)
+{
+	int n = online_cpus();
+	git_config_get_int("hook.jobs", &n);
+
+	return n;
+}
+
 static int should_include_hookdir(const char *path, enum hookdir_opt cfg)
 {
 	struct strbuf prompt = STRBUF_INIT;
@@ -223,6 +231,7 @@ void run_hooks_opt_init(struct run_hooks_opt *o)
 	strvec_init(&o->env);
 	strvec_init(&o->args);
 	o->run_hookdir = configured_hookdir_opt();
+	o->jobs = configured_hook_jobs();
 }
 
 int hook_exists(const char *hookname, enum hookdir_opt should_run_hookdir)
@@ -246,11 +255,96 @@ void run_hooks_opt_clear(struct run_hooks_opt *o)
 	strvec_clear(&o->args);
 }
 
+
+static int pick_next_hook(struct child_process *cp,
+			  struct strbuf *out,
+			  void *pp_cb,
+			  void **pp_task_cb)
+{
+	struct hook_cb_data *hook_cb = pp_cb;
+
+	struct hook *hook = list_entry(hook_cb->run_me, struct hook, list);
+
+	if (hook_cb->head == hook_cb->run_me)
+		return 0;
+
+	cp->env = hook_cb->options->env.v;
+	cp->stdout_to_stderr = 1;
+	cp->trace2_hook_name = hook->command.buf;
+
+	/* reopen the file for stdin; run_command closes it. */
+	if (hook_cb->options->path_to_stdin) {
+		cp->no_stdin = 0;
+		cp->in = xopen(hook_cb->options->path_to_stdin, O_RDONLY);
+	} else {
+		cp->no_stdin = 1;
+	}
+
+	/*
+	 * Commands from the config could be oneliners, but we know
+	 * for certain that hookdir commands are not.
+	 */
+	if (hook->from_hookdir)
+		cp->use_shell = 0;
+	else
+		cp->use_shell = 1;
+
+	/* add command */
+	strvec_push(&cp->args, hook->command.buf);
+
+	/*
+	 * add passed-in argv, without expanding - let the user get back
+	 * exactly what they put in
+	 */
+	strvec_pushv(&cp->args, hook_cb->options->args.v);
+
+	/* Provide context for errors if necessary */
+	*pp_task_cb = hook;
+
+	/* Get the next entry ready */
+	hook_cb->run_me = hook_cb->run_me->next;
+
+	return 1;
+}
+
+static int notify_start_failure(struct strbuf *out,
+				void *pp_cb,
+				void *pp_task_cp)
+{
+	struct hook_cb_data *hook_cb = pp_cb;
+	struct hook *attempted = pp_task_cp;
+
+	/* |= rc in cb */
+	hook_cb->rc |= 1;
+
+	strbuf_addf(out, _("Couldn't start '%s', configured in '%s'\n"),
+		    attempted->command.buf,
+		    attempted->from_hookdir ? "hookdir"
+		    	: config_scope_name(attempted->origin));
+
+	/* NEEDSWORK: if halt_on_error is desired, do it here. */
+	return 0;
+}
+
+static int notify_hook_finished(int result,
+				struct strbuf *out,
+				void *pp_cb,
+				void *pp_task_cb)
+{
+	struct hook_cb_data *hook_cb = pp_cb;
+
+	/* |= rc in cb */
+	hook_cb->rc |= result;
+
+	/* NEEDSWORK: if halt_on_error is desired, do it here. */
+	return 0;
+}
+
 int run_hooks(const char *hookname, struct run_hooks_opt *options)
 {
 	struct strbuf hookname_str = STRBUF_INIT;
 	struct list_head *to_run, *pos = NULL, *tmp = NULL;
-	int rc = 0;
+	struct hook_cb_data cb_data = { 0, NULL, NULL, options };
 
 	if (!options)
 		BUG("a struct run_hooks_opt must be provided to run_hooks");
@@ -260,41 +354,23 @@ int run_hooks(const char *hookname, struct run_hooks_opt *options)
 	to_run = hook_list(&hookname_str);
 
 	list_for_each_safe(pos, tmp, to_run) {
-		struct child_process hook_proc = CHILD_PROCESS_INIT;
 		struct hook *hook = list_entry(pos, struct hook, list);
 
-		/* reopen the file for stdin; run_command closes it. */
-		if (options->path_to_stdin)
-			hook_proc.in = xopen(options->path_to_stdin, O_RDONLY);
-		else
-			hook_proc.no_stdin = 1;
-
-		hook_proc.env = options->env.v;
-		hook_proc.stdout_to_stderr = 1;
-		hook_proc.trace2_hook_name = hook->command.buf;
-		hook_proc.use_shell = 1;
-
-		if (hook->from_hookdir) {
-		    if (!should_include_hookdir(hook->command.buf, options->run_hookdir))
-			continue;
-		    /*
-		     * Commands from the config could be oneliners, but we know
-		     * for certain that hookdir commands are not.
-		     */
-		    hook_proc.use_shell = 0;
-		}
-
-		/* add command */
-		strvec_push(&hook_proc.args, hook->command.buf);
+		if (hook->from_hookdir &&
+		    !should_include_hookdir(hook->command.buf, options->run_hookdir))
+			    list_del(pos);
+	}
 
-		/*
-		 * add passed-in argv, without expanding - let the user get back
-		 * exactly what they put in
-		 */
-		strvec_pushv(&hook_proc.args, options->args.v);
+	cb_data.head = to_run;
+	cb_data.run_me = to_run->next;
 
-		rc |= run_command(&hook_proc);
-	}
+	run_processes_parallel_tr2(options->jobs,
+				   pick_next_hook,
+				   notify_start_failure,
+				   notify_hook_finished,
+				   &cb_data,
+				   "hook",
+				   hookname);
 
-	return rc;
+	return cb_data.rc;
 }
diff --git a/hook.h b/hook.h
index 5184dcaa5a..f54568afe3 100644
--- a/hook.h
+++ b/hook.h
@@ -37,6 +37,9 @@ enum hookdir_opt
  */
 enum hookdir_opt configured_hookdir_opt(void);
 
+/* Provides the number of threads to use for parallel hook execution. */
+int configured_hook_jobs(void);
+
 struct run_hooks_opt
 {
 	/* Environment vars to be set for each hook */
@@ -54,15 +57,38 @@ struct run_hooks_opt
 
 	/* Path to file which should be piped to stdin for each hook */
 	const char *path_to_stdin;
+
+	/* Number of threads to parallelize across */
+	int jobs;
 };
 
-#define RUN_HOOKS_OPT_INIT  {   		\
+/*
+ * Callback provided to feed_pipe_fn and consume_sideband_fn.
+ */
+struct hook_cb_data {
+	int rc;
+	struct list_head *head;
+	struct list_head *run_me;
+	struct run_hooks_opt *options;
+};
+
+#define RUN_HOOKS_OPT_INIT_SYNC  {   		\
 	.env = STRVEC_INIT, 			\
 	.args = STRVEC_INIT, 			\
 	.path_to_stdin = NULL,			\
+	.jobs = 1,				\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
+#define RUN_HOOKS_OPT_INIT_ASYNC {		\
+	.env = STRVEC_INIT, 			\
+	.args = STRVEC_INIT, 			\
+	.path_to_stdin = NULL,			\
+	.jobs = configured_hook_jobs(),		\
+	.run_hookdir = configured_hookdir_opt()	\
+}
+
+
 void run_hooks_opt_init(struct run_hooks_opt *o);
 void run_hooks_opt_clear(struct run_hooks_opt *o);
 
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 13/17] hook: allow specifying working directory for hooks
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (11 preceding siblings ...)
  2020-12-05  1:46         ` [PATCH 12/17] hook: allow parallel hook execution Emily Shaffer
@ 2020-12-05  1:46         ` Emily Shaffer
  2020-12-05  1:46         ` [PATCH 14/17] run-command: add stdin callback for parallelization Emily Shaffer
                           ` (5 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:46 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Hooks like "post-checkout" require that hooks have a different working
directory than the initial process. Pipe that directly through to struct
child_process.

Because we can just run 'git -C <some-dir> hook run ...' it shouldn't be
necessary to pipe this option through the frontend. In fact, this
reduces the possibility of users running hooks which affect some part of
the filesystem outside of the repo in question.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Needed later for "post-checkout" conversion.

 hook.c | 1 +
 hook.h | 5 +++++
 2 files changed, 6 insertions(+)

diff --git a/hook.c b/hook.c
index edea54f95c..f0c052d847 100644
--- a/hook.c
+++ b/hook.c
@@ -271,6 +271,7 @@ static int pick_next_hook(struct child_process *cp,
 	cp->env = hook_cb->options->env.v;
 	cp->stdout_to_stderr = 1;
 	cp->trace2_hook_name = hook->command.buf;
+	cp->dir = hook_cb->options->dir;
 
 	/* reopen the file for stdin; run_command closes it. */
 	if (hook_cb->options->path_to_stdin) {
diff --git a/hook.h b/hook.h
index f54568afe3..4aae8e2dbb 100644
--- a/hook.h
+++ b/hook.h
@@ -60,6 +60,9 @@ struct run_hooks_opt
 
 	/* Number of threads to parallelize across */
 	int jobs;
+
+	/* Path to initial working directory for subprocess */
+	const char *dir;
 };
 
 /*
@@ -77,6 +80,7 @@ struct hook_cb_data {
 	.args = STRVEC_INIT, 			\
 	.path_to_stdin = NULL,			\
 	.jobs = 1,				\
+	.dir = NULL,				\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
@@ -85,6 +89,7 @@ struct hook_cb_data {
 	.args = STRVEC_INIT, 			\
 	.path_to_stdin = NULL,			\
 	.jobs = configured_hook_jobs(),		\
+	.dir = NULL,				\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 14/17] run-command: add stdin callback for parallelization
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (12 preceding siblings ...)
  2020-12-05  1:46         ` [PATCH 13/17] hook: allow specifying working directory for hooks Emily Shaffer
@ 2020-12-05  1:46         ` Emily Shaffer
  2020-12-05  1:46         ` [PATCH 15/17] hook: provide stdin by string_list or callback Emily Shaffer
                           ` (4 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:46 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

If a user of the run_processes_parallel() API wants to pipe a large
amount of information to stdin of each parallel command, that
information could exceed the buffer of the pipe allocated for that
process's stdin.  Generally this is solved by repeatedly writing to
child_process.in between calls to start_command() and finish_command();
run_processes_parallel() did not provide users an opportunity to access
child_process at that time.

Because the data might be extremely large (for example, a list of all
refs received during a push from a client) simply taking a string_list
or strbuf is not as scalable as using a callback; the rest of the
run_processes_parallel() API also uses callbacks, so making this feature
match the rest of the API reduces mental load on the user.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since run_processes_parallel() is used elsewhere, I'd appreciate a close
    look on this patch which modifies it. Thanks :)

 builtin/fetch.c             |  1 +
 builtin/submodule--helper.c |  2 +-
 run-command.c               | 54 +++++++++++++++++++++++++++++++++++--
 run-command.h               | 17 +++++++++++-
 submodule.c                 |  1 +
 t/helper/test-run-command.c | 31 ++++++++++++++++++---
 t/t0061-run-command.sh      | 30 +++++++++++++++++++++
 7 files changed, 128 insertions(+), 8 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index ecf8537605..5e153b5193 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1647,6 +1647,7 @@ static int fetch_multiple(struct string_list *list, int max_children)
 		result = run_processes_parallel_tr2(max_children,
 						    &fetch_next_remote,
 						    &fetch_failed_to_start,
+						    NULL,
 						    &fetch_finished,
 						    &state,
 						    "fetch", "parallel/fetch");
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index c30896c897..bb623c1852 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -2294,7 +2294,7 @@ static int update_submodules(struct submodule_update_clone *suc)
 	int i;
 
 	run_processes_parallel_tr2(suc->max_jobs, update_clone_get_next_task,
-				   update_clone_start_failure,
+				   update_clone_start_failure, NULL,
 				   update_clone_task_finished, suc, "submodule",
 				   "parallel/update");
 
diff --git a/run-command.c b/run-command.c
index 80c8c97bc1..7b65c087f8 100644
--- a/run-command.c
+++ b/run-command.c
@@ -1548,6 +1548,7 @@ struct parallel_processes {
 
 	get_next_task_fn get_next_task;
 	start_failure_fn start_failure;
+	feed_pipe_fn feed_pipe;
 	task_finished_fn task_finished;
 
 	struct {
@@ -1575,6 +1576,13 @@ static int default_start_failure(struct strbuf *out,
 	return 0;
 }
 
+static int default_feed_pipe(struct strbuf *pipe,
+			     void *pp_cb,
+			     void *pp_task_cb)
+{
+	return 1;
+}
+
 static int default_task_finished(int result,
 				 struct strbuf *out,
 				 void *pp_cb,
@@ -1605,6 +1613,7 @@ static void pp_init(struct parallel_processes *pp,
 		    int n,
 		    get_next_task_fn get_next_task,
 		    start_failure_fn start_failure,
+		    feed_pipe_fn feed_pipe,
 		    task_finished_fn task_finished,
 		    void *data)
 {
@@ -1623,6 +1632,7 @@ static void pp_init(struct parallel_processes *pp,
 	pp->get_next_task = get_next_task;
 
 	pp->start_failure = start_failure ? start_failure : default_start_failure;
+	pp->feed_pipe = feed_pipe ? feed_pipe : default_feed_pipe;
 	pp->task_finished = task_finished ? task_finished : default_task_finished;
 
 	pp->nr_processes = 0;
@@ -1715,6 +1725,37 @@ static int pp_start_one(struct parallel_processes *pp)
 	return 0;
 }
 
+static void pp_buffer_stdin(struct parallel_processes *pp)
+{
+	int i;
+	struct strbuf sb = STRBUF_INIT;
+
+	/* Buffer stdin for each pipe. */
+	for (i = 0; i < pp->max_processes; i++) {
+		if (pp->children[i].state == GIT_CP_WORKING &&
+		    pp->children[i].process.in > 0) {
+			int done;
+			strbuf_reset(&sb);
+			done = pp->feed_pipe(&sb, pp->data,
+					      pp->children[i].data);
+			if (sb.len) {
+				if (write_in_full(pp->children[i].process.in,
+					      sb.buf, sb.len) < 0) {
+					if (errno != EPIPE)
+						die_errno("write");
+					done = 1;
+				}
+			}
+			if (done) {
+				close(pp->children[i].process.in);
+				pp->children[i].process.in = 0;
+			}
+		}
+	}
+
+	strbuf_release(&sb);
+}
+
 static void pp_buffer_stderr(struct parallel_processes *pp, int output_timeout)
 {
 	int i;
@@ -1779,6 +1820,7 @@ static int pp_collect_finished(struct parallel_processes *pp)
 		pp->nr_processes--;
 		pp->children[i].state = GIT_CP_FREE;
 		pp->pfd[i].fd = -1;
+		pp->children[i].process.in = 0;
 		child_process_init(&pp->children[i].process);
 
 		if (i != pp->output_owner) {
@@ -1812,6 +1854,7 @@ static int pp_collect_finished(struct parallel_processes *pp)
 int run_processes_parallel(int n,
 			   get_next_task_fn get_next_task,
 			   start_failure_fn start_failure,
+			   feed_pipe_fn feed_pipe,
 			   task_finished_fn task_finished,
 			   void *pp_cb)
 {
@@ -1820,7 +1863,9 @@ int run_processes_parallel(int n,
 	int spawn_cap = 4;
 	struct parallel_processes pp;
 
-	pp_init(&pp, n, get_next_task, start_failure, task_finished, pp_cb);
+	sigchain_push(SIGPIPE, SIG_IGN);
+
+	pp_init(&pp, n, get_next_task, start_failure, feed_pipe, task_finished, pp_cb);
 	while (1) {
 		for (i = 0;
 		    i < spawn_cap && !pp.shutdown &&
@@ -1837,6 +1882,7 @@ int run_processes_parallel(int n,
 		}
 		if (!pp.nr_processes)
 			break;
+		pp_buffer_stdin(&pp);
 		pp_buffer_stderr(&pp, output_timeout);
 		pp_output(&pp);
 		code = pp_collect_finished(&pp);
@@ -1848,11 +1894,15 @@ int run_processes_parallel(int n,
 	}
 
 	pp_cleanup(&pp);
+
+	sigchain_pop(SIGPIPE);
+
 	return 0;
 }
 
 int run_processes_parallel_tr2(int n, get_next_task_fn get_next_task,
 			       start_failure_fn start_failure,
+			       feed_pipe_fn feed_pipe,
 			       task_finished_fn task_finished, void *pp_cb,
 			       const char *tr2_category, const char *tr2_label)
 {
@@ -1862,7 +1912,7 @@ int run_processes_parallel_tr2(int n, get_next_task_fn get_next_task,
 				   ((n < 1) ? online_cpus() : n));
 
 	result = run_processes_parallel(n, get_next_task, start_failure,
-					task_finished, pp_cb);
+					feed_pipe, task_finished, pp_cb);
 
 	trace2_region_leave(tr2_category, tr2_label, NULL);
 
diff --git a/run-command.h b/run-command.h
index 6472b38bde..e058c0e2c8 100644
--- a/run-command.h
+++ b/run-command.h
@@ -436,6 +436,20 @@ typedef int (*start_failure_fn)(struct strbuf *out,
 				void *pp_cb,
 				void *pp_task_cb);
 
+/**
+ * This callback is called repeatedly on every child process who requests
+ * start_command() to create a pipe by setting child_process.in < 0.
+ *
+ * pp_cb is the callback cookie as passed into run_processes_parallel, and
+ * pp_task_cb is the callback cookie as passed into get_next_task_fn.
+ * The contents of 'send' will be read into the pipe and passed to the pipe.
+ *
+ * Return nonzero to close the pipe.
+ */
+typedef int (*feed_pipe_fn)(struct strbuf *pipe,
+			    void *pp_cb,
+			    void *pp_task_cb);
+
 /**
  * This callback is called on every child process that finished processing.
  *
@@ -470,10 +484,11 @@ typedef int (*task_finished_fn)(int result,
 int run_processes_parallel(int n,
 			   get_next_task_fn,
 			   start_failure_fn,
+			   feed_pipe_fn,
 			   task_finished_fn,
 			   void *pp_cb);
 int run_processes_parallel_tr2(int n, get_next_task_fn, start_failure_fn,
-			       task_finished_fn, void *pp_cb,
+			       feed_pipe_fn, task_finished_fn, void *pp_cb,
 			       const char *tr2_category, const char *tr2_label);
 
 #endif
diff --git a/submodule.c b/submodule.c
index b3bb59f066..953f41818c 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1638,6 +1638,7 @@ int fetch_populated_submodules(struct repository *r,
 	run_processes_parallel_tr2(max_parallel_jobs,
 				   get_next_submodule,
 				   fetch_start_failure,
+				   NULL,
 				   fetch_finish,
 				   &spf,
 				   "submodule", "parallel/fetch");
diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c
index 7ae03dc712..9348184d30 100644
--- a/t/helper/test-run-command.c
+++ b/t/helper/test-run-command.c
@@ -32,8 +32,13 @@ static int parallel_next(struct child_process *cp,
 		return 0;
 
 	strvec_pushv(&cp->args, d->argv);
+	cp->in = d->in;
+	cp->no_stdin = d->no_stdin;
 	strbuf_addstr(err, "preloaded output of a child\n");
 	number_callbacks++;
+
+	*task_cb = xmalloc(sizeof(int));
+	*(int*)(*task_cb) = 2;
 	return 1;
 }
 
@@ -55,6 +60,17 @@ static int task_finished(int result,
 	return 1;
 }
 
+static int test_stdin(struct strbuf *pipe, void *cb, void *task_cb)
+{
+	int *lines_remaining = task_cb;
+
+	if (*lines_remaining)
+		strbuf_addf(pipe, "sample stdin %d\n", --(*lines_remaining));
+
+	return !(*lines_remaining);
+}
+
+
 struct testsuite {
 	struct string_list tests, failed;
 	int next;
@@ -185,7 +201,7 @@ static int testsuite(int argc, const char **argv)
 		suite.tests.nr, max_jobs);
 
 	ret = run_processes_parallel(max_jobs, next_test, test_failed,
-				     test_finished, &suite);
+				     test_stdin, test_finished, &suite);
 
 	if (suite.failed.nr > 0) {
 		ret = 1;
@@ -413,15 +429,22 @@ int cmd__run_command(int argc, const char **argv)
 
 	if (!strcmp(argv[1], "run-command-parallel"))
 		exit(run_processes_parallel(jobs, parallel_next,
-					    NULL, NULL, &proc));
+					    NULL, NULL, NULL, &proc));
 
 	if (!strcmp(argv[1], "run-command-abort"))
 		exit(run_processes_parallel(jobs, parallel_next,
-					    NULL, task_finished, &proc));
+					    NULL, NULL, task_finished, &proc));
 
 	if (!strcmp(argv[1], "run-command-no-jobs"))
 		exit(run_processes_parallel(jobs, no_job,
-					    NULL, task_finished, &proc));
+					    NULL, NULL, task_finished, &proc));
+
+	if (!strcmp(argv[1], "run-command-stdin")) {
+		proc.in = -1;
+		proc.no_stdin = 0;
+		exit (run_processes_parallel(jobs, parallel_next, NULL,
+					     test_stdin, NULL, &proc));
+	}
 
 	fprintf(stderr, "check usage\n");
 	return 1;
diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 7d599675e3..3eb572e6cd 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -143,6 +143,36 @@ test_expect_success 'run_command runs in parallel with more tasks than jobs avai
 	test_cmp expect actual
 '
 
+cat >expect <<-EOF
+preloaded output of a child
+listening for stdin:
+sample stdin 1
+sample stdin 0
+preloaded output of a child
+listening for stdin:
+sample stdin 1
+sample stdin 0
+preloaded output of a child
+listening for stdin:
+sample stdin 1
+sample stdin 0
+preloaded output of a child
+listening for stdin:
+sample stdin 1
+sample stdin 0
+EOF
+
+test_expect_success 'run_command listens to stdin' '
+	write_script stdin-script <<-\EOF &&
+	echo "listening for stdin:"
+	while read line; do
+		echo "$line"
+	done </dev/stdin
+	EOF
+	test-tool run-command run-command-stdin 2 ./stdin-script 2>actual &&
+	test_cmp expect actual
+'
+
 cat >expect <<-EOF
 preloaded output of a child
 asking for a quick stop
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 15/17] hook: provide stdin by string_list or callback
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (13 preceding siblings ...)
  2020-12-05  1:46         ` [PATCH 14/17] run-command: add stdin callback for parallelization Emily Shaffer
@ 2020-12-05  1:46         ` Emily Shaffer
  2020-12-08 21:09           ` SZEDER Gábor
  2020-12-05  1:46         ` [PATCH 16/17] run-command: allow capturing of collated output Emily Shaffer
                           ` (3 subsequent siblings)
  18 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:46 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

In cases where a hook requires only a small amount of information via
stdin, it should be simple for users to provide a string_list alone. But
in more complicated cases where the stdin is too large to hold in
memory, let's provide a callback the users can populate line after line
with instead.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 hook.c | 39 +++++++++++++++++++++++++++++++++++++++
 hook.h | 25 +++++++++++++++++++++++++
 2 files changed, 64 insertions(+)

diff --git a/hook.c b/hook.c
index f0c052d847..fbb49f241d 100644
--- a/hook.c
+++ b/hook.c
@@ -9,6 +9,8 @@ void free_hook(struct hook *ptr)
 {
 	if (ptr) {
 		strbuf_release(&ptr->command);
+		if (ptr->feed_pipe_cb_data)
+			free(ptr->feed_pipe_cb_data);
 		free(ptr);
 	}
 }
@@ -38,6 +40,7 @@ static void append_or_move_hook(struct list_head *head, const char *command)
 		strbuf_init(&to_add->command, 0);
 		strbuf_addstr(&to_add->command, command);
 		to_add->from_hookdir = 0;
+		to_add->feed_pipe_cb_data = NULL;
 	}
 
 	/* re-set the scope so we show where an override was specified */
@@ -253,9 +256,32 @@ void run_hooks_opt_clear(struct run_hooks_opt *o)
 {
 	strvec_clear(&o->env);
 	strvec_clear(&o->args);
+	string_list_clear(&o->str_stdin, 0);
 }
 
 
+static int pipe_from_string_list(struct strbuf *pipe, void *pp_cb, void *pp_task_cb)
+{
+	int *item_idx;
+	struct hook *ctx = pp_task_cb;
+	struct string_list *to_pipe = &((struct hook_cb_data*)pp_cb)->options->str_stdin;
+
+	/* Bootstrap the state manager if necessary. */
+	if (!ctx->feed_pipe_cb_data) {
+		ctx->feed_pipe_cb_data = xmalloc(sizeof(unsigned int));
+		*(int*)ctx->feed_pipe_cb_data = 0;
+	}
+
+	item_idx = ctx->feed_pipe_cb_data;
+
+	if (*item_idx < to_pipe->nr) {
+		strbuf_addf(pipe, "%s\n", to_pipe->items[*item_idx].string);
+		(*item_idx)++;
+		return 0;
+	}
+	return 1;
+}
+
 static int pick_next_hook(struct child_process *cp,
 			  struct strbuf *out,
 			  void *pp_cb,
@@ -277,6 +303,10 @@ static int pick_next_hook(struct child_process *cp,
 	if (hook_cb->options->path_to_stdin) {
 		cp->no_stdin = 0;
 		cp->in = xopen(hook_cb->options->path_to_stdin, O_RDONLY);
+	} else if (hook_cb->options->feed_pipe) {
+		/* ask for start_command() to make a pipe for us */
+		cp->in = -1;
+		cp->no_stdin = 0;
 	} else {
 		cp->no_stdin = 1;
 	}
@@ -350,6 +380,14 @@ int run_hooks(const char *hookname, struct run_hooks_opt *options)
 	if (!options)
 		BUG("a struct run_hooks_opt must be provided to run_hooks");
 
+	if ((options->path_to_stdin && options->str_stdin.nr) ||
+	    (options->path_to_stdin && options->feed_pipe) ||
+	    (options->str_stdin.nr && options->feed_pipe))
+		BUG("choose only one method to populate stdin");
+
+	if (options->str_stdin.nr)
+		options->feed_pipe = &pipe_from_string_list;
+
 	strbuf_addstr(&hookname_str, hookname);
 
 	to_run = hook_list(&hookname_str);
@@ -368,6 +406,7 @@ int run_hooks(const char *hookname, struct run_hooks_opt *options)
 	run_processes_parallel_tr2(options->jobs,
 				   pick_next_hook,
 				   notify_start_failure,
+				   options->feed_pipe,
 				   notify_hook_finished,
 				   &cb_data,
 				   "hook",
diff --git a/hook.h b/hook.h
index 4aae8e2dbb..ace26c637e 100644
--- a/hook.h
+++ b/hook.h
@@ -2,6 +2,7 @@
 #include "list.h"
 #include "strbuf.h"
 #include "strvec.h"
+#include "run-command.h"
 
 struct hook
 {
@@ -14,6 +15,12 @@ struct hook
 	/* The literal command to run. */
 	struct strbuf command;
 	int from_hookdir;
+
+	/*
+	 * Use this to keep state for your feed_pipe_fn if you are using
+	 * run_hooks_opt.feed_pipe. Otherwise, do not touch it.
+	 */
+	void *feed_pipe_cb_data;
 };
 
 /*
@@ -57,12 +64,24 @@ struct run_hooks_opt
 
 	/* Path to file which should be piped to stdin for each hook */
 	const char *path_to_stdin;
+	/* Pipe each string to stdin, separated by newlines */
+	struct string_list str_stdin;
+	/*
+	 * Callback and state pointer to ask for more content to pipe to stdin.
+	 * Will be called repeatedly, for each hook. See
+	 * hook.c:pipe_from_stdin() for an example. Keep per-hook state in
+	 * hook.feed_pipe_cb_data (per process). Keep initialization context in
+	 * feed_pipe_ctx (shared by all processes).
+	 */
+	feed_pipe_fn feed_pipe;
+	void *feed_pipe_ctx;
 
 	/* Number of threads to parallelize across */
 	int jobs;
 
 	/* Path to initial working directory for subprocess */
 	const char *dir;
+
 };
 
 /*
@@ -81,6 +100,9 @@ struct hook_cb_data {
 	.path_to_stdin = NULL,			\
 	.jobs = 1,				\
 	.dir = NULL,				\
+	.str_stdin = STRING_LIST_INIT_DUP,	\
+	.feed_pipe = NULL,			\
+	.feed_pipe_ctx = NULL,			\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
@@ -90,6 +112,9 @@ struct hook_cb_data {
 	.path_to_stdin = NULL,			\
 	.jobs = configured_hook_jobs(),		\
 	.dir = NULL,				\
+	.str_stdin = STRING_LIST_INIT_DUP,	\
+	.feed_pipe = NULL,			\
+	.feed_pipe_ctx = NULL,			\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 16/17] run-command: allow capturing of collated output
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (14 preceding siblings ...)
  2020-12-05  1:46         ` [PATCH 15/17] hook: provide stdin by string_list or callback Emily Shaffer
@ 2020-12-05  1:46         ` Emily Shaffer
  2020-12-05  1:46         ` [PATCH 17/17] hooks: allow callers to capture output Emily Shaffer
                           ` (2 subsequent siblings)
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:46 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Some callers, for example server-side hooks which wish to relay hook
output to clients across a transport, want to capture what would
normally print to stderr and do something else with it. Allow that via a
callback.

By calling the callback regardless of whether there's output available,
we allow clients to send e.g. a keepalive if necessary.

Because we expose a strbuf, not a fd or FILE*, there's no need to create
a temporary pipe or similar - we can just skip the print to stderr and
instead hand it to the caller.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Originally when writing this patch I attempted to use a pipe in memory -
    but managing its lifetime was actually pretty tricky, and I found I could
    achieve the same thing with less code by doing it this way. Critique welcome,
    including "no, you really need to do it with a pipe".

 builtin/fetch.c             |  2 +-
 builtin/submodule--helper.c |  2 +-
 hook.c                      |  1 +
 run-command.c               | 33 +++++++++++++++++++++++++--------
 run-command.h               | 18 +++++++++++++++++-
 submodule.c                 |  2 +-
 t/helper/test-run-command.c | 25 ++++++++++++++++++++-----
 t/t0061-run-command.sh      |  7 +++++++
 8 files changed, 73 insertions(+), 17 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 5e153b5193..6a634085d9 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1647,7 +1647,7 @@ static int fetch_multiple(struct string_list *list, int max_children)
 		result = run_processes_parallel_tr2(max_children,
 						    &fetch_next_remote,
 						    &fetch_failed_to_start,
-						    NULL,
+						    NULL, NULL,
 						    &fetch_finished,
 						    &state,
 						    "fetch", "parallel/fetch");
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index bb623c1852..8c543d33fd 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -2294,7 +2294,7 @@ static int update_submodules(struct submodule_update_clone *suc)
 	int i;
 
 	run_processes_parallel_tr2(suc->max_jobs, update_clone_get_next_task,
-				   update_clone_start_failure, NULL,
+				   update_clone_start_failure, NULL, NULL,
 				   update_clone_task_finished, suc, "submodule",
 				   "parallel/update");
 
diff --git a/hook.c b/hook.c
index fbb49f241d..1186ee41b3 100644
--- a/hook.c
+++ b/hook.c
@@ -407,6 +407,7 @@ int run_hooks(const char *hookname, struct run_hooks_opt *options)
 				   pick_next_hook,
 				   notify_start_failure,
 				   options->feed_pipe,
+				   NULL,
 				   notify_hook_finished,
 				   &cb_data,
 				   "hook",
diff --git a/run-command.c b/run-command.c
index 7b65c087f8..0dce6bec83 100644
--- a/run-command.c
+++ b/run-command.c
@@ -1549,6 +1549,7 @@ struct parallel_processes {
 	get_next_task_fn get_next_task;
 	start_failure_fn start_failure;
 	feed_pipe_fn feed_pipe;
+	consume_sideband_fn consume_sideband;
 	task_finished_fn task_finished;
 
 	struct {
@@ -1614,6 +1615,7 @@ static void pp_init(struct parallel_processes *pp,
 		    get_next_task_fn get_next_task,
 		    start_failure_fn start_failure,
 		    feed_pipe_fn feed_pipe,
+		    consume_sideband_fn consume_sideband,
 		    task_finished_fn task_finished,
 		    void *data)
 {
@@ -1634,6 +1636,7 @@ static void pp_init(struct parallel_processes *pp,
 	pp->start_failure = start_failure ? start_failure : default_start_failure;
 	pp->feed_pipe = feed_pipe ? feed_pipe : default_feed_pipe;
 	pp->task_finished = task_finished ? task_finished : default_task_finished;
+	pp->consume_sideband = consume_sideband;
 
 	pp->nr_processes = 0;
 	pp->output_owner = 0;
@@ -1670,7 +1673,10 @@ static void pp_cleanup(struct parallel_processes *pp)
 	 * When get_next_task added messages to the buffer in its last
 	 * iteration, the buffered output is non empty.
 	 */
-	strbuf_write(&pp->buffered_output, stderr);
+	if (pp->consume_sideband)
+		pp->consume_sideband(&pp->buffered_output, pp->data);
+	else
+		strbuf_write(&pp->buffered_output, stderr);
 	strbuf_release(&pp->buffered_output);
 
 	sigchain_pop_common();
@@ -1786,9 +1792,13 @@ static void pp_buffer_stderr(struct parallel_processes *pp, int output_timeout)
 static void pp_output(struct parallel_processes *pp)
 {
 	int i = pp->output_owner;
+
 	if (pp->children[i].state == GIT_CP_WORKING &&
 	    pp->children[i].err.len) {
-		strbuf_write(&pp->children[i].err, stderr);
+		if (pp->consume_sideband)
+			pp->consume_sideband(&pp->children[i].err, pp->data);
+		else
+			strbuf_write(&pp->children[i].err, stderr);
 		strbuf_reset(&pp->children[i].err);
 	}
 }
@@ -1827,11 +1837,15 @@ static int pp_collect_finished(struct parallel_processes *pp)
 			strbuf_addbuf(&pp->buffered_output, &pp->children[i].err);
 			strbuf_reset(&pp->children[i].err);
 		} else {
-			strbuf_write(&pp->children[i].err, stderr);
+			/* Output errors, then all other finished child processes */
+			if (pp->consume_sideband) {
+				pp->consume_sideband(&pp->children[i].err, pp->data);
+				pp->consume_sideband(&pp->buffered_output, pp->data);
+			} else {
+				strbuf_write(&pp->children[i].err, stderr);
+				strbuf_write(&pp->buffered_output, stderr);
+			}
 			strbuf_reset(&pp->children[i].err);
-
-			/* Output all other finished child processes */
-			strbuf_write(&pp->buffered_output, stderr);
 			strbuf_reset(&pp->buffered_output);
 
 			/*
@@ -1855,6 +1869,7 @@ int run_processes_parallel(int n,
 			   get_next_task_fn get_next_task,
 			   start_failure_fn start_failure,
 			   feed_pipe_fn feed_pipe,
+			   consume_sideband_fn consume_sideband,
 			   task_finished_fn task_finished,
 			   void *pp_cb)
 {
@@ -1865,7 +1880,7 @@ int run_processes_parallel(int n,
 
 	sigchain_push(SIGPIPE, SIG_IGN);
 
-	pp_init(&pp, n, get_next_task, start_failure, feed_pipe, task_finished, pp_cb);
+	pp_init(&pp, n, get_next_task, start_failure, feed_pipe, consume_sideband, task_finished, pp_cb);
 	while (1) {
 		for (i = 0;
 		    i < spawn_cap && !pp.shutdown &&
@@ -1903,6 +1918,7 @@ int run_processes_parallel(int n,
 int run_processes_parallel_tr2(int n, get_next_task_fn get_next_task,
 			       start_failure_fn start_failure,
 			       feed_pipe_fn feed_pipe,
+			       consume_sideband_fn consume_sideband,
 			       task_finished_fn task_finished, void *pp_cb,
 			       const char *tr2_category, const char *tr2_label)
 {
@@ -1912,7 +1928,8 @@ int run_processes_parallel_tr2(int n, get_next_task_fn get_next_task,
 				   ((n < 1) ? online_cpus() : n));
 
 	result = run_processes_parallel(n, get_next_task, start_failure,
-					feed_pipe, task_finished, pp_cb);
+					feed_pipe, consume_sideband,
+					task_finished, pp_cb);
 
 	trace2_region_leave(tr2_category, tr2_label, NULL);
 
diff --git a/run-command.h b/run-command.h
index e058c0e2c8..2ad8271f56 100644
--- a/run-command.h
+++ b/run-command.h
@@ -450,6 +450,20 @@ typedef int (*feed_pipe_fn)(struct strbuf *pipe,
 			    void *pp_cb,
 			    void *pp_task_cb);
 
+/**
+ * If this callback is provided, instead of collating process output to stderr,
+ * they will be collated into a new pipe. consume_sideband_fn will be called
+ * repeatedly. When output is available on that pipe, it will be contained in
+ * 'output'. But it will be called with an empty 'output' too, to allow for
+ * keepalives or similar operations if necessary.
+ *
+ * pp_cb is the callback cookie as passed into run_processes_parallel.
+ *
+ * Since this callback is provided with the collated output, no task cookie is
+ * provided.
+ */
+typedef void (*consume_sideband_fn)(struct strbuf *output, void *pp_cb);
+
 /**
  * This callback is called on every child process that finished processing.
  *
@@ -485,10 +499,12 @@ int run_processes_parallel(int n,
 			   get_next_task_fn,
 			   start_failure_fn,
 			   feed_pipe_fn,
+			   consume_sideband_fn,
 			   task_finished_fn,
 			   void *pp_cb);
 int run_processes_parallel_tr2(int n, get_next_task_fn, start_failure_fn,
-			       feed_pipe_fn, task_finished_fn, void *pp_cb,
+			       feed_pipe_fn, consume_sideband_fn,
+			       task_finished_fn, void *pp_cb,
 			       const char *tr2_category, const char *tr2_label);
 
 #endif
diff --git a/submodule.c b/submodule.c
index 953f41818c..215bff22d9 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1638,7 +1638,7 @@ int fetch_populated_submodules(struct repository *r,
 	run_processes_parallel_tr2(max_parallel_jobs,
 				   get_next_submodule,
 				   fetch_start_failure,
-				   NULL,
+				   NULL, NULL,
 				   fetch_finish,
 				   &spf,
 				   "submodule", "parallel/fetch");
diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c
index 9348184d30..d53db6d11c 100644
--- a/t/helper/test-run-command.c
+++ b/t/helper/test-run-command.c
@@ -51,6 +51,16 @@ static int no_job(struct child_process *cp,
 	return 0;
 }
 
+static void test_consume_sideband(struct strbuf *output, void *cb)
+{
+	FILE *sideband;
+
+	sideband = fopen("./sideband", "a");
+
+	strbuf_write(output, sideband);
+	fclose(sideband);
+}
+
 static int task_finished(int result,
 			 struct strbuf *err,
 			 void *pp_cb,
@@ -201,7 +211,7 @@ static int testsuite(int argc, const char **argv)
 		suite.tests.nr, max_jobs);
 
 	ret = run_processes_parallel(max_jobs, next_test, test_failed,
-				     test_stdin, test_finished, &suite);
+				     test_stdin, NULL, test_finished, &suite);
 
 	if (suite.failed.nr > 0) {
 		ret = 1;
@@ -429,23 +439,28 @@ int cmd__run_command(int argc, const char **argv)
 
 	if (!strcmp(argv[1], "run-command-parallel"))
 		exit(run_processes_parallel(jobs, parallel_next,
-					    NULL, NULL, NULL, &proc));
+					    NULL, NULL, NULL, NULL, &proc));
 
 	if (!strcmp(argv[1], "run-command-abort"))
 		exit(run_processes_parallel(jobs, parallel_next,
-					    NULL, NULL, task_finished, &proc));
+					    NULL, NULL, NULL, task_finished, &proc));
 
 	if (!strcmp(argv[1], "run-command-no-jobs"))
 		exit(run_processes_parallel(jobs, no_job,
-					    NULL, NULL, task_finished, &proc));
+					    NULL, NULL, NULL, task_finished, &proc));
 
 	if (!strcmp(argv[1], "run-command-stdin")) {
 		proc.in = -1;
 		proc.no_stdin = 0;
 		exit (run_processes_parallel(jobs, parallel_next, NULL,
-					     test_stdin, NULL, &proc));
+					     test_stdin, NULL, NULL, &proc));
 	}
 
+	if (!strcmp(argv[1], "run-command-sideband"))
+		exit(run_processes_parallel(jobs, parallel_next, NULL, NULL,
+					    test_consume_sideband, NULL,
+					    &proc));
+
 	fprintf(stderr, "check usage\n");
 	return 1;
 }
diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 3eb572e6cd..c5a5b6df6c 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -143,6 +143,13 @@ test_expect_success 'run_command runs in parallel with more tasks than jobs avai
 	test_cmp expect actual
 '
 
+test_expect_success 'run_command can divert output' '
+	test_when_finished rm sideband &&
+	test-tool run-command run-command-sideband 3 sh -c "printf \"%s\n%s\n\" Hello World" 2>actual &&
+	test_must_be_empty actual &&
+	test_cmp expect sideband
+'
+
 cat >expect <<-EOF
 preloaded output of a child
 listening for stdin:
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH 17/17] hooks: allow callers to capture output
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (15 preceding siblings ...)
  2020-12-05  1:46         ` [PATCH 16/17] run-command: allow capturing of collated output Emily Shaffer
@ 2020-12-05  1:46         ` Emily Shaffer
  2020-12-16  0:34         ` [PATCH v6 00/17] propose config-based hooks (part I) Josh Steadmon
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
  18 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-05  1:46 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Some server-side hooks will require capturing output to send over
sideband instead of printing directly to stderr. Expose that capability.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    You can see this in practice in the conversions for some of the push hooks,
    like 'receive-pack'.

 hook.c |  2 +-
 hook.h | 10 ++++++++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/hook.c b/hook.c
index 1186ee41b3..78d7721b74 100644
--- a/hook.c
+++ b/hook.c
@@ -407,7 +407,7 @@ int run_hooks(const char *hookname, struct run_hooks_opt *options)
 				   pick_next_hook,
 				   notify_start_failure,
 				   options->feed_pipe,
-				   NULL,
+				   options->consume_sideband,
 				   notify_hook_finished,
 				   &cb_data,
 				   "hook",
diff --git a/hook.h b/hook.h
index ace26c637e..7059e0db77 100644
--- a/hook.h
+++ b/hook.h
@@ -76,6 +76,14 @@ struct run_hooks_opt
 	feed_pipe_fn feed_pipe;
 	void *feed_pipe_ctx;
 
+	/*
+	 * Populate this to capture output and prevent it from being printed to
+	 * stderr. This will be passed directly through to
+	 * run_command:run_parallel_processes(). See t/helper/test-run-command.c
+	 * for an example.
+	 */
+	consume_sideband_fn consume_sideband;
+
 	/* Number of threads to parallelize across */
 	int jobs;
 
@@ -103,6 +111,7 @@ struct hook_cb_data {
 	.str_stdin = STRING_LIST_INIT_DUP,	\
 	.feed_pipe = NULL,			\
 	.feed_pipe_ctx = NULL,			\
+	.consume_sideband = NULL,		\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
@@ -115,6 +124,7 @@ struct hook_cb_data {
 	.str_stdin = STRING_LIST_INIT_DUP,	\
 	.feed_pipe = NULL,			\
 	.feed_pipe_ctx = NULL,			\
+	.consume_sideband = NULL,		\
 	.run_hookdir = configured_hookdir_opt()	\
 }
 
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* Re: [PATCH 15/17] hook: provide stdin by string_list or callback
  2020-12-05  1:46         ` [PATCH 15/17] hook: provide stdin by string_list or callback Emily Shaffer
@ 2020-12-08 21:09           ` SZEDER Gábor
  2020-12-08 22:11             ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: SZEDER Gábor @ 2020-12-08 21:09 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git

On Fri, Dec 04, 2020 at 05:46:05PM -0800, Emily Shaffer wrote:
> diff --git a/hook.c b/hook.c
> index f0c052d847..fbb49f241d 100644
> --- a/hook.c
> +++ b/hook.c
> @@ -9,6 +9,8 @@ void free_hook(struct hook *ptr)
>  {
>  	if (ptr) {
>  		strbuf_release(&ptr->command);
> +		if (ptr->feed_pipe_cb_data)

Coccinelle suggests to drop this condition, because free() can handle
a NULL pointer just fine.

> +			free(ptr->feed_pipe_cb_data);
>  		free(ptr);
>  	}
>  }

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH 15/17] hook: provide stdin by string_list or callback
  2020-12-08 21:09           ` SZEDER Gábor
@ 2020-12-08 22:11             ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-08 22:11 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git

On Tue, Dec 08, 2020 at 10:09:25PM +0100, SZEDER Gábor wrote:
> 
> On Fri, Dec 04, 2020 at 05:46:05PM -0800, Emily Shaffer wrote:
> > diff --git a/hook.c b/hook.c
> > index f0c052d847..fbb49f241d 100644
> > --- a/hook.c
> > +++ b/hook.c
> > @@ -9,6 +9,8 @@ void free_hook(struct hook *ptr)
> >  {
> >  	if (ptr) {
> >  		strbuf_release(&ptr->command);
> > +		if (ptr->feed_pipe_cb_data)
> 
> Coccinelle suggests to drop this condition, because free() can handle
> a NULL pointer just fine.

Done (locally). Thanks (and thanks for checking the coccinelle output
too).

> 
> > +			free(ptr->feed_pipe_cb_data);
> >  		free(ptr);
> >  	}
> >  }

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH 08/17] hook: add 'run' subcommand
  2020-12-05  1:45         ` [PATCH 08/17] hook: add 'run' subcommand Emily Shaffer
@ 2020-12-11 10:15           ` Phillip Wood
  2020-12-15 21:41             ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Phillip Wood @ 2020-12-11 10:15 UTC (permalink / raw)
  To: Emily Shaffer, git

Hi Emily

On 05/12/2020 01:45, Emily Shaffer wrote:
> In order to enable hooks to be run as an external process, by a
> standalone Git command, or by tools which wrap Git, provide an external
> means to run all configured hook commands for a given hook event.
> 
> For now, the hook commands will run in config order, in series. As
> alternate ordering or parallelism is supported in the future, we should
> add knobs to use those to the command line as well.
> 
> As with the legacy hook implementation, all stdout generated by hook
> commands is redirected to stderr. Piping from stdin is not yet
> supported.
> 
> Legacy hooks (those present in $GITDIR/hooks) are run at the end of the
> execution list. For now, there is no way to disable them.
> 
> Users may wish to provide hook commands like 'git config
> hook.pre-commit.command "~/linter.sh --pre-commit"'. To enable this, the
> contents of the 'hook.*.command' and 'hookcmd.*.command' strings are
> first split by space or quotes into an argv_array, then expanded with
> 'expand_user_path()'.

I'm a bit confused by this last paragraph, the docs below say we pass 
the string to the shell and that's what the implementation seems to do. 
If we're running a lot of hooks then maybe it would be worth using 
split_cmdline() and expand_user_path() rather than invoking the shell 
for each hook we run.

I'm afraid I've only had time to skip the patch, there are a couple of 
minor comments below.

> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
> 
> Notes:
>      Since v4, updated the docs, and did less local application of single
>      quotes. In order for hookdir hooks to run successfully with a space in
>      the path, though, they must not be run with 'sh -c'. So we can treat the
>      hookdir hooks specially, and warn users via doc about special
>      considerations for configured hooks with spaces in their path.
> 
>   Documentation/git-hook.txt    |  31 +++++++++-
>   builtin/hook.c                |  48 ++++++++++++++-
>   hook.c                        | 112 ++++++++++++++++++++++++++++++++++
>   hook.h                        |  32 ++++++++++
>   t/t1360-config-based-hooks.sh |  65 +++++++++++++++++++-
>   5 files changed, 281 insertions(+), 7 deletions(-)
> 
> diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
> index f19875ed68..18a817d832 100644
> --- a/Documentation/git-hook.txt
> +++ b/Documentation/git-hook.txt
> @@ -9,11 +9,12 @@ SYNOPSIS
>   --------
>   [verse]
>   'git hook' list <hook-name>
> +'git hook' run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hook-name>
>   
>   DESCRIPTION
>   -----------
> -You can list configured hooks with this command. Later, you will be able to run,
> -add, and modify hooks with this command.
> +You can list and run configured hooks with this command. Later, you will be able
> +to add and modify hooks with this command.
>   
>   This command parses the default configuration files for sections `hook` and
>   `hookcmd`. `hook` is used to describe the commands which will be run during a
> @@ -64,6 +65,32 @@ in the order they should be run, and print the config scope where the relevant
>   `hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
>   This output is human-readable and the format is subject to change over time.
>   
> +run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] `<hook-name>`::
> +
> +Runs hooks configured for `<hook-name>`, in the same order displayed by `git
> +hook list`. Hooks configured this way are run prepended with `sh -c`, so paths
> +containing special characters or spaces should be wrapped in single quotes:
> +`command = '/my/path with spaces/script.sh' some args`.
> +
> +OPTIONS
> +-------
> +--run-hookdir::
> +	Overrides the hook.runHookDir config. Must be 'yes', 'warn',
> +	'interactive', or 'no'. Specifies how to handle hooks located in the Git
> +	hook directory (core.hooksPath).
> +
> +-a::
> +--arg::
> +	Only valid for `run`.
> ++
> +Specify arguments to pass to every hook that is run.
> +
> +-e::
> +--env::
> +	Only valid for `run`.
> ++
> +Specify environment variables to set for every hook that is run.
> +
>   CONFIGURATION
>   -------------
>   include::config/hook.txt[]
> diff --git a/builtin/hook.c b/builtin/hook.c
> index 16324d4195..26f7050387 100644
> --- a/builtin/hook.c
> +++ b/builtin/hook.c
> @@ -5,9 +5,11 @@
>   #include "hook.h"
>   #include "parse-options.h"
>   #include "strbuf.h"
> +#include "strvec.h"
>   
>   static const char * const builtin_hook_usage[] = {
>   	N_("git hook list <hookname>"),
> +	N_("git hook run [(-e|--env)=<var>...] [(-a|--arg)=<arg>...] <hookname>"),
>   	NULL
>   };
>   
> @@ -84,6 +86,46 @@ static int list(int argc, const char **argv, const char *prefix)
>   	return 0;
>   }
>   
> +static int run(int argc, const char **argv, const char *prefix)
> +{
> +	struct strbuf hookname = STRBUF_INIT;
> +	struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT;
> +	int rc = 0;
> +
> +	struct option run_options[] = {
> +		OPT_STRVEC('e', "env", &opt.env, N_("var"),
> +			   N_("environment variables for hook to use")),
> +		OPT_STRVEC('a', "arg", &opt.args, N_("args"),
> +			   N_("argument to pass to hook")),
> +		OPT_END(),
> +	};
> +
> +	/*
> +	 * While it makes sense to list hooks out-of-repo, it doesn't make sense
> +	 * to execute them. Hooks usually want to look at repository artifacts.
> +	 */
> +	if (!have_git_dir())
> +		usage_msg_opt(_("You must be in a Git repo to execute hooks."),
> +			      builtin_hook_usage, run_options);
> +
> +	argc = parse_options(argc, argv, prefix, run_options,
> +			     builtin_hook_usage, 0);
> +
> +	if (argc < 1)
> +		usage_msg_opt(_("You must specify a hook event to run."),
> +			      builtin_hook_usage, run_options);
> +
> +	strbuf_addstr(&hookname, argv[0]);
> +	opt.run_hookdir = should_run_hookdir;
> +
> +	rc = run_hooks(hookname.buf, &opt);
> +
> +	strbuf_release(&hookname);
> +	run_hooks_opt_clear(&opt);
> +
> +	return rc;
> +}
> +
>   int cmd_hook(int argc, const char **argv, const char *prefix)
>   {
>   	const char *run_hookdir = NULL;
> @@ -95,10 +137,10 @@ int cmd_hook(int argc, const char **argv, const char *prefix)
>   	};
>   
>   	argc = parse_options(argc, argv, prefix, builtin_hook_options,
> -			     builtin_hook_usage, 0);
> +			     builtin_hook_usage, PARSE_OPT_KEEP_UNKNOWN);
>   
>   	/* after the parse, we should have "<command> <hookname> <args...>" */
> -	if (argc < 1)
> +	if (argc < 2)
>   		usage_with_options(builtin_hook_usage, builtin_hook_options);
>   
>   
> @@ -120,6 +162,8 @@ int cmd_hook(int argc, const char **argv, const char *prefix)
>   
>   	if (!strcmp(argv[0], "list"))
>   		return list(argc, argv, prefix);
> +	if (!strcmp(argv[0], "run"))
> +		return run(argc, argv, prefix);
>   
>   	usage_with_options(builtin_hook_usage, builtin_hook_options);
>   }
> diff --git a/hook.c b/hook.c
> index f4084e33c8..c4595a2324 100644
> --- a/hook.c
> +++ b/hook.c
> @@ -3,6 +3,7 @@
>   #include "hook.h"
>   #include "config.h"
>   #include "run-command.h"
> +#include "prompt.h"
>   
>   void free_hook(struct hook *ptr)
>   {
> @@ -135,6 +136,56 @@ enum hookdir_opt configured_hookdir_opt(void)
>   	return hookdir_unknown;
>   }
>   
> +static int should_include_hookdir(const char *path, enum hookdir_opt cfg)
> +{
> +	struct strbuf prompt = STRBUF_INIT;
> +	/*
> +	 * If the path doesn't exist, don't bother adding the empty hook and
> +	 * don't bother checking the config or prompting the user.
> +	 */
> +	if (!path)
> +		return 0;
> +
> +	switch (cfg)
> +	{
> +		case hookdir_no:

Style nit: we normally use uppercase for constants and enums.

> +			return 0;
> +		case hookdir_unknown:
> +			fprintf(stderr,
> +				_("Unrecognized value for 'hook.runHookDir'. "
> +				  "Is there a typo? "));

What happens at the moment if core.hooksPath does not exist?

Best Wishes

Phillip

> +			/* FALLTHROUGH */
> +		case hookdir_warn:
> +			fprintf(stderr, _("Running legacy hook at '%s'\n"),
> +				path);
> +			return 1;
> +		case hookdir_interactive:
> +			do {
> +				/*
> +				 * TRANSLATORS: Make sure to include [Y] and [n]
> +				 * in your translation. Only English input is
> +				 * accepted. Default option is "yes".
> +				 */
> +				fprintf(stderr, _("Run '%s'? [Yn] "), path);
> +				git_read_line_interactively(&prompt);
> +				strbuf_tolower(&prompt);
> +				if (starts_with(prompt.buf, "n")) {
> +					strbuf_release(&prompt);
> +					return 0;
> +				} else if (starts_with(prompt.buf, "y")) {
> +					strbuf_release(&prompt);
> +					return 1;
> +				}
> +				/* otherwise, we didn't understand the input */
> +			} while (prompt.len); /* an empty reply means "Yes" */
> +			strbuf_release(&prompt);
> +			return 1;
> +		case hookdir_yes:
> +		default:
> +			return 1;
> +	}
> +}
> +
>   struct list_head* hook_list(const struct strbuf* hookname)
>   {
>   	struct strbuf hook_key = STRBUF_INIT;
> @@ -166,3 +217,64 @@ struct list_head* hook_list(const struct strbuf* hookname)
>   	strbuf_release(&hook_key);
>   	return hook_head;
>   }
> +
> +void run_hooks_opt_init(struct run_hooks_opt *o)
> +{
> +	strvec_init(&o->env);
> +	strvec_init(&o->args);
> +	o->run_hookdir = configured_hookdir_opt();
> +}
> +
> +void run_hooks_opt_clear(struct run_hooks_opt *o)
> +{
> +	strvec_clear(&o->env);
> +	strvec_clear(&o->args);
> +}
> +
> +int run_hooks(const char *hookname, struct run_hooks_opt *options)
> +{
> +	struct strbuf hookname_str = STRBUF_INIT;
> +	struct list_head *to_run, *pos = NULL, *tmp = NULL;
> +	int rc = 0;
> +
> +	if (!options)
> +		BUG("a struct run_hooks_opt must be provided to run_hooks");
> +
> +	strbuf_addstr(&hookname_str, hookname);
> +
> +	to_run = hook_list(&hookname_str);
> +
> +	list_for_each_safe(pos, tmp, to_run) {
> +		struct child_process hook_proc = CHILD_PROCESS_INIT;
> +		struct hook *hook = list_entry(pos, struct hook, list);
> +
> +		hook_proc.env = options->env.v;
> +		hook_proc.no_stdin = 1;
> +		hook_proc.stdout_to_stderr = 1;
> +		hook_proc.trace2_hook_name = hook->command.buf;
> +		hook_proc.use_shell = 1;
> +
> +		if (hook->from_hookdir) {
> +		    if (!should_include_hookdir(hook->command.buf, options->run_hookdir))
> +			continue;
> +		    /*
> +		     * Commands from the config could be oneliners, but we know
> +		     * for certain that hookdir commands are not.
> +		     */
> +		    hook_proc.use_shell = 0;
> +		}
> +
> +		/* add command */
> +		strvec_push(&hook_proc.args, hook->command.buf);
> +
> +		/*
> +		 * add passed-in argv, without expanding - let the user get back
> +		 * exactly what they put in
> +		 */
> +		strvec_pushv(&hook_proc.args, options->args.v);
> +
> +		rc |= run_command(&hook_proc);
> +	}
> +
> +	return rc;
> +}
> diff --git a/hook.h b/hook.h
> index ca45d388d3..d1c3d71e82 100644
> --- a/hook.h
> +++ b/hook.h
> @@ -1,6 +1,7 @@
>   #include "config.h"
>   #include "list.h"
>   #include "strbuf.h"
> +#include "strvec.h"
>   
>   struct hook
>   {
> @@ -36,6 +37,37 @@ enum hookdir_opt
>    */
>   enum hookdir_opt configured_hookdir_opt(void);
>   
> +struct run_hooks_opt
> +{
> +	/* Environment vars to be set for each hook */
> +	struct strvec env;
> +
> +	/* Args to be passed to each hook */
> +	struct strvec args;
> +
> +	/*
> +	 * How should the hookdir be handled?
> +	 * Leave the RUN_HOOKS_OPT_INIT default in most cases; this only needs
> +	 * to be overridden if the user can override it at the command line.
> +	 */
> +	enum hookdir_opt run_hookdir;
> +};
> +
> +#define RUN_HOOKS_OPT_INIT  {   		\
> +	.env = STRVEC_INIT, 				\
> +	.args = STRVEC_INIT, 			\
> +	.run_hookdir = configured_hookdir_opt()	\
> +}
> +
> +void run_hooks_opt_init(struct run_hooks_opt *o);
> +void run_hooks_opt_clear(struct run_hooks_opt *o);
> +
> +/*
> + * Runs all hooks associated to the 'hookname' event in order. Each hook will be
> + * passed 'env' and 'args'.
> + */
> +int run_hooks(const char *hookname, struct run_hooks_opt *options);
> +
>   /* Free memory associated with a 'struct hook' */
>   void free_hook(struct hook *ptr);
>   /* Empties the list at 'head', calling 'free_hook()' on each entry */
> diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
> index ebd3bc623f..5b3003d59b 100755
> --- a/t/t1360-config-based-hooks.sh
> +++ b/t/t1360-config-based-hooks.sh
> @@ -115,7 +115,10 @@ test_expect_success 'hook.runHookDir = no is respected by list' '
>   
>   	git hook list pre-commit >actual &&
>   	# the hookdir annotation is translated
> -	test_i18ncmp expected actual
> +	test_i18ncmp expected actual &&
> +
> +	git hook run pre-commit 2>actual &&
> +	test_must_be_empty actual
>   '
>   
>   test_expect_success 'hook.runHookDir = warn is respected by list' '
> @@ -129,6 +132,14 @@ test_expect_success 'hook.runHookDir = warn is respected by list' '
>   
>   	git hook list pre-commit >actual &&
>   	# the hookdir annotation is translated
> +	test_i18ncmp expected actual &&
> +
> +	cat >expected <<-EOF &&
> +	Running legacy hook at '\''$(pwd)/.git/hooks/pre-commit'\''
> +	"Legacy Hook"
> +	EOF
> +
> +	git hook run pre-commit 2>actual &&
>   	test_i18ncmp expected actual
>   '
>   
> @@ -156,7 +167,7 @@ test_expect_success 'git hook list removes skipped inlined hook' '
>   	test_cmp expected actual
>   '
>   
> -test_expect_success 'hook.runHookDir = interactive is respected by list' '
> +test_expect_success 'hook.runHookDir = interactive is respected by list and run' '
>   	setup_hookdir &&
>   
>   	test_config hook.runHookDir "interactive" &&
> @@ -167,7 +178,55 @@ test_expect_success 'hook.runHookDir = interactive is respected by list' '
>   
>   	git hook list pre-commit >actual &&
>   	# the hookdir annotation is translated
> -	test_i18ncmp expected actual
> +	test_i18ncmp expected actual &&
> +
> +	test_write_lines n | git hook run pre-commit 2>actual &&
> +	! grep "Legacy Hook" actual &&
> +
> +	test_write_lines y | git hook run pre-commit 2>actual &&
> +	grep "Legacy Hook" actual
> +'
> +
> +test_expect_success 'inline hook definitions execute oneliners' '
> +	test_config hook.pre-commit.command "echo \"Hello World\"" &&
> +
> +	echo "Hello World" >expected &&
> +
> +	# hooks are run with stdout_to_stderr = 1
> +	git hook run pre-commit 2>actual &&
> +	test_cmp expected actual
> +'
> +
> +test_expect_success 'inline hook definitions resolve paths' '
> +	write_script sample-hook.sh <<-EOF &&
> +	echo \"Sample Hook\"
> +	EOF
> +
> +	test_when_finished "rm sample-hook.sh" &&
> +
> +	test_config hook.pre-commit.command "\"$(pwd)/sample-hook.sh\"" &&
> +
> +	echo \"Sample Hook\" >expected &&
> +
> +	# hooks are run with stdout_to_stderr = 1
> +	git hook run pre-commit 2>actual &&
> +	test_cmp expected actual
> +'
> +
> +test_expect_success 'hookdir hook included in git hook run' '
> +	setup_hookdir &&
> +
> +	echo \"Legacy Hook\" >expected &&
> +
> +	# hooks are run with stdout_to_stderr = 1
> +	git hook run pre-commit 2>actual &&
> +	test_cmp expected actual
> +'
> +
> +test_expect_success 'out-of-repo runs excluded' '
> +	setup_hooks &&
> +
> +	nongit test_must_fail git hook run pre-commit
>   '
>   
>   test_done
> 

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH 08/17] hook: add 'run' subcommand
  2020-12-11 10:15           ` Phillip Wood
@ 2020-12-15 21:41             ` Emily Shaffer
  0 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-15 21:41 UTC (permalink / raw)
  To: phillip.wood; +Cc: git

On Fri, Dec 11, 2020 at 10:15:26AM +0000, Phillip Wood wrote:
> 
> Hi Emily
> 
> On 05/12/2020 01:45, Emily Shaffer wrote:
> > In order to enable hooks to be run as an external process, by a
> > standalone Git command, or by tools which wrap Git, provide an external
> > means to run all configured hook commands for a given hook event.
> > 
> > For now, the hook commands will run in config order, in series. As
> > alternate ordering or parallelism is supported in the future, we should
> > add knobs to use those to the command line as well.
> > 
> > As with the legacy hook implementation, all stdout generated by hook
> > commands is redirected to stderr. Piping from stdin is not yet
> > supported.
> > 
> > Legacy hooks (those present in $GITDIR/hooks) are run at the end of the
> > execution list. For now, there is no way to disable them.
> > 
> > Users may wish to provide hook commands like 'git config
> > hook.pre-commit.command "~/linter.sh --pre-commit"'. To enable this, the
> > contents of the 'hook.*.command' and 'hookcmd.*.command' strings are
> > first split by space or quotes into an argv_array, then expanded with
> > 'expand_user_path()'.
> 
> I'm a bit confused by this last paragraph, the docs below say we pass the
> string to the shell and that's what the implementation seems to do. If we're
> running a lot of hooks then maybe it would be worth using split_cmdline()
> and expand_user_path() rather than invoking the shell for each hook we run.

Yeah, I think you are right that the commit message is stale. I had some
trouble getting things to work correctly with split_cmdline() and
expand_user_path(), so I'd prefer to run with shell.

> 
> I'm afraid I've only had time to skip the patch, there are a couple of minor
> comments below.

No problem. Thanks for having a look.

> > +static int should_include_hookdir(const char *path, enum hookdir_opt cfg)
> > +{
> > +	struct strbuf prompt = STRBUF_INIT;
> > +	/*
> > +	 * If the path doesn't exist, don't bother adding the empty hook and
> > +	 * don't bother checking the config or prompting the user.
> > +	 */
> > +	if (!path)
> > +		return 0;
> > +
> > +	switch (cfg)
> > +	{
> > +		case hookdir_no:
> 
> Style nit: we normally use uppercase for constants and enums.

OK. Thanks - will fix where it's introduced and update subsequent
patches.

> 
> > +			return 0;
> > +		case hookdir_unknown:
> > +			fprintf(stderr,
> > +				_("Unrecognized value for 'hook.runHookDir'. "
> > +				  "Is there a typo? "));
> 
> What happens at the moment if core.hooksPath does not exist?

When core.hooksPath does not exist then $GIT_DIR/hooks/ is used instead.
My setup currently doesn't have $GIT_DIR/hooks/ and runs happily. That
bit of logic (core.hooksPath or $GIT_DIR/hooks) is done in
run-command.h:find_hook() so I don't worry about it manually here.

However, your comment caused me to investigate what happens when
core.hooksPath DOES exist - and I found a bug. Because the 'git hook'
builtin doesn't call the default configuration callback, I miss
core.hooksPath hooks during 'git hook list' - but not during hooks
invoked during regular Git process runs. Very confusing :) So thanks for
the hint.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v6 00/17] propose config-based hooks (part I)
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (16 preceding siblings ...)
  2020-12-05  1:46         ` [PATCH 17/17] hooks: allow callers to capture output Emily Shaffer
@ 2020-12-16  0:34         ` Josh Steadmon
  2020-12-16  0:56           ` Junio C Hamano
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
  18 siblings, 1 reply; 170+ messages in thread
From: Josh Steadmon @ 2020-12-16  0:34 UTC (permalink / raw)
  To: Emily Shaffer
  Cc: git, Jeff King, Junio C Hamano, James Ramsay, Jonathan Nieder,
	brian m. carlson, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Johannes Schindelin

On 2020.12.04 17:45, Emily Shaffer wrote:
> Hi folks, and thanks for the patience - I ran into many, many last-mile
> challenges.
> 
> I haven't addressed many comments on the design doc yet - I was keen to get the
> "functionally complete" implementation and conversion to the list.
> 
> Next on my plate:
>  - Update the design doc to make sense with what's in the implementation.
>  - A blog post! How to set up new hooks, why they're neat, etc.
>  - We seem to have some Googlers interested in trying it out internally, so
>    I'm hoping we'll gather and collate feedback from that soon too.
>  - And of course addressing comments on this series.
> 
> Thanks!
>  - Emily

This approach looks good to me. I'll look forward to seeing the updated
design and the feedback from the internal tests.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v6 00/17] propose config-based hooks (part I)
  2020-12-16  0:34         ` [PATCH v6 00/17] propose config-based hooks (part I) Josh Steadmon
@ 2020-12-16  0:56           ` Junio C Hamano
  2020-12-16 20:16             ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Junio C Hamano @ 2020-12-16  0:56 UTC (permalink / raw)
  To: Josh Steadmon
  Cc: Emily Shaffer, git, Jeff King, James Ramsay, Jonathan Nieder,
	brian m. carlson, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Johannes Schindelin

Josh Steadmon <steadmon@google.com> writes:

> On 2020.12.04 17:45, Emily Shaffer wrote:
>> Hi folks, and thanks for the patience - I ran into many, many last-mile
>> challenges.
>> 
>> I haven't addressed many comments on the design doc yet - I was keen to get the
>> "functionally complete" implementation and conversion to the list.
>> 
>> Next on my plate:
>>  - Update the design doc to make sense with what's in the implementation.
>>  - A blog post! How to set up new hooks, why they're neat, etc.
>>  - We seem to have some Googlers interested in trying it out internally, so
>>    I'm hoping we'll gather and collate feedback from that soon too.
>>  - And of course addressing comments on this series.
>> 
>> Thanks!
>>  - Emily
>
> This approach looks good to me. I'll look forward to seeing the updated
> design and the feedback from the internal tests.

Thanks.

By the way, es/config-hooks does not seem to pass 5411 (at least)
even as a standalone topic, and has been kicked out of 'seen' for
some time.  Has anybody took a look into the issue?



^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v6 00/17] propose config-based hooks (part I)
  2020-12-16  0:56           ` Junio C Hamano
@ 2020-12-16 20:16             ` Emily Shaffer
  2020-12-16 23:32               ` Junio C Hamano
  0 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-16 20:16 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Josh Steadmon, git, Jeff King, James Ramsay, Jonathan Nieder,
	brian m. carlson, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Johannes Schindelin

On Tue, Dec 15, 2020 at 04:56:18PM -0800, Junio C Hamano wrote:
> 
> Josh Steadmon <steadmon@google.com> writes:
> 
> > On 2020.12.04 17:45, Emily Shaffer wrote:
> >> Hi folks, and thanks for the patience - I ran into many, many last-mile
> >> challenges.
> >> 
> >> I haven't addressed many comments on the design doc yet - I was keen to get the
> >> "functionally complete" implementation and conversion to the list.
> >> 
> >> Next on my plate:
> >>  - Update the design doc to make sense with what's in the implementation.
> >>  - A blog post! How to set up new hooks, why they're neat, etc.
> >>  - We seem to have some Googlers interested in trying it out internally, so
> >>    I'm hoping we'll gather and collate feedback from that soon too.
> >>  - And of course addressing comments on this series.
> >> 
> >> Thanks!
> >>  - Emily
> >
> > This approach looks good to me. I'll look forward to seeing the updated
> > design and the feedback from the internal tests.
> 
> Thanks.
> 
> By the way, es/config-hooks does not seem to pass 5411 (at least)
> even as a standalone topic, and has been kicked out of 'seen' for
> some time.  Has anybody took a look into the issue?

Yeah, I looked at it today. Looks like an issue with not paying
attention to master->main default config, since I added a new test to
the 5411 suite (which means it wouldn't have made a conflict for someone
to say "ah yes, s/master/main/g"). I am tracking down couple of other CI
errors today and will send a reroll today or tomorrow.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v6 00/17] propose config-based hooks (part I)
  2020-12-16 20:16             ` Emily Shaffer
@ 2020-12-16 23:32               ` Junio C Hamano
  2020-12-18  2:07                 ` Emily Shaffer
  0 siblings, 1 reply; 170+ messages in thread
From: Junio C Hamano @ 2020-12-16 23:32 UTC (permalink / raw)
  To: Emily Shaffer
  Cc: Josh Steadmon, git, Jeff King, James Ramsay, Jonathan Nieder,
	brian m. carlson, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Johannes Schindelin

Emily Shaffer <emilyshaffer@google.com> writes:

>> By the way, es/config-hooks does not seem to pass 5411 (at least)
>> even as a standalone topic, and has been kicked out of 'seen' for
>> some time.  Has anybody took a look into the issue?
>
> Yeah, I looked at it today. Looks like an issue with not paying
> attention to master->main default config, since I added a new test to
> the 5411 suite (which means it wouldn't have made a conflict for someone
> to say "ah yes, s/master/main/g"). I am tracking down couple of other CI
> errors today and will send a reroll today or tomorrow.

Thanks.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v6 00/17] propose config-based hooks (part I)
  2020-12-16 23:32               ` Junio C Hamano
@ 2020-12-18  2:07                 ` Emily Shaffer
  2020-12-18  5:29                   ` Junio C Hamano
  0 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-18  2:07 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Josh Steadmon, git, Jeff King, James Ramsay, Jonathan Nieder,
	brian m. carlson, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Johannes Schindelin

On Wed, Dec 16, 2020 at 03:32:46PM -0800, Junio C Hamano wrote:
> 
> Emily Shaffer <emilyshaffer@google.com> writes:
> 
> >> By the way, es/config-hooks does not seem to pass 5411 (at least)
> >> even as a standalone topic, and has been kicked out of 'seen' for
> >> some time.  Has anybody took a look into the issue?
> >
> > Yeah, I looked at it today. Looks like an issue with not paying
> > attention to master->main default config, since I added a new test to
> > the 5411 suite (which means it wouldn't have made a conflict for someone
> > to say "ah yes, s/master/main/g"). I am tracking down couple of other CI
> > errors today and will send a reroll today or tomorrow.
> 
> Thanks.

I don't have a reroll today. I have been trying to get to the bottom of
a test which fails when built with clang but passes when built with gcc
(t6030-bisect-porcelain.sh after patch 12 of the part II series) and
have not made progress on that, let alone on the other tasks I wanted to
do before sending the next version.

Next week I will only work one day, so I'd anticipate a reroll sometime
the week following. Sorry for the wait - but I think even if I sent it
with the fix for this t5411 failure, it would still break 'seen' because
of whatever this clang vs. gcc problem is.

Hope you enjoy your holidays.

 - Emily

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v6 00/17] propose config-based hooks (part I)
  2020-12-18  2:07                 ` Emily Shaffer
@ 2020-12-18  5:29                   ` Junio C Hamano
  0 siblings, 0 replies; 170+ messages in thread
From: Junio C Hamano @ 2020-12-18  5:29 UTC (permalink / raw)
  To: Emily Shaffer
  Cc: Josh Steadmon, git, Jeff King, James Ramsay, Jonathan Nieder,
	brian m. carlson, Ævar Arnfjörð Bjarmason,
	Phillip Wood, Johannes Schindelin

Emily Shaffer <emilyshaffer@google.com> writes:

> I don't have a reroll today. I have been trying to get to the bottom of
> a test which fails when built with clang but passes when built with gcc
> (t6030-bisect-porcelain.sh after patch 12 of the part II series) and
> have not made progress on that, let alone on the other tasks I wanted to
> do before sending the next version.

Thanks for an interim report.  No need to rush.

> Hope you enjoy your holidays.

You too, and have fun.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v7 00/17] propose config-based hooks (part I)
  2020-12-05  1:45       ` [PATCH v6 00/17] propose config-based hooks (part I) Emily Shaffer
                           ` (17 preceding siblings ...)
  2020-12-16  0:34         ` [PATCH v6 00/17] propose config-based hooks (part I) Josh Steadmon
@ 2020-12-22  0:02         ` Emily Shaffer
  2020-12-22  0:02           ` [PATCH v7 01/17] doc: propose hooks managed by the config Emily Shaffer
                             ` (20 more replies)
  18 siblings, 21 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git
  Cc: Emily Shaffer, Jeff King, Junio C Hamano, James Ramsay,
	Jonathan Nieder, brian m. carlson,
	Ævar Arnfjörð Bjarmason, Phillip Wood,
	Josh Steadmon, Johannes Schindelin

Since v6:

 - Converted 'enum hookdir_opt' to UPPER_SNAKE
 - Coccinelle fix in the hook destructor
 - Fixed a bug where builtin/hook.c wasn't running the default git config setup
   and therefore missed hooks in core.hooksPath when it was set. (These hooks
   would still run except when invoked by 'git hook run' as the config was
   called by the processes which invoked the hook library.)

CI run: https://github.com/nasamuffin/git/actions/runs/436864964

Thanks!
 - Emily

Emily Shaffer (17):
  doc: propose hooks managed by the config
  hook: scaffolding for git-hook subcommand
  hook: add list command
  hook: include hookdir hook in list
  hook: respect hook.runHookDir
  hook: implement hookcmd.<name>.skip
  parse-options: parse into strvec
  hook: add 'run' subcommand
  hook: replace find_hook() with hook_exists()
  hook: support passing stdin to hooks
  run-command: allow stdin for run_processes_parallel
  hook: allow parallel hook execution
  hook: allow specifying working directory for hooks
  run-command: add stdin callback for parallelization
  hook: provide stdin by string_list or callback
  run-command: allow capturing of collated output
  hooks: allow callers to capture output

 .gitignore                                    |   1 +
 Documentation/Makefile                        |   1 +
 Documentation/config/hook.txt                 |  19 +
 Documentation/git-hook.txt                    | 117 +++++
 Documentation/technical/api-parse-options.txt |   5 +
 .../technical/config-based-hooks.txt          | 355 +++++++++++++++
 Makefile                                      |   2 +
 builtin.h                                     |   1 +
 builtin/bugreport.c                           |   4 +-
 builtin/fetch.c                               |   1 +
 builtin/hook.c                                | 176 ++++++++
 builtin/submodule--helper.c                   |   2 +-
 command-list.txt                              |   1 +
 git.c                                         |   1 +
 hook.c                                        | 416 ++++++++++++++++++
 hook.h                                        | 154 +++++++
 parse-options-cb.c                            |  16 +
 parse-options.h                               |   4 +
 run-command.c                                 |  85 +++-
 run-command.h                                 |  31 ++
 submodule.c                                   |   1 +
 t/helper/test-run-command.c                   |  46 +-
 t/t0061-run-command.sh                        |  37 ++
 t/t1360-config-based-hooks.sh                 | 256 +++++++++++
 24 files changed, 1717 insertions(+), 15 deletions(-)
 create mode 100644 Documentation/config/hook.txt
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 Documentation/technical/config-based-hooks.txt
 create mode 100644 builtin/hook.c
 create mode 100644 hook.c
 create mode 100644 hook.h
 create mode 100755 t/t1360-config-based-hooks.sh

-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v7 01/17] doc: propose hooks managed by the config
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2021-01-23 15:38             ` Ævar Arnfjörð Bjarmason
  2021-02-01 22:11             ` Junio C Hamano
  2020-12-22  0:02           ` [PATCH v7 02/17] hook: scaffolding for git-hook subcommand Emily Shaffer
                             ` (19 subsequent siblings)
  20 siblings, 2 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Begin a design document for config-based hooks, managed via git-hook.
Focus on an overview of the implementation and motivation for design
decisions. Briefly discuss the alternatives considered before this
point. Also, attempt to redefine terms to fit into a multihook world.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v6, checked for inconsistencies with implementation and added lots of
    caveats about whether 'git hook add' and 'git hook edit' will ever materialize.
    
    Hopefully this reflects reality now; please review accordingly.
    
    Since v6, checked for inconsistencies with implementation and added lots of
    caveats about whether 'git hook add' and 'git hook edit' will ever materialize.
    
    Hopefully this reflects reality now; please review accordingly.
    
    Since v4, addressed comments from Jonathan Tan about wording. However, I have
    not addressed AEvar's comments or done a full re-review of this document.
    I wanted to get the rest of the series out for initial review first.
    
     - Emily
    
    Since v4, addressed comments from Jonathan Tan about wording.

 Documentation/Makefile                        |   1 +
 .../technical/config-based-hooks.txt          | 355 ++++++++++++++++++
 2 files changed, 356 insertions(+)
 create mode 100644 Documentation/technical/config-based-hooks.txt

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 69dbe4bb0b..505d318da1 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -81,6 +81,7 @@ SP_ARTICLES += $(API_DOCS)
 TECH_DOCS += MyFirstContribution
 TECH_DOCS += MyFirstObjectWalk
 TECH_DOCS += SubmittingPatches
+TECH_DOCS += technical/config-based-hooks
 TECH_DOCS += technical/hash-function-transition
 TECH_DOCS += technical/http-protocol
 TECH_DOCS += technical/index-format
diff --git a/Documentation/technical/config-based-hooks.txt b/Documentation/technical/config-based-hooks.txt
new file mode 100644
index 0000000000..3217faba47
--- /dev/null
+++ b/Documentation/technical/config-based-hooks.txt
@@ -0,0 +1,355 @@
+Configuration-based hook management
+===================================
+:sectanchors:
+
+[[motivation]]
+== Motivation
+
+Replace the .git/hook/hookname path as the only source of hooks to execute;
+allow users to define hooks using config files, in a way which is friendly to
+users with multiple repos which have similar needs.
+
+Redefine "hook" as an event rather than a single script, allowing users to
+perform multiple unrelated actions on a single event.
+
+Take a step closer to safety when copying zipped Git repositories from untrusted
+users by making it more apparent to users which scripts will be run during
+normal Git operations.
+
+Make it easier for users to discover Git's hook feature and automate their
+workflows.
+
+[[user-interfaces]]
+== User interfaces
+
+[[config-schema]]
+=== Config schema
+
+Hooks can be introduced by editing the configuration manually. There are two new
+sections added, `hook` and `hookcmd`.
+
+[[config-schema-hook]]
+==== `hook`
+
+Primarily contains subsections for each hook event. The order of variables in
+these subsections defines the hook command execution order; hook commands can be
+specified by setting the value directly to the command if no additional
+configuration is needed, or by setting the value as the name of a `hookcmd`. If
+Git does not find a `hookcmd` whose subsection matches the value of the given
+command string, Git will try to execute the string directly. Hooks are executed
+by passing the resolved command string to the shell. In the future, hook event
+subsections could also contain per-hook-event settings; see
+<<per-hook-event-settings,the section in Future Work>> for more details.
+
+Also contains top-level hook execution settings, for example, `hook.runHookDir`.
+(These settings are described more in <<library,Library>>.)
+
+----
+[hook "pre-commit"]
+  command = perl-linter
+  command = /usr/bin/git-secrets --pre-commit
+
+[hook "pre-applypatch"]
+  command = perl-linter
+  # for illustration purposes; error behavior isn't planned yet
+  error = ignore
+
+[hook]
+  runHookDir = interactive
+----
+
+[[config-schema-hookcmd]]
+==== `hookcmd`
+
+Defines a hook command and its attributes, which will be used when a hook event
+occurs. Unqualified attributes are assumed to apply to this hook during all hook
+events, but event-specific attributes can also be supplied. The example runs
+`/usr/bin/lint-it --language=perl <args passed by Git>`, but for repos which
+include this config, the hook command will be skipped for all events.
+Theoretically, the last line could be used to "un-skip" the hook command for
+`pre-commit` hooks, but this hasn't been scoped or implemented yet.
+
+----
+[hookcmd "perl-linter"]
+  command = /usr/bin/lint-it --language=perl
+  skip = true
+  # for illustration purposes; below hasn't been defined yet
+  pre-commit-skip = false
+----
+
+[[command-line-api]]
+=== Command-line API
+
+Users should be able to view, run, reorder, and create hook commands via the
+command line. External tools should be able to view a list of hooks in the
+correct order to run. Modifier commands (`edit` and `add`) have not been
+implemented yet and may not be if manually editing the config proves usable
+enough.
+
+*`git hook list <hook-event>`*
+
+*`git hook run <hook-event> [-a <arg>]... [-e <env-var>]...`*
+
+*`git hook edit <hook-event>`*
+
+*`git hook add <hook-command> <hook-event> <options...>`*
+
+[[hook-editor]]
+=== Hook editor
+
+The tool which is presented by `git hook edit <hook-command>`. Ideally, this
+tool should be easier to use than manually editing the config, and then produce
+a concise config afterwards. It may take a form similar to `git rebase
+--interactive`. This has not been designed or implemented yet and may not be if
+the config proves usable enough.
+
+[[implementation]]
+== Implementation
+
+[[library]]
+=== Library
+
+`hook.c` and `hook.h` are responsible for interacting with the config files. The
+hook library provides a basic API to call all hooks in config order with more
+complex options passed via `struct run_hooks_opt`:
+
+*`int run_hooks(const char *hookname, struct run_hooks_opt *options)`*
+
+`struct run_hooks_opt` allows callers to set:
+
+- environment variables
+- command-line arguments
+- behavior for the hook command provided by `run-command.h:find_hook()` (see
+  below)
+- a method to provide stdin to each hook, either via a file containing stdin, a
+  `struct string_list` containing a list of lines to print, or a callback
+  function to allow the caller to populate stdin manually
+- a method to process stdout from each hook, e.g. for printing to sideband
+  during a network operation
+- parallelism
+- a custom working directory for hooks to execute in
+
+And this struct can be extended with more options as necessary in the future.
+
+The "legacy" hook provided by `run-command.h:find_hook()` - that is, the hook
+present in `.git/hooks/<hookname>` or
+`$(git config --get core.hooksPath)/<hookname>` - can be handled in a number of
+ways, providing an avenue to deprecate these "legacy" hooks if desired. The
+handling is based on a config `hook.runHookDir`, which is checked against a
+number of cases:
+
+- "no": the legacy hook will not be run
+- "interactive": Git will prompt the user before running the legacy hook
+- "warn": Git will print a warning to stderr before running the legacy hook
+- "yes" (default): Git will silently run the legacy hook
+
+In case this list is expanded in the future, if a value for `hook.runHookDir` is
+given which Git does not recognize, Git should discard that config entry. For
+example, if "warn" was specified at system level and "junk" was specified at
+global level, Git would resolve the value to "warn"; if the only time the config
+was set was to "junk", Git would use the default value of "yes".
+
+`struct hookcmd` is expected to grow in size over time as more functionality is
+added to hooks; so that other parts of the code don't need to understand the
+config schema, `struct hookcmd` should contain logical values instead of string
+pairs.
+
+By default, hook parallelism is chosen based on the semantics of each hook;
+callsites initialize their `struct run_hooks_opt` via one of two macros,
+`RUN_HOOKS_OPT_INIT_SYNC` or `RUN_HOOKS_OPT_INIT_ASYNC`. The default number of
+jobs can be configured in `hook.jobs`; this config applies across all hook
+events. If unset, the value of `online_cpus()` (equivalent to `nproc`) is used.
+
+[[builtin]]
+=== Builtin
+
+`builtin/hook.c` is responsible for providing the frontend. It's responsible for
+formatting user-provided data and then calling the library API to set the
+configs as appropriate. The builtin frontend is not responsible for calling the
+config directly, so that other areas of Git can rely on the hook library to
+understand the most recent config schema for hooks.
+
+[[migration]]
+=== Migration path
+
+[[stage-0]]
+==== Stage 0
+
+Hooks are called by running `run-command.h:find_hook()` with the hookname and
+executing the result. The hook library and builtin do not exist. Hooks only
+exist as specially named scripts within `.git/hooks/`.
+
+[[stage-1]]
+==== Stage 1
+
+`git hook list --porcelain <hook-event>` is implemented. `hook.h:run_hooks()` is
+taught to include `run-command.h:find_hook()` at the end; calls to `find_hook()`
+are replaced with calls to `run_hooks()`. Users can opt-in to config-based hooks
+simply by creating some in their config; otherwise users should remain
+unaffected by the change.
+
+[[stage-2]]
+==== Stage 2
+
+The call to `find_hook()` inside of `run_hooks()` learns to check for a config,
+`hook.runHookDir`. Users can opt into managing their hooks completely via the
+config this way.
+
+[[stage-3]]
+==== Stage 3
+
+`.git/hooks` is removed from the template and the hook directory is considered
+deprecated. To avoid breaking older repos, the default of `hook.runHookDir` is
+not changed, and `find_hook()` is not removed.
+
+[[caveats]]
+== Caveats
+
+[[security]]
+=== Security and repo config
+
+Part of the motivation behind this refactor is to mitigate hooks as an attack
+vector;footnote:[https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/]
+however, as the design stands, users can still provide hooks in the repo-level
+config, which is included when a repo is zipped and sent elsewhere.  The
+security of the repo-level config is still under discussion; this design
+generally assumes the repo-level config is secure, which is not true yet. The
+goal is to avoid an overcomplicated design to work around a problem which has
+ceased to exist.
+
+[[ease-of-use]]
+=== Ease of use
+
+The config schema is nontrivial; that's why it's important for the `git hook`
+modifier commands to be usable. Contributors with UX expertise are encouraged to
+share their suggestions.
+
+[[alternatives]]
+== Alternative approaches
+
+A previous summary of alternatives exists in the
+archives.footnote:[https://lore.kernel.org/git/20191116011125.GG22855@google.com]
+
+[[status-quo]]
+=== Status quo
+
+Today users can implement multihooks themselves by using a "trampoline script"
+as their hook, and pointing that script to a directory or list of other scripts
+they wish to run.
+
+[[hook-directories]]
+=== Hook directories
+
+Other contributors have suggested Git learn about the existence of a directory
+such as `.git/hooks/<hookname>.d` and execute those hooks in alphabetical order.
+
+[[comparison]]
+=== Comparison table
+
+.Comparison of alternatives
+|===
+|Feature |Config-based hooks |Hook directories |Status quo
+
+|Supports multiple hooks
+|Natively
+|Natively
+|With user effort
+
+|Safer for zipped repos
+|A little
+|No
+|No
+
+|Previous hooks just work
+|If configured
+|Yes
+|Yes
+
+|Can install one hook to many repos
+|Yes
+|No
+|No
+
+|Discoverability
+|Better (in `git help git`)
+|Same as before
+|Same as before
+
+|Hard to run unexpected hook
+|If configured
+|No
+|No
+|===
+
+[[future-work]]
+== Future work
+
+[[execution-ordering]]
+=== Execution ordering
+
+We may find that config order is insufficient for some users; for example,
+config order makes it difficult to add a new hook to the system or global config
+which runs at the end of the hook list. A new ordering schema should be:
+
+1) Specified by a `hook.order` config, so that users will not unexpectedly see
+their order change;
+
+2) Either dependency or numerically based.
+
+Dependency-based ordering is prone to classic linked-list problems, like a
+cycles and handling of missing dependencies. But, it paves the way for enabling
+parallelization if some tasks truly depend on others.
+
+Numerical ordering makes it tricky for Git to generate suggested ordering
+numbers for each command, but is easy to determine a definitive order.
+
+[[parallelization]]
+=== Parallelization with dependencies
+
+Currently hooks use a naive parallelization scheme or are run in series.  But if
+one hook depends on another's output, then users will want to specify those
+dependencies. If we decide to solve this problem, we may want to look to modern
+build systems for inspiration on how to manage dependencies and parallel tasks.
+
+[[securing-hookdir-hooks]]
+=== Securing hookdir hooks
+
+With the design as written in this doc, it's still possible for a malicious user
+to modify `.git/config` to include `hook.pre-receive.command = rm -rf /`, then
+zip their repo and send it to another user. It may be necessary to teach Git to
+only allow inlined hooks like this if they were configured outside of the local
+scope (in other words, only run hookcmds, and only allow hookcmds to be
+configured in global or system scope); or another approach, like a list of safe
+projects, might be useful. It may also be sufficient (or at least useful) to
+teach a `hook.disableAll` config or similar flag to the Git executable.
+
+[[submodule-inheritance]]
+=== Submodule inheritance
+
+It's possible some submodules may want to run the identical set of hooks that
+their superrepo runs. While a globally-configured hook set is helpful, it's not
+a great solution for users who have multiple repos-with-submodules under the
+same user. It would be useful for submodules to learn how to run hooks from
+their superrepo's config, or inherit that hook setting.
+
+[[per-hook-event-settings]]
+=== Per-hook-event settings
+
+It might be desirable to keep settings specifically for some hook events, but
+not for others - for example, a user may wish to disable hookdir hooks for all
+events but pre-commit, which they haven't had time to convert yet; or, a user
+may wish for execution order settings to differ based on hook event. In that
+case, it would be useful to set something like `hook.pre-commit.executionOrder`
+which would not apply to the 'prepare-commit-msg' hook, for example.
+
+[[glossary]]
+== Glossary
+
+*hook event*
+
+A point during Git's execution where user scripts may be run, for example,
+_prepare-commit-msg_ or _pre-push_.
+
+*hook command*
+
+A user script or executable which will be run on one or more hook events.
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v7 02/17] hook: scaffolding for git-hook subcommand
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
  2020-12-22  0:02           ` [PATCH v7 01/17] doc: propose hooks managed by the config Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2020-12-22  0:02           ` [PATCH v7 03/17] hook: add list command Emily Shaffer
                             ` (18 subsequent siblings)
  20 siblings, 0 replies; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Introduce infrastructure for a new subcommand, git-hook, which will be
used to ease config-based hook management. This command will handle
parsing configs to compose a list of hooks to run for a given event, as
well as adding or modifying hook configs in an interactive fashion.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, mainly changed to RUN_SETUP_GENTLY so that 'git hook list' can
    be executed outside of a repo.

 .gitignore                    |  1 +
 Documentation/git-hook.txt    | 20 ++++++++++++++++++++
 Makefile                      |  1 +
 builtin.h                     |  1 +
 builtin/hook.c                | 21 +++++++++++++++++++++
 command-list.txt              |  1 +
 git.c                         |  1 +
 t/t1360-config-based-hooks.sh | 11 +++++++++++
 8 files changed, 57 insertions(+)
 create mode 100644 Documentation/git-hook.txt
 create mode 100644 builtin/hook.c
 create mode 100755 t/t1360-config-based-hooks.sh

diff --git a/.gitignore b/.gitignore
index 3dcdb6bb5a..3608c35b73 100644
--- a/.gitignore
+++ b/.gitignore
@@ -76,6 +76,7 @@
 /git-grep
 /git-hash-object
 /git-help
+/git-hook
 /git-http-backend
 /git-http-fetch
 /git-http-push
diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
new file mode 100644
index 0000000000..9eeab0009d
--- /dev/null
+++ b/Documentation/git-hook.txt
@@ -0,0 +1,20 @@
+git-hook(1)
+===========
+
+NAME
+----
+git-hook - Manage configured hooks
+
+SYNOPSIS
+--------
+[verse]
+'git hook'
+
+DESCRIPTION
+-----------
+A placeholder command. Later, you will be able to list, add, and modify hooks
+with this command.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 6fb86c5862..24cee44400 100644
--- a/Makefile
+++ b/Makefile
@@ -1101,6 +1101,7 @@ BUILTIN_OBJS += builtin/get-tar-commit-id.o
 BUILTIN_OBJS += builtin/grep.o
 BUILTIN_OBJS += builtin/hash-object.o
 BUILTIN_OBJS += builtin/help.o
+BUILTIN_OBJS += builtin/hook.o
 BUILTIN_OBJS += builtin/index-pack.o
 BUILTIN_OBJS += builtin/init-db.o
 BUILTIN_OBJS += builtin/interpret-trailers.o
diff --git a/builtin.h b/builtin.h
index b6ce981b73..8df1d36a7a 100644
--- a/builtin.h
+++ b/builtin.h
@@ -163,6 +163,7 @@ int cmd_get_tar_commit_id(int argc, const char **argv, const char *prefix);
 int cmd_grep(int argc, const char **argv, const char *prefix);
 int cmd_hash_object(int argc, const char **argv, const char *prefix);
 int cmd_help(int argc, const char **argv, const char *prefix);
+int cmd_hook(int argc, const char **argv, const char *prefix);
 int cmd_index_pack(int argc, const char **argv, const char *prefix);
 int cmd_init_db(int argc, const char **argv, const char *prefix);
 int cmd_interpret_trailers(int argc, const char **argv, const char *prefix);
diff --git a/builtin/hook.c b/builtin/hook.c
new file mode 100644
index 0000000000..b2bbc84d4d
--- /dev/null
+++ b/builtin/hook.c
@@ -0,0 +1,21 @@
+#include "cache.h"
+
+#include "builtin.h"
+#include "parse-options.h"
+
+static const char * const builtin_hook_usage[] = {
+	N_("git hook"),
+	NULL
+};
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+
+	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+			     builtin_hook_usage, 0);
+
+	return 0;
+}
diff --git a/command-list.txt b/command-list.txt
index 9379b02e5e..75909bf602 100644
--- a/command-list.txt
+++ b/command-list.txt
@@ -103,6 +103,7 @@ git-grep                                mainporcelain           info
 git-gui                                 mainporcelain
 git-hash-object                         plumbingmanipulators
 git-help                                ancillaryinterrogators          complete
+git-hook                                mainporcelain
 git-http-backend                        synchingrepositories
 git-http-fetch                          synchelpers
 git-http-push                           synchelpers
diff --git a/git.c b/git.c
index a00a0a4d94..9d1768b8e8 100644
--- a/git.c
+++ b/git.c
@@ -525,6 +525,7 @@ static struct cmd_struct commands[] = {
 	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
 	{ "hash-object", cmd_hash_object },
 	{ "help", cmd_help },
+	{ "hook", cmd_hook, RUN_SETUP_GENTLY },
 	{ "index-pack", cmd_index_pack, RUN_SETUP_GENTLY | NO_PARSEOPT },
 	{ "init", cmd_init_db },
 	{ "init-db", cmd_init_db },
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
new file mode 100755
index 0000000000..34b0df5216
--- /dev/null
+++ b/t/t1360-config-based-hooks.sh
@@ -0,0 +1,11 @@
+#!/bin/bash
+
+test_description='config-managed multihooks, including git-hook command'
+
+. ./test-lib.sh
+
+test_expect_success 'git hook command does not crash' '
+	git hook
+'
+
+test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v7 03/17] hook: add list command
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
  2020-12-22  0:02           ` [PATCH v7 01/17] doc: propose hooks managed by the config Emily Shaffer
  2020-12-22  0:02           ` [PATCH v7 02/17] hook: scaffolding for git-hook subcommand Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2021-01-31  3:10             ` Jonathan Tan
  2020-12-22  0:02           ` [PATCH v7 04/17] hook: include hookdir hook in list Emily Shaffer
                             ` (17 subsequent siblings)
  20 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Teach 'git hook list <hookname>', which checks the known configs in
order to create an ordered list of hooks to run on a given hook event.

Multiple commands can be specified for a given hook by providing
multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
run in config order. If more properties need to be set on a given hook
in the future, commands can also be specified by providing
"hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
<hookcmd-name>]" subsection; at minimum, this subsection must contain a
"hookcmd.<hookcmd-name>.command = <path-to-hook>" line.

For example:

  $ git config --list | grep ^hook
  hook.pre-commit.command=baz
  hook.pre-commit.command=~/bar.sh
  hookcmd.baz.command=~/baz/from/hookcmd.sh

  $ git hook list pre-commit
  global: ~/baz/from/hookcmd.sh
  local: ~/bar.sh

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Since v4, updated the sample in the commit message to reflect reality better.
    
    Since v4, more work on the documentation. Also a slight change to the
    output format (space instead of tab).

 Documentation/config/hook.txt |   9 +++
 Documentation/git-hook.txt    |  59 ++++++++++++++++-
 Makefile                      |   1 +
 builtin/hook.c                |  56 +++++++++++++++--
 hook.c                        | 115 ++++++++++++++++++++++++++++++++++
 hook.h                        |  26 ++++++++
 t/t1360-config-based-hooks.sh |  81 +++++++++++++++++++++++-
 7 files changed, 338 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/config/hook.txt
 create mode 100644 hook.c
 create mode 100644 hook.h

diff --git a/Documentation/config/hook.txt b/Documentation/config/hook.txt
new file mode 100644
index 0000000000..71449ecbc7
--- /dev/null
+++ b/Documentation/config/hook.txt
@@ -0,0 +1,9 @@
+hook.<command>.command::
+	A command to execute during the <command> hook event. This can be an
+	executable on your device, a oneliner for your shell, or the name of a
+	hookcmd. See linkgit:git-hook[1].
+
+hookcmd.<name>.command::
+	A command to execute during a hook for which <name> has been specified
+	as a command. This can be an executable on your device or a oneliner for
+	your shell. See linkgit:git-hook[1].
diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index 9eeab0009d..f19875ed68 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -8,12 +8,65 @@ git-hook - Manage configured hooks
 SYNOPSIS
 --------
 [verse]
-'git hook'
+'git hook' list <hook-name>
 
 DESCRIPTION
 -----------
-A placeholder command. Later, you will be able to list, add, and modify hooks
-with this command.
+You can list configured hooks with this command. Later, you will be able to run,
+add, and modify hooks with this command.
+
+This command parses the default configuration files for sections `hook` and
+`hookcmd`. `hook` is used to describe the commands which will be run during a
+particular hook event; commands are run in the order Git encounters them during
+the configuration parse (see linkgit:git-config[1]). `hookcmd` is used to
+describe attributes of a specific command. If additional attributes don't need
+to be specified, a command to run can be specified directly in the `hook`
+section; if a `hookcmd` by that name isn't found, Git will attempt to run the
+provided value directly. For example:
+
+Global config
+----
+  [hook "post-commit"]
+    command = "linter"
+    command = "~/typocheck.sh"
+
+  [hookcmd "linter"]
+    command = "/bin/linter --c"
+----
+
+Local config
+----
+  [hook "prepare-commit-msg"]
+    command = "linter"
+  [hook "post-commit"]
+    command = "python ~/run-test-suite.py"
+----
+
+With these configs, you'd then see:
+
+----
+$ git hook list "post-commit"
+global: /bin/linter --c
+global: ~/typocheck.sh
+local: python ~/run-test-suite.py
+
+$ git hook list "prepare-commit-msg"
+local: /bin/linter --c
+----
+
+COMMANDS
+--------
+
+list `<hook-name>`::
+
+List the hooks which have been configured for `<hook-name>`. Hooks appear
+in the order they should be run, and print the config scope where the relevant
+`hook.<hook-name>.command` was specified, not the `hookcmd` (if applicable).
+This output is human-readable and the format is subject to change over time.
+
+CONFIGURATION
+-------------
+include::config/hook.txt[]
 
 GIT
 ---
diff --git a/Makefile b/Makefile
index 24cee44400..d9f43dc8fe 100644
--- a/Makefile
+++ b/Makefile
@@ -904,6 +904,7 @@ LIB_OBJS += grep.o
 LIB_OBJS += hashmap.o
 LIB_OBJS += help.o
 LIB_OBJS += hex.o
+LIB_OBJS += hook.o
 LIB_OBJS += ident.o
 LIB_OBJS += json-writer.o
 LIB_OBJS += kwset.o
diff --git a/builtin/hook.c b/builtin/hook.c
index b2bbc84d4d..4d36de52f8 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -1,21 +1,69 @@
 #include "cache.h"
 
 #include "builtin.h"
+#include "config.h"
+#include "hook.h"
 #include "parse-options.h"
+#include "strbuf.h"
 
 static const char * const builtin_hook_usage[] = {
-	N_("git hook"),
+	N_("git hook list <hookname>"),
 	NULL
 };
 
-int cmd_hook(int argc, const char **argv, const char *prefix)
+static int list(int argc, const char **argv, const char *prefix)
 {
-	struct option builtin_hook_options[] = {
+	struct list_head *head, *pos;
+	struct hook *item;
+	struct strbuf hookname = STRBUF_INIT;
+
+	struct option list_options[] = {
 		OPT_END(),
 	};
 
-	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+	argc = parse_options(argc, argv, prefix, list_options,
 			     builtin_hook_usage, 0);
 
+	if (argc < 1) {
+		usage_msg_opt(_("You must specify a hook event name to list."),
+			      builtin_hook_usage, list_options);
+	}
+
+	strbuf_addstr(&hookname, argv[0]);
+
+	head = hook_list(&hookname);
+
+	if (list_empty(head)) {
+		printf(_("no commands configured for hook '%s'\n"),
+		       hookname.buf);
+		strbuf_release(&hookname);
+		return 0;
+	}
+
+	list_for_each(pos, head) {
+		item = list_entry(pos, struct hook, list);
+		if (item)
+			printf("%s: %s\n",
+			       config_scope_name(item->origin),
+			       item->command.buf);
+	}
+
+	clear_hook_list(head);
+	strbuf_release(&hookname);
+
 	return 0;
 }
+
+int cmd_hook(int argc, const char **argv, const char *prefix)
+{
+	struct option builtin_hook_options[] = {
+		OPT_END(),
+	};
+	if (argc < 2)
+		usage_with_options(builtin_hook_usage, builtin_hook_options);
+
+	if (!strcmp(argv[1], "list"))
+		return list(argc - 1, argv + 1, prefix);
+
+	usage_with_options(builtin_hook_usage, builtin_hook_options);
+}
diff --git a/hook.c b/hook.c
new file mode 100644
index 0000000000..937dc768c8
--- /dev/null
+++ b/hook.c
@@ -0,0 +1,115 @@
+#include "cache.h"
+
+#include "hook.h"
+#include "config.h"
+
+void free_hook(struct hook *ptr)
+{
+	if (ptr) {
+		strbuf_release(&ptr->command);
+		free(ptr);
+	}
+}
+
+static void append_or_move_hook(struct list_head *head, const char *command)
+{
+	struct list_head *pos = NULL, *tmp = NULL;
+	struct hook *to_add = NULL;
+
+	/*
+	 * remove the prior entry with this command; we'll replace it at the
+	 * end.
+	 */
+	list_for_each_safe(pos, tmp, head) {
+		struct hook *it = list_entry(pos, struct hook, list);
+		if (!strcmp(it->command.buf, command)) {
+		    list_del(pos);
+		    /* we'll simply move the hook to the end */
+		    to_add = it;
+		}
+	}
+
+	if (!to_add) {
+		/* adding a new hook, not moving an old one */
+		to_add = xmalloc(sizeof(struct hook));
+		strbuf_init(&to_add->command, 0);
+		strbuf_addstr(&to_add->command, command);
+	}
+
+	/* re-set the scope so we show where an override was specified */
+	to_add->origin = current_config_scope();
+
+	list_add_tail(&to_add->list, pos);
+}
+
+static void remove_hook(struct list_head *to_remove)
+{
+	struct hook *hook_to_remove = list_entry(to_remove, struct hook, list);
+	list_del(to_remove);
+	free_hook(hook_to_remove);
+}
+
+void clear_hook_list(struct list_head *head)
+{
+	struct list_head *pos, *tmp;
+	list_for_each_safe(pos, tmp, head)
+		remove_hook(pos);
+}
+
+struct hook_config_cb
+{
+	struct strbuf *hookname;
+	struct list_head *list;
+};
+
+static int hook_config_lookup(const char *key, const char *value, void *cb_data)
+{
+	struct hook_config_cb *data = cb_data;
+	const char *hook_key = data->hookname->buf;
+	struct list_head *head = data->list;
+
+	if (!strcmp(key, hook_key)) {
+		const char *command = value;
+		struct strbuf hookcmd_name = STRBUF_INIT;
+
+		/* Check if a hookcmd with that name exists. */
+		strbuf_addf(&hookcmd_name, "hookcmd.%s.command", command);
+		git_config_get_value(hookcmd_name.buf, &command);
+
+		if (!command) {
+			strbuf_release(&hookcmd_name);
+			BUG("git_config_get_value overwrote a string it shouldn't have");
+		}
+
+		/*
+		 * TODO: implement an option-getting callback, e.g.
+		 *   get configs by pattern hookcmd.$value.*
+		 *   for each key+value, do_callback(key, value, cb_data)
+		 */
+
+		append_or_move_hook(head, command);
+
+		strbuf_release(&hookcmd_name);
+	}
+
+	return 0;
+}
+
+struct list_head* hook_list(const struct strbuf* hookname)
+{
+	struct strbuf hook_key = STRBUF_INIT;
+	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
+	struct hook_config_cb cb_data = { &hook_key, hook_head };
+
+	INIT_LIST_HEAD(hook_head);
+
+	if (!hookname)
+		return NULL;
+
+	strbuf_addf(&hook_key, "hook.%s.command", hookname->buf);
+
+	git_config(hook_config_lookup, (void*)&cb_data);
+
+	strbuf_release(&hook_key);
+	return hook_head;
+}
diff --git a/hook.h b/hook.h
new file mode 100644
index 0000000000..8ffc4f14b6
--- /dev/null
+++ b/hook.h
@@ -0,0 +1,26 @@
+#include "config.h"
+#include "list.h"
+#include "strbuf.h"
+
+struct hook
+{
+	struct list_head list;
+	/*
+	 * Config file which holds the hook.*.command definition.
+	 * (This has nothing to do with the hookcmd.<name>.* configs.)
+	 */
+	enum config_scope origin;
+	/* The literal command to run. */
+	struct strbuf command;
+};
+
+/*
+ * Provides a linked list of 'struct hook' detailing commands which should run
+ * in response to the 'hookname' event, in execution order.
+ */
+struct list_head* hook_list(const struct strbuf *hookname);
+
+/* Free memory associated with a 'struct hook' */
+void free_hook(struct hook *ptr);
+/* Empties the list at 'head', calling 'free_hook()' on each entry */
+void clear_hook_list(struct list_head *head);
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 34b0df5216..6e4a3e763f 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -4,8 +4,85 @@ test_description='config-managed multihooks, including git-hook command'
 
 . ./test-lib.sh
 
-test_expect_success 'git hook command does not crash' '
-	git hook
+ROOT=
+if test_have_prereq MINGW
+then
+	# In Git for Windows, Unix-like paths work only in shell scripts;
+	# `git.exe`, however, will prefix them with the pseudo root directory
+	# (of the Unix shell). Let's accommodate for that.
+	ROOT="$(cd / && pwd)"
+fi
+
+setup_hooks () {
+	test_config hook.pre-commit.command "/path/ghi" --add
+	test_config_global hook.pre-commit.command "/path/def" --add
+}
+
+setup_hookcmd () {
+	test_config hook.pre-commit.command "abc" --add
+	test_config_global hookcmd.abc.command "/path/abc" --add
+}
+
+test_expect_success 'git hook rejects commands without a mode' '
+	test_must_fail git hook pre-commit
+'
+
+
+test_expect_success 'git hook rejects commands without a hookname' '
+	test_must_fail git hook list
+'
+
+test_expect_success 'git hook runs outside of a repo' '
+	setup_hooks &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	EOF
+
+	nongit git config --list --global &&
+
+	nongit git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list orders by config order' '
+	setup_hooks &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	local: $ROOT/path/ghi
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list dereferences a hookcmd' '
+	setup_hooks &&
+	setup_hookcmd &&
+
+	cat >expected <<-EOF &&
+	global: $ROOT/path/def
+	local: $ROOT/path/ghi
+	local: $ROOT/path/abc
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'git hook list reorders on duplicate commands' '
+	setup_hooks &&
+
+	test_config hook.pre-commit.command "/path/def" --add &&
+
+	cat >expected <<-EOF &&
+	local: $ROOT/path/ghi
+	local: $ROOT/path/def
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
 '
 
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v7 04/17] hook: include hookdir hook in list
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
                             ` (2 preceding siblings ...)
  2020-12-22  0:02           ` [PATCH v7 03/17] hook: add list command Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2021-01-31  3:20             ` Jonathan Tan
  2020-12-22  0:02           ` [PATCH v7 05/17] hook: respect hook.runHookDir Emily Shaffer
                             ` (16 subsequent siblings)
  20 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Historically, hooks are declared by placing an executable into
$GIT_DIR/hooks/$HOOKNAME (or $HOOKDIR/$HOOKNAME). Although hooks taken
from the config are more featureful than hooks placed in the $HOOKDIR,
those hooks should not stop working for users who already have them.

Legacy hooks should be run directly, not in shell. We know that they are
a path to an executable, not a oneliner script - and running them
directly takes care of path quoting concerns for us for free.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Newly split into its own commit since v4, and taking place much sooner.
    
    An unfortunate side effect of adding this support *before* the
    hook.runHookDir support is that the labels on the list are not clear -
    because we aren't yet flagging which hooks are from the hookdir versus
    the config. I suppose we could move the addition of that field to the
    struct hook up to this patch, but it didn't make a lot of sense to me to
    do it just for cosmetic purposes.

 builtin/hook.c                | 18 ++++++++++++++----
 hook.c                        | 15 +++++++++++++++
 hook.h                        |  1 +
 t/t1360-config-based-hooks.sh | 19 +++++++++++++++++++
 4 files changed, 49 insertions(+), 4 deletions(-)

diff --git a/builtin/hook.c b/builtin/hook.c
index 4d36de52f8..a0013ae4d7 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -16,6 +16,7 @@ static int list(int argc, const char **argv, const char *prefix)
 	struct list_head *head, *pos;
 	struct hook *item;
 	struct strbuf hookname = STRBUF_INIT;
+	struct strbuf hookdir_annotation = STRBUF_INIT;
 
 	struct option list_options[] = {
 		OPT_END(),
@@ -42,10 +43,17 @@ static int list(int argc, const char **argv, const char *prefix)
 
 	list_for_each(pos, head) {
 		item = list_entry(pos, struct hook, list);
-		if (item)
-			printf("%s: %s\n",
-			       config_scope_name(item->origin),
-			       item->command.buf);
+		if (item) {
+			/* Don't translate 'hookdir' - it matches the config */
+			printf("%s: %s%s\n",
+			       (item->from_hookdir
+				? "hookdir"
+				: config_scope_name(item->origin)),
+			       item->command.buf,
+			       (item->from_hookdir
+				? hookdir_annotation.buf
+				: ""));
+		}
 	}
 
 	clear_hook_list(head);
@@ -62,6 +70,8 @@ int cmd_hook(int argc, const char **argv, const char *prefix)
 	if (argc < 2)
 		usage_with_options(builtin_hook_usage, builtin_hook_options);
 
+	git_config(git_default_config, NULL);
+
 	if (!strcmp(argv[1], "list"))
 		return list(argc - 1, argv + 1, prefix);
 
diff --git a/hook.c b/hook.c
index 937dc768c8..ffbdcfd987 100644
--- a/hook.c
+++ b/hook.c
@@ -2,6 +2,7 @@
 
 #include "hook.h"
 #include "config.h"
+#include "run-command.h"
 
 void free_hook(struct hook *ptr)
 {
@@ -34,6 +35,7 @@ static void append_or_move_hook(struct list_head *head, const char *command)
 		to_add = xmalloc(sizeof(struct hook));
 		strbuf_init(&to_add->command, 0);
 		strbuf_addstr(&to_add->command, command);
+		to_add->from_hookdir = 0;
 	}
 
 	/* re-set the scope so we show where an override was specified */
@@ -100,6 +102,7 @@ struct list_head* hook_list(const struct strbuf* hookname)
 	struct strbuf hook_key = STRBUF_INIT;
 	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
 	struct hook_config_cb cb_data = { &hook_key, hook_head };
+	const char *legacy_hook_path = NULL;
 
 	INIT_LIST_HEAD(hook_head);
 
@@ -110,6 +113,18 @@ struct list_head* hook_list(const struct strbuf* hookname)
 
 	git_config(hook_config_lookup, (void*)&cb_data);
 
+	if (have_git_dir())
+		legacy_hook_path = find_hook(hookname->buf);
+
+	/* Unconditionally add legacy hook, but annotate it. */
+	if (legacy_hook_path) {
+		struct hook *legacy_hook;
+
+		append_or_move_hook(hook_head, absolute_path(legacy_hook_path));
+		legacy_hook = list_entry(hook_head->prev, struct hook, list);
+		legacy_hook->from_hookdir = 1;
+	}
+
 	strbuf_release(&hook_key);
 	return hook_head;
 }
diff --git a/hook.h b/hook.h
index 8ffc4f14b6..5750634c83 100644
--- a/hook.h
+++ b/hook.h
@@ -12,6 +12,7 @@ struct hook
 	enum config_scope origin;
 	/* The literal command to run. */
 	struct strbuf command;
+	int from_hookdir;
 };
 
 /*
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 6e4a3e763f..0f12af4659 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -23,6 +23,14 @@ setup_hookcmd () {
 	test_config_global hookcmd.abc.command "/path/abc" --add
 }
 
+setup_hookdir () {
+	mkdir .git/hooks
+	write_script .git/hooks/pre-commit <<-EOF
+	echo \"Legacy Hook\"
+	EOF
+	test_when_finished rm -rf .git/hooks
+}
+
 test_expect_success 'git hook rejects commands without a mode' '
 	test_must_fail git hook pre-commit
 '
@@ -85,4 +93,15 @@ test_expect_success 'git hook list reorders on duplicate commands' '
 	test_cmp expected actual
 '
 
+test_expect_success 'git hook list shows hooks from the hookdir' '
+	setup_hookdir &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit
+	EOF
+
+	git hook list pre-commit >actual &&
+	test_cmp expected actual
+'
+
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v7 05/17] hook: respect hook.runHookDir
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
                             ` (3 preceding siblings ...)
  2020-12-22  0:02           ` [PATCH v7 04/17] hook: include hookdir hook in list Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2021-01-31  3:35             ` Jonathan Tan
  2020-12-22  0:02           ` [PATCH v7 06/17] hook: implement hookcmd.<name>.skip Emily Shaffer
                             ` (15 subsequent siblings)
  20 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02 UTC (permalink / raw)
  To: git; +Cc: Emily Shaffer

Include hooks specified in the hook directory in the list of hooks to
run. These hooks do need to be treated differently from config-specified
ones - they do not need to run in a shell, and later on may be disabled
or warned about based on a config setting.

Because they are at least as local as the local config, we'll run them
last - to keep the hook execution order from global to local.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---

Notes:
    Newly split into its own commit since v4, and taking place much sooner.
    
    An unfortunate side effect of adding this support *before* the
    hook.runHookDir support is that the labels on the list are not clear -
    because we aren't yet flagging which hooks are from the hookdir versus
    the config. I suppose we could move the addition of that field to the
    struct hook up to this patch, but it didn't make a lot of sense to me to
    do it just for cosmetic purposes.

 Documentation/config/hook.txt |  5 ++++
 builtin/hook.c                | 54 +++++++++++++++++++++++++++++++++--
 hook.c                        | 21 ++++++++++++++
 hook.h                        | 15 ++++++++++
 t/t1360-config-based-hooks.sh | 43 ++++++++++++++++++++++++++++
 5 files changed, 135 insertions(+), 3 deletions(-)

diff --git a/Documentation/config/hook.txt b/Documentation/config/hook.txt
index 71449ecbc7..75312754ae 100644
--- a/Documentation/config/hook.txt
+++ b/Documentation/config/hook.txt
@@ -7,3 +7,8 @@ hookcmd.<name>.command::
 	A command to execute during a hook for which <name> has been specified
 	as a command. This can be an executable on your device or a oneliner for
 	your shell. See linkgit:git-hook[1].
+
+hook.runHookDir::
+	Controls how hooks contained in your hookdir are executed. Can be any of
+	"yes", "warn", "interactive", or "no". Defaults to "yes". See
+	linkgit:git-hook[1] and linkgit:git-config[1] "core.hooksPath").
diff --git a/builtin/hook.c b/builtin/hook.c
index a0013ae4d7..d087e6f5b0 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -11,6 +11,8 @@ static const char * const builtin_hook_usage[] = {
 	NULL
 };
 
+static enum hookdir_opt should_run_hookdir;
+
 static int list(int argc, const char **argv, const char *prefix)
 {
 	struct list_head *head, *pos;
@@ -41,6 +43,26 @@ static int list(int argc, const char **argv, const char *prefix)
 		return 0;
 	}
 
+	switch (should_run_hookdir) {
+		case HOOKDIR_NO:
+			strbuf_addstr(&hookdir_annotation, _(" (will not run)"));
+			break;
+		case HOOKDIR_INTERACTIVE:
+			strbuf_addstr(&hookdir_annotation, _(" (will prompt)"));
+			break;
+		case HOOKDIR_WARN:
+		case HOOKDIR_UNKNOWN:
+			strbuf_addstr(&hookdir_annotation, _(" (will warn)"));
+			break;
+		case HOOKDIR_YES:
+		/*
+		 * The default behavior should agree with
+		 * hook.c:configured_hookdir_opt().
+		 */
+		default:
+			break;
+	}
+
 	list_for_each(pos, head) {
 		item = list_entry(pos, struct hook, list);
 		if (item) {
@@ -64,16 +86,42 @@ static int list(int argc, const char **argv, const char *prefix)
 
 int cmd_hook(int argc, const char **argv, const char *prefix)
 {
+	const char *run_hookdir = NULL;
+
 	struct option builtin_hook_options[] = {
+		OPT_STRING(0, "run-hookdir", &run_hookdir, N_("option"),
+			   N_("what to do with hooks found in the hookdir")),
 		OPT_END(),
 	};
-	if (argc < 2)
+
+	argc = parse_options(argc, argv, prefix, builtin_hook_options,
+			     builtin_hook_usage, 0);
+
+	/* after the parse, we should have "<command> <hookname> <args...>" */
+	if (argc < 1)
 		usage_with_options(builtin_hook_usage, builtin_hook_options);
 
 	git_config(git_default_config, NULL);
 
-	if (!strcmp(argv[1], "list"))
-		return list(argc - 1, argv + 1, prefix);
+
+	/* argument > config */
+	if (run_hookdir)
+		if (!strcmp(run_hookdir, "no"))
+			should_run_hookdir = HOOKDIR_NO;
+		else if (!strcmp(run_hookdir, "yes"))
+			should_run_hookdir = HOOKDIR_YES;
+		else if (!strcmp(run_hookdir, "warn"))
+			should_run_hookdir = HOOKDIR_WARN;
+		else if (!strcmp(run_hookdir, "interactive"))
+			should_run_hookdir = HOOKDIR_INTERACTIVE;
+		else
+			die(_("'%s' is not a valid option for --run-hookdir "
+			      "(yes, warn, interactive, no)"), run_hookdir);
+	else
+		should_run_hookdir = configured_hookdir_opt();
+
+	if (!strcmp(argv[0], "list"))
+		return list(argc, argv, prefix);
 
 	usage_with_options(builtin_hook_usage, builtin_hook_options);
 }
diff --git a/hook.c b/hook.c
index ffbdcfd987..ed52e85159 100644
--- a/hook.c
+++ b/hook.c
@@ -97,6 +97,27 @@ static int hook_config_lookup(const char *key, const char *value, void *cb_data)
 	return 0;
 }
 
+enum hookdir_opt configured_hookdir_opt(void)
+{
+	const char *key;
+	if (git_config_get_value("hook.runhookdir", &key))
+		return HOOKDIR_YES; /* by default, just run it. */
+
+	if (!strcmp(key, "no"))
+		return HOOKDIR_NO;
+
+	if (!strcmp(key, "yes"))
+		return HOOKDIR_YES;
+
+	if (!strcmp(key, "warn"))
+		return HOOKDIR_WARN;
+
+	if (!strcmp(key, "interactive"))
+		return HOOKDIR_INTERACTIVE;
+
+	return HOOKDIR_UNKNOWN;
+}
+
 struct list_head* hook_list(const struct strbuf* hookname)
 {
 	struct strbuf hook_key = STRBUF_INIT;
diff --git a/hook.h b/hook.h
index 5750634c83..ccdf6272f2 100644
--- a/hook.h
+++ b/hook.h
@@ -21,6 +21,21 @@ struct hook
  */
 struct list_head* hook_list(const struct strbuf *hookname);
 
+enum hookdir_opt
+{
+	HOOKDIR_NO,
+	HOOKDIR_WARN,
+	HOOKDIR_INTERACTIVE,
+	HOOKDIR_YES,
+	HOOKDIR_UNKNOWN,
+};
+
+/*
+ * Provides the hookdir_opt specified in the config without consulting any
+ * command line arguments.
+ */
+enum hookdir_opt configured_hookdir_opt(void);
+
 /* Free memory associated with a 'struct hook' */
 void free_hook(struct hook *ptr);
 /* Empties the list at 'head', calling 'free_hook()' on each entry */
diff --git a/t/t1360-config-based-hooks.sh b/t/t1360-config-based-hooks.sh
index 0f12af4659..91127a50a4 100755
--- a/t/t1360-config-based-hooks.sh
+++ b/t/t1360-config-based-hooks.sh
@@ -104,4 +104,47 @@ test_expect_success 'git hook list shows hooks from the hookdir' '
 	test_cmp expected actual
 '
 
+test_expect_success 'hook.runHookDir = no is respected by list' '
+	setup_hookdir &&
+
+	test_config hook.runHookDir "no" &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit (will not run)
+	EOF
+
+	git hook list pre-commit >actual &&
+	# the hookdir annotation is translated
+	test_i18ncmp expected actual
+'
+
+test_expect_success 'hook.runHookDir = warn is respected by list' '
+	setup_hookdir &&
+
+	test_config hook.runHookDir "warn" &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit (will warn)
+	EOF
+
+	git hook list pre-commit >actual &&
+	# the hookdir annotation is translated
+	test_i18ncmp expected actual
+'
+
+
+test_expect_success 'hook.runHookDir = interactive is respected by list' '
+	setup_hookdir &&
+
+	test_config hook.runHookDir "interactive" &&
+
+	cat >expected <<-EOF &&
+	hookdir: $(pwd)/.git/hooks/pre-commit (will prompt)
+	EOF
+
+	git hook list pre-commit >actual &&
+	# the hookdir annotation is translated
+	test_i18ncmp expected actual
+'
+
 test_done
-- 
2.28.0.rc0.142.g3c755180ce-goog


^ permalink raw reply related	[flat|nested] 170+ messages in thread

* [PATCH v7 06/17] hook: implement hookcmd.<name>.skip
  2020-12-22  0:02         ` [PATCH v7 " Emily Shaffer
                             ` (4 preceding siblings ...)
  2020-12-22  0:02           ` [PATCH v7 05/17] hook: respect hook.runHookDir Emily Shaffer
@ 2020-12-22  0:02           ` Emily Shaffer
  2021-01-31  3:40             ` Jonathan Tan
  2020-12-22  0:02           ` [PATCH v7 07/17] parse-options: parse into strvec Emily Shaffer
                             ` (14 subsequent siblings)
  20 siblings, 1 reply; 170+ messages in thread
From: Emily Shaffer @ 2020-12-22  0:02