* [RFC] Hook management via 'git hooks' command @ 2019-11-16 1:11 Emily Shaffer 2019-11-16 5:45 ` brian m. carlson 0 siblings, 1 reply; 12+ messages in thread From: Emily Shaffer @ 2019-11-16 1:11 UTC (permalink / raw) To: git Cc: brian m. carlson, Jonathan Nieder, Ævar Arnfjörð Bjarmason Hi all. The topic of multihook support has come up before; here are a handful of conversations I could find on it: brian m. carlson's proposal for multiple hooks via .git/hooks/pre-commit.d/: https://public-inbox.org/git/20190424004948.728326-1-sandals@crustytoothpaste.net/ Jonathan Nieder's discussion of hooks as an attack vector: https://public-inbox.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/ Ævar Arnfjörð Bjarmason's RFC for multihooks in a similar manner to brian's: https://public-inbox.org/git/CACBZZX6j6q2DUN_Z-Pnent1u714dVNPFBrL_PiEQyLmCzLUVxg@mail.gmail.com/ As I consider these conversations, I think about what goals we are aiming for: 1. Let a user run more than one hook on a single event. 2. Let a user determine the order of those hooks. 3. Let a user version-control their hooks without kludging. (This is actually largely solved by core.hookName, thanks to Ævar.) 4. Make it hard for a user to run a hook they did not expect to run. 5. Don't break users who currently have hooks in $REPO/.git/hooks, but provide a path for them to stop using hooks from that directory. It seems that brian's thread stalled out in v2, but as I understand it, the method was: - For each codepath which executes a hook: - Check whether .git/hooks/${hookname} is valid (exists, executable) - If not, check whether .git/hooks/${hookname}.d exists, and run each script within, in alphabetical order. This approach seems similar to one that Ævar suggested in their RFC linked above. Benefits of that approach include: - A familiar Unixlike interface - Previously configured hooks will Just Work Drawbacks of that approach include: - Still an attack vector, unless core.hooksDirectory is set - Still need to configure a hook for each repo, unless core.hooksDirectory is set - Hooks are still not version controlled, unless core.hooksDirectory is set This meets goals 1, 2, and 5. Using core.hookPath makes this approach meet 3 as well. An alternate suggestion from Jonathan Nieder seems to look something like: - Learn a core.${hookname} config. - Teach all hook executing codepaths to check core.${hookname} - Either add multiple core.${hookname} entries and order ...TBD? or put lots of hooks on a single config entry and run in order on that line. Benefits of the config approach include: - Easy to configure global, repo-wide, worktree-wide hooks - Long-term, the decision of which hooks to run is left entirely to the user - The hooks themselves can be version controlled, even distributed with the repo (a user could specify ${repopath}/hooks/pre-push-lint if they wanted, and we chose to allow it) Drawbacks of the config approach include: - Different enough from the current approach to be unusual. Coming away from the transition period will require some thought. - The ordering of hooks may be confusing. - It may be difficult to choose _not_ to run a global hook from a local project. This meets goals 3, 4, and 5; it could meet 1 and 2 but needs to be fleshed out more. Meanwhile, I have some other concerns when I read through the source: - I don't think I saw a codified list of all the possible hooks. Meaning, find_hook() is called with a freeform string, not from some enum, and it's hard to programmatically list all places where a hook could be used. This came up when I was looking at bugreport; even the template ${hookname}.sample files prepopulated in .git/hooks aren't a comprehensive list of all hooks mentioned in `git help hooks`. This feels like a lack of assurance that all hooks are well-documented. Please do let me know if I missed something in my attempt to frame the "story so far". Here's my suggestion. - Let configs handle the hook list and ordering, via 'hook.{hookname}' which can be specified more than once. - global config: hook.update = /path/to/firsthook user config: hook.update = /path/to/secondhook worktree config: hook.update = -/path/to/firsthook #eliminate the firsthook call - global config: hook.update = /path/to/firsthook repo config: hook.update = /path/to/secondhook repo config: hook.update = ^/path/to/firsthook #move firsthook execution after secondhook for this project - Let users decide whether they want to check core.hookDir in addition to their config, instead of their config, or not at all, via a config 'hook.runHookDir' "hookdir-first", "hookdir-only", "config-only", etc. - Let users ask to be notified if running a hook from .git/hooks via a config 'hook.warnHookDir'. (Mentioning .git/hooks rather than core.hookDir is intentional here.) Alternatively, ask for 'hook.warn' with values "hookdir", "all" depending on how trusting the user feels. - If we want to phase out .git/hooks, we can make this notification opt-out instead of opt-in, someday. - between runHookDir and warnHookDir, users are able to phase out .git/hooks scripts at their own pace. - Abstract most of this via a 'git hooks' command. - 'git hooks add <hookname> [--<scope>] <path>' to append another hook; let the scope setting match 'git config'. - 'git hooks show [<hookname>]' to indicate which hooks will be run, either on a specific event or for all events, and in which order, as well as each hook's config origin - 'git hooks edit <hookname>' to modify the hook order, or choose not to run a hook locally - By managing it through a command, we can reorder the contents of config files if it makes sense to do so. - Add a hook library. - Optionally specify all hook events via enum, so we aren't string-typing calls to find_hook() anymore. - Resolve configs into a list of hooks to run by concatenating them together in config precedence order. - Also allow configs formatted like "-<path>" to remove an earlier-configured invocation of that hook, or "^<path>" to move the earlier-configured invocation to this point in the execution order - Move the find_hook() implementation to here, to account for the multihook ordering I think it reaches the goals I mentioned: 1. User can specify more than one hook per event by adding more configs or running 'git hooks add <event>'. 2. User can determine the order by reordering their configs in file, using "^/hookpath" notation, or using 'git hooks edit'. 3. User can point to a hook script which is checked into the repo they want it to run on, another repo in their filesystem, or anywhere at all. Contrary to the behavior of core.hookDir, users can manage each hook in a different repo if they wish (maybe the git-secrets hooks all live in the git-secrets repo, and the Gerrit Change-Id hook lives in the gerrit repo, and each project's pre-commit formatting hooks live in their own projects' repos). 4. Users who have "bought in" to the config-based hook management can ask to be warned if a hook is run from .git/hooks, and can turn off running hooks from .git/hooks entirely if they wish. 5. By defaulting to hook.runHookDir=hookdir-first or hookdir-only, we can avoid breaking users who are using .git/hooks/ today. The former gives value-add - the user can start using config hooks seamlessly - while the latter gives old functionality only. The other benefits I can think of: - Having a friendly porcelain command to manage hooks makes life easier for the Git and Unix uninitiated. - If 'git hooks edit' UI looks like 'git rebase --interactive' UI, then users may already feel at home using it. - If organizations are already distributing global configs, they can also distribute hooks such as git-secrets or other internal hooks if they wish. - Users who want to use a hook that comes with a repo may still be subject to attack if the hook shipped by the repo changes under their feet (although they have explicitly asked Git to run that hook, so the threat doesn't feel different from running a shell script from a repo manually). The drawbacks I can think of: - It is a much more complicated approach than adding .git/hooks/${hookname}.d. - The config syntax can clutter the output if users want to examine their config with 'git config --list'. - It is still possible for someone to attack via a zip file if they manipulate .git/config to use a hook they specified in the repo. The issue with .git/config is still a pretty big one, but my own thought is that we should treat it as a separate piece. Yes, those kinds of attacks are still possible as long as either .git/config or .git/hooks are present in the repo directory. But coding around .git/config will lead to some cruft that won't make sense once repo-local config files are made more secure (for example, one solution might be to ignore local and worktree configs when evaluating the hooks list, which would be inconvenient and no longer make sense if those config scopes are secured). I look forward to the discussion. Thanks for your thoughts. - Emily ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC] Hook management via 'git hooks' command 2019-11-16 1:11 [RFC] Hook management via 'git hooks' command Emily Shaffer @ 2019-11-16 5:45 ` brian m. carlson 2019-11-18 22:38 ` Emily Shaffer 0 siblings, 1 reply; 12+ messages in thread From: brian m. carlson @ 2019-11-16 5:45 UTC (permalink / raw) To: Emily Shaffer Cc: git, Jonathan Nieder, Ævar Arnfjörð Bjarmason [-- Attachment #1: Type: text/plain, Size: 4230 bytes --] On 2019-11-16 at 01:11:25, Emily Shaffer wrote: > Here's my suggestion. > > - Let configs handle the hook list and ordering, via 'hook.{hookname}' which > can be specified more than once. > - global config: hook.update = /path/to/firsthook > user config: hook.update = /path/to/secondhook > worktree config: hook.update = -/path/to/firsthook #eliminate the firsthook > call > - global config: hook.update = /path/to/firsthook > repo config: hook.update = /path/to/secondhook > repo config: hook.update = ^/path/to/firsthook #move firsthook execution > after secondhook for this project I'd like to hear more about how we handle multiple hooks that are repo-specific and don't live in the PATH. This is a common situation for existing tools that handle multiple hooks, and it's one I think we should support. Perhaps we add a "git hook execute" subcommand that executes scripts from the hook directory. > - Let users decide whether they want to check core.hookDir in addition to their > config, instead of their config, or not at all, via a config > 'hook.runHookDir' "hookdir-first", "hookdir-only", "config-only", etc. > - Let users ask to be notified if running a hook from .git/hooks via a config > 'hook.warnHookDir'. (Mentioning .git/hooks rather than core.hookDir is > intentional here.) Alternatively, ask for 'hook.warn' with values "hookdir", > "all" depending on how trusting the user feels. > - If we want to phase out .git/hooks, we can make this notification opt-out > instead of opt-in, someday. > - between runHookDir and warnHookDir, users are able to phase out > .git/hooks scripts at their own pace. > - Abstract most of this via a 'git hooks' command. > - 'git hooks add <hookname> [--<scope>] <path>' to append another hook; > let the scope setting match 'git config'. > - 'git hooks show [<hookname>]' to indicate which hooks will be run, either > on a specific event or for all events, and in which order, as well as each > hook's config origin > - 'git hooks edit <hookname>' to modify the hook order, or choose not to run > a hook locally > - By managing it through a command, we can reorder the contents of config > files if it makes sense to do so. > - Add a hook library. > - Optionally specify all hook events via enum, so we aren't string-typing > calls to find_hook() anymore. > - Resolve configs into a list of hooks to run by concatenating them together > in config precedence order. > - Also allow configs formatted like "-<path>" to remove an > earlier-configured invocation of that hook, or "^<path>" to move the > earlier-configured invocation to this point in the execution order > - Move the find_hook() implementation to here, to account for the multihook > ordering I think this addresses most of the concerns that I had about ordering. It is still a little suboptimal that we're relying on the ordering of the config file, since it makes things equivalent to numbered files in .d directories hard. Possibly as an alternative to the ^ syntax, we could make the hook value be of the form "key program", where key is a sort key (e.g., a number) and program is the program to run. We pick normal config file ordering if the keys are identical. Then if the system config wants to have a hook that always runs at the end, it can do so easily. In addition, we should be sure that attempting to remove a hook which doesn't exist isn't an error, since a user might want to set their ~/.gitconfig file to always unset a global hook that may or may not exist. We also need to address exit codes with multiple hooks and whether we fail fast or not. My series may provide some valuable options here, or we may choose to go with a single, simpler approach. Whatever we do, we should document the behavior clearly. Overall, though, I think this approach has a lot of potential and I feel positive about it. Thanks for bringing this up again in a productive and helpful way. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 868 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC] Hook management via 'git hooks' command 2019-11-16 5:45 ` brian m. carlson @ 2019-11-18 22:38 ` Emily Shaffer 2019-11-19 0:51 ` brian m. carlson 0 siblings, 1 reply; 12+ messages in thread From: Emily Shaffer @ 2019-11-18 22:38 UTC (permalink / raw) To: brian m. carlson, git, Jonathan Nieder, Ævar Arnfjörð Bjarmason On Sat, Nov 16, 2019 at 05:45:01AM +0000, brian m. carlson wrote: > On 2019-11-16 at 01:11:25, Emily Shaffer wrote: > > Here's my suggestion. > > > > - Let configs handle the hook list and ordering, via 'hook.{hookname}' which > > can be specified more than once. > > - global config: hook.update = /path/to/firsthook > > user config: hook.update = /path/to/secondhook > > worktree config: hook.update = -/path/to/firsthook #eliminate the firsthook > > call > > - global config: hook.update = /path/to/firsthook > > repo config: hook.update = /path/to/secondhook > > repo config: hook.update = ^/path/to/firsthook #move firsthook execution > > after secondhook for this project > > I'd like to hear more about how we handle multiple hooks that are > repo-specific and don't live in the PATH. This is a common situation > for existing tools that handle multiple hooks, and it's one I think we > should support. I guess I'm confused about where PATH comes into play. Do you mean that the hook being run relies on PATH to be set appropriately? I had envisioned absolute paths in the config. > Perhaps we add a "git hook execute" subcommand that executes scripts > from the hook directory. Can you give an example of when you'd use it? I'm not understanding your concern and I think an example use case would help me see what you mean. > > > - Let users decide whether they want to check core.hookDir in addition to their > > config, instead of their config, or not at all, via a config > > 'hook.runHookDir' "hookdir-first", "hookdir-only", "config-only", etc. > > - Let users ask to be notified if running a hook from .git/hooks via a config > > 'hook.warnHookDir'. (Mentioning .git/hooks rather than core.hookDir is > > intentional here.) Alternatively, ask for 'hook.warn' with values "hookdir", > > "all" depending on how trusting the user feels. > > - If we want to phase out .git/hooks, we can make this notification opt-out > > instead of opt-in, someday. > > - between runHookDir and warnHookDir, users are able to phase out > > .git/hooks scripts at their own pace. > > - Abstract most of this via a 'git hooks' command. > > - 'git hooks add <hookname> [--<scope>] <path>' to append another hook; > > let the scope setting match 'git config'. > > - 'git hooks show [<hookname>]' to indicate which hooks will be run, either > > on a specific event or for all events, and in which order, as well as each > > hook's config origin > > - 'git hooks edit <hookname>' to modify the hook order, or choose not to run > > a hook locally > > - By managing it through a command, we can reorder the contents of config > > files if it makes sense to do so. > > - Add a hook library. > > - Optionally specify all hook events via enum, so we aren't string-typing > > calls to find_hook() anymore. > > - Resolve configs into a list of hooks to run by concatenating them together > > in config precedence order. > > - Also allow configs formatted like "-<path>" to remove an > > earlier-configured invocation of that hook, or "^<path>" to move the > > earlier-configured invocation to this point in the execution order > > - Move the find_hook() implementation to here, to account for the multihook > > ordering > > I think this addresses most of the concerns that I had about ordering. > It is still a little suboptimal that we're relying on the ordering of > the config file, since it makes things equivalent to numbered files in > .d directories hard. Hm, I suppose I don't see why, if the final ordering is determined by the .git/config (or future replacement for that). Can you explain what you mean? I want to understand where you're coming from. > > Possibly as an alternative to the ^ syntax, we could make the hook value > be of the form "key program", where key is a sort key (e.g., a number) > and program is the program to run. We pick normal config file ordering > if the keys are identical. Then if the system config wants to have a > hook that always runs at the end, it can do so easily. Interesting. This way if you decide after you've set up all your configs just so that you really want something to run at the end of the update event, you can change one place, not n=number of Git repos. (I do still want to be able to say "don't run that global hook in this project" though.) > > In addition, we should be sure that attempting to remove a hook which > doesn't exist isn't an error, since a user might want to set their > ~/.gitconfig file to always unset a global hook that may or may not > exist. I'd be comfortable with a warning when exiting 'git hook edit' mode and silence when actually running the hook list. > > We also need to address exit codes with multiple hooks and whether we > fail fast or not. My series may provide some valuable options here, or > we may choose to go with a single, simpler approach. Whatever we do, we > should document the behavior clearly. Agree. I'll take a look at your series this week to see how you solved it; this feels like a place where it'd be easy to give too many knobs that users would never pull. I was going to write my gut feeling about what we should do, but my gut came up with reasonable arguments for lots of approaches. So I'll read what came first before I say anything more :) > > Overall, though, I think this approach has a lot of potential and I feel > positive about it. Thanks for bringing this up again in a productive > and helpful way. Thanks for being open minded about it. - Emily ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC] Hook management via 'git hooks' command 2019-11-18 22:38 ` Emily Shaffer @ 2019-11-19 0:51 ` brian m. carlson 2019-11-23 1:19 ` Emily Shaffer 0 siblings, 1 reply; 12+ messages in thread From: brian m. carlson @ 2019-11-19 0:51 UTC (permalink / raw) To: Emily Shaffer Cc: git, Jonathan Nieder, Ævar Arnfjörð Bjarmason [-- Attachment #1: Type: text/plain, Size: 5778 bytes --] On 2019-11-18 at 22:38:19, Emily Shaffer wrote: > On Sat, Nov 16, 2019 at 05:45:01AM +0000, brian m. carlson wrote: > > On 2019-11-16 at 01:11:25, Emily Shaffer wrote: > > > Here's my suggestion. > > > > > > - Let configs handle the hook list and ordering, via 'hook.{hookname}' which > > > can be specified more than once. > > > - global config: hook.update = /path/to/firsthook > > > user config: hook.update = /path/to/secondhook > > > worktree config: hook.update = -/path/to/firsthook #eliminate the firsthook > > > call > > > - global config: hook.update = /path/to/firsthook > > > repo config: hook.update = /path/to/secondhook > > > repo config: hook.update = ^/path/to/firsthook #move firsthook execution > > > after secondhook for this project > > > > I'd like to hear more about how we handle multiple hooks that are > > repo-specific and don't live in the PATH. This is a common situation > > for existing tools that handle multiple hooks, and it's one I think we > > should support. > > I guess I'm confused about where PATH comes into play. Do you mean that > the hook being run relies on PATH to be set appropriately? I had > envisioned absolute paths in the config. In past discussions, there's been an assumption that hooks in the config will be found in PATH if they're not specified explicitly, and I assumed (apparently incorrectly) that the same would be true here. I do expect folks are going to want to use non-absolute paths, though. If I'm invoking the git binary in a hook, I don't care whether it exists in /usr/bin, /usr/local/bin, ~/bin, or somewhere else entirely. That's my shell's problem to figure out. It's also common for folks to use something like "bundle exec" in a hook to run a linter that's installed by the local package manager, and in order to do that, you have to honor PATH to find the package manager's binary. That could be in a variety of places, and it could end up changing dynamically in a session due to a tool like RVM. > > Perhaps we add a "git hook execute" subcommand that executes scripts > > from the hook directory. > > Can you give an example of when you'd use it? I'm not understanding your > concern and I think an example use case would help me see what you mean. Sure. Currently, if I have pre-push hook, it lives in .git/hooks/pre-push. Now, I want to have multiple hooks for that which are specific to my repo. Maybe I've stuffed them into .git/hooks/pre-push.d/hook1 and .git/hooks/pre-push.d/hook2, since that's where my previous hook management system put them, but I now want to use those same hooks with the config style and drop the old tool. I'd like to use "git hook execute pre-push.d/hook1" and "git hook execute pre-push.d/hook2" to automatically find the right hooks and invoke them. Similarly, I could use "git hook execute pre-push" to execute the old pre-push hook. I suppose if we continue to keep the existing behavior of changing the directory and we pass the config options to the shell, then we could just write "$(git config core.hooksPath || echo .git/hooks)/pre-push.d/hook1" instead, which, while ugly, gets the job done. Then we wouldn't need such a command. > > I think this addresses most of the concerns that I had about ordering. > > It is still a little suboptimal that we're relying on the ordering of > > the config file, since it makes things equivalent to numbered files in > > .d directories hard. > > Hm, I suppose I don't see why, if the final ordering is determined by > the .git/config (or future replacement for that). Can you explain what > you mean? I want to understand where you're coming from. One of the benefits to using numbered files in a .d directory is that you can explicitly control ordering of operations. For example, maybe I have a per-repo pre-push hook that performs some checks and rejects a push if something is off. I also have a pre-push hook for Git LFS that pushes the Git LFS objects to the remote server if Git LFS is in use. In this case, I'd always want my sanity-check hook to run first, and so I'd number it first. This is fine if both are per-repo, but if the LFS hook is global, then it's in the wrong order and my LFS objects are pushed even though my sanity check failed. > > Possibly as an alternative to the ^ syntax, we could make the hook value > > be of the form "key program", where key is a sort key (e.g., a number) > > and program is the program to run. We pick normal config file ordering > > if the keys are identical. Then if the system config wants to have a > > hook that always runs at the end, it can do so easily. > > Interesting. This way if you decide after you've set up all your configs > just so that you really want something to run at the end of the update > event, you can change one place, not n=number of Git repos. (I do still > want to be able to say "don't run that global hook in this project" > though.) Exactly. A global or per-user commit-msg hook may want to see the final message before approving or rejecting it, and that wouldn't be possible without some sort of ordering. I strongly agree that we should still allow removing higher-level hooks. > > In addition, we should be sure that attempting to remove a hook which > > doesn't exist isn't an error, since a user might want to set their > > ~/.gitconfig file to always unset a global hook that may or may not > > exist. > > I'd be comfortable with a warning when exiting 'git hook edit' mode and > silence when actually running the hook list. Yeah, that's what I'm going for. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 868 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC] Hook management via 'git hooks' command 2019-11-19 0:51 ` brian m. carlson @ 2019-11-23 1:19 ` Emily Shaffer 2019-11-25 3:04 ` brian m. carlson 0 siblings, 1 reply; 12+ messages in thread From: Emily Shaffer @ 2019-11-23 1:19 UTC (permalink / raw) To: brian m. carlson, git, Jonathan Nieder, Ævar Arnfjörð Bjarmason On Tue, Nov 19, 2019 at 12:51:36AM +0000, brian m. carlson wrote: > On 2019-11-18 at 22:38:19, Emily Shaffer wrote: > > On Sat, Nov 16, 2019 at 05:45:01AM +0000, brian m. carlson wrote: > > > On 2019-11-16 at 01:11:25, Emily Shaffer wrote: > > > > Here's my suggestion. > > > > > > > > - Let configs handle the hook list and ordering, via 'hook.{hookname}' which > > > > can be specified more than once. > > > > - global config: hook.update = /path/to/firsthook > > > > user config: hook.update = /path/to/secondhook > > > > worktree config: hook.update = -/path/to/firsthook #eliminate the firsthook > > > > call > > > > - global config: hook.update = /path/to/firsthook > > > > repo config: hook.update = /path/to/secondhook > > > > repo config: hook.update = ^/path/to/firsthook #move firsthook execution > > > > after secondhook for this project > > > > > > I'd like to hear more about how we handle multiple hooks that are > > > repo-specific and don't live in the PATH. This is a common situation > > > for existing tools that handle multiple hooks, and it's one I think we > > > should support. > > > > I guess I'm confused about where PATH comes into play. Do you mean that > > the hook being run relies on PATH to be set appropriately? I had > > envisioned absolute paths in the config. > > In past discussions, there's been an assumption that hooks in the config > will be found in PATH if they're not specified explicitly, and I assumed > (apparently incorrectly) that the same would be true here. Ah, I think I see what you mean. hook.update = security-heuristic-runner where "security-heuristic-runner" is some compiled binary your employer purchased from some vendor and distributed directly to your `/bin/`. No, I had imagined users would achieve that by writing: hook.update = /bin/security-heuristic-runner > I do expect folks are going to want to use non-absolute paths, though. > If I'm invoking the git binary in a hook, I don't care whether it exists > in /usr/bin, /usr/local/bin, ~/bin, or somewhere else entirely. That's > my shell's problem to figure out. Hm. Do you mean: hook.update = git grep "something" Or do you mean: hook.update = ~/grephook.sh grephook.sh: #!/bin/bash git grep "something" >output ... do something with output ... I suppose I need to understand better how $PATH works with the latter scenario, but my gut says "if you didn't worry about where to find the Git binary from your script before, why are you going to start caring now". This led me to wonder: "Should we allow someone to pass arguments via the hook config?" Or to put it another way, "Should we allow 'hook.update = grep -Rin blahblah >audit.log'?" I think the answer is no - some hooks do expect to be given arguments, for example sequencer.c:run_prepare_commit_msg_hook(). > > It's also common for folks to use something like "bundle exec" in a hook > to run a linter that's installed by the local package manager, and in > order to do that, you have to honor PATH to find the package manager's > binary. That could be in a variety of places, and it could end up > changing dynamically in a session due to a tool like RVM. I imagine I'm missing something crucial here about why this isn't an issue in today's hook implementation, but it would be an issue with a config-based hook lookup. I don't think I see why the invocation would be any different, except today find_hook() says "is it in $GITDIR/hooks/" and tomorrow find_hook() would say "is it in config". > > > > Perhaps we add a "git hook execute" subcommand that executes scripts > > > from the hook directory. > > > > Can you give an example of when you'd use it? I'm not understanding your > > concern and I think an example use case would help me see what you mean. > > Sure. Currently, if I have pre-push hook, it lives in > .git/hooks/pre-push. Now, I want to have multiple hooks for that which > are specific to my repo. Maybe I've stuffed them into > .git/hooks/pre-push.d/hook1 and .git/hooks/pre-push.d/hook2, since > that's where my previous hook management system put them, but I now want > to use those same hooks with the config style and drop the old tool. > > I'd like to use "git hook execute pre-push.d/hook1" and "git hook > execute pre-push.d/hook2" to automatically find the right hooks and > invoke them. Similarly, I could use "git hook execute pre-push" to > execute the old pre-push hook. Hmmm. I think you're describing a scenario like this: 1. I make .git/hooks/pre-push.d/hook1, .git/hooks/pre-push.d/hook2, and .git/hooks/pre-push. 2. Hook magic happens upstream, and now instead of living in .git/hooks/, my hooks live in ~/.githooks/<reponame>/. 3. I just want to run my hooks now, but where are they? I don't envision a Git-facilitated "mv .git/hooks/ ~/.githooks/<reponame>/" (or whatever). In fact, I think it could break your scenario, because maybe you write "pre-push" to reference hook1 and hook2 by absolute path, or by relative path to the repo directory. I don't like the idea of doing that very much at all, which is why I proposed hook.warnHookDir (or whatever it is I called it), which we can later turn on by default if/when we want to chase people off using .git/hooks/. To your point, though. The most hands-off way to keep your previous setup would be to add: hook.pre-push = /path/to/myrepo/.git/hooks/pre-push which would invoke your springboard script and let you do whatever you want with the contents of .git/hooks/pre-push.d/. The second most hands-off (but more correct) way would be to add: hook.pre-push = /path/to/myrepo/.git/hooks/pre-push.d/hook1 hook.pre-push = /path/to/myrepo/.git/hooks/pre-push.d/hook2 I think we may be talking past each other, because when I hear you describe what you want from 'git hook execute <...>', it sounds like you are asking for 'git hook' to just do: $ $(git config core.hookdir)/$3 and that confuses me because I'm having trouble figuring out why you couldn't just do that yourself. So I'm sure I'm still not understanding something correctly. This is kind of the use case I was hoping to account for by providing 'hook.runHookDir' - hooks in .git/hooks/ still "just work", by taking the contents of that dir into account while composing the list of hooks to run. > > I suppose if we continue to keep the existing behavior of changing the > directory and we pass the config options to the shell, then we could > just write "$(git config core.hooksPath || echo > .git/hooks)/pre-push.d/hook1" instead, which, while ugly, gets the job > done. Then we wouldn't need such a command. Yeah, I am wondering about when you want to run a hook generically (i.e. from a noninteractive script) but outside of the context of something in the Git binary invoking a hook. Are you thinking of Git commands implemented as scripts? I wonder if the script use case is better served by something like 'git hook list pre-push --porcelain' which could compose the full list of hooks to run, taking into account hook.runHookDir and provide them in a script-readable way. Or, a simplification on your suggestion: 'git hook execute pre-push" which does the same thing, but runs them all for you too. > > > I think this addresses most of the concerns that I had about ordering. > > > It is still a little suboptimal that we're relying on the ordering of > > > the config file, since it makes things equivalent to numbered files in > > > .d directories hard. > > > > Hm, I suppose I don't see why, if the final ordering is determined by > > the .git/config (or future replacement for that). Can you explain what > > you mean? I want to understand where you're coming from. > > One of the benefits to using numbered files in a .d directory is that > you can explicitly control ordering of operations. For example, maybe I > have a per-repo pre-push hook that performs some checks and rejects a > push if something is off. I also have a pre-push hook for Git LFS that > pushes the Git LFS objects to the remote server if Git LFS is in use. > > In this case, I'd always want my sanity-check hook to run first, and so > I'd number it first. This is fine if both are per-repo, but if the LFS > hook is global, then it's in the wrong order and my LFS objects are > pushed even though my sanity check failed. Yeah, this is really compelling, and also removes the somewhat wonky ^ proposed just below here. I like this idea quite a lot: hook.pre-push = 001:/path/to/sanity-checker I'll have to ponder on the UX of a 'git hook'-facilitated interactive edit of the hook numbering, though. UX is not my strong point :) > > > > Possibly as an alternative to the ^ syntax, we could make the hook value > > > be of the form "key program", where key is a sort key (e.g., a number) > > > and program is the program to run. We pick normal config file ordering > > > if the keys are identical. Then if the system config wants to have a > > > hook that always runs at the end, it can do so easily. > > > > Interesting. This way if you decide after you've set up all your configs > > just so that you really want something to run at the end of the update > > event, you can change one place, not n=number of Git repos. (I do still > > want to be able to say "don't run that global hook in this project" > > though.) > > Exactly. A global or per-user commit-msg hook may want to see the final > message before approving or rejecting it, and that wouldn't be possible > without some sort of ordering. > > I strongly agree that we should still allow removing higher-level hooks. > > > > In addition, we should be sure that attempting to remove a hook which > > > doesn't exist isn't an error, since a user might want to set their > > > ~/.gitconfig file to always unset a global hook that may or may not > > > exist. > > > > I'd be comfortable with a warning when exiting 'git hook edit' mode and > > silence when actually running the hook list. > > Yeah, that's what I'm going for. > -- > brian m. carlson: Houston, Texas, US > OpenPGP: https://keybase.io/bk2204 Thanks for being patient as I wrapped my head around this enough to reply. - Emily ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC] Hook management via 'git hooks' command 2019-11-23 1:19 ` Emily Shaffer @ 2019-11-25 3:04 ` brian m. carlson 2019-11-25 22:21 ` Emily Shaffer 0 siblings, 1 reply; 12+ messages in thread From: brian m. carlson @ 2019-11-25 3:04 UTC (permalink / raw) To: Emily Shaffer Cc: git, Jonathan Nieder, Ævar Arnfjörð Bjarmason [-- Attachment #1: Type: text/plain, Size: 4834 bytes --] On 2019-11-23 at 01:19:24, Emily Shaffer wrote: > Ah, I think I see what you mean. > > hook.update = security-heuristic-runner > > where "security-heuristic-runner" is some compiled binary your employer > purchased from some vendor and distributed directly to your `/bin/`. > > No, I had imagined users would achieve that by writing: > > hook.update = /bin/security-heuristic-runner Yeah, that's what I'm looking for. The problem is that, for example, Debian does not guarantee where in PATH a file is. Having a newer Git on RHEL or CentOS systems often involves hacking PATH and LD_LIBRARY_PATH. > Hm. Do you mean: > > hook.update = git grep "something" > > Or do you mean: > > hook.update = ~/grephook.sh > > grephook.sh: > #!/bin/bash > > git grep "something" >output > ... do something with output ... I had intended to include the latter case, but also allow valid hooks with multiple argument support. For example, you could invoke "git lfs pre-push" directly in your hook, and that is a fully functioning pre-push hook, and would require a suitable PATH lookup to find your Git binary. It accepts the additional arguments that pre-push hooks accept; right now we basically do the following instead: ---- #!/bin/sh exec git lfs pre-push "$@" ---- > I suppose I need to understand better how $PATH works with the latter > scenario, but my gut says "if you didn't worry about where to find the > Git binary from your script before, why are you going to start caring > now". > > This led me to wonder: "Should we allow someone to pass arguments via > the hook config?" Or to put it another way, "Should we allow 'hook.update > = grep -Rin blahblah >audit.log'?" I think the answer is no - some hooks > do expect to be given arguments, for example > sequencer.c:run_prepare_commit_msg_hook(). I think what we want to do in this case is just invoke things in the shell with extra arguments, like we do with editors. This means we don't have to handle PATH or anything else; we just invoke the shell and let it handle it. That lets people provide multi-call binaries (like git lfs) that include hooks inside them. I do, however, think we should require folks to have a suitable hook that accepts the right arguments. So "git grep blahblah" isn't a valid hook in most cases, because it doesn't take the right arguments and read the right data from stdin if necessary. > > I suppose if we continue to keep the existing behavior of changing the > > directory and we pass the config options to the shell, then we could > > just write "$(git config core.hooksPath || echo > > .git/hooks)/pre-push.d/hook1" instead, which, while ugly, gets the job > > done. Then we wouldn't need such a command. > > Yeah, I am wondering about when you want to run a hook generically (i.e. > from a noninteractive script) but outside of the context of something in > the Git binary invoking a hook. Are you thinking of Git commands > implemented as scripts? I'm just thinking about existing hook wrappers that invoke multiple scripts at the moment, something like how https://gist.github.com/mjackson/7e602a7aa357cfe37dadcc016710931b works at the moment and how we'd replace that with a config-based model. I think using the shell avoids the entire proposal, because it then becomes trivial to script that in the command and port it over, since we can use my ugly hack above. I think I like that better than "git hook execute" because it's a little more flexible. > > One of the benefits to using numbered files in a .d directory is that > > you can explicitly control ordering of operations. For example, maybe I > > have a per-repo pre-push hook that performs some checks and rejects a > > push if something is off. I also have a pre-push hook for Git LFS that > > pushes the Git LFS objects to the remote server if Git LFS is in use. > > > > In this case, I'd always want my sanity-check hook to run first, and so > > I'd number it first. This is fine if both are per-repo, but if the LFS > > hook is global, then it's in the wrong order and my LFS objects are > > pushed even though my sanity check failed. > > Yeah, this is really compelling, and also removes the somewhat wonky ^ > proposed just below here. I like this idea quite a lot: > > hook.pre-push = 001:/path/to/sanity-checker I think a colon is actually better than my proposal for a space in this regard, but I'm not picky: anything unambiguous is fine. > I'll have to ponder on the UX of a 'git hook'-facilitated interactive > edit of the hook numbering, though. UX is not my strong point :) I also find I'm not great at UX, so I can't be of much help. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 868 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC] Hook management via 'git hooks' command 2019-11-25 3:04 ` brian m. carlson @ 2019-11-25 22:21 ` Emily Shaffer 2019-11-25 22:45 ` Emily Shaffer 2019-11-26 0:28 ` brian m. carlson 0 siblings, 2 replies; 12+ messages in thread From: Emily Shaffer @ 2019-11-25 22:21 UTC (permalink / raw) To: brian m. carlson, git, Jonathan Nieder, Ævar Arnfjörð Bjarmason On Mon, Nov 25, 2019 at 03:04:45AM +0000, brian m. carlson wrote: > On 2019-11-23 at 01:19:24, Emily Shaffer wrote: > > Ah, I think I see what you mean. > > > > hook.update = security-heuristic-runner > > > > where "security-heuristic-runner" is some compiled binary your employer > > purchased from some vendor and distributed directly to your `/bin/`. > > > > No, I had imagined users would achieve that by writing: > > > > hook.update = /bin/security-heuristic-runner > > Yeah, that's what I'm looking for. The problem is that, for example, > Debian does not guarantee where in PATH a file is. Having a newer Git > on RHEL or CentOS systems often involves hacking PATH and > LD_LIBRARY_PATH. > > > Hm. Do you mean: > > > > hook.update = git grep "something" > > > > Or do you mean: > > > > hook.update = ~/grephook.sh > > > > grephook.sh: > > #!/bin/bash > > > > git grep "something" >output > > ... do something with output ... > > I had intended to include the latter case, but also allow valid hooks > with multiple argument support. For example, you could invoke "git lfs > pre-push" directly in your hook, and that is a fully functioning > pre-push hook, and would require a suitable PATH lookup to find your Git > binary. It accepts the additional arguments that pre-push hooks accept; > right now we basically do the following instead: > > ---- > #!/bin/sh > > exec git lfs pre-push "$@" > ---- > > > I suppose I need to understand better how $PATH works with the latter > > scenario, but my gut says "if you didn't worry about where to find the > > Git binary from your script before, why are you going to start caring > > now". > > > > This led me to wonder: "Should we allow someone to pass arguments via > > the hook config?" Or to put it another way, "Should we allow 'hook.update > > = grep -Rin blahblah >audit.log'?" I think the answer is no - some hooks > > do expect to be given arguments, for example > > sequencer.c:run_prepare_commit_msg_hook(). > > I think what we want to do in this case is just invoke things in the > shell with extra arguments, like we do with editors. This means we > don't have to handle PATH or anything else; we just invoke the shell and > let it handle it. That lets people provide multi-call binaries (like > git lfs) that include hooks inside them. I think that you are saying we should do the nice equivalent of: system("git lfs pre-push"); and tack the args onto the end. (I suppose that's run-command.h, but I'm trying to use a shorthand that is easy to understand.) It seems like this would solve the PATH issue, yes. However, I feel tentative because of pushback on that approach in the bugreport review. Hmmm. I think this is different, because the user understands that the hook they configure themselves will be invoked on the shell of their choosing. That is, I think run-command.h with "C:\myhook.exe" would still work, no? Will this approach "just work" for Windows, et al.? > I do, however, think we should require folks to have a suitable hook > that accepts the right arguments. So "git grep blahblah" isn't a valid > hook in most cases, because it doesn't take the right arguments and read > the right data from stdin if necessary. I'm not sure how we would guarantee that. Are you suggesting we should try dry running a hook somehow when it's added with 'git hook add'? Even doing that won't stop someone from popping open ~/.gitconfig with nano and adding their hook that doesn't take the right args that way. > > > > I suppose if we continue to keep the existing behavior of changing the > > > directory and we pass the config options to the shell, then we could > > > just write "$(git config core.hooksPath || echo > > > .git/hooks)/pre-push.d/hook1" instead, which, while ugly, gets the job > > > done. Then we wouldn't need such a command. > > > > Yeah, I am wondering about when you want to run a hook generically (i.e. > > from a noninteractive script) but outside of the context of something in > > the Git binary invoking a hook. Are you thinking of Git commands > > implemented as scripts? > > I'm just thinking about existing hook wrappers that invoke multiple > scripts at the moment, something like how > https://gist.github.com/mjackson/7e602a7aa357cfe37dadcc016710931b works > at the moment and how we'd replace that with a config-based model. > > I think using the shell avoids the entire proposal, because it then > becomes trivial to script that in the command and port it over, since we > can use my ugly hack above. Still not sure I understand what the issue was before. Is it that the trampoline script needs to know $PATH to be able to invoke the child hooks in hookname.d? Or it's because the current working directory isn't clear, so a relative path in the trampoline script may not be resolved well? > I think I like that better than "git hook > execute" because it's a little more flexible. > > > > One of the benefits to using numbered files in a .d directory is that > > > you can explicitly control ordering of operations. For example, maybe I > > > have a per-repo pre-push hook that performs some checks and rejects a > > > push if something is off. I also have a pre-push hook for Git LFS that > > > pushes the Git LFS objects to the remote server if Git LFS is in use. > > > > > > In this case, I'd always want my sanity-check hook to run first, and so > > > I'd number it first. This is fine if both are per-repo, but if the LFS > > > hook is global, then it's in the wrong order and my LFS objects are > > > pushed even though my sanity check failed. > > > > Yeah, this is really compelling, and also removes the somewhat wonky ^ > > proposed just below here. I like this idea quite a lot: > > > > hook.pre-push = 001:/path/to/sanity-checker > > I think a colon is actually better than my proposal for a space in this > regard, but I'm not picky: anything unambiguous is fine. Yeah, same. I'm a little worried about a colon because there was some conversation about being able to pick a hook in a repo from a specific ref (i.e. I want to run 'update' hook from 'master' all the time even when I'm examining 'third-party-topic' branch or bisecting 'nasty-bug-branch') and the colon seems to be used handily for ref + path, but I think these details will work themselves out during implementation, so I'm not so very worried. > > > I'll have to ponder on the UX of a 'git hook'-facilitated interactive > > edit of the hook numbering, though. UX is not my strong point :) > > I also find I'm not great at UX, so I can't be of much help. We're in luck on my end as it seems we have a UX team with open office hours who can give us an opinion. I'll try to meet with them next week Thursday; thanks Jonathan Nieder for the pointer. This sounds like we both are pretty close on the same page, so I think I will get started in the coming weeks and see if we can get a mockup to pick at with the implementation details in front of us. - Emily ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC] Hook management via 'git hooks' command 2019-11-25 22:21 ` Emily Shaffer @ 2019-11-25 22:45 ` Emily Shaffer 2019-11-26 0:28 ` brian m. carlson 1 sibling, 0 replies; 12+ messages in thread From: Emily Shaffer @ 2019-11-25 22:45 UTC (permalink / raw) To: brian m. carlson, git, Jonathan Nieder, Ævar Arnfjörð Bjarmason > This sounds like we both are pretty close on the same page, so I think I > will get started in the coming weeks and see if we can get a mockup to > pick at with the implementation details in front of us. Hm. To elaborate (and partially as a reminder to myself) I will try to get it done in the following order: 1. Implement 'git hook list <hookname>' which reads all the configs. (User would need to manually add the configs at this stage) 2. (maybe) Implement 'git hook execute <hookname> <arg...>'. This may or may not be useful; I suppose it would be pretty equivalent to: $ git hook list <hookname> | xargs -I% sh % <arg...> 3. Implement config modifiers like 'git hook add', 'git hook edit' etc. My thinking is that we will have a lot of time with 1. in front of us to nitpick how we want the config format to look, how the ordering should go, etc. and it will be a fairly simple implementation. It'll also be "usable" although not in a particularly friendly way in case someone wants to try it and see, in a way that the config modifiers by themselves wouldn't be. - Emily ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC] Hook management via 'git hooks' command 2019-11-25 22:21 ` Emily Shaffer 2019-11-25 22:45 ` Emily Shaffer @ 2019-11-26 0:28 ` brian m. carlson 2019-11-26 0:56 ` Emily Shaffer 1 sibling, 1 reply; 12+ messages in thread From: brian m. carlson @ 2019-11-26 0:28 UTC (permalink / raw) To: Emily Shaffer Cc: git, Jonathan Nieder, Ævar Arnfjörð Bjarmason [-- Attachment #1: Type: text/plain, Size: 3825 bytes --] On 2019-11-25 at 22:21:13, Emily Shaffer wrote: > I think that you are saying we should do the nice equivalent of: > > system("git lfs pre-push"); > > and tack the args onto the end. (I suppose that's run-command.h, but I'm > trying to use a shorthand that is easy to understand.) Yeah, I'm proposing we run the command using run-command.c with the use_shell flag, which does something very much like that, except a little bit saner. > It seems like this would solve the PATH issue, yes. However, I feel > tentative because of pushback on that approach in the bugreport review. > Hmmm. I think this is different, because the user understands that the > hook they configure themselves will be invoked on the shell of their > choosing. That is, I think run-command.h with "C:\myhook.exe" would > still work, no? This will always use /bin/sh, as we do for editors. bash on Windows does understand the Windows paths and works correctly in this case, AIUI, so that should be fine. > Will this approach "just work" for Windows, et al.? Yes. Windows ships with bash (or in Portable Git, busybox sh), which is not only required to run the testsuite, but required to invoke editors. > > I do, however, think we should require folks to have a suitable hook > > that accepts the right arguments. So "git grep blahblah" isn't a valid > > hook in most cases, because it doesn't take the right arguments and read > > the right data from stdin if necessary. > > I'm not sure how we would guarantee that. Are you suggesting we should > try dry running a hook somehow when it's added with 'git hook add'? Even > doing that won't stop someone from popping open ~/.gitconfig with nano > and adding their hook that doesn't take the right args that way. We don't need to guarantee that. We just need to document it, so that (a) people write hooks in the expected way and (b) if people don't, we can point them to the documentation and explain why their hooks don't work. I can see people thinking of this as a set of commands that just checks exit statuses, and there's likely to be confusion. > > I'm just thinking about existing hook wrappers that invoke multiple > > scripts at the moment, something like how > > https://gist.github.com/mjackson/7e602a7aa357cfe37dadcc016710931b works > > at the moment and how we'd replace that with a config-based model. > > > > I think using the shell avoids the entire proposal, because it then > > becomes trivial to script that in the command and port it over, since we > > can use my ugly hack above. > > Still not sure I understand what the issue was before. Is it that the > trampoline script needs to know $PATH to be able to invoke the child > hooks in hookname.d? Or it's because the current working directory > isn't clear, so a relative path in the trampoline script may not be > resolved well? I think we need to have either (a) a way to explicitly invoke hooks underneath the hook directory or (b) a shell invocation to allow looking up the hook directory. People want to have per-repository hooks that are not a part of the project, and they need a place to store them, which is logically the hook directory. If we want to allow people to have multiple hooks under something like .git/hooks/pre-push.d, then we need to have a way to wire them up in the configuration using the correct location of the hook directory instead of using a helper script like the one I linked above. We don't know that the hook directory will necessarily be under .git/hooks, so if the user has moved it elsewhere, we'd want to follow that. A relative path would be wrong if the user changed the hook directory to a different location. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 868 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC] Hook management via 'git hooks' command 2019-11-26 0:28 ` brian m. carlson @ 2019-11-26 0:56 ` Emily Shaffer 2019-11-26 2:41 ` brian m. carlson 0 siblings, 1 reply; 12+ messages in thread From: Emily Shaffer @ 2019-11-26 0:56 UTC (permalink / raw) To: brian m. carlson, git, Jonathan Nieder, Ævar Arnfjörð Bjarmason On Tue, Nov 26, 2019 at 12:28:06AM +0000, brian m. carlson wrote: > On 2019-11-25 at 22:21:13, Emily Shaffer wrote: > > I think that you are saying we should do the nice equivalent of: > > > > system("git lfs pre-push"); > > > > and tack the args onto the end. (I suppose that's run-command.h, but I'm > > trying to use a shorthand that is easy to understand.) > > Yeah, I'm proposing we run the command using run-command.c with the > use_shell flag, which does something very much like that, except a > little bit saner. > > > It seems like this would solve the PATH issue, yes. However, I feel > > tentative because of pushback on that approach in the bugreport review. > > Hmmm. I think this is different, because the user understands that the > > hook they configure themselves will be invoked on the shell of their > > choosing. That is, I think run-command.h with "C:\myhook.exe" would > > still work, no? > > This will always use /bin/sh, as we do for editors. > > bash on Windows does understand the Windows paths and works correctly in > this case, AIUI, so that should be fine. > > > Will this approach "just work" for Windows, et al.? > > Yes. Windows ships with bash (or in Portable Git, busybox sh), which is > not only required to run the testsuite, but required to invoke editors. > > > > I do, however, think we should require folks to have a suitable hook > > > that accepts the right arguments. So "git grep blahblah" isn't a valid > > > hook in most cases, because it doesn't take the right arguments and read > > > the right data from stdin if necessary. > > > > I'm not sure how we would guarantee that. Are you suggesting we should > > try dry running a hook somehow when it's added with 'git hook add'? Even > > doing that won't stop someone from popping open ~/.gitconfig with nano > > and adding their hook that doesn't take the right args that way. > > We don't need to guarantee that. We just need to document it, so that > (a) people write hooks in the expected way and (b) if people don't, we > can point them to the documentation and explain why their hooks don't > work. I can see people thinking of this as a set of commands that just > checks exit statuses, and there's likely to be confusion. > > > > I'm just thinking about existing hook wrappers that invoke multiple > > > scripts at the moment, something like how > > > https://gist.github.com/mjackson/7e602a7aa357cfe37dadcc016710931b works > > > at the moment and how we'd replace that with a config-based model. > > > > > > I think using the shell avoids the entire proposal, because it then > > > becomes trivial to script that in the command and port it over, since we > > > can use my ugly hack above. > > > > Still not sure I understand what the issue was before. Is it that the > > trampoline script needs to know $PATH to be able to invoke the child > > hooks in hookname.d? Or it's because the current working directory > > isn't clear, so a relative path in the trampoline script may not be > > resolved well? > > I think we need to have either (a) a way to explicitly invoke hooks > underneath the hook directory or (b) a shell invocation to allow looking > up the hook directory. People want to have per-repository hooks that > are not a part of the project, and they need a place to store them, > which is logically the hook directory. > > If we want to allow people to have multiple hooks under something like > .git/hooks/pre-push.d, then we need to have a way to wire them up in the > configuration using the correct location of the hook directory instead > of using a helper script like the one I linked above. > > We don't know that the hook directory will necessarily be under > .git/hooks, so if the user has moved it elsewhere, we'd want to follow > that. A relative path would be wrong if the user changed the hook > directory to a different location. Hopefully I am not beating a dead horse here but I really want to understand. Let me take another guess with examples at what you mean; please correct me! For our purposes, let's assume: .git/hooks/ update update.d/ update-1 update-2 update: #!/bin/bash ./git/hooks/update.d/update-1 && ./git/hooks/update.d/update-2 The goal is to make sure update-1 and update-2 are run when other update hooks happen. With my proposal as is, I see two options: 1) .git/config: hook.runHookDir = true This will run whatever else is in hook.update, and then it will run .git/hooks/update. This is not ideal because hookDir could change, and then the content of update will be incorrect: git config --add core.hookdir=~/hooks/ mv .git/hooks/update* ~/hooks/ # call something which invokes update hooks # ~/hooks/update fails, .git/hooks/update-1 is gone :( 2) .git/config: hook.update = 001:/project/path/.git/hooks/update.d/update-1 hook.update = 002:/project/path/.git/hooks/update.d/update-2 This will run each update hook individually and have no notion of whether they're in a "hook dir" or not. It sees a path, it hands it to 'sh' to run, it checks the return code. Now I run 'mv .git/hooks/update* ~/hooks/'. Next time I invoke something which would run the 'update' hook, it fails, because the path in the config isn't pointing to anything. But this way is unrelated to hookdir. It sounds like you might be asking for something like: .git/config: hook.update = 001:__HOOKDIR__/update.d/update-1 I'm not sure that I like the idea of this. My own dream is to eliminate the need for a hookdir entirely, so it's logically easy for my Gerrit hook to live in ~/gerrit-tools/ and my linter to live in ~/clang-tidy/ and so on. I could see option 1 being alleviated by writing it as something like: update: $GIT_HOOKDIR/update.d/update-1 && $GIT_HOOKDIR/update.d/update-2 or update: $(git config core.hookdir)/update.d/update-1 && $(git config core.hookdir)/update.d/update-2 But, I think once you "buy in" to the idea of providing a full path to the final target - not to a trampoline script - in your config, you should forget about the idea of "core.hookdir" having anything to do with it. To quote you out-of-order now: > If we want to allow people to have multiple hooks under something like > .git/hooks/pre-push.d, then we need to have a way to wire them up in the > configuration using the correct location of the hook directory instead > of using a helper script like the one I linked above. I think I may spot the misunderstanding. It sounds like you hope someone can provide "hook.update=001:~/update.d/" and have all the contents of update.d executed. I'll be clear and say that I didn't have the intention to support that at all; instead I was hoping to support a case like 2. above. Recursing through directories like that sounds scary to order. Hmm. Maybe it makes slightly more sense to support something like 'git hook add update ~/update.d', which provides an editor where you can add ordering numbers and config scope to each, but resolves to a config which looks like option 2. above. Are you pushing hard for this update.d case in the hopes that someone can 'set and forget' a hook directory like that, and be able to add new hooks there without changing the configuration? It sounds tricky to me, as we then get to deal with more questions like: - how do we order hooks in that directory? - how should 'git hook list' display those hooks? - how would you use 'git hook' to reorder those hooks? - do we really want a user to 'git pull' and be running an entirely new script they didn't mean to, without any additional interaction? (and maybe we do - 'git pull' in a repo where you have asked to run hooks from, and then hook behavior changes, might not actually be surprising.) - Emily ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC] Hook management via 'git hooks' command 2019-11-26 0:56 ` Emily Shaffer @ 2019-11-26 2:41 ` brian m. carlson 2019-12-02 23:46 ` Emily Shaffer 0 siblings, 1 reply; 12+ messages in thread From: brian m. carlson @ 2019-11-26 2:41 UTC (permalink / raw) To: Emily Shaffer Cc: git, Jonathan Nieder, Ævar Arnfjörð Bjarmason [-- Attachment #1: Type: text/plain, Size: 7138 bytes --] On 2019-11-26 at 00:56:14, Emily Shaffer wrote: > Hopefully I am not beating a dead horse here but I really want to > understand. Let me take another guess with examples at what you mean; > please correct me! > > For our purposes, let's assume: > > .git/hooks/ > update > update.d/ > update-1 > update-2 > > update: > #!/bin/bash > > ./git/hooks/update.d/update-1 && > ./git/hooks/update.d/update-2 > > The goal is to make sure update-1 and update-2 are run when other update > hooks happen. > With my proposal as is, I see two options: > > 1) > .git/config: > hook.runHookDir = true > > This will run whatever else is in hook.update, and then it will run > .git/hooks/update. This is not ideal because hookDir could change, and > then the content of update will be incorrect: > > git config --add core.hookdir=~/hooks/ > mv .git/hooks/update* ~/hooks/ > # call something which invokes update hooks > # ~/hooks/update fails, .git/hooks/update-1 is gone :( > This is actually fine. We assume the user knows where they want to store their hooks. If they change that, then that's intentional. > .git/config: > hook.update = 001:/project/path/.git/hooks/update.d/update-1 > hook.update = 002:/project/path/.git/hooks/update.d/update-2 > > This will run each update hook individually and have no notion of > whether they're in a "hook dir" or not. It sees a path, it hands it to > 'sh' to run, it checks the return code. Correct. This is the logical porting of the above shell script to the config syntax if you use an absolute path. > Now I run 'mv .git/hooks/update* ~/hooks/'. Next time I invoke something > which would run the 'update' hook, it fails, because the path in the > config isn't pointing to anything. But this way is unrelated to hookdir. Yes, that's correct. It sounds like you're thinking of the config approach as completely orthogonal to the hook directory. However, I want to have multiple per-repository hooks that do not live within the repository but move with it. The logical place to store those hooks is in the hook directory, even if it's not being invoked by Git explicitly. To use that with the config approach so I can have multiple hooks in a useful way that's compatible with other tools, I need some way to refer to whatever the hook directory is at a given point in time. > It sounds like you might be asking for something like: > > .git/config: > hook.update = 001:__HOOKDIR__/update.d/update-1 > > I'm not sure that I like the idea of this. My own dream is to eliminate > the need for a hookdir entirely, so it's logically easy for my Gerrit > hook to live in ~/gerrit-tools/ and my linter to live in ~/clang-tidy/ > and so on. Using the config syntax eliminates per-repository storage for hooks. I, personally, want to store multiple one-off hooks in my hooks directory. I want to use tools that install one-off hooks into my repository without needing to depend on the location of those tools in the future. I don't want those hooks to live elsewhere on my file system, since that makes my repository no longer self contained. I want to store those hooks in the hook directory, wherever I've configured that, and whatever that happens to be at this point in time. I may additionally have tools that live in other locations as well and may use other syntaxes to invoke them. For example, I may install a hook that's provided by a Debian package and refer to it using a bare program name. If your goal is to eliminate the hook directory entirely, then I can't say that I'm in support of that. A design which does that won't meet my needs, and it likely won't meet the needs of other users. The benefit of your proposed config syntax for me is that it provides a standard way to configure multiple hooks. I still want many of those hooks to live in the hook directory, although others may live elsewhere. > I could see option 1 being alleviated by writing it as something like: > > update: > $GIT_HOOKDIR/update.d/update-1 && > $GIT_HOOKDIR/update.d/update-2 > > or > update: > $(git config core.hookdir)/update.d/update-1 && > $(git config core.hookdir)/update.d/update-2 This is similar to what I want, and why I want to pass it to the shell. I can write the following, and everything just works: .git/config: hook.update = 001:$(git config core.hookdir || echo .git/hooks)/update.d/update-1 hook.update = 002:$(git config core.hookdir || echo .git/hooks)/update.d/update-2 Wherever I have placed my hooks for this repository, I can refer to them with a shell script. I can also add the following line as well: .git/config: hook.update = debian-package-hook update …and everything just works. > But, I think once you "buy in" to the idea of providing a full path to > the final target - not to a trampoline script - in your config, you > should forget about the idea of "core.hookdir" having anything to do > with it. I have no intention of providing a full path to any hook, since that's quite brittle. There are very few paths on my system which can be guaranteed not to change, and all of them are things like /bin/sh or /usr/bin/env. If my hooks are in the hook directory (or even in the working tree) with a full path and I move that repository, they become broken. If they're shipped by Debian and Debian decides to move the command, they're broken. I'm very interested in learning more about why you seem to want to specify full paths. It seems very at odds with the way the rest of Git works. If the goal is to support other syntaxes in the future, then let's use a prefix character (e.g. !) to denote commands vs. non-commands or something like that. If the goal is security, then I'd like to hear more about the security model you're trying to achieve with this design. > To quote you out-of-order now: > > > If we want to allow people to have multiple hooks under something like > > .git/hooks/pre-push.d, then we need to have a way to wire them up in the > > configuration using the correct location of the hook directory instead > > of using a helper script like the one I linked above. > > I think I may spot the misunderstanding. It sounds like you hope someone > can provide "hook.update=001:~/update.d/" and have all the contents of > update.d executed. I'll be clear and say that I didn't have the > intention to support that at all; instead I was hoping to support a case > like 2. above. Recursing through directories like that sounds scary to > order. I don't need a particular way to do that, since I can do it already, but I do need a way to wire up multiple hooks that are per-repository and move with the repository but aren't in the repository. In other words, I need a functional replacement for that approach if we're not going to use that approach itself. Hopefully I've done a better job at explaining myself here. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 868 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC] Hook management via 'git hooks' command 2019-11-26 2:41 ` brian m. carlson @ 2019-12-02 23:46 ` Emily Shaffer 0 siblings, 0 replies; 12+ messages in thread From: Emily Shaffer @ 2019-12-02 23:46 UTC (permalink / raw) To: brian m. carlson, git, Jonathan Nieder, Ævar Arnfjörð Bjarmason Thanks for waiting as I took half of last week off to host family for holiday. :) On Tue, Nov 26, 2019 at 02:41:47AM +0000, brian m. carlson wrote: > On 2019-11-26 at 00:56:14, Emily Shaffer wrote: > > Hopefully I am not beating a dead horse here but I really want to > > understand. Let me take another guess with examples at what you mean; > > please correct me! > > > > For our purposes, let's assume: > > > > .git/hooks/ > > update > > update.d/ > > update-1 > > update-2 > > > > update: > > #!/bin/bash > > > > ./git/hooks/update.d/update-1 && > > ./git/hooks/update.d/update-2 > > > > The goal is to make sure update-1 and update-2 are run when other update > > hooks happen. > > > > > With my proposal as is, I see two options: > > > > 1) > > .git/config: > > hook.runHookDir = true > > > > This will run whatever else is in hook.update, and then it will run > > .git/hooks/update. This is not ideal because hookDir could change, and > > then the content of update will be incorrect: > > > > git config --add core.hookdir=~/hooks/ > > mv .git/hooks/update* ~/hooks/ > > # call something which invokes update hooks > > # ~/hooks/update fails, .git/hooks/update-1 is gone :( > > > > This is actually fine. We assume the user knows where they want to > store their hooks. If they change that, then that's intentional. > > > .git/config: > > hook.update = 001:/project/path/.git/hooks/update.d/update-1 > > hook.update = 002:/project/path/.git/hooks/update.d/update-2 > > > > This will run each update hook individually and have no notion of > > whether they're in a "hook dir" or not. It sees a path, it hands it to > > 'sh' to run, it checks the return code. > > Correct. This is the logical porting of the above shell script to the > config syntax if you use an absolute path. > > > Now I run 'mv .git/hooks/update* ~/hooks/'. Next time I invoke something > > which would run the 'update' hook, it fails, because the path in the > > config isn't pointing to anything. But this way is unrelated to hookdir. > > Yes, that's correct. > > It sounds like you're thinking of the config approach as completely > orthogonal to the hook directory. However, I want to have multiple > per-repository hooks that do not live within the repository but move > with it. The logical place to store those hooks is in the hook > directory, even if it's not being invoked by Git explicitly. To use > that with the config approach so I can have multiple hooks in a useful > way that's compatible with other tools, I need some way to refer to > whatever the hook directory is at a given point in time. This sounds like a use case we can expect other users to want to work. It is very different from the way I think about and use hooks, so I'm glad you explained it to me. (For the record, the way I think of hooks is something like this: I want all repositories to run git-secrets before I push; I want a handful of repositories to run the Gerrit Change-Id hook after I commit; I want one repo to run linter A, and three or four other repos to run linter B. So I tend to think "how can I apply hooks I have downloaded a la carte to my repositories?" You seem to be thinking "how can I write hooks for one repository?" They are both totally valid, of course.) > > > It sounds like you might be asking for something like: > > > > .git/config: > > hook.update = 001:__HOOKDIR__/update.d/update-1 > > > > I'm not sure that I like the idea of this. My own dream is to eliminate > > the need for a hookdir entirely, so it's logically easy for my Gerrit > > hook to live in ~/gerrit-tools/ and my linter to live in ~/clang-tidy/ > > and so on. > > Using the config syntax eliminates per-repository storage for hooks. I, > personally, want to store multiple one-off hooks in my hooks directory. > I want to use tools that install one-off hooks into my repository > without needing to depend on the location of those tools in the future. > I don't want those hooks to live elsewhere on my file system, since that > makes my repository no longer self contained. > > I want to store those hooks in the hook directory, wherever I've > configured that, and whatever that happens to be at this point in time. > > I may additionally have tools that live in other locations as well and > may use other syntaxes to invoke them. For example, I may install a > hook that's provided by a Debian package and refer to it using a bare > program name. > > If your goal is to eliminate the hook directory entirely, then I can't > say that I'm in support of that. A design which does that won't meet my > needs, and it likely won't meet the needs of other users. Internally, that is our goal, absolutely. However, you yourself are proving it's not a reasonable goal for the entire world. That was somewhat expected, which is why I hoped "hook.runHookDir" and "hook.warnHookDir" would provide a path for users to decide by themselves that they don't want to use the hook directory anymore. A self-paced hook directoy phase out seems like a good fit to make us both happy, where your pace can be "over my dead body" if you so choose ;) > > The benefit of your proposed config syntax for me is that it provides a > standard way to configure multiple hooks. I still want many of those > hooks to live in the hook directory, although others may live elsewhere. > > > I could see option 1 being alleviated by writing it as something like: > > > > update: > > $GIT_HOOKDIR/update.d/update-1 && > > $GIT_HOOKDIR/update.d/update-2 > > > > or > > update: > > $(git config core.hookdir)/update.d/update-1 && > > $(git config core.hookdir)/update.d/update-2 > > This is similar to what I want, and why I want to pass it to the shell. > I can write the following, and everything just works: > > .git/config: > hook.update = 001:$(git config core.hookdir || echo .git/hooks)/update.d/update-1 > hook.update = 002:$(git config core.hookdir || echo .git/hooks)/update.d/update-2 > > Wherever I have placed my hooks for this repository, I can refer to > them with a shell script. I can also add the following line as well: > > .git/config: > hook.update = debian-package-hook update > > …and everything just works. > > > But, I think once you "buy in" to the idea of providing a full path to > > the final target - not to a trampoline script - in your config, you > > should forget about the idea of "core.hookdir" having anything to do > > with it. > > I have no intention of providing a full path to any hook, since that's > quite brittle. There are very few paths on my system which can be > guaranteed not to change, and all of them are things like /bin/sh or > /usr/bin/env. If my hooks are in the hook directory (or even in the > working tree) with a full path and I move that repository, they become > broken. If they're shipped by Debian and Debian decides to move the > command, they're broken. I think this is again stemming from the different ways you and I are thinking about our hooks, and that's fine. I'm open to the idea of relative-path hooks, and a blurb in the interactive "git hook edit" mode (or whatever it ends up being) explaining where relative paths will be relative to. Is that OK for you? > I'm very interested in learning more about why you seem to want to > specify full paths. It seems very at odds with the way the rest of Git > works. I hope my description closer to the top of this mail helps to explain. > If the goal is to support other syntaxes in the future, then > let's use a prefix character (e.g. !) to denote commands vs. > non-commands or something like that. If the goal is security, then I'd > like to hear more about the security model you're trying to achieve with > this design. As for the security model, I think Jonathan explained it quite well in the link I posted at the top of this thread. The tl;dr is that Malicious User can hand Victim Sysadmin a .zip with a repo that is "broken", and when Victim Sysadmin starts to explore it, she inadvertently triggers hooks which were packaged in that .zip and compromises her workstation. Eliminating the use of .git/hooks/ alone doesn't protect from this - .git/config is also a vector, as one can include an alias there - but I believe it's heading towards the right direction. > > > To quote you out-of-order now: > > > > > If we want to allow people to have multiple hooks under something like > > > .git/hooks/pre-push.d, then we need to have a way to wire them up in the > > > configuration using the correct location of the hook directory instead > > > of using a helper script like the one I linked above. > > > > I think I may spot the misunderstanding. It sounds like you hope someone > > can provide "hook.update=001:~/update.d/" and have all the contents of > > update.d executed. I'll be clear and say that I didn't have the > > intention to support that at all; instead I was hoping to support a case > > like 2. above. Recursing through directories like that sounds scary to > > order. > > I don't need a particular way to do that, since I can do it already, but > I do need a way to wire up multiple hooks that are per-repository and > move with the repository but aren't in the repository. In other words, > I need a functional replacement for that approach if we're not going to > use that approach itself. It sounds like simply supporting relative paths in the config, with adequate explanation to the user about where the paths are relative to, will solve this for you. What do you think? > > Hopefully I've done a better job at explaining myself here. Yeah, I think I made a breakthrough understanding you from this mail. Thanks :) :) - Emily ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2019-12-02 23:46 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-11-16 1:11 [RFC] Hook management via 'git hooks' command Emily Shaffer 2019-11-16 5:45 ` brian m. carlson 2019-11-18 22:38 ` Emily Shaffer 2019-11-19 0:51 ` brian m. carlson 2019-11-23 1:19 ` Emily Shaffer 2019-11-25 3:04 ` brian m. carlson 2019-11-25 22:21 ` Emily Shaffer 2019-11-25 22:45 ` Emily Shaffer 2019-11-26 0:28 ` brian m. carlson 2019-11-26 0:56 ` Emily Shaffer 2019-11-26 2:41 ` brian m. carlson 2019-12-02 23:46 ` Emily Shaffer
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).