Re: [RFC] Hook management via 'git hooks' command

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

From: Emily Shaffer <emilyshaffer@google.com>
To: "brian m. carlson" <sandals@crustytoothpaste.net>,
	git@vger.kernel.org, "Jonathan Nieder" <jrnieder@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: Re: [RFC] Hook management via 'git hooks' command
Date: Mon, 2 Dec 2019 15:46:05 -0800	[thread overview]
Message-ID: <20191202234605.GA17687@google.com> (raw)
In-Reply-To: <20191126024147.GF2404748@camp.crustytoothpaste.net>

Thanks for waiting as I took half of last week off to host family for
holiday. :)

On Tue, Nov 26, 2019 at 02:41:47AM +0000, brian m. carlson wrote:
> On 2019-11-26 at 00:56:14, Emily Shaffer wrote:
> > Hopefully I am not beating a dead horse here but I really want to
> > understand. Let me take another guess with examples at what you mean;
> > please correct me!
> > 
> > For our purposes, let's assume:
> > 
> > .git/hooks/
> >   update
> >   update.d/
> >     update-1
> >     update-2
> > 
> > update:
> >   #!/bin/bash
> > 
> >   ./git/hooks/update.d/update-1 &&
> >   ./git/hooks/update.d/update-2
> > 
> > The goal is to make sure update-1 and update-2 are run when other update
> > hooks happen.
> 
> 
> 
> > With my proposal as is, I see two options:
> > 
> > 1)
> > .git/config:
> >   hook.runHookDir = true
> > 
> > This will run whatever else is in hook.update, and then it will run
> > .git/hooks/update. This is not ideal because hookDir could change, and
> > then the content of update will be incorrect:
> > 
> >   git config --add core.hookdir=~/hooks/
> >   mv .git/hooks/update* ~/hooks/
> >   # call something which invokes update hooks
> >   # ~/hooks/update fails, .git/hooks/update-1 is gone :(
> > 
> 
> This is actually fine.  We assume the user knows where they want to
> store their hooks.  If they change that, then that's intentional.
> 
> > .git/config:
> >   hook.update = 001:/project/path/.git/hooks/update.d/update-1
> >   hook.update = 002:/project/path/.git/hooks/update.d/update-2
> > 
> > This will run each update hook individually and have no notion of
> > whether they're in a "hook dir" or not. It sees a path, it hands it to
> > 'sh' to run, it checks the return code.
> 
> Correct.  This is the logical porting of the above shell script to the
> config syntax if you use an absolute path.
> 
> > Now I run 'mv .git/hooks/update* ~/hooks/'. Next time I invoke something
> > which would run the 'update' hook, it fails, because the path in the
> > config isn't pointing to anything. But this way is unrelated to hookdir.
> 
> Yes, that's correct.
> 
> It sounds like you're thinking of the config approach as completely
> orthogonal to the hook directory.  However, I want to have multiple
> per-repository hooks that do not live within the repository but move
> with it.  The logical place to store those hooks is in the hook
> directory, even if it's not being invoked by Git explicitly.  To use
> that with the config approach so I can have multiple hooks in a useful
> way that's compatible with other tools, I need some way to refer to
> whatever the hook directory is at a given point in time.

This sounds like a use case we can expect other users to want to work.
It is very different from the way I think about and use hooks, so I'm
glad you explained it to me.

(For the record, the way I think of hooks is something like this: I want
all repositories to run git-secrets before I push; I want a handful of
repositories to run the Gerrit Change-Id hook after I commit; I want one
repo to run linter A, and three or four other repos to run linter B. So
I tend to think "how can I apply hooks I have downloaded a la carte to
my repositories?" You seem to be thinking "how can I write hooks for one
repository?" They are both totally valid, of course.)

> 
> > It sounds like you might be asking for something like:
> > 
> > .git/config:
> >   hook.update = 001:__HOOKDIR__/update.d/update-1
> > 
> > I'm not sure that I like the idea of this. My own dream is to eliminate
> > the need for a hookdir entirely, so it's logically easy for my Gerrit
> > hook to live in ~/gerrit-tools/ and my linter to live in ~/clang-tidy/
> > and so on.
> 
> Using the config syntax eliminates per-repository storage for hooks.  I,
> personally, want to store multiple one-off hooks in my hooks directory.
> I want to use tools that install one-off hooks into my repository
> without needing to depend on the location of those tools in the future.
> I don't want those hooks to live elsewhere on my file system, since that
> makes my repository no longer self contained.
> 
> I want to store those hooks in the hook directory, wherever I've
> configured that, and whatever that happens to be at this point in time.
> 
> I may additionally have tools that live in other locations as well and
> may use other syntaxes to invoke them.  For example, I may install a
> hook that's provided by a Debian package and refer to it using a bare
> program name.
> 
> If your goal is to eliminate the hook directory entirely, then I can't
> say that I'm in support of that.  A design which does that won't meet my
> needs, and it likely won't meet the needs of other users.

Internally, that is our goal, absolutely. However, you yourself are
proving it's not a reasonable goal for the entire world. That was
somewhat expected, which is why I hoped "hook.runHookDir" and
"hook.warnHookDir" would provide a path for users to decide by
themselves that they don't want to use the hook directory anymore. A
self-paced hook directoy phase out seems like a good fit to make us both
happy, where your pace can be "over my dead body" if you so choose ;)

> 
> The benefit of your proposed config syntax for me is that it provides a
> standard way to configure multiple hooks.  I still want many of those
> hooks to live in the hook directory, although others may live elsewhere.
> 
> > I could see option 1 being alleviated by writing it as something like:
> > 
> > update:
> >   $GIT_HOOKDIR/update.d/update-1 &&
> >   $GIT_HOOKDIR/update.d/update-2
> > 
> > or
> > update:
> >   $(git config core.hookdir)/update.d/update-1 &&
> >   $(git config core.hookdir)/update.d/update-2
> 
> This is similar to what I want, and why I want to pass it to the shell.
> I can write the following, and everything just works:
> 
> .git/config:
>   hook.update = 001:$(git config core.hookdir || echo .git/hooks)/update.d/update-1
>   hook.update = 002:$(git config core.hookdir || echo .git/hooks)/update.d/update-2
> 
> Wherever I have placed my hooks for this repository, I can refer to
> them with a shell script.  I can also add the following line as well:
> 
> .git/config:
>   hook.update = debian-package-hook update
> 
> …and everything just works.
> 
> > But, I think once you "buy in" to the idea of providing a full path to
> > the final target - not to a trampoline script - in your config, you
> > should forget about the idea of "core.hookdir" having anything to do
> > with it.
> 
> I have no intention of providing a full path to any hook, since that's
> quite brittle.  There are very few paths on my system which can be
> guaranteed not to change, and all of them are things like /bin/sh or
> /usr/bin/env.  If my hooks are in the hook directory (or even in the
> working tree) with a full path and I move that repository, they become
> broken.  If they're shipped by Debian and Debian decides to move the
> command, they're broken.

I think this is again stemming from the different ways you and I are
thinking about our hooks, and that's fine. I'm open to the idea of
relative-path hooks, and a blurb in the interactive "git hook edit" mode
(or whatever it ends up being) explaining where relative paths will be
relative to. Is that OK for you?

> I'm very interested in learning more about why you seem to want to
> specify full paths.  It seems very at odds with the way the rest of Git
> works.

I hope my description closer to the top of this mail helps to explain.

> If the goal is to support other syntaxes in the future, then
> let's use a prefix character (e.g. !) to denote commands vs.
> non-commands or something like that.  If the goal is security, then I'd
> like to hear more about the security model you're trying to achieve with
> this design.

As for the security model, I think Jonathan explained it quite well in
the link I posted at the top of this thread. The tl;dr is that Malicious
User can hand Victim Sysadmin a .zip with a repo that is "broken",
and when Victim Sysadmin starts to explore it, she inadvertently
triggers hooks which were packaged in that .zip and compromises her
workstation. Eliminating the use of .git/hooks/ alone doesn't protect
from this - .git/config is also a vector, as one can include an alias
there - but I believe it's heading towards the right direction.

> 
> > To quote you out-of-order now:
> > 
> > > If we want to allow people to have multiple hooks under something like
> > > .git/hooks/pre-push.d, then we need to have a way to wire them up in the
> > > configuration using the correct location of the hook directory instead
> > > of using a helper script like the one I linked above.
> > 
> > I think I may spot the misunderstanding. It sounds like you hope someone
> > can provide "hook.update=001:~/update.d/" and have all the contents of
> > update.d executed. I'll be clear and say that I didn't have the
> > intention to support that at all; instead I was hoping to support a case
> > like 2. above. Recursing through directories like that sounds scary to
> > order.
> 
> I don't need a particular way to do that, since I can do it already, but
> I do need a way to wire up multiple hooks that are per-repository and
> move with the repository but aren't in the repository.  In other words,
> I need a functional replacement for that approach if we're not going to
> use that approach itself.

It sounds like simply supporting relative paths in the config, with
adequate explanation to the user about where the paths are relative to,
will solve this for you. What do you think?

> 
> Hopefully I've done a better job at explaining myself here.

Yeah, I think I made a breakthrough understanding you from this mail.
Thanks :) :)

 - Emily

     prev parent reply	other threads:[~2019-12-02 23:46 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-16  1:11 [RFC] Hook management via 'git hooks' command Emily Shaffer
2019-11-16  5:45 ` brian m. carlson
2019-11-18 22:38   ` Emily Shaffer
2019-11-19  0:51     ` brian m. carlson
2019-11-23  1:19       ` Emily Shaffer
2019-11-25  3:04         ` brian m. carlson
2019-11-25 22:21           ` Emily Shaffer
2019-11-25 22:45             ` Emily Shaffer
2019-11-26  0:28             ` brian m. carlson
2019-11-26  0:56               ` Emily Shaffer
2019-11-26  2:41                 ` brian m. carlson
2019-12-02 23:46                   ` Emily Shaffer [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191202234605.GA17687@google.com \
    --to=emilyshaffer@google.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jrnieder@gmail.com \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).