From: "Ævar Arnfjörð Bjarmason" <email@example.com> To: Albert Cui <firstname.lastname@example.org> Cc: Albert Cui via GitGitGadget <email@example.com>, firstname.lastname@example.org Subject: Re: [PATCH] hooks: propose repository owner configured hooks Date: Thu, 08 Apr 2021 00:47:05 +0200 [thread overview] Message-ID: <email@example.com> (raw) In-Reply-To: <CAMbkP-SX2PvjWaNGfO4YUVaWHhAr_KHb170sb_pp8_CiSydQFg@mail.gmail.com> On Tue, Apr 06 2021, Albert Cui wrote: > On Fri, Mar 19, 2021 at 3:27 AM Ævar Arnfjörð Bjarmason > <firstname.lastname@example.org> wrote: > >> On Thu, Mar 18 2021, Albert Cui via GitGitGadget wrote: Not replying to everything here for lack of time, might need to loop back to this, please headsup me if there's outstanding things you'd like >> As noted in the proposal I linked I think anyone thinking about this >> would do well to examine the trade-off Emacs's implementation of this >> uses, since it manages to safely open arbitrary files that'll >> potentially run arbitrary code on-open, but only if the user opts-in. >> > > In , I proposed only allowing pre-clone opt-in to suggested hooks > if allowlisted in the config for the given remote, which seems similar > to your previous proposal. Extending this idea to any config settings > seems very reasonable. I'd love other people's thoughts about this. > > In your proposal, you wrote: > >> It would work really well with includeIf, e.g. I could clone all my work repos to a "safe" area in ~/work which is allowed to set more options, e.g. aliases. > > It seems much safer to do this for a given remote, opposed to a local > file path, no? Yes, probably. I don't think I thought much about it at the time. >> > +Later, we might want to do this before the initial clone is performed; that >> > +workflow looks like: >> > + >> > +* During clone, perform ls-refs as normal >> > + >> > +* If the server has a "magic" config branch, fetch only that config branch. >> >> ...the reason for the magic branch is this before-clone use-case? >> >> Unless there's a really strong use-case for that I'd think the >> per-branch .gitconfig would be a better trade-off, but even then there >> would be ways to make that work obviously in the many-many case, and >> still e.g. have a branch advertise a config blob for its "main" branch >> as a special case or something. >> >> I also suspect an unstated constraint of having this in a magic branch >> is the limitation of some git hosting provider's ACL >> implementations. More on that later... >> > > Mentioned in , the primary motivation for a magic config branch > that lived outside of the worktree was "since the configuration is > separate from the code base, it allows you to go back in history or to > older branches while preserving "improvements" to the hooks > configuration e.g. maybe the project has shifted to using a newer > version of a linter." It also means that your hooks will forever need to be aware of the union of all revisions in the project to work properly, or more likely they'll simply break on older revisions as e.g. a hook that needs to handle a test directory handles just "t", but it used to be called "tests". It's also just un-git-y in *requiring* a remote. A .mailmap, .gitattributes etc. all work with a repo you find on-disk, why does config & hooks need to be different? How would a user of such a repo suggest changes to a hook? Now it's fairly easy for e.g. .gitattributes, you change it, push a branch, ask for it to be merged etc. If you want the same hook for all revisions ever having some light logic in the hook itself to check/cache that (it's executing arbitrary code after all) seems like a saner thing for those who have this "magic branch" use-case than to make it the default. > [...] >> > +Future Work >> > +~~~~~~~~~~~ >> > + >> > +* Extending this to allow repository owners to specify specific configurations >> > +in general e.g. this repository should use partial-clone with these parameters. >> >> I don't see why such a proposal should narrowly confine itself to hooks. >> >> Once we have config-based hooks then hooks are just configuration. >> >> If we're going to pick up arbitrary configuration from remotes/repos on >> request then we'd be starting with the most dangerous configuration. >> > > Summarizing from : Based on this feedback, I'm hearing "we should > have a design for suggested configuration in general," but I don't > think that necessitates that we should pursue generic configuration > before hooks configuration. It's the principle that if we think you might stab your eye accidentally, maybe we should start with a plastic fork instead of going directly to a lightsaber :) I.e. hooks are a *lot* more dangerous to get wrong. I think this sort of proposal would be better received if it started with a small whitelist of our least dangerous (but still useful to set in-repo .gitconfig) config variables. Then the worst thing that'll happen is you'll confuse git-diff a bit, not an RCE. >> I think a better way to start such an effort incrementally would be: >> >> 1. Audit git's config parser for the safety of parsing arbitrary config, >> we parse .gitmodules now, do we feel it's safe enough to parse >> arbitrary config (probably, but worth checking). >> >> 2. Add reflection to git's own config variables. Right now we have this >> in the binary generated via a grep from the documentation. Similar to >> Emacs's implementation we should/could categorize all our known >> config variables by safety. >> > > To clarify, are you saying, today, git's config variables are pulled > from the documentation? I.e. the documentation is the source of truth > for what config variables are supported? o.0 Yes, see generate-configlist.sh. But we don't hard round-trip it, and we couldn't have an exhaustive list anyway due to things like userdiff, sendemail.identity etc. But I'm mentioning it because the next logical step from there is to for some/most pre-declare them, and at that point it could be some struct that lists the danger level of each one, and from there whitelisting them for in-repo .gitconfig would be a logical next step... >> Hooks are the least safe, something like a diff.context=N setting the >> repo wants to suggest a -U<n> setting much safer (just tweaking our >> existing diff algorithm) etc. But even in those cases we'd want to >> proceed slowly, e.g. is that config parsing for that -U<n> defensive >> enough to be safe for arbitrary data? >> > > To clarify, this proposal is just to set the hooks config that > config-based hooks enabled e.g. running `git config --add > hook.pre-commit.command pylint` on behalf of the user, so I'm not sure > what "arbitrary data" you're referring to. At least, any problems I'd > think we'd already address with config-based hooks. As noted by brian m. carlson etc. in the side-thread in <YGzrfaSC4xd75j2U@camp.crustytoothpaste.net> the danger is that by making this a supported feature git becomes the social-engineering vector to fool users into trusting a command like that which they otherwise might not have. >> 3. Users should be able to e.g. configure "yes, for any repo I clone >> they can tweak 'safe'" variables. This would reduce user dialog >> fatigue, and thus increase safety. I.e. a repo who wants to set a >> thing like hooks would stand out, but something that e.g. wants to >> change the diff order would pass on existing global configuration. >> >> 4. Smarter ACL that magic remote+magic branch: It seems like an obvious >> thing to me to want that if I clone e.g. a random clone of git.git >> that I'd want to setup config for it IFF the .gitconfig in it is >> reachable from a tag GPG signed by <approved key>. >> >> Git's a distributed system, so while I don't think it's bad to have >> some feature like "I always trust config from this remote" (e.g. a >> corporate environment where you know its .gitconfig is >> guarded/audited) we should think about more distributed-friendly >> solutions first and if possible guide users towards those. > > This seems like an OK alternative to allow-listing based on remote, > but are you expecting users to add a GPG key to their .gitconfig? That instead of saying you trust https://github.com/git/git your primary means of interaction with this feature would be saying you, as an example, trust Junio's GPG key. And if you clone a git.git from *anywhere* and have already opted-in to whatever scary config we'd expose for tihs we'll trust and setup the hooks, if and only if they're identical to hooks we can find that were last signed by the trusted GPG key. > Remote URLs seem much more user friendly (think IP address vs URL). Git is content-addressable, something that caters directly to that is more consistent and user friendly than something that makes URLs magical as a model for trust. If I want a trusted kernel I don't really care what URL I clone linux.git from, I care that I can resolve the content to something Linus signed. So if I trust Junio's key and would like this to Just Work it would be better if I clone a mirror of git.git that it works, and I don't have to maintain a list of all mirrors myself. Trusting based on content and GPG keys gives you that, magic URLs don't. It seems to me that in the BigCorp use-case this would also be useful. Presumably you just want blessed reviewed hooks to execute arbitrary code on your laptop, but if your co-worker forks a repo and has control of your magic branch you don't want to setup the potential harmful RCE. They might have malicious hooks. But that's also annoying because in your BigCorp repo you'll presumably want the hooks to check/validate/whatever. Just because you cloned a fork you'd still like that. If you trust (signed) content instead of URLs you can have your cake and eat it too. As long as your co-worker didn't modify the hook config in his fork it'll Just Work.
next prev parent reply other threads:[~2021-04-07 22:47 UTC|newest] Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-03-18 22:03 Albert Cui via GitGitGadget 2021-03-18 22:29 ` Junio C Hamano 2021-03-18 23:45 ` Albert Cui 2021-03-19 1:28 ` brian m. carlson 2021-03-19 10:27 ` Ævar Arnfjörð Bjarmason 2021-04-06 0:35 ` Albert Cui 2021-04-07 22:47 ` Ævar Arnfjörð Bjarmason [this message] 2021-06-21 19:36 ` Jonathan Tan 2021-06-21 20:35 ` Ævar Arnfjörð Bjarmason 2021-03-26 1:43 ` [PATCH v2] hooks: propose project " Albert Cui via GitGitGadget 2021-03-29 23:20 ` Emily Shaffer 2021-04-01 20:02 ` Albert Cui 2021-03-30 15:24 ` Derrick Stolee 2021-04-05 22:45 ` Albert Cui 2021-04-05 23:09 ` Junio C Hamano 2021-04-05 23:40 ` Albert Cui 2021-04-06 0:13 ` Junio C Hamano 2021-04-06 0:27 ` Albert Cui 2021-04-06 23:15 ` brian m. carlson 2021-04-07 7:53 ` Ævar Arnfjörð Bjarmason 2021-04-07 13:09 ` Derrick Stolee 2021-04-07 18:40 ` Albert Cui 2021-04-07 20:02 ` Junio C Hamano 2021-04-07 22:23 ` Ævar Arnfjörð Bjarmason 2021-04-15 16:52 ` Ed Maste 2021-04-15 19:41 ` Junio C Hamano 2021-04-15 20:37 ` Ed Maste 2021-04-15 20:50 ` Junio C Hamano 2021-04-15 22:28 ` brian m. carlson 2021-04-02 9:59 ` Ævar Arnfjörð Bjarmason 2021-04-05 23:42 ` Albert Cui 2021-04-02 10:30 ` Ævar Arnfjörð Bjarmason 2021-04-03 0:58 ` Albert Cui 2021-04-24 1:38 ` [PATCH v3] " Albert Cui via GitGitGadget 2021-04-28 2:48 ` Junio C Hamano 2021-05-05 19:11 ` [PATCH v4] " Albert Cui via GitGitGadget 2021-06-03 3:31 ` Jonathan Tan 2021-06-03 20:16 ` Albert Cui 2021-06-03 22:10 ` Jonathan Tan
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style List information: http://vger.kernel.org/majordomo-info.html * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --subject='Re: [PATCH] hooks: propose repository owner configured hooks' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Code repositories for project(s) associated with this inbox: https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).