On 2020-05-06 at 21:33:54, Emily Shaffer wrote: > On Sat, Apr 25, 2020 at 08:57:27PM +0000, brian m. carlson wrote: > > > > On 2020-04-20 at 23:53:10, Emily Shaffer wrote: > > > +== Caveats > > > + > > > +=== Security and repo config > > > + > > > +Part of the motivation behind this refactor is to mitigate hooks as an attack > > > +vector;footnote:[https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/] > > > +however, as the design stands, users can still provide hooks in the repo-level > > > +config, which is included when a repo is zipped and sent elsewhere. The > > > +security of the repo-level config is still under discussion; this design > > > +generally assumes the repo-level config is secure, which is not true yet. The > > > +goal is to avoid an overcomplicated design to work around a problem which has > > > +ceased to exist. > > > > I want to be clear that I'm very much opposed to trying to "secure" the > > config as a whole. I believe that it's going to ultimately lead to a > > variety of new and interesting attack vectors and will lead to Git > > becoming a CVE factory. Vim has this problem with modelines, for > > example. > > I'm really interested to hear more - it seems like security and config > efforts will end up on my plate before the end of the year, so I'd like > to know what is on your mind. In general, having untrusted configuration is enormously difficult and is typically only possible as a designed-in feature with extremely limited options. We have not designed that feature in from the beginning and our config parsing is far too ad-hoc to support any reasonable security posture. We've also written a program entirely in C, which has all of the fun memory safety problems. If we try to secure the config and allow people to use untrusted repositories securely, we've changed the security posture of the project very significantly. The number of keys we can safely trust come down to probably core.repositoryformatversion and extensions.objectformat, and I'm not even sure that the latter can be trusted because there are all sorts of fun behaviors one can produce by setting the wrong hash algorithm. That's just one example of a potential source of security problems, but I anticipate people can use other options as well. Setting the rename limit can be a DoS. Changing the colors of diff or log output could be used to hide malicious code from inspection. We obviously can't trust anything containing a URL, since an attacker could try to make "git pull origin" point to their server instead, which means having remotes is out of the question. Most of our recent security issues have involved the .gitmodules file, which, despite being extremely limited, is indeed an untrusted config file. The scope of potential vulnerabilities explodes as you allow users to have untrusted config. I don't think there's any reasonable set of useful configuration we can have on a per-repo basis that doesn't open us up to a whole set of security vulnerabilities. It seems to me that we're setting ourselves up to either have a feature so limited nobody uses it or a massive, never-ending set of CVEs as everybody finds new ways to attack things. I just don't think promising that feature to users is honest because I don't think we can practically achieve it in Git. Most projects don't even try it as an option. On the other hand, what we promise now, which is to restrict untrusted repositories to cloning and fetching, while surprising to users, dramatically reduces the scope because it's basically what we promise over the network. The interface is highly restricted, well known, and reasonably secure. We've also limited attack surface to a much smaller number of binaries. So while I think the intention is good and the idea, if implementable, would be beneficial to users, I think it's practically going to be unachievable. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204