On 2020-05-06 at 21:33:54, Emily Shaffer wrote:
> On Sat, Apr 25, 2020 at 08:57:27PM +0000, brian m. carlson wrote:
> > 
> > On 2020-04-20 at 23:53:10, Emily Shaffer wrote:
> > > +== Caveats
> > > +
> > > +=== Security and repo config
> > > +
> > > +Part of the motivation behind this refactor is to mitigate hooks as an attack
> > > +vector;footnote:[https://lore.kernel.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/]
> > > +however, as the design stands, users can still provide hooks in the repo-level
> > > +config, which is included when a repo is zipped and sent elsewhere.  The
> > > +security of the repo-level config is still under discussion; this design
> > > +generally assumes the repo-level config is secure, which is not true yet. The
> > > +goal is to avoid an overcomplicated design to work around a problem which has
> > > +ceased to exist.
> > 
> > I want to be clear that I'm very much opposed to trying to "secure" the
> > config as a whole.  I believe that it's going to ultimately lead to a
> > variety of new and interesting attack vectors and will lead to Git
> > becoming a CVE factory.  Vim has this problem with modelines, for
> > example.
> 
> I'm really interested to hear more - it seems like security and config
> efforts will end up on my plate before the end of the year, so I'd like
> to know what is on your mind.

In general, having untrusted configuration is enormously difficult and
is typically only possible as a designed-in feature with extremely
limited options.  We have not designed that feature in from the
beginning and our config parsing is far too ad-hoc to support any
reasonable security posture.  We've also written a program entirely in
C, which has all of the fun memory safety problems.

If we try to secure the config and allow people to use untrusted
repositories securely, we've changed the security posture of the project
very significantly.  The number of keys we can safely trust come down to
probably core.repositoryformatversion and extensions.objectformat, and
I'm not even sure that the latter can be trusted because there are all
sorts of fun behaviors one can produce by setting the wrong hash
algorithm.

That's just one example of a potential source of security problems, but
I anticipate people can use other options as well.  Setting the rename
limit can be a DoS.  Changing the colors of diff or log output could be
used to hide malicious code from inspection.  We obviously can't trust
anything containing a URL, since an attacker could try to make "git pull
origin" point to their server instead, which means having remotes is out
of the question.  Most of our recent security issues have involved the
.gitmodules file, which, despite being extremely limited, is indeed an
untrusted config file.

The scope of potential vulnerabilities explodes as you allow users to
have untrusted config.  I don't think there's any reasonable set of
useful configuration we can have on a per-repo basis that doesn't open
us up to a whole set of security vulnerabilities.  It seems to me that
we're setting ourselves up to either have a feature so limited nobody
uses it or a massive, never-ending set of CVEs as everybody finds new
ways to attack things.  I just don't think promising that feature to
users is honest because I don't think we can practically achieve it in
Git.  Most projects don't even try it as an option.

On the other hand, what we promise now, which is to restrict untrusted
repositories to cloning and fetching, while surprising to users,
dramatically reduces the scope because it's basically what we promise
over the network.  The interface is highly restricted, well known, and
reasonably secure.  We've also limited attack surface to a much smaller
number of binaries.

So while I think the intention is good and the idea, if implementable,
would be beneficial to users, I think it's practically going to be
unachievable.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204