On Mon, Nov 16, 2020 at 09:34:54PM -0500, Jeff King wrote:
> On Mon, Nov 16, 2020 at 11:39:35AM -0800, Junio C Hamano wrote:
> 
> > >> While not document, it is currently possible to specify config entries
> > >> [in GIT_CONFIG_PARAMETERS]
> > [...]
> >
> > "While not documented" yes, for sure, but we do not document it for
> > a good reason---it is a pure implementation detail between Git
> > process that runs another one as its internal implementation detail.
> 
> I have actually been quite tempted to document and promise that it will
> continue to work. Because it really is useful in some instances. The
> thing that has held me back is that the documentation would reveal how
> unforgiving the parser is. ;)
> 
> It insists that key/value pairs are shell-quoted as a whole. I think if
> we made it accept a some reasonable inputs:
> 
>   - do not require quoting for values that do not need it
> 
>   - allow any amount of shell-style single-quoting (whole parameters,
>     just values, etc).
> 
>   - do not bother allowing other quoting, like double-quotes with
>     backslashes. However, document backslash and double-quote as
>     meta-characters that must not appear outside of single-quotes, to
>     allow for future expansion.
> 
> then I'd feel comfortable making it a public-facing feature. And for
> most cases it would be pretty pleasant to use (and for the unpleasant
> ones, I'm not sure that a little quoting is any worse than the paired
> environment variables found here).

I tend to disagree there. As long as you control keys and values
yourself it's not too hard, that's true. But as soon as you start
processing untrusted keys or values, then it's getting a lot harder.

E.g. suppose you create a fetch mirror for a user, where the source is
protected by a password. We don't want to write the password into the
gitconfig of the mirror repository. Passing it via `-C` will show up in
ps(1). Using GIT_CONFIG_PARAMETERS requires you to quote the value,
which contains arbitrary data, and if you fail to do that correctly you
now have an avenue for arbitrary config injection.

That scenario is roughly why I came up with the _KEY/_VALUE schema. It
requires no quoting, is trivial to set up (at least for its target
audience, which is mostly scripts or programs) and wouldn't show up in
ps(1).

> > I especially do not think we want to read from unbounded number of
> > GIT_CONFIG_KEY_<N> variables like this patch does.  How would a
> > script cleanse its environment to protect itself from stray such
> > environment variable pair?  Count up from 1 to forever?  Run "env"
> > and grep for "GIT_CONFIG_KEY_[0-9]*=" (the answer is No.  What if
> > some environment variables have newline in its values?)
> 
> Yeah, scripts can currently assume that:
> 
>   unset $(git rev-parse --local-env-vars)
> 
> will drop any config from the environment. In some cases, having
> rev-parse enumerate the GIT_CONFIG_KEY_* variables that are set would be
> sufficient. But that implies that rev-parse is seeing the same
> environment we're planning to clear. As it is now, a script is free to
> use rev-parse to generate that list, hold on to it, and then use it
> later.

Good point. Adjusting it would be trivial, though: unset all consecutive
GIT_CONFIG_KEY_$n keys and potentially also GIT_CONFIG_VALUE_$n until we
hit a gap. The parser would stop on the first gap anyway.

Patrick

> -Peff