git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Brandon Williams <bmwill@google.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 2/2] submodule: munge paths to submodule git directories
Date: Tue, 14 Aug 2018 11:04:06 -0700	[thread overview]
Message-ID: <20180814180406.GA86804@google.com> (raw)
In-Reply-To: <20180809212602.GA11342@sigill.intra.peff.net>

On 08/09, Jeff King wrote:
> On Wed, Aug 08, 2018 at 03:33:23PM -0700, Brandon Williams wrote:
> 
> > Commit 0383bbb901 (submodule-config: verify submodule names as paths,
> > 2018-04-30) introduced some checks to ensure that submodule names don't
> > include directory traversal components (e.g. "../").
> > 
> > This addresses the vulnerability identified in 0383bbb901 but the root
> > cause is that we use submodule names to construct paths to the
> > submodule's git directory.  What we really should do is munge the
> > submodule name before using it to construct a path.
> > 
> > Teach "submodule_name_to_gitdir()" to munge a submodule's name (by url
> > encoding it) before using it to build a path to the submodule's gitdir.
> 
> I like this approach very much, and I think using url encoding is much
> better than an opaque hash (purely because it makes debugging and
> inspection saner).
> 
> Two thoughts, though:
> 
> > +	modules_len = buf->len;
> >  	strbuf_addstr(buf, submodule_name);
> > +
> > +	/*
> > +	 * If the submodule gitdir already exists using the old-fashioned
> > +	 * location (which uses the submodule name as-is, without munging it)
> > +	 * then return that.
> > +	 */
> > +	if (!access(buf->buf, F_OK))
> > +		return;
> 
> I think this backwards-compatibility is necessary to avoid pain. But
> until it goes away, I don't think this is helping the vulnerability from
> 0383bbb901. Because there the issue was that the submodule name pointed
> back into the working tree, so this access() would find the untrusted
> working tree code and say "ah, an old-fashioned name!".
> 
> In theory a fresh clone could set a config option for "I only speak
> use new-style modules". And there could even be a conversion program
> that moves the modules as appropriate, fixes up the .git files in the
> working tree, and then sets that config.
> 
> In fact, I think that config option _could_ be done by bumping
> core.repositoryformatversion and then setting extensions.submodulenames
> to "url" or something. Then you could never run into the confusing case
> where you have a clone done by a new version of git (using new-style
> names), but using an old-style version gets confused because it can't
> find the module directories (instead, it would barf and say "I don't
> know about that extension").
> 
> I don't know if any of that is worth it, though. We already fixed the
> problem from 0383bbb901. There may be a _different_ "break out of the
> modules directory" vulnerability, but since we disallow ".." it's hard
> to see what it would be (the best I could come up with is maybe pointing
> one module into the interior of another module, but I think you'd have
> to trouble overwriting anything useful).
> 
> And while an old-style version of Git being confused might be annoying,
> I suspect that bumping the repository version would be even _more_
> annoying, because it would hit every command, not just ones that try to
> touch those submodules.

Oh I know that this doesn't help with that vulnerability.  As you've
said we fix it and now disallow ".." at the submodule-config level so
really this path is simply about using what we get out of
submodule-config in a more sane manor.

> 
> > diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
> > index 2c2c97e144..963693332c 100755
> > --- a/t/t7400-submodule-basic.sh
> > +++ b/t/t7400-submodule-basic.sh
> > @@ -933,7 +933,7 @@ test_expect_success 'recursive relative submodules stay relative' '
> >  		cd clone2 &&
> >  		git submodule update --init --recursive &&
> >  		echo "gitdir: ../.git/modules/sub3" >./sub3/.git_expect &&
> > -		echo "gitdir: ../../../.git/modules/sub3/modules/dirdir/subsub" >./sub3/dirdir/subsub/.git_expect
> > +		echo "gitdir: ../../../.git/modules/sub3/modules/dirdir%2fsubsub" >./sub3/dirdir/subsub/.git_expect
> 
> One interesting thing about url-encoding is that it's not one-to-one.
> This case could also be %2F, which is a different file (on a
> case-sensitive filesystem). I think "%20" and "+" are similarly
> interchangeable.
> 
> If we were decoding the filenames, that's fine. The round-trip is
> lossless.
> 
> But that's not quite how the new code behaves. We encode the input and
> then check to see if it matches an encoding we previously performed. So
> if our urlencode routines ever change, this will subtly break.
> 
> I don't know how much it's worth caring about. We're not that likely to
> change the routines ourself (though certainly a third-party
> implementation would need to know our exact url-encoding decisions).

This is exactly the reason why I wanted to get some opinions on what the
best thing to do here would be.  I _think_ the best thing would probably
be to write a specific routine to do the conversion, and it wouldn't
even have to be all that complex.  Basically I'm just interested in
converting '/' characters so that things no longer behave like
nested directories.

> 
> Some possible actions:
> 
>  0. Do nothing, and cross our fingers. ;)
> 
>  1. Don't use strbuf_addstr_urlencode(), but rather our own munging
>     function which we know will remain stable (or alternatively, a flag
>     to strbuf_addstr_urlencode to get the consistent behavior).
> 
>  2. Make sure we have tests which cover this, so at least somebody
>     changing the urlencode decisions will see a breakage. Your test here
>     covers the upper/lowercase one, but we might want one that covers
>     "+". (There may be more ambiguous cases, but those are the ones I
>     know about).
> 
>  3. Rather than check for the existence of names, decode what's actually
>     in the modules/ directory to create an in-memory index of names.
> 
>     I hesitate to suggest that, because it's obviously way more
>     complicated, and may perform worse if you have a lot of modules
>     (since you have to readdir() and decode the whole directory just to
>     look up one module).
> 
>     But I think it also gives a more elegant solution to the
>     backwards-compatibility problem, since we could recognize both new
>     and old-style names. There's some ambiguity (e.g., is "foo%2fbar"
>     "foo/bar", or did somebody really have a name with a percent in
>     it?),. but in theory you could respect either name (giving
>     preference to new-style in case of a conflict).
> 
>     And I think the result would be immune to any directory-escape
>     vulnerabilities, because we'd always start with what actually exists
>     in $GIT_DIR/modules/, which we know _we_ will have written.
> 
>     Again, I'm not sure if it's worth the effort, but I thought I'd
>     throw it out there.
> 
> -Peff

-- 
Brandon Williams

  reply	other threads:[~2018-08-14 18:04 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-07 23:06 [RFC] submodule: munge paths to submodule git directories Brandon Williams
2018-08-07 23:25 ` Jonathan Nieder
2018-08-08  0:14 ` Junio C Hamano
2018-08-08 22:33 ` [PATCH 0/2] munge submodule names Brandon Williams
2018-08-08 22:33   ` [PATCH 1/2] submodule: create helper to build paths to submodule gitdirs Brandon Williams
2018-08-08 23:21     ` Stefan Beller
2018-08-09  0:45       ` Brandon Williams
2018-08-10 21:27     ` Junio C Hamano
2018-08-10 21:45       ` Brandon Williams
2018-08-08 22:33   ` [PATCH 2/2] submodule: munge paths to submodule git directories Brandon Williams
2018-08-09 21:26     ` Jeff King
2018-08-14 18:04       ` Brandon Williams [this message]
2018-08-14 18:57         ` Jonathan Nieder
2018-08-14 21:08           ` Stefan Beller
2018-08-14 21:12             ` Jonathan Nieder
2018-08-14 22:34               ` Stefan Beller
2018-08-16  2:34                 ` Jonathan Nieder
2018-08-16  2:39                   ` Stefan Beller
2018-08-16  2:47                     ` Jonathan Nieder
2018-08-16 17:34                       ` Brandon Williams
2018-08-16 18:19                       ` [PATCH] submodule: add config for where gitdirs are located Brandon Williams
2018-08-20 22:03                         ` Junio C Hamano
2018-08-16 15:07                   ` [PATCH 2/2] submodule: munge paths to submodule git directories Junio C Hamano
2018-08-14 18:58         ` Jeff King
2018-08-28 21:35         ` Stefan Beller
2018-08-29  5:25           ` Jeff King
2018-08-29 18:10             ` Stefan Beller
2018-08-29 21:03               ` Jeff King
2018-08-29 21:10                 ` Stefan Beller
2018-08-29 21:18                   ` Jonathan Nieder
2018-08-29 21:27                     ` Stefan Beller
2018-08-29 21:30                   ` Jeff King
2018-08-29 21:09             ` Jonathan Nieder
2018-08-29 21:14               ` Stefan Beller
2018-08-29 21:25                 ` Brandon Williams
2018-08-29 21:32               ` Jeff King
2018-08-16  0:19     ` Aaron Schrab
2019-01-15  1:25 ` [RFC] " Jonathan Nieder
2019-01-17 17:32   ` Jeff King
2019-01-17 17:57     ` Stefan Beller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180814180406.GA86804@google.com \
    --to=bmwill@google.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).