list mirror (unofficial, one of many)
 help / color / mirror / Atom feed
From: Jeff King <>
To: Brandon Williams <>
Subject: Re: [PATCH 2/2] submodule: munge paths to submodule git directories
Date: Thu, 9 Aug 2018 17:26:02 -0400
Message-ID: <> (raw)
In-Reply-To: <>

On Wed, Aug 08, 2018 at 03:33:23PM -0700, Brandon Williams wrote:

> Commit 0383bbb901 (submodule-config: verify submodule names as paths,
> 2018-04-30) introduced some checks to ensure that submodule names don't
> include directory traversal components (e.g. "../").
> This addresses the vulnerability identified in 0383bbb901 but the root
> cause is that we use submodule names to construct paths to the
> submodule's git directory.  What we really should do is munge the
> submodule name before using it to construct a path.
> Teach "submodule_name_to_gitdir()" to munge a submodule's name (by url
> encoding it) before using it to build a path to the submodule's gitdir.

I like this approach very much, and I think using url encoding is much
better than an opaque hash (purely because it makes debugging and
inspection saner).

Two thoughts, though:

> +	modules_len = buf->len;
>  	strbuf_addstr(buf, submodule_name);
> +
> +	/*
> +	 * If the submodule gitdir already exists using the old-fashioned
> +	 * location (which uses the submodule name as-is, without munging it)
> +	 * then return that.
> +	 */
> +	if (!access(buf->buf, F_OK))
> +		return;

I think this backwards-compatibility is necessary to avoid pain. But
until it goes away, I don't think this is helping the vulnerability from
0383bbb901. Because there the issue was that the submodule name pointed
back into the working tree, so this access() would find the untrusted
working tree code and say "ah, an old-fashioned name!".

In theory a fresh clone could set a config option for "I only speak
use new-style modules". And there could even be a conversion program
that moves the modules as appropriate, fixes up the .git files in the
working tree, and then sets that config.

In fact, I think that config option _could_ be done by bumping
core.repositoryformatversion and then setting extensions.submodulenames
to "url" or something. Then you could never run into the confusing case
where you have a clone done by a new version of git (using new-style
names), but using an old-style version gets confused because it can't
find the module directories (instead, it would barf and say "I don't
know about that extension").

I don't know if any of that is worth it, though. We already fixed the
problem from 0383bbb901. There may be a _different_ "break out of the
modules directory" vulnerability, but since we disallow ".." it's hard
to see what it would be (the best I could come up with is maybe pointing
one module into the interior of another module, but I think you'd have
to trouble overwriting anything useful).

And while an old-style version of Git being confused might be annoying,
I suspect that bumping the repository version would be even _more_
annoying, because it would hit every command, not just ones that try to
touch those submodules.

> diff --git a/t/ b/t/
> index 2c2c97e144..963693332c 100755
> --- a/t/
> +++ b/t/
> @@ -933,7 +933,7 @@ test_expect_success 'recursive relative submodules stay relative' '
>  		cd clone2 &&
>  		git submodule update --init --recursive &&
>  		echo "gitdir: ../.git/modules/sub3" >./sub3/.git_expect &&
> -		echo "gitdir: ../../../.git/modules/sub3/modules/dirdir/subsub" >./sub3/dirdir/subsub/.git_expect
> +		echo "gitdir: ../../../.git/modules/sub3/modules/dirdir%2fsubsub" >./sub3/dirdir/subsub/.git_expect

One interesting thing about url-encoding is that it's not one-to-one.
This case could also be %2F, which is a different file (on a
case-sensitive filesystem). I think "%20" and "+" are similarly

If we were decoding the filenames, that's fine. The round-trip is

But that's not quite how the new code behaves. We encode the input and
then check to see if it matches an encoding we previously performed. So
if our urlencode routines ever change, this will subtly break.

I don't know how much it's worth caring about. We're not that likely to
change the routines ourself (though certainly a third-party
implementation would need to know our exact url-encoding decisions).

Some possible actions:

 0. Do nothing, and cross our fingers. ;)

 1. Don't use strbuf_addstr_urlencode(), but rather our own munging
    function which we know will remain stable (or alternatively, a flag
    to strbuf_addstr_urlencode to get the consistent behavior).

 2. Make sure we have tests which cover this, so at least somebody
    changing the urlencode decisions will see a breakage. Your test here
    covers the upper/lowercase one, but we might want one that covers
    "+". (There may be more ambiguous cases, but those are the ones I
    know about).

 3. Rather than check for the existence of names, decode what's actually
    in the modules/ directory to create an in-memory index of names.

    I hesitate to suggest that, because it's obviously way more
    complicated, and may perform worse if you have a lot of modules
    (since you have to readdir() and decode the whole directory just to
    look up one module).

    But I think it also gives a more elegant solution to the
    backwards-compatibility problem, since we could recognize both new
    and old-style names. There's some ambiguity (e.g., is "foo%2fbar"
    "foo/bar", or did somebody really have a name with a percent in
    it?),. but in theory you could respect either name (giving
    preference to new-style in case of a conflict).

    And I think the result would be immune to any directory-escape
    vulnerabilities, because we'd always start with what actually exists
    in $GIT_DIR/modules/, which we know _we_ will have written.

    Again, I'm not sure if it's worth the effort, but I thought I'd
    throw it out there.


  reply	other threads:[~2018-08-09 21:26 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-07 23:06 [RFC] " Brandon Williams
2018-08-07 23:25 ` Jonathan Nieder
2018-08-08  0:14 ` Junio C Hamano
2018-08-08 22:33 ` [PATCH 0/2] munge submodule names Brandon Williams
2018-08-08 22:33   ` [PATCH 1/2] submodule: create helper to build paths to submodule gitdirs Brandon Williams
2018-08-08 23:21     ` Stefan Beller
2018-08-09  0:45       ` Brandon Williams
2018-08-10 21:27     ` Junio C Hamano
2018-08-10 21:45       ` Brandon Williams
2018-08-08 22:33   ` [PATCH 2/2] submodule: munge paths to submodule git directories Brandon Williams
2018-08-09 21:26     ` Jeff King [this message]
2018-08-14 18:04       ` Brandon Williams
2018-08-14 18:57         ` Jonathan Nieder
2018-08-14 21:08           ` Stefan Beller
2018-08-14 21:12             ` Jonathan Nieder
2018-08-14 22:34               ` Stefan Beller
2018-08-16  2:34                 ` Jonathan Nieder
2018-08-16  2:39                   ` Stefan Beller
2018-08-16  2:47                     ` Jonathan Nieder
2018-08-16 17:34                       ` Brandon Williams
2018-08-16 18:19                       ` [PATCH] submodule: add config for where gitdirs are located Brandon Williams
2018-08-20 22:03                         ` Junio C Hamano
2018-08-16 15:07                   ` [PATCH 2/2] submodule: munge paths to submodule git directories Junio C Hamano
2018-08-14 18:58         ` Jeff King
2018-08-28 21:35         ` Stefan Beller
2018-08-29  5:25           ` Jeff King
2018-08-29 18:10             ` Stefan Beller
2018-08-29 21:03               ` Jeff King
2018-08-29 21:10                 ` Stefan Beller
2018-08-29 21:18                   ` Jonathan Nieder
2018-08-29 21:27                     ` Stefan Beller
2018-08-29 21:30                   ` Jeff King
2018-08-29 21:09             ` Jonathan Nieder
2018-08-29 21:14               ` Stefan Beller
2018-08-29 21:25                 ` Brandon Williams
2018-08-29 21:32               ` Jeff King
2018-08-16  0:19     ` Aaron Schrab
2019-01-15  1:25 ` [RFC] " Jonathan Nieder
2019-01-17 17:32   ` Jeff King
2019-01-17 17:57     ` Stefan Beller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

  List information:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link list mirror (unofficial, one of many)

This inbox may be cloned and mirrored by anyone:

	git clone --mirror
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 git git/ \
	public-inbox-index git

Example config snippet for mirrors.
Newsgroups are available over NNTP:
 note: .onion URLs require Tor:

code repositories for the project(s) associated with this inbox:

AGPL code for this site: git clone