git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: NAKAYAMA DAISUKE <nakyamad@icloud.com>
Subject: [PATCH 4/4] gitweb: escape URLs generated by href()
Date: Fri, 15 Nov 2019 04:06:07 -0500	[thread overview]
Message-ID: <20191115090607.GD21987@sigill.intra.peff.net> (raw)
In-Reply-To: <20191115090545.GA30971@sigill.intra.peff.net>

There's a cross-site scripting problem in gitweb, where it will print
URLs generated by its href() helper without further quoting. This allows
an attacker to point a victim to a specially crafted gitweb URL and
inject arbitrary HTML into the resulting page (which the victim sees as
coming from gitweb).

The base of the URL comes from evaluate_uri(), which pulls the value of
$REQUEST_URI via the CGI module. It tries to strip off $PATH_INFO, but
fails to do so in some cases (including ones that contain special
characters, like "+"). Most of the uses of the URL end up being passed
to "$cgi->a(-href = href())", which will get quoted properly by the CGI
module. But in a few places, we output them ourselves as part of
manually-generated HTML, and whatever was in the original URL will
appear unquoted in the output.

Given that all of the nearby variables placed into this manual HTML
_are_ quoted, it seems like the authors assumed that these URLs would
not need quoting. So it's possible that the bug is actually in
evaluate_uri(), which should be doing a more careful job of stripping
$PATH_INFO. There's some discussion in a comment in that function, as
well as the commit message in 81d3fe9f48 (gitweb: fix wrong base URL
when non-root DirectoryIndex, 2009-02-15). But I'm not sure I understand
it.

Regardless, it's a good idea to quote these values at the point of
insertion into the HTML output:

  1. Even if there is a bug in evaluate_uri(), this would give us
     belt-and-suspenders protection.

  2. evaluate_uri() is only handling the base. Some generated URLs will
     also mention arbitrary refs or filenames in the repositories, and
     these should be quoted anyway.

  3. It should never _hurt_ to quote (and that's what all of the
     $cgi->a() calls are doing already).

So there may be further work here, but this patch at least prevents the
XSS vulnerability, and shouldn't make anything worse.

The test here covers the calls in print_feed_meta(), but I manually
audited every call to href() to see how its output was used, and quoted
appropriately. Most of them are esc_attr(), as they're used in tag
attributes, but I used esc_html() when the URLs were printed bare. The
distinction is largely academic, as one is implemented as a wrapper for
the other.

Reported-by: NAKAYAMA DAISUKE <nakyamad@icloud.com>
Signed-off-by: Jeff King <peff@peff.net>
---
 gitweb/gitweb.perl                        | 31 +++++++++++++----------
 t/t9502-gitweb-standalone-parse-output.sh |  3 ++-
 2 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 7fef19fe59..a2cc4d9fb0 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -4048,7 +4048,7 @@ sub print_feed_meta {
 
 			$href_params{'extra_options'} = undef;
 			$href_params{'action'} = $type;
-			$link_attr{'-href'} = href(%href_params);
+			$link_attr{'-href'} = esc_attr(href(%href_params));
 			print "<link ".
 			      "rel=\"$link_attr{'-rel'}\" ".
 			      "title=\"$link_attr{'-title'}\" ".
@@ -4057,7 +4057,7 @@ sub print_feed_meta {
 			      "/>\n";
 
 			$href_params{'extra_options'} = '--no-merges';
-			$link_attr{'-href'} = href(%href_params);
+			$link_attr{'-href'} = esc_attr(href(%href_params));
 			$link_attr{'-title'} .= ' (no merges)';
 			print "<link ".
 			      "rel=\"$link_attr{'-rel'}\" ".
@@ -4070,10 +4070,12 @@ sub print_feed_meta {
 	} else {
 		printf('<link rel="alternate" title="%s projects list" '.
 		       'href="%s" type="text/plain; charset=utf-8" />'."\n",
-		       esc_attr($site_name), href(project=>undef, action=>"project_index"));
+		       esc_attr($site_name),
+		       esc_attr(href(project=>undef, action=>"project_index")));
 		printf('<link rel="alternate" title="%s projects feeds" '.
 		       'href="%s" type="text/x-opml" />'."\n",
-		       esc_attr($site_name), href(project=>undef, action=>"opml"));
+		       esc_attr($site_name),
+		       esc_attr(href(project=>undef, action=>"opml")));
 	}
 }
 
@@ -4287,8 +4289,8 @@ sub git_footer_html {
 	if (defined $action &&
 	    $action eq 'blame_incremental') {
 		print qq!<script type="text/javascript">\n!.
-		      qq!startBlame("!. href(action=>"blame_data", -replay=>1) .qq!",\n!.
-		      qq!           "!. href() .qq!");\n!.
+		      qq!startBlame("!. esc_attr(href(action=>"blame_data", -replay=>1)) .qq!",\n!.
+		      qq!           "!. esc_attr(href()) .qq!");\n!.
 		      qq!</script>\n!;
 	} else {
 		my ($jstimezone, $tz_cookie, $datetime_class) =
@@ -7155,8 +7157,8 @@ sub git_blob {
 			print qq! alt="!.esc_attr($file_name).qq!" title="!.esc_attr($file_name).qq!"!;
 		}
 		print qq! src="! .
-		      href(action=>"blob_plain", hash=>$hash,
-		           hash_base=>$hash_base, file_name=>$file_name) .
+		      esc_attr(href(action=>"blob_plain", hash=>$hash,
+		           hash_base=>$hash_base, file_name=>$file_name)) .
 		      qq!" />\n!;
 	} else {
 		my $nr;
@@ -8239,6 +8241,7 @@ sub git_feed {
 	} else {
 		$alt_url = href(-full=>1, action=>"summary");
 	}
+	$alt_url = esc_attr($alt_url);
 	print qq!<?xml version="1.0" encoding="utf-8"?>\n!;
 	if ($format eq 'rss') {
 		print <<XML;
@@ -8276,7 +8279,7 @@ sub git_feed {
 		      $alt_url . '" />' . "\n" .
 		      '<link rel="self" type="' . $content_type . '" href="' .
 		      $cgi->self_url() . '" />' . "\n" .
-		      "<id>" . href(-full=>1) . "</id>\n" .
+		      "<id>" . esc_url(href(-full=>1)) . "</id>\n" .
 		      # use project owner for feed author
 		      "<author><name>$owner</name></author>\n";
 		if (defined $favicon) {
@@ -8322,7 +8325,7 @@ sub git_feed {
 			      "<author>" . esc_html($co{'author'}) . "</author>\n" .
 			      "<pubDate>$cd{'rfc2822'}</pubDate>\n" .
 			      "<guid isPermaLink=\"true\">$co_url</guid>\n" .
-			      "<link>$co_url</link>\n" .
+			      "<link>" . esc_html($co_url) . "</link>\n" .
 			      "<description>" . esc_html($co{'title'}) . "</description>\n" .
 			      "<content:encoded>" .
 			      "<![CDATA[\n";
@@ -8344,8 +8347,8 @@ sub git_feed {
 			}
 			print "</contributor>\n" .
 			      "<published>$cd{'iso-8601'}</published>\n" .
-			      "<link rel=\"alternate\" type=\"text/html\" href=\"$co_url\" />\n" .
-			      "<id>$co_url</id>\n" .
+			      "<link rel=\"alternate\" type=\"text/html\" href=\"" . esc_attr($co_url) . "\" />\n" .
+			      "<id>" . esc_html($co_url) . "</id>\n" .
 			      "<content type=\"xhtml\" xml:base=\"" . esc_url($my_url) . "\">\n" .
 			      "<div xmlns=\"http://www.w3.org/1999/xhtml\">\n";
 		}
@@ -8452,8 +8455,8 @@ sub git_opml {
 		}
 
 		my $path = esc_html(chop_str($proj{'path'}, 25, 5));
-		my $rss  = href('project' => $proj{'path'}, 'action' => 'rss', -full => 1);
-		my $html = href('project' => $proj{'path'}, 'action' => 'summary', -full => 1);
+		my $rss  = esc_attr(href('project' => $proj{'path'}, 'action' => 'rss', -full => 1));
+		my $html = esc_attr(href('project' => $proj{'path'}, 'action' => 'summary', -full => 1));
 		print "<outline type=\"rss\" text=\"$path\" title=\"$path\" xmlUrl=\"$rss\" htmlUrl=\"$html\"/>\n";
 	}
 	print <<XML;
diff --git a/t/t9502-gitweb-standalone-parse-output.sh b/t/t9502-gitweb-standalone-parse-output.sh
index 1b04c29037..e38cbc97d3 100755
--- a/t/t9502-gitweb-standalone-parse-output.sh
+++ b/t/t9502-gitweb-standalone-parse-output.sh
@@ -200,7 +200,8 @@ xss() {
 test_expect_success 'xss checks' '
 	TAG="<magic-xss-tag>" &&
 	xss "a=rss&p=$TAG" &&
-	xss "a=rss&p=foo.git&f=$TAG"
+	xss "a=rss&p=foo.git&f=$TAG" &&
+	xss "" "$TAG+"
 '
 
 test_done
-- 
2.24.0.739.gb5632e4929

  parent reply	other threads:[~2019-11-15  9:06 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-15  9:05 [PATCH 0/4] gitweb: quote base url more consistently Jeff King
2019-11-15  9:05 ` [PATCH 1/4] t9502: pass along all arguments in xss helper Jeff King
2019-11-15  9:06 ` [PATCH 2/4] t/gitweb-lib.sh: drop confusing quotes Jeff King
2019-11-15  9:06 ` [PATCH 3/4] t/gitweb-lib.sh: set $REQUEST_URI Jeff King
2019-11-15  9:06 ` Jeff King [this message]
2019-11-18  1:45 ` [PATCH 0/4] gitweb: quote base url more consistently Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191115090607.GD21987@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=nakyamad@icloud.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).