git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: Michael Haggerty <mhagger@alum.mit.edu>
Subject: [PATCH] sq_dequote: fix extra consumption of source string
Date: Tue, 13 Feb 2018 18:41:49 -0500	[thread overview]
Message-ID: <20180213234149.GA21964@sigill.intra.peff.net> (raw)

This fixes a (probably harmless) parsing problem in
sq_dequote_step(), in which we parse some bogus input
incorrectly rather than complaining that it's bogus.

Our shell-dequoting function is very strict: it can unquote
everything generated by sq_quote(), but not arbitrary
strings. In particular, it only allows characters outside of
the single-quoted string if they are immediately backslashed
and then the single-quoted string is resumed. So:

  'foo'\''bar'

is OK. But these are not:

  'foo'\'bar
  'foo'\'
  'foo'\'\''bar'

even though they are all valid shell. The parser has a funny
corner case here. When we see a backslashed character, we
keep incrementing the "src" pointer as we parse it. For a
single sq_dequote() call, that's OK; our next step is to
bail with an error, and we don't care where "src" points.

But if we're parsing multiple strings with sq_dequote_to_argv(),
then our next step is to see if the string is followed by
whitespace. Because we erroneously incremented the "src"
pointer, we don't barf on the bogus backslash that we
skipped. Instead, we may find whitespace that immediately
follows it, and continue as if all is well (skipping the
backslashed character completely!).

In practice, this shouldn't be a big deal. The input is
bogus, and our sq_quote() would never generate this bogus
input. In all but one callers, we are parsing input created
by an earlier call to sq_quote(). That final case is "git
shell", which parses shell-quoting generated by the client.
And in that case we use the singular sq_quote(), which has
always behaved correctly.

One might also wonder if you could provoke a read past the
end of the string. But the answer is no; we still parse
character by character, and would never advance past a NUL.

This patch implements the minimal fix, along with
documenting the restriction (which confused at least me
while reading the code). We should possibly consider
being more liberal in accepting valid shell-quoted words. I
suspect the code may actually be simpler, and it would be
more friendly to anybody generating or editing input by
hand. But I wanted to fix just the immediate bug in this
patch.

We don't have a direct way to unit-test the sq_dequote()
functions, but we can do this by feeding input to
GIT_CONFIG_PARAMETERS (which is not normally a user-facing
interface, but serves here as it expects to see sq_quote()
input from "git -c"). I've included both a bogus example,
and a related "good" one to confirm that we still parse it
correctly.

Noticed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
---
Phew. That was a lot of explanation for a tiny bug. But it
really took me a while to convince myself that there were no
other lurking problems.

 quote.c                | 12 +++++++++---
 t/t1300-repo-config.sh | 23 +++++++++++++++++++++++
 2 files changed, 32 insertions(+), 3 deletions(-)

diff --git a/quote.c b/quote.c
index de2922ddd6..44f47bd3dc 100644
--- a/quote.c
+++ b/quote.c
@@ -94,9 +94,15 @@ static char *sq_dequote_step(char *arg, char **next)
 				*next = NULL;
 			return arg;
 		case '\\':
-			c = *++src;
-			if (need_bs_quote(c) && *++src == '\'') {
-				*dst++ = c;
+			/*
+			 * Allow backslashed characters outside of
+			 * single-quotes only if they need escaping,
+			 * and only if we resume the single-quoted part
+			 * afterward.
+			 */
+			if (need_bs_quote(src[1]) && src[2] == '\'') {
+				*dst++ = src[1];
+				src += 2;
 				continue;
 			}
 		/* Fallthrough */
diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index cbeb9bebee..4f8e6f5fde 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -1206,6 +1206,29 @@ test_expect_success 'git -c is not confused by empty environment' '
 	GIT_CONFIG_PARAMETERS="" git -c x.one=1 config --list
 '
 
+sq="'"
+test_expect_success 'detect bogus GIT_CONFIG_PARAMETERS' '
+	cat >expect <<-\EOF &&
+	env.one one
+	env.two two
+	EOF
+	GIT_CONFIG_PARAMETERS="${sq}env.one=one${sq} ${sq}env.two=two${sq}" \
+		git config --get-regexp "env.*" >actual &&
+	test_cmp expect actual &&
+
+	cat >expect <<-EOF &&
+	env.one one${sq}
+	env.two two
+	EOF
+	GIT_CONFIG_PARAMETERS="${sq}env.one=one${sq}\\$sq$sq$sq ${sq}env.two=two${sq}" \
+		git config --get-regexp "env.*" >actual &&
+	test_cmp expect actual &&
+
+	test_must_fail env \
+		GIT_CONFIG_PARAMETERS="${sq}env.one=one${sq}\\$sq ${sq}env.two=two${sq}" \
+		git config --get-regexp "env.*"
+'
+
 test_expect_success 'git config --edit works' '
 	git config -f tmp test.value no &&
 	echo test.value=yes >expect &&
-- 
2.16.1.464.gc4bae515b7

             reply	other threads:[~2018-02-13 23:41 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-13 23:41 Jeff King [this message]
2018-02-19  3:45 ` [PATCH] sq_dequote: fix extra consumption of source string Michael Haggerty

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180213234149.GA21964@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=mhagger@alum.mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).