git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: nori <semtlenori@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] http: fix charset detection of extract_content_type()
Date: Tue, 17 Jun 2014 05:31:07 -0400	[thread overview]
Message-ID: <20140617093106.GA5922@sigill.intra.peff.net> (raw)
In-Reply-To: <539caff7.e7bc420a.76b9.fffff853@mx.google.com>

On Sun, Jun 15, 2014 at 03:49:34AM +0900, nori wrote:

> extract_content_type() could not extract a charset parameter if the
> parameter is not the first one and there is a whitespace and a following
> semicolon just before the parameter. For example:
> 
>     text/plain; format=fixed ;charset=utf-8

Thanks, I think your patch does the right thing. We also have a similar
situation going the other way. If we have:

   text/plain; charset=utf-8; format=fixed

we will parse the charset as "utf-8;". My version of iconv actually
seems to accept that, but we should probably be more careful.

I think parameter values can actually be fully quoted. So you could
have something as nasty as:

  text/plain; some-param="a long value with ;semicolons;"; charset=utf-8

I'd rather not get into parsing that level, as I don't expect it to
happen in the real world. And besides, we actually _would_ find the
charset here with the current code; the only thing we might do wrong is
to parse:

  text/plain; tricky="param; charset=foo"; charset=bar

with charset "foo" rather than "bar". But that's highly unlikely, and
the stakes are fairly low (we just show the error message in the wrong
charset).

Anyway, when you re-roll with the correct "From:" header, do you mind
squashing in the extra change and the tests below?

diff --git a/http.c b/http.c
index 05e8b91..3a28b21 100644
--- a/http.c
+++ b/http.c
@@ -927,7 +927,7 @@ static int extract_param(const char *raw, const char *name,
 		return -1;
 	raw++;
 
-	while (*raw && !isspace(*raw))
+	while (*raw && !isspace(*raw) && *raw != ';')
 		strbuf_addch(out, *raw++);
 	return 0;
 }
diff --git a/t/lib-httpd/error.sh b/t/lib-httpd/error.sh
index eafc9d2..a77b8e5 100755
--- a/t/lib-httpd/error.sh
+++ b/t/lib-httpd/error.sh
@@ -19,6 +19,10 @@ case "$PATH_INFO" in
 	printf "text/plain; charset=utf-16"
 	charset=utf-16
 	;;
+*odd-spacing*)
+	printf "text/plain; foo=bar ;charset=utf-16; other=nonsense"
+	charset=utf-16
+	;;
 esac
 printf "\n"
 
diff --git a/t/t5550-http-fetch-dumb.sh b/t/t5550-http-fetch-dumb.sh
index 01b8aae..ac71418 100755
--- a/t/t5550-http-fetch-dumb.sh
+++ b/t/t5550-http-fetch-dumb.sh
@@ -191,5 +191,10 @@ test_expect_success 'http error messages are reencoded' '
 	grep "this is the error message" stderr
 '
 
+test_expect_success 'reencoding is robust to whitespace oddities' '
+	test_must_fail git clone "$HTTPD_URL/error/odd-spacing" 2>stderr &&
+	grep "this is the error message" stderr
+'
+
 stop_httpd
 test_done

      parent reply	other threads:[~2014-06-17  9:31 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-14 18:49 [PATCH] http: fix charset detection of extract_content_type() nori
2014-06-15  8:21 ` Yi EungJun
2014-06-16 18:29 ` Junio C Hamano
2014-06-16 19:20   ` Jeff King
2014-06-17  9:31 ` Jeff King [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140617093106.GA5922@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=semtlenori@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).