[PATCH v6 0/4] config: add support for http.<url>.* settings

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* [PATCH v6 0/4] config: add support for http.<url>.* settings
@ 2013-07-19 12:48 Kyle J. McKay
  2013-07-19 12:48 ` [PATCH v6 1/4] " Kyle J. McKay
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Kyle J. McKay @ 2013-07-19 12:48 UTC (permalink / raw)
  To: git
  Cc: David Aguilar, Petr Baudis, Junio C Hamano, Richard Hartmann,
	Jeff King, Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

NOTE: This patch requires the following preparatory change:

 f1ff763 http.c: fix parsing of http.sslCertPasswordProtected variable

which is currently in pu.

This patch series adds support for http.<url>.* settings.  The patch is
organized as a series of improvements on the functionality:

1/4 - adds basic textual matching support
2/4 - adds URL normalization before matching
3/4 - adds a test for the URL normalization function
4/4 - adds any user matching

With-Feedback-From-jh: Junio C Hamano <gitster@pobox.com>
With-Feedback-From-jk: Jeff King <peff@peff.net>
With-Feedback-From-es: Eric Sunshine <sunshine@sunshineco.com>

Differences from v5:

Dropped the GIT_SSL_CERT_PASSWORD_PROTECTED patch.  It's related to the
http.sslCertPasswordProtected patch, but not required as it touches different
sections of the http.c file.

1/4 - this is identical to v5's 3/5.  Updated from v4's 1/4:

* Added cover comments to log message (feedback-jh)
* Updated help text (feedback-jk)
* Uppercased enum values (feedback-jk)
* Moved #ifdef enum values to end (feedback-jh)
* Renamed check_matched_len to new_match_is_shorter (feedback-jh)
* Removed 1 unnecessary cast (feedback-jk,feedback-jh)
* Based on preparatory http.sslCertPasswordProtected fix (feedback-jh)

2/4 - Updated from v5's 4/5:

* Altered append_normalized_escapes descriptive comments (feedback-es)
* Improved normalization (keep empty user name, strip port leading 0s and
  resolve . and .. in path)

3/4 - Updated from v5's 5/5:

* Add more tests and adjust to improved normalization behavior

4/4 - New code

* Allow a http.<url>.* config item without a user name to match a url passed
  to git that has a user name

Applicable comments from v5 cover:

To better support matching URLs that are equivalent but spelled differently, a
url_normalize function has been added.  Currently this patch leaves it in
http.c as http_options_url_normalize as I am unclear whether it should go into
url.{h,c} at this time since only http.c uses it.

Since the url_normalize function's behavior is non-trivial, it is presented as
a separate patch on top of the basic http.<url>.* settings support.  A new test
for it has also been included as a separate patch.  I am unclear on the proper
number for this test, but have gone ahead and put it with the other http tests
since this patch series places the url_normalize function into http.c.

Kyle J. McKay (4):
  config: add support for http.<url>.* settings
  config: improve support for http.<url>.* settings
  tests: add new test for the url_normalize function
  config: allow http.<url>.* any user matching

 .gitignore               |   1 +
 Documentation/config.txt |  19 ++
 Makefile                 |   5 +
 http.c                   | 675 +++++++++++++++++++++++++++++++++++++++++++++--
 t/t5200-url-normalize.sh | 161 +++++++++++
 test-url-normalize.c     |  62 +++++
 6 files changed, 906 insertions(+), 17 deletions(-)
 create mode 100755 t/t5200-url-normalize.sh
 create mode 100644 test-url-normalize.c

-- 
1.8.3

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v6 1/4] config: add support for http.<url>.* settings
  2013-07-19 12:48 [PATCH v6 0/4] config: add support for http.<url>.* settings Kyle J. McKay
@ 2013-07-19 12:48 ` Kyle J. McKay
  2013-07-19 20:08   ` Junio C Hamano
  2013-07-19 12:48 ` [PATCH v6 2/4] config: improve " Kyle J. McKay
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Kyle J. McKay @ 2013-07-19 12:48 UTC (permalink / raw)
  To: git
  Cc: David Aguilar, Petr Baudis, Junio C Hamano, Richard Hartmann,
	Jeff King, Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

The credentials configuration values already support url-specific
configuration items in the form credential.<url>.*.  This patch
adds similar support for http configuration values.

The <url> value is considered a match to a url if the <url> value
is either an exact match or a prefix of the url which ends on a
path component boundary ('/').  So "https://example.com/test" will
match "https://example.com/test" and "https://example.com/test/too"
but not "https://example.com/testextra".

Longer matches take precedence over shorter matches with
environment variable settings taking precedence over all.

With this configuration:

[http]
        useragent = other-agent
        noEPSV = true
[http "https://example.com"]
        useragent = example-agent
        sslVerify = false
[http "https://example.com/path"]
        useragent = path-agent

The "https://other.example.com/" url will have useragent
"other-agent" and sslVerify will be on.

The "https://example.com/" url will have useragent
"example-agent" and sslVerify will be off.

The "https://example.com/path/sub" url will have useragent
"path-agent" and sslVerify will be off.

All three of the examples will have noEPSV enabled.

Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
---
 Documentation/config.txt |  15 +++++
 http.c                   | 169 ++++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 167 insertions(+), 17 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 6e53fc5..41cab91 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1513,6 +1513,21 @@ http.useragent::
 	of common USER_AGENT strings (but not including those like git/1.7.1).
 	Can be overridden by the 'GIT_HTTP_USER_AGENT' environment variable.
 
+http.<url>.*::
+	Any of the http.* options above can be applied selectively to some urls.
+	For example "http.https://example.com.useragent" would set the user
+	agent only for https connections to example.com.  The <url> value
+	matches a url if it is an exact match or a prefix of the url matching
+	at a "/" boundary.  Longer <url> matches take precedence over shorter
+	ones with the environment variable settings taking precedence over all.
+	Note that <url> must match the url passed to git exactly (other than
+	possibly being a prefix).  This means any user, password and/or port
+	setting that appears in a url as well as any %XX escapes that are
+	present must also appear in <url> to have a successful match.  The urls
+	that are matched against are those given directly to git commands.  In
+	other words, use exactly the same url that was passed to git (possibly
+	shortened) for the <url> value of the config setting.
+
 i18n.commitEncoding::
 	Character encoding the commit messages are stored in; Git itself
 	does not care per se, but this information is necessary e.g. when
diff --git a/http.c b/http.c
index 37986f8..f61a79c 100644
--- a/http.c
+++ b/http.c
@@ -30,6 +30,34 @@ static CURL *curl_default;
 
 char curl_errorstr[CURL_ERROR_SIZE];
 
+enum http_option_type {
+	OPT_POST_BUFFER,
+	OPT_MIN_SESSIONS,
+	OPT_SSL_VERIFY,
+	OPT_SSL_TRY,
+	OPT_SSL_CERT,
+	OPT_SSL_CAINFO,
+	OPT_LOW_SPEED,
+	OPT_LOW_TIME,
+	OPT_NO_EPSV,
+	OPT_HTTP_PROXY,
+	OPT_COOKIE_FILE,
+	OPT_USER_AGENT,
+	OPT_PASSWD_REQ,
+#ifdef USE_CURL_MULTI
+	OPT_MAX_REQUESTS,
+#endif
+#if LIBCURL_VERSION_NUM >= 0x070903
+	OPT_SSL_KEY,
+#endif
+#if LIBCURL_VERSION_NUM >= 0x070908
+	OPT_SSL_CAPATH,
+#endif
+	OPT_MAX
+};
+
+static size_t http_option_max_matched_len[OPT_MAX];
+
 static int curl_ssl_verify = -1;
 static int curl_ssl_try;
 static const char *ssl_cert;
@@ -141,33 +169,121 @@ static void process_curl_messages(void)
 }
 #endif
 
+static size_t http_options_url_match_prefix(const char *url,
+					    const char *url_prefix,
+					    size_t url_prefix_len)
+{
+	/*
+	 * url_prefix matches url if url_prefix is an exact match for url or it
+	 * is a prefix of url and the match ends on a path component boundary.
+	 * Both url and url_prefix are considered to have an implicit '/' on the
+	 * end for matching purposes if they do not already.
+	 *
+	 * url must be NUL terminated.  url_prefix_len is the length of
+	 * url_prefix which need not be NUL terminated.
+	 *
+	 * The return value is the length of the match in characters (excluding
+	 * any final '/') or 0 for no match.  Passing "/" as url_prefix will
+	 * always cause 0 to be returned.
+	 */
+	size_t url_len;
+	if (url_prefix_len && url_prefix[url_prefix_len - 1] == '/')
+		url_prefix_len--;
+	if (!url_prefix_len || strncmp(url, url_prefix, url_prefix_len))
+		return 0;
+	url_len = strlen(url);
+	if ((url_len == url_prefix_len) || (url[url_prefix_len] == '/'))
+		return url_prefix_len;
+	return 0;
+}
+
+static int new_match_is_shorter(size_t matchlen, enum http_option_type opt)
+{
+	/*
+	 * Compare matchlen to the last matched length of option opt and
+	 * return true if matchlen is shorter than the last matched length
+	 * (meaning the config setting should be ignored).  Upon seeing the
+	 * _same_ key (i.e. new key has the same match length which is therefore
+	 * not shorter) the new setting will override the previous setting.
+	 * Otherwise return false and record matchlen as the current last
+	 * matched length of option opt.
+	 */
+	if (matchlen < http_option_max_matched_len[opt])
+		return 1;
+	http_option_max_matched_len[opt] = matchlen;
+	return 0;
+}
+
 static int http_options(const char *var, const char *value, void *cb)
 {
-	if (!strcmp("http.sslverify", var)) {
+	const char *url = cb;
+	const char *key, *dot;
+	size_t matchlen = 0;
+
+	key = skip_prefix(var, "http.");
+	if (!key)
+		return git_default_config(var, value, cb);
+
+	/*
+	 * If the part following the leading "http." contains a '.' then check
+	 * to see if the part between "http." and the last '.' is a match to
+	 * url.  If it's not then ignore the setting.  Otherwise set key to
+	 * point to the option which is the part after the final '.' and
+	 * use key in subsequent comparisons to determine the option type.
+	 */
+	dot = strrchr(key, '.');
+	if (dot) {
+		matchlen = http_options_url_match_prefix(url, key, dot - key);
+		if (!matchlen)
+			return 0;
+		key = dot + 1;
+	}
+
+	if (!strcmp("sslverify", key)) {
+		if (new_match_is_shorter(matchlen, OPT_SSL_VERIFY))
+			return 0;
 		curl_ssl_verify = git_config_bool(var, value);
 		return 0;
 	}
-	if (!strcmp("http.sslcert", var))
+	if (!strcmp("sslcert", key)) {
+		if (new_match_is_shorter(matchlen, OPT_SSL_CERT))
+			return 0;
 		return git_config_string(&ssl_cert, var, value);
+	}
 #if LIBCURL_VERSION_NUM >= 0x070903
-	if (!strcmp("http.sslkey", var))
+	if (!strcmp("sslkey", key)) {
+		if (new_match_is_shorter(matchlen, OPT_SSL_KEY))
+			return 0;
 		return git_config_string(&ssl_key, var, value);
+	}
 #endif
 #if LIBCURL_VERSION_NUM >= 0x070908
-	if (!strcmp("http.sslcapath", var))
+	if (!strcmp("sslcapath", key)) {
+		if (new_match_is_shorter(matchlen, OPT_SSL_CAPATH))
+			return 0;
 		return git_config_string(&ssl_capath, var, value);
+	}
 #endif
-	if (!strcmp("http.sslcainfo", var))
+	if (!strcmp("sslcainfo", key)) {
+		if (new_match_is_shorter(matchlen, OPT_SSL_CAINFO))
+			return 0;
 		return git_config_string(&ssl_cainfo, var, value);
-	if (!strcmp("http.sslcertpasswordprotected", var)) {
+	}
+	if (!strcmp("sslcertpasswordprotected", key)) {
+		if (new_match_is_shorter(matchlen, OPT_PASSWD_REQ))
+			return 0;
 		ssl_cert_password_required = git_config_bool(var, value);
 		return 0;
 	}
-	if (!strcmp("http.ssltry", var)) {
+	if (!strcmp("ssltry", key)) {
+		if (new_match_is_shorter(matchlen, OPT_SSL_TRY))
+			return 0;
 		curl_ssl_try = git_config_bool(var, value);
 		return 0;
 	}
-	if (!strcmp("http.minsessions", var)) {
+	if (!strcmp("minsessions", key)) {
+		if (new_match_is_shorter(matchlen, OPT_MIN_SESSIONS))
+			return 0;
 		min_curl_sessions = git_config_int(var, value);
 #ifndef USE_CURL_MULTI
 		if (min_curl_sessions > 1)
@@ -176,39 +292,58 @@ static int http_options(const char *var, const char *value, void *cb)
 		return 0;
 	}
 #ifdef USE_CURL_MULTI
-	if (!strcmp("http.maxrequests", var)) {
+	if (!strcmp("maxrequests", key)) {
+		if (new_match_is_shorter(matchlen, OPT_MAX_REQUESTS))
+			return 0;
 		max_requests = git_config_int(var, value);
 		return 0;
 	}
 #endif
-	if (!strcmp("http.lowspeedlimit", var)) {
+	if (!strcmp("lowspeedlimit", key)) {
+		if (new_match_is_shorter(matchlen, OPT_LOW_SPEED))
+			return 0;
 		curl_low_speed_limit = (long)git_config_int(var, value);
 		return 0;
 	}
-	if (!strcmp("http.lowspeedtime", var)) {
+	if (!strcmp("lowspeedtime", key)) {
+		if (new_match_is_shorter(matchlen, OPT_LOW_TIME))
+			return 0;
 		curl_low_speed_time = (long)git_config_int(var, value);
 		return 0;
 	}
 
-	if (!strcmp("http.noepsv", var)) {
+	if (!strcmp("noepsv", key)) {
+		if (new_match_is_shorter(matchlen, OPT_NO_EPSV))
+			return 0;
 		curl_ftp_no_epsv = git_config_bool(var, value);
 		return 0;
 	}
-	if (!strcmp("http.proxy", var))
+	if (!strcmp("proxy", key)) {
+		if (new_match_is_shorter(matchlen, OPT_HTTP_PROXY))
+			return 0;
 		return git_config_string(&curl_http_proxy, var, value);
+	}
 
-	if (!strcmp("http.cookiefile", var))
+	if (!strcmp("cookiefile", key)) {
+		if (new_match_is_shorter(matchlen, OPT_COOKIE_FILE))
+			return 0;
 		return git_config_string(&curl_cookie_file, var, value);
+	}
 
-	if (!strcmp("http.postbuffer", var)) {
+	if (!strcmp("postbuffer", key)) {
+		if (new_match_is_shorter(matchlen, OPT_POST_BUFFER))
+			return 0;
 		http_post_buffer = git_config_int(var, value);
 		if (http_post_buffer < LARGE_PACKET_MAX)
 			http_post_buffer = LARGE_PACKET_MAX;
 		return 0;
 	}
 
-	if (!strcmp("http.useragent", var))
+	if (!strcmp("useragent", key)) {
+		if (new_match_is_shorter(matchlen, OPT_USER_AGENT))
+			return 0;
 		return git_config_string(&user_agent, var, value);
+	}
 
 	/* Fall back on the default ones */
 	return git_default_config(var, value, cb);
@@ -337,7 +472,7 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 
 	http_is_verbose = 0;
 
-	git_config(http_options, NULL);
+	git_config(http_options, (void *)url);
 
 	curl_global_init(CURL_GLOBAL_ALL);
 
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v6 2/4] config: improve support for http.<url>.* settings
  2013-07-19 12:48 [PATCH v6 0/4] config: add support for http.<url>.* settings Kyle J. McKay
  2013-07-19 12:48 ` [PATCH v6 1/4] " Kyle J. McKay
@ 2013-07-19 12:48 ` Kyle J. McKay
  2013-07-19 19:59   ` Junio C Hamano
  2013-07-19 12:48 ` [PATCH v6 3/4] tests: add new test for the url_normalize function Kyle J. McKay
  2013-07-19 12:48 ` [PATCH v6 4/4] config: allow http.<url>.* any user matching Kyle J. McKay
  3 siblings, 1 reply; 10+ messages in thread
From: Kyle J. McKay @ 2013-07-19 12:48 UTC (permalink / raw)
  To: git
  Cc: David Aguilar, Petr Baudis, Junio C Hamano, Richard Hartmann,
	Jeff King, Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

Improve on the http.<url>.* url matching behavior by first
normalizing the urls before they are compared.

With this change, for example, the following configuration
section:

[http "https://example.com/path"]
        useragent = example-agent
        sslVerify = false

will properly match a "HTTPS://example.COM/p%61th" url which
is equivalent.

The normalization rules are based on RFC 3986 and should
result in any two equivalent urls being a match.

Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
---
 Documentation/config.txt |  19 ++-
 http.c                   | 322 ++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 329 insertions(+), 12 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 41cab91..e461f32 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1517,16 +1517,15 @@ http.<url>.*::
 	Any of the http.* options above can be applied selectively to some urls.
 	For example "http.https://example.com.useragent" would set the user
 	agent only for https connections to example.com.  The <url> value
-	matches a url if it is an exact match or a prefix of the url matching
-	at a "/" boundary.  Longer <url> matches take precedence over shorter
-	ones with the environment variable settings taking precedence over all.
-	Note that <url> must match the url passed to git exactly (other than
-	possibly being a prefix).  This means any user, password and/or port
-	setting that appears in a url as well as any %XX escapes that are
-	present must also appear in <url> to have a successful match.  The urls
-	that are matched against are those given directly to git commands.  In
-	other words, use exactly the same url that was passed to git (possibly
-	shortened) for the <url> value of the config setting.
+	matches a url if it is an exact match or if it is a prefix of the url
+	matching at a "/" boundary.  Longer <url> matches take precedence over
+	shorter ones with the environment variable settings taking precedence
+	over all.  The urls are normalized before testing for a match.  Note,
+	however, that any user, password and/or port setting that appears in a
+	url must also match that part of <url> to have a successful match.  The
+	urls that are matched against are those given directly to git commands.
+	This means any urls visited as a result of a redirection do not
+	participate in matching.
 
 i18n.commitEncoding::
 	Character encoding the commit messages are stored in; Git itself
diff --git a/http.c b/http.c
index f61a79c..3c5c2ef 100644
--- a/http.c
+++ b/http.c
@@ -169,6 +169,311 @@ static void process_curl_messages(void)
 }
 #endif
 
+#define URL_ALPHA "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
+#define URL_DIGIT "0123456789"
+#define URL_HEXDIGIT URL_DIGIT "ABCDEFabcdef"
+#define URL_ALPHADIGIT URL_ALPHA URL_DIGIT
+#define URL_SCHEME_CHARS URL_ALPHADIGIT "+.-"
+#define URL_HOST_CHARS URL_ALPHADIGIT ".-[:]" /* IPv6 literals need [:] */
+#define URL_UNSAFE_CHARS " <>\"%{}|\\^`" /* plus 0x00-0x1F,0x7F-0xFF */
+#define URL_GEN_RESERVED ":/?#[]@"
+#define URL_SUB_RESERVED "!$&'()*+,;="
+#define URL_RESERVED URL_GEN_RESERVED URL_SUB_RESERVED /* only allowed delims */
+
+static int hex_digit_value(int ch)
+{
+	/*
+	 * Returns the hexadecimal digit value (0x00..0x0F) of character ch.
+	 * If ch is not a hexadecimal digit, the return value is undefined.
+	 */
+
+	if ('0' <= ch && ch <= '9')
+		return ch - '0';
+	return toupper(ch) - 'A' + 10;
+}
+
+static int append_normalized_escapes(struct strbuf *buf,
+				     const char *from,
+				     size_t from_len,
+				     const char *esc_extra,
+				     const char *esc_ok)
+{
+	/*
+	 * Append to strbuf 'buf' characters from string 'from' with length
+	 * 'from_len' while unescaping characters that do not need to be escaped
+	 * and escaping characters that do.  The set of characters to escape
+	 * (the complement of which is unescaped) starts out as the RFC 3986
+	 * unsafe characters (0x00-0x1F,0x7F-0xFF," <>\"#%{}|\\^`").  If
+	 * 'esc_extra' is not NULL, those additional characters will also always
+	 * be escaped.  If 'esc_ok' is not NULL, those characters will be left
+	 * escaped if found that way, but will not be unescaped otherwise (used
+	 * for delimiters).  If a %-escape sequence is encountered that is not
+	 * followed by 2 hexadecimal digits, the sequence is invalid and
+	 * false (0) will be returned.  Otherwise true (1) will be returned for
+	 * success.
+	 *
+	 * Note that all %-escape sequences will be normalized to UPPERCASE
+	 * as indicated in RFC 3986.  Unless included in esc_extra or esc_ok
+	 * alphanumerics and "-._~" will always be unescaped as per RFC 3986.
+	 */
+
+	while (from_len) {
+		int ch = *from++;
+		int was_esc = 0;
+
+		from_len--;
+		if (ch == '%') {
+			if (from_len < 2 ||
+			    !strchr(URL_HEXDIGIT, from[0]) ||
+			    !strchr(URL_HEXDIGIT, from[1]))
+				return 0;
+			ch = hex_digit_value(*from++) << 4;
+			ch |= hex_digit_value(*from++);
+			from_len -= 2;
+			was_esc = 1;
+		}
+		if (ch <= 0x1F || ch >= 0x7F || strchr(URL_UNSAFE_CHARS, ch) ||
+		    (esc_extra && strchr(esc_extra, ch)) ||
+		    (was_esc && strchr(esc_ok, ch)))
+			strbuf_addf(buf, "%%%02X", ch);
+		else
+			strbuf_addch(buf, ch);
+	}
+
+	return 1;
+}
+
+static char *http_options_url_normalize(const char *url)
+{
+	/*
+	 * Normalize NUL-terminated url using the following rules:
+	 *
+	 * 1. Case-insensitive parts of url will be converted to lower case
+	 * 2. %-encoded characters that do not need to be will be unencoded
+	 * 3. Characters that are not %-encoded and must be will be encoded
+	 * 4. All %-encodings will be converted to upper case hexadecimal
+	 * 5. Leading 0s are removed from port numbers
+	 * 6. If the default port for the scheme is given it will be removed
+	 * 7. A path part (including empty) not starting with '/' has one added
+	 * 8. Any dot segments (. or ..) in the path are resolved and removed
+	 * 9. IPv6 host literals are allowed (but not normalized or validated)
+	 *
+	 * The rules are based on information in RFC 3986.
+	 *
+	 * Please note this function requires a full URL including a scheme
+	 * and host part (except for file: URLs which may have an empty host).
+	 *
+	 * The return value is a newly allocated string that must be freed
+	 * or NULL if the url is not valid.
+	 *
+	 * This is NOT a URL validation function.  Full URL validation is NOT
+	 * performed.  Some invalid host names are passed through this function
+	 * undetected.  However, most all other problems that make a URL invalid
+	 * will be detected (including a missing host for non file: URLs).
+	 */
+
+	size_t url_len = strlen(url);
+	struct strbuf norm;
+	size_t spanned;
+	const char *slash_ptr, *at_ptr, *colon_ptr, *path_start;
+	int found_host = 0;
+
+
+	/*
+	 * Copy lowercased scheme and :// suffix, %-escapes are not allowed
+	 */
+	spanned = strspn(url, URL_SCHEME_CHARS);
+	if (!spanned || spanned + 3 > url_len || url[spanned] != ':' ||
+	    url[spanned+1] != '/' || url[spanned+2] != '/')
+		return NULL; /* Bad scheme and/or missing "://" part */
+	strbuf_init(&norm, url_len);
+	spanned += 3;
+	url_len -= spanned;
+	while (spanned--)
+		strbuf_addch(&norm, tolower(*url++));
+
+
+	/*
+	 * Copy any username:password if present normalizing %-escapes
+	 */
+	at_ptr = strchr(url, '@');
+	slash_ptr = url + strcspn(url, "/?#");
+	if (at_ptr && at_ptr < slash_ptr) {
+		if (at_ptr > url) {
+			if (!append_normalized_escapes(&norm, url, at_ptr - url,
+						       "", URL_RESERVED)) {
+				strbuf_release(&norm);
+				return NULL;
+			}
+		}
+		strbuf_addch(&norm, '@');
+		url_len -= (++at_ptr - url);
+		url = at_ptr;
+	}
+
+
+	/*
+	 * Copy the host part excluding any port part, no %-escapes allowed
+	 */
+	if (!url_len || strchr(":/?#", *url)) {
+		/* Missing host invalid for all URL schemes except file */
+		if (strncmp(norm.buf, "file:", 5)) {
+			strbuf_release(&norm);
+			return NULL;
+		}
+	} else {
+		found_host = 1;
+	}
+	colon_ptr = slash_ptr - 1;
+	while (colon_ptr > url && *colon_ptr != ':' && *colon_ptr != ']')
+		colon_ptr--;
+	if (*colon_ptr != ':') {
+		colon_ptr = slash_ptr;
+	} else if (!found_host && colon_ptr < slash_ptr && colon_ptr + 1 != slash_ptr) {
+		/* file: URLs may not have a port number */
+		strbuf_release(&norm);
+		return NULL;
+	}
+	spanned = strspn(url, URL_HOST_CHARS);
+	if (spanned < colon_ptr - url) {
+		/* Host name has invalid characters */
+		strbuf_release(&norm);
+		return NULL;
+	}
+	while (url < colon_ptr) {
+		strbuf_addch(&norm, tolower(*url++));
+		url_len--;
+	}
+
+
+	/*
+	 * Check the port part and copy if not the default (after removing any
+	 * leading 0s); no %-escapes allowed
+	 */
+	if (colon_ptr < slash_ptr) {
+		/* skip the ':' and leading 0s but not the last one if all 0s */
+		url++;
+		url += strspn(url, "0");
+		if (url == slash_ptr && url[-1] == '0')
+			url--;
+		if (url == slash_ptr) {
+			/* Skip ":" port with no number, it's same as default */
+		} else if (slash_ptr - url == 2 &&
+			   !strncmp(norm.buf, "http:", 5) &&
+			   !strncmp(url, "80", 2)) {
+			/* Skip http :80 as it's the default */
+		} else if (slash_ptr - url == 3 &&
+			   !strncmp(norm.buf, "https:", 6) &&
+			   !strncmp(url, "443", 3)) {
+			/* Skip https :443 as it's the default */
+		} else {
+			/*
+			 * Port number must be all digits with leading 0s removed
+			 * and since all the protocols we deal with have a 16-bit
+			 * port number it must also be in the range 1..65535
+			 * 0 is not allowed because that means "next available"
+			 * on just about every system and therefore cannot be used
+			 */
+			unsigned long pnum = 0;
+			spanned = strspn(url, URL_DIGIT);
+			if (spanned < slash_ptr - url) {
+				/* port number has invalid characters */
+				strbuf_release(&norm);
+				return NULL;
+			}
+			if (slash_ptr - url <= 5)
+				pnum = strtoul(url, NULL, 10);
+			if (pnum == 0 || pnum > 65535) {
+				/* port number not in range 1..65535 */
+				strbuf_release(&norm);
+				return NULL;
+			}
+			strbuf_addch(&norm, ':');
+			strbuf_add(&norm, url, slash_ptr - url);
+		}
+		url_len -= slash_ptr - colon_ptr;
+		url = slash_ptr;
+	}
+
+
+	/*
+	 * Now copy the path resolving any . and .. segments being careful not
+	 * to corrupt the URL by unescaping any delimiters, but do add an
+	 * initial '/' if it's missing and do normalize any %-escape sequences.
+	 */
+	path_start = norm.buf + norm.len;
+	strbuf_addch(&norm, '/');
+	if (*url == '/') {
+		url++;
+		url_len--;
+	}
+	for (;;) {
+		const char *seg_start = norm.buf + norm.len;
+		const char *next_slash = url + strcspn(url, "/?#");
+		int skip_add_slash = 0;
+		/*
+		 * RFC 3689 indicates that any . or .. segments should be
+		 * unescaped before being checked for.
+		 */
+		if (!append_normalized_escapes(&norm, url, next_slash - url, "",
+					       URL_RESERVED)) {
+			strbuf_release(&norm);
+			return NULL;
+		}
+		if (!strcmp(seg_start, ".")) {
+			/* ignore a . segment; be careful not to remove initial '/' */
+			if (seg_start == path_start + 1) {
+				strbuf_setlen(&norm, norm.len - 1);
+				skip_add_slash = 1;
+			} else {
+				strbuf_setlen(&norm, norm.len - 2);
+			}
+		} else if (!strcmp(seg_start, "..")) {
+			/*
+			 * ignore a .. segment and remove the previous segment;
+			 * be careful not to remove initial '/' from path
+			 */
+			const char *prev_slash = norm.buf + norm.len - 3;
+			if (prev_slash == path_start) {
+				/* invalid .. because no previous segment to remove */
+				strbuf_release(&norm);
+				return NULL;
+			}
+			while (*--prev_slash != '/') {}
+			if (prev_slash == path_start) {
+				strbuf_setlen(&norm, prev_slash - norm.buf + 1);
+				skip_add_slash = 1;
+			} else {
+				strbuf_setlen(&norm, prev_slash - norm.buf);
+			}
+		}
+		url_len -= next_slash - url;
+		url = next_slash;
+		/* if the next char is not '/' done with the path */
+		if (*url != '/')
+			break;
+		url++;
+		url_len--;
+		if (!skip_add_slash)
+			strbuf_addch(&norm, '/');
+	}
+
+
+	/*
+	 * Now simply copy the rest, if any, only normalizing %-escapes and
+	 * being careful not to corrupt the URL by unescaping any delimiters.
+	 */
+	if (*url) {
+		if (!append_normalized_escapes(&norm, url, url_len, "", URL_RESERVED)) {
+			strbuf_release(&norm);
+			return NULL;
+		}
+	}
+
+
+	return strbuf_detach(&norm, NULL);
+}
+
 static size_t http_options_url_match_prefix(const char *url,
 					    const char *url_prefix,
 					    size_t url_prefix_len)
@@ -185,8 +490,13 @@ static size_t http_options_url_match_prefix(const char *url,
 	 * The return value is the length of the match in characters (excluding
 	 * any final '/') or 0 for no match.  Passing "/" as url_prefix will
 	 * always cause 0 to be returned.
+	 *
+	 * Passing NULL as url and/or url_prefix will always cause 0 to be
+	 * returned without causing any faults.
 	 */
 	size_t url_len;
+	if (!url || !url_prefix)
+		return 0;
 	if (url_prefix_len && url_prefix[url_prefix_len - 1] == '/')
 		url_prefix_len--;
 	if (!url_prefix_len || strncmp(url, url_prefix, url_prefix_len))
@@ -233,7 +543,13 @@ static int http_options(const char *var, const char *value, void *cb)
 	 */
 	dot = strrchr(key, '.');
 	if (dot) {
-		matchlen = http_options_url_match_prefix(url, key, dot - key);
+		char *config_url = xmemdupz(key, dot - key);
+		char *norm_url = http_options_url_normalize(config_url);
+		free(config_url);
+		if (!norm_url)
+			return 0;
+		matchlen = http_options_url_match_prefix(url, norm_url, strlen(norm_url));
+		free(norm_url);
 		if (!matchlen)
 			return 0;
 		key = dot + 1;
@@ -469,10 +785,12 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 {
 	char *low_speed_limit;
 	char *low_speed_time;
+	char *norm_url = http_options_url_normalize(url);
 
 	http_is_verbose = 0;
 
-	git_config(http_options, (void *)url);
+	git_config(http_options, norm_url);
+	free(norm_url);
 
 	curl_global_init(CURL_GLOBAL_ALL);
 
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v6 3/4] tests: add new test for the url_normalize function
  2013-07-19 12:48 [PATCH v6 0/4] config: add support for http.<url>.* settings Kyle J. McKay
  2013-07-19 12:48 ` [PATCH v6 1/4] " Kyle J. McKay
  2013-07-19 12:48 ` [PATCH v6 2/4] config: improve " Kyle J. McKay
@ 2013-07-19 12:48 ` Kyle J. McKay
  2013-07-19 12:48 ` [PATCH v6 4/4] config: allow http.<url>.* any user matching Kyle J. McKay
  3 siblings, 0 replies; 10+ messages in thread
From: Kyle J. McKay @ 2013-07-19 12:48 UTC (permalink / raw)
  To: git
  Cc: David Aguilar, Petr Baudis, Junio C Hamano, Richard Hartmann,
	Jeff King, Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

In order to perform sane URL matching for http.<url>.* options,
http.c normalizes URLs before performing matches.

A new test-url-normalize test program is introduced along with
a new t5200-url-normalize.sh script to run the tests.

Since the url_normalize function currently lives in http.c this
test will be skipped if NO_CURL is defined since http.c is skipped
in that case.

Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
---
 .gitignore               |   1 +
 Makefile                 |   5 ++
 t/t5200-url-normalize.sh | 161 +++++++++++++++++++++++++++++++++++++++++++++++
 test-url-normalize.c     |  61 ++++++++++++++++++
 4 files changed, 228 insertions(+)
 create mode 100755 t/t5200-url-normalize.sh
 create mode 100644 test-url-normalize.c

diff --git a/.gitignore b/.gitignore
index 6669bf0..cd97e16 100644
--- a/.gitignore
+++ b/.gitignore
@@ -198,6 +198,7 @@
 /test-string-list
 /test-subprocess
 /test-svn-fe
+/test-url-normalize
 /test-wildmatch
 /common-cmds.h
 *.tar.gz
diff --git a/Makefile b/Makefile
index 0f931a2..f83879c 100644
--- a/Makefile
+++ b/Makefile
@@ -567,6 +567,7 @@ TEST_PROGRAMS_NEED_X += test-sigchain
 TEST_PROGRAMS_NEED_X += test-string-list
 TEST_PROGRAMS_NEED_X += test-subprocess
 TEST_PROGRAMS_NEED_X += test-svn-fe
+TEST_PROGRAMS_NEED_X += test-url-normalize
 TEST_PROGRAMS_NEED_X += test-wildmatch
 
 TEST_PROGRAMS = $(patsubst %,%$X,$(TEST_PROGRAMS_NEED_X))
@@ -2235,6 +2236,10 @@ test-parse-options$X: parse-options.o parse-options-cb.o
 
 test-svn-fe$X: vcs-svn/lib.a
 
+test-url-normalize$X: test-url-normalize.o GIT-LDFLAGS $(GITLIBS)
+	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
+		$(LIBS) $(CURL_LIBCURL) $(EXPAT_LIBEXPAT)
+
 .PRECIOUS: $(TEST_OBJS)
 
 test-%$X: test-%.o GIT-LDFLAGS $(GITLIBS)
diff --git a/t/t5200-url-normalize.sh b/t/t5200-url-normalize.sh
new file mode 100755
index 0000000..82d78ce
--- /dev/null
+++ b/t/t5200-url-normalize.sh
@@ -0,0 +1,161 @@
+#!/bin/sh
+
+test_description='url normalization'
+. ./test-lib.sh
+
+if test -n "$NO_CURL"; then
+	skip_all='skipping test, git built without http support'
+	test_done
+fi
+
+# Note that only file: URLs should be allowed without a host
+
+test_expect_success 'url scheme' '
+	! test-url-normalize "" &&
+	! test-url-normalize "_" &&
+	! test-url-normalize "scheme" &&
+	! test-url-normalize "scheme:" &&
+	! test-url-normalize "scheme:/" &&
+	! test-url-normalize "scheme://" &&
+	! test-url-normalize "file" &&
+	! test-url-normalize "file:" &&
+	! test-url-normalize "file:/" &&
+	test-url-normalize "file://" &&
+	! test-url-normalize "://acme.co" &&
+	! test-url-normalize "x_test://acme.co" &&
+	! test-url-normalize "schem%6e://" &&
+	test-url-normalize "x-Test+v1.0://acme.co" &&
+	test "$(test-url-normalize -p "AbCdeF://x.Y")" = "abcdef://x.y/"
+'
+
+test_expect_success 'url authority' '
+	! test-url-normalize "scheme://user:pass@" &&
+	! test-url-normalize "scheme://?" &&
+	! test-url-normalize "scheme://#" &&
+	! test-url-normalize "scheme:///" &&
+	! test-url-normalize "scheme://:" &&
+	! test-url-normalize "scheme://:555" &&
+	test-url-normalize "file://user:pass@" &&
+	test-url-normalize "file://?" &&
+	test-url-normalize "file://#" &&
+	test-url-normalize "file:///" &&
+	test-url-normalize "file://:" &&
+	! test-url-normalize "file://:555" &&
+	test-url-normalize "scheme://user:pass@host" &&
+	test-url-normalize "scheme://@host" &&
+	test-url-normalize "scheme://%00@host" &&
+	! test-url-normalize "scheme://%%@host" &&
+	! test-url-normalize "scheme://host_" &&
+	test-url-normalize "scheme://user:pass@host/" &&
+	test-url-normalize "scheme://@host/" &&
+	test-url-normalize "scheme://host/" &&
+	test-url-normalize "scheme://host?x" &&
+	test-url-normalize "scheme://host#x" &&
+	test-url-normalize "scheme://host/@" &&
+	test-url-normalize "scheme://host?@x" &&
+	test-url-normalize "scheme://host#@x" &&
+	test-url-normalize "scheme://[::1]" &&
+	test-url-normalize "scheme://[::1]/" &&
+	! test-url-normalize "scheme://hos%41/" &&
+	test-url-normalize "scheme://[invalid....:/" &&
+	test-url-normalize "scheme://invalid....:]/" &&
+	! test-url-normalize "scheme://invalid....:[/" &&
+	! test-url-normalize "scheme://invalid....:["
+'
+
+test_expect_success 'url port checks' '
+	test-url-normalize "xyz://q@some.host:" &&
+	test-url-normalize "xyz://q@some.host:456/" &&
+	! test-url-normalize "xyz://q@some.host:0" &&
+	! test-url-normalize "xyz://q@some.host:0000000" &&
+	test-url-normalize "xyz://q@some.host:0000001?" &&
+	test-url-normalize "xyz://q@some.host:065535#" &&
+	test-url-normalize "xyz://q@some.host:65535" &&
+	! test-url-normalize "xyz://q@some.host:65536" &&
+	! test-url-normalize "xyz://q@some.host:99999" &&
+	! test-url-normalize "xyz://q@some.host:100000" &&
+	! test-url-normalize "xyz://q@some.host:100001" &&
+	test-url-normalize "http://q@some.host:80" &&
+	test-url-normalize "https://q@some.host:443" &&
+	test-url-normalize "http://q@some.host:80/" &&
+	test-url-normalize "https://q@some.host:443?" &&
+	! test-url-normalize "http://q@:8008" &&
+	! test-url-normalize "http://:8080" &&
+	! test-url-normalize "http://:" &&
+	test-url-normalize "xyz://q@some.host:456/" &&
+	test-url-normalize "xyz://[::1]:456/" &&
+	test-url-normalize "xyz://[::1]:/" &&
+	! test-url-normalize "xyz://[::1]:000/" &&
+	! test-url-normalize "xyz://[::1]:0%300/" &&
+	! test-url-normalize "xyz://[::1]:0x80/" &&
+	! test-url-normalize "xyz://[::1]:4294967297/" &&
+	! test-url-normalize "xyz://[::1]:030f/"
+'
+
+test_expect_success 'url port normalization' '
+	test "$(test-url-normalize -p "http://x:800")" = "http://x:800/" &&
+	test "$(test-url-normalize -p "http://x:0800")" = "http://x:800/" &&
+	test "$(test-url-normalize -p "http://x:00000800")" = "http://x:800/" &&
+	test "$(test-url-normalize -p "http://x:065535")" = "http://x:65535/" &&
+	test "$(test-url-normalize -p "http://x:1")" = "http://x:1/" &&
+	test "$(test-url-normalize -p "http://x:80")" = "http://x/" &&
+	test "$(test-url-normalize -p "http://x:080")" = "http://x/" &&
+	test "$(test-url-normalize -p "http://x:000000080")" = "http://x/" &&
+	test "$(test-url-normalize -p "https://x:443")" = "https://x/" &&
+	test "$(test-url-normalize -p "https://x:0443")" = "https://x/" &&
+	test "$(test-url-normalize -p "https://x:000000443")" = "https://x/"
+'
+
+test_expect_success 'url general escapes' '
+	! test-url-normalize "http://x.y?%fg" &&
+	test "$(test-url-normalize -p "X://W/%7e%41^%3a")" = "x://w/~A%5E%3A" &&
+	test "$(test-url-normalize -p "X://W/:/?#[]@")" = "x://w/:/?#[]@" &&
+	test "$(test-url-normalize -p "X://W/$&()*+,;=")" = "x://w/$&()*+,;=" &&
+	test "$(test-url-normalize -p "X://W/'\''")" = "x://w/'\''" &&
+	test "$(test-url-normalize -p "X://W?'\!'")" = "x://w/?'\!'"
+';#'
+
+test_expect_success 'url username/password escapes' '
+	test "$(test-url-normalize -p "x://%41%62(^):%70+d@foo")" = "x://Ab(%5E):p+d@foo/"
+'
+
+test_expect_success 'url normalized lengths' '
+	test "$(test-url-normalize -l "Http://%4d%65:%4d^%70@The.Host")" = 25 &&
+	test "$(test-url-normalize -l "http://%41:%42@x.y/%61/")" = 17 &&
+	test "$(test-url-normalize -l "http://@x.y/^")" = 15
+'
+
+test_expect_success 'url . and .. segments' '
+	test "$(test-url-normalize -p "x://y/.")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/./")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/a/.")" = "x://y/a" &&
+	test "$(test-url-normalize -p "x://y/a/./")" = "x://y/a/" &&
+	test "$(test-url-normalize -p "x://y/.?")" = "x://y/?" &&
+	test "$(test-url-normalize -p "x://y/./?")" = "x://y/?" &&
+	test "$(test-url-normalize -p "x://y/a/.?")" = "x://y/a?" &&
+	test "$(test-url-normalize -p "x://y/a/./?")" = "x://y/a/?" &&
+	test "$(test-url-normalize -p "x://y/a/./b/.././../c")" = "x://y/c" &&
+	test "$(test-url-normalize -p "x://y/a/./b/../.././c/")" = "x://y/c/" &&
+	test "$(test-url-normalize -p "x://y/a/./b/.././../c/././.././.")" = "x://y/" &&
+	! test-url-normalize "x://y/a/./b/.././../c/././.././.." &&
+	test "$(test-url-normalize -p "x://y/a/./?/././..")" = "x://y/a/?/././.." &&
+	test "$(test-url-normalize -p "x://y/%2e/")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/%2E/")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/a/%2e./")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/b/.%2E/")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/c/%2e%2E/")" = "x://y/"
+'
+
+# http://@foo specifies an empty user name but does not specify a password
+# http://foo  specifies neither a user name nor a password
+# So they should not be equivalent
+test_expect_success 'url equivalents' '
+	test-url-normalize "httP://x" "Http://X/" &&
+	test-url-normalize "Http://%4d%65:%4d^%70@The.Host" "hTTP://Me:%4D^p@the.HOST:80/" &&
+	! test-url-normalize "https://@x.y/^" "httpS://x.y:443/^" &&
+	test-url-normalize "https://@x.y/^" "httpS://@x.y:0443/^" &&
+	test-url-normalize "https://@x.y/^/../abc" "httpS://@x.y:0443/abc" &&
+	test-url-normalize "https://@x.y/^/.." "httpS://@x.y:0443/"
+'
+
+test_done
diff --git a/test-url-normalize.c b/test-url-normalize.c
new file mode 100644
index 0000000..d68312d
--- /dev/null
+++ b/test-url-normalize.c
@@ -0,0 +1,61 @@
+#ifdef NO_CURL
+
+int main()
+{
+	return 125;
+}
+
+#else /* !NO_CURL */
+
+#include "http.c"
+
+#define url_normalize(u) http_options_url_normalize(u)
+
+int main(int argc, char **argv)
+{
+	const char *usage = "test-url-normalize [-p | -l] <url1> | <url1> <url2>";
+	char *url1, *url2;
+	int opt_p = 0, opt_l = 0;
+
+	/*
+	 * For one url, succeed if url_normalize succeeds on it, fail otherwise.
+	 * For two urls, succeed only if url_normalize succeeds on both and
+	 * the results compare equal with strcmp.  If -p is given (one url only)
+	 * and url_normalize succeeds, print the result followed by "\n".  If
+	 * -l is given (one url only) and url_normalize succeeds, print the
+	 * returned length in decimal followed by "\n".
+	 */
+
+	if (argc > 1 && !strcmp(argv[1], "-p")) {
+		opt_p = 1;
+		argc--;
+		argv++;
+	} else if (argc > 1 && !strcmp(argv[1], "-l")) {
+		opt_l = 1;
+		argc--;
+		argv++;
+	}
+
+	if (argc < 2 || argc > 3)
+		die(usage);
+
+	if (argc == 2) {
+		url1 = url_normalize(argv[1]);
+		if (!url1)
+			return 1;
+		if (opt_p)
+			printf("%s\n", url1);
+		if (opt_l)
+			printf("%u\n", (unsigned)strlen(url1));
+		return 0;
+	}
+
+	if (opt_p || opt_l)
+		die(usage);
+
+	url1 = url_normalize(argv[1]);
+	url2 = url_normalize(argv[2]);
+	return (url1 && url2 && !strcmp(url1, url2)) ? 0 : 1;
+}
+
+#endif /* !NO_CURL */
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v6 4/4] config: allow http.<url>.* any user matching
  2013-07-19 12:48 [PATCH v6 0/4] config: add support for http.<url>.* settings Kyle J. McKay
                   ` (2 preceding siblings ...)
  2013-07-19 12:48 ` [PATCH v6 3/4] tests: add new test for the url_normalize function Kyle J. McKay
@ 2013-07-19 12:48 ` Kyle J. McKay
  3 siblings, 0 replies; 10+ messages in thread
From: Kyle J. McKay @ 2013-07-19 12:48 UTC (permalink / raw)
  To: git
  Cc: David Aguilar, Petr Baudis, Junio C Hamano, Richard Hartmann,
	Jeff King, Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

Previously the <url> had to specify an exactly matching user name
and password if those were present in the url being matched against.

Now the password portion is always ignored and omitting the user
name from <url> allows it to match against any user name.

Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
---
 Documentation/config.txt |  23 ++--
 http.c                   | 280 +++++++++++++++++++++++++++++++++++++++--------
 test-url-normalize.c     |  11 +-
 3 files changed, 254 insertions(+), 60 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index e461f32..8b32a15 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1517,15 +1517,20 @@ http.<url>.*::
 	Any of the http.* options above can be applied selectively to some urls.
 	For example "http.https://example.com.useragent" would set the user
 	agent only for https connections to example.com.  The <url> value
-	matches a url if it is an exact match or if it is a prefix of the url
-	matching at a "/" boundary.  Longer <url> matches take precedence over
-	shorter ones with the environment variable settings taking precedence
-	over all.  The urls are normalized before testing for a match.  Note,
-	however, that any user, password and/or port setting that appears in a
-	url must also match that part of <url> to have a successful match.  The
-	urls that are matched against are those given directly to git commands.
-	This means any urls visited as a result of a redirection do not
-	participate in matching.
+	matches a url if it refers to the same scheme, host and port and the
+	path portion is an exact match or a prefix that matches at a "/"
+	boundary.  If <url> does not include a user name, it will match a url
+	with any username otherwise the user name must match as well (the
+	password part, if present in the url, is always ignored).  Longer <url>
+	path matches take precedence over shorter matches no matter what order
+	they occur in.  For same length matches, the last one wins except that a
+	same-length <url> match that includes a user name will be preferred over
+	a same-length <url> match that does not.  The urls are normalized before
+	matching so that equivalent urls that are simply spelled differently
+	will match properly.  Environment variable settings always override any
+	matches.  The urls that are matched against are those given directly to
+	git commands.  This means any urls visited as a result of a redirection
+	do not participate in matching.
 
 i18n.commitEncoding::
 	Character encoding the commit messages are stored in; Git itself
diff --git a/http.c b/http.c
index 3c5c2ef..3096a1d 100644
--- a/http.c
+++ b/http.c
@@ -56,7 +56,35 @@ enum http_option_type {
 	OPT_MAX
 };
 
+struct url_info {
+	char *url;		/* normalized url on success, must be freed, otherwise NULL */
+	const char *err;	/* if !url, a brief reason for the failure, otherwise NULL */
+
+	/* the rest of the fields are only set if url != NULL */
+
+	size_t url_len;		/* total length of url (which is now normalized) */
+	size_t scheme_len;	/* length of scheme name (excluding final :) */
+	size_t user_off;	/* offset into url to start of user name (0 => none) */
+	size_t user_len;	/* length of user name; if user_off != 0 but
+				   user_len == 0, an empty user name was given */
+	size_t passwd_off;	/* offset into url to start of passwd (0 => none) */
+	size_t passwd_len;	/* length of passwd; if passwd_off != 0 but
+				   passwd_len == 0, an empty passwd was given */
+	size_t host_off;	/* offset into url to start of host name (0 => none) */
+	size_t host_len;	/* length of host name; this INCLUDES any ':portnum';
+				 * file urls may have host_len == 0 */
+	size_t port_len;	/* if a portnum is present (port_len != 0), it has
+				 * this length (excluding the leading ':') at the
+				 * end of the host name (always 0 for file urls) */
+	size_t path_off;	/* offset into url to the start of the url path;
+				 * this will always point to a '/' character
+				 * after the url has been normalized */
+	size_t path_len;	/* length of path portion excluding any trailing
+				 * '?...' and '#...' portion; will always be >= 1 */
+};
+
 static size_t http_option_max_matched_len[OPT_MAX];
+static int http_option_user_matched[OPT_MAX];
 
 static int curl_ssl_verify = -1;
 static int curl_ssl_try;
@@ -243,7 +271,7 @@ static int append_normalized_escapes(struct strbuf *buf,
 	return 1;
 }
 
-static char *http_options_url_normalize(const char *url)
+static char *http_options_url_normalize(const char *url, struct url_info *out_info)
 {
 	/*
 	 * Normalize NUL-terminated url using the following rules:
@@ -266,6 +294,14 @@ static char *http_options_url_normalize(const char *url)
 	 * The return value is a newly allocated string that must be freed
 	 * or NULL if the url is not valid.
 	 *
+	 * If out_info is non-NULL, the url and err fields therein will always
+	 * be set.  If a non-NULL value is returned, it will be stored in
+	 * out_info->url as well, out_info->err will be set to NULL and the
+	 * other fields of *out_info will also be filled in.  If a NULL value
+	 * is returned, NULL will be stored in out_info->url and out_info->err
+	 * will be set to a brief, translated, error message, but no other
+	 * fields will be filled in.
+	 *
 	 * This is NOT a URL validation function.  Full URL validation is NOT
 	 * performed.  Some invalid host names are passed through this function
 	 * undetected.  However, most all other problems that make a URL invalid
@@ -275,18 +311,25 @@ static char *http_options_url_normalize(const char *url)
 	size_t url_len = strlen(url);
 	struct strbuf norm;
 	size_t spanned;
+	size_t scheme_len, user_off=0, user_len=0, passwd_off=0, passwd_len=0;
+	size_t host_off=0, host_len=0, port_len=0, path_off, path_len, result_len;
 	const char *slash_ptr, *at_ptr, *colon_ptr, *path_start;
-	int found_host = 0;
-
+	char *result;
 
 	/*
 	 * Copy lowercased scheme and :// suffix, %-escapes are not allowed
 	 */
 	spanned = strspn(url, URL_SCHEME_CHARS);
 	if (!spanned || spanned + 3 > url_len || url[spanned] != ':' ||
-	    url[spanned+1] != '/' || url[spanned+2] != '/')
+	    url[spanned+1] != '/' || url[spanned+2] != '/') {
+		if (out_info) {
+			out_info->url = NULL;
+			out_info->err = _("invalid URL scheme name or missing '://' suffix");
+		}
 		return NULL; /* Bad scheme and/or missing "://" part */
+	}
 	strbuf_init(&norm, url_len);
+	scheme_len = spanned;
 	spanned += 3;
 	url_len -= spanned;
 	while (spanned--)
@@ -299,12 +342,25 @@ static char *http_options_url_normalize(const char *url)
 	at_ptr = strchr(url, '@');
 	slash_ptr = url + strcspn(url, "/?#");
 	if (at_ptr && at_ptr < slash_ptr) {
+		user_off = norm.len;
 		if (at_ptr > url) {
 			if (!append_normalized_escapes(&norm, url, at_ptr - url,
 						       "", URL_RESERVED)) {
+				if (out_info) {
+					out_info->url = NULL;
+					out_info->err = _("invalid %XX escape sequence");
+				}
 				strbuf_release(&norm);
 				return NULL;
 			}
+			colon_ptr = strchr(norm.buf + scheme_len + 3, ':');
+			if (colon_ptr) {
+				passwd_off = (colon_ptr + 1) - norm.buf;
+				passwd_len = norm.len - passwd_off;
+				user_len = (passwd_off - 1) - (scheme_len + 3);
+			} else {
+				user_len = norm.len - (scheme_len + 3);
+			}
 		}
 		strbuf_addch(&norm, '@');
 		url_len -= (++at_ptr - url);
@@ -318,25 +374,37 @@ static char *http_options_url_normalize(const char *url)
 	if (!url_len || strchr(":/?#", *url)) {
 		/* Missing host invalid for all URL schemes except file */
 		if (strncmp(norm.buf, "file:", 5)) {
+			if (out_info) {
+				out_info->url = NULL;
+				out_info->err = _("missing host and scheme is not 'file:'");
+			}
 			strbuf_release(&norm);
 			return NULL;
 		}
 	} else {
-		found_host = 1;
+		host_off = norm.len;
 	}
 	colon_ptr = slash_ptr - 1;
 	while (colon_ptr > url && *colon_ptr != ':' && *colon_ptr != ']')
 		colon_ptr--;
 	if (*colon_ptr != ':') {
 		colon_ptr = slash_ptr;
-	} else if (!found_host && colon_ptr < slash_ptr && colon_ptr + 1 != slash_ptr) {
+	} else if (!host_off && colon_ptr < slash_ptr && colon_ptr + 1 != slash_ptr) {
 		/* file: URLs may not have a port number */
+		if (out_info) {
+			out_info->url = NULL;
+			out_info->err = _("a 'file:' URL may not have a port number");
+		}
 		strbuf_release(&norm);
 		return NULL;
 	}
 	spanned = strspn(url, URL_HOST_CHARS);
 	if (spanned < colon_ptr - url) {
 		/* Host name has invalid characters */
+		if (out_info) {
+			out_info->url = NULL;
+			out_info->err = _("invalid characters in host name");
+		}
 		strbuf_release(&norm);
 		return NULL;
 	}
@@ -378,6 +446,10 @@ static char *http_options_url_normalize(const char *url)
 			spanned = strspn(url, URL_DIGIT);
 			if (spanned < slash_ptr - url) {
 				/* port number has invalid characters */
+				if (out_info) {
+					out_info->url = NULL;
+					out_info->err = _("invalid port number");
+				}
 				strbuf_release(&norm);
 				return NULL;
 			}
@@ -385,15 +457,22 @@ static char *http_options_url_normalize(const char *url)
 				pnum = strtoul(url, NULL, 10);
 			if (pnum == 0 || pnum > 65535) {
 				/* port number not in range 1..65535 */
+				if (out_info) {
+					out_info->url = NULL;
+					out_info->err = _("invalid port number");
+				}
 				strbuf_release(&norm);
 				return NULL;
 			}
 			strbuf_addch(&norm, ':');
 			strbuf_add(&norm, url, slash_ptr - url);
+			port_len = slash_ptr - url;
 		}
 		url_len -= slash_ptr - colon_ptr;
 		url = slash_ptr;
 	}
+	if (host_off)
+		host_len = norm.len - host_off;
 
 
 	/*
@@ -401,7 +480,8 @@ static char *http_options_url_normalize(const char *url)
 	 * to corrupt the URL by unescaping any delimiters, but do add an
 	 * initial '/' if it's missing and do normalize any %-escape sequences.
 	 */
-	path_start = norm.buf + norm.len;
+	path_off = norm.len;
+	path_start = norm.buf + path_off;
 	strbuf_addch(&norm, '/');
 	if (*url == '/') {
 		url++;
@@ -417,6 +497,10 @@ static char *http_options_url_normalize(const char *url)
 		 */
 		if (!append_normalized_escapes(&norm, url, next_slash - url, "",
 					       URL_RESERVED)) {
+			if (out_info) {
+				out_info->url = NULL;
+				out_info->err = _("invalid %XX escape sequence");
+			}
 			strbuf_release(&norm);
 			return NULL;
 		}
@@ -436,6 +520,10 @@ static char *http_options_url_normalize(const char *url)
 			const char *prev_slash = norm.buf + norm.len - 3;
 			if (prev_slash == path_start) {
 				/* invalid .. because no previous segment to remove */
+				if (out_info) {
+					out_info->url = NULL;
+					out_info->err = _("invalid '..' path segment");
+				}
 				strbuf_release(&norm);
 				return NULL;
 			}
@@ -457,6 +545,7 @@ static char *http_options_url_normalize(const char *url)
 		if (!skip_add_slash)
 			strbuf_addch(&norm, '/');
 	}
+	path_len = norm.len - path_off;
 
 
 	/*
@@ -465,13 +554,33 @@ static char *http_options_url_normalize(const char *url)
 	 */
 	if (*url) {
 		if (!append_normalized_escapes(&norm, url, url_len, "", URL_RESERVED)) {
+			if (out_info) {
+				out_info->url = NULL;
+				out_info->err = _("invalid %XX escape sequence");
+			}
 			strbuf_release(&norm);
 			return NULL;
 		}
 	}
 
 
-	return strbuf_detach(&norm, NULL);
+	result = strbuf_detach(&norm, &result_len);
+	if (out_info) {
+		out_info->url = result;
+		out_info->err = NULL;
+		out_info->url_len = result_len;
+		out_info->scheme_len = scheme_len;
+		out_info->user_off = user_off;
+		out_info->user_len = user_len;
+		out_info->passwd_off = passwd_off;
+		out_info->passwd_len = passwd_len;
+		out_info->host_off = host_off;
+		out_info->host_len = host_len;
+		out_info->port_len = port_len;
+		out_info->path_off = path_off;
+		out_info->path_len = path_len;
+	}
+	return result;
 }
 
 static size_t http_options_url_match_prefix(const char *url,
@@ -487,48 +596,120 @@ static size_t http_options_url_match_prefix(const char *url,
 	 * url must be NUL terminated.  url_prefix_len is the length of
 	 * url_prefix which need not be NUL terminated.
 	 *
-	 * The return value is the length of the match in characters (excluding
-	 * any final '/') or 0 for no match.  Passing "/" as url_prefix will
-	 * always cause 0 to be returned.
+	 * The return value is the length of the match in characters (including
+	 * the final '/' even if it's implicit) or 0 for no match.
 	 *
 	 * Passing NULL as url and/or url_prefix will always cause 0 to be
 	 * returned without causing any faults.
 	 */
-	size_t url_len;
 	if (!url || !url_prefix)
 		return 0;
-	if (url_prefix_len && url_prefix[url_prefix_len - 1] == '/')
+	if (!url_prefix_len || (url_prefix_len == 1 && *url_prefix == '/'))
+		return (!*url || *url == '/') ? 1 : 0;
+	if (url_prefix[url_prefix_len - 1] == '/')
 		url_prefix_len--;
-	if (!url_prefix_len || strncmp(url, url_prefix, url_prefix_len))
+	if (strncmp(url, url_prefix, url_prefix_len))
 		return 0;
-	url_len = strlen(url);
-	if ((url_len == url_prefix_len) || (url[url_prefix_len] == '/'))
-		return url_prefix_len;
+	if ((strlen(url) == url_prefix_len) || (url[url_prefix_len] == '/'))
+		return url_prefix_len + 1;
 	return 0;
 }
 
-static int new_match_is_shorter(size_t matchlen, enum http_option_type opt)
+static int http_options_match_urls(const struct url_info *url,
+				   const struct url_info *url_prefix,
+				   int *exactusermatch)
+{
+	/*
+	 * url_prefix matches url if the scheme, host and port of url_prefix
+	 * are the same as those of url and the path portion of url_prefix
+	 * is the same as the path portion of url or it is a prefix that
+	 * matches at a '/' boundary.  If url_prefix contains a user name,
+	 * that must also exactly match the user name in url.
+	 *
+	 * If the user, host, port and path match in this fashion, the returned
+	 * value is the length of the path match including any implicit
+	 * final '/'.  For example, "http://me@example.com/path" is matched by
+	 * "http://example.com" with a path length of 1.
+	 *
+	 * If there is a match and exactusermatch is not NULL, then
+	 * *exactusermatch will be set to true if both url and url_prefix
+	 * contained a user name or false if url_prefix did not have a
+	 * user name.  If there is no match *exactusermatch is left untouched.
+	 */
+	int usermatched = 0;
+	int pathmatchlen;
+
+	if (!url || !url_prefix || !url->url || !url_prefix->url)
+		return 0;
+
+	/* check the scheme */
+	if (url_prefix->scheme_len != url->scheme_len ||
+	    strncmp(url->url, url_prefix->url, url->scheme_len))
+		return 0; /* schemes do not match */
+
+	/* check the user name if url_prefix has one */
+	if (url_prefix->user_off) {
+		if (!url->user_off || url->user_len != url_prefix->user_len ||
+		    strncmp(url->url + url->user_off,
+			    url_prefix->url + url_prefix->user_off,
+			    url->user_len))
+			return 0; /* url_prefix has a user but it's not a match */
+		usermatched = 1;
+	}
+
+	/* check the host and port */
+	if (url_prefix->host_len != url->host_len ||
+	    strncmp(url->url + url->host_off,
+		    url_prefix->url + url_prefix->host_off, url->host_len))
+		return 0; /* host names and/or ports do not match */
+
+	/* check the path */
+	pathmatchlen = http_options_url_match_prefix(
+		url->url + url->path_off,
+		url_prefix->url + url_prefix->path_off,
+		url_prefix->url_len - url_prefix->path_off);
+
+	if (pathmatchlen && exactusermatch)
+		*exactusermatch = usermatched;
+	return pathmatchlen;
+}
+
+static int match_is_ignored(size_t matchlen, int usermatch, enum http_option_type opt)
 {
 	/*
-	 * Compare matchlen to the last matched length of option opt and
+	 * Compare matchlen to the last matched path length of option opt and
 	 * return true if matchlen is shorter than the last matched length
 	 * (meaning the config setting should be ignored).  Upon seeing the
 	 * _same_ key (i.e. new key has the same match length which is therefore
-	 * not shorter) the new setting will override the previous setting.
-	 * Otherwise return false and record matchlen as the current last
-	 * matched length of option opt.
+	 * not shorter) the new setting will override the previous setting
+	 * unless the new setting did not match the user and the previous match
+	 * did.  Otherwise return false and record matchlen as the current last
+	 * matched path length of option opt and usermatch as the last user
+	 * matching state for option opt.
 	 */
 	if (matchlen < http_option_max_matched_len[opt])
 		return 1;
-	http_option_max_matched_len[opt] = matchlen;
+	if (matchlen > http_option_max_matched_len[opt]) {
+		http_option_max_matched_len[opt] = matchlen;
+		http_option_user_matched[opt] = usermatch;
+		return 0;
+	}
+	/*
+	 * If a previous match of the same length explicitly matched the user
+	 * name, but the current match matched on any user, ignore it.
+	 */
+	if (!usermatch && http_option_user_matched[opt])
+		return 1;
+	http_option_user_matched[opt] = usermatch;
 	return 0;
 }
 
 static int http_options(const char *var, const char *value, void *cb)
 {
-	const char *url = cb;
+	const struct url_info *info = cb;
 	const char *key, *dot;
 	size_t matchlen = 0;
+	int usermatch = 0;
 
 	key = skip_prefix(var, "http.");
 	if (!key)
@@ -543,12 +724,18 @@ static int http_options(const char *var, const char *value, void *cb)
 	 */
 	dot = strrchr(key, '.');
 	if (dot) {
-		char *config_url = xmemdupz(key, dot - key);
-		char *norm_url = http_options_url_normalize(config_url);
+		char *config_url;
+		struct url_info norm_info;
+		char *norm_url;
+
+		if (!info || !info->url)
+			return 0;
+		config_url = xmemdupz(key, dot - key);
+		norm_url = http_options_url_normalize(config_url, &norm_info);
 		free(config_url);
 		if (!norm_url)
 			return 0;
-		matchlen = http_options_url_match_prefix(url, norm_url, strlen(norm_url));
+		matchlen = http_options_match_urls(info, &norm_info, &usermatch);
 		free(norm_url);
 		if (!matchlen)
 			return 0;
@@ -556,49 +743,49 @@ static int http_options(const char *var, const char *value, void *cb)
 	}
 
 	if (!strcmp("sslverify", key)) {
-		if (new_match_is_shorter(matchlen, OPT_SSL_VERIFY))
+		if (match_is_ignored(matchlen, usermatch, OPT_SSL_VERIFY))
 			return 0;
 		curl_ssl_verify = git_config_bool(var, value);
 		return 0;
 	}
 	if (!strcmp("sslcert", key)) {
-		if (new_match_is_shorter(matchlen, OPT_SSL_CERT))
+		if (match_is_ignored(matchlen, usermatch, OPT_SSL_CERT))
 			return 0;
 		return git_config_string(&ssl_cert, var, value);
 	}
 #if LIBCURL_VERSION_NUM >= 0x070903
 	if (!strcmp("sslkey", key)) {
-		if (new_match_is_shorter(matchlen, OPT_SSL_KEY))
+		if (match_is_ignored(matchlen, usermatch, OPT_SSL_KEY))
 			return 0;
 		return git_config_string(&ssl_key, var, value);
 	}
 #endif
 #if LIBCURL_VERSION_NUM >= 0x070908
 	if (!strcmp("sslcapath", key)) {
-		if (new_match_is_shorter(matchlen, OPT_SSL_CAPATH))
+		if (match_is_ignored(matchlen, usermatch, OPT_SSL_CAPATH))
 			return 0;
 		return git_config_string(&ssl_capath, var, value);
 	}
 #endif
 	if (!strcmp("sslcainfo", key)) {
-		if (new_match_is_shorter(matchlen, OPT_SSL_CAINFO))
+		if (match_is_ignored(matchlen, usermatch, OPT_SSL_CAINFO))
 			return 0;
 		return git_config_string(&ssl_cainfo, var, value);
 	}
 	if (!strcmp("sslcertpasswordprotected", key)) {
-		if (new_match_is_shorter(matchlen, OPT_PASSWD_REQ))
+		if (match_is_ignored(matchlen, usermatch, OPT_PASSWD_REQ))
 			return 0;
 		ssl_cert_password_required = git_config_bool(var, value);
 		return 0;
 	}
 	if (!strcmp("ssltry", key)) {
-		if (new_match_is_shorter(matchlen, OPT_SSL_TRY))
+		if (match_is_ignored(matchlen, usermatch, OPT_SSL_TRY))
 			return 0;
 		curl_ssl_try = git_config_bool(var, value);
 		return 0;
 	}
 	if (!strcmp("minsessions", key)) {
-		if (new_match_is_shorter(matchlen, OPT_MIN_SESSIONS))
+		if (match_is_ignored(matchlen, usermatch, OPT_MIN_SESSIONS))
 			return 0;
 		min_curl_sessions = git_config_int(var, value);
 #ifndef USE_CURL_MULTI
@@ -609,45 +796,45 @@ static int http_options(const char *var, const char *value, void *cb)
 	}
 #ifdef USE_CURL_MULTI
 	if (!strcmp("maxrequests", key)) {
-		if (new_match_is_shorter(matchlen, OPT_MAX_REQUESTS))
+		if (match_is_ignored(matchlen, usermatch, OPT_MAX_REQUESTS))
 			return 0;
 		max_requests = git_config_int(var, value);
 		return 0;
 	}
 #endif
 	if (!strcmp("lowspeedlimit", key)) {
-		if (new_match_is_shorter(matchlen, OPT_LOW_SPEED))
+		if (match_is_ignored(matchlen, usermatch, OPT_LOW_SPEED))
 			return 0;
 		curl_low_speed_limit = (long)git_config_int(var, value);
 		return 0;
 	}
 	if (!strcmp("lowspeedtime", key)) {
-		if (new_match_is_shorter(matchlen, OPT_LOW_TIME))
+		if (match_is_ignored(matchlen, usermatch, OPT_LOW_TIME))
 			return 0;
 		curl_low_speed_time = (long)git_config_int(var, value);
 		return 0;
 	}
 
 	if (!strcmp("noepsv", key)) {
-		if (new_match_is_shorter(matchlen, OPT_NO_EPSV))
+		if (match_is_ignored(matchlen, usermatch, OPT_NO_EPSV))
 			return 0;
 		curl_ftp_no_epsv = git_config_bool(var, value);
 		return 0;
 	}
 	if (!strcmp("proxy", key)) {
-		if (new_match_is_shorter(matchlen, OPT_HTTP_PROXY))
+		if (match_is_ignored(matchlen, usermatch, OPT_HTTP_PROXY))
 			return 0;
 		return git_config_string(&curl_http_proxy, var, value);
 	}
 
 	if (!strcmp("cookiefile", key)) {
-		if (new_match_is_shorter(matchlen, OPT_COOKIE_FILE))
+		if (match_is_ignored(matchlen, usermatch, OPT_COOKIE_FILE))
 			return 0;
 		return git_config_string(&curl_cookie_file, var, value);
 	}
 
 	if (!strcmp("postbuffer", key)) {
-		if (new_match_is_shorter(matchlen, OPT_POST_BUFFER))
+		if (match_is_ignored(matchlen, usermatch, OPT_POST_BUFFER))
 			return 0;
 		http_post_buffer = git_config_int(var, value);
 		if (http_post_buffer < LARGE_PACKET_MAX)
@@ -656,7 +843,7 @@ static int http_options(const char *var, const char *value, void *cb)
 	}
 
 	if (!strcmp("useragent", key)) {
-		if (new_match_is_shorter(matchlen, OPT_USER_AGENT))
+		if (match_is_ignored(matchlen, usermatch, OPT_USER_AGENT))
 			return 0;
 		return git_config_string(&user_agent, var, value);
 	}
@@ -785,12 +972,13 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 {
 	char *low_speed_limit;
 	char *low_speed_time;
-	char *norm_url = http_options_url_normalize(url);
+	struct url_info info;
 
 	http_is_verbose = 0;
 
-	git_config(http_options, norm_url);
-	free(norm_url);
+	http_options_url_normalize(url, &info);
+	git_config(http_options, &info);
+	free(info.url);
 
 	curl_global_init(CURL_GLOBAL_ALL);
 
diff --git a/test-url-normalize.c b/test-url-normalize.c
index d68312d..f325571 100644
--- a/test-url-normalize.c
+++ b/test-url-normalize.c
@@ -9,7 +9,7 @@ int main()
 
 #include "http.c"
 
-#define url_normalize(u) http_options_url_normalize(u)
+#define url_normalize(u,i) http_options_url_normalize(u,i)
 
 int main(int argc, char **argv)
 {
@@ -40,21 +40,22 @@ int main(int argc, char **argv)
 		die(usage);
 
 	if (argc == 2) {
-		url1 = url_normalize(argv[1]);
+		struct url_info info;
+		url1 = url_normalize(argv[1], &info);
 		if (!url1)
 			return 1;
 		if (opt_p)
 			printf("%s\n", url1);
 		if (opt_l)
-			printf("%u\n", (unsigned)strlen(url1));
+			printf("%u\n", (unsigned)info.url_len);
 		return 0;
 	}
 
 	if (opt_p || opt_l)
 		die(usage);
 
-	url1 = url_normalize(argv[1]);
-	url2 = url_normalize(argv[2]);
+	url1 = url_normalize(argv[1], NULL);
+	url2 = url_normalize(argv[2], NULL);
 	return (url1 && url2 && !strcmp(url1, url2)) ? 0 : 1;
 }
 
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 2/4] config: improve support for http.<url>.* settings
  2013-07-19 12:48 ` [PATCH v6 2/4] config: improve " Kyle J. McKay
@ 2013-07-19 19:59   ` Junio C Hamano
  2013-07-19 23:37     ` Kyle J. McKay
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2013-07-19 19:59 UTC (permalink / raw)
  To: Kyle J. McKay
  Cc: git, David Aguilar, Petr Baudis, Richard Hartmann, Jeff King,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

"Kyle J. McKay" <mackyle@gmail.com> writes:

> +#define URL_ALPHA "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
> +#define URL_DIGIT "0123456789"
> +#define URL_HEXDIGIT URL_DIGIT "ABCDEFabcdef"
> +#define URL_ALPHADIGIT URL_ALPHA URL_DIGIT
> +#define URL_SCHEME_CHARS URL_ALPHADIGIT "+.-"
> +#define URL_HOST_CHARS URL_ALPHADIGIT ".-[:]" /* IPv6 literals need [:] */
> +#define URL_UNSAFE_CHARS " <>\"%{}|\\^`" /* plus 0x00-0x1F,0x7F-0xFF */
> +#define URL_GEN_RESERVED ":/?#[]@"
> +#define URL_SUB_RESERVED "!$&'()*+,;="
> +#define URL_RESERVED URL_GEN_RESERVED URL_SUB_RESERVED /* only allowed delims */
> + ...
> +	while (from_len) {
> +		int ch = *from++;
> +		int was_esc = 0;
> +
> +		from_len--;
> +		if (ch == '%') {
> +			if (from_len < 2 ||
> +			    !strchr(URL_HEXDIGIT, from[0]) ||
> +			    !strchr(URL_HEXDIGIT, from[1]))

I actually do like the readability of the approach in this patch,
but these repeated strchrs() in a loop may want to be optimized,
using a trick similar to what is used in ctype.c::sane_ctype[].

A small build-time-only program or script gen-http-ctype.perl that
defines and uses these URL_* cpp macros and generates a C source
file http-ctype-gen.c that can be #included from http.c, with
something like this in the Makefile:

	http-ctype-gen.c: gen-http-ctype.perl
		rm -f $@ $@+
                $(PERL_PATH) gen-http-ctype.perl >$@+
                mv $@+ $@
	http.o: http.c http-ctype-gen.c

would give us both readability and efficiency, perhaps?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 1/4] config: add support for http.<url>.* settings
  2013-07-19 12:48 ` [PATCH v6 1/4] " Kyle J. McKay
@ 2013-07-19 20:08   ` Junio C Hamano
  2013-07-19 23:40     ` Kyle J. McKay
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2013-07-19 20:08 UTC (permalink / raw)
  To: Kyle J. McKay
  Cc: git, David Aguilar, Petr Baudis, Richard Hartmann, Jeff King,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

"Kyle J. McKay" <mackyle@gmail.com> writes:

> +static size_t http_option_max_matched_len[OPT_MAX];
>  ...
> +static int new_match_is_shorter(size_t matchlen, enum http_option_type opt)
> +{
> +	/*
> +	 * Compare matchlen to the last matched length of option opt and
> +	 * return true if matchlen is shorter than the last matched length
> +	 * (meaning the config setting should be ignored).  Upon seeing the
> +	 * _same_ key (i.e. new key has the same match length which is therefore
> +	 * not shorter) the new setting will override the previous setting.
> +	 * Otherwise return false and record matchlen as the current last
> +	 * matched length of option opt.
> +	 */
> +	if (matchlen < http_option_max_matched_len[opt])
> +		return 1;
> +	http_option_max_matched_len[opt] = matchlen;
> +	return 0;
> +}
> + ...
> @@ -337,7 +472,7 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
>  
>  	http_is_verbose = 0;
>  
> -	git_config(http_options, NULL);
> +	git_config(http_options, (void *)url);

Can http_init() be called more than once?  max-matched-len (and
leter user-matched as well) is initialized to zero at the link time,
and never reset after it is used for matching the configuration file
entries with a single URL.

If this function is called more than once, the code needs to
memset(0) the array(s), don't it?

Another possibility, which might be better, is to package that array
and the url into a structure, have it on the stackframe of this
function, i.e.

	struct match_url_state {
        	const char *url;

		size_t http_option_max_matched_len[OPT_MAX];
		int http_option_user_matched[OPT_MAX];
	} match_url_state = {NULL};

	git_config(http_options, &match_url_state);

or something.  In any case, you no longer have to cast the second
parameter of git_config to (void *) only to defeat constness ;-)

>  	curl_global_init(CURL_GLOBAL_ALL);

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 2/4] config: improve support for http.<url>.* settings
  2013-07-19 19:59   ` Junio C Hamano
@ 2013-07-19 23:37     ` Kyle J. McKay
  2013-07-20  0:35       ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Kyle J. McKay @ 2013-07-19 23:37 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, David Aguilar, Petr Baudis, Richard Hartmann, Jeff King,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

On Jul 19, 2013, at 12:59, Junio C Hamano wrote:
> "Kyle J. McKay" <mackyle@gmail.com> writes:
>
>> +#define URL_ALPHA  
>> "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
>> +#define URL_DIGIT "0123456789"
>> +#define URL_HEXDIGIT URL_DIGIT "ABCDEFabcdef"
>> +#define URL_ALPHADIGIT URL_ALPHA URL_DIGIT
>> +#define URL_SCHEME_CHARS URL_ALPHADIGIT "+.-"
>> +#define URL_HOST_CHARS URL_ALPHADIGIT ".-[:]" /* IPv6 literals  
>> need [:] */
>> +#define URL_UNSAFE_CHARS " <>\"%{}|\\^`" /* plus 0x00-0x1F, 
>> 0x7F-0xFF */
>> +#define URL_GEN_RESERVED ":/?#[]@"
>> +#define URL_SUB_RESERVED "!$&'()*+,;="
>> +#define URL_RESERVED URL_GEN_RESERVED URL_SUB_RESERVED /* only  
>> allowed delims */
>> + ...
>> +	while (from_len) {
>> +		int ch = *from++;
>> +		int was_esc = 0;
>> +
>> +		from_len--;
>> +		if (ch == '%') {
>> +			if (from_len < 2 ||
>> +			    !strchr(URL_HEXDIGIT, from[0]) ||
>> +			    !strchr(URL_HEXDIGIT, from[1]))
>
> I actually do like the readability of the approach in this patch,
> but these repeated strchrs() in a loop may want to be optimized,
> using a trick similar to what is used in ctype.c::sane_ctype[].
>
> A small build-time-only program or script gen-http-ctype.perl that
> defines and uses these URL_* cpp macros and generates a C source
> file http-ctype-gen.c that can be #included from http.c, with
> something like this in the Makefile:
>
> 	http-ctype-gen.c: gen-http-ctype.perl
> 		rm -f $@ $@+
>                $(PERL_PATH) gen-http-ctype.perl >$@+
>                mv $@+ $@
> 	http.o: http.c http-ctype-gen.c
>
> would give us both readability and efficiency, perhaps?

Hmmm.  That's a very fast technique.  However something like:

#define IS_HEX_DIGIT(c) \
   (('0'<=(c)&&(c)<='9')||('a'<=(c)&&(c)<='f')||('A'<=(c)&&(c)<='F'))

I would think would be suitably fast without needing any added build  
files.

However, looks like there is a ctype.h isxdigit() function and it  
looks like there's a version of that in git-compat-util.h as well as a  
convenient hexval_table to use for the conversion, so I will alter the  
code to use those instead which will also do away with the  
hex_digit_value() function.

If you mean for all the strchr etc. calls, multiple tables would be  
required since URL_SCHEME_CHARS and URL_HOST_CHARS partially overlap,  
but it could be done.  Is the speed of strchr that much of a concern?   
The code will only be invoked for http.<url>.* option settings in any  
case and I expect the user would have to set an awfully large number  
of those to even begin to notice.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 1/4] config: add support for http.<url>.* settings
  2013-07-19 20:08   ` Junio C Hamano
@ 2013-07-19 23:40     ` Kyle J. McKay
  0 siblings, 0 replies; 10+ messages in thread
From: Kyle J. McKay @ 2013-07-19 23:40 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, David Aguilar, Petr Baudis, Richard Hartmann, Jeff King,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

On Jul 19, 2013, at 13:08, Junio C Hamano wrote:
> "Kyle J. McKay" <mackyle@gmail.com> writes:
>
>> @@ -337,7 +472,7 @@ void http_init(struct remote *remote, const  
>> char *url, int proactive_auth)
>>
>> 	http_is_verbose = 0;
>>
>> -	git_config(http_options, NULL);
>> +	git_config(http_options, (void *)url);
>
> Can http_init() be called more than once?  max-matched-len (and
> leter user-matched as well) is initialized to zero at the link time,
> and never reset after it is used for matching the configuration file
> entries with a single URL.
>
> If this function is called more than once, the code needs to
> memset(0) the array(s), don't it?

The http_init() function is never called more than once.  It's called  
from http_fetch.c, http_push.c, and remote-curl.c exactly once in the  
main() function, so no worries.  However, I do like your idea of  
bundling them all up into an on-stack variable, I just didn't want to  
stick those arrays on the stack if it's not necessary it and it seems  
like it isn't.

I think I will just go ahead and add the memset to http_init to avoid  
unexpected breakage if something in the future actually does make  
multiple http_init()/http_cleanup() calls during its lifetime.

> [...] In any case, you no longer have to cast the second
> parameter of git_config to (void *) only to defeat constness ;-)

That goes away in part 4/4 anyhow. :)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 2/4] config: improve support for http.<url>.* settings
  2013-07-19 23:37     ` Kyle J. McKay
@ 2013-07-20  0:35       ` Junio C Hamano
  0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2013-07-20  0:35 UTC (permalink / raw)
  To: Kyle J. McKay
  Cc: git, David Aguilar, Petr Baudis, Richard Hartmann, Jeff King,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

"Kyle J. McKay" <mackyle@gmail.com> writes:

> If you mean for all the strchr etc. calls, multiple tables would be
> required since URL_SCHEME_CHARS and URL_HOST_CHARS partially overlap,

The entries of the table could be at least 8-bit wide, so there
shouldn't be any problem, should there?

> but it could be done.  Is the speed of strchr that much of a concern?
> The code will only be invoked for http.<url>.* option settings in any
> case and I expect the user would have to set an awfully large number
> of those to even begin to notice.

We can start slow and then optimize, but I suspect that there are
sufficient number of people who are into micro-optimizing, and such
a slow implementation may quickly be replaced.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-07-20  0:36 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-19 12:48 [PATCH v6 0/4] config: add support for http.<url>.* settings Kyle J. McKay
2013-07-19 12:48 ` [PATCH v6 1/4] " Kyle J. McKay
2013-07-19 20:08   ` Junio C Hamano
2013-07-19 23:40     ` Kyle J. McKay
2013-07-19 12:48 ` [PATCH v6 2/4] config: improve " Kyle J. McKay
2013-07-19 19:59   ` Junio C Hamano
2013-07-19 23:37     ` Kyle J. McKay
2013-07-20  0:35       ` Junio C Hamano
2013-07-19 12:48 ` [PATCH v6 3/4] tests: add new test for the url_normalize function Kyle J. McKay
2013-07-19 12:48 ` [PATCH v6 4/4] config: allow http.<url>.* any user matching Kyle J. McKay

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).