git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v8 0/4] config: add support for http.<url>.* settings
@ 2013-07-22 12:56 Kyle J. McKay
  2013-07-22 12:56 ` [PATCH v8 1/4] " Kyle J. McKay
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Kyle J. McKay @ 2013-07-22 12:56 UTC (permalink / raw)
  To: git
  Cc: David Aguilar, Petr Baudis, Junio C Hamano, Richard Hartmann,
	Jeff King, Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

NOTE: This patch requires the following preparatory change:

 f1ff763 http.c: fix parsing of http.sslCertPasswordProtected variable

which is currently in pu.


This patch series adds support for http.<url>.* settings.  The patch is
organized as a series of improvements on the functionality:

1/4 - adds basic textual matching support
2/4 - adds URL normalization before matching
3/4 - adds a test for the URL normalization function
4/4 - adds any user matching


With-Feedback-From-jh: Junio C Hamano <gitster@pobox.com>

Differences from v7:

1/4 - No changes since v7's 1/4

2/4 - No changes since v7's 2/4

3/4 - Updated from v7's 3/4:

* Add a binary attribute for the url-* files (feedback-jh)
* Make test-url-normalize.c able to run http_options (feedback-jh)
* Add additional tests and corresponding config files (as t/t5200/config-*)
* Remove extraneous comment from t5200-url-normalize.sh (feedback-jh)

4/4 - Updated from v7's 4/4:

* Update http.<url>.* documentation with another example (feedback-jh)
* Add another url normalization config test to match the new example


Applicable comments from earlier cover:

To better support matching URLs that are equivalent but spelled differently, a
url_normalize function has been added.  Currently this patch leaves it in
http.c as http_options_url_normalize as I am unclear whether it should go into
url.{h,c} at this time since only http.c uses it.

Since the url_normalize function's behavior is non-trivial, it is presented as
a separate patch on top of the basic http.<url>.* settings support.  A new test
for it has also been included as a separate patch.  I am unclear on the proper
number for this test, but have gone ahead and put it with the other http tests
since this patch series places the url_normalize function into http.c.


Kyle J. McKay (4):
  config: add support for http.<url>.* settings
  config: improve support for http.<url>.* settings
  tests: add new test for the url_normalize function
  config: allow http.<url>.* any user matching

 .gitignore               |   1 +
 Documentation/config.txt |  25 ++
 Makefile                 |   5 +
 http.c                   | 666 +++++++++++++++++++++++++++++++++++++++++++++--
 t/.gitattributes         |   1 +
 t/t5200-url-normalize.sh | 199 ++++++++++++++
 t/t5200/README           |  18 ++
 t/t5200/config-1         |   8 +
 t/t5200/config-2         |   3 +
 t/t5200/config-3         |   4 +
 t/t5200/url-1            | Bin 0 -> 20 bytes
 t/t5200/url-10           | Bin 0 -> 23 bytes
 t/t5200/url-11           | Bin 0 -> 25 bytes
 t/t5200/url-2            | Bin 0 -> 20 bytes
 t/t5200/url-3            | Bin 0 -> 23 bytes
 t/t5200/url-4            | Bin 0 -> 23 bytes
 t/t5200/url-5            | Bin 0 -> 23 bytes
 t/t5200/url-6            | Bin 0 -> 23 bytes
 t/t5200/url-7            | Bin 0 -> 23 bytes
 t/t5200/url-8            | Bin 0 -> 23 bytes
 t/t5200/url-9            | Bin 0 -> 23 bytes
 test-url-normalize.c     | 132 ++++++++++
 22 files changed, 1045 insertions(+), 17 deletions(-)
 create mode 100755 t/t5200-url-normalize.sh
 create mode 100644 t/t5200/README
 create mode 100644 t/t5200/config-1
 create mode 100644 t/t5200/config-2
 create mode 100644 t/t5200/config-3
 create mode 100644 t/t5200/url-1
 create mode 100644 t/t5200/url-10
 create mode 100644 t/t5200/url-11
 create mode 100644 t/t5200/url-2
 create mode 100644 t/t5200/url-3
 create mode 100644 t/t5200/url-4
 create mode 100644 t/t5200/url-5
 create mode 100644 t/t5200/url-6
 create mode 100644 t/t5200/url-7
 create mode 100644 t/t5200/url-8
 create mode 100644 t/t5200/url-9
 create mode 100644 test-url-normalize.c

-- 
1.8.3

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v8 1/4] config: add support for http.<url>.* settings
  2013-07-22 12:56 [PATCH v8 0/4] config: add support for http.<url>.* settings Kyle J. McKay
@ 2013-07-22 12:56 ` Kyle J. McKay
  2013-07-24  7:12   ` Jeff King
  2013-07-22 12:56 ` [PATCH v8 2/4] config: improve " Kyle J. McKay
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 19+ messages in thread
From: Kyle J. McKay @ 2013-07-22 12:56 UTC (permalink / raw)
  To: git
  Cc: David Aguilar, Petr Baudis, Junio C Hamano, Richard Hartmann,
	Jeff King, Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

The credentials configuration values already support url-specific
configuration items in the form credential.<url>.*.  This patch
adds similar support for http configuration values.

The <url> value is considered a match to a url if the <url> value
is either an exact match or a prefix of the url which ends on a
path component boundary ('/').  So "https://example.com/test" will
match "https://example.com/test" and "https://example.com/test/too"
but not "https://example.com/testextra".

Longer matches take precedence over shorter matches with
environment variable settings taking precedence over all.

With this configuration:

[http]
        useragent = other-agent
        noEPSV = true
[http "https://example.com"]
        useragent = example-agent
        sslVerify = false
[http "https://example.com/path"]
        useragent = path-agent

The "https://other.example.com/" url will have useragent
"other-agent" and sslVerify will be on.

The "https://example.com/" url will have useragent
"example-agent" and sslVerify will be off.

The "https://example.com/path/sub" url will have useragent
"path-agent" and sslVerify will be off.

All three of the examples will have noEPSV enabled.

Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
---
 Documentation/config.txt |  15 +++++
 http.c                   | 170 ++++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 168 insertions(+), 17 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 6e53fc5..41cab91 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1513,6 +1513,21 @@ http.useragent::
 	of common USER_AGENT strings (but not including those like git/1.7.1).
 	Can be overridden by the 'GIT_HTTP_USER_AGENT' environment variable.
 
+http.<url>.*::
+	Any of the http.* options above can be applied selectively to some urls.
+	For example "http.https://example.com.useragent" would set the user
+	agent only for https connections to example.com.  The <url> value
+	matches a url if it is an exact match or a prefix of the url matching
+	at a "/" boundary.  Longer <url> matches take precedence over shorter
+	ones with the environment variable settings taking precedence over all.
+	Note that <url> must match the url passed to git exactly (other than
+	possibly being a prefix).  This means any user, password and/or port
+	setting that appears in a url as well as any %XX escapes that are
+	present must also appear in <url> to have a successful match.  The urls
+	that are matched against are those given directly to git commands.  In
+	other words, use exactly the same url that was passed to git (possibly
+	shortened) for the <url> value of the config setting.
+
 i18n.commitEncoding::
 	Character encoding the commit messages are stored in; Git itself
 	does not care per se, but this information is necessary e.g. when
diff --git a/http.c b/http.c
index 37986f8..1531ffa 100644
--- a/http.c
+++ b/http.c
@@ -30,6 +30,34 @@ static CURL *curl_default;
 
 char curl_errorstr[CURL_ERROR_SIZE];
 
+enum http_option_type {
+	OPT_POST_BUFFER,
+	OPT_MIN_SESSIONS,
+	OPT_SSL_VERIFY,
+	OPT_SSL_TRY,
+	OPT_SSL_CERT,
+	OPT_SSL_CAINFO,
+	OPT_LOW_SPEED,
+	OPT_LOW_TIME,
+	OPT_NO_EPSV,
+	OPT_HTTP_PROXY,
+	OPT_COOKIE_FILE,
+	OPT_USER_AGENT,
+	OPT_PASSWD_REQ,
+#ifdef USE_CURL_MULTI
+	OPT_MAX_REQUESTS,
+#endif
+#if LIBCURL_VERSION_NUM >= 0x070903
+	OPT_SSL_KEY,
+#endif
+#if LIBCURL_VERSION_NUM >= 0x070908
+	OPT_SSL_CAPATH,
+#endif
+	OPT_MAX
+};
+
+static size_t http_option_max_matched_len[OPT_MAX];
+
 static int curl_ssl_verify = -1;
 static int curl_ssl_try;
 static const char *ssl_cert;
@@ -141,33 +169,121 @@ static void process_curl_messages(void)
 }
 #endif
 
+static size_t http_options_url_match_prefix(const char *url,
+					    const char *url_prefix,
+					    size_t url_prefix_len)
+{
+	/*
+	 * url_prefix matches url if url_prefix is an exact match for url or it
+	 * is a prefix of url and the match ends on a path component boundary.
+	 * Both url and url_prefix are considered to have an implicit '/' on the
+	 * end for matching purposes if they do not already.
+	 *
+	 * url must be NUL terminated.  url_prefix_len is the length of
+	 * url_prefix which need not be NUL terminated.
+	 *
+	 * The return value is the length of the match in characters (excluding
+	 * any final '/') or 0 for no match.  Passing "/" as url_prefix will
+	 * always cause 0 to be returned.
+	 */
+	size_t url_len;
+	if (url_prefix_len && url_prefix[url_prefix_len - 1] == '/')
+		url_prefix_len--;
+	if (!url_prefix_len || strncmp(url, url_prefix, url_prefix_len))
+		return 0;
+	url_len = strlen(url);
+	if ((url_len == url_prefix_len) || (url[url_prefix_len] == '/'))
+		return url_prefix_len;
+	return 0;
+}
+
+static int new_match_is_shorter(size_t matchlen, enum http_option_type opt)
+{
+	/*
+	 * Compare matchlen to the last matched length of option opt and
+	 * return true if matchlen is shorter than the last matched length
+	 * (meaning the config setting should be ignored).  Upon seeing the
+	 * _same_ key (i.e. new key has the same match length which is therefore
+	 * not shorter) the new setting will override the previous setting.
+	 * Otherwise return false and record matchlen as the current last
+	 * matched length of option opt.
+	 */
+	if (matchlen < http_option_max_matched_len[opt])
+		return 1;
+	http_option_max_matched_len[opt] = matchlen;
+	return 0;
+}
+
 static int http_options(const char *var, const char *value, void *cb)
 {
-	if (!strcmp("http.sslverify", var)) {
+	const char *url = cb;
+	const char *key, *dot;
+	size_t matchlen = 0;
+
+	key = skip_prefix(var, "http.");
+	if (!key)
+		return git_default_config(var, value, cb);
+
+	/*
+	 * If the part following the leading "http." contains a '.' then check
+	 * to see if the part between "http." and the last '.' is a match to
+	 * url.  If it's not then ignore the setting.  Otherwise set key to
+	 * point to the option which is the part after the final '.' and
+	 * use key in subsequent comparisons to determine the option type.
+	 */
+	dot = strrchr(key, '.');
+	if (dot) {
+		matchlen = http_options_url_match_prefix(url, key, dot - key);
+		if (!matchlen)
+			return 0;
+		key = dot + 1;
+	}
+
+	if (!strcmp("sslverify", key)) {
+		if (new_match_is_shorter(matchlen, OPT_SSL_VERIFY))
+			return 0;
 		curl_ssl_verify = git_config_bool(var, value);
 		return 0;
 	}
-	if (!strcmp("http.sslcert", var))
+	if (!strcmp("sslcert", key)) {
+		if (new_match_is_shorter(matchlen, OPT_SSL_CERT))
+			return 0;
 		return git_config_string(&ssl_cert, var, value);
+	}
 #if LIBCURL_VERSION_NUM >= 0x070903
-	if (!strcmp("http.sslkey", var))
+	if (!strcmp("sslkey", key)) {
+		if (new_match_is_shorter(matchlen, OPT_SSL_KEY))
+			return 0;
 		return git_config_string(&ssl_key, var, value);
+	}
 #endif
 #if LIBCURL_VERSION_NUM >= 0x070908
-	if (!strcmp("http.sslcapath", var))
+	if (!strcmp("sslcapath", key)) {
+		if (new_match_is_shorter(matchlen, OPT_SSL_CAPATH))
+			return 0;
 		return git_config_string(&ssl_capath, var, value);
+	}
 #endif
-	if (!strcmp("http.sslcainfo", var))
+	if (!strcmp("sslcainfo", key)) {
+		if (new_match_is_shorter(matchlen, OPT_SSL_CAINFO))
+			return 0;
 		return git_config_string(&ssl_cainfo, var, value);
-	if (!strcmp("http.sslcertpasswordprotected", var)) {
+	}
+	if (!strcmp("sslcertpasswordprotected", key)) {
+		if (new_match_is_shorter(matchlen, OPT_PASSWD_REQ))
+			return 0;
 		ssl_cert_password_required = git_config_bool(var, value);
 		return 0;
 	}
-	if (!strcmp("http.ssltry", var)) {
+	if (!strcmp("ssltry", key)) {
+		if (new_match_is_shorter(matchlen, OPT_SSL_TRY))
+			return 0;
 		curl_ssl_try = git_config_bool(var, value);
 		return 0;
 	}
-	if (!strcmp("http.minsessions", var)) {
+	if (!strcmp("minsessions", key)) {
+		if (new_match_is_shorter(matchlen, OPT_MIN_SESSIONS))
+			return 0;
 		min_curl_sessions = git_config_int(var, value);
 #ifndef USE_CURL_MULTI
 		if (min_curl_sessions > 1)
@@ -176,39 +292,58 @@ static int http_options(const char *var, const char *value, void *cb)
 		return 0;
 	}
 #ifdef USE_CURL_MULTI
-	if (!strcmp("http.maxrequests", var)) {
+	if (!strcmp("maxrequests", key)) {
+		if (new_match_is_shorter(matchlen, OPT_MAX_REQUESTS))
+			return 0;
 		max_requests = git_config_int(var, value);
 		return 0;
 	}
 #endif
-	if (!strcmp("http.lowspeedlimit", var)) {
+	if (!strcmp("lowspeedlimit", key)) {
+		if (new_match_is_shorter(matchlen, OPT_LOW_SPEED))
+			return 0;
 		curl_low_speed_limit = (long)git_config_int(var, value);
 		return 0;
 	}
-	if (!strcmp("http.lowspeedtime", var)) {
+	if (!strcmp("lowspeedtime", key)) {
+		if (new_match_is_shorter(matchlen, OPT_LOW_TIME))
+			return 0;
 		curl_low_speed_time = (long)git_config_int(var, value);
 		return 0;
 	}
 
-	if (!strcmp("http.noepsv", var)) {
+	if (!strcmp("noepsv", key)) {
+		if (new_match_is_shorter(matchlen, OPT_NO_EPSV))
+			return 0;
 		curl_ftp_no_epsv = git_config_bool(var, value);
 		return 0;
 	}
-	if (!strcmp("http.proxy", var))
+	if (!strcmp("proxy", key)) {
+		if (new_match_is_shorter(matchlen, OPT_HTTP_PROXY))
+			return 0;
 		return git_config_string(&curl_http_proxy, var, value);
+	}
 
-	if (!strcmp("http.cookiefile", var))
+	if (!strcmp("cookiefile", key)) {
+		if (new_match_is_shorter(matchlen, OPT_COOKIE_FILE))
+			return 0;
 		return git_config_string(&curl_cookie_file, var, value);
+	}
 
-	if (!strcmp("http.postbuffer", var)) {
+	if (!strcmp("postbuffer", key)) {
+		if (new_match_is_shorter(matchlen, OPT_POST_BUFFER))
+			return 0;
 		http_post_buffer = git_config_int(var, value);
 		if (http_post_buffer < LARGE_PACKET_MAX)
 			http_post_buffer = LARGE_PACKET_MAX;
 		return 0;
 	}
 
-	if (!strcmp("http.useragent", var))
+	if (!strcmp("useragent", key)) {
+		if (new_match_is_shorter(matchlen, OPT_USER_AGENT))
+			return 0;
 		return git_config_string(&user_agent, var, value);
+	}
 
 	/* Fall back on the default ones */
 	return git_default_config(var, value, cb);
@@ -337,7 +472,8 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 
 	http_is_verbose = 0;
 
-	git_config(http_options, NULL);
+	memset(http_option_max_matched_len, 0, sizeof(http_option_max_matched_len));
+	git_config(http_options, (void *)url);
 
 	curl_global_init(CURL_GLOBAL_ALL);
 
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v8 2/4] config: improve support for http.<url>.* settings
  2013-07-22 12:56 [PATCH v8 0/4] config: add support for http.<url>.* settings Kyle J. McKay
  2013-07-22 12:56 ` [PATCH v8 1/4] " Kyle J. McKay
@ 2013-07-22 12:56 ` Kyle J. McKay
  2013-07-22 12:56 ` [PATCH v8 3/4] tests: add new test for the url_normalize function Kyle J. McKay
  2013-07-22 12:56 ` [PATCH v8 4/4] config: allow http.<url>.* any user matching Kyle J. McKay
  3 siblings, 0 replies; 19+ messages in thread
From: Kyle J. McKay @ 2013-07-22 12:56 UTC (permalink / raw)
  To: git
  Cc: David Aguilar, Petr Baudis, Junio C Hamano, Richard Hartmann,
	Jeff King, Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

Improve on the http.<url>.* url matching behavior by first
normalizing the urls before they are compared.

With this change, for example, the following configuration
section:

[http "https://example.com/path"]
        useragent = example-agent
        sslVerify = false

will properly match a "HTTPS://example.COM/p%61th" url which
is equivalent.

The normalization rules are based on RFC 3986 and should
result in any two equivalent urls being a match.

Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
---
 Documentation/config.txt |  19 ++-
 http.c                   | 311 ++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 318 insertions(+), 12 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 41cab91..e461f32 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1517,16 +1517,15 @@ http.<url>.*::
 	Any of the http.* options above can be applied selectively to some urls.
 	For example "http.https://example.com.useragent" would set the user
 	agent only for https connections to example.com.  The <url> value
-	matches a url if it is an exact match or a prefix of the url matching
-	at a "/" boundary.  Longer <url> matches take precedence over shorter
-	ones with the environment variable settings taking precedence over all.
-	Note that <url> must match the url passed to git exactly (other than
-	possibly being a prefix).  This means any user, password and/or port
-	setting that appears in a url as well as any %XX escapes that are
-	present must also appear in <url> to have a successful match.  The urls
-	that are matched against are those given directly to git commands.  In
-	other words, use exactly the same url that was passed to git (possibly
-	shortened) for the <url> value of the config setting.
+	matches a url if it is an exact match or if it is a prefix of the url
+	matching at a "/" boundary.  Longer <url> matches take precedence over
+	shorter ones with the environment variable settings taking precedence
+	over all.  The urls are normalized before testing for a match.  Note,
+	however, that any user, password and/or port setting that appears in a
+	url must also match that part of <url> to have a successful match.  The
+	urls that are matched against are those given directly to git commands.
+	This means any urls visited as a result of a redirection do not
+	participate in matching.
 
 i18n.commitEncoding::
 	Character encoding the commit messages are stored in; Git itself
diff --git a/http.c b/http.c
index 1531ffa..29e119c 100644
--- a/http.c
+++ b/http.c
@@ -169,6 +169,300 @@ static void process_curl_messages(void)
 }
 #endif
 
+#define URL_ALPHA "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
+#define URL_DIGIT "0123456789"
+#define URL_ALPHADIGIT URL_ALPHA URL_DIGIT
+#define URL_SCHEME_CHARS URL_ALPHADIGIT "+.-"
+#define URL_HOST_CHARS URL_ALPHADIGIT ".-[:]" /* IPv6 literals need [:] */
+#define URL_UNSAFE_CHARS " <>\"%{}|\\^`" /* plus 0x00-0x1F,0x7F-0xFF */
+#define URL_GEN_RESERVED ":/?#[]@"
+#define URL_SUB_RESERVED "!$&'()*+,;="
+#define URL_RESERVED URL_GEN_RESERVED URL_SUB_RESERVED /* only allowed delims */
+
+static int append_normalized_escapes(struct strbuf *buf,
+				     const char *from,
+				     size_t from_len,
+				     const char *esc_extra,
+				     const char *esc_ok)
+{
+	/*
+	 * Append to strbuf 'buf' characters from string 'from' with length
+	 * 'from_len' while unescaping characters that do not need to be escaped
+	 * and escaping characters that do.  The set of characters to escape
+	 * (the complement of which is unescaped) starts out as the RFC 3986
+	 * unsafe characters (0x00-0x1F,0x7F-0xFF," <>\"#%{}|\\^`").  If
+	 * 'esc_extra' is not NULL, those additional characters will also always
+	 * be escaped.  If 'esc_ok' is not NULL, those characters will be left
+	 * escaped if found that way, but will not be unescaped otherwise (used
+	 * for delimiters).  If a %-escape sequence is encountered that is not
+	 * followed by 2 hexadecimal digits, the sequence is invalid and
+	 * false (0) will be returned.  Otherwise true (1) will be returned for
+	 * success.
+	 *
+	 * Note that all %-escape sequences will be normalized to UPPERCASE
+	 * as indicated in RFC 3986.  Unless included in esc_extra or esc_ok
+	 * alphanumerics and "-._~" will always be unescaped as per RFC 3986.
+	 */
+
+	while (from_len) {
+		int ch = *from++;
+		int was_esc = 0;
+
+		from_len--;
+		if (ch == '%') {
+			if (from_len < 2 ||
+			    !isxdigit((unsigned char)from[0]) ||
+			    !isxdigit((unsigned char)from[1]))
+				return 0;
+			ch = hexval_table[(unsigned char)*from++] << 4;
+			ch |= hexval_table[(unsigned char)*from++];
+			from_len -= 2;
+			was_esc = 1;
+		}
+		if ((unsigned char)ch <= 0x1F || (unsigned char)ch >= 0x7F ||
+		    strchr(URL_UNSAFE_CHARS, ch) ||
+		    (esc_extra && strchr(esc_extra, ch)) ||
+		    (was_esc && strchr(esc_ok, ch)))
+			strbuf_addf(buf, "%%%02X", (unsigned char)ch);
+		else
+			strbuf_addch(buf, ch);
+	}
+
+	return 1;
+}
+
+static char *http_options_url_normalize(const char *url)
+{
+	/*
+	 * Normalize NUL-terminated url using the following rules:
+	 *
+	 * 1. Case-insensitive parts of url will be converted to lower case
+	 * 2. %-encoded characters that do not need to be will be unencoded
+	 * 3. Characters that are not %-encoded and must be will be encoded
+	 * 4. All %-encodings will be converted to upper case hexadecimal
+	 * 5. Leading 0s are removed from port numbers
+	 * 6. If the default port for the scheme is given it will be removed
+	 * 7. A path part (including empty) not starting with '/' has one added
+	 * 8. Any dot segments (. or ..) in the path are resolved and removed
+	 * 9. IPv6 host literals are allowed (but not normalized or validated)
+	 *
+	 * The rules are based on information in RFC 3986.
+	 *
+	 * Please note this function requires a full URL including a scheme
+	 * and host part (except for file: URLs which may have an empty host).
+	 *
+	 * The return value is a newly allocated string that must be freed
+	 * or NULL if the url is not valid.
+	 *
+	 * This is NOT a URL validation function.  Full URL validation is NOT
+	 * performed.  Some invalid host names are passed through this function
+	 * undetected.  However, most all other problems that make a URL invalid
+	 * will be detected (including a missing host for non file: URLs).
+	 */
+
+	size_t url_len = strlen(url);
+	struct strbuf norm;
+	size_t spanned;
+	const char *slash_ptr, *at_ptr, *colon_ptr, *path_start;
+	int found_host = 0;
+
+
+	/*
+	 * Copy lowercased scheme and :// suffix, %-escapes are not allowed
+	 * First character of scheme must be URL_ALPHA
+	 */
+	spanned = strspn(url, URL_SCHEME_CHARS);
+	if (!spanned || !isalpha(url[0]) || spanned + 3 > url_len ||
+	    url[spanned] != ':' || url[spanned+1] != '/' || url[spanned+2] != '/')
+		return NULL; /* Bad scheme and/or missing "://" part */
+	strbuf_init(&norm, url_len);
+	spanned += 3;
+	url_len -= spanned;
+	while (spanned--)
+		strbuf_addch(&norm, tolower(*url++));
+
+
+	/*
+	 * Copy any username:password if present normalizing %-escapes
+	 */
+	at_ptr = strchr(url, '@');
+	slash_ptr = url + strcspn(url, "/?#");
+	if (at_ptr && at_ptr < slash_ptr) {
+		if (at_ptr > url) {
+			if (!append_normalized_escapes(&norm, url, at_ptr - url,
+						       "", URL_RESERVED)) {
+				strbuf_release(&norm);
+				return NULL;
+			}
+		}
+		strbuf_addch(&norm, '@');
+		url_len -= (++at_ptr - url);
+		url = at_ptr;
+	}
+
+
+	/*
+	 * Copy the host part excluding any port part, no %-escapes allowed
+	 */
+	if (!url_len || strchr(":/?#", *url)) {
+		/* Missing host invalid for all URL schemes except file */
+		if (strncmp(norm.buf, "file:", 5)) {
+			strbuf_release(&norm);
+			return NULL;
+		}
+	} else {
+		found_host = 1;
+	}
+	colon_ptr = slash_ptr - 1;
+	while (colon_ptr > url && *colon_ptr != ':' && *colon_ptr != ']')
+		colon_ptr--;
+	if (*colon_ptr != ':') {
+		colon_ptr = slash_ptr;
+	} else if (!found_host && colon_ptr < slash_ptr && colon_ptr + 1 != slash_ptr) {
+		/* file: URLs may not have a port number */
+		strbuf_release(&norm);
+		return NULL;
+	}
+	spanned = strspn(url, URL_HOST_CHARS);
+	if (spanned < colon_ptr - url) {
+		/* Host name has invalid characters */
+		strbuf_release(&norm);
+		return NULL;
+	}
+	while (url < colon_ptr) {
+		strbuf_addch(&norm, tolower(*url++));
+		url_len--;
+	}
+
+
+	/*
+	 * Check the port part and copy if not the default (after removing any
+	 * leading 0s); no %-escapes allowed
+	 */
+	if (colon_ptr < slash_ptr) {
+		/* skip the ':' and leading 0s but not the last one if all 0s */
+		url++;
+		url += strspn(url, "0");
+		if (url == slash_ptr && url[-1] == '0')
+			url--;
+		if (url == slash_ptr) {
+			/* Skip ":" port with no number, it's same as default */
+		} else if (slash_ptr - url == 2 &&
+			   !strncmp(norm.buf, "http:", 5) &&
+			   !strncmp(url, "80", 2)) {
+			/* Skip http :80 as it's the default */
+		} else if (slash_ptr - url == 3 &&
+			   !strncmp(norm.buf, "https:", 6) &&
+			   !strncmp(url, "443", 3)) {
+			/* Skip https :443 as it's the default */
+		} else {
+			/*
+			 * Port number must be all digits with leading 0s removed
+			 * and since all the protocols we deal with have a 16-bit
+			 * port number it must also be in the range 1..65535
+			 * 0 is not allowed because that means "next available"
+			 * on just about every system and therefore cannot be used
+			 */
+			unsigned long pnum = 0;
+			spanned = strspn(url, URL_DIGIT);
+			if (spanned < slash_ptr - url) {
+				/* port number has invalid characters */
+				strbuf_release(&norm);
+				return NULL;
+			}
+			if (slash_ptr - url <= 5)
+				pnum = strtoul(url, NULL, 10);
+			if (pnum == 0 || pnum > 65535) {
+				/* port number not in range 1..65535 */
+				strbuf_release(&norm);
+				return NULL;
+			}
+			strbuf_addch(&norm, ':');
+			strbuf_add(&norm, url, slash_ptr - url);
+		}
+		url_len -= slash_ptr - colon_ptr;
+		url = slash_ptr;
+	}
+
+
+	/*
+	 * Now copy the path resolving any . and .. segments being careful not
+	 * to corrupt the URL by unescaping any delimiters, but do add an
+	 * initial '/' if it's missing and do normalize any %-escape sequences.
+	 */
+	path_start = norm.buf + norm.len;
+	strbuf_addch(&norm, '/');
+	if (*url == '/') {
+		url++;
+		url_len--;
+	}
+	for (;;) {
+		const char *seg_start = norm.buf + norm.len;
+		const char *next_slash = url + strcspn(url, "/?#");
+		int skip_add_slash = 0;
+		/*
+		 * RFC 3689 indicates that any . or .. segments should be
+		 * unescaped before being checked for.
+		 */
+		if (!append_normalized_escapes(&norm, url, next_slash - url, "",
+					       URL_RESERVED)) {
+			strbuf_release(&norm);
+			return NULL;
+		}
+		if (!strcmp(seg_start, ".")) {
+			/* ignore a . segment; be careful not to remove initial '/' */
+			if (seg_start == path_start + 1) {
+				strbuf_setlen(&norm, norm.len - 1);
+				skip_add_slash = 1;
+			} else {
+				strbuf_setlen(&norm, norm.len - 2);
+			}
+		} else if (!strcmp(seg_start, "..")) {
+			/*
+			 * ignore a .. segment and remove the previous segment;
+			 * be careful not to remove initial '/' from path
+			 */
+			const char *prev_slash = norm.buf + norm.len - 3;
+			if (prev_slash == path_start) {
+				/* invalid .. because no previous segment to remove */
+				strbuf_release(&norm);
+				return NULL;
+			}
+			while (*--prev_slash != '/') {}
+			if (prev_slash == path_start) {
+				strbuf_setlen(&norm, prev_slash - norm.buf + 1);
+				skip_add_slash = 1;
+			} else {
+				strbuf_setlen(&norm, prev_slash - norm.buf);
+			}
+		}
+		url_len -= next_slash - url;
+		url = next_slash;
+		/* if the next char is not '/' done with the path */
+		if (*url != '/')
+			break;
+		url++;
+		url_len--;
+		if (!skip_add_slash)
+			strbuf_addch(&norm, '/');
+	}
+
+
+	/*
+	 * Now simply copy the rest, if any, only normalizing %-escapes and
+	 * being careful not to corrupt the URL by unescaping any delimiters.
+	 */
+	if (*url) {
+		if (!append_normalized_escapes(&norm, url, url_len, "", URL_RESERVED)) {
+			strbuf_release(&norm);
+			return NULL;
+		}
+	}
+
+
+	return strbuf_detach(&norm, NULL);
+}
+
 static size_t http_options_url_match_prefix(const char *url,
 					    const char *url_prefix,
 					    size_t url_prefix_len)
@@ -185,8 +479,13 @@ static size_t http_options_url_match_prefix(const char *url,
 	 * The return value is the length of the match in characters (excluding
 	 * any final '/') or 0 for no match.  Passing "/" as url_prefix will
 	 * always cause 0 to be returned.
+	 *
+	 * Passing NULL as url and/or url_prefix will always cause 0 to be
+	 * returned without causing any faults.
 	 */
 	size_t url_len;
+	if (!url || !url_prefix)
+		return 0;
 	if (url_prefix_len && url_prefix[url_prefix_len - 1] == '/')
 		url_prefix_len--;
 	if (!url_prefix_len || strncmp(url, url_prefix, url_prefix_len))
@@ -233,7 +532,13 @@ static int http_options(const char *var, const char *value, void *cb)
 	 */
 	dot = strrchr(key, '.');
 	if (dot) {
-		matchlen = http_options_url_match_prefix(url, key, dot - key);
+		char *config_url = xmemdupz(key, dot - key);
+		char *norm_url = http_options_url_normalize(config_url);
+		free(config_url);
+		if (!norm_url)
+			return 0;
+		matchlen = http_options_url_match_prefix(url, norm_url, strlen(norm_url));
+		free(norm_url);
 		if (!matchlen)
 			return 0;
 		key = dot + 1;
@@ -469,11 +774,13 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 {
 	char *low_speed_limit;
 	char *low_speed_time;
+	char *norm_url = http_options_url_normalize(url);
 
 	http_is_verbose = 0;
 
 	memset(http_option_max_matched_len, 0, sizeof(http_option_max_matched_len));
-	git_config(http_options, (void *)url);
+	git_config(http_options, norm_url);
+	free(norm_url);
 
 	curl_global_init(CURL_GLOBAL_ALL);
 
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v8 3/4] tests: add new test for the url_normalize function
  2013-07-22 12:56 [PATCH v8 0/4] config: add support for http.<url>.* settings Kyle J. McKay
  2013-07-22 12:56 ` [PATCH v8 1/4] " Kyle J. McKay
  2013-07-22 12:56 ` [PATCH v8 2/4] config: improve " Kyle J. McKay
@ 2013-07-22 12:56 ` Kyle J. McKay
  2013-07-24  6:59   ` Jeff King
  2013-07-22 12:56 ` [PATCH v8 4/4] config: allow http.<url>.* any user matching Kyle J. McKay
  3 siblings, 1 reply; 19+ messages in thread
From: Kyle J. McKay @ 2013-07-22 12:56 UTC (permalink / raw)
  To: git
  Cc: David Aguilar, Petr Baudis, Junio C Hamano, Richard Hartmann,
	Jeff King, Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

In order to perform sane URL matching for http.<url>.* options,
http.c normalizes URLs before performing matches.

A new test-url-normalize test program is introduced along with
a new t5200-url-normalize.sh script to run the tests.

Since the url_normalize function currently lives in http.c this
test will be skipped if NO_CURL is defined since http.c is skipped
in that case.

Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
---
 .gitignore               |   1 +
 Makefile                 |   5 ++
 t/.gitattributes         |   1 +
 t/t5200-url-normalize.sh | 198 +++++++++++++++++++++++++++++++++++++++++++++++
 t/t5200/README           |  18 +++++
 t/t5200/config-1         |   8 ++
 t/t5200/config-2         |   3 +
 t/t5200/url-1            | Bin 0 -> 20 bytes
 t/t5200/url-10           | Bin 0 -> 23 bytes
 t/t5200/url-11           | Bin 0 -> 25 bytes
 t/t5200/url-2            | Bin 0 -> 20 bytes
 t/t5200/url-3            | Bin 0 -> 23 bytes
 t/t5200/url-4            | Bin 0 -> 23 bytes
 t/t5200/url-5            | Bin 0 -> 23 bytes
 t/t5200/url-6            | Bin 0 -> 23 bytes
 t/t5200/url-7            | Bin 0 -> 23 bytes
 t/t5200/url-8            | Bin 0 -> 23 bytes
 t/t5200/url-9            | Bin 0 -> 23 bytes
 test-url-normalize.c     | 131 +++++++++++++++++++++++++++++++
 19 files changed, 365 insertions(+)
 create mode 100755 t/t5200-url-normalize.sh
 create mode 100644 t/t5200/README
 create mode 100644 t/t5200/config-1
 create mode 100644 t/t5200/config-2
 create mode 100644 t/t5200/url-1
 create mode 100644 t/t5200/url-10
 create mode 100644 t/t5200/url-11
 create mode 100644 t/t5200/url-2
 create mode 100644 t/t5200/url-3
 create mode 100644 t/t5200/url-4
 create mode 100644 t/t5200/url-5
 create mode 100644 t/t5200/url-6
 create mode 100644 t/t5200/url-7
 create mode 100644 t/t5200/url-8
 create mode 100644 t/t5200/url-9
 create mode 100644 test-url-normalize.c

diff --git a/.gitignore b/.gitignore
index 6669bf0..cd97e16 100644
--- a/.gitignore
+++ b/.gitignore
@@ -198,6 +198,7 @@
 /test-string-list
 /test-subprocess
 /test-svn-fe
+/test-url-normalize
 /test-wildmatch
 /common-cmds.h
 *.tar.gz
diff --git a/Makefile b/Makefile
index 0f931a2..f83879c 100644
--- a/Makefile
+++ b/Makefile
@@ -567,6 +567,7 @@ TEST_PROGRAMS_NEED_X += test-sigchain
 TEST_PROGRAMS_NEED_X += test-string-list
 TEST_PROGRAMS_NEED_X += test-subprocess
 TEST_PROGRAMS_NEED_X += test-svn-fe
+TEST_PROGRAMS_NEED_X += test-url-normalize
 TEST_PROGRAMS_NEED_X += test-wildmatch
 
 TEST_PROGRAMS = $(patsubst %,%$X,$(TEST_PROGRAMS_NEED_X))
@@ -2235,6 +2236,10 @@ test-parse-options$X: parse-options.o parse-options-cb.o
 
 test-svn-fe$X: vcs-svn/lib.a
 
+test-url-normalize$X: test-url-normalize.o GIT-LDFLAGS $(GITLIBS)
+	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
+		$(LIBS) $(CURL_LIBCURL) $(EXPAT_LIBEXPAT)
+
 .PRECIOUS: $(TEST_OBJS)
 
 test-%$X: test-%.o GIT-LDFLAGS $(GITLIBS)
diff --git a/t/.gitattributes b/t/.gitattributes
index 1b97c54..f6f1df3 100644
--- a/t/.gitattributes
+++ b/t/.gitattributes
@@ -1 +1,2 @@
 t[0-9][0-9][0-9][0-9]/* -whitespace
+t5200/url-* binary
diff --git a/t/t5200-url-normalize.sh b/t/t5200-url-normalize.sh
new file mode 100755
index 0000000..d2dd886
--- /dev/null
+++ b/t/t5200-url-normalize.sh
@@ -0,0 +1,198 @@
+#!/bin/sh
+
+test_description='url normalization'
+. ./test-lib.sh
+
+if test -n "$NO_CURL"; then
+	skip_all='skipping test, git built without http support'
+	test_done
+fi
+
+# The base name of the test url files
+tu="$TEST_DIRECTORY/t5200/url"
+
+# The base name of the test config files
+tc="$TEST_DIRECTORY/t5200/config"
+
+# Note that only file: URLs should be allowed without a host
+
+test_expect_success 'url scheme' '
+	! test-url-normalize "" &&
+	! test-url-normalize "_" &&
+	! test-url-normalize "scheme" &&
+	! test-url-normalize "scheme:" &&
+	! test-url-normalize "scheme:/" &&
+	! test-url-normalize "scheme://" &&
+	! test-url-normalize "file" &&
+	! test-url-normalize "file:" &&
+	! test-url-normalize "file:/" &&
+	test-url-normalize "file://" &&
+	! test-url-normalize "://acme.co" &&
+	! test-url-normalize "x_test://acme.co" &&
+	! test-url-normalize "-test://acme.co" &&
+	! test-url-normalize "0test://acme.co" &&
+	! test-url-normalize "+test://acme.co" &&
+	! test-url-normalize ".test://acme.co" &&
+	! test-url-normalize "schem%6e://" &&
+	test-url-normalize "x-Test+v1.0://acme.co" &&
+	test "$(test-url-normalize -p "AbCdeF://x.Y")" = "abcdef://x.y/"
+'
+
+test_expect_success 'url authority' '
+	! test-url-normalize "scheme://user:pass@" &&
+	! test-url-normalize "scheme://?" &&
+	! test-url-normalize "scheme://#" &&
+	! test-url-normalize "scheme:///" &&
+	! test-url-normalize "scheme://:" &&
+	! test-url-normalize "scheme://:555" &&
+	test-url-normalize "file://user:pass@" &&
+	test-url-normalize "file://?" &&
+	test-url-normalize "file://#" &&
+	test-url-normalize "file:///" &&
+	test-url-normalize "file://:" &&
+	! test-url-normalize "file://:555" &&
+	test-url-normalize "scheme://user:pass@host" &&
+	test-url-normalize "scheme://@host" &&
+	test-url-normalize "scheme://%00@host" &&
+	! test-url-normalize "scheme://%%@host" &&
+	! test-url-normalize "scheme://host_" &&
+	test-url-normalize "scheme://user:pass@host/" &&
+	test-url-normalize "scheme://@host/" &&
+	test-url-normalize "scheme://host/" &&
+	test-url-normalize "scheme://host?x" &&
+	test-url-normalize "scheme://host#x" &&
+	test-url-normalize "scheme://host/@" &&
+	test-url-normalize "scheme://host?@x" &&
+	test-url-normalize "scheme://host#@x" &&
+	test-url-normalize "scheme://[::1]" &&
+	test-url-normalize "scheme://[::1]/" &&
+	! test-url-normalize "scheme://hos%41/" &&
+	test-url-normalize "scheme://[invalid....:/" &&
+	test-url-normalize "scheme://invalid....:]/" &&
+	! test-url-normalize "scheme://invalid....:[/" &&
+	! test-url-normalize "scheme://invalid....:["
+'
+
+test_expect_success 'url port checks' '
+	test-url-normalize "xyz://q@some.host:" &&
+	test-url-normalize "xyz://q@some.host:456/" &&
+	! test-url-normalize "xyz://q@some.host:0" &&
+	! test-url-normalize "xyz://q@some.host:0000000" &&
+	test-url-normalize "xyz://q@some.host:0000001?" &&
+	test-url-normalize "xyz://q@some.host:065535#" &&
+	test-url-normalize "xyz://q@some.host:65535" &&
+	! test-url-normalize "xyz://q@some.host:65536" &&
+	! test-url-normalize "xyz://q@some.host:99999" &&
+	! test-url-normalize "xyz://q@some.host:100000" &&
+	! test-url-normalize "xyz://q@some.host:100001" &&
+	test-url-normalize "http://q@some.host:80" &&
+	test-url-normalize "https://q@some.host:443" &&
+	test-url-normalize "http://q@some.host:80/" &&
+	test-url-normalize "https://q@some.host:443?" &&
+	! test-url-normalize "http://q@:8008" &&
+	! test-url-normalize "http://:8080" &&
+	! test-url-normalize "http://:" &&
+	test-url-normalize "xyz://q@some.host:456/" &&
+	test-url-normalize "xyz://[::1]:456/" &&
+	test-url-normalize "xyz://[::1]:/" &&
+	! test-url-normalize "xyz://[::1]:000/" &&
+	! test-url-normalize "xyz://[::1]:0%300/" &&
+	! test-url-normalize "xyz://[::1]:0x80/" &&
+	! test-url-normalize "xyz://[::1]:4294967297/" &&
+	! test-url-normalize "xyz://[::1]:030f/"
+'
+
+test_expect_success 'url port normalization' '
+	test "$(test-url-normalize -p "http://x:800")" = "http://x:800/" &&
+	test "$(test-url-normalize -p "http://x:0800")" = "http://x:800/" &&
+	test "$(test-url-normalize -p "http://x:00000800")" = "http://x:800/" &&
+	test "$(test-url-normalize -p "http://x:065535")" = "http://x:65535/" &&
+	test "$(test-url-normalize -p "http://x:1")" = "http://x:1/" &&
+	test "$(test-url-normalize -p "http://x:80")" = "http://x/" &&
+	test "$(test-url-normalize -p "http://x:080")" = "http://x/" &&
+	test "$(test-url-normalize -p "http://x:000000080")" = "http://x/" &&
+	test "$(test-url-normalize -p "https://x:443")" = "https://x/" &&
+	test "$(test-url-normalize -p "https://x:0443")" = "https://x/" &&
+	test "$(test-url-normalize -p "https://x:000000443")" = "https://x/"
+'
+
+test_expect_success 'url general escapes' '
+	! test-url-normalize "http://x.y?%fg" &&
+	test "$(test-url-normalize -p "X://W/%7e%41^%3a")" = "x://w/~A%5E%3A" &&
+	test "$(test-url-normalize -p "X://W/:/?#[]@")" = "x://w/:/?#[]@" &&
+	test "$(test-url-normalize -p "X://W/$&()*+,;=")" = "x://w/$&()*+,;=" &&
+	test "$(test-url-normalize -p "X://W/'\''")" = "x://w/'\''" &&
+	test "$(test-url-normalize -p "X://W?'\!'")" = "x://w/?'\!'"
+'
+
+test_expect_success 'url high-bit escapes' '
+	test "$(test-url-normalize -p "$(cat "$tu-1")")" = "x://q/%01%02%03%04%05%06%07%08%0E%0F%10%11%12" &&
+	test "$(test-url-normalize -p "$(cat "$tu-2")")" = "x://q/%13%14%15%16%17%18%19%1B%1C%1D%1E%1F%7F" &&
+	test "$(test-url-normalize -p "$(cat "$tu-3")")" = "x://q/%80%81%82%83%84%85%86%87%88%89%8A%8B%8C%8D%8E%8F" &&
+	test "$(test-url-normalize -p "$(cat "$tu-4")")" = "x://q/%90%91%92%93%94%95%96%97%98%99%9A%9B%9C%9D%9E%9F" &&
+	test "$(test-url-normalize -p "$(cat "$tu-5")")" = "x://q/%A0%A1%A2%A3%A4%A5%A6%A7%A8%A9%AA%AB%AC%AD%AE%AF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-6")")" = "x://q/%B0%B1%B2%B3%B4%B5%B6%B7%B8%B9%BA%BB%BC%BD%BE%BF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-7")")" = "x://q/%C0%C1%C2%C3%C4%C5%C6%C7%C8%C9%CA%CB%CC%CD%CE%CF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-8")")" = "x://q/%D0%D1%D2%D3%D4%D5%D6%D7%D8%D9%DA%DB%DC%DD%DE%DF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-9")")" = "x://q/%E0%E1%E2%E3%E4%E5%E6%E7%E8%E9%EA%EB%EC%ED%EE%EF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-10")")" = "x://q/%F0%F1%F2%F3%F4%F5%F6%F7%F8%F9%FA%FB%FC%FD%FE%FF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-11")")" = "x://q/%C2%80%DF%BF%E0%A0%80%EF%BF%BD%F0%90%80%80%F0%AF%BF%BD"
+'
+
+test_expect_success 'url username/password escapes' '
+	test "$(test-url-normalize -p "x://%41%62(^):%70+d@foo")" = "x://Ab(%5E):p+d@foo/"
+'
+
+test_expect_success 'url normalized lengths' '
+	test "$(test-url-normalize -l "Http://%4d%65:%4d^%70@The.Host")" = 25 &&
+	test "$(test-url-normalize -l "http://%41:%42@x.y/%61/")" = 17 &&
+	test "$(test-url-normalize -l "http://@x.y/^")" = 15
+'
+
+test_expect_success 'url . and .. segments' '
+	test "$(test-url-normalize -p "x://y/.")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/./")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/a/.")" = "x://y/a" &&
+	test "$(test-url-normalize -p "x://y/a/./")" = "x://y/a/" &&
+	test "$(test-url-normalize -p "x://y/.?")" = "x://y/?" &&
+	test "$(test-url-normalize -p "x://y/./?")" = "x://y/?" &&
+	test "$(test-url-normalize -p "x://y/a/.?")" = "x://y/a?" &&
+	test "$(test-url-normalize -p "x://y/a/./?")" = "x://y/a/?" &&
+	test "$(test-url-normalize -p "x://y/a/./b/.././../c")" = "x://y/c" &&
+	test "$(test-url-normalize -p "x://y/a/./b/../.././c/")" = "x://y/c/" &&
+	test "$(test-url-normalize -p "x://y/a/./b/.././../c/././.././.")" = "x://y/" &&
+	! test-url-normalize "x://y/a/./b/.././../c/././.././.." &&
+	test "$(test-url-normalize -p "x://y/a/./?/././..")" = "x://y/a/?/././.." &&
+	test "$(test-url-normalize -p "x://y/%2e/")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/%2E/")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/a/%2e./")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/b/.%2E/")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/c/%2e%2E/")" = "x://y/"
+'
+
+# http://@foo specifies an empty user name but does not specify a password
+# http://foo  specifies neither a user name nor a password
+# So they should not be equivalent
+test_expect_success 'url equivalents' '
+	test-url-normalize "httP://x" "Http://X/" &&
+	test-url-normalize "Http://%4d%65:%4d^%70@The.Host" "hTTP://Me:%4D^p@the.HOST:80/" &&
+	! test-url-normalize "https://@x.y/^" "httpS://x.y:443/^" &&
+	test-url-normalize "https://@x.y/^" "httpS://@x.y:0443/^" &&
+	test-url-normalize "https://@x.y/^/../abc" "httpS://@x.y:0443/abc" &&
+	test-url-normalize "https://@x.y/^/.." "httpS://@x.y:0443/"
+'
+
+test_expect_success 'url config normalization matching' '
+	test "$(test-url-normalize -c "$tc-1" "useragent" "https://other.example.com/")" = "other-agent" &&
+	test "$(test-url-normalize -c "$tc-1" "useragent" "https://example.com/")" = "example-agent" &&
+	test "$(test-url-normalize -c "$tc-1" "sslVerify" "https://example.com/")" = "false" &&
+	test "$(test-url-normalize -c "$tc-1" "useragent" "https://example.com/path/sub")" = "path-agent" &&
+	test "$(test-url-normalize -c "$tc-1" "sslVerify" "https://example.com/path/sub")" = "false" &&
+	test "$(test-url-normalize -c "$tc-1" "noEPSV" "https://elsewhere.com/")" = "true" &&
+	test "$(test-url-normalize -c "$tc-1" "noEPSV" "https://example.com")" = "true" &&
+	test "$(test-url-normalize -c "$tc-1" "noEPSV" "https://example.com/path")" = "true" &&
+	test "$(test-url-normalize -c "$tc-2" "useragent" "HTTPS://example.COM/p%61th")" = "example-agent" &&
+	test "$(test-url-normalize -c "$tc-2" "sslVerify" "HTTPS://example.COM/p%61th")" = "false"
+'
+
+test_done
diff --git a/t/t5200/README b/t/t5200/README
new file mode 100644
index 0000000..e3a67d9
--- /dev/null
+++ b/t/t5200/README
@@ -0,0 +1,18 @@
+The url data files in this directory contain URLs with characters
+in the range 0x01-0x1f and 0x7f-0xff to test the proper normalization
+of unprintable characters.
+
+A select few characters in the 0x01-0x1f range are skipped to help
+avoid problems running the test itself.
+
+The urls are in test files in this directory rather than being
+embedded in the test script for portability.
+
+The config data files in this directory represent configurations
+to be parsed by http_options so that the final option value can be
+tested.
+
+The config files may contain more than one same-named section to
+simulate having a system, global and .git config file.
+
+
diff --git a/t/t5200/config-1 b/t/t5200/config-1
new file mode 100644
index 0000000..8aaf23c
--- /dev/null
+++ b/t/t5200/config-1
@@ -0,0 +1,8 @@
+[http]
+	useragent = other-agent
+	noEPSV = true
+[http "https://example.com"]
+	useragent = example-agent
+	sslVerify = false
+[http "https://example.com/path"]
+	useragent = path-agent
diff --git a/t/t5200/config-2 b/t/t5200/config-2
new file mode 100644
index 0000000..749f4bd
--- /dev/null
+++ b/t/t5200/config-2
@@ -0,0 +1,3 @@
+[http "https://example.com/path"]
+	useragent = example-agent
+	sslVerify = false
diff --git a/t/t5200/url-1 b/t/t5200/url-1
new file mode 100644
index 0000000000000000000000000000000000000000..519019c5ce6c58478f048a2f39e2321370d318c6
GIT binary patch
literal 20
bcmb=h($_E4XJle#VP#|I;Nuq%6ygE^Admtt

literal 0
HcmV?d00001

diff --git a/t/t5200/url-10 b/t/t5200/url-10
new file mode 100644
index 0000000000000000000000000000000000000000..b9965de6a5d74b122179821212b2c27c8ae03e80
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFYxj5^Yr!h_xSnx`~3a>{|dCd5i<Y)

literal 0
HcmV?d00001

diff --git a/t/t5200/url-11 b/t/t5200/url-11
new file mode 100644
index 0000000000000000000000000000000000000000..f0a50f10096a20d597f40c775f09a71276e0050a
GIT binary patch
literal 25
hcmb=h($_E4Kh$u4|APe$@AvQhFrlI0!}|Suxd5(W4xs=5

literal 0
HcmV?d00001

diff --git a/t/t5200/url-2 b/t/t5200/url-2
new file mode 100644
index 0000000000000000000000000000000000000000..43334b05b2de3794d6020abd96e634a4e9e49cb0
GIT binary patch
literal 20
bcmb=h($_E47Zwo}6PJ*bmXVc{ujc{)C{+Vx

literal 0
HcmV?d00001

diff --git a/t/t5200/url-3 b/t/t5200/url-3
new file mode 100644
index 0000000000000000000000000000000000000000..7378c7bec247b996bc67b00a05ed89cf47d4b7a7
GIT binary patch
literal 23
ecmb=h($_E4Z)j|4ZfR|6@96C6?&<C8=K=t7Jqj}b

literal 0
HcmV?d00001

diff --git a/t/t5200/url-4 b/t/t5200/url-4
new file mode 100644
index 0000000000000000000000000000000000000000..220b198c97f942fea4960f51a2105cc42261061a
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFOZRvla!T~mzbHFo1C4Vp9*`u3o`%!

literal 0
HcmV?d00001

diff --git a/t/t5200/url-5 b/t/t5200/url-5
new file mode 100644
index 0000000000000000000000000000000000000000..1ccd9277792840955bb124bdde21f4b08bcccb63
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFQB2Kqok##r>Lo_tE{cAuL^}d3^M=#

literal 0
HcmV?d00001

diff --git a/t/t5200/url-6 b/t/t5200/url-6
new file mode 100644
index 0000000000000000000000000000000000000000..e8283aac6dff049d3e02454db6e684c5790a5996
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFR-z)v$VCgx45~wyS%-=zY31M4Kn}$

literal 0
HcmV?d00001

diff --git a/t/t5200/url-7 b/t/t5200/url-7
new file mode 100644
index 0000000000000000000000000000000000000000..fa7c10b615259deefd15b638b021da7c60eba1b2
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFTlaV!^FkL$H>Xb%goKr&kC454l@7%

literal 0
HcmV?d00001

diff --git a/t/t5200/url-8 b/t/t5200/url-8
new file mode 100644
index 0000000000000000000000000000000000000000..79a0ba836f5b8886b0a73f161eb292af2b105e65
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFVNA_)6~`0*Vx(G+uYsW-wL6<4>JG&

literal 0
HcmV?d00001

diff --git a/t/t5200/url-9 b/t/t5200/url-9
new file mode 100644
index 0000000000000000000000000000000000000000..8b44bec48b94467c63e8e1ad18162e465da6d6dd
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFW}+g<K*S$=jiF`>+J3B?+U9u5HkP(

literal 0
HcmV?d00001

diff --git a/test-url-normalize.c b/test-url-normalize.c
new file mode 100644
index 0000000..f18bd88
--- /dev/null
+++ b/test-url-normalize.c
@@ -0,0 +1,131 @@
+#ifdef NO_CURL
+
+int main()
+{
+	return 125;
+}
+
+#else /* !NO_CURL */
+
+#include "http.c"
+
+static int run_http_options(const char *file,
+			    const char *opt,
+			    const char *url)
+{
+	struct strbuf opt_lc;
+	size_t i, len;
+
+	if (git_config_with_options(http_options, (void *)url, file, 0))
+		return 1;
+
+	len = strlen(opt);
+	strbuf_init(&opt_lc, len);
+	for (i = 0; i < len; ++i) {
+		strbuf_addch(&opt_lc, tolower(opt[i]));
+	}
+
+	if (!strcmp("sslverify", opt_lc.buf))
+		printf("%s\n", curl_ssl_verify ? "true" : "false");
+	else if (!strcmp("sslcert", opt_lc.buf))
+		printf("%s\n", ssl_cert);
+#if LIBCURL_VERSION_NUM >= 0x070903
+	else if (!strcmp("sslkey", opt_lc.buf))
+		printf("%s\n", ssl_key);
+#endif
+#if LIBCURL_VERSION_NUM >= 0x070908
+	else if (!strcmp("sslcapath", opt_lc.buf))
+		printf("%s\n", ssl_capath);
+#endif
+	else if (!strcmp("sslcainfo", opt_lc.buf))
+		printf("%s\n", ssl_cainfo);
+	else if (!strcmp("sslcertpasswordprotected", opt_lc.buf))
+		printf("%s\n", ssl_cert_password_required ? "true" : "false");
+	else if (!strcmp("ssltry", opt_lc.buf))
+		printf("%s\n", curl_ssl_try ? "true" : "false");
+	else if (!strcmp("minsessions", opt_lc.buf))
+		printf("%d\n", min_curl_sessions);
+	else if (!strcmp("maxrequests", opt_lc.buf))
+		printf("%d\n", max_requests);
+	else if (!strcmp("lowspeedlimit", opt_lc.buf))
+		printf("%ld\n", curl_low_speed_limit);
+	else if (!strcmp("lowspeedtime", opt_lc.buf))
+		printf("%ld\n", curl_low_speed_time);
+	else if (!strcmp("noepsv", opt_lc.buf))
+		printf("%s\n", curl_ftp_no_epsv ? "true" : "false");
+	else if (!strcmp("proxy", opt_lc.buf))
+		printf("%s\n", curl_http_proxy);
+	else if (!strcmp("cookiefile", opt_lc.buf))
+		printf("%s\n", curl_cookie_file);
+	else if (!strcmp("postbuffer", opt_lc.buf))
+		printf("%u\n", (unsigned)http_post_buffer);
+	else if (!strcmp("useragent", opt_lc.buf))
+		printf("%s\n", user_agent);
+
+	return 0;
+}
+
+#define url_normalize(u) http_options_url_normalize(u)
+
+int main(int argc, char **argv)
+{
+	const char *usage = "test-url-normalize [-p | -l] <url1> | <url1> <url2>"
+		" | -c file option <url1>";
+	char *url1, *url2;
+	int opt_p = 0, opt_l = 0, opt_c = 0;
+	char *file = NULL, *optname = NULL;
+
+	/*
+	 * For one url, succeed if url_normalize succeeds on it, fail otherwise.
+	 * For two urls, succeed only if url_normalize succeeds on both and
+	 * the results compare equal with strcmp.  If -p is given (one url only)
+	 * and url_normalize succeeds, print the result followed by "\n".  If
+	 * -l is given (one url only) and url_normalize succeeds, print the
+	 * returned length in decimal followed by "\n".
+	 * If -c is given, call git_config_with_options using the specified file
+	 * and http_options and passing the normalized value of the url.  Then
+	 * print the value of 'option' afterwards.  'option' must be one of the
+	 * valid 'http.*' options.
+	 */
+
+	if (argc > 1 && !strcmp(argv[1], "-p")) {
+		opt_p = 1;
+		argc--;
+		argv++;
+	} else if (argc > 1 && !strcmp(argv[1], "-l")) {
+		opt_l = 1;
+		argc--;
+		argv++;
+	} else if (argc > 3 && !strcmp(argv[1], "-c")) {
+		opt_c = 1;
+		file = argv[2];
+		optname = argv[3];
+		argc -= 3;
+		argv += 3;
+	}
+
+	if (argc < 2 || argc > 3)
+		die(usage);
+
+	if (argc == 2) {
+		url1 = url_normalize(argv[1]);
+		if (!url1)
+			return 1;
+		if (opt_p)
+			printf("%s\n", url1);
+		if (opt_l)
+			printf("%u\n", (unsigned)strlen(url1));
+		if (opt_c)
+			return run_http_options(file, optname, url1);
+		return 0;
+	}
+
+	if (opt_p || opt_l || opt_c)
+		die(usage);
+
+	url1 = url_normalize(argv[1]);
+	url2 = url_normalize(argv[2]);
+	return (url1 && url2 && !strcmp(url1, url2)) ? 0 : 1;
+}
+
+#endif /* !NO_CURL */
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v8 4/4] config: allow http.<url>.* any user matching
  2013-07-22 12:56 [PATCH v8 0/4] config: add support for http.<url>.* settings Kyle J. McKay
                   ` (2 preceding siblings ...)
  2013-07-22 12:56 ` [PATCH v8 3/4] tests: add new test for the url_normalize function Kyle J. McKay
@ 2013-07-22 12:56 ` Kyle J. McKay
  2013-07-22 18:00   ` Junio C Hamano
  2013-07-24  6:44   ` Jeff King
  3 siblings, 2 replies; 19+ messages in thread
From: Kyle J. McKay @ 2013-07-22 12:56 UTC (permalink / raw)
  To: git
  Cc: David Aguilar, Petr Baudis, Junio C Hamano, Richard Hartmann,
	Jeff King, Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

Previously the <url> had to specify an exactly matching user name
and password if those were present in the url being matched against.

Now the password portion is always ignored and omitting the user
name from <url> allows it to match against any user name.

Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
---
 Documentation/config.txt |  29 +++--
 http.c                   | 281 +++++++++++++++++++++++++++++++++++++++--------
 t/t5200-url-normalize.sh |   3 +-
 t/t5200/config-3         |   4 +
 test-url-normalize.c     |  17 +--
 5 files changed, 270 insertions(+), 64 deletions(-)
 create mode 100644 t/t5200/config-3

diff --git a/Documentation/config.txt b/Documentation/config.txt
index e461f32..c418adf 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1517,15 +1517,26 @@ http.<url>.*::
 	Any of the http.* options above can be applied selectively to some urls.
 	For example "http.https://example.com.useragent" would set the user
 	agent only for https connections to example.com.  The <url> value
-	matches a url if it is an exact match or if it is a prefix of the url
-	matching at a "/" boundary.  Longer <url> matches take precedence over
-	shorter ones with the environment variable settings taking precedence
-	over all.  The urls are normalized before testing for a match.  Note,
-	however, that any user, password and/or port setting that appears in a
-	url must also match that part of <url> to have a successful match.  The
-	urls that are matched against are those given directly to git commands.
-	This means any urls visited as a result of a redirection do not
-	participate in matching.
+	matches a url if it refers to the same scheme, host and port and the
+	path portion is an exact match or a prefix that matches at a "/"
+	boundary.  If <url> does not include a user name, it will match a url
+	with any username otherwise the user name must match as well (the
+	password part, if present in the url, is always ignored).  Longer <url>
+	path matches take precedence over shorter matches no matter what order
+	they occur in.  For example, if both "https://user@example.com/path" and
+	"https://example.com/path/name" are used as a config <url> value and
+	then "https://user@example.com/path/name/here" is passed to a git
+	command, the settings in the "https://example.com/path/name" section
+	will be preferred because that <url> has a longer path length match than
+	"https://user@example.com/path" even though the latter did match the
+	user.  For same length matches, the last one wins except that a same
+	length <url> match that includes a user name will be preferred over a
+	same length <url> match that does not.  The urls are normalized before
+	matching so that equivalent urls that are simply spelled differently
+	will match properly.  Environment variable settings always override any
+	matches.  The urls that are matched against are those given directly to
+	git commands.  This means any urls visited as a result of a redirection
+	do not participate in matching.
 
 i18n.commitEncoding::
 	Character encoding the commit messages are stored in; Git itself
diff --git a/http.c b/http.c
index 29e119c..c636d3c 100644
--- a/http.c
+++ b/http.c
@@ -56,7 +56,35 @@ enum http_option_type {
 	OPT_MAX
 };
 
+struct url_info {
+	char *url;		/* normalized url on success, must be freed, otherwise NULL */
+	const char *err;	/* if !url, a brief reason for the failure, otherwise NULL */
+
+	/* the rest of the fields are only set if url != NULL */
+
+	size_t url_len;		/* total length of url (which is now normalized) */
+	size_t scheme_len;	/* length of scheme name (excluding final :) */
+	size_t user_off;	/* offset into url to start of user name (0 => none) */
+	size_t user_len;	/* length of user name; if user_off != 0 but
+				   user_len == 0, an empty user name was given */
+	size_t passwd_off;	/* offset into url to start of passwd (0 => none) */
+	size_t passwd_len;	/* length of passwd; if passwd_off != 0 but
+				   passwd_len == 0, an empty passwd was given */
+	size_t host_off;	/* offset into url to start of host name (0 => none) */
+	size_t host_len;	/* length of host name; this INCLUDES any ':portnum';
+				 * file urls may have host_len == 0 */
+	size_t port_len;	/* if a portnum is present (port_len != 0), it has
+				 * this length (excluding the leading ':') at the
+				 * end of the host name (always 0 for file urls) */
+	size_t path_off;	/* offset into url to the start of the url path;
+				 * this will always point to a '/' character
+				 * after the url has been normalized */
+	size_t path_len;	/* length of path portion excluding any trailing
+				 * '?...' and '#...' portion; will always be >= 1 */
+};
+
 static size_t http_option_max_matched_len[OPT_MAX];
+static int http_option_user_matched[OPT_MAX];
 
 static int curl_ssl_verify = -1;
 static int curl_ssl_try;
@@ -231,7 +259,7 @@ static int append_normalized_escapes(struct strbuf *buf,
 	return 1;
 }
 
-static char *http_options_url_normalize(const char *url)
+static char *http_options_url_normalize(const char *url, struct url_info *out_info)
 {
 	/*
 	 * Normalize NUL-terminated url using the following rules:
@@ -254,6 +282,14 @@ static char *http_options_url_normalize(const char *url)
 	 * The return value is a newly allocated string that must be freed
 	 * or NULL if the url is not valid.
 	 *
+	 * If out_info is non-NULL, the url and err fields therein will always
+	 * be set.  If a non-NULL value is returned, it will be stored in
+	 * out_info->url as well, out_info->err will be set to NULL and the
+	 * other fields of *out_info will also be filled in.  If a NULL value
+	 * is returned, NULL will be stored in out_info->url and out_info->err
+	 * will be set to a brief, translated, error message, but no other
+	 * fields will be filled in.
+	 *
 	 * This is NOT a URL validation function.  Full URL validation is NOT
 	 * performed.  Some invalid host names are passed through this function
 	 * undetected.  However, most all other problems that make a URL invalid
@@ -263,9 +299,10 @@ static char *http_options_url_normalize(const char *url)
 	size_t url_len = strlen(url);
 	struct strbuf norm;
 	size_t spanned;
+	size_t scheme_len, user_off=0, user_len=0, passwd_off=0, passwd_len=0;
+	size_t host_off=0, host_len=0, port_len=0, path_off, path_len, result_len;
 	const char *slash_ptr, *at_ptr, *colon_ptr, *path_start;
-	int found_host = 0;
-
+	char *result;
 
 	/*
 	 * Copy lowercased scheme and :// suffix, %-escapes are not allowed
@@ -273,9 +310,15 @@ static char *http_options_url_normalize(const char *url)
 	 */
 	spanned = strspn(url, URL_SCHEME_CHARS);
 	if (!spanned || !isalpha(url[0]) || spanned + 3 > url_len ||
-	    url[spanned] != ':' || url[spanned+1] != '/' || url[spanned+2] != '/')
+	    url[spanned] != ':' || url[spanned+1] != '/' || url[spanned+2] != '/') {
+		if (out_info) {
+			out_info->url = NULL;
+			out_info->err = _("invalid URL scheme name or missing '://' suffix");
+		}
 		return NULL; /* Bad scheme and/or missing "://" part */
+	}
 	strbuf_init(&norm, url_len);
+	scheme_len = spanned;
 	spanned += 3;
 	url_len -= spanned;
 	while (spanned--)
@@ -288,12 +331,25 @@ static char *http_options_url_normalize(const char *url)
 	at_ptr = strchr(url, '@');
 	slash_ptr = url + strcspn(url, "/?#");
 	if (at_ptr && at_ptr < slash_ptr) {
+		user_off = norm.len;
 		if (at_ptr > url) {
 			if (!append_normalized_escapes(&norm, url, at_ptr - url,
 						       "", URL_RESERVED)) {
+				if (out_info) {
+					out_info->url = NULL;
+					out_info->err = _("invalid %XX escape sequence");
+				}
 				strbuf_release(&norm);
 				return NULL;
 			}
+			colon_ptr = strchr(norm.buf + scheme_len + 3, ':');
+			if (colon_ptr) {
+				passwd_off = (colon_ptr + 1) - norm.buf;
+				passwd_len = norm.len - passwd_off;
+				user_len = (passwd_off - 1) - (scheme_len + 3);
+			} else {
+				user_len = norm.len - (scheme_len + 3);
+			}
 		}
 		strbuf_addch(&norm, '@');
 		url_len -= (++at_ptr - url);
@@ -307,25 +363,37 @@ static char *http_options_url_normalize(const char *url)
 	if (!url_len || strchr(":/?#", *url)) {
 		/* Missing host invalid for all URL schemes except file */
 		if (strncmp(norm.buf, "file:", 5)) {
+			if (out_info) {
+				out_info->url = NULL;
+				out_info->err = _("missing host and scheme is not 'file:'");
+			}
 			strbuf_release(&norm);
 			return NULL;
 		}
 	} else {
-		found_host = 1;
+		host_off = norm.len;
 	}
 	colon_ptr = slash_ptr - 1;
 	while (colon_ptr > url && *colon_ptr != ':' && *colon_ptr != ']')
 		colon_ptr--;
 	if (*colon_ptr != ':') {
 		colon_ptr = slash_ptr;
-	} else if (!found_host && colon_ptr < slash_ptr && colon_ptr + 1 != slash_ptr) {
+	} else if (!host_off && colon_ptr < slash_ptr && colon_ptr + 1 != slash_ptr) {
 		/* file: URLs may not have a port number */
+		if (out_info) {
+			out_info->url = NULL;
+			out_info->err = _("a 'file:' URL may not have a port number");
+		}
 		strbuf_release(&norm);
 		return NULL;
 	}
 	spanned = strspn(url, URL_HOST_CHARS);
 	if (spanned < colon_ptr - url) {
 		/* Host name has invalid characters */
+		if (out_info) {
+			out_info->url = NULL;
+			out_info->err = _("invalid characters in host name");
+		}
 		strbuf_release(&norm);
 		return NULL;
 	}
@@ -367,6 +435,10 @@ static char *http_options_url_normalize(const char *url)
 			spanned = strspn(url, URL_DIGIT);
 			if (spanned < slash_ptr - url) {
 				/* port number has invalid characters */
+				if (out_info) {
+					out_info->url = NULL;
+					out_info->err = _("invalid port number");
+				}
 				strbuf_release(&norm);
 				return NULL;
 			}
@@ -374,15 +446,22 @@ static char *http_options_url_normalize(const char *url)
 				pnum = strtoul(url, NULL, 10);
 			if (pnum == 0 || pnum > 65535) {
 				/* port number not in range 1..65535 */
+				if (out_info) {
+					out_info->url = NULL;
+					out_info->err = _("invalid port number");
+				}
 				strbuf_release(&norm);
 				return NULL;
 			}
 			strbuf_addch(&norm, ':');
 			strbuf_add(&norm, url, slash_ptr - url);
+			port_len = slash_ptr - url;
 		}
 		url_len -= slash_ptr - colon_ptr;
 		url = slash_ptr;
 	}
+	if (host_off)
+		host_len = norm.len - host_off;
 
 
 	/*
@@ -390,7 +469,8 @@ static char *http_options_url_normalize(const char *url)
 	 * to corrupt the URL by unescaping any delimiters, but do add an
 	 * initial '/' if it's missing and do normalize any %-escape sequences.
 	 */
-	path_start = norm.buf + norm.len;
+	path_off = norm.len;
+	path_start = norm.buf + path_off;
 	strbuf_addch(&norm, '/');
 	if (*url == '/') {
 		url++;
@@ -406,6 +486,10 @@ static char *http_options_url_normalize(const char *url)
 		 */
 		if (!append_normalized_escapes(&norm, url, next_slash - url, "",
 					       URL_RESERVED)) {
+			if (out_info) {
+				out_info->url = NULL;
+				out_info->err = _("invalid %XX escape sequence");
+			}
 			strbuf_release(&norm);
 			return NULL;
 		}
@@ -425,6 +509,10 @@ static char *http_options_url_normalize(const char *url)
 			const char *prev_slash = norm.buf + norm.len - 3;
 			if (prev_slash == path_start) {
 				/* invalid .. because no previous segment to remove */
+				if (out_info) {
+					out_info->url = NULL;
+					out_info->err = _("invalid '..' path segment");
+				}
 				strbuf_release(&norm);
 				return NULL;
 			}
@@ -446,6 +534,7 @@ static char *http_options_url_normalize(const char *url)
 		if (!skip_add_slash)
 			strbuf_addch(&norm, '/');
 	}
+	path_len = norm.len - path_off;
 
 
 	/*
@@ -454,13 +543,33 @@ static char *http_options_url_normalize(const char *url)
 	 */
 	if (*url) {
 		if (!append_normalized_escapes(&norm, url, url_len, "", URL_RESERVED)) {
+			if (out_info) {
+				out_info->url = NULL;
+				out_info->err = _("invalid %XX escape sequence");
+			}
 			strbuf_release(&norm);
 			return NULL;
 		}
 	}
 
 
-	return strbuf_detach(&norm, NULL);
+	result = strbuf_detach(&norm, &result_len);
+	if (out_info) {
+		out_info->url = result;
+		out_info->err = NULL;
+		out_info->url_len = result_len;
+		out_info->scheme_len = scheme_len;
+		out_info->user_off = user_off;
+		out_info->user_len = user_len;
+		out_info->passwd_off = passwd_off;
+		out_info->passwd_len = passwd_len;
+		out_info->host_off = host_off;
+		out_info->host_len = host_len;
+		out_info->port_len = port_len;
+		out_info->path_off = path_off;
+		out_info->path_len = path_len;
+	}
+	return result;
 }
 
 static size_t http_options_url_match_prefix(const char *url,
@@ -476,48 +585,120 @@ static size_t http_options_url_match_prefix(const char *url,
 	 * url must be NUL terminated.  url_prefix_len is the length of
 	 * url_prefix which need not be NUL terminated.
 	 *
-	 * The return value is the length of the match in characters (excluding
-	 * any final '/') or 0 for no match.  Passing "/" as url_prefix will
-	 * always cause 0 to be returned.
+	 * The return value is the length of the match in characters (including
+	 * the final '/' even if it's implicit) or 0 for no match.
 	 *
 	 * Passing NULL as url and/or url_prefix will always cause 0 to be
 	 * returned without causing any faults.
 	 */
-	size_t url_len;
 	if (!url || !url_prefix)
 		return 0;
-	if (url_prefix_len && url_prefix[url_prefix_len - 1] == '/')
+	if (!url_prefix_len || (url_prefix_len == 1 && *url_prefix == '/'))
+		return (!*url || *url == '/') ? 1 : 0;
+	if (url_prefix[url_prefix_len - 1] == '/')
 		url_prefix_len--;
-	if (!url_prefix_len || strncmp(url, url_prefix, url_prefix_len))
+	if (strncmp(url, url_prefix, url_prefix_len))
 		return 0;
-	url_len = strlen(url);
-	if ((url_len == url_prefix_len) || (url[url_prefix_len] == '/'))
-		return url_prefix_len;
+	if ((strlen(url) == url_prefix_len) || (url[url_prefix_len] == '/'))
+		return url_prefix_len + 1;
 	return 0;
 }
 
-static int new_match_is_shorter(size_t matchlen, enum http_option_type opt)
+static int http_options_match_urls(const struct url_info *url,
+				   const struct url_info *url_prefix,
+				   int *exactusermatch)
+{
+	/*
+	 * url_prefix matches url if the scheme, host and port of url_prefix
+	 * are the same as those of url and the path portion of url_prefix
+	 * is the same as the path portion of url or it is a prefix that
+	 * matches at a '/' boundary.  If url_prefix contains a user name,
+	 * that must also exactly match the user name in url.
+	 *
+	 * If the user, host, port and path match in this fashion, the returned
+	 * value is the length of the path match including any implicit
+	 * final '/'.  For example, "http://me@example.com/path" is matched by
+	 * "http://example.com" with a path length of 1.
+	 *
+	 * If there is a match and exactusermatch is not NULL, then
+	 * *exactusermatch will be set to true if both url and url_prefix
+	 * contained a user name or false if url_prefix did not have a
+	 * user name.  If there is no match *exactusermatch is left untouched.
+	 */
+	int usermatched = 0;
+	int pathmatchlen;
+
+	if (!url || !url_prefix || !url->url || !url_prefix->url)
+		return 0;
+
+	/* check the scheme */
+	if (url_prefix->scheme_len != url->scheme_len ||
+	    strncmp(url->url, url_prefix->url, url->scheme_len))
+		return 0; /* schemes do not match */
+
+	/* check the user name if url_prefix has one */
+	if (url_prefix->user_off) {
+		if (!url->user_off || url->user_len != url_prefix->user_len ||
+		    strncmp(url->url + url->user_off,
+			    url_prefix->url + url_prefix->user_off,
+			    url->user_len))
+			return 0; /* url_prefix has a user but it's not a match */
+		usermatched = 1;
+	}
+
+	/* check the host and port */
+	if (url_prefix->host_len != url->host_len ||
+	    strncmp(url->url + url->host_off,
+		    url_prefix->url + url_prefix->host_off, url->host_len))
+		return 0; /* host names and/or ports do not match */
+
+	/* check the path */
+	pathmatchlen = http_options_url_match_prefix(
+		url->url + url->path_off,
+		url_prefix->url + url_prefix->path_off,
+		url_prefix->url_len - url_prefix->path_off);
+
+	if (pathmatchlen && exactusermatch)
+		*exactusermatch = usermatched;
+	return pathmatchlen;
+}
+
+static int match_is_ignored(size_t matchlen, int usermatch, enum http_option_type opt)
 {
 	/*
-	 * Compare matchlen to the last matched length of option opt and
+	 * Compare matchlen to the last matched path length of option opt and
 	 * return true if matchlen is shorter than the last matched length
 	 * (meaning the config setting should be ignored).  Upon seeing the
 	 * _same_ key (i.e. new key has the same match length which is therefore
-	 * not shorter) the new setting will override the previous setting.
-	 * Otherwise return false and record matchlen as the current last
-	 * matched length of option opt.
+	 * not shorter) the new setting will override the previous setting
+	 * unless the new setting did not match the user and the previous match
+	 * did.  Otherwise return false and record matchlen as the current last
+	 * matched path length of option opt and usermatch as the last user
+	 * matching state for option opt.
 	 */
 	if (matchlen < http_option_max_matched_len[opt])
 		return 1;
-	http_option_max_matched_len[opt] = matchlen;
+	if (matchlen > http_option_max_matched_len[opt]) {
+		http_option_max_matched_len[opt] = matchlen;
+		http_option_user_matched[opt] = usermatch;
+		return 0;
+	}
+	/*
+	 * If a previous match of the same length explicitly matched the user
+	 * name, but the current match matched on any user, ignore it.
+	 */
+	if (!usermatch && http_option_user_matched[opt])
+		return 1;
+	http_option_user_matched[opt] = usermatch;
 	return 0;
 }
 
 static int http_options(const char *var, const char *value, void *cb)
 {
-	const char *url = cb;
+	const struct url_info *info = cb;
 	const char *key, *dot;
 	size_t matchlen = 0;
+	int usermatch = 0;
 
 	key = skip_prefix(var, "http.");
 	if (!key)
@@ -532,12 +713,18 @@ static int http_options(const char *var, const char *value, void *cb)
 	 */
 	dot = strrchr(key, '.');
 	if (dot) {
-		char *config_url = xmemdupz(key, dot - key);
-		char *norm_url = http_options_url_normalize(config_url);
+		char *config_url;
+		struct url_info norm_info;
+		char *norm_url;
+
+		if (!info || !info->url)
+			return 0;
+		config_url = xmemdupz(key, dot - key);
+		norm_url = http_options_url_normalize(config_url, &norm_info);
 		free(config_url);
 		if (!norm_url)
 			return 0;
-		matchlen = http_options_url_match_prefix(url, norm_url, strlen(norm_url));
+		matchlen = http_options_match_urls(info, &norm_info, &usermatch);
 		free(norm_url);
 		if (!matchlen)
 			return 0;
@@ -545,49 +732,49 @@ static int http_options(const char *var, const char *value, void *cb)
 	}
 
 	if (!strcmp("sslverify", key)) {
-		if (new_match_is_shorter(matchlen, OPT_SSL_VERIFY))
+		if (match_is_ignored(matchlen, usermatch, OPT_SSL_VERIFY))
 			return 0;
 		curl_ssl_verify = git_config_bool(var, value);
 		return 0;
 	}
 	if (!strcmp("sslcert", key)) {
-		if (new_match_is_shorter(matchlen, OPT_SSL_CERT))
+		if (match_is_ignored(matchlen, usermatch, OPT_SSL_CERT))
 			return 0;
 		return git_config_string(&ssl_cert, var, value);
 	}
 #if LIBCURL_VERSION_NUM >= 0x070903
 	if (!strcmp("sslkey", key)) {
-		if (new_match_is_shorter(matchlen, OPT_SSL_KEY))
+		if (match_is_ignored(matchlen, usermatch, OPT_SSL_KEY))
 			return 0;
 		return git_config_string(&ssl_key, var, value);
 	}
 #endif
 #if LIBCURL_VERSION_NUM >= 0x070908
 	if (!strcmp("sslcapath", key)) {
-		if (new_match_is_shorter(matchlen, OPT_SSL_CAPATH))
+		if (match_is_ignored(matchlen, usermatch, OPT_SSL_CAPATH))
 			return 0;
 		return git_config_string(&ssl_capath, var, value);
 	}
 #endif
 	if (!strcmp("sslcainfo", key)) {
-		if (new_match_is_shorter(matchlen, OPT_SSL_CAINFO))
+		if (match_is_ignored(matchlen, usermatch, OPT_SSL_CAINFO))
 			return 0;
 		return git_config_string(&ssl_cainfo, var, value);
 	}
 	if (!strcmp("sslcertpasswordprotected", key)) {
-		if (new_match_is_shorter(matchlen, OPT_PASSWD_REQ))
+		if (match_is_ignored(matchlen, usermatch, OPT_PASSWD_REQ))
 			return 0;
 		ssl_cert_password_required = git_config_bool(var, value);
 		return 0;
 	}
 	if (!strcmp("ssltry", key)) {
-		if (new_match_is_shorter(matchlen, OPT_SSL_TRY))
+		if (match_is_ignored(matchlen, usermatch, OPT_SSL_TRY))
 			return 0;
 		curl_ssl_try = git_config_bool(var, value);
 		return 0;
 	}
 	if (!strcmp("minsessions", key)) {
-		if (new_match_is_shorter(matchlen, OPT_MIN_SESSIONS))
+		if (match_is_ignored(matchlen, usermatch, OPT_MIN_SESSIONS))
 			return 0;
 		min_curl_sessions = git_config_int(var, value);
 #ifndef USE_CURL_MULTI
@@ -598,45 +785,45 @@ static int http_options(const char *var, const char *value, void *cb)
 	}
 #ifdef USE_CURL_MULTI
 	if (!strcmp("maxrequests", key)) {
-		if (new_match_is_shorter(matchlen, OPT_MAX_REQUESTS))
+		if (match_is_ignored(matchlen, usermatch, OPT_MAX_REQUESTS))
 			return 0;
 		max_requests = git_config_int(var, value);
 		return 0;
 	}
 #endif
 	if (!strcmp("lowspeedlimit", key)) {
-		if (new_match_is_shorter(matchlen, OPT_LOW_SPEED))
+		if (match_is_ignored(matchlen, usermatch, OPT_LOW_SPEED))
 			return 0;
 		curl_low_speed_limit = (long)git_config_int(var, value);
 		return 0;
 	}
 	if (!strcmp("lowspeedtime", key)) {
-		if (new_match_is_shorter(matchlen, OPT_LOW_TIME))
+		if (match_is_ignored(matchlen, usermatch, OPT_LOW_TIME))
 			return 0;
 		curl_low_speed_time = (long)git_config_int(var, value);
 		return 0;
 	}
 
 	if (!strcmp("noepsv", key)) {
-		if (new_match_is_shorter(matchlen, OPT_NO_EPSV))
+		if (match_is_ignored(matchlen, usermatch, OPT_NO_EPSV))
 			return 0;
 		curl_ftp_no_epsv = git_config_bool(var, value);
 		return 0;
 	}
 	if (!strcmp("proxy", key)) {
-		if (new_match_is_shorter(matchlen, OPT_HTTP_PROXY))
+		if (match_is_ignored(matchlen, usermatch, OPT_HTTP_PROXY))
 			return 0;
 		return git_config_string(&curl_http_proxy, var, value);
 	}
 
 	if (!strcmp("cookiefile", key)) {
-		if (new_match_is_shorter(matchlen, OPT_COOKIE_FILE))
+		if (match_is_ignored(matchlen, usermatch, OPT_COOKIE_FILE))
 			return 0;
 		return git_config_string(&curl_cookie_file, var, value);
 	}
 
 	if (!strcmp("postbuffer", key)) {
-		if (new_match_is_shorter(matchlen, OPT_POST_BUFFER))
+		if (match_is_ignored(matchlen, usermatch, OPT_POST_BUFFER))
 			return 0;
 		http_post_buffer = git_config_int(var, value);
 		if (http_post_buffer < LARGE_PACKET_MAX)
@@ -645,7 +832,7 @@ static int http_options(const char *var, const char *value, void *cb)
 	}
 
 	if (!strcmp("useragent", key)) {
-		if (new_match_is_shorter(matchlen, OPT_USER_AGENT))
+		if (match_is_ignored(matchlen, usermatch, OPT_USER_AGENT))
 			return 0;
 		return git_config_string(&user_agent, var, value);
 	}
@@ -774,13 +961,15 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 {
 	char *low_speed_limit;
 	char *low_speed_time;
-	char *norm_url = http_options_url_normalize(url);
+	struct url_info info;
 
 	http_is_verbose = 0;
 
 	memset(http_option_max_matched_len, 0, sizeof(http_option_max_matched_len));
-	git_config(http_options, norm_url);
-	free(norm_url);
+	memset(http_option_user_matched, 0, sizeof(http_option_user_matched));
+	http_options_url_normalize(url, &info);
+	git_config(http_options, &info);
+	free(info.url);
 
 	curl_global_init(CURL_GLOBAL_ALL);
 
diff --git a/t/t5200-url-normalize.sh b/t/t5200-url-normalize.sh
index d2dd886..f79bb0f 100755
--- a/t/t5200-url-normalize.sh
+++ b/t/t5200-url-normalize.sh
@@ -192,7 +192,8 @@ test_expect_success 'url config normalization matching' '
 	test "$(test-url-normalize -c "$tc-1" "noEPSV" "https://example.com")" = "true" &&
 	test "$(test-url-normalize -c "$tc-1" "noEPSV" "https://example.com/path")" = "true" &&
 	test "$(test-url-normalize -c "$tc-2" "useragent" "HTTPS://example.COM/p%61th")" = "example-agent" &&
-	test "$(test-url-normalize -c "$tc-2" "sslVerify" "HTTPS://example.COM/p%61th")" = "false"
+	test "$(test-url-normalize -c "$tc-2" "sslVerify" "HTTPS://example.COM/p%61th")" = "false" &&
+	test "$(test-url-normalize -c "$tc-3" "sslcainfo" "https://user@example.com/path/name/here")" = "file-1"
 '
 
 test_done
diff --git a/t/t5200/config-3 b/t/t5200/config-3
new file mode 100644
index 0000000..5c8d3e8
--- /dev/null
+++ b/t/t5200/config-3
@@ -0,0 +1,4 @@
+[http "https://example.com/path/name"]
+	sslcainfo = file-1
+[http "https://user@example.com/path"]
+	sslcainfo = file-2
diff --git a/test-url-normalize.c b/test-url-normalize.c
index f18bd88..4f870dc 100644
--- a/test-url-normalize.c
+++ b/test-url-normalize.c
@@ -11,12 +11,12 @@ int main()
 
 static int run_http_options(const char *file,
 			    const char *opt,
-			    const char *url)
+			    const struct url_info *info)
 {
 	struct strbuf opt_lc;
 	size_t i, len;
 
-	if (git_config_with_options(http_options, (void *)url, file, 0))
+	if (git_config_with_options(http_options, (void *)info, file, 0))
 		return 1;
 
 	len = strlen(opt);
@@ -65,7 +65,7 @@ static int run_http_options(const char *file,
 	return 0;
 }
 
-#define url_normalize(u) http_options_url_normalize(u)
+#define url_normalize(u,i) http_options_url_normalize(u,i)
 
 int main(int argc, char **argv)
 {
@@ -108,23 +108,24 @@ int main(int argc, char **argv)
 		die(usage);
 
 	if (argc == 2) {
-		url1 = url_normalize(argv[1]);
+		struct url_info info;
+		url1 = url_normalize(argv[1], &info);
 		if (!url1)
 			return 1;
 		if (opt_p)
 			printf("%s\n", url1);
 		if (opt_l)
-			printf("%u\n", (unsigned)strlen(url1));
+			printf("%u\n", (unsigned)info.url_len);
 		if (opt_c)
-			return run_http_options(file, optname, url1);
+			return run_http_options(file, optname, &info);
 		return 0;
 	}
 
 	if (opt_p || opt_l || opt_c)
 		die(usage);
 
-	url1 = url_normalize(argv[1]);
-	url2 = url_normalize(argv[2]);
+	url1 = url_normalize(argv[1], NULL);
+	url2 = url_normalize(argv[2], NULL);
 	return (url1 && url2 && !strcmp(url1, url2)) ? 0 : 1;
 }
 
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 4/4] config: allow http.<url>.* any user matching
  2013-07-22 12:56 ` [PATCH v8 4/4] config: allow http.<url>.* any user matching Kyle J. McKay
@ 2013-07-22 18:00   ` Junio C Hamano
  2013-07-22 20:24     ` Kyle J. McKay
  2013-07-24  6:44   ` Jeff King
  1 sibling, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2013-07-22 18:00 UTC (permalink / raw)
  To: Kyle J. McKay
  Cc: git, David Aguilar, Petr Baudis, Richard Hartmann, Jeff King,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

"Kyle J. McKay" <mackyle@gmail.com> writes:

> Previously the <url> had to specify an exactly matching user name
> and password if those were present in the url being matched against.
>
> Now the password portion is always ignored and omitting the user
> name from <url> allows it to match against any user name.
>
> Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
> ---
>  Documentation/config.txt |  29 +++--

Thanks.

> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index e461f32..c418adf 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1517,15 +1517,26 @@ http.<url>.*::
>  	Any of the http.* options above can be applied selectively to some urls.
>  	For example "http.https://example.com.useragent" would set the user
>  	agent only for https connections to example.com.  The <url> value
> +	matches a url if it refers to the same scheme, host and port and the
> +	path portion is an exact match or a prefix that matches at a "/"
> +	boundary.  If <url> does not include a user name, it will match a url
> +	with any username otherwise the user name must match as well (the
> +	password part, if present in the url, is always ignored).  Longer <url>
> +	path matches take precedence over shorter matches no matter what order
> +	they occur in.  For example, if both "https://user@example.com/path" and
> +	"https://example.com/path/name" are used as a config <url> value and
> +	then "https://user@example.com/path/name/here" is passed to a git
> +	command, the settings in the "https://example.com/path/name" section
> +	will be preferred because that <url> has a longer path length match than
> +	"https://user@example.com/path" even though the latter did match the
> +	user.  For same length matches, the last one wins except that a same
> +	length <url> match that includes a user name will be preferred over a
> +	same length <url> match that does not.  The urls are normalized before
> +	matching so that equivalent urls that are simply spelled differently
> +	will match properly.  Environment variable settings always override any
> +	matches.  The urls that are matched against are those given directly to
> +	git commands.  This means any urls visited as a result of a redirection
> +	do not participate in matching.

A solid wall of text is somewhat hard to read, so I'd queue the
equivalent of the following "git diff -w" output on top.  I also was
trying to see if we can clarify the "length comparison" only refers
to the length of the path part, excluding the length of "user@"
(i.e. when comparing "https://user@example.com/path" with
"https://example.com/path", they are of the same length), which you
can see in the first three lines below.

diff --git a/Documentation/config.txt b/Documentation/config.txt
index c418adf..635ed5d 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1521,9 +1521,11 @@ http.<url>.*::
 	path portion is an exact match or a prefix that matches at a "/"
 	boundary.  If <url> does not include a user name, it will match a url
 	with any username otherwise the user name must match as well (the
-	password part, if present in the url, is always ignored).  Longer <url>
-	path matches take precedence over shorter matches no matter what order
-	they occur in.  For example, if both "https://user@example.com/path" and
+	password part, if present in the url, is always ignored).  A <url>
+	with longer path matches take precedence over shorter matches no matter
+	what order they occur in the configuration file.
++
+For example, if both "https://user@example.com/path" and
 "https://example.com/path/name" are used as a config <url> value and
 then "https://user@example.com/path/name/here" is passed to a git
 command, the settings in the "https://example.com/path/name" section

I am not yet convinced that the precedence rule specified in this
what we want (I do not have an example why it is *not* what we want,
either).  Another definition could be "if user@ is present in the
request, give lower precedence to config entries for the site
without user@ than entries with user@", and I do not have a strong
opinion myself which one between the two is better (and there may be
third and other possible rule).

Comments?

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 4/4] config: allow http.<url>.* any user matching
  2013-07-22 18:00   ` Junio C Hamano
@ 2013-07-22 20:24     ` Kyle J. McKay
  2013-07-22 21:51       ` Junio C Hamano
  2013-07-24  6:42       ` Jeff King
  0 siblings, 2 replies; 19+ messages in thread
From: Kyle J. McKay @ 2013-07-22 20:24 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, David Aguilar, Petr Baudis, Richard Hartmann, Jeff King,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

On Jul 22, 2013, at 11:00, Junio C Hamano wrote:
> "Kyle J. McKay" <mackyle@gmail.com> writes:
>
>> diff --git a/Documentation/config.txt b/Documentation/config.txt
>> index e461f32..c418adf 100644
>> --- a/Documentation/config.txt
>> +++ b/Documentation/config.txt
>> @@ -1517,15 +1517,26 @@ http.<url>.*::
>> 	Any of the http.* options above can be applied selectively to some  
>> urls.
>> 	For example "http.https://example.com.useragent" would set the user
>> 	agent only for https connections to example.com.  The <url> value
>> +	matches a url if it refers to the same scheme, host and port and  
>> the
>> +	path portion is an exact match or a prefix that matches at a "/"
>> +	boundary.  If <url> does not include a user name, it will match a  
>> url
>> +	with any username otherwise the user name must match as well (the
>> +	password part, if present in the url, is always ignored).  Longer  
>> <url>
>> +	path matches take precedence over shorter matches no matter what  
>> order
>> +	they occur in.  For example, if both "https://user@example.com/ 
>> path" and
>> +	"https://example.com/path/name" are used as a config <url> value  
>> and
>> +	then "https://user@example.com/path/name/here" is passed to a git
>> +	command, the settings in the "https://example.com/path/name"  
>> section
>> +	will be preferred because that <url> has a longer path length  
>> match than
>> +	"https://user@example.com/path" even though the latter did match  
>> the
>> +	user.  For same length matches, the last one wins except that a  
>> same
>> +	length <url> match that includes a user name will be preferred  
>> over a
>> +	same length <url> match that does not.  The urls are normalized  
>> before
>> +	matching so that equivalent urls that are simply spelled  
>> differently
>> +	will match properly.  Environment variable settings always  
>> override any
>> +	matches.  The urls that are matched against are those given  
>> directly to
>> +	git commands.  This means any urls visited as a result of a  
>> redirection
>> +	do not participate in matching.
>
> A solid wall of text is somewhat hard to read, so I'd queue the
> equivalent of the following "git diff -w" output on top.

Can I send out the change as a 'fixup!' patch?  Or do I need to send a  
new v9 patch series with the documentation update?

> I also was
> trying to see if we can clarify the "length comparison" only refers
> to the length of the path part, excluding the length of "user@"
> (i.e. when comparing "https://user@example.com/path" with
> "https://example.com/path", they are of the same length), which you
> can see in the first three lines below.
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index c418adf..635ed5d 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1521,9 +1521,11 @@ http.<url>.*::
> 	path portion is an exact match or a prefix that matches at a "/"
> 	boundary.  If <url> does not include a user name, it will match a url
> 	with any username otherwise the user name must match as well (the
> -	password part, if present in the url, is always ignored).  Longer  
> <url>
> -	path matches take precedence over shorter matches no matter what  
> order
> -	they occur in.  For example, if both "https://user@example.com/ 
> path" and
> +	password part, if present in the url, is always ignored).  A <url>
> +	with longer path matches take precedence over shorter matches no  
> matter
> +	what order they occur in the configuration file.
> ++
> +For example, if both "https://user@example.com/path" and
> "https://example.com/path/name" are used as a config <url> value and
> then "https://user@example.com/path/name/here" is passed to a git
> command, the settings in the "https://example.com/path/name" section

OK.

> I am not yet convinced that the precedence rule specified in this
> what we want (I do not have an example why it is *not* what we want,
> either).  Another definition could be "if user@ is present in the
> request, give lower precedence to config entries for the site
> without user@ than entries with user@", and I do not have a strong
> opinion myself which one between the two is better (and there may be
> third and other possible rule).
>
> Comments?

Consider this site:

example.com/
example.com/dir
example.com/dir/sub
example.com/dir/sub/public

Suppose I want to configure a particular key for example.com/dir/sub  
for a particular user, say "contractor".

I can configure:

[http "https://contractor@example.com/dir/sub"]
   sslkey=contractor_key

But then I want to configure a completely different key for the public  
area example.com/dir/sub/public no matter what user.  I just add this:

[http "https://example.com/dir/sub/public"]
   sslkey=public_key

But if entries with usernames take precedence it won't work.  The only  
way to do it is to list every possible user for "example.com/dir1/sub1/ 
public" in its own new section and then continue to add a new config  
section every time a new user name is encountered.

Conversely, if a special key is supposed to be used for the user  
"contractor" everywhere on the site at or below "example.com/dir/sub",  
then the above configuration doesn't work unless user matches take  
precedence.  With the current code, one additional entry would have to  
be added like so:

[http "https://contractor@example.com/dir/sub/public"]
   sslkey=contractor_key

So my thinking was that having user matches take precedence over path  
length matches can result in endless additions to the config file  
(because you have to list all the other users to override a sub area  
and that could be a large list) whereas having path length matches  
take precedence over user matches will only result in a few, finite  
additions to the config file (the number of already-configured items  
with a longer path).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 4/4] config: allow http.<url>.* any user matching
  2013-07-22 20:24     ` Kyle J. McKay
@ 2013-07-22 21:51       ` Junio C Hamano
  2013-07-22 22:18         ` Kyle J. McKay
  2013-07-24  6:42       ` Jeff King
  1 sibling, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2013-07-22 21:51 UTC (permalink / raw)
  To: Kyle J. McKay
  Cc: git, David Aguilar, Petr Baudis, Richard Hartmann, Jeff King,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

"Kyle J. McKay" <mackyle@gmail.com> writes:

>> A solid wall of text is somewhat hard to read, so I'd queue the
>> equivalent of the following "git diff -w" output on top.
>
> Can I send out the change as a 'fixup!' patch?  Or do I need to send a
> new v9 patch series with the documentation update?

If you are OK with splitting it into two paragraphs with the
"longest" clarification tweak (the "patch" I showed you), just
saying so and I can squash ;-) so there is no need to resend.

>> diff --git a/Documentation/config.txt b/Documentation/config.txt
>> index c418adf..635ed5d 100644
>> --- a/Documentation/config.txt
>> +++ b/Documentation/config.txt
>> @@ -1521,9 +1521,11 @@ http.<url>.*::
>> 	path portion is an exact match or a prefix that matches at a "/"
>> 	boundary.  If <url> does not include a user name, it will match a url
>> 	with any username otherwise the user name must match as well (the
>> -	password part, if present in the url, is always ignored).
>> Longer <url>
>> -	path matches take precedence over shorter matches no matter
>> what order
>> -	they occur in.  For example, if both
>> "https://user@example.com/
>> path" and
>> +	password part, if present in the url, is always ignored).  A <url>
>> +	with longer path matches take precedence over shorter matches
>> no matter
>> +	what order they occur in the configuration file.
>> ++
>> +For example, if both "https://user@example.com/path" and
>> "https://example.com/path/name" are used as a config <url> value and
>> then "https://user@example.com/path/name/here" is passed to a git
>> command, the settings in the "https://example.com/path/name" section
>
> OK.

... which essentially is your "OK" ;-)

>> I am not yet convinced that the precedence rule specified in this
>> what we want (I do not have an example why it is *not* what we want,
>> either).  Another definition could be "if user@ is present in the
>> request, give lower precedence to config entries for the site
>> without user@ than entries with user@", and I do not have a strong
>> opinion myself which one between the two is better (and there may be
>> third and other possible rule).
>>
>> Comments?
>
> Consider this site:
> ...
> So my thinking was that having user matches take precedence over path
> length matches can result in endless additions to the config file
> (because you have to list all the other users to override a sub area
> and that could be a large list) whereas having path length matches
> take precedence over user matches will only result in a few, finite
> additions to the config file (the number of already-configured items
> with a longer path).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 4/4] config: allow http.<url>.* any user matching
  2013-07-22 21:51       ` Junio C Hamano
@ 2013-07-22 22:18         ` Kyle J. McKay
  2013-07-22 22:34           ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Kyle J. McKay @ 2013-07-22 22:18 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, David Aguilar, Petr Baudis, Richard Hartmann, Jeff King,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine


On Jul 22, 2013, at 14:51, Junio C Hamano wrote:

> "Kyle J. McKay" <mackyle@gmail.com> writes:
>
>>> A solid wall of text is somewhat hard to read, so I'd queue the
>>> equivalent of the following "git diff -w" output on top.
>>
>> Can I send out the change as a 'fixup!' patch?  Or do I need to  
>> send a
>> new v9 patch series with the documentation update?
>
> If you are OK with splitting it into two paragraphs with the
> "longest" clarification tweak (the "patch" I showed you), just
> saying so and I can squash ;-) so there is no need to resend.

The wording of:

"+	password part, if present in the url, is always ignored).  A <url>
+	with longer path matches take precedence over shorter matches no  
matter
+	what order they occur in the configuration file.

needs to be fixed first.  Replace "take" with "takes" and you can go  
ahead and squash it in.  :)


>>> diff --git a/Documentation/config.txt b/Documentation/config.txt
>>> index c418adf..635ed5d 100644
>>> --- a/Documentation/config.txt
>>> +++ b/Documentation/config.txt
>>> @@ -1521,9 +1521,11 @@ http.<url>.*::
>>> 	path portion is an exact match or a prefix that matches at a "/"
>>> 	boundary.  If <url> does not include a user name, it will match a  
>>> url
>>> 	with any username otherwise the user name must match as well (the
>>> -	password part, if present in the url, is always ignored).
>>> Longer <url>
>>> -	path matches take precedence over shorter matches no matter
>>> what order
>>> -	they occur in.  For example, if both
>>> "https://user@example.com/
>>> path" and
>>> +	password part, if present in the url, is always ignored).  A <url>
>>> +	with longer path matches take precedence over shorter matches
>>> no matter
>>> +	what order they occur in the configuration file.
>>> ++
>>> +For example, if both "https://user@example.com/path" and
>>> "https://example.com/path/name" are used as a config <url> value and
>>> then "https://user@example.com/path/name/here" is passed to a git
>>> command, the settings in the "https://example.com/path/name" section
>>
>> OK.
>
> ... which essentially is your "OK" ;-)

Yes, I meant that as "OK I will send out an update that includes  
something like this."

After replacing "take" with "takes" in the change I'm good with just  
squashing that diff in.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 4/4] config: allow http.<url>.* any user matching
  2013-07-22 22:18         ` Kyle J. McKay
@ 2013-07-22 22:34           ` Junio C Hamano
  0 siblings, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2013-07-22 22:34 UTC (permalink / raw)
  To: Kyle J. McKay
  Cc: git, David Aguilar, Petr Baudis, Richard Hartmann, Jeff King,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

"Kyle J. McKay" <mackyle@gmail.com> writes:

> After replacing "take" with "takes" in the change I'm good with just
> squashing that diff in.

Thanks for proofreading.  Then let's omit an extra back-and-forth.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 4/4] config: allow http.<url>.* any user matching
  2013-07-22 20:24     ` Kyle J. McKay
  2013-07-22 21:51       ` Junio C Hamano
@ 2013-07-24  6:42       ` Jeff King
  2013-07-24 15:00         ` Junio C Hamano
  1 sibling, 1 reply; 19+ messages in thread
From: Jeff King @ 2013-07-24  6:42 UTC (permalink / raw)
  To: Kyle J. McKay
  Cc: Junio C Hamano, git, David Aguilar, Petr Baudis, Richard Hartmann,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

On Mon, Jul 22, 2013 at 01:24:06PM -0700, Kyle J. McKay wrote:

> >I am not yet convinced that the precedence rule specified in this
> >what we want (I do not have an example why it is *not* what we want,
> >either).  Another definition could be "if user@ is present in the
> >request, give lower precedence to config entries for the site
> >without user@ than entries with user@", and I do not have a strong
> >opinion myself which one between the two is better (and there may be
> >third and other possible rule).
> >
> >Comments?
> 
> Consider this site:
> [...]

Thanks for explaining, and sorry I missed out on the last few rounds of
review.

I think your scheme (normalization plus special handling of the username
field) addresses my biggest concern, which is matching in the face of
optional usernames. The only two things that make me wary are:

  1. The explanation and special-casing of username is a little
     complicated to explain.

  2. The behavior for resolving the value when faced with multiple
     possibilities is completely unlike the rest of the config system
     (both dropping last-one-wins, and unlike the URL matching for
     credentials).

I think we can decide that (2) is worth it if your semantics are more
flexible in practice. It would be nice to see real-world feedback on how
people use it before setting the behavior in stone, but there's sort of
a chicken and egg problem there.

For (1), I wonder if the explanation would be simpler if the precedences
of each sub-part were simply laid out. That is, would it be correct to
say something like:

  For a config key to match a URL, each element of the config key (if
  present) is compared to that of the URL, in the following order:

    1. Protocol (e.g., `https` in `https://example.com/`). This field
       must match exactly between the config key and the URL.

    2. Host/domain name (e.g., `example.com` in `https://example.com/`).
       This field must match exactly between the config key and the URL.

    3. Path (e.g., `repo.git` in `https://example.com/repo.git`). This
       field is prefix-matched by slash-delimited path elements, so that
       config key `foo/` matches URL `foo/bar`. Longer matches take
       precedence (so `foo/bar`, if it exists, is a better match than
       just `foo/`).

    4. Username (e.g., `user` in `https://user@example.com/repo.git`).

  The list above is ordered by decreasing precedence; a URL that matches
  a config key's path is preferred to one that matches its username.

I don't know if that is more or less clear of an explanation. It makes
more sense to me, but that is probably because I wrote it. I'm also not
100% sure it describes your implementation, but I think it is equivalent
to the prefix matching with normalization.

I have a few other comments on specific patches; I'll send them
separately.

-Peff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 4/4] config: allow http.<url>.* any user matching
  2013-07-22 12:56 ` [PATCH v8 4/4] config: allow http.<url>.* any user matching Kyle J. McKay
  2013-07-22 18:00   ` Junio C Hamano
@ 2013-07-24  6:44   ` Jeff King
  1 sibling, 0 replies; 19+ messages in thread
From: Jeff King @ 2013-07-24  6:44 UTC (permalink / raw)
  To: Kyle J. McKay
  Cc: git, David Aguilar, Petr Baudis, Junio C Hamano, Richard Hartmann,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

On Mon, Jul 22, 2013 at 05:56:44AM -0700, Kyle J. McKay wrote:

> +	matches a url if it refers to the same scheme, host and port and the
> +	path portion is an exact match or a prefix that matches at a "/"
> +	boundary.  If <url> does not include a user name, it will match a url
> +	with any username otherwise the user name must match as well (the
> +	password part, if present in the url, is always ignored).  Longer <url>
> +	path matches take precedence over shorter matches no matter what order
> +	they occur in.  For example, if both "https://user@example.com/path" and
> +	"https://example.com/path/name" are used as a config <url> value and
> +	then "https://user@example.com/path/name/here" is passed to a git
> +	command, the settings in the "https://example.com/path/name" section

These "https://..." should probably be `https://...`, to mark them in
asciidoc as literals.

-Peff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 3/4] tests: add new test for the url_normalize function
  2013-07-22 12:56 ` [PATCH v8 3/4] tests: add new test for the url_normalize function Kyle J. McKay
@ 2013-07-24  6:59   ` Jeff King
  2013-07-24 17:14     ` Junio C Hamano
  2013-07-24 19:01     ` Kyle J. McKay
  0 siblings, 2 replies; 19+ messages in thread
From: Jeff King @ 2013-07-24  6:59 UTC (permalink / raw)
  To: Kyle J. McKay
  Cc: git, David Aguilar, Petr Baudis, Junio C Hamano, Richard Hartmann,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

On Mon, Jul 22, 2013 at 05:56:43AM -0700, Kyle J. McKay wrote:

> In order to perform sane URL matching for http.<url>.* options,
> http.c normalizes URLs before performing matches.
> 
> A new test-url-normalize test program is introduced along with
> a new t5200-url-normalize.sh script to run the tests.

While looking at test-url-normalize (and wanting to experiment some with
your series to see how it handled a few behaviors), I wondered if we
shouldn't be exposing this selection process to the user somehow via
git-config.

That is, would a shell script ever want to say "what is the value of
this config variable for url $X"? Certainly our test scripts want to,
and having a test-* program covers that, but might user scripts want to
do the same? Or even to introduce its own URL-matched config options?

How hard would it be to convert the "-c" option of test-url-normalize
into something like:

  git config --file=foo --url http noepsv $URL

which would look for http.$URL.noepsv matches.

If we want to go that route, we don't have to do it now (the test-*
interface is unpublished, and the git-config interface can simply come
later than the internal feature).  But it may be worth thinking about
now while it is in your head.

> diff --git a/test-url-normalize.c b/test-url-normalize.c
> new file mode 100644
> index 0000000..f18bd88
> --- /dev/null
> +++ b/test-url-normalize.c
> [...]
> +	if (!strcmp("sslverify", opt_lc.buf))
> +		printf("%s\n", curl_ssl_verify ? "true" : "false");
> +	else if (!strcmp("sslcert", opt_lc.buf))
> +		printf("%s\n", ssl_cert);
> +#if LIBCURL_VERSION_NUM >= 0x070903
> +	else if (!strcmp("sslkey", opt_lc.buf))
> +		printf("%s\n", ssl_key);
> +#endif
> +#if LIBCURL_VERSION_NUM >= 0x070908
> +	else if (!strcmp("sslcapath", opt_lc.buf))
> +		printf("%s\n", ssl_capath);
> +#endif
> [...]

Do we need to have the complete list of options here, including curl
version limitations? It seems like this will eventually get out of date
with the list of options. Would it be sufficient to test just one (or
even just handle a fake "http.$URL.foo" variable)?

> +#define url_normalize(u) http_options_url_normalize(u)

Does this macro do anything besides shorten the name? Is the extra
level of indirection to the reader worth it?

-Peff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 1/4] config: add support for http.<url>.* settings
  2013-07-22 12:56 ` [PATCH v8 1/4] " Kyle J. McKay
@ 2013-07-24  7:12   ` Jeff King
  0 siblings, 0 replies; 19+ messages in thread
From: Jeff King @ 2013-07-24  7:12 UTC (permalink / raw)
  To: Kyle J. McKay
  Cc: git, David Aguilar, Petr Baudis, Junio C Hamano, Richard Hartmann,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

On Mon, Jul 22, 2013 at 05:56:41AM -0700, Kyle J. McKay wrote:

> +enum http_option_type {
> +	OPT_POST_BUFFER,
> +	OPT_MIN_SESSIONS,
> +	OPT_SSL_VERIFY,
> +	OPT_SSL_TRY,
> +	OPT_SSL_CERT,
> +	OPT_SSL_CAINFO,
> +	OPT_LOW_SPEED,
> +	OPT_LOW_TIME,
> +	OPT_NO_EPSV,
> +	OPT_HTTP_PROXY,
> +	OPT_COOKIE_FILE,
> +	OPT_USER_AGENT,
> +	OPT_PASSWD_REQ,
> +#ifdef USE_CURL_MULTI
> +	OPT_MAX_REQUESTS,
> +#endif
> +#if LIBCURL_VERSION_NUM >= 0x070903
> +	OPT_SSL_KEY,
> +#endif
> +#if LIBCURL_VERSION_NUM >= 0x070908
> +	OPT_SSL_CAPATH,
> +#endif
> +	OPT_MAX
> +};
> +
> +static size_t http_option_max_matched_len[OPT_MAX];

It's frustrating that we now have an extra spot to add new options to
(e.g., somebody else is adding http.savecookies in another thread, and
the merge will need to not just resolve the textual conflicts, but add
it to this other list).

Might it be simpler to just keep a dynamic list indexed by option name
(either a hash table, or a sorted string_list)? We only incur a lookup
when we find an actual config entry, so the number of lookups (and
entries) should be minuscule and not affect performance. And as a bonus,
it lets us handle arbitrary keys if we want to allow "git config" to
learn about url matching for user-specified keys.

-Peff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 4/4] config: allow http.<url>.* any user matching
  2013-07-24  6:42       ` Jeff King
@ 2013-07-24 15:00         ` Junio C Hamano
  0 siblings, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2013-07-24 15:00 UTC (permalink / raw)
  To: Jeff King
  Cc: Kyle J. McKay, git, David Aguilar, Petr Baudis, Richard Hartmann,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

Jeff King <peff@peff.net> writes:

>   1. The explanation and special-casing of username is a little
>      complicated to explain.
>
>   2. The behavior for resolving the value when faced with multiple
>      possibilities is completely unlike the rest of the config system
>      (both dropping last-one-wins, and unlike the URL matching for
>      credentials).
>
> I think we can decide that (2) is worth it if your semantics are more
> flexible in practice. It would be nice to see real-world feedback on how
> people use it before setting the behavior in stone, but there's sort of
> a chicken and egg problem there.
>
> For (1), I wonder if the explanation would be simpler if the precedences
> of each sub-part were simply laid out. That is, would it be correct to
> say something like:
>
>   For a config key to match a URL, each element of the config key (if
>   present) is compared to that of the URL, in the following order:
>
>     1. Protocol (e.g., `https` in `https://example.com/`). This field
>        must match exactly between the config key and the URL.
>
>     2. Host/domain name (e.g., `example.com` in `https://example.com/`).
>        This field must match exactly between the config key and the URL.
>
>     3. Path (e.g., `repo.git` in `https://example.com/repo.git`). This
>        field is prefix-matched by slash-delimited path elements, so that
>        config key `foo/` matches URL `foo/bar`. Longer matches take
>        precedence (so `foo/bar`, if it exists, is a better match than
>        just `foo/`).
>
>     4. Username (e.g., `user` in `https://user@example.com/repo.git`).
>
>   The list above is ordered by decreasing precedence; a URL that matches
>   a config key's path is preferred to one that matches its username.
>
> I don't know if that is more or less clear of an explanation. It makes
> more sense to me, but that is probably because I wrote it. 

The above reads very clearly to me, too.  We may want to add that
the Username, if exists on the configuration key name, must match
exactly to the one in the URL the variable is queried for
(i.e. anonymous requests never match configuration for a specific
user).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 3/4] tests: add new test for the url_normalize function
  2013-07-24  6:59   ` Jeff King
@ 2013-07-24 17:14     ` Junio C Hamano
  2013-07-24 18:43       ` Kyle J. McKay
  2013-07-24 19:01     ` Kyle J. McKay
  1 sibling, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2013-07-24 17:14 UTC (permalink / raw)
  To: Jeff King
  Cc: Kyle J. McKay, git, David Aguilar, Petr Baudis, Richard Hartmann,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

Jeff King <peff@peff.net> writes:

> That is, would a shell script ever want to say "what is the value of
> this config variable for url $X"? Certainly our test scripts want to,
> and having a test-* program covers that, but might user scripts want to
> do the same? Or even to introduce its own URL-matched config options?
>
> How hard would it be to convert the "-c" option of test-url-normalize
> into something like:
>
>   git config --file=foo --url http noepsv $URL
>
> which would look for http.$URL.noepsv matches.

Lovely.

>> +#if LIBCURL_VERSION_NUM >= 0x070903
>> +	else if (!strcmp("sslkey", opt_lc.buf))
>> +		printf("%s\n", ssl_key);
>> +#endif
>> +#if LIBCURL_VERSION_NUM >= 0x070908
>> +	else if (!strcmp("sslcapath", opt_lc.buf))
>> +		printf("%s\n", ssl_capath);
>> +#endif
>> [...]
>
> Do we need to have the complete list of options here, including curl
> version limitations? It seems like this will eventually get out of date
> with the list of options. Would it be sufficient to test just one (or
> even just handle a fake "http.$URL.foo" variable)?

Yeah, and that will be in line with "git config --url" direction.

Another thing we may want to consider is to see if we can
restructure the http_options interface a bit, so that the caller can
be agonistic to the actual meaning of the key.  For example,

  "git config --url http notknownyet $URL" 

may want to be able to show the value for http.<pattern>.notknownyet
for the most matching <pattern> for a given $URL, without knowing
what the variable means, just like any other configuration that is
queried via the "git config" program.  The caller may want to pass
further type information like --bool, --int and --path as needed.

>> +#define url_normalize(u) http_options_url_normalize(u)
>
> Does this macro do anything besides shorten the name? Is the extra
> level of indirection to the reader worth it?

Probably not.

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 3/4] tests: add new test for the url_normalize function
  2013-07-24 17:14     ` Junio C Hamano
@ 2013-07-24 18:43       ` Kyle J. McKay
  0 siblings, 0 replies; 19+ messages in thread
From: Kyle J. McKay @ 2013-07-24 18:43 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, git, David Aguilar, Petr Baudis, Richard Hartmann,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

On Jul 24, 2013, at 10:14, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
>> How hard would it be to convert the "-c" option of test-url-normalize
>> into something like:
>>
>>  git config --file=foo --url http noepsv $URL
>>
>> which would look for http.$URL.noepsv matches.
>
> Lovely.

[snip]

> Another thing we may want to consider is to see if we can
> restructure the http_options interface a bit, so that the caller can
> be agonistic to the actual meaning of the key.  For example,
>
>  "git config --url http notknownyet $URL"
>
> may want to be able to show the value for http.<pattern>.notknownyet
> for the most matching <pattern> for a given $URL, without knowing
> what the variable means, just like any other configuration that is
> queried via the "git config" program.  The caller may want to pass
> further type information like --bool, --int and --path as needed.
>
>>> +#define url_normalize(u) http_options_url_normalize(u)
>>
>> Does this macro do anything besides shorten the name? Is the extra
>> level of indirection to the reader worth it?
>
> Probably not.

It's a hint that the name of the function might change.  If it's going  
to be used in a more generic fashion it may not belong in http.c  
anymore in which case the name will likely change along with the  
location.  Nothing about the normalization function requires CURL or  
http/https, so it would work equally well on rsync, ftp, smtp etc.  
urls and could just move into url.{h,c} as url_normalize.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 3/4] tests: add new test for the url_normalize function
  2013-07-24  6:59   ` Jeff King
  2013-07-24 17:14     ` Junio C Hamano
@ 2013-07-24 19:01     ` Kyle J. McKay
  2013-07-24 19:03       ` Jeff King
  1 sibling, 1 reply; 19+ messages in thread
From: Kyle J. McKay @ 2013-07-24 19:01 UTC (permalink / raw)
  To: Jeff King
  Cc: git, David Aguilar, Petr Baudis, Junio C Hamano, Richard Hartmann,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

On Jul 23, 2013, at 23:59, Jeff King wrote:

>> diff --git a/test-url-normalize.c b/test-url-normalize.c
>> new file mode 100644
>> index 0000000..f18bd88
>> --- /dev/null
>> +++ b/test-url-normalize.c
>> [...]
>> +	if (!strcmp("sslverify", opt_lc.buf))
>> +		printf("%s\n", curl_ssl_verify ? "true" : "false");
>> +	else if (!strcmp("sslcert", opt_lc.buf))
>> +		printf("%s\n", ssl_cert);
>> +#if LIBCURL_VERSION_NUM >= 0x070903
>> +	else if (!strcmp("sslkey", opt_lc.buf))
>> +		printf("%s\n", ssl_key);
>> +#endif
>> +#if LIBCURL_VERSION_NUM >= 0x070908
>> +	else if (!strcmp("sslcapath", opt_lc.buf))
>> +		printf("%s\n", ssl_capath);
>> +#endif
>> [...]
>
> Do we need to have the complete list of options here, including curl
> version limitations? It seems like this will eventually get out of  
> date
> with the list of options. Would it be sufficient to test just one (or
> even just handle a fake "http.$URL.foo" variable)?

Right now, the values are only available as various strings, ints,  
longs etc. which have to be formatted differently for output.  The  
original string value they were converted from is gone.  The snippet  
shown above only shows some of the "%s" formatters.

Either the original value will have to be kept around or a  
reconstituted string depending on what:

git config --file=foo --url http noepsv $URL

should output.  If the original value was 0 or 1, should it output  
that or "false" or "true"?  The test-url-normalize code for "-c"  
normalizes the output to "false" or "true" for all boolean values and  
reconverts ints/longs to strings.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 3/4] tests: add new test for the url_normalize function
  2013-07-24 19:01     ` Kyle J. McKay
@ 2013-07-24 19:03       ` Jeff King
  0 siblings, 0 replies; 19+ messages in thread
From: Jeff King @ 2013-07-24 19:03 UTC (permalink / raw)
  To: Kyle J. McKay
  Cc: git, David Aguilar, Petr Baudis, Junio C Hamano, Richard Hartmann,
	Daniel Knittl-Frank, Jan Krüger, Alejandro Mery,
	Aaron Schrab, Eric Sunshine

On Wed, Jul 24, 2013 at 12:01:26PM -0700, Kyle J. McKay wrote:

> Right now, the values are only available as various strings, ints,
> longs etc. which have to be formatted differently for output.  The
> original string value they were converted from is gone.  The snippet
> shown above only shows some of the "%s" formatters.
> 
> Either the original value will have to be kept around or a
> reconstituted string depending on what:
> 
> git config --file=foo --url http noepsv $URL
> 
> should output.  If the original value was 0 or 1, should it output
> that or "false" or "true"?  The test-url-normalize code for "-c"
> normalizes the output to "false" or "true" for all boolean values and
> reconverts ints/longs to strings.

I think it would be the responsibility of the caller to specify what
they are looking for. I.e., add "--bool" to the git-config command line
as appropriate.

-Peff

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2013-07-24 19:04 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-22 12:56 [PATCH v8 0/4] config: add support for http.<url>.* settings Kyle J. McKay
2013-07-22 12:56 ` [PATCH v8 1/4] " Kyle J. McKay
2013-07-24  7:12   ` Jeff King
2013-07-22 12:56 ` [PATCH v8 2/4] config: improve " Kyle J. McKay
2013-07-22 12:56 ` [PATCH v8 3/4] tests: add new test for the url_normalize function Kyle J. McKay
2013-07-24  6:59   ` Jeff King
2013-07-24 17:14     ` Junio C Hamano
2013-07-24 18:43       ` Kyle J. McKay
2013-07-24 19:01     ` Kyle J. McKay
2013-07-24 19:03       ` Jeff King
2013-07-22 12:56 ` [PATCH v8 4/4] config: allow http.<url>.* any user matching Kyle J. McKay
2013-07-22 18:00   ` Junio C Hamano
2013-07-22 20:24     ` Kyle J. McKay
2013-07-22 21:51       ` Junio C Hamano
2013-07-22 22:18         ` Kyle J. McKay
2013-07-22 22:34           ` Junio C Hamano
2013-07-24  6:42       ` Jeff King
2013-07-24 15:00         ` Junio C Hamano
2013-07-24  6:44   ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).