git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v6 0/6] http.<url>.<key> and friends
@ 2013-07-31 19:26 Junio C Hamano
  2013-07-31 19:26 ` [PATCH v6 1/6] http.c: fix parsing of http.sslCertPasswordProtected variable Junio C Hamano
                   ` (5 more replies)
  0 siblings, 6 replies; 23+ messages in thread
From: Junio C Hamano @ 2013-07-31 19:26 UTC (permalink / raw)
  To: git; +Cc: Kyle J. McKay, Jeff King

This is my attempt to reroll Kyle's http.<url>.<key> series.

It adds a general <section>.<url>.<key> support at the
infrastructure level and then rebuild http.<url>.<key> support on
top of it.  A useful side effect of doing it this way is that it
avoids having to touch the two-name parser http_options() at all.

The same infrastructure is used to add "--get-urlmatch" mode to "git
config", so that scripted Porcelains can use the same mechanism to
ask for the value for <section>.<key> variable with a URL, and learn
the value for <section>.<url>.<key> whose <url> part best matches
the given URL.  In a sense, the infrastructure makes <section>.<key>
a "virtual" variable that is customized for URL.

 * Patch 1/6 is unchanged.

 * Patch 2/6 is to add only the two helpers url_normalize and
   match_urls from the original series by Kyle.

 * Patch 3/6 is the general <section>.<url>.<key> support.  The
   urlmatch_config_entry() wrapper can use existing two-name parser
   to implement "virtual" <section>.<key> variables.

 * Patch 4/6 is the rest of Kyle's http.<url>.<key> ported on top of
   the infrastructure.

 * Patch 5/6 is unchanged from the previous round.

 * Patch 6/6 teaches "--get-urlmatch" to "git config"; this time it
   adds tests and docs.

Junio C Hamano (4):
  http.c: fix parsing of http.sslCertPasswordProtected variable
  config: add generic callback shim to parse section.<url>.key
  builtin/config: refactor collect_config()
  config: "git config --get-urlmatch" parses section.<url>.key

Kyle J. McKay (2):
  config: add helper to normalize and match URLs
  config: parse http.<url>.<variable> using urlmatch

 .gitignore                   |   1 +
 Documentation/config.txt     |  44 ++++
 Documentation/git-config.txt |  29 +++
 Makefile                     |   7 +
 builtin/config.c             | 134 +++++++++--
 http.c                       |  16 +-
 t/.gitattributes             |   1 +
 t/t1300-repo-config.sh       |  25 ++
 t/t5200-url-normalize.sh     | 199 ++++++++++++++++
 t/t5200/README               | Bin 0 -> 644 bytes
 t/t5200/config-1             | Bin 0 -> 180 bytes
 t/t5200/config-2             | Bin 0 -> 80 bytes
 t/t5200/config-3             | Bin 0 -> 118 bytes
 t/t5200/url-1                | Bin 0 -> 20 bytes
 t/t5200/url-10               | Bin 0 -> 23 bytes
 t/t5200/url-11               | Bin 0 -> 25 bytes
 t/t5200/url-2                | Bin 0 -> 20 bytes
 t/t5200/url-3                | Bin 0 -> 23 bytes
 t/t5200/url-4                | Bin 0 -> 23 bytes
 t/t5200/url-5                | Bin 0 -> 23 bytes
 t/t5200/url-6                | Bin 0 -> 23 bytes
 t/t5200/url-7                | Bin 0 -> 23 bytes
 t/t5200/url-8                | Bin 0 -> 23 bytes
 t/t5200/url-9                | Bin 0 -> 23 bytes
 test-url-normalize.c         | 137 +++++++++++
 urlmatch.c                   | 535 +++++++++++++++++++++++++++++++++++++++++++
 urlmatch.h                   |  54 +++++
 27 files changed, 1158 insertions(+), 24 deletions(-)
 create mode 100755 t/t5200-url-normalize.sh
 create mode 100644 t/t5200/README
 create mode 100644 t/t5200/config-1
 create mode 100644 t/t5200/config-2
 create mode 100644 t/t5200/config-3
 create mode 100644 t/t5200/url-1
 create mode 100644 t/t5200/url-10
 create mode 100644 t/t5200/url-11
 create mode 100644 t/t5200/url-2
 create mode 100644 t/t5200/url-3
 create mode 100644 t/t5200/url-4
 create mode 100644 t/t5200/url-5
 create mode 100644 t/t5200/url-6
 create mode 100644 t/t5200/url-7
 create mode 100644 t/t5200/url-8
 create mode 100644 t/t5200/url-9
 create mode 100644 test-url-normalize.c
 create mode 100644 urlmatch.c
 create mode 100644 urlmatch.h

-- 
1.8.4-rc0-153-g9820077

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v6 1/6] http.c: fix parsing of http.sslCertPasswordProtected variable
  2013-07-31 19:26 [PATCH v6 0/6] http.<url>.<key> and friends Junio C Hamano
@ 2013-07-31 19:26 ` Junio C Hamano
  2013-07-31 19:26 ` [PATCH v6 2/6] config: add helper to normalize and match URLs Junio C Hamano
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2013-07-31 19:26 UTC (permalink / raw)
  To: git; +Cc: Kyle J. McKay, Jeff King

The existing code triggers only when the configuration variable is
set to true.  Once the variable is set to true in a more generic
configuration file (e.g. ~/.gitconfig), it cannot be overriden to
false in the repository specific one (e.g. .git/config).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 http.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/http.c b/http.c
index 92aba59..37986f8 100644
--- a/http.c
+++ b/http.c
@@ -160,8 +160,7 @@ static int http_options(const char *var, const char *value, void *cb)
 	if (!strcmp("http.sslcainfo", var))
 		return git_config_string(&ssl_cainfo, var, value);
 	if (!strcmp("http.sslcertpasswordprotected", var)) {
-		if (git_config_bool(var, value))
-			ssl_cert_password_required = 1;
+		ssl_cert_password_required = git_config_bool(var, value);
 		return 0;
 	}
 	if (!strcmp("http.ssltry", var)) {
-- 
1.8.4-rc0-153-g9820077

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 2/6] config: add helper to normalize and match URLs
  2013-07-31 19:26 [PATCH v6 0/6] http.<url>.<key> and friends Junio C Hamano
  2013-07-31 19:26 ` [PATCH v6 1/6] http.c: fix parsing of http.sslCertPasswordProtected variable Junio C Hamano
@ 2013-07-31 19:26 ` Junio C Hamano
  2013-07-31 20:50   ` Kyle J. McKay
  2013-07-31 19:26 ` [PATCH v6 3/6] config: add generic callback wrapper to parse section.<url>.key Junio C Hamano
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2013-07-31 19:26 UTC (permalink / raw)
  To: git; +Cc: Kyle J. McKay, Jeff King

From: "Kyle J. McKay" <mackyle@gmail.com>

Some http.* configuration variables need to take values customized
for the URL we are talking to.  We may want to set http.sslVerify to
true in general but to false only for a certain site, for example,
with a configuration file like this:

	[http]
		sslVerify = true
	[http "https://weak.example.com"]
		sslVerify = false

and let the configuration machinery pick up the latter only when
talking to "https://weak.example.com".  The latter needs to kick in
not only when the URL is exactly "https://weak.example.com", but
also is anything that "match" it, e.g.

	https://weak.example.com/test
	https://me@weak.example.com/test

The <url> in the configuration key consists of the following parts,
and is considered a match to the URL we are attempting to access
under certain conditions:

  . Scheme (e.g., `https` in `https://example.com/`). This field
    must match exactly between the config key and the URL.

  . Host/domain name (e.g., `example.com` in `https://example.com/`).
    This field must match exactly between the config key and the URL.

  . Port number (e.g., `8080` in `http://example.com:8080/`).  This
    field must match exactly between the config key and the URL.
    Omitted port numbers are automatically converted to the correct
    default for the scheme before matching.

  . Path (e.g., `repo.git` in `https://example.com/repo.git`). The
    path field of the config key must match the path field of the
    URL either exactly or as a prefix of slash-delimited path
    elements.  A config key with path `foo/` matches URL path
    `foo/bar`.  A prefix can only match on a slash (`/`) boundary.
    Longer matches take precedence (so a config key with path
    `foo/bar` is a better match to URL path `foo/bar` than a config
    key with just path `foo/`).

  . User name (e.g., `me` in `https://me@example.com/repo.git`). If
    the config key has a user name, it must match the user name in
    the URL exactly. If the config key does not have a user name,
    that config key will match a URL with any user name (including
    none).

Longer matches take precedence over shorter matches.

This step adds two helper functions `url_normalize()` and
`match_urls()` to help implement the above semantics. The
normalization rules are based on RFC 3986 and should result in any
two equivalent urls being a match.

Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 urlmatch.c | 468 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 urlmatch.h |  36 +++++
 2 files changed, 504 insertions(+)
 create mode 100644 urlmatch.c
 create mode 100644 urlmatch.h

diff --git a/urlmatch.c b/urlmatch.c
new file mode 100644
index 0000000..e1b03ee
--- /dev/null
+++ b/urlmatch.c
@@ -0,0 +1,468 @@
+#include "cache.h"
+#include "urlmatch.h"
+
+#define URL_ALPHA "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
+#define URL_DIGIT "0123456789"
+#define URL_ALPHADIGIT URL_ALPHA URL_DIGIT
+#define URL_SCHEME_CHARS URL_ALPHADIGIT "+.-"
+#define URL_HOST_CHARS URL_ALPHADIGIT ".-[:]" /* IPv6 literals need [:] */
+#define URL_UNSAFE_CHARS " <>\"%{}|\\^`" /* plus 0x00-0x1F,0x7F-0xFF */
+#define URL_GEN_RESERVED ":/?#[]@"
+#define URL_SUB_RESERVED "!$&'()*+,;="
+#define URL_RESERVED URL_GEN_RESERVED URL_SUB_RESERVED /* only allowed delims */
+
+static int append_normalized_escapes(struct strbuf *buf,
+				     const char *from,
+				     size_t from_len,
+				     const char *esc_extra,
+				     const char *esc_ok)
+{
+	/*
+	 * Append to strbuf 'buf' characters from string 'from' with length
+	 * 'from_len' while unescaping characters that do not need to be escaped
+	 * and escaping characters that do.  The set of characters to escape
+	 * (the complement of which is unescaped) starts out as the RFC 3986
+	 * unsafe characters (0x00-0x1F,0x7F-0xFF," <>\"#%{}|\\^`").  If
+	 * 'esc_extra' is not NULL, those additional characters will also always
+	 * be escaped.  If 'esc_ok' is not NULL, those characters will be left
+	 * escaped if found that way, but will not be unescaped otherwise (used
+	 * for delimiters).  If a %-escape sequence is encountered that is not
+	 * followed by 2 hexadecimal digits, the sequence is invalid and
+	 * false (0) will be returned.  Otherwise true (1) will be returned for
+	 * success.
+	 *
+	 * Note that all %-escape sequences will be normalized to UPPERCASE
+	 * as indicated in RFC 3986.  Unless included in esc_extra or esc_ok
+	 * alphanumerics and "-._~" will always be unescaped as per RFC 3986.
+	 */
+
+	while (from_len) {
+		int ch = *from++;
+		int was_esc = 0;
+
+		from_len--;
+		if (ch == '%') {
+			if (from_len < 2 ||
+			    !isxdigit((unsigned char)from[0]) ||
+			    !isxdigit((unsigned char)from[1]))
+				return 0;
+			ch = hexval_table[(unsigned char)*from++] << 4;
+			ch |= hexval_table[(unsigned char)*from++];
+			from_len -= 2;
+			was_esc = 1;
+		}
+		if ((unsigned char)ch <= 0x1F || (unsigned char)ch >= 0x7F ||
+		    strchr(URL_UNSAFE_CHARS, ch) ||
+		    (esc_extra && strchr(esc_extra, ch)) ||
+		    (was_esc && strchr(esc_ok, ch)))
+			strbuf_addf(buf, "%%%02X", (unsigned char)ch);
+		else
+			strbuf_addch(buf, ch);
+	}
+
+	return 1;
+}
+
+char *url_normalize(const char *url, struct url_info *out_info)
+{
+	/*
+	 * Normalize NUL-terminated url using the following rules:
+	 *
+	 * 1. Case-insensitive parts of url will be converted to lower case
+	 * 2. %-encoded characters that do not need to be will be unencoded
+	 * 3. Characters that are not %-encoded and must be will be encoded
+	 * 4. All %-encodings will be converted to upper case hexadecimal
+	 * 5. Leading 0s are removed from port numbers
+	 * 6. If the default port for the scheme is given it will be removed
+	 * 7. A path part (including empty) not starting with '/' has one added
+	 * 8. Any dot segments (. or ..) in the path are resolved and removed
+	 * 9. IPv6 host literals are allowed (but not normalized or validated)
+	 *
+	 * The rules are based on information in RFC 3986.
+	 *
+	 * Please note this function requires a full URL including a scheme
+	 * and host part (except for file: URLs which may have an empty host).
+	 *
+	 * The return value is a newly allocated string that must be freed
+	 * or NULL if the url is not valid.
+	 *
+	 * If out_info is non-NULL, the url and err fields therein will always
+	 * be set.  If a non-NULL value is returned, it will be stored in
+	 * out_info->url as well, out_info->err will be set to NULL and the
+	 * other fields of *out_info will also be filled in.  If a NULL value
+	 * is returned, NULL will be stored in out_info->url and out_info->err
+	 * will be set to a brief, translated, error message, but no other
+	 * fields will be filled in.
+	 *
+	 * This is NOT a URL validation function.  Full URL validation is NOT
+	 * performed.  Some invalid host names are passed through this function
+	 * undetected.  However, most all other problems that make a URL invalid
+	 * will be detected (including a missing host for non file: URLs).
+	 */
+
+	size_t url_len = strlen(url);
+	struct strbuf norm;
+	size_t spanned;
+	size_t scheme_len, user_off=0, user_len=0, passwd_off=0, passwd_len=0;
+	size_t host_off=0, host_len=0, port_len=0, path_off, path_len, result_len;
+	const char *slash_ptr, *at_ptr, *colon_ptr, *path_start;
+	char *result;
+
+	/*
+	 * Copy lowercased scheme and :// suffix, %-escapes are not allowed
+	 * First character of scheme must be URL_ALPHA
+	 */
+	spanned = strspn(url, URL_SCHEME_CHARS);
+	if (!spanned || !isalpha(url[0]) || spanned + 3 > url_len ||
+	    url[spanned] != ':' || url[spanned+1] != '/' || url[spanned+2] != '/') {
+		if (out_info) {
+			out_info->url = NULL;
+			out_info->err = _("invalid URL scheme name or missing '://' suffix");
+		}
+		return NULL; /* Bad scheme and/or missing "://" part */
+	}
+	strbuf_init(&norm, url_len);
+	scheme_len = spanned;
+	spanned += 3;
+	url_len -= spanned;
+	while (spanned--)
+		strbuf_addch(&norm, tolower(*url++));
+
+
+	/*
+	 * Copy any username:password if present normalizing %-escapes
+	 */
+	at_ptr = strchr(url, '@');
+	slash_ptr = url + strcspn(url, "/?#");
+	if (at_ptr && at_ptr < slash_ptr) {
+		user_off = norm.len;
+		if (at_ptr > url) {
+			if (!append_normalized_escapes(&norm, url, at_ptr - url,
+						       "", URL_RESERVED)) {
+				if (out_info) {
+					out_info->url = NULL;
+					out_info->err = _("invalid %XX escape sequence");
+				}
+				strbuf_release(&norm);
+				return NULL;
+			}
+			colon_ptr = strchr(norm.buf + scheme_len + 3, ':');
+			if (colon_ptr) {
+				passwd_off = (colon_ptr + 1) - norm.buf;
+				passwd_len = norm.len - passwd_off;
+				user_len = (passwd_off - 1) - (scheme_len + 3);
+			} else {
+				user_len = norm.len - (scheme_len + 3);
+			}
+		}
+		strbuf_addch(&norm, '@');
+		url_len -= (++at_ptr - url);
+		url = at_ptr;
+	}
+
+
+	/*
+	 * Copy the host part excluding any port part, no %-escapes allowed
+	 */
+	if (!url_len || strchr(":/?#", *url)) {
+		/* Missing host invalid for all URL schemes except file */
+		if (strncmp(norm.buf, "file:", 5)) {
+			if (out_info) {
+				out_info->url = NULL;
+				out_info->err = _("missing host and scheme is not 'file:'");
+			}
+			strbuf_release(&norm);
+			return NULL;
+		}
+	} else {
+		host_off = norm.len;
+	}
+	colon_ptr = slash_ptr - 1;
+	while (colon_ptr > url && *colon_ptr != ':' && *colon_ptr != ']')
+		colon_ptr--;
+	if (*colon_ptr != ':') {
+		colon_ptr = slash_ptr;
+	} else if (!host_off && colon_ptr < slash_ptr && colon_ptr + 1 != slash_ptr) {
+		/* file: URLs may not have a port number */
+		if (out_info) {
+			out_info->url = NULL;
+			out_info->err = _("a 'file:' URL may not have a port number");
+		}
+		strbuf_release(&norm);
+		return NULL;
+	}
+	spanned = strspn(url, URL_HOST_CHARS);
+	if (spanned < colon_ptr - url) {
+		/* Host name has invalid characters */
+		if (out_info) {
+			out_info->url = NULL;
+			out_info->err = _("invalid characters in host name");
+		}
+		strbuf_release(&norm);
+		return NULL;
+	}
+	while (url < colon_ptr) {
+		strbuf_addch(&norm, tolower(*url++));
+		url_len--;
+	}
+
+
+	/*
+	 * Check the port part and copy if not the default (after removing any
+	 * leading 0s); no %-escapes allowed
+	 */
+	if (colon_ptr < slash_ptr) {
+		/* skip the ':' and leading 0s but not the last one if all 0s */
+		url++;
+		url += strspn(url, "0");
+		if (url == slash_ptr && url[-1] == '0')
+			url--;
+		if (url == slash_ptr) {
+			/* Skip ":" port with no number, it's same as default */
+		} else if (slash_ptr - url == 2 &&
+			   !strncmp(norm.buf, "http:", 5) &&
+			   !strncmp(url, "80", 2)) {
+			/* Skip http :80 as it's the default */
+		} else if (slash_ptr - url == 3 &&
+			   !strncmp(norm.buf, "https:", 6) &&
+			   !strncmp(url, "443", 3)) {
+			/* Skip https :443 as it's the default */
+		} else {
+			/*
+			 * Port number must be all digits with leading 0s removed
+			 * and since all the protocols we deal with have a 16-bit
+			 * port number it must also be in the range 1..65535
+			 * 0 is not allowed because that means "next available"
+			 * on just about every system and therefore cannot be used
+			 */
+			unsigned long pnum = 0;
+			spanned = strspn(url, URL_DIGIT);
+			if (spanned < slash_ptr - url) {
+				/* port number has invalid characters */
+				if (out_info) {
+					out_info->url = NULL;
+					out_info->err = _("invalid port number");
+				}
+				strbuf_release(&norm);
+				return NULL;
+			}
+			if (slash_ptr - url <= 5)
+				pnum = strtoul(url, NULL, 10);
+			if (pnum == 0 || pnum > 65535) {
+				/* port number not in range 1..65535 */
+				if (out_info) {
+					out_info->url = NULL;
+					out_info->err = _("invalid port number");
+				}
+				strbuf_release(&norm);
+				return NULL;
+			}
+			strbuf_addch(&norm, ':');
+			strbuf_add(&norm, url, slash_ptr - url);
+			port_len = slash_ptr - url;
+		}
+		url_len -= slash_ptr - colon_ptr;
+		url = slash_ptr;
+	}
+	if (host_off)
+		host_len = norm.len - host_off;
+
+
+	/*
+	 * Now copy the path resolving any . and .. segments being careful not
+	 * to corrupt the URL by unescaping any delimiters, but do add an
+	 * initial '/' if it's missing and do normalize any %-escape sequences.
+	 */
+	path_off = norm.len;
+	path_start = norm.buf + path_off;
+	strbuf_addch(&norm, '/');
+	if (*url == '/') {
+		url++;
+		url_len--;
+	}
+	for (;;) {
+		const char *seg_start = norm.buf + norm.len;
+		const char *next_slash = url + strcspn(url, "/?#");
+		int skip_add_slash = 0;
+		/*
+		 * RFC 3689 indicates that any . or .. segments should be
+		 * unescaped before being checked for.
+		 */
+		if (!append_normalized_escapes(&norm, url, next_slash - url, "",
+					       URL_RESERVED)) {
+			if (out_info) {
+				out_info->url = NULL;
+				out_info->err = _("invalid %XX escape sequence");
+			}
+			strbuf_release(&norm);
+			return NULL;
+		}
+		if (!strcmp(seg_start, ".")) {
+			/* ignore a . segment; be careful not to remove initial '/' */
+			if (seg_start == path_start + 1) {
+				strbuf_setlen(&norm, norm.len - 1);
+				skip_add_slash = 1;
+			} else {
+				strbuf_setlen(&norm, norm.len - 2);
+			}
+		} else if (!strcmp(seg_start, "..")) {
+			/*
+			 * ignore a .. segment and remove the previous segment;
+			 * be careful not to remove initial '/' from path
+			 */
+			const char *prev_slash = norm.buf + norm.len - 3;
+			if (prev_slash == path_start) {
+				/* invalid .. because no previous segment to remove */
+				if (out_info) {
+					out_info->url = NULL;
+					out_info->err = _("invalid '..' path segment");
+				}
+				strbuf_release(&norm);
+				return NULL;
+			}
+			while (*--prev_slash != '/') {}
+			if (prev_slash == path_start) {
+				strbuf_setlen(&norm, prev_slash - norm.buf + 1);
+				skip_add_slash = 1;
+			} else {
+				strbuf_setlen(&norm, prev_slash - norm.buf);
+			}
+		}
+		url_len -= next_slash - url;
+		url = next_slash;
+		/* if the next char is not '/' done with the path */
+		if (*url != '/')
+			break;
+		url++;
+		url_len--;
+		if (!skip_add_slash)
+			strbuf_addch(&norm, '/');
+	}
+	path_len = norm.len - path_off;
+
+
+	/*
+	 * Now simply copy the rest, if any, only normalizing %-escapes and
+	 * being careful not to corrupt the URL by unescaping any delimiters.
+	 */
+	if (*url) {
+		if (!append_normalized_escapes(&norm, url, url_len, "", URL_RESERVED)) {
+			if (out_info) {
+				out_info->url = NULL;
+				out_info->err = _("invalid %XX escape sequence");
+			}
+			strbuf_release(&norm);
+			return NULL;
+		}
+	}
+
+
+	result = strbuf_detach(&norm, &result_len);
+	if (out_info) {
+		out_info->url = result;
+		out_info->err = NULL;
+		out_info->url_len = result_len;
+		out_info->scheme_len = scheme_len;
+		out_info->user_off = user_off;
+		out_info->user_len = user_len;
+		out_info->passwd_off = passwd_off;
+		out_info->passwd_len = passwd_len;
+		out_info->host_off = host_off;
+		out_info->host_len = host_len;
+		out_info->port_len = port_len;
+		out_info->path_off = path_off;
+		out_info->path_len = path_len;
+	}
+	return result;
+}
+
+static size_t http_options_url_match_prefix(const char *url,
+					    const char *url_prefix,
+					    size_t url_prefix_len)
+{
+	/*
+	 * url_prefix matches url if url_prefix is an exact match for url or it
+	 * is a prefix of url and the match ends on a path component boundary.
+	 * Both url and url_prefix are considered to have an implicit '/' on the
+	 * end for matching purposes if they do not already.
+	 *
+	 * url must be NUL terminated.  url_prefix_len is the length of
+	 * url_prefix which need not be NUL terminated.
+	 *
+	 * The return value is the length of the match in characters (including
+	 * the final '/' even if it's implicit) or 0 for no match.
+	 *
+	 * Passing NULL as url and/or url_prefix will always cause 0 to be
+	 * returned without causing any faults.
+	 */
+	if (!url || !url_prefix)
+		return 0;
+	if (!url_prefix_len || (url_prefix_len == 1 && *url_prefix == '/'))
+		return (!*url || *url == '/') ? 1 : 0;
+	if (url_prefix[url_prefix_len - 1] == '/')
+		url_prefix_len--;
+	if (strncmp(url, url_prefix, url_prefix_len))
+		return 0;
+	if ((strlen(url) == url_prefix_len) || (url[url_prefix_len] == '/'))
+		return url_prefix_len + 1;
+	return 0;
+}
+
+int match_urls(const struct url_info *url,
+	       const struct url_info *url_prefix,
+	       int *exactusermatch)
+{
+	/*
+	 * url_prefix matches url if the scheme, host and port of url_prefix
+	 * are the same as those of url and the path portion of url_prefix
+	 * is the same as the path portion of url or it is a prefix that
+	 * matches at a '/' boundary.  If url_prefix contains a user name,
+	 * that must also exactly match the user name in url.
+	 *
+	 * If the user, host, port and path match in this fashion, the returned
+	 * value is the length of the path match including any implicit
+	 * final '/'.  For example, "http://me@example.com/path" is matched by
+	 * "http://example.com" with a path length of 1.
+	 *
+	 * If there is a match and exactusermatch is not NULL, then
+	 * *exactusermatch will be set to true if both url and url_prefix
+	 * contained a user name or false if url_prefix did not have a
+	 * user name.  If there is no match *exactusermatch is left untouched.
+	 */
+	int usermatched = 0;
+	int pathmatchlen;
+
+	if (!url || !url_prefix || !url->url || !url_prefix->url)
+		return 0;
+
+	/* check the scheme */
+	if (url_prefix->scheme_len != url->scheme_len ||
+	    strncmp(url->url, url_prefix->url, url->scheme_len))
+		return 0; /* schemes do not match */
+
+	/* check the user name if url_prefix has one */
+	if (url_prefix->user_off) {
+		if (!url->user_off || url->user_len != url_prefix->user_len ||
+		    strncmp(url->url + url->user_off,
+			    url_prefix->url + url_prefix->user_off,
+			    url->user_len))
+			return 0; /* url_prefix has a user but it's not a match */
+		usermatched = 1;
+	}
+
+	/* check the host and port */
+	if (url_prefix->host_len != url->host_len ||
+	    strncmp(url->url + url->host_off,
+		    url_prefix->url + url_prefix->host_off, url->host_len))
+		return 0; /* host names and/or ports do not match */
+
+	/* check the path */
+	pathmatchlen = http_options_url_match_prefix(
+		url->url + url->path_off,
+		url_prefix->url + url_prefix->path_off,
+		url_prefix->url_len - url_prefix->path_off);
+
+	if (pathmatchlen && exactusermatch)
+		*exactusermatch = usermatched;
+	return pathmatchlen;
+}
diff --git a/urlmatch.h b/urlmatch.h
new file mode 100644
index 0000000..b67f57f
--- /dev/null
+++ b/urlmatch.h
@@ -0,0 +1,36 @@
+#ifndef URL_MATCH_H
+#include "string-list.h"
+
+struct url_info {
+	/* normalized url on success, must be freed, otherwise NULL */
+	char *url;
+	/* if !url, a brief reason for the failure, otherwise NULL */
+	const char *err;
+
+	/* the rest of the fields are only set if url != NULL */
+
+	size_t url_len;		/* total length of url (which is now normalized) */
+	size_t scheme_len;	/* length of scheme name (excluding final :) */
+	size_t user_off;	/* offset into url to start of user name (0 => none) */
+	size_t user_len;	/* length of user name; if user_off != 0 but
+				   user_len == 0, an empty user name was given */
+	size_t passwd_off;	/* offset into url to start of passwd (0 => none) */
+	size_t passwd_len;	/* length of passwd; if passwd_off != 0 but
+				   passwd_len == 0, an empty passwd was given */
+	size_t host_off;	/* offset into url to start of host name (0 => none) */
+	size_t host_len;	/* length of host name; this INCLUDES any ':portnum';
+				 * file urls may have host_len == 0 */
+	size_t port_len;	/* if a portnum is present (port_len != 0), it has
+				 * this length (excluding the leading ':') at the
+				 * end of the host name (always 0 for file urls) */
+	size_t path_off;	/* offset into url to the start of the url path;
+				 * this will always point to a '/' character
+				 * after the url has been normalized */
+	size_t path_len;	/* length of path portion excluding any trailing
+				 * '?...' and '#...' portion; will always be >= 1 */
+};
+
+extern char *url_normalize(const char *, struct url_info *);
+extern int match_urls(const struct url_info *url, const struct url_info *url_prefix, int *exactusermatch);
+
+#endif /* URL_MATCH_H */
-- 
1.8.4-rc0-153-g9820077

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 3/6] config: add generic callback wrapper to parse section.<url>.key
  2013-07-31 19:26 [PATCH v6 0/6] http.<url>.<key> and friends Junio C Hamano
  2013-07-31 19:26 ` [PATCH v6 1/6] http.c: fix parsing of http.sslCertPasswordProtected variable Junio C Hamano
  2013-07-31 19:26 ` [PATCH v6 2/6] config: add helper to normalize and match URLs Junio C Hamano
@ 2013-07-31 19:26 ` Junio C Hamano
  2013-07-31 19:26 ` [PATCH v6 4/6] config: parse http.<url>.<variable> using urlmatch Junio C Hamano
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2013-07-31 19:26 UTC (permalink / raw)
  To: git; +Cc: Kyle J. McKay, Jeff King

Existing configuration parsing functions (e.g. http_options() in
http.c) know how to parse two-level configuration variable names.
We would like to exploit them and parse something like this:

	[http]
		sslVerify = true
	[http "https://weak.example.com"]
		sslVerify = false

and pretend as if http.sslVerify were set to false when talking to
"https://weak.example.com/path".

Introduce `urlmatch_config_entry()` wrapper that:

 - is called with the target URL (e.g. "https://weak.example.com/path"),
   and the two-level variable parser (e.g. `http_options`);

 - uses `url_normalize()` and `match_urls()` to see if configuration
   data matches the target URL; and

 - calls the traditional two-level configuration variable parser
   only for the configuration data whose <url> part matches the
   target URL (and if there are multiple matches, only do so if the
   current match is a better match than the ones previously seen).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 urlmatch.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 urlmatch.h | 18 +++++++++++++++++
 2 files changed, 85 insertions(+)

diff --git a/urlmatch.c b/urlmatch.c
index e1b03ee..073fdd3 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -466,3 +466,70 @@ int match_urls(const struct url_info *url,
 		*exactusermatch = usermatched;
 	return pathmatchlen;
 }
+
+int urlmatch_config_entry(const char *var, const char *value, void *cb)
+{
+	struct string_list_item *item;
+	struct urlmatch_config *collect = cb;
+	struct urlmatch_item *matched;
+	struct url_info *url = &collect->url;
+	const char *key, *dot;
+	struct strbuf synthkey = STRBUF_INIT;
+	size_t matched_len = 0;
+	int user_matched = 0;
+	int retval;
+
+	key = skip_prefix(var, collect->section);
+	if (!key || *(key++) != '.') {
+		if (collect->cascade_fn)
+			return collect->cascade_fn(var, value, cb);
+		return 0; /* not interested */
+	}
+	dot = strrchr(key, '.');
+	if (dot) {
+		char *config_url, *norm_url;
+		struct url_info norm_info;
+
+		config_url = xmemdupz(key, dot - key);
+		norm_url = url_normalize(config_url, &norm_info);
+		free(config_url);
+		if (!norm_url)
+			return 0;
+		matched_len = match_urls(url, &norm_info, &user_matched);
+		free(norm_url);
+		if (!matched_len)
+			return 0;
+		key = dot + 1;
+	}
+
+	if (collect->key && strcmp(key, collect->key))
+		return 0;
+
+	item = string_list_insert(&collect->vars, key);
+	if (!item->util) {
+		matched = xcalloc(1, sizeof(*matched));
+		item->util = matched;
+	} else {
+		matched = item->util;
+		/*
+		 * Is our match shorter?  Is our match the same
+		 * length, and without user while the current
+		 * candidate is with user?  Then we cannot use it.
+		 */
+		if (matched_len < matched->matched_len ||
+		    ((matched_len == matched->matched_len) &&
+		     (!user_matched && matched->user_matched)))
+			return 0;
+		/* Otherwise, replace it with this one. */
+	}
+
+	matched->matched_len = matched_len;
+	matched->user_matched = user_matched;
+	strbuf_addstr(&synthkey, collect->section);
+	strbuf_addch(&synthkey, '.');
+	strbuf_addstr(&synthkey, key);
+	retval = collect->collect_fn(synthkey.buf, value, collect->cb);
+
+	strbuf_release(&synthkey);
+	return retval;
+}
diff --git a/urlmatch.h b/urlmatch.h
index b67f57f..b461dfd 100644
--- a/urlmatch.h
+++ b/urlmatch.h
@@ -33,4 +33,22 @@ struct url_info {
 extern char *url_normalize(const char *, struct url_info *);
 extern int match_urls(const struct url_info *url, const struct url_info *url_prefix, int *exactusermatch);
 
+struct urlmatch_item {
+	size_t matched_len;
+	char user_matched;
+};
+
+struct urlmatch_config {
+	struct string_list vars;
+	struct url_info url;
+	const char *section;
+	const char *key;
+
+	void *cb;
+	int (*collect_fn)(const char *var, const char *value, void *cb);
+	int (*cascade_fn)(const char *var, const char *value, void *cb);
+};
+
+extern int urlmatch_config_entry(const char *var, const char *value, void *cb);
+
 #endif /* URL_MATCH_H */
-- 
1.8.4-rc0-153-g9820077

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 4/6] config: parse http.<url>.<variable> using urlmatch
  2013-07-31 19:26 [PATCH v6 0/6] http.<url>.<key> and friends Junio C Hamano
                   ` (2 preceding siblings ...)
  2013-07-31 19:26 ` [PATCH v6 3/6] config: add generic callback wrapper to parse section.<url>.key Junio C Hamano
@ 2013-07-31 19:26 ` Junio C Hamano
  2013-07-31 20:51   ` Kyle J. McKay
  2013-07-31 20:51   ` [PATCH ALTERNATIVE v6 0/2] http.<url>.<key> and friends Kyle J. McKay
  2013-07-31 19:26 ` [PATCH v6 5/6] builtin/config: refactor collect_config() Junio C Hamano
  2013-07-31 19:26 ` [PATCH v6 6/6] config: "git config --get-urlmatch" parses section.<url>.key Junio C Hamano
  5 siblings, 2 replies; 23+ messages in thread
From: Junio C Hamano @ 2013-07-31 19:26 UTC (permalink / raw)
  To: git; +Cc: Kyle J. McKay, Jeff King

From: "Kyle J. McKay" <mackyle@gmail.com>

Use the urlmatch_config_entry() to wrap the underlying
http_options() two-level variable parser in order to set
http.<variable> to the value with the most specific URL in the
configuration.

Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 .gitignore               |   1 +
 Documentation/config.txt |  44 +++++++++++
 Makefile                 |   7 ++
 http.c                   |  13 +++-
 t/.gitattributes         |   1 +
 t/t5200-url-normalize.sh | 199 +++++++++++++++++++++++++++++++++++++++++++++++
 t/t5200/README           | Bin 0 -> 644 bytes
 t/t5200/config-1         | Bin 0 -> 180 bytes
 t/t5200/config-2         | Bin 0 -> 80 bytes
 t/t5200/config-3         | Bin 0 -> 118 bytes
 t/t5200/url-1            | Bin 0 -> 20 bytes
 t/t5200/url-10           | Bin 0 -> 23 bytes
 t/t5200/url-11           | Bin 0 -> 25 bytes
 t/t5200/url-2            | Bin 0 -> 20 bytes
 t/t5200/url-3            | Bin 0 -> 23 bytes
 t/t5200/url-4            | Bin 0 -> 23 bytes
 t/t5200/url-5            | Bin 0 -> 23 bytes
 t/t5200/url-6            | Bin 0 -> 23 bytes
 t/t5200/url-7            | Bin 0 -> 23 bytes
 t/t5200/url-8            | Bin 0 -> 23 bytes
 t/t5200/url-9            | Bin 0 -> 23 bytes
 test-url-normalize.c     | 137 ++++++++++++++++++++++++++++++++
 22 files changed, 401 insertions(+), 1 deletion(-)
 create mode 100755 t/t5200-url-normalize.sh
 create mode 100644 t/t5200/README
 create mode 100644 t/t5200/config-1
 create mode 100644 t/t5200/config-2
 create mode 100644 t/t5200/config-3
 create mode 100644 t/t5200/url-1
 create mode 100644 t/t5200/url-10
 create mode 100644 t/t5200/url-11
 create mode 100644 t/t5200/url-2
 create mode 100644 t/t5200/url-3
 create mode 100644 t/t5200/url-4
 create mode 100644 t/t5200/url-5
 create mode 100644 t/t5200/url-6
 create mode 100644 t/t5200/url-7
 create mode 100644 t/t5200/url-8
 create mode 100644 t/t5200/url-9
 create mode 100644 test-url-normalize.c

diff --git a/.gitignore b/.gitignore
index 6669bf0..cd97e16 100644
--- a/.gitignore
+++ b/.gitignore
@@ -198,6 +198,7 @@
 /test-string-list
 /test-subprocess
 /test-svn-fe
+/test-url-normalize
 /test-wildmatch
 /common-cmds.h
 *.tar.gz
diff --git a/Documentation/config.txt b/Documentation/config.txt
index 6e53fc5..60c140f 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1513,6 +1513,50 @@ http.useragent::
 	of common USER_AGENT strings (but not including those like git/1.7.1).
 	Can be overridden by the 'GIT_HTTP_USER_AGENT' environment variable.
 
+http.<url>.*::
+	Any of the http.* options above can be applied selectively to some urls.
+	For a config key to match a URL, each element of the config key is
+	compared to that of the URL, in the following order:
++
+--
+. Scheme (e.g., `https` in `https://example.com/`). This field
+  must match exactly between the config key and the URL.
+
+. Host/domain name (e.g., `example.com` in `https://example.com/`).
+  This field must match exactly between the config key and the URL.
+
+. Port number (e.g., `8080` in `http://example.com:8080/`).
+  This field must match exactly between the config key and the URL.
+  Omitted port numbers are automatically converted to the correct
+  default for the scheme before matching.
+
+. Path (e.g., `repo.git` in `https://example.com/repo.git`). The
+  path field of the config key must match the path field of the URL
+  either exactly or as a prefix of slash-delimited path elements.  This means
+  a config key with path `foo/` matches URL path `foo/bar`.  A prefix can only
+  match on a slash (`/`) boundary.  Longer matches take precedence (so a config
+  key with path `foo/bar` is a better match to URL path `foo/bar` than a config
+  key with just path `foo/`).
+
+. User name (e.g., `user` in `https://user@example.com/repo.git`). If
+  the config key has a user name it must match the user name in the
+  URL exactly. If the config key does not have a user name, that
+  config key will match a URL with any user name (including none).
+--
++
+The list above is ordered by decreasing precedence; a URL that matches
+a config key's path is preferred to one that matches its user name. For example,
+if the URL is `https://user@example.com/foo/bar` a config key match of
+`https://example.com/foo` will be preferred over a config key match of
+`https://user@example.com`.
++
+All URLs are normalized before attempting any matching (the password part,
+if embedded in the URL, is always ignored for matching purposes) so that
+equivalent urls that are simply spelled differently will match properly.
+Environment variable settings always override any matches.  The urls that are
+matched against are those given directly to Git commands.  This means any URLs
+visited as a result of a redirection do not participate in matching.
+
 i18n.commitEncoding::
 	Character encoding the commit messages are stored in; Git itself
 	does not care per se, but this information is necessary e.g. when
diff --git a/Makefile b/Makefile
index 0f931a2..ea3edba 100644
--- a/Makefile
+++ b/Makefile
@@ -567,6 +567,7 @@ TEST_PROGRAMS_NEED_X += test-sigchain
 TEST_PROGRAMS_NEED_X += test-string-list
 TEST_PROGRAMS_NEED_X += test-subprocess
 TEST_PROGRAMS_NEED_X += test-svn-fe
+TEST_PROGRAMS_NEED_X += test-url-normalize
 TEST_PROGRAMS_NEED_X += test-wildmatch
 
 TEST_PROGRAMS = $(patsubst %,%$X,$(TEST_PROGRAMS_NEED_X))
@@ -721,6 +722,7 @@ LIB_H += tree-walk.h
 LIB_H += tree.h
 LIB_H += unpack-trees.h
 LIB_H += url.h
+LIB_H += urlmatch.h
 LIB_H += userdiff.h
 LIB_H += utf8.h
 LIB_H += varint.h
@@ -868,6 +870,7 @@ LIB_OBJS += tree.o
 LIB_OBJS += tree-walk.o
 LIB_OBJS += unpack-trees.o
 LIB_OBJS += url.o
+LIB_OBJS += urlmatch.o
 LIB_OBJS += usage.o
 LIB_OBJS += userdiff.o
 LIB_OBJS += utf8.o
@@ -2235,6 +2238,10 @@ test-parse-options$X: parse-options.o parse-options-cb.o
 
 test-svn-fe$X: vcs-svn/lib.a
 
+test-url-normalize$X: test-url-normalize.o GIT-LDFLAGS $(GITLIBS)
+	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
+		$(LIBS) $(CURL_LIBCURL) $(EXPAT_LIBEXPAT)
+
 .PRECIOUS: $(TEST_OBJS)
 
 test-%$X: test-%.o GIT-LDFLAGS $(GITLIBS)
diff --git a/http.c b/http.c
index 37986f8..5eda356 100644
--- a/http.c
+++ b/http.c
@@ -3,6 +3,7 @@
 #include "sideband.h"
 #include "run-command.h"
 #include "url.h"
+#include "urlmatch.h"
 #include "credential.h"
 #include "version.h"
 #include "pkt-line.h"
@@ -334,10 +335,20 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 {
 	char *low_speed_limit;
 	char *low_speed_time;
+	char *normalized_url;
+	struct urlmatch_config config = { STRING_LIST_INIT_DUP };
+
+	config.section = "http";
+	config.key = NULL;
+	config.collect_fn = http_options;
+	config.cascade_fn = git_default_config;
+	config.cb = NULL;
 
 	http_is_verbose = 0;
+	normalized_url = url_normalize(url, &config.url);
 
-	git_config(http_options, NULL);
+	git_config(urlmatch_config_entry, &config);
+	free(normalized_url);
 
 	curl_global_init(CURL_GLOBAL_ALL);
 
diff --git a/t/.gitattributes b/t/.gitattributes
index 1b97c54..f6f1df3 100644
--- a/t/.gitattributes
+++ b/t/.gitattributes
@@ -1 +1,2 @@
 t[0-9][0-9][0-9][0-9]/* -whitespace
+t5200/url-* binary
diff --git a/t/t5200-url-normalize.sh b/t/t5200-url-normalize.sh
new file mode 100755
index 0000000..f79bb0f
--- /dev/null
+++ b/t/t5200-url-normalize.sh
@@ -0,0 +1,199 @@
+#!/bin/sh
+
+test_description='url normalization'
+. ./test-lib.sh
+
+if test -n "$NO_CURL"; then
+	skip_all='skipping test, git built without http support'
+	test_done
+fi
+
+# The base name of the test url files
+tu="$TEST_DIRECTORY/t5200/url"
+
+# The base name of the test config files
+tc="$TEST_DIRECTORY/t5200/config"
+
+# Note that only file: URLs should be allowed without a host
+
+test_expect_success 'url scheme' '
+	! test-url-normalize "" &&
+	! test-url-normalize "_" &&
+	! test-url-normalize "scheme" &&
+	! test-url-normalize "scheme:" &&
+	! test-url-normalize "scheme:/" &&
+	! test-url-normalize "scheme://" &&
+	! test-url-normalize "file" &&
+	! test-url-normalize "file:" &&
+	! test-url-normalize "file:/" &&
+	test-url-normalize "file://" &&
+	! test-url-normalize "://acme.co" &&
+	! test-url-normalize "x_test://acme.co" &&
+	! test-url-normalize "-test://acme.co" &&
+	! test-url-normalize "0test://acme.co" &&
+	! test-url-normalize "+test://acme.co" &&
+	! test-url-normalize ".test://acme.co" &&
+	! test-url-normalize "schem%6e://" &&
+	test-url-normalize "x-Test+v1.0://acme.co" &&
+	test "$(test-url-normalize -p "AbCdeF://x.Y")" = "abcdef://x.y/"
+'
+
+test_expect_success 'url authority' '
+	! test-url-normalize "scheme://user:pass@" &&
+	! test-url-normalize "scheme://?" &&
+	! test-url-normalize "scheme://#" &&
+	! test-url-normalize "scheme:///" &&
+	! test-url-normalize "scheme://:" &&
+	! test-url-normalize "scheme://:555" &&
+	test-url-normalize "file://user:pass@" &&
+	test-url-normalize "file://?" &&
+	test-url-normalize "file://#" &&
+	test-url-normalize "file:///" &&
+	test-url-normalize "file://:" &&
+	! test-url-normalize "file://:555" &&
+	test-url-normalize "scheme://user:pass@host" &&
+	test-url-normalize "scheme://@host" &&
+	test-url-normalize "scheme://%00@host" &&
+	! test-url-normalize "scheme://%%@host" &&
+	! test-url-normalize "scheme://host_" &&
+	test-url-normalize "scheme://user:pass@host/" &&
+	test-url-normalize "scheme://@host/" &&
+	test-url-normalize "scheme://host/" &&
+	test-url-normalize "scheme://host?x" &&
+	test-url-normalize "scheme://host#x" &&
+	test-url-normalize "scheme://host/@" &&
+	test-url-normalize "scheme://host?@x" &&
+	test-url-normalize "scheme://host#@x" &&
+	test-url-normalize "scheme://[::1]" &&
+	test-url-normalize "scheme://[::1]/" &&
+	! test-url-normalize "scheme://hos%41/" &&
+	test-url-normalize "scheme://[invalid....:/" &&
+	test-url-normalize "scheme://invalid....:]/" &&
+	! test-url-normalize "scheme://invalid....:[/" &&
+	! test-url-normalize "scheme://invalid....:["
+'
+
+test_expect_success 'url port checks' '
+	test-url-normalize "xyz://q@some.host:" &&
+	test-url-normalize "xyz://q@some.host:456/" &&
+	! test-url-normalize "xyz://q@some.host:0" &&
+	! test-url-normalize "xyz://q@some.host:0000000" &&
+	test-url-normalize "xyz://q@some.host:0000001?" &&
+	test-url-normalize "xyz://q@some.host:065535#" &&
+	test-url-normalize "xyz://q@some.host:65535" &&
+	! test-url-normalize "xyz://q@some.host:65536" &&
+	! test-url-normalize "xyz://q@some.host:99999" &&
+	! test-url-normalize "xyz://q@some.host:100000" &&
+	! test-url-normalize "xyz://q@some.host:100001" &&
+	test-url-normalize "http://q@some.host:80" &&
+	test-url-normalize "https://q@some.host:443" &&
+	test-url-normalize "http://q@some.host:80/" &&
+	test-url-normalize "https://q@some.host:443?" &&
+	! test-url-normalize "http://q@:8008" &&
+	! test-url-normalize "http://:8080" &&
+	! test-url-normalize "http://:" &&
+	test-url-normalize "xyz://q@some.host:456/" &&
+	test-url-normalize "xyz://[::1]:456/" &&
+	test-url-normalize "xyz://[::1]:/" &&
+	! test-url-normalize "xyz://[::1]:000/" &&
+	! test-url-normalize "xyz://[::1]:0%300/" &&
+	! test-url-normalize "xyz://[::1]:0x80/" &&
+	! test-url-normalize "xyz://[::1]:4294967297/" &&
+	! test-url-normalize "xyz://[::1]:030f/"
+'
+
+test_expect_success 'url port normalization' '
+	test "$(test-url-normalize -p "http://x:800")" = "http://x:800/" &&
+	test "$(test-url-normalize -p "http://x:0800")" = "http://x:800/" &&
+	test "$(test-url-normalize -p "http://x:00000800")" = "http://x:800/" &&
+	test "$(test-url-normalize -p "http://x:065535")" = "http://x:65535/" &&
+	test "$(test-url-normalize -p "http://x:1")" = "http://x:1/" &&
+	test "$(test-url-normalize -p "http://x:80")" = "http://x/" &&
+	test "$(test-url-normalize -p "http://x:080")" = "http://x/" &&
+	test "$(test-url-normalize -p "http://x:000000080")" = "http://x/" &&
+	test "$(test-url-normalize -p "https://x:443")" = "https://x/" &&
+	test "$(test-url-normalize -p "https://x:0443")" = "https://x/" &&
+	test "$(test-url-normalize -p "https://x:000000443")" = "https://x/"
+'
+
+test_expect_success 'url general escapes' '
+	! test-url-normalize "http://x.y?%fg" &&
+	test "$(test-url-normalize -p "X://W/%7e%41^%3a")" = "x://w/~A%5E%3A" &&
+	test "$(test-url-normalize -p "X://W/:/?#[]@")" = "x://w/:/?#[]@" &&
+	test "$(test-url-normalize -p "X://W/$&()*+,;=")" = "x://w/$&()*+,;=" &&
+	test "$(test-url-normalize -p "X://W/'\''")" = "x://w/'\''" &&
+	test "$(test-url-normalize -p "X://W?'\!'")" = "x://w/?'\!'"
+'
+
+test_expect_success 'url high-bit escapes' '
+	test "$(test-url-normalize -p "$(cat "$tu-1")")" = "x://q/%01%02%03%04%05%06%07%08%0E%0F%10%11%12" &&
+	test "$(test-url-normalize -p "$(cat "$tu-2")")" = "x://q/%13%14%15%16%17%18%19%1B%1C%1D%1E%1F%7F" &&
+	test "$(test-url-normalize -p "$(cat "$tu-3")")" = "x://q/%80%81%82%83%84%85%86%87%88%89%8A%8B%8C%8D%8E%8F" &&
+	test "$(test-url-normalize -p "$(cat "$tu-4")")" = "x://q/%90%91%92%93%94%95%96%97%98%99%9A%9B%9C%9D%9E%9F" &&
+	test "$(test-url-normalize -p "$(cat "$tu-5")")" = "x://q/%A0%A1%A2%A3%A4%A5%A6%A7%A8%A9%AA%AB%AC%AD%AE%AF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-6")")" = "x://q/%B0%B1%B2%B3%B4%B5%B6%B7%B8%B9%BA%BB%BC%BD%BE%BF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-7")")" = "x://q/%C0%C1%C2%C3%C4%C5%C6%C7%C8%C9%CA%CB%CC%CD%CE%CF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-8")")" = "x://q/%D0%D1%D2%D3%D4%D5%D6%D7%D8%D9%DA%DB%DC%DD%DE%DF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-9")")" = "x://q/%E0%E1%E2%E3%E4%E5%E6%E7%E8%E9%EA%EB%EC%ED%EE%EF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-10")")" = "x://q/%F0%F1%F2%F3%F4%F5%F6%F7%F8%F9%FA%FB%FC%FD%FE%FF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-11")")" = "x://q/%C2%80%DF%BF%E0%A0%80%EF%BF%BD%F0%90%80%80%F0%AF%BF%BD"
+'
+
+test_expect_success 'url username/password escapes' '
+	test "$(test-url-normalize -p "x://%41%62(^):%70+d@foo")" = "x://Ab(%5E):p+d@foo/"
+'
+
+test_expect_success 'url normalized lengths' '
+	test "$(test-url-normalize -l "Http://%4d%65:%4d^%70@The.Host")" = 25 &&
+	test "$(test-url-normalize -l "http://%41:%42@x.y/%61/")" = 17 &&
+	test "$(test-url-normalize -l "http://@x.y/^")" = 15
+'
+
+test_expect_success 'url . and .. segments' '
+	test "$(test-url-normalize -p "x://y/.")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/./")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/a/.")" = "x://y/a" &&
+	test "$(test-url-normalize -p "x://y/a/./")" = "x://y/a/" &&
+	test "$(test-url-normalize -p "x://y/.?")" = "x://y/?" &&
+	test "$(test-url-normalize -p "x://y/./?")" = "x://y/?" &&
+	test "$(test-url-normalize -p "x://y/a/.?")" = "x://y/a?" &&
+	test "$(test-url-normalize -p "x://y/a/./?")" = "x://y/a/?" &&
+	test "$(test-url-normalize -p "x://y/a/./b/.././../c")" = "x://y/c" &&
+	test "$(test-url-normalize -p "x://y/a/./b/../.././c/")" = "x://y/c/" &&
+	test "$(test-url-normalize -p "x://y/a/./b/.././../c/././.././.")" = "x://y/" &&
+	! test-url-normalize "x://y/a/./b/.././../c/././.././.." &&
+	test "$(test-url-normalize -p "x://y/a/./?/././..")" = "x://y/a/?/././.." &&
+	test "$(test-url-normalize -p "x://y/%2e/")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/%2E/")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/a/%2e./")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/b/.%2E/")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/c/%2e%2E/")" = "x://y/"
+'
+
+# http://@foo specifies an empty user name but does not specify a password
+# http://foo  specifies neither a user name nor a password
+# So they should not be equivalent
+test_expect_success 'url equivalents' '
+	test-url-normalize "httP://x" "Http://X/" &&
+	test-url-normalize "Http://%4d%65:%4d^%70@The.Host" "hTTP://Me:%4D^p@the.HOST:80/" &&
+	! test-url-normalize "https://@x.y/^" "httpS://x.y:443/^" &&
+	test-url-normalize "https://@x.y/^" "httpS://@x.y:0443/^" &&
+	test-url-normalize "https://@x.y/^/../abc" "httpS://@x.y:0443/abc" &&
+	test-url-normalize "https://@x.y/^/.." "httpS://@x.y:0443/"
+'
+
+test_expect_success 'url config normalization matching' '
+	test "$(test-url-normalize -c "$tc-1" "useragent" "https://other.example.com/")" = "other-agent" &&
+	test "$(test-url-normalize -c "$tc-1" "useragent" "https://example.com/")" = "example-agent" &&
+	test "$(test-url-normalize -c "$tc-1" "sslVerify" "https://example.com/")" = "false" &&
+	test "$(test-url-normalize -c "$tc-1" "useragent" "https://example.com/path/sub")" = "path-agent" &&
+	test "$(test-url-normalize -c "$tc-1" "sslVerify" "https://example.com/path/sub")" = "false" &&
+	test "$(test-url-normalize -c "$tc-1" "noEPSV" "https://elsewhere.com/")" = "true" &&
+	test "$(test-url-normalize -c "$tc-1" "noEPSV" "https://example.com")" = "true" &&
+	test "$(test-url-normalize -c "$tc-1" "noEPSV" "https://example.com/path")" = "true" &&
+	test "$(test-url-normalize -c "$tc-2" "useragent" "HTTPS://example.COM/p%61th")" = "example-agent" &&
+	test "$(test-url-normalize -c "$tc-2" "sslVerify" "HTTPS://example.COM/p%61th")" = "false" &&
+	test "$(test-url-normalize -c "$tc-3" "sslcainfo" "https://user@example.com/path/name/here")" = "file-1"
+'
+
+test_done
diff --git a/t/t5200/README b/t/t5200/README
new file mode 100644
index 0000000000000000000000000000000000000000..e3a67d94fb52201b565884daf050f7f76294a4f3
GIT binary patch
literal 644
zcmaKqxl+V142F9@g&u(2St@FJDrTt}C6;3izD9QT;OUW_WmpPmGULekk54&zf>i=o
zYzU46Rp128a|O#nbIWptIj4sD`t9|l!kL?g*`wgxNU9mv2?WzZaJU>EcZbWP158#=
zPfkzHLCesnHWel)k_!o>ED-~LV&j}lcCe-*tVUCyJN>;e&rm676mWFDn`}YvJ+}-f
z1oeYUA=`aDg<|lO*>-0Yj}>H1iPJLTj9lE*!l~DB<58ij^lClzUt9#WkEjYJm`DW;
z#fhv{;~SOnd7Xues$_wZMGJD^cD<P?TgS`){Dq28rPTT+^!BR83auKgHz`n*s+Q9g
zd~4-BmobE@%sp<*9VZO1P2hxKRm3B-=?TgH4zu-*__O(#702kAlhVaVyQB}Ro0+?t
kf;N@o8n6*~JaxDT^{U$O0hW*_weP>ge&O#H1FIUFFGl3$z5oCK

literal 0
HcmV?d00001

diff --git a/t/t5200/config-1 b/t/t5200/config-1
new file mode 100644
index 0000000000000000000000000000000000000000..8aaf23c41c52c28bea4da92ba172c54e8d2a4926
GIT binary patch
literal 180
zcma#fC@Cq3<>D+YPAy7IPt7Y)uvN$}$w)2I1@pK#^YUE-g2RBKB}JvFT+txq3Q8cd
z*h*hNwIVUMASYEXIX_nk%@C**%$VZhoUqiQ%(P0NNok2W#rTZUFGwuOKsOI01~m)-
D(n&qZ

literal 0
HcmV?d00001

diff --git a/t/t5200/config-2 b/t/t5200/config-2
new file mode 100644
index 0000000000000000000000000000000000000000..749f4bd5c4e93b6f0e8a18b3efe794fb3b0b5832
GIT binary patch
literal 80
zcma#fC@CpWPy&&~R{Hv>6^Xe8IjMTd`MLT9i6t3Iv0R*`#i>P!>8W`o3bqPRd0jA{
Xi?g^mCoHunGp!ORm6n)OoXQ0NccUB6

literal 0
HcmV?d00001

diff --git a/t/t5200/config-3 b/t/t5200/config-3
new file mode 100644
index 0000000000000000000000000000000000000000..5c8d3e85dda86e4e2e36d362a0d16400509a45ea
GIT binary patch
literal 118
zcma#fC@CpWPy&&~R{Hv>6^Xe8IjMTd`MLT9i6t5Od5O8HO0is=#l<<viJ5t6`3klQ
aX_+~xx`tfQs9H;lQ;QtX^<&j)#03D2JSaT?

literal 0
HcmV?d00001

diff --git a/t/t5200/url-1 b/t/t5200/url-1
new file mode 100644
index 0000000000000000000000000000000000000000..519019c5ce6c58478f048a2f39e2321370d318c6
GIT binary patch
literal 20
bcmb=h($_E4XJle#VP#|I;Nuq%6ygE^Admtt

literal 0
HcmV?d00001

diff --git a/t/t5200/url-10 b/t/t5200/url-10
new file mode 100644
index 0000000000000000000000000000000000000000..b9965de6a5d74b122179821212b2c27c8ae03e80
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFYxj5^Yr!h_xSnx`~3a>{|dCd5i<Y)

literal 0
HcmV?d00001

diff --git a/t/t5200/url-11 b/t/t5200/url-11
new file mode 100644
index 0000000000000000000000000000000000000000..f0a50f10096a20d597f40c775f09a71276e0050a
GIT binary patch
literal 25
hcmb=h($_E4Kh$u4|APe$@AvQhFrlI0!}|Suxd5(W4xs=5

literal 0
HcmV?d00001

diff --git a/t/t5200/url-2 b/t/t5200/url-2
new file mode 100644
index 0000000000000000000000000000000000000000..43334b05b2de3794d6020abd96e634a4e9e49cb0
GIT binary patch
literal 20
bcmb=h($_E47Zwo}6PJ*bmXVc{ujc{)C{+Vx

literal 0
HcmV?d00001

diff --git a/t/t5200/url-3 b/t/t5200/url-3
new file mode 100644
index 0000000000000000000000000000000000000000..7378c7bec247b996bc67b00a05ed89cf47d4b7a7
GIT binary patch
literal 23
ecmb=h($_E4Z)j|4ZfR|6@96C6?&<C8=K=t7Jqj}b

literal 0
HcmV?d00001

diff --git a/t/t5200/url-4 b/t/t5200/url-4
new file mode 100644
index 0000000000000000000000000000000000000000..220b198c97f942fea4960f51a2105cc42261061a
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFOZRvla!T~mzbHFo1C4Vp9*`u3o`%!

literal 0
HcmV?d00001

diff --git a/t/t5200/url-5 b/t/t5200/url-5
new file mode 100644
index 0000000000000000000000000000000000000000..1ccd9277792840955bb124bdde21f4b08bcccb63
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFQB2Kqok##r>Lo_tE{cAuL^}d3^M=#

literal 0
HcmV?d00001

diff --git a/t/t5200/url-6 b/t/t5200/url-6
new file mode 100644
index 0000000000000000000000000000000000000000..e8283aac6dff049d3e02454db6e684c5790a5996
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFR-z)v$VCgx45~wyS%-=zY31M4Kn}$

literal 0
HcmV?d00001

diff --git a/t/t5200/url-7 b/t/t5200/url-7
new file mode 100644
index 0000000000000000000000000000000000000000..fa7c10b615259deefd15b638b021da7c60eba1b2
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFTlaV!^FkL$H>Xb%goKr&kC454l@7%

literal 0
HcmV?d00001

diff --git a/t/t5200/url-8 b/t/t5200/url-8
new file mode 100644
index 0000000000000000000000000000000000000000..79a0ba836f5b8886b0a73f161eb292af2b105e65
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFVNA_)6~`0*Vx(G+uYsW-wL6<4>JG&

literal 0
HcmV?d00001

diff --git a/t/t5200/url-9 b/t/t5200/url-9
new file mode 100644
index 0000000000000000000000000000000000000000..8b44bec48b94467c63e8e1ad18162e465da6d6dd
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFW}+g<K*S$=jiF`>+J3B?+U9u5HkP(

literal 0
HcmV?d00001

diff --git a/test-url-normalize.c b/test-url-normalize.c
new file mode 100644
index 0000000..81d3da9
--- /dev/null
+++ b/test-url-normalize.c
@@ -0,0 +1,137 @@
+#ifdef NO_CURL
+
+int main()
+{
+	return 125;
+}
+
+#else /* !NO_CURL */
+
+#include "http.c"
+
+static int run_http_options(const char *file,
+			    const char *opt,
+			    const struct url_info *info)
+{
+	struct strbuf opt_lc;
+	size_t i, len;
+	struct urlmatch_config config = { STRING_LIST_INIT_DUP };
+
+	memcpy(&config.url, info, sizeof(*info));
+	config.section = "http";
+	config.collect_fn = http_options;
+	config.cascade_fn = git_default_config;
+	config.cb = NULL;
+
+	if (git_config_with_options(urlmatch_config_entry, &config, file, 0))
+		return 1;
+
+	len = strlen(opt);
+	strbuf_init(&opt_lc, len);
+	for (i = 0; i < len; ++i) {
+		strbuf_addch(&opt_lc, tolower(opt[i]));
+	}
+
+	if (!strcmp("sslverify", opt_lc.buf))
+		printf("%s\n", curl_ssl_verify ? "true" : "false");
+	else if (!strcmp("sslcert", opt_lc.buf))
+		printf("%s\n", ssl_cert);
+#if LIBCURL_VERSION_NUM >= 0x070903
+	else if (!strcmp("sslkey", opt_lc.buf))
+		printf("%s\n", ssl_key);
+#endif
+#if LIBCURL_VERSION_NUM >= 0x070908
+	else if (!strcmp("sslcapath", opt_lc.buf))
+		printf("%s\n", ssl_capath);
+#endif
+	else if (!strcmp("sslcainfo", opt_lc.buf))
+		printf("%s\n", ssl_cainfo);
+	else if (!strcmp("sslcertpasswordprotected", opt_lc.buf))
+		printf("%s\n", ssl_cert_password_required ? "true" : "false");
+	else if (!strcmp("ssltry", opt_lc.buf))
+		printf("%s\n", curl_ssl_try ? "true" : "false");
+	else if (!strcmp("minsessions", opt_lc.buf))
+		printf("%d\n", min_curl_sessions);
+	else if (!strcmp("maxrequests", opt_lc.buf))
+		printf("%d\n", max_requests);
+	else if (!strcmp("lowspeedlimit", opt_lc.buf))
+		printf("%ld\n", curl_low_speed_limit);
+	else if (!strcmp("lowspeedtime", opt_lc.buf))
+		printf("%ld\n", curl_low_speed_time);
+	else if (!strcmp("noepsv", opt_lc.buf))
+		printf("%s\n", curl_ftp_no_epsv ? "true" : "false");
+	else if (!strcmp("proxy", opt_lc.buf))
+		printf("%s\n", curl_http_proxy);
+	else if (!strcmp("cookiefile", opt_lc.buf))
+		printf("%s\n", curl_cookie_file);
+	else if (!strcmp("postbuffer", opt_lc.buf))
+		printf("%u\n", (unsigned)http_post_buffer);
+	else if (!strcmp("useragent", opt_lc.buf))
+		printf("%s\n", user_agent);
+
+	return 0;
+}
+
+int main(int argc, char **argv)
+{
+	const char *usage = "test-url-normalize [-p | -l] <url1> | <url1> <url2>"
+		" | -c file option <url1>";
+	char *url1, *url2;
+	int opt_p = 0, opt_l = 0, opt_c = 0;
+	char *file = NULL, *optname = NULL;
+
+	/*
+	 * For one url, succeed if url_normalize succeeds on it, fail otherwise.
+	 * For two urls, succeed only if url_normalize succeeds on both and
+	 * the results compare equal with strcmp.  If -p is given (one url only)
+	 * and url_normalize succeeds, print the result followed by "\n".  If
+	 * -l is given (one url only) and url_normalize succeeds, print the
+	 * returned length in decimal followed by "\n".
+	 * If -c is given, call git_config_with_options using the specified file
+	 * and http_options and passing the normalized value of the url.  Then
+	 * print the value of 'option' afterwards.  'option' must be one of the
+	 * valid 'http.*' options.
+	 */
+
+	if (argc > 1 && !strcmp(argv[1], "-p")) {
+		opt_p = 1;
+		argc--;
+		argv++;
+	} else if (argc > 1 && !strcmp(argv[1], "-l")) {
+		opt_l = 1;
+		argc--;
+		argv++;
+	} else if (argc > 3 && !strcmp(argv[1], "-c")) {
+		opt_c = 1;
+		file = argv[2];
+		optname = argv[3];
+		argc -= 3;
+		argv += 3;
+	}
+
+	if (argc < 2 || argc > 3)
+		die(usage);
+
+	if (argc == 2) {
+		struct url_info info;
+		url1 = url_normalize(argv[1], &info);
+		if (!url1)
+			return 1;
+		if (opt_p)
+			printf("%s\n", url1);
+		if (opt_l)
+			printf("%u\n", (unsigned)info.url_len);
+		if (opt_c)
+			return run_http_options(file, optname, &info);
+		return 0;
+	}
+
+	if (opt_p || opt_l || opt_c)
+		die(usage);
+
+	url1 = url_normalize(argv[1], NULL);
+	url2 = url_normalize(argv[2], NULL);
+	return (url1 && url2 && !strcmp(url1, url2)) ? 0 : 1;
+}
+
+#endif /* !NO_CURL */
-- 
1.8.4-rc0-153-g9820077

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 5/6] builtin/config: refactor collect_config()
  2013-07-31 19:26 [PATCH v6 0/6] http.<url>.<key> and friends Junio C Hamano
                   ` (3 preceding siblings ...)
  2013-07-31 19:26 ` [PATCH v6 4/6] config: parse http.<url>.<variable> using urlmatch Junio C Hamano
@ 2013-07-31 19:26 ` Junio C Hamano
  2013-07-31 19:26 ` [PATCH v6 6/6] config: "git config --get-urlmatch" parses section.<url>.key Junio C Hamano
  5 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2013-07-31 19:26 UTC (permalink / raw)
  To: git; +Cc: Kyle J. McKay, Jeff King

In order to reuse the logic to format the configuration value while
honouring the requested type, split this function into two.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/config.c | 42 +++++++++++++++++++++---------------------
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/builtin/config.c b/builtin/config.c
index 33c9bf9..12c5073 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -100,25 +100,13 @@ struct strbuf_list {
 	int alloc;
 };
 
-static int collect_config(const char *key_, const char *value_, void *cb)
+static int format_config(struct strbuf *buf, const char *key_, const char *value_)
 {
-	struct strbuf_list *values = cb;
-	struct strbuf *buf;
-	char value[256];
-	const char *vptr = value;
 	int must_free_vptr = 0;
 	int must_print_delim = 0;
+	char value[256];
+	const char *vptr = value;
 
-	if (!use_key_regexp && strcmp(key_, key))
-		return 0;
-	if (use_key_regexp && regexec(key_regexp, key_, 0, NULL, 0))
-		return 0;
-	if (regexp != NULL &&
-	    (do_not_match ^ !!regexec(regexp, (value_?value_:""), 0, NULL, 0)))
-		return 0;
-
-	ALLOC_GROW(values->items, values->nr + 1, values->alloc);
-	buf = &values->items[values->nr++];
 	strbuf_init(buf, 0);
 
 	if (show_keys) {
@@ -126,7 +114,7 @@ static int collect_config(const char *key_, const char *value_, void *cb)
 		must_print_delim = 1;
 	}
 	if (types == TYPE_INT)
-		sprintf(value, "%d", git_config_int(key_, value_?value_:""));
+		sprintf(value, "%d", git_config_int(key_, value_ ? value_ : ""));
 	else if (types == TYPE_BOOL)
 		vptr = git_config_bool(key_, value_) ? "true" : "false";
 	else if (types == TYPE_BOOL_OR_INT) {
@@ -154,15 +142,27 @@ static int collect_config(const char *key_, const char *value_, void *cb)
 	strbuf_addch(buf, term);
 
 	if (must_free_vptr)
-		/* If vptr must be freed, it's a pointer to a
-		 * dynamically allocated buffer, it's safe to cast to
-		 * const.
-		*/
 		free((char *)vptr);
-
 	return 0;
 }
 
+static int collect_config(const char *key_, const char *value_, void *cb)
+{
+	struct strbuf_list *values = cb;
+
+	if (!use_key_regexp && strcmp(key_, key))
+		return 0;
+	if (use_key_regexp && regexec(key_regexp, key_, 0, NULL, 0))
+		return 0;
+	if (regexp != NULL &&
+	    (do_not_match ^ !!regexec(regexp, (value_?value_:""), 0, NULL, 0)))
+		return 0;
+
+	ALLOC_GROW(values->items, values->nr + 1, values->alloc);
+
+	return format_config(&values->items[values->nr++], key_, value_);
+}
+
 static int get_value(const char *key_, const char *regex_)
 {
 	int ret = CONFIG_GENERIC_ERROR;
-- 
1.8.4-rc0-153-g9820077

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v6 6/6] config: "git config --get-urlmatch" parses section.<url>.key
  2013-07-31 19:26 [PATCH v6 0/6] http.<url>.<key> and friends Junio C Hamano
                   ` (4 preceding siblings ...)
  2013-07-31 19:26 ` [PATCH v6 5/6] builtin/config: refactor collect_config() Junio C Hamano
@ 2013-07-31 19:26 ` Junio C Hamano
  2013-07-31 22:45   ` Jeff King
  5 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2013-07-31 19:26 UTC (permalink / raw)
  To: git; +Cc: Kyle J. McKay, Jeff King

Using the same urlmatch_config_entry() infrastructure, add a new
mode "--get-urlmatch" to the "git config" command, to learn values
for the "virtual" two-level variables customized for the specific
URL.

    git config [--<type>] --get-urlmatch <section>[.<key>] <url>

With <section>.<key> fully specified, the configuration data for
<section>.<urlpattern>.<key> for <urlpattern> that best matches the
given <url> is sought (and if not found, <section>.<key> is used)
and reported.  For example, with this configuration:

    [http]
        sslVerify
    [http "https://weak.example.com"]
        cookieFile = /tmp/cookie.txt
        sslVerify = false

You would get

    $ git config --bool --get-urlmatch http.sslVerify https://good.example.com
    true
    $ git config --bool --get-urlmatch http.sslVerify https://weak.example.com
    false

With only <section> specified, you can get a list of all variables
in the section with their values that apply to the given URL.  E.g

    $ git config --get-urlmatch http https://weak.example.com
    http.cookiefile /tmp/cookie.txt
    http.sslverify false

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-config.txt | 29 ++++++++++++++
 builtin/config.c             | 92 ++++++++++++++++++++++++++++++++++++++++++++
 t/t1300-repo-config.sh       | 25 ++++++++++++
 3 files changed, 146 insertions(+)

diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index d88a6fc..b48e2ec 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -15,6 +15,7 @@ SYNOPSIS
 'git config' [<file-option>] [type] [-z|--null] --get name [value_regex]
 'git config' [<file-option>] [type] [-z|--null] --get-all name [value_regex]
 'git config' [<file-option>] [type] [-z|--null] --get-regexp name_regex [value_regex]
+'git config' [<file-option>] [type] [-z|--null] --get-urlmatch name URL
 'git config' [<file-option>] --unset name [value_regex]
 'git config' [<file-option>] --unset-all name [value_regex]
 'git config' [<file-option>] --rename-section old_name new_name
@@ -95,6 +96,14 @@ OPTIONS
 	in which section and variable names are lowercased, but subsection
 	names are not.
 
+--get-urlmatch name URL::
+	When given a two-part name section.key, the value for
+	section.<url>.key whose <url> part matches the best to the
+	given URL is returned (if no such key exists, the value for
+	section.key is used as a fallback).  When given just the
+	section as name, do so for all the keys in the section and
+	list them.
+
 --global::
 	For writing options: write to global ~/.gitconfig file rather than
 	the repository .git/config, write to $XDG_CONFIG_HOME/git/config file
@@ -273,6 +282,13 @@ Given a .git/config like this:
 		gitproxy=proxy-command for kernel.org
 		gitproxy=default-proxy ; for all the rest
 
+	; HTTP
+	[http]
+		sslVerify
+	[http "https://weak.example.com"]
+		sslVerify = false
+		cookieFile = /tmp/cookie.txt
+
 you can set the filemode to true with
 
 ------------
@@ -358,6 +374,19 @@ RESET=$(git config --get-color "" "reset")
 echo "${WS}your whitespace color or blue reverse${RESET}"
 ------------
 
+For URLs in `https://weak.example.com`, `http.sslVerify` is set to
+false, while it is set to `true` for all others:
+
+------------
+% git config --bool --get-urlmatch http.sslverify https://good.example.com
+true
+% git config --bool --get-urlmatch http.sslverify https://weak.example.com
+false
+% git config --get-urlmatch http https://weak.example.com
+http.cookiefile /tmp/cookie.txt
+http.sslverify false
+------------
+
 include::config.txt[]
 
 GIT
diff --git a/builtin/config.c b/builtin/config.c
index 12c5073..c35c5be 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -2,6 +2,7 @@
 #include "cache.h"
 #include "color.h"
 #include "parse-options.h"
+#include "urlmatch.h"
 
 static const char *const builtin_config_usage[] = {
 	N_("git config [options]"),
@@ -41,6 +42,7 @@ static int respect_includes = -1;
 #define ACTION_SET_ALL (1<<12)
 #define ACTION_GET_COLOR (1<<13)
 #define ACTION_GET_COLORBOOL (1<<14)
+#define ACTION_GET_URLMATCH (1<<15)
 
 #define TYPE_BOOL (1<<0)
 #define TYPE_INT (1<<1)
@@ -57,6 +59,7 @@ static struct option builtin_config_options[] = {
 	OPT_BIT(0, "get", &actions, N_("get value: name [value-regex]"), ACTION_GET),
 	OPT_BIT(0, "get-all", &actions, N_("get all values: key [value-regex]"), ACTION_GET_ALL),
 	OPT_BIT(0, "get-regexp", &actions, N_("get values for regexp: name-regex [value-regex]"), ACTION_GET_REGEXP),
+	OPT_BIT(0, "get-urlmatch", &actions, N_("get value specific for the URL: section[.var] URL"), ACTION_GET_URLMATCH),
 	OPT_BIT(0, "replace-all", &actions, N_("replace all matching variables: name value [value_regex]"), ACTION_REPLACE_ALL),
 	OPT_BIT(0, "add", &actions, N_("add a new variable: name value"), ACTION_ADD),
 	OPT_BIT(0, "unset", &actions, N_("remove a variable: name [value-regex]"), ACTION_UNSET),
@@ -348,6 +351,91 @@ static int get_colorbool(int print)
 		return get_colorbool_found ? 0 : 1;
 }
 
+struct urlmatch_current_candidate_value {
+	char value_is_null;
+	struct strbuf value;
+};
+
+static int urlmatch_collect_fn(const char *var, const char *value, void *cb)
+{
+	struct string_list *values = cb;
+	struct string_list_item *item = string_list_insert(values, var);
+	struct urlmatch_current_candidate_value *matched = item->util;
+
+	if (!matched) {
+		matched = xmalloc(sizeof(*matched));
+		strbuf_init(&matched->value, 0);
+		item->util = matched;
+	} else {
+		strbuf_reset(&matched->value);
+	}
+
+	if (value) {
+		strbuf_addstr(&matched->value, value);
+		matched->value_is_null = 0;
+	} else {
+		matched->value_is_null = 1;
+	}
+	return 0;
+}
+
+static int get_urlmatch(const char *var, const char *url)
+{
+	const char *section_tail;
+	struct string_list_item *item;
+	struct urlmatch_config config = { STRING_LIST_INIT_DUP };
+	struct string_list values = STRING_LIST_INIT_DUP;
+
+	config.collect_fn = urlmatch_collect_fn;
+	config.cascade_fn = NULL;
+	config.cb = &values;
+
+	if (!url_normalize(url, &config.url))
+		die(config.url.err);
+
+	section_tail = strchr(var, '.');
+	if (section_tail) {
+		config.section = xmemdupz(var, section_tail - var);
+		config.key = strrchr(var, '.') + 1;
+		show_keys = 0;
+	} else {
+		config.section = var;
+		config.key = NULL;
+		show_keys = 1;
+	}
+
+	git_config_with_options(urlmatch_config_entry, &config,
+				given_config_file, respect_includes);
+
+	for_each_string_list_item(item, &values) {
+		struct urlmatch_current_candidate_value *matched = item->util;
+		struct strbuf key = STRBUF_INIT;
+		struct strbuf buf = STRBUF_INIT;
+
+		strbuf_addstr(&key, item->string);
+		format_config(&buf, key.buf,
+			      matched->value_is_null ? NULL : matched->value.buf);
+		fwrite(buf.buf, 1, buf.len, stdout);
+		strbuf_release(&key);
+		strbuf_release(&buf);
+
+		strbuf_release(&matched->value);
+	}
+	string_list_clear(&config.vars, 1);
+	string_list_clear(&values, 1);
+	free(config.url.url);
+
+	/*
+	 * section name may have been copied to replace the dot, in which
+	 * case it needs to be freed.  key name is either NULL (e.g. 'http'
+	 * alone) or points into var (e.g. 'http.savecookies'), and we do
+	 * not own the storage.
+	 */
+	if (config.section != var)
+		free((void *)config.section);
+	return 0;
+}
+
 int cmd_config(int argc, const char **argv, const char *prefix)
 {
 	int nongit = !startup_info->have_repository;
@@ -499,6 +587,10 @@ int cmd_config(int argc, const char **argv, const char *prefix)
 		check_argc(argc, 1, 2);
 		return get_value(argv[0], argv[1]);
 	}
+	else if (actions == ACTION_GET_URLMATCH) {
+		check_argc(argc, 2, 2);
+		return get_urlmatch(argv[0], argv[1]);
+	}
 	else if (actions == ACTION_UNSET) {
 		check_argc(argc, 1, 2);
 		if (argc == 2)
diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index c4a7d84..323e880 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -1087,6 +1087,31 @@ test_expect_success 'barf on incomplete string' '
 	grep " line 3 " error
 '
 
+test_expect_success 'urlmatch' '
+	cat >.git/config <<-\EOF &&
+	[http]
+		sslVerify
+	[http "https://weak.example.com"]
+		sslVerify = false
+		cookieFile = /tmp/cookie.txt
+	EOF
+
+	echo true >expect &&
+	git config --bool --get-urlmatch http.sslverify https://good.example.com >actual &&
+	test_cmp expect actual &&
+
+	echo false >expect &&
+	git config --bool --get-urlmatch http.sslverify https://weak.example.com >actual &&
+	test_cmp expect actual &&
+
+	{
+		echo http.cookiefile /tmp/cookie.txt &&
+		echo http.sslverify false
+	} >expect &&
+	git config --get-urlmatch http https://weak.example.com >actual &&
+	test_cmp expect actual
+'
+
 # good section hygiene
 test_expect_failure 'unsetting the last key in a section removes header' '
 	cat >.git/config <<-\EOF &&
-- 
1.8.4-rc0-153-g9820077

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 2/6] config: add helper to normalize and match URLs
  2013-07-31 19:26 ` [PATCH v6 2/6] config: add helper to normalize and match URLs Junio C Hamano
@ 2013-07-31 20:50   ` Kyle J. McKay
  0 siblings, 0 replies; 23+ messages in thread
From: Kyle J. McKay @ 2013-07-31 20:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King

On Jul 31, 2013, at 12:26, Junio C Hamano wrote:

> From: "Kyle J. McKay" <mackyle@gmail.com>
>
> Some http.* configuration variables need to take values customized
> for the URL we are talking to.  We may want to set http.sslVerify to
> true in general but to false only for a certain site, for example,
> with a configuration file like this:
[...]
> urlmatch.c | 468 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 
> +++++++++
> urlmatch.h |  36 +++++
> 2 files changed, 504 insertions(+)
> create mode 100644 urlmatch.c
> create mode 100644 urlmatch.h
>
> diff --git a/urlmatch.c b/urlmatch.c
> new file mode 100644
> index 0000000..e1b03ee
> --- /dev/null
> +++ b/urlmatch.c
[...]
> +
> +static size_t http_options_url_match_prefix(const char *url,
> +					    const char *url_prefix,
> +					    size_t url_prefix_len)
> +{
> +	/*
> +	 * url_prefix matches url if url_prefix is an exact match for url  
> or it
> +	 * is a prefix of url and the match ends on a path component  
> boundary.
> +	 * Both url and url_prefix are considered to have an implicit '/'  
> on the
> +	 * end for matching purposes if they do not already.

This function should probably be renamed to just url_match_prefix  
since it isn't part of nor does it depend on the http_options related  
files or functions anymore.

Otherwise looks good to me.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 4/6] config: parse http.<url>.<variable> using urlmatch
  2013-07-31 19:26 ` [PATCH v6 4/6] config: parse http.<url>.<variable> using urlmatch Junio C Hamano
@ 2013-07-31 20:51   ` Kyle J. McKay
  2013-07-31 20:51   ` [PATCH ALTERNATIVE v6 0/2] http.<url>.<key> and friends Kyle J. McKay
  1 sibling, 0 replies; 23+ messages in thread
From: Kyle J. McKay @ 2013-07-31 20:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King

On Jul 31, 2013, at 12:26, Junio C Hamano wrote:

> From: "Kyle J. McKay" <mackyle@gmail.com>
>
> Use the urlmatch_config_entry() to wrap the underlying
> http_options() two-level variable parser in order to set
> http.<variable> to the value with the most specific URL in the
> configuration.
>
> Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Needs Peff's Signed-off-by: for the copious amount of text he wrote  
that is included verbatim in the documentation part of the patch.  He  
previously gave it for this purpose.

> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 6e53fc5..60c140f 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1513,6 +1513,50 @@ http.useragent::
> 	of common USER_AGENT strings (but not including those like git/ 
> 1.7.1).
> 	Can be overridden by the 'GIT_HTTP_USER_AGENT' environment variable.
>
> +http.<url>.*::
> +	Any of the http.* options above can be applied selectively to some  
> urls.
> +	For a config key to match a URL, each element of the config key is
> +	compared to that of the URL, in the following order:
> ++
> +--
> +. Scheme (e.g., `https` in `https://example.com/`). This field
> +  must match exactly between the config key and the URL.
> +
> +. Host/domain name (e.g., `example.com` in `https://example.com/`).
> +  This field must match exactly between the config key and the URL.
> +
> +. Port number (e.g., `8080` in `http://example.com:8080/`).
> +  This field must match exactly between the config key and the URL.
> +  Omitted port numbers are automatically converted to the correct
> +  default for the scheme before matching.
> +
> +. Path (e.g., `repo.git` in `https://example.com/repo.git`). The
> +  path field of the config key must match the path field of the URL
> +  either exactly or as a prefix of slash-delimited path elements.   
> This means
> +  a config key with path `foo/` matches URL path `foo/bar`.  A  
> prefix can only
> +  match on a slash (`/`) boundary.  Longer matches take precedence  
> (so a config
> +  key with path `foo/bar` is a better match to URL path `foo/bar`  
> than a config
> +  key with just path `foo/`).
> +
> +. User name (e.g., `user` in `https://user@example.com/repo.git`). If
> +  the config key has a user name it must match the user name in the
> +  URL exactly. If the config key does not have a user name, that
> +  config key will match a URL with any user name (including none).

Missing the single line follow-up patch:

> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 0dd5566..f2ed9ef 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1568,7 +1568,8 @@ http.<url>.*::
> . User name (e.g., `user` in `https://user@example.com/repo.git`). If
>   the config key has a user name it must match the user name in the
>   URL exactly. If the config key does not have a user name, that
> -  config key will match a URL with any user name (including none).
> +  config key will match a URL with any user name (including none),
> +  but at a lower precedence than a config key with a user name.
> --



> diff --git a/test-url-normalize.c b/test-url-normalize.c
> new file mode 100644
> index 0000000..81d3da9
> --- /dev/null
> +++ b/test-url-normalize.c
> @@ -0,0 +1,137 @@
> +#ifdef NO_CURL
> +
> +int main()

Need's Ramsey's patch here:

-int main()
+int main(void)

> +static int run_http_options(const char *file,
> +			    const char *opt,
> +			    const struct url_info *info)
> +{
> +	struct strbuf opt_lc;
> +	size_t i, len;
> +	struct urlmatch_config config = { STRING_LIST_INIT_DUP };
> +
> +	memcpy(&config.url, info, sizeof(*info));
> +	config.section = "http";
> +	config.collect_fn = http_options;
> +	config.cascade_fn = git_default_config;
> +	config.cb = NULL;
> +
> +	if (git_config_with_options(urlmatch_config_entry, &config, file,  
> 0))
> +		return 1;
> +
> +	len = strlen(opt);
> +	strbuf_init(&opt_lc, len);
> +	for (i = 0; i < len; ++i) {
> +		strbuf_addch(&opt_lc, tolower(opt[i]));
> +	}
> +
> +	if (!strcmp("sslverify", opt_lc.buf))
> +		printf("%s\n", curl_ssl_verify ? "true" : "false");
> +	else if (!strcmp("sslcert", opt_lc.buf))
> +		printf("%s\n", ssl_cert);
> +#if LIBCURL_VERSION_NUM >= 0x070903
> +	else if (!strcmp("sslkey", opt_lc.buf))
> +		printf("%s\n", ssl_key);
> +#endif
> +#if LIBCURL_VERSION_NUM >= 0x070908
> +	else if (!strcmp("sslcapath", opt_lc.buf))
> +		printf("%s\n", ssl_capath);
> +#endif
> +	else if (!strcmp("sslcainfo", opt_lc.buf))
> +		printf("%s\n", ssl_cainfo);
> +	else if (!strcmp("sslcertpasswordprotected", opt_lc.buf))
> +		printf("%s\n", ssl_cert_password_required ? "true" : "false");
> +	else if (!strcmp("ssltry", opt_lc.buf))
> +		printf("%s\n", curl_ssl_try ? "true" : "false");
> +	else if (!strcmp("minsessions", opt_lc.buf))
> +		printf("%d\n", min_curl_sessions);

And here

+#ifdef USE_CURL_MULTI
> +	else if (!strcmp("maxrequests", opt_lc.buf))
> +		printf("%d\n", max_requests);
+#endif
> +	else if (!strcmp("lowspeedlimit", opt_lc.buf))
> +		printf("%ld\n", curl_low_speed_limit);

Otherwise looks good to me.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH ALTERNATIVE v6 0/2] http.<url>.<key> and friends
  2013-07-31 19:26 ` [PATCH v6 4/6] config: parse http.<url>.<variable> using urlmatch Junio C Hamano
  2013-07-31 20:51   ` Kyle J. McKay
@ 2013-07-31 20:51   ` Kyle J. McKay
  2013-07-31 20:52     ` [PATCH ALTERNATIVE v6 2/4] config: add helper to normalize and match URLs Kyle J. McKay
                       ` (3 more replies)
  1 sibling, 4 replies; 23+ messages in thread
From: Kyle J. McKay @ 2013-07-31 20:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King

This patch simply provides two alternative versions of the 2/6 and 4/6
patches previously sent as part of the:

  [PATCH v6 0/6] http.<url>.<key> and friends

series.  They are intended simply as complete alternatives to parts 2 and 4
that include the following changes:

2/4 - Include 1-line documentation update in log comment and rename static
      function from http_options_url_match_prefix to url_match_prefix.
      
4/4 - Include 1-line documentation update together with Peff's previously
      provided Signed-off-by for the copious amount of documentation text he
      has provided that has been included verbatim.  Also include the minor
      fixes from Ramsay Jones for compilation of test-url-normalize when
      NO_CURL is defined.

If simply squashing the patches in instead is desired, here's the patch for
part 2/4:

diff --git a/urlmatch.c b/urlmatch.c
index e1b03ee7..4f38cc7b 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -376,9 +376,9 @@ char *url_normalize(const char *url, struct url_info *out_info)
 	return result;
 }
 
-static size_t http_options_url_match_prefix(const char *url,
-					    const char *url_prefix,
-					    size_t url_prefix_len)
+static size_t url_match_prefix(const char *url,
+			       const char *url_prefix,
+			       size_t url_prefix_len)
 {
 	/*
 	 * url_prefix matches url if url_prefix is an exact match for url or it
@@ -457,7 +457,7 @@ int match_urls(const struct url_info *url,
 		return 0; /* host names and/or ports do not match */
 
 	/* check the path */
-	pathmatchlen = http_options_url_match_prefix(
+	pathmatchlen = url_match_prefix(
 		url->url + url->path_off,
 		url_prefix->url + url_prefix->path_off,
 		url_prefix->url_len - url_prefix->path_off);
---

And here's the patch for part 4/4:

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 60c140f1..8cc0fd78 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1542,6 +1542,8 @@ http.<url>.*::
   the config key has a user name it must match the user name in the
   URL exactly. If the config key does not have a user name, that
   config key will match a URL with any user name (including none).
+  config key will match a URL with any user name (including none),
+  but at a lower precedence than a config key with a user name.
 --
 +
 The list above is ordered by decreasing precedence; a URL that matches
diff --git a/test-url-normalize.c b/test-url-normalize.c
index 81d3da90..80437217 100644
--- a/test-url-normalize.c
+++ b/test-url-normalize.c
@@ -1,6 +1,6 @@
 #ifdef NO_CURL
 
-int main()
+int main(void)
 {
 	return 125;
 }
@@ -52,8 +52,10 @@ static int run_http_options(const char *file,
 		printf("%s\n", curl_ssl_try ? "true" : "false");
 	else if (!strcmp("minsessions", opt_lc.buf))
 		printf("%d\n", min_curl_sessions);
+#ifdef USE_CURL_MULTI
 	else if (!strcmp("maxrequests", opt_lc.buf))
 		printf("%d\n", max_requests);
+#endif
 	else if (!strcmp("lowspeedlimit", opt_lc.buf))
 		printf("%ld\n", curl_low_speed_limit);
 	else if (!strcmp("lowspeedtime", opt_lc.buf))
---

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH ALTERNATIVE v6 2/4] config: add helper to normalize and match URLs
  2013-07-31 20:51   ` [PATCH ALTERNATIVE v6 0/2] http.<url>.<key> and friends Kyle J. McKay
@ 2013-07-31 20:52     ` Kyle J. McKay
  2013-07-31 20:52     ` [PATCH ALTERNATIVE v6 4/4] config: parse http.<url>.<variable> using urlmatch Kyle J. McKay
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 23+ messages in thread
From: Kyle J. McKay @ 2013-07-31 20:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King

Some http.* configuration variables need to take values customized
for the URL we are talking to.  We may want to set http.sslVerify to
true in general but to false only for a certain site, for example,
with a configuration file like this:

	[http]
		sslVerify = true
	[http "https://weak.example.com"]
		sslVerify = false

and let the configuration machinery pick up the latter only when
talking to "https://weak.example.com".  The latter needs to kick in
not only when the URL is exactly "https://weak.example.com", but
also is anything that "match" it, e.g.

	https://weak.example.com/test
	https://me@weak.example.com/test

The <url> in the configuration key consists of the following parts,
and is considered a match to the URL we are attempting to access
under certain conditions:

  . Scheme (e.g., `https` in `https://example.com/`). This field
    must match exactly between the config key and the URL.

  . Host/domain name (e.g., `example.com` in `https://example.com/`).
    This field must match exactly between the config key and the URL.

  . Port number (e.g., `8080` in `http://example.com:8080/`).  This
    field must match exactly between the config key and the URL.
    Omitted port numbers are automatically converted to the correct
    default for the scheme before matching.

  . Path (e.g., `repo.git` in `https://example.com/repo.git`). The
    path field of the config key must match the path field of the
    URL either exactly or as a prefix of slash-delimited path
    elements.  A config key with path `foo/` matches URL path
    `foo/bar`.  A prefix can only match on a slash (`/`) boundary.
    Longer matches take precedence (so a config key with path
    `foo/bar` is a better match to URL path `foo/bar` than a config
    key with just path `foo/`).

  . User name (e.g., `me` in `https://me@example.com/repo.git`). If
    the config key has a user name, it must match the user name in
    the URL exactly. If the config key does not have a user name,
    that config key will match a URL with any user name (including
    none), but at a lower precedence than a config key with a user
    name.

Longer matches take precedence over shorter matches.

This step adds two helper functions `url_normalize()` and
`match_urls()` to help implement the above semantics. The
normalization rules are based on RFC 3986 and should result in any
two equivalent urls being a match.

Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 urlmatch.c | 468 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 urlmatch.h |  36 +++++
 2 files changed, 504 insertions(+)
 create mode 100644 urlmatch.c
 create mode 100644 urlmatch.h

diff --git a/urlmatch.c b/urlmatch.c
new file mode 100644
index 00000000..4f38cc7b
--- /dev/null
+++ b/urlmatch.c
@@ -0,0 +1,468 @@
+#include "cache.h"
+#include "urlmatch.h"
+
+#define URL_ALPHA "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
+#define URL_DIGIT "0123456789"
+#define URL_ALPHADIGIT URL_ALPHA URL_DIGIT
+#define URL_SCHEME_CHARS URL_ALPHADIGIT "+.-"
+#define URL_HOST_CHARS URL_ALPHADIGIT ".-[:]" /* IPv6 literals need [:] */
+#define URL_UNSAFE_CHARS " <>\"%{}|\\^`" /* plus 0x00-0x1F,0x7F-0xFF */
+#define URL_GEN_RESERVED ":/?#[]@"
+#define URL_SUB_RESERVED "!$&'()*+,;="
+#define URL_RESERVED URL_GEN_RESERVED URL_SUB_RESERVED /* only allowed delims */
+
+static int append_normalized_escapes(struct strbuf *buf,
+				     const char *from,
+				     size_t from_len,
+				     const char *esc_extra,
+				     const char *esc_ok)
+{
+	/*
+	 * Append to strbuf 'buf' characters from string 'from' with length
+	 * 'from_len' while unescaping characters that do not need to be escaped
+	 * and escaping characters that do.  The set of characters to escape
+	 * (the complement of which is unescaped) starts out as the RFC 3986
+	 * unsafe characters (0x00-0x1F,0x7F-0xFF," <>\"#%{}|\\^`").  If
+	 * 'esc_extra' is not NULL, those additional characters will also always
+	 * be escaped.  If 'esc_ok' is not NULL, those characters will be left
+	 * escaped if found that way, but will not be unescaped otherwise (used
+	 * for delimiters).  If a %-escape sequence is encountered that is not
+	 * followed by 2 hexadecimal digits, the sequence is invalid and
+	 * false (0) will be returned.  Otherwise true (1) will be returned for
+	 * success.
+	 *
+	 * Note that all %-escape sequences will be normalized to UPPERCASE
+	 * as indicated in RFC 3986.  Unless included in esc_extra or esc_ok
+	 * alphanumerics and "-._~" will always be unescaped as per RFC 3986.
+	 */
+
+	while (from_len) {
+		int ch = *from++;
+		int was_esc = 0;
+
+		from_len--;
+		if (ch == '%') {
+			if (from_len < 2 ||
+			    !isxdigit((unsigned char)from[0]) ||
+			    !isxdigit((unsigned char)from[1]))
+				return 0;
+			ch = hexval_table[(unsigned char)*from++] << 4;
+			ch |= hexval_table[(unsigned char)*from++];
+			from_len -= 2;
+			was_esc = 1;
+		}
+		if ((unsigned char)ch <= 0x1F || (unsigned char)ch >= 0x7F ||
+		    strchr(URL_UNSAFE_CHARS, ch) ||
+		    (esc_extra && strchr(esc_extra, ch)) ||
+		    (was_esc && strchr(esc_ok, ch)))
+			strbuf_addf(buf, "%%%02X", (unsigned char)ch);
+		else
+			strbuf_addch(buf, ch);
+	}
+
+	return 1;
+}
+
+char *url_normalize(const char *url, struct url_info *out_info)
+{
+	/*
+	 * Normalize NUL-terminated url using the following rules:
+	 *
+	 * 1. Case-insensitive parts of url will be converted to lower case
+	 * 2. %-encoded characters that do not need to be will be unencoded
+	 * 3. Characters that are not %-encoded and must be will be encoded
+	 * 4. All %-encodings will be converted to upper case hexadecimal
+	 * 5. Leading 0s are removed from port numbers
+	 * 6. If the default port for the scheme is given it will be removed
+	 * 7. A path part (including empty) not starting with '/' has one added
+	 * 8. Any dot segments (. or ..) in the path are resolved and removed
+	 * 9. IPv6 host literals are allowed (but not normalized or validated)
+	 *
+	 * The rules are based on information in RFC 3986.
+	 *
+	 * Please note this function requires a full URL including a scheme
+	 * and host part (except for file: URLs which may have an empty host).
+	 *
+	 * The return value is a newly allocated string that must be freed
+	 * or NULL if the url is not valid.
+	 *
+	 * If out_info is non-NULL, the url and err fields therein will always
+	 * be set.  If a non-NULL value is returned, it will be stored in
+	 * out_info->url as well, out_info->err will be set to NULL and the
+	 * other fields of *out_info will also be filled in.  If a NULL value
+	 * is returned, NULL will be stored in out_info->url and out_info->err
+	 * will be set to a brief, translated, error message, but no other
+	 * fields will be filled in.
+	 *
+	 * This is NOT a URL validation function.  Full URL validation is NOT
+	 * performed.  Some invalid host names are passed through this function
+	 * undetected.  However, most all other problems that make a URL invalid
+	 * will be detected (including a missing host for non file: URLs).
+	 */
+
+	size_t url_len = strlen(url);
+	struct strbuf norm;
+	size_t spanned;
+	size_t scheme_len, user_off=0, user_len=0, passwd_off=0, passwd_len=0;
+	size_t host_off=0, host_len=0, port_len=0, path_off, path_len, result_len;
+	const char *slash_ptr, *at_ptr, *colon_ptr, *path_start;
+	char *result;
+
+	/*
+	 * Copy lowercased scheme and :// suffix, %-escapes are not allowed
+	 * First character of scheme must be URL_ALPHA
+	 */
+	spanned = strspn(url, URL_SCHEME_CHARS);
+	if (!spanned || !isalpha(url[0]) || spanned + 3 > url_len ||
+	    url[spanned] != ':' || url[spanned+1] != '/' || url[spanned+2] != '/') {
+		if (out_info) {
+			out_info->url = NULL;
+			out_info->err = _("invalid URL scheme name or missing '://' suffix");
+		}
+		return NULL; /* Bad scheme and/or missing "://" part */
+	}
+	strbuf_init(&norm, url_len);
+	scheme_len = spanned;
+	spanned += 3;
+	url_len -= spanned;
+	while (spanned--)
+		strbuf_addch(&norm, tolower(*url++));
+
+
+	/*
+	 * Copy any username:password if present normalizing %-escapes
+	 */
+	at_ptr = strchr(url, '@');
+	slash_ptr = url + strcspn(url, "/?#");
+	if (at_ptr && at_ptr < slash_ptr) {
+		user_off = norm.len;
+		if (at_ptr > url) {
+			if (!append_normalized_escapes(&norm, url, at_ptr - url,
+						       "", URL_RESERVED)) {
+				if (out_info) {
+					out_info->url = NULL;
+					out_info->err = _("invalid %XX escape sequence");
+				}
+				strbuf_release(&norm);
+				return NULL;
+			}
+			colon_ptr = strchr(norm.buf + scheme_len + 3, ':');
+			if (colon_ptr) {
+				passwd_off = (colon_ptr + 1) - norm.buf;
+				passwd_len = norm.len - passwd_off;
+				user_len = (passwd_off - 1) - (scheme_len + 3);
+			} else {
+				user_len = norm.len - (scheme_len + 3);
+			}
+		}
+		strbuf_addch(&norm, '@');
+		url_len -= (++at_ptr - url);
+		url = at_ptr;
+	}
+
+
+	/*
+	 * Copy the host part excluding any port part, no %-escapes allowed
+	 */
+	if (!url_len || strchr(":/?#", *url)) {
+		/* Missing host invalid for all URL schemes except file */
+		if (strncmp(norm.buf, "file:", 5)) {
+			if (out_info) {
+				out_info->url = NULL;
+				out_info->err = _("missing host and scheme is not 'file:'");
+			}
+			strbuf_release(&norm);
+			return NULL;
+		}
+	} else {
+		host_off = norm.len;
+	}
+	colon_ptr = slash_ptr - 1;
+	while (colon_ptr > url && *colon_ptr != ':' && *colon_ptr != ']')
+		colon_ptr--;
+	if (*colon_ptr != ':') {
+		colon_ptr = slash_ptr;
+	} else if (!host_off && colon_ptr < slash_ptr && colon_ptr + 1 != slash_ptr) {
+		/* file: URLs may not have a port number */
+		if (out_info) {
+			out_info->url = NULL;
+			out_info->err = _("a 'file:' URL may not have a port number");
+		}
+		strbuf_release(&norm);
+		return NULL;
+	}
+	spanned = strspn(url, URL_HOST_CHARS);
+	if (spanned < colon_ptr - url) {
+		/* Host name has invalid characters */
+		if (out_info) {
+			out_info->url = NULL;
+			out_info->err = _("invalid characters in host name");
+		}
+		strbuf_release(&norm);
+		return NULL;
+	}
+	while (url < colon_ptr) {
+		strbuf_addch(&norm, tolower(*url++));
+		url_len--;
+	}
+
+
+	/*
+	 * Check the port part and copy if not the default (after removing any
+	 * leading 0s); no %-escapes allowed
+	 */
+	if (colon_ptr < slash_ptr) {
+		/* skip the ':' and leading 0s but not the last one if all 0s */
+		url++;
+		url += strspn(url, "0");
+		if (url == slash_ptr && url[-1] == '0')
+			url--;
+		if (url == slash_ptr) {
+			/* Skip ":" port with no number, it's same as default */
+		} else if (slash_ptr - url == 2 &&
+			   !strncmp(norm.buf, "http:", 5) &&
+			   !strncmp(url, "80", 2)) {
+			/* Skip http :80 as it's the default */
+		} else if (slash_ptr - url == 3 &&
+			   !strncmp(norm.buf, "https:", 6) &&
+			   !strncmp(url, "443", 3)) {
+			/* Skip https :443 as it's the default */
+		} else {
+			/*
+			 * Port number must be all digits with leading 0s removed
+			 * and since all the protocols we deal with have a 16-bit
+			 * port number it must also be in the range 1..65535
+			 * 0 is not allowed because that means "next available"
+			 * on just about every system and therefore cannot be used
+			 */
+			unsigned long pnum = 0;
+			spanned = strspn(url, URL_DIGIT);
+			if (spanned < slash_ptr - url) {
+				/* port number has invalid characters */
+				if (out_info) {
+					out_info->url = NULL;
+					out_info->err = _("invalid port number");
+				}
+				strbuf_release(&norm);
+				return NULL;
+			}
+			if (slash_ptr - url <= 5)
+				pnum = strtoul(url, NULL, 10);
+			if (pnum == 0 || pnum > 65535) {
+				/* port number not in range 1..65535 */
+				if (out_info) {
+					out_info->url = NULL;
+					out_info->err = _("invalid port number");
+				}
+				strbuf_release(&norm);
+				return NULL;
+			}
+			strbuf_addch(&norm, ':');
+			strbuf_add(&norm, url, slash_ptr - url);
+			port_len = slash_ptr - url;
+		}
+		url_len -= slash_ptr - colon_ptr;
+		url = slash_ptr;
+	}
+	if (host_off)
+		host_len = norm.len - host_off;
+
+
+	/*
+	 * Now copy the path resolving any . and .. segments being careful not
+	 * to corrupt the URL by unescaping any delimiters, but do add an
+	 * initial '/' if it's missing and do normalize any %-escape sequences.
+	 */
+	path_off = norm.len;
+	path_start = norm.buf + path_off;
+	strbuf_addch(&norm, '/');
+	if (*url == '/') {
+		url++;
+		url_len--;
+	}
+	for (;;) {
+		const char *seg_start = norm.buf + norm.len;
+		const char *next_slash = url + strcspn(url, "/?#");
+		int skip_add_slash = 0;
+		/*
+		 * RFC 3689 indicates that any . or .. segments should be
+		 * unescaped before being checked for.
+		 */
+		if (!append_normalized_escapes(&norm, url, next_slash - url, "",
+					       URL_RESERVED)) {
+			if (out_info) {
+				out_info->url = NULL;
+				out_info->err = _("invalid %XX escape sequence");
+			}
+			strbuf_release(&norm);
+			return NULL;
+		}
+		if (!strcmp(seg_start, ".")) {
+			/* ignore a . segment; be careful not to remove initial '/' */
+			if (seg_start == path_start + 1) {
+				strbuf_setlen(&norm, norm.len - 1);
+				skip_add_slash = 1;
+			} else {
+				strbuf_setlen(&norm, norm.len - 2);
+			}
+		} else if (!strcmp(seg_start, "..")) {
+			/*
+			 * ignore a .. segment and remove the previous segment;
+			 * be careful not to remove initial '/' from path
+			 */
+			const char *prev_slash = norm.buf + norm.len - 3;
+			if (prev_slash == path_start) {
+				/* invalid .. because no previous segment to remove */
+				if (out_info) {
+					out_info->url = NULL;
+					out_info->err = _("invalid '..' path segment");
+				}
+				strbuf_release(&norm);
+				return NULL;
+			}
+			while (*--prev_slash != '/') {}
+			if (prev_slash == path_start) {
+				strbuf_setlen(&norm, prev_slash - norm.buf + 1);
+				skip_add_slash = 1;
+			} else {
+				strbuf_setlen(&norm, prev_slash - norm.buf);
+			}
+		}
+		url_len -= next_slash - url;
+		url = next_slash;
+		/* if the next char is not '/' done with the path */
+		if (*url != '/')
+			break;
+		url++;
+		url_len--;
+		if (!skip_add_slash)
+			strbuf_addch(&norm, '/');
+	}
+	path_len = norm.len - path_off;
+
+
+	/*
+	 * Now simply copy the rest, if any, only normalizing %-escapes and
+	 * being careful not to corrupt the URL by unescaping any delimiters.
+	 */
+	if (*url) {
+		if (!append_normalized_escapes(&norm, url, url_len, "", URL_RESERVED)) {
+			if (out_info) {
+				out_info->url = NULL;
+				out_info->err = _("invalid %XX escape sequence");
+			}
+			strbuf_release(&norm);
+			return NULL;
+		}
+	}
+
+
+	result = strbuf_detach(&norm, &result_len);
+	if (out_info) {
+		out_info->url = result;
+		out_info->err = NULL;
+		out_info->url_len = result_len;
+		out_info->scheme_len = scheme_len;
+		out_info->user_off = user_off;
+		out_info->user_len = user_len;
+		out_info->passwd_off = passwd_off;
+		out_info->passwd_len = passwd_len;
+		out_info->host_off = host_off;
+		out_info->host_len = host_len;
+		out_info->port_len = port_len;
+		out_info->path_off = path_off;
+		out_info->path_len = path_len;
+	}
+	return result;
+}
+
+static size_t url_match_prefix(const char *url,
+			       const char *url_prefix,
+			       size_t url_prefix_len)
+{
+	/*
+	 * url_prefix matches url if url_prefix is an exact match for url or it
+	 * is a prefix of url and the match ends on a path component boundary.
+	 * Both url and url_prefix are considered to have an implicit '/' on the
+	 * end for matching purposes if they do not already.
+	 *
+	 * url must be NUL terminated.  url_prefix_len is the length of
+	 * url_prefix which need not be NUL terminated.
+	 *
+	 * The return value is the length of the match in characters (including
+	 * the final '/' even if it's implicit) or 0 for no match.
+	 *
+	 * Passing NULL as url and/or url_prefix will always cause 0 to be
+	 * returned without causing any faults.
+	 */
+	if (!url || !url_prefix)
+		return 0;
+	if (!url_prefix_len || (url_prefix_len == 1 && *url_prefix == '/'))
+		return (!*url || *url == '/') ? 1 : 0;
+	if (url_prefix[url_prefix_len - 1] == '/')
+		url_prefix_len--;
+	if (strncmp(url, url_prefix, url_prefix_len))
+		return 0;
+	if ((strlen(url) == url_prefix_len) || (url[url_prefix_len] == '/'))
+		return url_prefix_len + 1;
+	return 0;
+}
+
+int match_urls(const struct url_info *url,
+	       const struct url_info *url_prefix,
+	       int *exactusermatch)
+{
+	/*
+	 * url_prefix matches url if the scheme, host and port of url_prefix
+	 * are the same as those of url and the path portion of url_prefix
+	 * is the same as the path portion of url or it is a prefix that
+	 * matches at a '/' boundary.  If url_prefix contains a user name,
+	 * that must also exactly match the user name in url.
+	 *
+	 * If the user, host, port and path match in this fashion, the returned
+	 * value is the length of the path match including any implicit
+	 * final '/'.  For example, "http://me@example.com/path" is matched by
+	 * "http://example.com" with a path length of 1.
+	 *
+	 * If there is a match and exactusermatch is not NULL, then
+	 * *exactusermatch will be set to true if both url and url_prefix
+	 * contained a user name or false if url_prefix did not have a
+	 * user name.  If there is no match *exactusermatch is left untouched.
+	 */
+	int usermatched = 0;
+	int pathmatchlen;
+
+	if (!url || !url_prefix || !url->url || !url_prefix->url)
+		return 0;
+
+	/* check the scheme */
+	if (url_prefix->scheme_len != url->scheme_len ||
+	    strncmp(url->url, url_prefix->url, url->scheme_len))
+		return 0; /* schemes do not match */
+
+	/* check the user name if url_prefix has one */
+	if (url_prefix->user_off) {
+		if (!url->user_off || url->user_len != url_prefix->user_len ||
+		    strncmp(url->url + url->user_off,
+			    url_prefix->url + url_prefix->user_off,
+			    url->user_len))
+			return 0; /* url_prefix has a user but it's not a match */
+		usermatched = 1;
+	}
+
+	/* check the host and port */
+	if (url_prefix->host_len != url->host_len ||
+	    strncmp(url->url + url->host_off,
+		    url_prefix->url + url_prefix->host_off, url->host_len))
+		return 0; /* host names and/or ports do not match */
+
+	/* check the path */
+	pathmatchlen = url_match_prefix(
+		url->url + url->path_off,
+		url_prefix->url + url_prefix->path_off,
+		url_prefix->url_len - url_prefix->path_off);
+
+	if (pathmatchlen && exactusermatch)
+		*exactusermatch = usermatched;
+	return pathmatchlen;
+}
diff --git a/urlmatch.h b/urlmatch.h
new file mode 100644
index 00000000..b67f57f8
--- /dev/null
+++ b/urlmatch.h
@@ -0,0 +1,36 @@
+#ifndef URL_MATCH_H
+#include "string-list.h"
+
+struct url_info {
+	/* normalized url on success, must be freed, otherwise NULL */
+	char *url;
+	/* if !url, a brief reason for the failure, otherwise NULL */
+	const char *err;
+
+	/* the rest of the fields are only set if url != NULL */
+
+	size_t url_len;		/* total length of url (which is now normalized) */
+	size_t scheme_len;	/* length of scheme name (excluding final :) */
+	size_t user_off;	/* offset into url to start of user name (0 => none) */
+	size_t user_len;	/* length of user name; if user_off != 0 but
+				   user_len == 0, an empty user name was given */
+	size_t passwd_off;	/* offset into url to start of passwd (0 => none) */
+	size_t passwd_len;	/* length of passwd; if passwd_off != 0 but
+				   passwd_len == 0, an empty passwd was given */
+	size_t host_off;	/* offset into url to start of host name (0 => none) */
+	size_t host_len;	/* length of host name; this INCLUDES any ':portnum';
+				 * file urls may have host_len == 0 */
+	size_t port_len;	/* if a portnum is present (port_len != 0), it has
+				 * this length (excluding the leading ':') at the
+				 * end of the host name (always 0 for file urls) */
+	size_t path_off;	/* offset into url to the start of the url path;
+				 * this will always point to a '/' character
+				 * after the url has been normalized */
+	size_t path_len;	/* length of path portion excluding any trailing
+				 * '?...' and '#...' portion; will always be >= 1 */
+};
+
+extern char *url_normalize(const char *, struct url_info *);
+extern int match_urls(const struct url_info *url, const struct url_info *url_prefix, int *exactusermatch);
+
+#endif /* URL_MATCH_H */
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH ALTERNATIVE v6 4/4] config: parse http.<url>.<variable> using urlmatch
  2013-07-31 20:51   ` [PATCH ALTERNATIVE v6 0/2] http.<url>.<key> and friends Kyle J. McKay
  2013-07-31 20:52     ` [PATCH ALTERNATIVE v6 2/4] config: add helper to normalize and match URLs Kyle J. McKay
@ 2013-07-31 20:52     ` Kyle J. McKay
  2013-07-31 22:01     ` [PATCH ALTERNATIVE v6 0/2] http.<url>.<key> and friends Junio C Hamano
  2013-07-31 22:41     ` [PATCH ALTERNATIVE v6.v2 4/6] config: parse http.<url>.<variable> using urlmatch Kyle J. McKay
  3 siblings, 0 replies; 23+ messages in thread
From: Kyle J. McKay @ 2013-07-31 20:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King

Use the urlmatch_config_entry() to wrap the underlying
http_options() two-level variable parser in order to set
http.<variable> to the value with the most specific URL in the
configuration.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 .gitignore               |   1 +
 Documentation/config.txt |  46 +++++++++++
 Makefile                 |   7 ++
 http.c                   |  13 +++-
 t/.gitattributes         |   1 +
 t/t5200-url-normalize.sh | 199 +++++++++++++++++++++++++++++++++++++++++++++++
 t/t5200/README           |  18 +++++
 t/t5200/config-1         |   8 ++
 t/t5200/config-2         |   3 +
 t/t5200/config-3         |   4 +
 t/t5200/url-1            | Bin 0 -> 20 bytes
 t/t5200/url-10           | Bin 0 -> 23 bytes
 t/t5200/url-11           | Bin 0 -> 25 bytes
 t/t5200/url-2            | Bin 0 -> 20 bytes
 t/t5200/url-3            | Bin 0 -> 23 bytes
 t/t5200/url-4            | Bin 0 -> 23 bytes
 t/t5200/url-5            | Bin 0 -> 23 bytes
 t/t5200/url-6            | Bin 0 -> 23 bytes
 t/t5200/url-7            | Bin 0 -> 23 bytes
 t/t5200/url-8            | Bin 0 -> 23 bytes
 t/t5200/url-9            | Bin 0 -> 23 bytes
 test-url-normalize.c     | 139 +++++++++++++++++++++++++++++++++
 22 files changed, 438 insertions(+), 1 deletion(-)
 create mode 100755 t/t5200-url-normalize.sh
 create mode 100644 t/t5200/README
 create mode 100644 t/t5200/config-1
 create mode 100644 t/t5200/config-2
 create mode 100644 t/t5200/config-3
 create mode 100644 t/t5200/url-1
 create mode 100644 t/t5200/url-10
 create mode 100644 t/t5200/url-11
 create mode 100644 t/t5200/url-2
 create mode 100644 t/t5200/url-3
 create mode 100644 t/t5200/url-4
 create mode 100644 t/t5200/url-5
 create mode 100644 t/t5200/url-6
 create mode 100644 t/t5200/url-7
 create mode 100644 t/t5200/url-8
 create mode 100644 t/t5200/url-9
 create mode 100644 test-url-normalize.c

diff --git a/.gitignore b/.gitignore
index 6669bf0c..cd97e16a 100644
--- a/.gitignore
+++ b/.gitignore
@@ -198,6 +198,7 @@
 /test-string-list
 /test-subprocess
 /test-svn-fe
+/test-url-normalize
 /test-wildmatch
 /common-cmds.h
 *.tar.gz
diff --git a/Documentation/config.txt b/Documentation/config.txt
index 6e53fc50..8cc0fd78 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1513,6 +1513,52 @@ http.useragent::
 	of common USER_AGENT strings (but not including those like git/1.7.1).
 	Can be overridden by the 'GIT_HTTP_USER_AGENT' environment variable.
 
+http.<url>.*::
+	Any of the http.* options above can be applied selectively to some urls.
+	For a config key to match a URL, each element of the config key is
+	compared to that of the URL, in the following order:
++
+--
+. Scheme (e.g., `https` in `https://example.com/`). This field
+  must match exactly between the config key and the URL.
+
+. Host/domain name (e.g., `example.com` in `https://example.com/`).
+  This field must match exactly between the config key and the URL.
+
+. Port number (e.g., `8080` in `http://example.com:8080/`).
+  This field must match exactly between the config key and the URL.
+  Omitted port numbers are automatically converted to the correct
+  default for the scheme before matching.
+
+. Path (e.g., `repo.git` in `https://example.com/repo.git`). The
+  path field of the config key must match the path field of the URL
+  either exactly or as a prefix of slash-delimited path elements.  This means
+  a config key with path `foo/` matches URL path `foo/bar`.  A prefix can only
+  match on a slash (`/`) boundary.  Longer matches take precedence (so a config
+  key with path `foo/bar` is a better match to URL path `foo/bar` than a config
+  key with just path `foo/`).
+
+. User name (e.g., `user` in `https://user@example.com/repo.git`). If
+  the config key has a user name it must match the user name in the
+  URL exactly. If the config key does not have a user name, that
+  config key will match a URL with any user name (including none).
+  config key will match a URL with any user name (including none),
+  but at a lower precedence than a config key with a user name.
+--
++
+The list above is ordered by decreasing precedence; a URL that matches
+a config key's path is preferred to one that matches its user name. For example,
+if the URL is `https://user@example.com/foo/bar` a config key match of
+`https://example.com/foo` will be preferred over a config key match of
+`https://user@example.com`.
++
+All URLs are normalized before attempting any matching (the password part,
+if embedded in the URL, is always ignored for matching purposes) so that
+equivalent urls that are simply spelled differently will match properly.
+Environment variable settings always override any matches.  The urls that are
+matched against are those given directly to Git commands.  This means any URLs
+visited as a result of a redirection do not participate in matching.
+
 i18n.commitEncoding::
 	Character encoding the commit messages are stored in; Git itself
 	does not care per se, but this information is necessary e.g. when
diff --git a/Makefile b/Makefile
index 0f931a20..ea3edbae 100644
--- a/Makefile
+++ b/Makefile
@@ -567,6 +567,7 @@ TEST_PROGRAMS_NEED_X += test-sigchain
 TEST_PROGRAMS_NEED_X += test-string-list
 TEST_PROGRAMS_NEED_X += test-subprocess
 TEST_PROGRAMS_NEED_X += test-svn-fe
+TEST_PROGRAMS_NEED_X += test-url-normalize
 TEST_PROGRAMS_NEED_X += test-wildmatch
 
 TEST_PROGRAMS = $(patsubst %,%$X,$(TEST_PROGRAMS_NEED_X))
@@ -721,6 +722,7 @@ LIB_H += tree-walk.h
 LIB_H += tree.h
 LIB_H += unpack-trees.h
 LIB_H += url.h
+LIB_H += urlmatch.h
 LIB_H += userdiff.h
 LIB_H += utf8.h
 LIB_H += varint.h
@@ -868,6 +870,7 @@ LIB_OBJS += tree.o
 LIB_OBJS += tree-walk.o
 LIB_OBJS += unpack-trees.o
 LIB_OBJS += url.o
+LIB_OBJS += urlmatch.o
 LIB_OBJS += usage.o
 LIB_OBJS += userdiff.o
 LIB_OBJS += utf8.o
@@ -2235,6 +2238,10 @@ test-parse-options$X: parse-options.o parse-options-cb.o
 
 test-svn-fe$X: vcs-svn/lib.a
 
+test-url-normalize$X: test-url-normalize.o GIT-LDFLAGS $(GITLIBS)
+	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
+		$(LIBS) $(CURL_LIBCURL) $(EXPAT_LIBEXPAT)
+
 .PRECIOUS: $(TEST_OBJS)
 
 test-%$X: test-%.o GIT-LDFLAGS $(GITLIBS)
diff --git a/http.c b/http.c
index 37986f82..5eda356f 100644
--- a/http.c
+++ b/http.c
@@ -3,6 +3,7 @@
 #include "sideband.h"
 #include "run-command.h"
 #include "url.h"
+#include "urlmatch.h"
 #include "credential.h"
 #include "version.h"
 #include "pkt-line.h"
@@ -334,10 +335,20 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 {
 	char *low_speed_limit;
 	char *low_speed_time;
+	char *normalized_url;
+	struct urlmatch_config config = { STRING_LIST_INIT_DUP };
+
+	config.section = "http";
+	config.key = NULL;
+	config.collect_fn = http_options;
+	config.cascade_fn = git_default_config;
+	config.cb = NULL;
 
 	http_is_verbose = 0;
+	normalized_url = url_normalize(url, &config.url);
 
-	git_config(http_options, NULL);
+	git_config(urlmatch_config_entry, &config);
+	free(normalized_url);
 
 	curl_global_init(CURL_GLOBAL_ALL);
 
diff --git a/t/.gitattributes b/t/.gitattributes
index 1b97c546..f6f1df35 100644
--- a/t/.gitattributes
+++ b/t/.gitattributes
@@ -1 +1,2 @@
 t[0-9][0-9][0-9][0-9]/* -whitespace
+t5200/url-* binary
diff --git a/t/t5200-url-normalize.sh b/t/t5200-url-normalize.sh
new file mode 100755
index 00000000..f79bb0fb
--- /dev/null
+++ b/t/t5200-url-normalize.sh
@@ -0,0 +1,199 @@
+#!/bin/sh
+
+test_description='url normalization'
+. ./test-lib.sh
+
+if test -n "$NO_CURL"; then
+	skip_all='skipping test, git built without http support'
+	test_done
+fi
+
+# The base name of the test url files
+tu="$TEST_DIRECTORY/t5200/url"
+
+# The base name of the test config files
+tc="$TEST_DIRECTORY/t5200/config"
+
+# Note that only file: URLs should be allowed without a host
+
+test_expect_success 'url scheme' '
+	! test-url-normalize "" &&
+	! test-url-normalize "_" &&
+	! test-url-normalize "scheme" &&
+	! test-url-normalize "scheme:" &&
+	! test-url-normalize "scheme:/" &&
+	! test-url-normalize "scheme://" &&
+	! test-url-normalize "file" &&
+	! test-url-normalize "file:" &&
+	! test-url-normalize "file:/" &&
+	test-url-normalize "file://" &&
+	! test-url-normalize "://acme.co" &&
+	! test-url-normalize "x_test://acme.co" &&
+	! test-url-normalize "-test://acme.co" &&
+	! test-url-normalize "0test://acme.co" &&
+	! test-url-normalize "+test://acme.co" &&
+	! test-url-normalize ".test://acme.co" &&
+	! test-url-normalize "schem%6e://" &&
+	test-url-normalize "x-Test+v1.0://acme.co" &&
+	test "$(test-url-normalize -p "AbCdeF://x.Y")" = "abcdef://x.y/"
+'
+
+test_expect_success 'url authority' '
+	! test-url-normalize "scheme://user:pass@" &&
+	! test-url-normalize "scheme://?" &&
+	! test-url-normalize "scheme://#" &&
+	! test-url-normalize "scheme:///" &&
+	! test-url-normalize "scheme://:" &&
+	! test-url-normalize "scheme://:555" &&
+	test-url-normalize "file://user:pass@" &&
+	test-url-normalize "file://?" &&
+	test-url-normalize "file://#" &&
+	test-url-normalize "file:///" &&
+	test-url-normalize "file://:" &&
+	! test-url-normalize "file://:555" &&
+	test-url-normalize "scheme://user:pass@host" &&
+	test-url-normalize "scheme://@host" &&
+	test-url-normalize "scheme://%00@host" &&
+	! test-url-normalize "scheme://%%@host" &&
+	! test-url-normalize "scheme://host_" &&
+	test-url-normalize "scheme://user:pass@host/" &&
+	test-url-normalize "scheme://@host/" &&
+	test-url-normalize "scheme://host/" &&
+	test-url-normalize "scheme://host?x" &&
+	test-url-normalize "scheme://host#x" &&
+	test-url-normalize "scheme://host/@" &&
+	test-url-normalize "scheme://host?@x" &&
+	test-url-normalize "scheme://host#@x" &&
+	test-url-normalize "scheme://[::1]" &&
+	test-url-normalize "scheme://[::1]/" &&
+	! test-url-normalize "scheme://hos%41/" &&
+	test-url-normalize "scheme://[invalid....:/" &&
+	test-url-normalize "scheme://invalid....:]/" &&
+	! test-url-normalize "scheme://invalid....:[/" &&
+	! test-url-normalize "scheme://invalid....:["
+'
+
+test_expect_success 'url port checks' '
+	test-url-normalize "xyz://q@some.host:" &&
+	test-url-normalize "xyz://q@some.host:456/" &&
+	! test-url-normalize "xyz://q@some.host:0" &&
+	! test-url-normalize "xyz://q@some.host:0000000" &&
+	test-url-normalize "xyz://q@some.host:0000001?" &&
+	test-url-normalize "xyz://q@some.host:065535#" &&
+	test-url-normalize "xyz://q@some.host:65535" &&
+	! test-url-normalize "xyz://q@some.host:65536" &&
+	! test-url-normalize "xyz://q@some.host:99999" &&
+	! test-url-normalize "xyz://q@some.host:100000" &&
+	! test-url-normalize "xyz://q@some.host:100001" &&
+	test-url-normalize "http://q@some.host:80" &&
+	test-url-normalize "https://q@some.host:443" &&
+	test-url-normalize "http://q@some.host:80/" &&
+	test-url-normalize "https://q@some.host:443?" &&
+	! test-url-normalize "http://q@:8008" &&
+	! test-url-normalize "http://:8080" &&
+	! test-url-normalize "http://:" &&
+	test-url-normalize "xyz://q@some.host:456/" &&
+	test-url-normalize "xyz://[::1]:456/" &&
+	test-url-normalize "xyz://[::1]:/" &&
+	! test-url-normalize "xyz://[::1]:000/" &&
+	! test-url-normalize "xyz://[::1]:0%300/" &&
+	! test-url-normalize "xyz://[::1]:0x80/" &&
+	! test-url-normalize "xyz://[::1]:4294967297/" &&
+	! test-url-normalize "xyz://[::1]:030f/"
+'
+
+test_expect_success 'url port normalization' '
+	test "$(test-url-normalize -p "http://x:800")" = "http://x:800/" &&
+	test "$(test-url-normalize -p "http://x:0800")" = "http://x:800/" &&
+	test "$(test-url-normalize -p "http://x:00000800")" = "http://x:800/" &&
+	test "$(test-url-normalize -p "http://x:065535")" = "http://x:65535/" &&
+	test "$(test-url-normalize -p "http://x:1")" = "http://x:1/" &&
+	test "$(test-url-normalize -p "http://x:80")" = "http://x/" &&
+	test "$(test-url-normalize -p "http://x:080")" = "http://x/" &&
+	test "$(test-url-normalize -p "http://x:000000080")" = "http://x/" &&
+	test "$(test-url-normalize -p "https://x:443")" = "https://x/" &&
+	test "$(test-url-normalize -p "https://x:0443")" = "https://x/" &&
+	test "$(test-url-normalize -p "https://x:000000443")" = "https://x/"
+'
+
+test_expect_success 'url general escapes' '
+	! test-url-normalize "http://x.y?%fg" &&
+	test "$(test-url-normalize -p "X://W/%7e%41^%3a")" = "x://w/~A%5E%3A" &&
+	test "$(test-url-normalize -p "X://W/:/?#[]@")" = "x://w/:/?#[]@" &&
+	test "$(test-url-normalize -p "X://W/$&()*+,;=")" = "x://w/$&()*+,;=" &&
+	test "$(test-url-normalize -p "X://W/'\''")" = "x://w/'\''" &&
+	test "$(test-url-normalize -p "X://W?'\!'")" = "x://w/?'\!'"
+'
+
+test_expect_success 'url high-bit escapes' '
+	test "$(test-url-normalize -p "$(cat "$tu-1")")" = "x://q/%01%02%03%04%05%06%07%08%0E%0F%10%11%12" &&
+	test "$(test-url-normalize -p "$(cat "$tu-2")")" = "x://q/%13%14%15%16%17%18%19%1B%1C%1D%1E%1F%7F" &&
+	test "$(test-url-normalize -p "$(cat "$tu-3")")" = "x://q/%80%81%82%83%84%85%86%87%88%89%8A%8B%8C%8D%8E%8F" &&
+	test "$(test-url-normalize -p "$(cat "$tu-4")")" = "x://q/%90%91%92%93%94%95%96%97%98%99%9A%9B%9C%9D%9E%9F" &&
+	test "$(test-url-normalize -p "$(cat "$tu-5")")" = "x://q/%A0%A1%A2%A3%A4%A5%A6%A7%A8%A9%AA%AB%AC%AD%AE%AF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-6")")" = "x://q/%B0%B1%B2%B3%B4%B5%B6%B7%B8%B9%BA%BB%BC%BD%BE%BF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-7")")" = "x://q/%C0%C1%C2%C3%C4%C5%C6%C7%C8%C9%CA%CB%CC%CD%CE%CF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-8")")" = "x://q/%D0%D1%D2%D3%D4%D5%D6%D7%D8%D9%DA%DB%DC%DD%DE%DF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-9")")" = "x://q/%E0%E1%E2%E3%E4%E5%E6%E7%E8%E9%EA%EB%EC%ED%EE%EF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-10")")" = "x://q/%F0%F1%F2%F3%F4%F5%F6%F7%F8%F9%FA%FB%FC%FD%FE%FF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-11")")" = "x://q/%C2%80%DF%BF%E0%A0%80%EF%BF%BD%F0%90%80%80%F0%AF%BF%BD"
+'
+
+test_expect_success 'url username/password escapes' '
+	test "$(test-url-normalize -p "x://%41%62(^):%70+d@foo")" = "x://Ab(%5E):p+d@foo/"
+'
+
+test_expect_success 'url normalized lengths' '
+	test "$(test-url-normalize -l "Http://%4d%65:%4d^%70@The.Host")" = 25 &&
+	test "$(test-url-normalize -l "http://%41:%42@x.y/%61/")" = 17 &&
+	test "$(test-url-normalize -l "http://@x.y/^")" = 15
+'
+
+test_expect_success 'url . and .. segments' '
+	test "$(test-url-normalize -p "x://y/.")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/./")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/a/.")" = "x://y/a" &&
+	test "$(test-url-normalize -p "x://y/a/./")" = "x://y/a/" &&
+	test "$(test-url-normalize -p "x://y/.?")" = "x://y/?" &&
+	test "$(test-url-normalize -p "x://y/./?")" = "x://y/?" &&
+	test "$(test-url-normalize -p "x://y/a/.?")" = "x://y/a?" &&
+	test "$(test-url-normalize -p "x://y/a/./?")" = "x://y/a/?" &&
+	test "$(test-url-normalize -p "x://y/a/./b/.././../c")" = "x://y/c" &&
+	test "$(test-url-normalize -p "x://y/a/./b/../.././c/")" = "x://y/c/" &&
+	test "$(test-url-normalize -p "x://y/a/./b/.././../c/././.././.")" = "x://y/" &&
+	! test-url-normalize "x://y/a/./b/.././../c/././.././.." &&
+	test "$(test-url-normalize -p "x://y/a/./?/././..")" = "x://y/a/?/././.." &&
+	test "$(test-url-normalize -p "x://y/%2e/")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/%2E/")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/a/%2e./")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/b/.%2E/")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/c/%2e%2E/")" = "x://y/"
+'
+
+# http://@foo specifies an empty user name but does not specify a password
+# http://foo  specifies neither a user name nor a password
+# So they should not be equivalent
+test_expect_success 'url equivalents' '
+	test-url-normalize "httP://x" "Http://X/" &&
+	test-url-normalize "Http://%4d%65:%4d^%70@The.Host" "hTTP://Me:%4D^p@the.HOST:80/" &&
+	! test-url-normalize "https://@x.y/^" "httpS://x.y:443/^" &&
+	test-url-normalize "https://@x.y/^" "httpS://@x.y:0443/^" &&
+	test-url-normalize "https://@x.y/^/../abc" "httpS://@x.y:0443/abc" &&
+	test-url-normalize "https://@x.y/^/.." "httpS://@x.y:0443/"
+'
+
+test_expect_success 'url config normalization matching' '
+	test "$(test-url-normalize -c "$tc-1" "useragent" "https://other.example.com/")" = "other-agent" &&
+	test "$(test-url-normalize -c "$tc-1" "useragent" "https://example.com/")" = "example-agent" &&
+	test "$(test-url-normalize -c "$tc-1" "sslVerify" "https://example.com/")" = "false" &&
+	test "$(test-url-normalize -c "$tc-1" "useragent" "https://example.com/path/sub")" = "path-agent" &&
+	test "$(test-url-normalize -c "$tc-1" "sslVerify" "https://example.com/path/sub")" = "false" &&
+	test "$(test-url-normalize -c "$tc-1" "noEPSV" "https://elsewhere.com/")" = "true" &&
+	test "$(test-url-normalize -c "$tc-1" "noEPSV" "https://example.com")" = "true" &&
+	test "$(test-url-normalize -c "$tc-1" "noEPSV" "https://example.com/path")" = "true" &&
+	test "$(test-url-normalize -c "$tc-2" "useragent" "HTTPS://example.COM/p%61th")" = "example-agent" &&
+	test "$(test-url-normalize -c "$tc-2" "sslVerify" "HTTPS://example.COM/p%61th")" = "false" &&
+	test "$(test-url-normalize -c "$tc-3" "sslcainfo" "https://user@example.com/path/name/here")" = "file-1"
+'
+
+test_done
diff --git a/t/t5200/README b/t/t5200/README
new file mode 100644
index 00000000..e3a67d94
--- /dev/null
+++ b/t/t5200/README
@@ -0,0 +1,18 @@
+The url data files in this directory contain URLs with characters
+in the range 0x01-0x1f and 0x7f-0xff to test the proper normalization
+of unprintable characters.
+
+A select few characters in the 0x01-0x1f range are skipped to help
+avoid problems running the test itself.
+
+The urls are in test files in this directory rather than being
+embedded in the test script for portability.
+
+The config data files in this directory represent configurations
+to be parsed by http_options so that the final option value can be
+tested.
+
+The config files may contain more than one same-named section to
+simulate having a system, global and .git config file.
+
+
diff --git a/t/t5200/config-1 b/t/t5200/config-1
new file mode 100644
index 00000000..8aaf23c4
--- /dev/null
+++ b/t/t5200/config-1
@@ -0,0 +1,8 @@
+[http]
+	useragent = other-agent
+	noEPSV = true
+[http "https://example.com"]
+	useragent = example-agent
+	sslVerify = false
+[http "https://example.com/path"]
+	useragent = path-agent
diff --git a/t/t5200/config-2 b/t/t5200/config-2
new file mode 100644
index 00000000..749f4bd5
--- /dev/null
+++ b/t/t5200/config-2
@@ -0,0 +1,3 @@
+[http "https://example.com/path"]
+	useragent = example-agent
+	sslVerify = false
diff --git a/t/t5200/config-3 b/t/t5200/config-3
new file mode 100644
index 00000000..5c8d3e85
--- /dev/null
+++ b/t/t5200/config-3
@@ -0,0 +1,4 @@
+[http "https://example.com/path/name"]
+	sslcainfo = file-1
+[http "https://user@example.com/path"]
+	sslcainfo = file-2
diff --git a/t/t5200/url-1 b/t/t5200/url-1
new file mode 100644
index 0000000000000000000000000000000000000000..519019c5ce6c58478f048a2f39e2321370d318c6
GIT binary patch
literal 20
bcmb=h($_E4XJle#VP#|I;Nuq%6ygE^Admtt

literal 0
HcmV?d00001

diff --git a/t/t5200/url-10 b/t/t5200/url-10
new file mode 100644
index 0000000000000000000000000000000000000000..b9965de6a5d74b122179821212b2c27c8ae03e80
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFYxj5^Yr!h_xSnx`~3a>{|dCd5i<Y)

literal 0
HcmV?d00001

diff --git a/t/t5200/url-11 b/t/t5200/url-11
new file mode 100644
index 0000000000000000000000000000000000000000..f0a50f10096a20d597f40c775f09a71276e0050a
GIT binary patch
literal 25
hcmb=h($_E4Kh$u4|APe$@AvQhFrlI0!}|Suxd5(W4xs=5

literal 0
HcmV?d00001

diff --git a/t/t5200/url-2 b/t/t5200/url-2
new file mode 100644
index 0000000000000000000000000000000000000000..43334b05b2de3794d6020abd96e634a4e9e49cb0
GIT binary patch
literal 20
bcmb=h($_E47Zwo}6PJ*bmXVc{ujc{)C{+Vx

literal 0
HcmV?d00001

diff --git a/t/t5200/url-3 b/t/t5200/url-3
new file mode 100644
index 0000000000000000000000000000000000000000..7378c7bec247b996bc67b00a05ed89cf47d4b7a7
GIT binary patch
literal 23
ecmb=h($_E4Z)j|4ZfR|6@96C6?&<C8=K=t7Jqj}b

literal 0
HcmV?d00001

diff --git a/t/t5200/url-4 b/t/t5200/url-4
new file mode 100644
index 0000000000000000000000000000000000000000..220b198c97f942fea4960f51a2105cc42261061a
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFOZRvla!T~mzbHFo1C4Vp9*`u3o`%!

literal 0
HcmV?d00001

diff --git a/t/t5200/url-5 b/t/t5200/url-5
new file mode 100644
index 0000000000000000000000000000000000000000..1ccd9277792840955bb124bdde21f4b08bcccb63
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFQB2Kqok##r>Lo_tE{cAuL^}d3^M=#

literal 0
HcmV?d00001

diff --git a/t/t5200/url-6 b/t/t5200/url-6
new file mode 100644
index 0000000000000000000000000000000000000000..e8283aac6dff049d3e02454db6e684c5790a5996
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFR-z)v$VCgx45~wyS%-=zY31M4Kn}$

literal 0
HcmV?d00001

diff --git a/t/t5200/url-7 b/t/t5200/url-7
new file mode 100644
index 0000000000000000000000000000000000000000..fa7c10b615259deefd15b638b021da7c60eba1b2
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFTlaV!^FkL$H>Xb%goKr&kC454l@7%

literal 0
HcmV?d00001

diff --git a/t/t5200/url-8 b/t/t5200/url-8
new file mode 100644
index 0000000000000000000000000000000000000000..79a0ba836f5b8886b0a73f161eb292af2b105e65
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFVNA_)6~`0*Vx(G+uYsW-wL6<4>JG&

literal 0
HcmV?d00001

diff --git a/t/t5200/url-9 b/t/t5200/url-9
new file mode 100644
index 0000000000000000000000000000000000000000..8b44bec48b94467c63e8e1ad18162e465da6d6dd
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFW}+g<K*S$=jiF`>+J3B?+U9u5HkP(

literal 0
HcmV?d00001

diff --git a/test-url-normalize.c b/test-url-normalize.c
new file mode 100644
index 00000000..80437217
--- /dev/null
+++ b/test-url-normalize.c
@@ -0,0 +1,139 @@
+#ifdef NO_CURL
+
+int main(void)
+{
+	return 125;
+}
+
+#else /* !NO_CURL */
+
+#include "http.c"
+
+static int run_http_options(const char *file,
+			    const char *opt,
+			    const struct url_info *info)
+{
+	struct strbuf opt_lc;
+	size_t i, len;
+	struct urlmatch_config config = { STRING_LIST_INIT_DUP };
+
+	memcpy(&config.url, info, sizeof(*info));
+	config.section = "http";
+	config.collect_fn = http_options;
+	config.cascade_fn = git_default_config;
+	config.cb = NULL;
+
+	if (git_config_with_options(urlmatch_config_entry, &config, file, 0))
+		return 1;
+
+	len = strlen(opt);
+	strbuf_init(&opt_lc, len);
+	for (i = 0; i < len; ++i) {
+		strbuf_addch(&opt_lc, tolower(opt[i]));
+	}
+
+	if (!strcmp("sslverify", opt_lc.buf))
+		printf("%s\n", curl_ssl_verify ? "true" : "false");
+	else if (!strcmp("sslcert", opt_lc.buf))
+		printf("%s\n", ssl_cert);
+#if LIBCURL_VERSION_NUM >= 0x070903
+	else if (!strcmp("sslkey", opt_lc.buf))
+		printf("%s\n", ssl_key);
+#endif
+#if LIBCURL_VERSION_NUM >= 0x070908
+	else if (!strcmp("sslcapath", opt_lc.buf))
+		printf("%s\n", ssl_capath);
+#endif
+	else if (!strcmp("sslcainfo", opt_lc.buf))
+		printf("%s\n", ssl_cainfo);
+	else if (!strcmp("sslcertpasswordprotected", opt_lc.buf))
+		printf("%s\n", ssl_cert_password_required ? "true" : "false");
+	else if (!strcmp("ssltry", opt_lc.buf))
+		printf("%s\n", curl_ssl_try ? "true" : "false");
+	else if (!strcmp("minsessions", opt_lc.buf))
+		printf("%d\n", min_curl_sessions);
+#ifdef USE_CURL_MULTI
+	else if (!strcmp("maxrequests", opt_lc.buf))
+		printf("%d\n", max_requests);
+#endif
+	else if (!strcmp("lowspeedlimit", opt_lc.buf))
+		printf("%ld\n", curl_low_speed_limit);
+	else if (!strcmp("lowspeedtime", opt_lc.buf))
+		printf("%ld\n", curl_low_speed_time);
+	else if (!strcmp("noepsv", opt_lc.buf))
+		printf("%s\n", curl_ftp_no_epsv ? "true" : "false");
+	else if (!strcmp("proxy", opt_lc.buf))
+		printf("%s\n", curl_http_proxy);
+	else if (!strcmp("cookiefile", opt_lc.buf))
+		printf("%s\n", curl_cookie_file);
+	else if (!strcmp("postbuffer", opt_lc.buf))
+		printf("%u\n", (unsigned)http_post_buffer);
+	else if (!strcmp("useragent", opt_lc.buf))
+		printf("%s\n", user_agent);
+
+	return 0;
+}
+
+int main(int argc, char **argv)
+{
+	const char *usage = "test-url-normalize [-p | -l] <url1> | <url1> <url2>"
+		" | -c file option <url1>";
+	char *url1, *url2;
+	int opt_p = 0, opt_l = 0, opt_c = 0;
+	char *file = NULL, *optname = NULL;
+
+	/*
+	 * For one url, succeed if url_normalize succeeds on it, fail otherwise.
+	 * For two urls, succeed only if url_normalize succeeds on both and
+	 * the results compare equal with strcmp.  If -p is given (one url only)
+	 * and url_normalize succeeds, print the result followed by "\n".  If
+	 * -l is given (one url only) and url_normalize succeeds, print the
+	 * returned length in decimal followed by "\n".
+	 * If -c is given, call git_config_with_options using the specified file
+	 * and http_options and passing the normalized value of the url.  Then
+	 * print the value of 'option' afterwards.  'option' must be one of the
+	 * valid 'http.*' options.
+	 */
+
+	if (argc > 1 && !strcmp(argv[1], "-p")) {
+		opt_p = 1;
+		argc--;
+		argv++;
+	} else if (argc > 1 && !strcmp(argv[1], "-l")) {
+		opt_l = 1;
+		argc--;
+		argv++;
+	} else if (argc > 3 && !strcmp(argv[1], "-c")) {
+		opt_c = 1;
+		file = argv[2];
+		optname = argv[3];
+		argc -= 3;
+		argv += 3;
+	}
+
+	if (argc < 2 || argc > 3)
+		die(usage);
+
+	if (argc == 2) {
+		struct url_info info;
+		url1 = url_normalize(argv[1], &info);
+		if (!url1)
+			return 1;
+		if (opt_p)
+			printf("%s\n", url1);
+		if (opt_l)
+			printf("%u\n", (unsigned)info.url_len);
+		if (opt_c)
+			return run_http_options(file, optname, &info);
+		return 0;
+	}
+
+	if (opt_p || opt_l || opt_c)
+		die(usage);
+
+	url1 = url_normalize(argv[1], NULL);
+	url2 = url_normalize(argv[2], NULL);
+	return (url1 && url2 && !strcmp(url1, url2)) ? 0 : 1;
+}
+
+#endif /* !NO_CURL */
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH ALTERNATIVE v6 0/2] http.<url>.<key> and friends
  2013-07-31 20:51   ` [PATCH ALTERNATIVE v6 0/2] http.<url>.<key> and friends Kyle J. McKay
  2013-07-31 20:52     ` [PATCH ALTERNATIVE v6 2/4] config: add helper to normalize and match URLs Kyle J. McKay
  2013-07-31 20:52     ` [PATCH ALTERNATIVE v6 4/4] config: parse http.<url>.<variable> using urlmatch Kyle J. McKay
@ 2013-07-31 22:01     ` Junio C Hamano
  2013-07-31 22:41     ` [PATCH ALTERNATIVE v6.v2 4/6] config: parse http.<url>.<variable> using urlmatch Kyle J. McKay
  3 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2013-07-31 22:01 UTC (permalink / raw)
  To: Kyle J. McKay; +Cc: git, Jeff King

Thanks for a quick turnaround.  Will replace the two patches with
these and queue.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH ALTERNATIVE v6.v2 4/6] config: parse http.<url>.<variable> using urlmatch
  2013-07-31 20:51   ` [PATCH ALTERNATIVE v6 0/2] http.<url>.<key> and friends Kyle J. McKay
                       ` (2 preceding siblings ...)
  2013-07-31 22:01     ` [PATCH ALTERNATIVE v6 0/2] http.<url>.<key> and friends Junio C Hamano
@ 2013-07-31 22:41     ` Kyle J. McKay
  3 siblings, 0 replies; 23+ messages in thread
From: Kyle J. McKay @ 2013-07-31 22:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King

Use the urlmatch_config_entry() to wrap the underlying
http_options() two-level variable parser in order to set
http.<variable> to the value with the most specific URL in the
configuration.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

Somehow I managed to screw up the previous version of 4/6 by
duplicating a line in the config.txt patch.  Arrggh!  Anyhow,
here's a corrected version of 4/6 with the duplicate line
removed.  I apologize for my clumsiness.

 .gitignore               |   1 +
 Documentation/config.txt |  46 +++++++++++
 Makefile                 |   7 ++
 http.c                   |  13 +++-
 t/.gitattributes         |   1 +
 t/t5200-url-normalize.sh | 199 +++++++++++++++++++++++++++++++++++++++++++++++
 t/t5200/README           |  18 +++++
 t/t5200/config-1         |   8 ++
 t/t5200/config-2         |   3 +
 t/t5200/config-3         |   4 +
 t/t5200/url-1            | Bin 0 -> 20 bytes
 t/t5200/url-10           | Bin 0 -> 23 bytes
 t/t5200/url-11           | Bin 0 -> 25 bytes
 t/t5200/url-2            | Bin 0 -> 20 bytes
 t/t5200/url-3            | Bin 0 -> 23 bytes
 t/t5200/url-4            | Bin 0 -> 23 bytes
 t/t5200/url-5            | Bin 0 -> 23 bytes
 t/t5200/url-6            | Bin 0 -> 23 bytes
 t/t5200/url-7            | Bin 0 -> 23 bytes
 t/t5200/url-8            | Bin 0 -> 23 bytes
 t/t5200/url-9            | Bin 0 -> 23 bytes
 test-url-normalize.c     | 139 +++++++++++++++++++++++++++++++++
 22 files changed, 438 insertions(+), 1 deletion(-)
 create mode 100755 t/t5200-url-normalize.sh
 create mode 100644 t/t5200/README
 create mode 100644 t/t5200/config-1
 create mode 100644 t/t5200/config-2
 create mode 100644 t/t5200/config-3
 create mode 100644 t/t5200/url-1
 create mode 100644 t/t5200/url-10
 create mode 100644 t/t5200/url-11
 create mode 100644 t/t5200/url-2
 create mode 100644 t/t5200/url-3
 create mode 100644 t/t5200/url-4
 create mode 100644 t/t5200/url-5
 create mode 100644 t/t5200/url-6
 create mode 100644 t/t5200/url-7
 create mode 100644 t/t5200/url-8
 create mode 100644 t/t5200/url-9
 create mode 100644 test-url-normalize.c

diff --git a/.gitignore b/.gitignore
index 6669bf0c..cd97e16a 100644
--- a/.gitignore
+++ b/.gitignore
@@ -198,6 +198,7 @@
 /test-string-list
 /test-subprocess
 /test-svn-fe
+/test-url-normalize
 /test-wildmatch
 /common-cmds.h
 *.tar.gz
diff --git a/Documentation/config.txt b/Documentation/config.txt
index 6e53fc50..8cc0fd78 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1513,6 +1513,51 @@ http.useragent::
 	of common USER_AGENT strings (but not including those like git/1.7.1).
 	Can be overridden by the 'GIT_HTTP_USER_AGENT' environment variable.
 
+http.<url>.*::
+	Any of the http.* options above can be applied selectively to some urls.
+	For a config key to match a URL, each element of the config key is
+	compared to that of the URL, in the following order:
++
+--
+. Scheme (e.g., `https` in `https://example.com/`). This field
+  must match exactly between the config key and the URL.
+
+. Host/domain name (e.g., `example.com` in `https://example.com/`).
+  This field must match exactly between the config key and the URL.
+
+. Port number (e.g., `8080` in `http://example.com:8080/`).
+  This field must match exactly between the config key and the URL.
+  Omitted port numbers are automatically converted to the correct
+  default for the scheme before matching.
+
+. Path (e.g., `repo.git` in `https://example.com/repo.git`). The
+  path field of the config key must match the path field of the URL
+  either exactly or as a prefix of slash-delimited path elements.  This means
+  a config key with path `foo/` matches URL path `foo/bar`.  A prefix can only
+  match on a slash (`/`) boundary.  Longer matches take precedence (so a config
+  key with path `foo/bar` is a better match to URL path `foo/bar` than a config
+  key with just path `foo/`).
+
+. User name (e.g., `user` in `https://user@example.com/repo.git`). If
+  the config key has a user name it must match the user name in the
+  URL exactly. If the config key does not have a user name, that
+  config key will match a URL with any user name (including none),
+  but at a lower precedence than a config key with a user name.
+--
++
+The list above is ordered by decreasing precedence; a URL that matches
+a config key's path is preferred to one that matches its user name. For example,
+if the URL is `https://user@example.com/foo/bar` a config key match of
+`https://example.com/foo` will be preferred over a config key match of
+`https://user@example.com`.
++
+All URLs are normalized before attempting any matching (the password part,
+if embedded in the URL, is always ignored for matching purposes) so that
+equivalent urls that are simply spelled differently will match properly.
+Environment variable settings always override any matches.  The urls that are
+matched against are those given directly to Git commands.  This means any URLs
+visited as a result of a redirection do not participate in matching.
+
 i18n.commitEncoding::
 	Character encoding the commit messages are stored in; Git itself
 	does not care per se, but this information is necessary e.g. when
diff --git a/Makefile b/Makefile
index 0f931a20..ea3edbae 100644
--- a/Makefile
+++ b/Makefile
@@ -567,6 +567,7 @@ TEST_PROGRAMS_NEED_X += test-sigchain
 TEST_PROGRAMS_NEED_X += test-string-list
 TEST_PROGRAMS_NEED_X += test-subprocess
 TEST_PROGRAMS_NEED_X += test-svn-fe
+TEST_PROGRAMS_NEED_X += test-url-normalize
 TEST_PROGRAMS_NEED_X += test-wildmatch
 
 TEST_PROGRAMS = $(patsubst %,%$X,$(TEST_PROGRAMS_NEED_X))
@@ -721,6 +722,7 @@ LIB_H += tree-walk.h
 LIB_H += tree.h
 LIB_H += unpack-trees.h
 LIB_H += url.h
+LIB_H += urlmatch.h
 LIB_H += userdiff.h
 LIB_H += utf8.h
 LIB_H += varint.h
@@ -868,6 +870,7 @@ LIB_OBJS += tree.o
 LIB_OBJS += tree-walk.o
 LIB_OBJS += unpack-trees.o
 LIB_OBJS += url.o
+LIB_OBJS += urlmatch.o
 LIB_OBJS += usage.o
 LIB_OBJS += userdiff.o
 LIB_OBJS += utf8.o
@@ -2235,6 +2238,10 @@ test-parse-options$X: parse-options.o parse-options-cb.o
 
 test-svn-fe$X: vcs-svn/lib.a
 
+test-url-normalize$X: test-url-normalize.o GIT-LDFLAGS $(GITLIBS)
+	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
+		$(LIBS) $(CURL_LIBCURL) $(EXPAT_LIBEXPAT)
+
 .PRECIOUS: $(TEST_OBJS)
 
 test-%$X: test-%.o GIT-LDFLAGS $(GITLIBS)
diff --git a/http.c b/http.c
index 37986f82..5eda356f 100644
--- a/http.c
+++ b/http.c
@@ -3,6 +3,7 @@
 #include "sideband.h"
 #include "run-command.h"
 #include "url.h"
+#include "urlmatch.h"
 #include "credential.h"
 #include "version.h"
 #include "pkt-line.h"
@@ -334,10 +335,20 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 {
 	char *low_speed_limit;
 	char *low_speed_time;
+	char *normalized_url;
+	struct urlmatch_config config = { STRING_LIST_INIT_DUP };
+
+	config.section = "http";
+	config.key = NULL;
+	config.collect_fn = http_options;
+	config.cascade_fn = git_default_config;
+	config.cb = NULL;
 
 	http_is_verbose = 0;
+	normalized_url = url_normalize(url, &config.url);
 
-	git_config(http_options, NULL);
+	git_config(urlmatch_config_entry, &config);
+	free(normalized_url);
 
 	curl_global_init(CURL_GLOBAL_ALL);
 
diff --git a/t/.gitattributes b/t/.gitattributes
index 1b97c546..f6f1df35 100644
--- a/t/.gitattributes
+++ b/t/.gitattributes
@@ -1 +1,2 @@
 t[0-9][0-9][0-9][0-9]/* -whitespace
+t5200/url-* binary
diff --git a/t/t5200-url-normalize.sh b/t/t5200-url-normalize.sh
new file mode 100755
index 00000000..f79bb0fb
--- /dev/null
+++ b/t/t5200-url-normalize.sh
@@ -0,0 +1,199 @@
+#!/bin/sh
+
+test_description='url normalization'
+. ./test-lib.sh
+
+if test -n "$NO_CURL"; then
+	skip_all='skipping test, git built without http support'
+	test_done
+fi
+
+# The base name of the test url files
+tu="$TEST_DIRECTORY/t5200/url"
+
+# The base name of the test config files
+tc="$TEST_DIRECTORY/t5200/config"
+
+# Note that only file: URLs should be allowed without a host
+
+test_expect_success 'url scheme' '
+	! test-url-normalize "" &&
+	! test-url-normalize "_" &&
+	! test-url-normalize "scheme" &&
+	! test-url-normalize "scheme:" &&
+	! test-url-normalize "scheme:/" &&
+	! test-url-normalize "scheme://" &&
+	! test-url-normalize "file" &&
+	! test-url-normalize "file:" &&
+	! test-url-normalize "file:/" &&
+	test-url-normalize "file://" &&
+	! test-url-normalize "://acme.co" &&
+	! test-url-normalize "x_test://acme.co" &&
+	! test-url-normalize "-test://acme.co" &&
+	! test-url-normalize "0test://acme.co" &&
+	! test-url-normalize "+test://acme.co" &&
+	! test-url-normalize ".test://acme.co" &&
+	! test-url-normalize "schem%6e://" &&
+	test-url-normalize "x-Test+v1.0://acme.co" &&
+	test "$(test-url-normalize -p "AbCdeF://x.Y")" = "abcdef://x.y/"
+'
+
+test_expect_success 'url authority' '
+	! test-url-normalize "scheme://user:pass@" &&
+	! test-url-normalize "scheme://?" &&
+	! test-url-normalize "scheme://#" &&
+	! test-url-normalize "scheme:///" &&
+	! test-url-normalize "scheme://:" &&
+	! test-url-normalize "scheme://:555" &&
+	test-url-normalize "file://user:pass@" &&
+	test-url-normalize "file://?" &&
+	test-url-normalize "file://#" &&
+	test-url-normalize "file:///" &&
+	test-url-normalize "file://:" &&
+	! test-url-normalize "file://:555" &&
+	test-url-normalize "scheme://user:pass@host" &&
+	test-url-normalize "scheme://@host" &&
+	test-url-normalize "scheme://%00@host" &&
+	! test-url-normalize "scheme://%%@host" &&
+	! test-url-normalize "scheme://host_" &&
+	test-url-normalize "scheme://user:pass@host/" &&
+	test-url-normalize "scheme://@host/" &&
+	test-url-normalize "scheme://host/" &&
+	test-url-normalize "scheme://host?x" &&
+	test-url-normalize "scheme://host#x" &&
+	test-url-normalize "scheme://host/@" &&
+	test-url-normalize "scheme://host?@x" &&
+	test-url-normalize "scheme://host#@x" &&
+	test-url-normalize "scheme://[::1]" &&
+	test-url-normalize "scheme://[::1]/" &&
+	! test-url-normalize "scheme://hos%41/" &&
+	test-url-normalize "scheme://[invalid....:/" &&
+	test-url-normalize "scheme://invalid....:]/" &&
+	! test-url-normalize "scheme://invalid....:[/" &&
+	! test-url-normalize "scheme://invalid....:["
+'
+
+test_expect_success 'url port checks' '
+	test-url-normalize "xyz://q@some.host:" &&
+	test-url-normalize "xyz://q@some.host:456/" &&
+	! test-url-normalize "xyz://q@some.host:0" &&
+	! test-url-normalize "xyz://q@some.host:0000000" &&
+	test-url-normalize "xyz://q@some.host:0000001?" &&
+	test-url-normalize "xyz://q@some.host:065535#" &&
+	test-url-normalize "xyz://q@some.host:65535" &&
+	! test-url-normalize "xyz://q@some.host:65536" &&
+	! test-url-normalize "xyz://q@some.host:99999" &&
+	! test-url-normalize "xyz://q@some.host:100000" &&
+	! test-url-normalize "xyz://q@some.host:100001" &&
+	test-url-normalize "http://q@some.host:80" &&
+	test-url-normalize "https://q@some.host:443" &&
+	test-url-normalize "http://q@some.host:80/" &&
+	test-url-normalize "https://q@some.host:443?" &&
+	! test-url-normalize "http://q@:8008" &&
+	! test-url-normalize "http://:8080" &&
+	! test-url-normalize "http://:" &&
+	test-url-normalize "xyz://q@some.host:456/" &&
+	test-url-normalize "xyz://[::1]:456/" &&
+	test-url-normalize "xyz://[::1]:/" &&
+	! test-url-normalize "xyz://[::1]:000/" &&
+	! test-url-normalize "xyz://[::1]:0%300/" &&
+	! test-url-normalize "xyz://[::1]:0x80/" &&
+	! test-url-normalize "xyz://[::1]:4294967297/" &&
+	! test-url-normalize "xyz://[::1]:030f/"
+'
+
+test_expect_success 'url port normalization' '
+	test "$(test-url-normalize -p "http://x:800")" = "http://x:800/" &&
+	test "$(test-url-normalize -p "http://x:0800")" = "http://x:800/" &&
+	test "$(test-url-normalize -p "http://x:00000800")" = "http://x:800/" &&
+	test "$(test-url-normalize -p "http://x:065535")" = "http://x:65535/" &&
+	test "$(test-url-normalize -p "http://x:1")" = "http://x:1/" &&
+	test "$(test-url-normalize -p "http://x:80")" = "http://x/" &&
+	test "$(test-url-normalize -p "http://x:080")" = "http://x/" &&
+	test "$(test-url-normalize -p "http://x:000000080")" = "http://x/" &&
+	test "$(test-url-normalize -p "https://x:443")" = "https://x/" &&
+	test "$(test-url-normalize -p "https://x:0443")" = "https://x/" &&
+	test "$(test-url-normalize -p "https://x:000000443")" = "https://x/"
+'
+
+test_expect_success 'url general escapes' '
+	! test-url-normalize "http://x.y?%fg" &&
+	test "$(test-url-normalize -p "X://W/%7e%41^%3a")" = "x://w/~A%5E%3A" &&
+	test "$(test-url-normalize -p "X://W/:/?#[]@")" = "x://w/:/?#[]@" &&
+	test "$(test-url-normalize -p "X://W/$&()*+,;=")" = "x://w/$&()*+,;=" &&
+	test "$(test-url-normalize -p "X://W/'\''")" = "x://w/'\''" &&
+	test "$(test-url-normalize -p "X://W?'\!'")" = "x://w/?'\!'"
+'
+
+test_expect_success 'url high-bit escapes' '
+	test "$(test-url-normalize -p "$(cat "$tu-1")")" = "x://q/%01%02%03%04%05%06%07%08%0E%0F%10%11%12" &&
+	test "$(test-url-normalize -p "$(cat "$tu-2")")" = "x://q/%13%14%15%16%17%18%19%1B%1C%1D%1E%1F%7F" &&
+	test "$(test-url-normalize -p "$(cat "$tu-3")")" = "x://q/%80%81%82%83%84%85%86%87%88%89%8A%8B%8C%8D%8E%8F" &&
+	test "$(test-url-normalize -p "$(cat "$tu-4")")" = "x://q/%90%91%92%93%94%95%96%97%98%99%9A%9B%9C%9D%9E%9F" &&
+	test "$(test-url-normalize -p "$(cat "$tu-5")")" = "x://q/%A0%A1%A2%A3%A4%A5%A6%A7%A8%A9%AA%AB%AC%AD%AE%AF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-6")")" = "x://q/%B0%B1%B2%B3%B4%B5%B6%B7%B8%B9%BA%BB%BC%BD%BE%BF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-7")")" = "x://q/%C0%C1%C2%C3%C4%C5%C6%C7%C8%C9%CA%CB%CC%CD%CE%CF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-8")")" = "x://q/%D0%D1%D2%D3%D4%D5%D6%D7%D8%D9%DA%DB%DC%DD%DE%DF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-9")")" = "x://q/%E0%E1%E2%E3%E4%E5%E6%E7%E8%E9%EA%EB%EC%ED%EE%EF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-10")")" = "x://q/%F0%F1%F2%F3%F4%F5%F6%F7%F8%F9%FA%FB%FC%FD%FE%FF" &&
+	test "$(test-url-normalize -p "$(cat "$tu-11")")" = "x://q/%C2%80%DF%BF%E0%A0%80%EF%BF%BD%F0%90%80%80%F0%AF%BF%BD"
+'
+
+test_expect_success 'url username/password escapes' '
+	test "$(test-url-normalize -p "x://%41%62(^):%70+d@foo")" = "x://Ab(%5E):p+d@foo/"
+'
+
+test_expect_success 'url normalized lengths' '
+	test "$(test-url-normalize -l "Http://%4d%65:%4d^%70@The.Host")" = 25 &&
+	test "$(test-url-normalize -l "http://%41:%42@x.y/%61/")" = 17 &&
+	test "$(test-url-normalize -l "http://@x.y/^")" = 15
+'
+
+test_expect_success 'url . and .. segments' '
+	test "$(test-url-normalize -p "x://y/.")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/./")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/a/.")" = "x://y/a" &&
+	test "$(test-url-normalize -p "x://y/a/./")" = "x://y/a/" &&
+	test "$(test-url-normalize -p "x://y/.?")" = "x://y/?" &&
+	test "$(test-url-normalize -p "x://y/./?")" = "x://y/?" &&
+	test "$(test-url-normalize -p "x://y/a/.?")" = "x://y/a?" &&
+	test "$(test-url-normalize -p "x://y/a/./?")" = "x://y/a/?" &&
+	test "$(test-url-normalize -p "x://y/a/./b/.././../c")" = "x://y/c" &&
+	test "$(test-url-normalize -p "x://y/a/./b/../.././c/")" = "x://y/c/" &&
+	test "$(test-url-normalize -p "x://y/a/./b/.././../c/././.././.")" = "x://y/" &&
+	! test-url-normalize "x://y/a/./b/.././../c/././.././.." &&
+	test "$(test-url-normalize -p "x://y/a/./?/././..")" = "x://y/a/?/././.." &&
+	test "$(test-url-normalize -p "x://y/%2e/")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/%2E/")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/a/%2e./")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/b/.%2E/")" = "x://y/" &&
+	test "$(test-url-normalize -p "x://y/c/%2e%2E/")" = "x://y/"
+'
+
+# http://@foo specifies an empty user name but does not specify a password
+# http://foo  specifies neither a user name nor a password
+# So they should not be equivalent
+test_expect_success 'url equivalents' '
+	test-url-normalize "httP://x" "Http://X/" &&
+	test-url-normalize "Http://%4d%65:%4d^%70@The.Host" "hTTP://Me:%4D^p@the.HOST:80/" &&
+	! test-url-normalize "https://@x.y/^" "httpS://x.y:443/^" &&
+	test-url-normalize "https://@x.y/^" "httpS://@x.y:0443/^" &&
+	test-url-normalize "https://@x.y/^/../abc" "httpS://@x.y:0443/abc" &&
+	test-url-normalize "https://@x.y/^/.." "httpS://@x.y:0443/"
+'
+
+test_expect_success 'url config normalization matching' '
+	test "$(test-url-normalize -c "$tc-1" "useragent" "https://other.example.com/")" = "other-agent" &&
+	test "$(test-url-normalize -c "$tc-1" "useragent" "https://example.com/")" = "example-agent" &&
+	test "$(test-url-normalize -c "$tc-1" "sslVerify" "https://example.com/")" = "false" &&
+	test "$(test-url-normalize -c "$tc-1" "useragent" "https://example.com/path/sub")" = "path-agent" &&
+	test "$(test-url-normalize -c "$tc-1" "sslVerify" "https://example.com/path/sub")" = "false" &&
+	test "$(test-url-normalize -c "$tc-1" "noEPSV" "https://elsewhere.com/")" = "true" &&
+	test "$(test-url-normalize -c "$tc-1" "noEPSV" "https://example.com")" = "true" &&
+	test "$(test-url-normalize -c "$tc-1" "noEPSV" "https://example.com/path")" = "true" &&
+	test "$(test-url-normalize -c "$tc-2" "useragent" "HTTPS://example.COM/p%61th")" = "example-agent" &&
+	test "$(test-url-normalize -c "$tc-2" "sslVerify" "HTTPS://example.COM/p%61th")" = "false" &&
+	test "$(test-url-normalize -c "$tc-3" "sslcainfo" "https://user@example.com/path/name/here")" = "file-1"
+'
+
+test_done
diff --git a/t/t5200/README b/t/t5200/README
new file mode 100644
index 00000000..e3a67d94
--- /dev/null
+++ b/t/t5200/README
@@ -0,0 +1,18 @@
+The url data files in this directory contain URLs with characters
+in the range 0x01-0x1f and 0x7f-0xff to test the proper normalization
+of unprintable characters.
+
+A select few characters in the 0x01-0x1f range are skipped to help
+avoid problems running the test itself.
+
+The urls are in test files in this directory rather than being
+embedded in the test script for portability.
+
+The config data files in this directory represent configurations
+to be parsed by http_options so that the final option value can be
+tested.
+
+The config files may contain more than one same-named section to
+simulate having a system, global and .git config file.
+
+
diff --git a/t/t5200/config-1 b/t/t5200/config-1
new file mode 100644
index 00000000..8aaf23c4
--- /dev/null
+++ b/t/t5200/config-1
@@ -0,0 +1,8 @@
+[http]
+	useragent = other-agent
+	noEPSV = true
+[http "https://example.com"]
+	useragent = example-agent
+	sslVerify = false
+[http "https://example.com/path"]
+	useragent = path-agent
diff --git a/t/t5200/config-2 b/t/t5200/config-2
new file mode 100644
index 00000000..749f4bd5
--- /dev/null
+++ b/t/t5200/config-2
@@ -0,0 +1,3 @@
+[http "https://example.com/path"]
+	useragent = example-agent
+	sslVerify = false
diff --git a/t/t5200/config-3 b/t/t5200/config-3
new file mode 100644
index 00000000..5c8d3e85
--- /dev/null
+++ b/t/t5200/config-3
@@ -0,0 +1,4 @@
+[http "https://example.com/path/name"]
+	sslcainfo = file-1
+[http "https://user@example.com/path"]
+	sslcainfo = file-2
diff --git a/t/t5200/url-1 b/t/t5200/url-1
new file mode 100644
index 0000000000000000000000000000000000000000..519019c5ce6c58478f048a2f39e2321370d318c6
GIT binary patch
literal 20
bcmb=h($_E4XJle#VP#|I;Nuq%6ygE^Admtt

literal 0
HcmV?d00001

diff --git a/t/t5200/url-10 b/t/t5200/url-10
new file mode 100644
index 0000000000000000000000000000000000000000..b9965de6a5d74b122179821212b2c27c8ae03e80
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFYxj5^Yr!h_xSnx`~3a>{|dCd5i<Y)

literal 0
HcmV?d00001

diff --git a/t/t5200/url-11 b/t/t5200/url-11
new file mode 100644
index 0000000000000000000000000000000000000000..f0a50f10096a20d597f40c775f09a71276e0050a
GIT binary patch
literal 25
hcmb=h($_E4Kh$u4|APe$@AvQhFrlI0!}|Suxd5(W4xs=5

literal 0
HcmV?d00001

diff --git a/t/t5200/url-2 b/t/t5200/url-2
new file mode 100644
index 0000000000000000000000000000000000000000..43334b05b2de3794d6020abd96e634a4e9e49cb0
GIT binary patch
literal 20
bcmb=h($_E47Zwo}6PJ*bmXVc{ujc{)C{+Vx

literal 0
HcmV?d00001

diff --git a/t/t5200/url-3 b/t/t5200/url-3
new file mode 100644
index 0000000000000000000000000000000000000000..7378c7bec247b996bc67b00a05ed89cf47d4b7a7
GIT binary patch
literal 23
ecmb=h($_E4Z)j|4ZfR|6@96C6?&<C8=K=t7Jqj}b

literal 0
HcmV?d00001

diff --git a/t/t5200/url-4 b/t/t5200/url-4
new file mode 100644
index 0000000000000000000000000000000000000000..220b198c97f942fea4960f51a2105cc42261061a
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFOZRvla!T~mzbHFo1C4Vp9*`u3o`%!

literal 0
HcmV?d00001

diff --git a/t/t5200/url-5 b/t/t5200/url-5
new file mode 100644
index 0000000000000000000000000000000000000000..1ccd9277792840955bb124bdde21f4b08bcccb63
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFQB2Kqok##r>Lo_tE{cAuL^}d3^M=#

literal 0
HcmV?d00001

diff --git a/t/t5200/url-6 b/t/t5200/url-6
new file mode 100644
index 0000000000000000000000000000000000000000..e8283aac6dff049d3e02454db6e684c5790a5996
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFR-z)v$VCgx45~wyS%-=zY31M4Kn}$

literal 0
HcmV?d00001

diff --git a/t/t5200/url-7 b/t/t5200/url-7
new file mode 100644
index 0000000000000000000000000000000000000000..fa7c10b615259deefd15b638b021da7c60eba1b2
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFTlaV!^FkL$H>Xb%goKr&kC454l@7%

literal 0
HcmV?d00001

diff --git a/t/t5200/url-8 b/t/t5200/url-8
new file mode 100644
index 0000000000000000000000000000000000000000..79a0ba836f5b8886b0a73f161eb292af2b105e65
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFVNA_)6~`0*Vx(G+uYsW-wL6<4>JG&

literal 0
HcmV?d00001

diff --git a/t/t5200/url-9 b/t/t5200/url-9
new file mode 100644
index 0000000000000000000000000000000000000000..8b44bec48b94467c63e8e1ad18162e465da6d6dd
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFW}+g<K*S$=jiF`>+J3B?+U9u5HkP(

literal 0
HcmV?d00001

diff --git a/test-url-normalize.c b/test-url-normalize.c
new file mode 100644
index 00000000..80437217
--- /dev/null
+++ b/test-url-normalize.c
@@ -0,0 +1,139 @@
+#ifdef NO_CURL
+
+int main(void)
+{
+	return 125;
+}
+
+#else /* !NO_CURL */
+
+#include "http.c"
+
+static int run_http_options(const char *file,
+			    const char *opt,
+			    const struct url_info *info)
+{
+	struct strbuf opt_lc;
+	size_t i, len;
+	struct urlmatch_config config = { STRING_LIST_INIT_DUP };
+
+	memcpy(&config.url, info, sizeof(*info));
+	config.section = "http";
+	config.collect_fn = http_options;
+	config.cascade_fn = git_default_config;
+	config.cb = NULL;
+
+	if (git_config_with_options(urlmatch_config_entry, &config, file, 0))
+		return 1;
+
+	len = strlen(opt);
+	strbuf_init(&opt_lc, len);
+	for (i = 0; i < len; ++i) {
+		strbuf_addch(&opt_lc, tolower(opt[i]));
+	}
+
+	if (!strcmp("sslverify", opt_lc.buf))
+		printf("%s\n", curl_ssl_verify ? "true" : "false");
+	else if (!strcmp("sslcert", opt_lc.buf))
+		printf("%s\n", ssl_cert);
+#if LIBCURL_VERSION_NUM >= 0x070903
+	else if (!strcmp("sslkey", opt_lc.buf))
+		printf("%s\n", ssl_key);
+#endif
+#if LIBCURL_VERSION_NUM >= 0x070908
+	else if (!strcmp("sslcapath", opt_lc.buf))
+		printf("%s\n", ssl_capath);
+#endif
+	else if (!strcmp("sslcainfo", opt_lc.buf))
+		printf("%s\n", ssl_cainfo);
+	else if (!strcmp("sslcertpasswordprotected", opt_lc.buf))
+		printf("%s\n", ssl_cert_password_required ? "true" : "false");
+	else if (!strcmp("ssltry", opt_lc.buf))
+		printf("%s\n", curl_ssl_try ? "true" : "false");
+	else if (!strcmp("minsessions", opt_lc.buf))
+		printf("%d\n", min_curl_sessions);
+#ifdef USE_CURL_MULTI
+	else if (!strcmp("maxrequests", opt_lc.buf))
+		printf("%d\n", max_requests);
+#endif
+	else if (!strcmp("lowspeedlimit", opt_lc.buf))
+		printf("%ld\n", curl_low_speed_limit);
+	else if (!strcmp("lowspeedtime", opt_lc.buf))
+		printf("%ld\n", curl_low_speed_time);
+	else if (!strcmp("noepsv", opt_lc.buf))
+		printf("%s\n", curl_ftp_no_epsv ? "true" : "false");
+	else if (!strcmp("proxy", opt_lc.buf))
+		printf("%s\n", curl_http_proxy);
+	else if (!strcmp("cookiefile", opt_lc.buf))
+		printf("%s\n", curl_cookie_file);
+	else if (!strcmp("postbuffer", opt_lc.buf))
+		printf("%u\n", (unsigned)http_post_buffer);
+	else if (!strcmp("useragent", opt_lc.buf))
+		printf("%s\n", user_agent);
+
+	return 0;
+}
+
+int main(int argc, char **argv)
+{
+	const char *usage = "test-url-normalize [-p | -l] <url1> | <url1> <url2>"
+		" | -c file option <url1>";
+	char *url1, *url2;
+	int opt_p = 0, opt_l = 0, opt_c = 0;
+	char *file = NULL, *optname = NULL;
+
+	/*
+	 * For one url, succeed if url_normalize succeeds on it, fail otherwise.
+	 * For two urls, succeed only if url_normalize succeeds on both and
+	 * the results compare equal with strcmp.  If -p is given (one url only)
+	 * and url_normalize succeeds, print the result followed by "\n".  If
+	 * -l is given (one url only) and url_normalize succeeds, print the
+	 * returned length in decimal followed by "\n".
+	 * If -c is given, call git_config_with_options using the specified file
+	 * and http_options and passing the normalized value of the url.  Then
+	 * print the value of 'option' afterwards.  'option' must be one of the
+	 * valid 'http.*' options.
+	 */
+
+	if (argc > 1 && !strcmp(argv[1], "-p")) {
+		opt_p = 1;
+		argc--;
+		argv++;
+	} else if (argc > 1 && !strcmp(argv[1], "-l")) {
+		opt_l = 1;
+		argc--;
+		argv++;
+	} else if (argc > 3 && !strcmp(argv[1], "-c")) {
+		opt_c = 1;
+		file = argv[2];
+		optname = argv[3];
+		argc -= 3;
+		argv += 3;
+	}
+
+	if (argc < 2 || argc > 3)
+		die(usage);
+
+	if (argc == 2) {
+		struct url_info info;
+		url1 = url_normalize(argv[1], &info);
+		if (!url1)
+			return 1;
+		if (opt_p)
+			printf("%s\n", url1);
+		if (opt_l)
+			printf("%u\n", (unsigned)info.url_len);
+		if (opt_c)
+			return run_http_options(file, optname, &info);
+		return 0;
+	}
+
+	if (opt_p || opt_l || opt_c)
+		die(usage);
+
+	url1 = url_normalize(argv[1], NULL);
+	url2 = url_normalize(argv[2], NULL);
+	return (url1 && url2 && !strcmp(url1, url2)) ? 0 : 1;
+}
+
+#endif /* !NO_CURL */
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 6/6] config: "git config --get-urlmatch" parses section.<url>.key
  2013-07-31 19:26 ` [PATCH v6 6/6] config: "git config --get-urlmatch" parses section.<url>.key Junio C Hamano
@ 2013-07-31 22:45   ` Jeff King
  2013-07-31 23:03     ` Kyle J. McKay
  2013-07-31 23:47     ` [PATCH v6 6/6] config: "git config --get-urlmatch" parses section.<url>.key Junio C Hamano
  0 siblings, 2 replies; 23+ messages in thread
From: Jeff King @ 2013-07-31 22:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Kyle J. McKay

On Wed, Jul 31, 2013 at 12:26:08PM -0700, Junio C Hamano wrote:

> Using the same urlmatch_config_entry() infrastructure, add a new
> mode "--get-urlmatch" to the "git config" command, to learn values
> for the "virtual" two-level variables customized for the specific
> URL.
> 
>     git config [--<type>] --get-urlmatch <section>[.<key>] <url>

Do we want something like this on top, to convert the third form of
test-url-normalize into git-config calls?

It would be nicer squashed in, but we the tests are added earlier in the
series than "--get-urlmatch", so we would have to rip the tests out of
the earlier patches and have a "[PATCH 7/6] add tests for url
normalizing".

Two things to note about my test conversion:

  1. Git-config expects pre-canonicalized variable names (so http.noepsv
     instead of "http.noEPSV"). I think the "git config --get" code path
     does this for the caller, so we should probably do the same for
     "--get-urlmatch". And it is even easier here, because we know that
     "http.noEPSV" does not contain a case-sensitive middle part. :)

  2. I turned the many 'test "$(git foo)" = "bar"' invocations into a
     wrapper function that uses test_cmp. This helped immensely with
     debugging (1).

     The wrapper is a little ugly. I do wonder if we actually need all
     of these tests (i.e., it is not clear to me what different things
     each is testing, and if it is not simply trying to exercise the
     different variable names, which now all follow the same code path,
     because git-config does not care about the particular names).

---
 t/t5200-url-normalize.sh | 40 ++++++++-------
 test-url-normalize.c     | 93 ++---------------------------------
 2 files changed, 27 insertions(+), 106 deletions(-)

diff --git a/t/t5200-url-normalize.sh b/t/t5200-url-normalize.sh
index f79bb0f..a5190f7 100755
--- a/t/t5200-url-normalize.sh
+++ b/t/t5200-url-normalize.sh
@@ -3,11 +3,6 @@ test_description='url normalization'
 test_description='url normalization'
 . ./test-lib.sh
 
-if test -n "$NO_CURL"; then
-	skip_all='skipping test, git built without http support'
-	test_done
-fi
-
 # The base name of the test url files
 tu="$TEST_DIRECTORY/t5200/url"
 
@@ -182,18 +177,27 @@ test_expect_success 'url equivalents' '
 	test-url-normalize "https://@x.y/^/.." "httpS://@x.y:0443/"
 '
 
-test_expect_success 'url config normalization matching' '
-	test "$(test-url-normalize -c "$tc-1" "useragent" "https://other.example.com/")" = "other-agent" &&
-	test "$(test-url-normalize -c "$tc-1" "useragent" "https://example.com/")" = "example-agent" &&
-	test "$(test-url-normalize -c "$tc-1" "sslVerify" "https://example.com/")" = "false" &&
-	test "$(test-url-normalize -c "$tc-1" "useragent" "https://example.com/path/sub")" = "path-agent" &&
-	test "$(test-url-normalize -c "$tc-1" "sslVerify" "https://example.com/path/sub")" = "false" &&
-	test "$(test-url-normalize -c "$tc-1" "noEPSV" "https://elsewhere.com/")" = "true" &&
-	test "$(test-url-normalize -c "$tc-1" "noEPSV" "https://example.com")" = "true" &&
-	test "$(test-url-normalize -c "$tc-1" "noEPSV" "https://example.com/path")" = "true" &&
-	test "$(test-url-normalize -c "$tc-2" "useragent" "HTTPS://example.COM/p%61th")" = "example-agent" &&
-	test "$(test-url-normalize -c "$tc-2" "sslVerify" "HTTPS://example.COM/p%61th")" = "false" &&
-	test "$(test-url-normalize -c "$tc-3" "sslcainfo" "https://user@example.com/path/name/here")" = "file-1"
-'
+check_url_config() {
+	config=$1; shift
+	expect=$1; shift
+
+	test_expect_success "config normalization ($*)" "
+		echo '$expect' >expect &&
+		git config --file="\$config" --get-urlmatch $* >actual &&
+		test_cmp expect actual
+	"
+}
+
+check_url_config "$tc-1" other-agent http.useragent https://other.example.com/
+check_url_config "$tc-1" example-agent http.useragent https://example.com/
+check_url_config "$tc-1" false --bool http.sslverify https://example.com/
+check_url_config "$tc-1" path-agent http.useragent https://example.com/path/sub
+check_url_config "$tc-1" false --bool http.sslverify https://example.com/path/sub
+check_url_config "$tc-1" true --bool http.noepsv https://elsewhere.com/
+check_url_config "$tc-1" true --bool http.noepsv https://example.com
+check_url_config "$tc-1" true --bool http.noepsv https://example.com/path
+check_url_config "$tc-2" example-agent http.useragent HTTPS://example.COM/p%61th
+check_url_config "$tc-2" false --bool http.sslverify HTTPS://example.COM/p%61th
+check_url_config "$tc-3" file-1 http.sslcainfo https://user@example.com/path/name/here
 
 test_done
diff --git a/test-url-normalize.c b/test-url-normalize.c
index 81d3da9..34a1a67 100644
--- a/test-url-normalize.c
+++ b/test-url-normalize.c
@@ -1,84 +1,11 @@ int main(int argc, char **argv)
-#ifdef NO_CURL
-
-int main()
-{
-	return 125;
-}
-
-#else /* !NO_CURL */
-
-#include "http.c"
-
-static int run_http_options(const char *file,
-			    const char *opt,
-			    const struct url_info *info)
-{
-	struct strbuf opt_lc;
-	size_t i, len;
-	struct urlmatch_config config = { STRING_LIST_INIT_DUP };
-
-	memcpy(&config.url, info, sizeof(*info));
-	config.section = "http";
-	config.collect_fn = http_options;
-	config.cascade_fn = git_default_config;
-	config.cb = NULL;
-
-	if (git_config_with_options(urlmatch_config_entry, &config, file, 0))
-		return 1;
-
-	len = strlen(opt);
-	strbuf_init(&opt_lc, len);
-	for (i = 0; i < len; ++i) {
-		strbuf_addch(&opt_lc, tolower(opt[i]));
-	}
-
-	if (!strcmp("sslverify", opt_lc.buf))
-		printf("%s\n", curl_ssl_verify ? "true" : "false");
-	else if (!strcmp("sslcert", opt_lc.buf))
-		printf("%s\n", ssl_cert);
-#if LIBCURL_VERSION_NUM >= 0x070903
-	else if (!strcmp("sslkey", opt_lc.buf))
-		printf("%s\n", ssl_key);
-#endif
-#if LIBCURL_VERSION_NUM >= 0x070908
-	else if (!strcmp("sslcapath", opt_lc.buf))
-		printf("%s\n", ssl_capath);
-#endif
-	else if (!strcmp("sslcainfo", opt_lc.buf))
-		printf("%s\n", ssl_cainfo);
-	else if (!strcmp("sslcertpasswordprotected", opt_lc.buf))
-		printf("%s\n", ssl_cert_password_required ? "true" : "false");
-	else if (!strcmp("ssltry", opt_lc.buf))
-		printf("%s\n", curl_ssl_try ? "true" : "false");
-	else if (!strcmp("minsessions", opt_lc.buf))
-		printf("%d\n", min_curl_sessions);
-	else if (!strcmp("maxrequests", opt_lc.buf))
-		printf("%d\n", max_requests);
-	else if (!strcmp("lowspeedlimit", opt_lc.buf))
-		printf("%ld\n", curl_low_speed_limit);
-	else if (!strcmp("lowspeedtime", opt_lc.buf))
-		printf("%ld\n", curl_low_speed_time);
-	else if (!strcmp("noepsv", opt_lc.buf))
-		printf("%s\n", curl_ftp_no_epsv ? "true" : "false");
-	else if (!strcmp("proxy", opt_lc.buf))
-		printf("%s\n", curl_http_proxy);
-	else if (!strcmp("cookiefile", opt_lc.buf))
-		printf("%s\n", curl_cookie_file);
-	else if (!strcmp("postbuffer", opt_lc.buf))
-		printf("%u\n", (unsigned)http_post_buffer);
-	else if (!strcmp("useragent", opt_lc.buf))
-		printf("%s\n", user_agent);
-
-	return 0;
-}
+#include "git-compat-util.h"
+#include "urlmatch.h"
 
 int main(int argc, char **argv)
 {
-	const char *usage = "test-url-normalize [-p | -l] <url1> | <url1> <url2>"
-		" | -c file option <url1>";
+	const char *usage = "test-url-normalize [-p | -l] <url1> | <url1> <url2>";
 	char *url1, *url2;
-	int opt_p = 0, opt_l = 0, opt_c = 0;
-	char *file = NULL, *optname = NULL;
+	int opt_p = 0, opt_l = 0;
 
 	/*
 	 * For one url, succeed if url_normalize succeeds on it, fail otherwise.
@@ -101,12 +28,6 @@ int main(int argc, char **argv)
 		opt_l = 1;
 		argc--;
 		argv++;
-	} else if (argc > 3 && !strcmp(argv[1], "-c")) {
-		opt_c = 1;
-		file = argv[2];
-		optname = argv[3];
-		argc -= 3;
-		argv += 3;
 	}
 
 	if (argc < 2 || argc > 3)
@@ -121,17 +42,13 @@ int main(int argc, char **argv)
 			printf("%s\n", url1);
 		if (opt_l)
 			printf("%u\n", (unsigned)info.url_len);
-		if (opt_c)
-			return run_http_options(file, optname, &info);
 		return 0;
 	}
 
-	if (opt_p || opt_l || opt_c)
+	if (opt_p || opt_l)
 		die(usage);
 
 	url1 = url_normalize(argv[1], NULL);
 	url2 = url_normalize(argv[2], NULL);
 	return (url1 && url2 && !strcmp(url1, url2)) ? 0 : 1;
 }
-
-#endif /* !NO_CURL */

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 6/6] config: "git config --get-urlmatch" parses section.<url>.key
  2013-07-31 22:45   ` Jeff King
@ 2013-07-31 23:03     ` Kyle J. McKay
  2013-07-31 23:44       ` Jeff King
  2013-08-05 20:20       ` [PATCH ALTERNATIVE v6.v3 4/6] config: parse http.<url>.<variable> using urlmatch Kyle J. McKay
  2013-07-31 23:47     ` [PATCH v6 6/6] config: "git config --get-urlmatch" parses section.<url>.key Junio C Hamano
  1 sibling, 2 replies; 23+ messages in thread
From: Kyle J. McKay @ 2013-07-31 23:03 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git

On Jul 31, 2013, at 15:45, Jeff King wrote:

> On Wed, Jul 31, 2013 at 12:26:08PM -0700, Junio C Hamano wrote:
>
>> Using the same urlmatch_config_entry() infrastructure, add a new
>> mode "--get-urlmatch" to the "git config" command, to learn values
>> for the "virtual" two-level variables customized for the specific
>> URL.
>>
>>    git config [--<type>] --get-urlmatch <section>[.<key>] <url>
>
> Do we want something like this on top, to convert the third form of
> test-url-normalize into git-config calls?
>
> It would be nicer squashed in, but we the tests are added earlier in  
> the
> series than "--get-urlmatch", so we would have to rip the tests out of
> the earlier patches and have a "[PATCH 7/6] add tests for url
> normalizing".
>
> Two things to note about my test conversion:
>
>  1. Git-config expects pre-canonicalized variable names (so  
> http.noepsv
>     instead of "http.noEPSV"). I think the "git config --get" code  
> path
>     does this for the caller, so we should probably do the same for
>     "--get-urlmatch". And it is even easier here, because we know that
>     "http.noEPSV" does not contain a case-sensitive middle part. :)

The test was testing that too, which I think is a good thing.  Your  
replacement does not test that.  With a fix for --get-urlmatch as you  
mention above, the tests can check that again.

>  2. I turned the many 'test "$(git foo)" = "bar"' invocations into a
>     wrapper function that uses test_cmp. This helped immensely with
>     debugging (1).
>
>     The wrapper is a little ugly. I do wonder if we actually need all
>     of these tests (i.e., it is not clear to me what different things
>     each is testing, and if it is not simply trying to exercise the
>     different variable names, which now all follow the same code path,
>     because git-config does not care about the particular names).

Each one tests a different item from the "$tc-n" config file to make  
sure that everything that's in each config file actually behaves as  
expected.

If we do this (and I don't really have any objection except for the  
point noted above), then the tests really need to move out from t5200  
as they're not tied to the http operations anymore.  Also the Makefile  
rule for test-url-normalize.c needs to be simplified since it won't  
need the extra options to make it link since it's no longer including  
http.c.

The README has this:

> First digit tells the family:
>
>         0 - the absolute basics and global stuff
>         1 - the basic commands concerning database
>         2 - the basic commands concerning the working tree
>         3 - the other basic commands (e.g. ls-files)
>         4 - the diff commands
>         5 - the pull and exporting commands
>         6 - the revision tree commands (even e.g. merge-base)
>         7 - the porcelainish commands concerning the working tree
>         8 - the porcelainish commands concerning forensics
>         9 - the git tools

But the best choice does not immediately jump out at me.  However,  
looking at the other tests that are there, I think perhaps 1307-config- 
url might be a reasonable choice.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 6/6] config: "git config --get-urlmatch" parses section.<url>.key
  2013-07-31 23:03     ` Kyle J. McKay
@ 2013-07-31 23:44       ` Jeff King
  2013-08-01 17:25         ` Junio C Hamano
  2013-08-05 20:20       ` [PATCH ALTERNATIVE v6.v3 4/6] config: parse http.<url>.<variable> using urlmatch Kyle J. McKay
  1 sibling, 1 reply; 23+ messages in thread
From: Jeff King @ 2013-07-31 23:44 UTC (permalink / raw)
  To: Kyle J. McKay; +Cc: Junio C Hamano, git

On Wed, Jul 31, 2013 at 04:03:01PM -0700, Kyle J. McKay wrote:

> > 1. Git-config expects pre-canonicalized variable names (so
> >http.noepsv
> >    instead of "http.noEPSV"). I think the "git config --get" code
> >path
> >    does this for the caller, so we should probably do the same for
> >    "--get-urlmatch". And it is even easier here, because we know that
> >    "http.noEPSV" does not contain a case-sensitive middle part. :)
> 
> The test was testing that too, which I think is a good thing.  Your
> replacement does not test that.  With a fix for --get-urlmatch as you
> mention above, the tests can check that again.

I do not think the existing tests were checking anything interesting in
that respect. The url-matching code does not do the canonicalization,
and nor should it (the internal callbacks for all variables should use
the canonical lowercase version). So we were only testing that
test-url-normalize lowercased them, which is not something we actually
care about (nobody but the test script should ever call it).

That being said, git-config _should_ be lowercasing to match the normal
--get code path. I think the fix (squashable on top of 6/6 + my earlier
patch) is just:

diff --git a/builtin/config.c b/builtin/config.c
index c35c5be..9328a90 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -589,7 +589,9 @@ int cmd_config(int argc, const char **argv, const char *prefix)
 	}
 	else if (actions == ACTION_GET_URLMATCH) {
 		check_argc(argc, 2, 2);
-		return get_urlmatch(argv[0], argv[1]);
+		if (git_config_parse_key(argv[0], &key, NULL))
+			return CONFIG_INVALID_KEY;
+		return get_urlmatch(key, argv[1]);
 	}
 	else if (actions == ACTION_UNSET) {
 		check_argc(argc, 1, 2);
diff --git a/t/t5200-url-normalize.sh b/t/t5200-url-normalize.sh
index a5190f7..7284dc6 100755
--- a/t/t5200-url-normalize.sh
+++ b/t/t5200-url-normalize.sh
@@ -190,14 +190,14 @@ check_url_config "$tc-2" example-agent http.useragent HTTPS://example.COM/p%61th
 
 check_url_config "$tc-1" other-agent http.useragent https://other.example.com/
 check_url_config "$tc-1" example-agent http.useragent https://example.com/
-check_url_config "$tc-1" false --bool http.sslverify https://example.com/
+check_url_config "$tc-1" false --bool http.sslVerify https://example.com/
 check_url_config "$tc-1" path-agent http.useragent https://example.com/path/sub
-check_url_config "$tc-1" false --bool http.sslverify https://example.com/path/sub
-check_url_config "$tc-1" true --bool http.noepsv https://elsewhere.com/
-check_url_config "$tc-1" true --bool http.noepsv https://example.com
-check_url_config "$tc-1" true --bool http.noepsv https://example.com/path
+check_url_config "$tc-1" false --bool http.sslVerify https://example.com/path/sub
+check_url_config "$tc-1" true --bool http.noEPSV https://elsewhere.com/
+check_url_config "$tc-1" true --bool http.noEPSV https://example.com
+check_url_config "$tc-1" true --bool http.noEPSV https://example.com/path
 check_url_config "$tc-2" example-agent http.useragent HTTPS://example.COM/p%61th
-check_url_config "$tc-2" false --bool http.sslverify HTTPS://example.COM/p%61th
+check_url_config "$tc-2" false --bool http.sslVerify HTTPS://example.COM/p%61th
 check_url_config "$tc-3" file-1 http.sslcainfo https://user@example.com/path/name/here
 
 test_done

> >    The wrapper is a little ugly. I do wonder if we actually need all
> >    of these tests (i.e., it is not clear to me what different things
> >    each is testing, and if it is not simply trying to exercise the
> >    different variable names, which now all follow the same code path,
> >    because git-config does not care about the particular names).
> 
> Each one tests a different item from the "$tc-n" config file to make
> sure that everything that's in each config file actually behaves as
> expected.

I guess I don't understand why we have so many items in each file. That
is, we have:

	"$tc-1" "useragent" "https://other.example.com/" = "other-agent"
	"$tc-1" "useragent" "https://example.com/" = "example-agent"

The first checks that we do not apply within a sub-domain (but fall back
to http.useragent), and the second checks that we do properly apply the
full domain.

	"$tc-1" "sslVerify" "https://example.com/" = "false"

This check seems redundant with the second one above.

	"$tc-1" "useragent" "https://example.com/path/sub" = "path-agent"
	"$tc-1" "sslVerify" "https://example.com/path/sub" = "false"

Here we make sure paths are preferred over non-paths (the first one),
but that config keys with non-paths are still used (the second).

	"$tc-1" "noEPSV" "https://elsewhere.com/" = "true"

This seems redundant with the first test (check that we do not match and
fallback to http.noepsv).

	"$tc-1" "noEPSV" "https://example.com" = "true"

Not sure what we are testing here; there is no variable besides the
one in http.noepsv. Somewhat redundant with the first test.

	"$tc-1" "noEPSV" "https://example.com/path" = "true"

Ditto.

	"$tc-2" "useragent" "HTTPS://example.COM/p%61th" = "example-agent"
	"$tc-2" "sslVerify" "HTTPS://example.COM/p%61th" = "false"

Testing normalization, though they seem redundant with each other.

	"$tc-3" "sslcainfo" "https://user@example.com/path/name/here" = "file-1"

Testing specific pathnames preferred to usernames, which is useful.

I don't mean to nitpick. It was just hard as a reader to figure out what
specifically each one was interested in checking, which means it may be
similarly hard if one of the tests is later broken for the investigator
to figure out what was happening. I don't know if it is worth putting
each in its own test and annotating what each is looking for (it may
also help show gaps; e.g., we check that longer pathnames trump
usernames, but we do not check that the same pathname prefers the
version with the username).

> If we do this (and I don't really have any objection except for the
> point noted above), then the tests really need to move out from t5200
> [...]
> But the best choice does not immediately jump out at me.  However,
> looking at the other tests that are there, I think perhaps
> 1307-config-url might be a reasonable choice.

Yes, that makes sense to me; all of the other config is in the t1300
series.

-Peff

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 6/6] config: "git config --get-urlmatch" parses section.<url>.key
  2013-07-31 22:45   ` Jeff King
  2013-07-31 23:03     ` Kyle J. McKay
@ 2013-07-31 23:47     ` Junio C Hamano
  1 sibling, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2013-07-31 23:47 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Kyle J. McKay

Jeff King <peff@peff.net> writes:

>   1. Git-config expects pre-canonicalized variable names (so http.noepsv
>      instead of "http.noEPSV"). I think the "git config --get" code path
>      does this for the caller, so we should probably do the same for
>      "--get-urlmatch". And it is even easier here, because we know that
>      "http.noEPSV" does not contain a case-sensitive middle part. :)

I'll squash these in later, but here is from my working copy.
Thanks for spotting.

 builtin/config.c       | 32 +++++++++++++++++++-------------
 t/t1300-repo-config.sh |  4 ++--
 2 files changed, 21 insertions(+), 15 deletions(-)

diff --git a/builtin/config.c b/builtin/config.c
index c35c5be..c046f54 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -379,9 +379,22 @@ static int urlmatch_collect_fn(const char *var, const char *value, void *cb)
 	return 0;
 }
 
+static char *dup_downcase(const char *string)
+{
+	char *result;
+	size_t len, i;
+
+	len = strlen(string);
+	result = xmalloc(len + 1);
+	for (i = 0; i < len; i++)
+		result[i] = tolower(string[i]);
+	result[i] = '\0';
+	return result;
+}
+
 static int get_urlmatch(const char *var, const char *url)
 {
-	const char *section_tail;
+	char *section_tail;
 	struct string_list_item *item;
 	struct urlmatch_config config = { STRING_LIST_INIT_DUP };
 	struct string_list values = STRING_LIST_INIT_DUP;
@@ -393,13 +406,13 @@ static int get_urlmatch(const char *var, const char *url)
 	if (!url_normalize(url, &config.url))
 		die(config.url.err);
 
-	section_tail = strchr(var, '.');
+	config.section = dup_downcase(var);
+	section_tail = strchr(config.section, '.');
 	if (section_tail) {
-		config.section = xmemdupz(var, section_tail - var);
-		config.key = strrchr(var, '.') + 1;
+		*section_tail = '\0';
+		config.key = section_tail + 1;
 		show_keys = 0;
 	} else {
-		config.section = var;
 		config.key = NULL;
 		show_keys = 1;
 	}
@@ -425,14 +438,7 @@ static int get_urlmatch(const char *var, const char *url)
 	string_list_clear(&values, 1);
 	free(config.url.url);
 
-	/*
-	 * section name may have been copied to replace the dot, in which
-	 * case it needs to be freed.  key name is either NULL (e.g. 'http'
-	 * alone) or points into var (e.g. 'http.savecookies'), and we do
-	 * not own the storage.
-	 */
-	if (config.section != var)
-		free((void *)config.section);
+	free((void *)config.section);
 	return 0;
 }
 
diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index 323e880..c23f478 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -1097,7 +1097,7 @@ test_expect_success 'urlmatch' '
 	EOF
 
 	echo true >expect &&
-	git config --bool --get-urlmatch http.sslverify https://good.example.com >actual &&
+	git config --bool --get-urlmatch http.SSLverify https://good.example.com >actual &&
 	test_cmp expect actual &&
 
 	echo false >expect &&
@@ -1108,7 +1108,7 @@ test_expect_success 'urlmatch' '
 		echo http.cookiefile /tmp/cookie.txt &&
 		echo http.sslverify false
 	} >expect &&
-	git config --get-urlmatch http https://weak.example.com >actual &&
+	git config --get-urlmatch HTTP https://weak.example.com >actual &&
 	test_cmp expect actual
 '
 

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 6/6] config: "git config --get-urlmatch" parses section.<url>.key
  2013-07-31 23:44       ` Jeff King
@ 2013-08-01 17:25         ` Junio C Hamano
  2013-08-01 17:30           ` Jeff King
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2013-08-01 17:25 UTC (permalink / raw)
  To: Jeff King; +Cc: Kyle J. McKay, git

Jeff King <peff@peff.net> writes:

> That being said, git-config _should_ be lowercasing to match the normal
> --get code path. I think the fix (squashable on top of 6/6 + my earlier
> patch) is just:
>
> diff --git a/builtin/config.c b/builtin/config.c
> index c35c5be..9328a90 100644
> --- a/builtin/config.c
> +++ b/builtin/config.c
> @@ -589,7 +589,9 @@ int cmd_config(int argc, const char **argv, const char *prefix)
>  	}
>  	else if (actions == ACTION_GET_URLMATCH) {
>  		check_argc(argc, 2, 2);
> -		return get_urlmatch(argv[0], argv[1]);
> +		if (git_config_parse_key(argv[0], &key, NULL))
> +			return CONFIG_INVALID_KEY;
> +		return get_urlmatch(key, argv[1]);
>  	}
>  	else if (actions == ACTION_UNSET) {
>  		check_argc(argc, 1, 2);

If we drop the "list every key in section.*" mode, the above should
be sufficient, I think.

I do not know how useful it would be to be for a scripted Porcelain
to be able to ask

    $ git config --get-urlmatch http https://weak.example.com/path/to/git.git

and get all the "http.*" variables that will apply to the given URL.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v6 6/6] config: "git config --get-urlmatch" parses section.<url>.key
  2013-08-01 17:25         ` Junio C Hamano
@ 2013-08-01 17:30           ` Jeff King
  0 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2013-08-01 17:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Kyle J. McKay, git

On Thu, Aug 01, 2013 at 10:25:13AM -0700, Junio C Hamano wrote:

> > diff --git a/builtin/config.c b/builtin/config.c
> > index c35c5be..9328a90 100644
> > --- a/builtin/config.c
> > +++ b/builtin/config.c
> > @@ -589,7 +589,9 @@ int cmd_config(int argc, const char **argv, const char *prefix)
> >  	}
> >  	else if (actions == ACTION_GET_URLMATCH) {
> >  		check_argc(argc, 2, 2);
> > -		return get_urlmatch(argv[0], argv[1]);
> > +		if (git_config_parse_key(argv[0], &key, NULL))
> > +			return CONFIG_INVALID_KEY;
> > +		return get_urlmatch(key, argv[1]);
> >  	}
> >  	else if (actions == ACTION_UNSET) {
> >  		check_argc(argc, 1, 2);
> 
> If we drop the "list every key in section.*" mode, the above should
> be sufficient, I think.

Ah, right, forgot about that again.

> I do not know how useful it would be to be for a scripted Porcelain
> to be able to ask
> 
>     $ git config --get-urlmatch http https://weak.example.com/path/to/git.git
> 
> and get all the "http.*" variables that will apply to the given URL.

I do not think it is all that useful, but I don't think it hurts too
much to have it (and it would be a hard operation to do otherwise). The
implementation you posted that downcases as we memdup the key makes
sense to me.

The key thing is that it is done only by git-config, as we would not
want to pay the allocation cost for every config key we look at during a
git_config() run, but I think your implementation is fine in that
respect.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH ALTERNATIVE v6.v3 4/6] config: parse http.<url>.<variable> using urlmatch
  2013-07-31 23:03     ` Kyle J. McKay
  2013-07-31 23:44       ` Jeff King
@ 2013-08-05 20:20       ` Kyle J. McKay
  2013-08-05 22:56         ` Junio C Hamano
  1 sibling, 1 reply; 23+ messages in thread
From: Kyle J. McKay @ 2013-08-05 20:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King

Use the urlmatch_config_entry() to wrap the underlying
http_options() two-level variable parser in order to set
http.<variable> to the value with the most specific URL in the
configuration.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

This version of 4/6 moves the tests to t0110 since urlmatch is now global.
The config tests are removed since part 6/6 already has those and they no
longer belong with the urlmatch normalization tests.

The Makefile rule has been removed since it's no longer needed to build
correctly as the test program no longer includes http.c.

Other than those changes (and a minor rename to reflect the new location),
this patch is identical to the previous v6.v2 4/6.

 .gitignore                        |   1 +
 Documentation/config.txt          |  45 ++++++++++
 Makefile                          |   3 +
 http.c                            |  13 ++-
 t/.gitattributes                  |   1 +
 t/t0110-urlmatch-normalization.sh | 177 ++++++++++++++++++++++++++++++++++++++
 t/t0110/README                    |   9 ++
 t/t0110/url-1                     | Bin 0 -> 20 bytes
 t/t0110/url-10                    | Bin 0 -> 23 bytes
 t/t0110/url-11                    | Bin 0 -> 25 bytes
 t/t0110/url-2                     | Bin 0 -> 20 bytes
 t/t0110/url-3                     | Bin 0 -> 23 bytes
 t/t0110/url-4                     | Bin 0 -> 23 bytes
 t/t0110/url-5                     | Bin 0 -> 23 bytes
 t/t0110/url-6                     | Bin 0 -> 23 bytes
 t/t0110/url-7                     | Bin 0 -> 23 bytes
 t/t0110/url-8                     | Bin 0 -> 23 bytes
 t/t0110/url-9                     | Bin 0 -> 23 bytes
 test-urlmatch-normalization.c     |  50 +++++++++++
 19 files changed, 298 insertions(+), 1 deletion(-)
 create mode 100755 t/t0110-urlmatch-normalization.sh
 create mode 100644 t/t0110/README
 create mode 100644 t/t0110/url-1
 create mode 100644 t/t0110/url-10
 create mode 100644 t/t0110/url-11
 create mode 100644 t/t0110/url-2
 create mode 100644 t/t0110/url-3
 create mode 100644 t/t0110/url-4
 create mode 100644 t/t0110/url-5
 create mode 100644 t/t0110/url-6
 create mode 100644 t/t0110/url-7
 create mode 100644 t/t0110/url-8
 create mode 100644 t/t0110/url-9
 create mode 100644 test-urlmatch-normalization.c

diff --git a/.gitignore b/.gitignore
index 6669bf0c..b8524bfe 100644
--- a/.gitignore
+++ b/.gitignore
@@ -198,6 +198,7 @@
 /test-string-list
 /test-subprocess
 /test-svn-fe
+/test-urlmatch-normalization
 /test-wildmatch
 /common-cmds.h
 *.tar.gz
diff --git a/Documentation/config.txt b/Documentation/config.txt
index 6e53fc50..a81f3ab7 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1513,6 +1513,51 @@ http.useragent::
 	of common USER_AGENT strings (but not including those like git/1.7.1).
 	Can be overridden by the 'GIT_HTTP_USER_AGENT' environment variable.
 
+http.<url>.*::
+	Any of the http.* options above can be applied selectively to some urls.
+	For a config key to match a URL, each element of the config key is
+	compared to that of the URL, in the following order:
++
+--
+. Scheme (e.g., `https` in `https://example.com/`). This field
+  must match exactly between the config key and the URL.
+
+. Host/domain name (e.g., `example.com` in `https://example.com/`).
+  This field must match exactly between the config key and the URL.
+
+. Port number (e.g., `8080` in `http://example.com:8080/`).
+  This field must match exactly between the config key and the URL.
+  Omitted port numbers are automatically converted to the correct
+  default for the scheme before matching.
+
+. Path (e.g., `repo.git` in `https://example.com/repo.git`). The
+  path field of the config key must match the path field of the URL
+  either exactly or as a prefix of slash-delimited path elements.  This means
+  a config key with path `foo/` matches URL path `foo/bar`.  A prefix can only
+  match on a slash (`/`) boundary.  Longer matches take precedence (so a config
+  key with path `foo/bar` is a better match to URL path `foo/bar` than a config
+  key with just path `foo/`).
+
+. User name (e.g., `user` in `https://user@example.com/repo.git`). If
+  the config key has a user name it must match the user name in the
+  URL exactly. If the config key does not have a user name, that
+  config key will match a URL with any user name (including none),
+  but at a lower precedence than a config key with a user name.
+--
++
+The list above is ordered by decreasing precedence; a URL that matches
+a config key's path is preferred to one that matches its user name. For example,
+if the URL is `https://user@example.com/foo/bar` a config key match of
+`https://example.com/foo` will be preferred over a config key match of
+`https://user@example.com`.
++
+All URLs are normalized before attempting any matching (the password part,
+if embedded in the URL, is always ignored for matching purposes) so that
+equivalent urls that are simply spelled differently will match properly.
+Environment variable settings always override any matches.  The urls that are
+matched against are those given directly to Git commands.  This means any URLs
+visited as a result of a redirection do not participate in matching.
+
 i18n.commitEncoding::
 	Character encoding the commit messages are stored in; Git itself
 	does not care per se, but this information is necessary e.g. when
diff --git a/Makefile b/Makefile
index 0f931a20..2df742c2 100644
--- a/Makefile
+++ b/Makefile
@@ -567,6 +567,7 @@ TEST_PROGRAMS_NEED_X += test-sigchain
 TEST_PROGRAMS_NEED_X += test-string-list
 TEST_PROGRAMS_NEED_X += test-subprocess
 TEST_PROGRAMS_NEED_X += test-svn-fe
+TEST_PROGRAMS_NEED_X += test-urlmatch-normalization
 TEST_PROGRAMS_NEED_X += test-wildmatch
 
 TEST_PROGRAMS = $(patsubst %,%$X,$(TEST_PROGRAMS_NEED_X))
@@ -721,6 +722,7 @@ LIB_H += tree-walk.h
 LIB_H += tree.h
 LIB_H += unpack-trees.h
 LIB_H += url.h
+LIB_H += urlmatch.h
 LIB_H += userdiff.h
 LIB_H += utf8.h
 LIB_H += varint.h
@@ -868,6 +870,7 @@ LIB_OBJS += tree.o
 LIB_OBJS += tree-walk.o
 LIB_OBJS += unpack-trees.o
 LIB_OBJS += url.o
+LIB_OBJS += urlmatch.o
 LIB_OBJS += usage.o
 LIB_OBJS += userdiff.o
 LIB_OBJS += utf8.o
diff --git a/http.c b/http.c
index 37986f82..5eda356f 100644
--- a/http.c
+++ b/http.c
@@ -3,6 +3,7 @@
 #include "sideband.h"
 #include "run-command.h"
 #include "url.h"
+#include "urlmatch.h"
 #include "credential.h"
 #include "version.h"
 #include "pkt-line.h"
@@ -334,10 +335,20 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 {
 	char *low_speed_limit;
 	char *low_speed_time;
+	char *normalized_url;
+	struct urlmatch_config config = { STRING_LIST_INIT_DUP };
+
+	config.section = "http";
+	config.key = NULL;
+	config.collect_fn = http_options;
+	config.cascade_fn = git_default_config;
+	config.cb = NULL;
 
 	http_is_verbose = 0;
+	normalized_url = url_normalize(url, &config.url);
 
-	git_config(http_options, NULL);
+	git_config(urlmatch_config_entry, &config);
+	free(normalized_url);
 
 	curl_global_init(CURL_GLOBAL_ALL);
 
diff --git a/t/.gitattributes b/t/.gitattributes
index 1b97c546..2d44088f 100644
--- a/t/.gitattributes
+++ b/t/.gitattributes
@@ -1 +1,2 @@
 t[0-9][0-9][0-9][0-9]/* -whitespace
+t0110/url-* binary
diff --git a/t/t0110-urlmatch-normalization.sh b/t/t0110-urlmatch-normalization.sh
new file mode 100755
index 00000000..8d6096d4
--- /dev/null
+++ b/t/t0110-urlmatch-normalization.sh
@@ -0,0 +1,177 @@
+#!/bin/sh
+
+test_description='urlmatch URL normalization'
+. ./test-lib.sh
+
+# The base name of the test url files
+tu="$TEST_DIRECTORY/t0110/url"
+
+# Note that only file: URLs should be allowed without a host
+
+test_expect_success 'url scheme' '
+	! test-urlmatch-normalization "" &&
+	! test-urlmatch-normalization "_" &&
+	! test-urlmatch-normalization "scheme" &&
+	! test-urlmatch-normalization "scheme:" &&
+	! test-urlmatch-normalization "scheme:/" &&
+	! test-urlmatch-normalization "scheme://" &&
+	! test-urlmatch-normalization "file" &&
+	! test-urlmatch-normalization "file:" &&
+	! test-urlmatch-normalization "file:/" &&
+	test-urlmatch-normalization "file://" &&
+	! test-urlmatch-normalization "://acme.co" &&
+	! test-urlmatch-normalization "x_test://acme.co" &&
+	! test-urlmatch-normalization "-test://acme.co" &&
+	! test-urlmatch-normalization "0test://acme.co" &&
+	! test-urlmatch-normalization "+test://acme.co" &&
+	! test-urlmatch-normalization ".test://acme.co" &&
+	! test-urlmatch-normalization "schem%6e://" &&
+	test-urlmatch-normalization "x-Test+v1.0://acme.co" &&
+	test "$(test-urlmatch-normalization -p "AbCdeF://x.Y")" = "abcdef://x.y/"
+'
+
+test_expect_success 'url authority' '
+	! test-urlmatch-normalization "scheme://user:pass@" &&
+	! test-urlmatch-normalization "scheme://?" &&
+	! test-urlmatch-normalization "scheme://#" &&
+	! test-urlmatch-normalization "scheme:///" &&
+	! test-urlmatch-normalization "scheme://:" &&
+	! test-urlmatch-normalization "scheme://:555" &&
+	test-urlmatch-normalization "file://user:pass@" &&
+	test-urlmatch-normalization "file://?" &&
+	test-urlmatch-normalization "file://#" &&
+	test-urlmatch-normalization "file:///" &&
+	test-urlmatch-normalization "file://:" &&
+	! test-urlmatch-normalization "file://:555" &&
+	test-urlmatch-normalization "scheme://user:pass@host" &&
+	test-urlmatch-normalization "scheme://@host" &&
+	test-urlmatch-normalization "scheme://%00@host" &&
+	! test-urlmatch-normalization "scheme://%%@host" &&
+	! test-urlmatch-normalization "scheme://host_" &&
+	test-urlmatch-normalization "scheme://user:pass@host/" &&
+	test-urlmatch-normalization "scheme://@host/" &&
+	test-urlmatch-normalization "scheme://host/" &&
+	test-urlmatch-normalization "scheme://host?x" &&
+	test-urlmatch-normalization "scheme://host#x" &&
+	test-urlmatch-normalization "scheme://host/@" &&
+	test-urlmatch-normalization "scheme://host?@x" &&
+	test-urlmatch-normalization "scheme://host#@x" &&
+	test-urlmatch-normalization "scheme://[::1]" &&
+	test-urlmatch-normalization "scheme://[::1]/" &&
+	! test-urlmatch-normalization "scheme://hos%41/" &&
+	test-urlmatch-normalization "scheme://[invalid....:/" &&
+	test-urlmatch-normalization "scheme://invalid....:]/" &&
+	! test-urlmatch-normalization "scheme://invalid....:[/" &&
+	! test-urlmatch-normalization "scheme://invalid....:["
+'
+
+test_expect_success 'url port checks' '
+	test-urlmatch-normalization "xyz://q@some.host:" &&
+	test-urlmatch-normalization "xyz://q@some.host:456/" &&
+	! test-urlmatch-normalization "xyz://q@some.host:0" &&
+	! test-urlmatch-normalization "xyz://q@some.host:0000000" &&
+	test-urlmatch-normalization "xyz://q@some.host:0000001?" &&
+	test-urlmatch-normalization "xyz://q@some.host:065535#" &&
+	test-urlmatch-normalization "xyz://q@some.host:65535" &&
+	! test-urlmatch-normalization "xyz://q@some.host:65536" &&
+	! test-urlmatch-normalization "xyz://q@some.host:99999" &&
+	! test-urlmatch-normalization "xyz://q@some.host:100000" &&
+	! test-urlmatch-normalization "xyz://q@some.host:100001" &&
+	test-urlmatch-normalization "http://q@some.host:80" &&
+	test-urlmatch-normalization "https://q@some.host:443" &&
+	test-urlmatch-normalization "http://q@some.host:80/" &&
+	test-urlmatch-normalization "https://q@some.host:443?" &&
+	! test-urlmatch-normalization "http://q@:8008" &&
+	! test-urlmatch-normalization "http://:8080" &&
+	! test-urlmatch-normalization "http://:" &&
+	test-urlmatch-normalization "xyz://q@some.host:456/" &&
+	test-urlmatch-normalization "xyz://[::1]:456/" &&
+	test-urlmatch-normalization "xyz://[::1]:/" &&
+	! test-urlmatch-normalization "xyz://[::1]:000/" &&
+	! test-urlmatch-normalization "xyz://[::1]:0%300/" &&
+	! test-urlmatch-normalization "xyz://[::1]:0x80/" &&
+	! test-urlmatch-normalization "xyz://[::1]:4294967297/" &&
+	! test-urlmatch-normalization "xyz://[::1]:030f/"
+'
+
+test_expect_success 'url port normalization' '
+	test "$(test-urlmatch-normalization -p "http://x:800")" = "http://x:800/" &&
+	test "$(test-urlmatch-normalization -p "http://x:0800")" = "http://x:800/" &&
+	test "$(test-urlmatch-normalization -p "http://x:00000800")" = "http://x:800/" &&
+	test "$(test-urlmatch-normalization -p "http://x:065535")" = "http://x:65535/" &&
+	test "$(test-urlmatch-normalization -p "http://x:1")" = "http://x:1/" &&
+	test "$(test-urlmatch-normalization -p "http://x:80")" = "http://x/" &&
+	test "$(test-urlmatch-normalization -p "http://x:080")" = "http://x/" &&
+	test "$(test-urlmatch-normalization -p "http://x:000000080")" = "http://x/" &&
+	test "$(test-urlmatch-normalization -p "https://x:443")" = "https://x/" &&
+	test "$(test-urlmatch-normalization -p "https://x:0443")" = "https://x/" &&
+	test "$(test-urlmatch-normalization -p "https://x:000000443")" = "https://x/"
+'
+
+test_expect_success 'url general escapes' '
+	! test-urlmatch-normalization "http://x.y?%fg" &&
+	test "$(test-urlmatch-normalization -p "X://W/%7e%41^%3a")" = "x://w/~A%5E%3A" &&
+	test "$(test-urlmatch-normalization -p "X://W/:/?#[]@")" = "x://w/:/?#[]@" &&
+	test "$(test-urlmatch-normalization -p "X://W/$&()*+,;=")" = "x://w/$&()*+,;=" &&
+	test "$(test-urlmatch-normalization -p "X://W/'\''")" = "x://w/'\''" &&
+	test "$(test-urlmatch-normalization -p "X://W?'\!'")" = "x://w/?'\!'"
+'
+
+test_expect_success 'url high-bit escapes' '
+	test "$(test-urlmatch-normalization -p "$(cat "$tu-1")")" = "x://q/%01%02%03%04%05%06%07%08%0E%0F%10%11%12" &&
+	test "$(test-urlmatch-normalization -p "$(cat "$tu-2")")" = "x://q/%13%14%15%16%17%18%19%1B%1C%1D%1E%1F%7F" &&
+	test "$(test-urlmatch-normalization -p "$(cat "$tu-3")")" = "x://q/%80%81%82%83%84%85%86%87%88%89%8A%8B%8C%8D%8E%8F" &&
+	test "$(test-urlmatch-normalization -p "$(cat "$tu-4")")" = "x://q/%90%91%92%93%94%95%96%97%98%99%9A%9B%9C%9D%9E%9F" &&
+	test "$(test-urlmatch-normalization -p "$(cat "$tu-5")")" = "x://q/%A0%A1%A2%A3%A4%A5%A6%A7%A8%A9%AA%AB%AC%AD%AE%AF" &&
+	test "$(test-urlmatch-normalization -p "$(cat "$tu-6")")" = "x://q/%B0%B1%B2%B3%B4%B5%B6%B7%B8%B9%BA%BB%BC%BD%BE%BF" &&
+	test "$(test-urlmatch-normalization -p "$(cat "$tu-7")")" = "x://q/%C0%C1%C2%C3%C4%C5%C6%C7%C8%C9%CA%CB%CC%CD%CE%CF" &&
+	test "$(test-urlmatch-normalization -p "$(cat "$tu-8")")" = "x://q/%D0%D1%D2%D3%D4%D5%D6%D7%D8%D9%DA%DB%DC%DD%DE%DF" &&
+	test "$(test-urlmatch-normalization -p "$(cat "$tu-9")")" = "x://q/%E0%E1%E2%E3%E4%E5%E6%E7%E8%E9%EA%EB%EC%ED%EE%EF" &&
+	test "$(test-urlmatch-normalization -p "$(cat "$tu-10")")" = "x://q/%F0%F1%F2%F3%F4%F5%F6%F7%F8%F9%FA%FB%FC%FD%FE%FF" &&
+	test "$(test-urlmatch-normalization -p "$(cat "$tu-11")")" = "x://q/%C2%80%DF%BF%E0%A0%80%EF%BF%BD%F0%90%80%80%F0%AF%BF%BD"
+'
+
+test_expect_success 'url username/password escapes' '
+	test "$(test-urlmatch-normalization -p "x://%41%62(^):%70+d@foo")" = "x://Ab(%5E):p+d@foo/"
+'
+
+test_expect_success 'url normalized lengths' '
+	test "$(test-urlmatch-normalization -l "Http://%4d%65:%4d^%70@The.Host")" = 25 &&
+	test "$(test-urlmatch-normalization -l "http://%41:%42@x.y/%61/")" = 17 &&
+	test "$(test-urlmatch-normalization -l "http://@x.y/^")" = 15
+'
+
+test_expect_success 'url . and .. segments' '
+	test "$(test-urlmatch-normalization -p "x://y/.")" = "x://y/" &&
+	test "$(test-urlmatch-normalization -p "x://y/./")" = "x://y/" &&
+	test "$(test-urlmatch-normalization -p "x://y/a/.")" = "x://y/a" &&
+	test "$(test-urlmatch-normalization -p "x://y/a/./")" = "x://y/a/" &&
+	test "$(test-urlmatch-normalization -p "x://y/.?")" = "x://y/?" &&
+	test "$(test-urlmatch-normalization -p "x://y/./?")" = "x://y/?" &&
+	test "$(test-urlmatch-normalization -p "x://y/a/.?")" = "x://y/a?" &&
+	test "$(test-urlmatch-normalization -p "x://y/a/./?")" = "x://y/a/?" &&
+	test "$(test-urlmatch-normalization -p "x://y/a/./b/.././../c")" = "x://y/c" &&
+	test "$(test-urlmatch-normalization -p "x://y/a/./b/../.././c/")" = "x://y/c/" &&
+	test "$(test-urlmatch-normalization -p "x://y/a/./b/.././../c/././.././.")" = "x://y/" &&
+	! test-urlmatch-normalization "x://y/a/./b/.././../c/././.././.." &&
+	test "$(test-urlmatch-normalization -p "x://y/a/./?/././..")" = "x://y/a/?/././.." &&
+	test "$(test-urlmatch-normalization -p "x://y/%2e/")" = "x://y/" &&
+	test "$(test-urlmatch-normalization -p "x://y/%2E/")" = "x://y/" &&
+	test "$(test-urlmatch-normalization -p "x://y/a/%2e./")" = "x://y/" &&
+	test "$(test-urlmatch-normalization -p "x://y/b/.%2E/")" = "x://y/" &&
+	test "$(test-urlmatch-normalization -p "x://y/c/%2e%2E/")" = "x://y/"
+'
+
+# http://@foo specifies an empty user name but does not specify a password
+# http://foo  specifies neither a user name nor a password
+# So they should not be equivalent
+test_expect_success 'url equivalents' '
+	test-urlmatch-normalization "httP://x" "Http://X/" &&
+	test-urlmatch-normalization "Http://%4d%65:%4d^%70@The.Host" "hTTP://Me:%4D^p@the.HOST:80/" &&
+	! test-urlmatch-normalization "https://@x.y/^" "httpS://x.y:443/^" &&
+	test-urlmatch-normalization "https://@x.y/^" "httpS://@x.y:0443/^" &&
+	test-urlmatch-normalization "https://@x.y/^/../abc" "httpS://@x.y:0443/abc" &&
+	test-urlmatch-normalization "https://@x.y/^/.." "httpS://@x.y:0443/"
+'
+
+test_done
diff --git a/t/t0110/README b/t/t0110/README
new file mode 100644
index 00000000..ad4a50ec
--- /dev/null
+++ b/t/t0110/README
@@ -0,0 +1,9 @@
+The url data files in this directory contain URLs with characters
+in the range 0x01-0x1f and 0x7f-0xff to test the proper normalization
+of unprintable characters.
+
+A select few characters in the 0x01-0x1f range are skipped to help
+avoid problems running the test itself.
+
+The urls are in test files in this directory rather than being
+embedded in the test script for portability.
diff --git a/t/t0110/url-1 b/t/t0110/url-1
new file mode 100644
index 0000000000000000000000000000000000000000..519019c5ce6c58478f048a2f39e2321370d318c6
GIT binary patch
literal 20
bcmb=h($_E4XJle#VP#|I;Nuq%6ygE^Admtt

literal 0
HcmV?d00001

diff --git a/t/t0110/url-10 b/t/t0110/url-10
new file mode 100644
index 0000000000000000000000000000000000000000..b9965de6a5d74b122179821212b2c27c8ae03e80
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFYxj5^Yr!h_xSnx`~3a>{|dCd5i<Y)

literal 0
HcmV?d00001

diff --git a/t/t0110/url-11 b/t/t0110/url-11
new file mode 100644
index 0000000000000000000000000000000000000000..f0a50f10096a20d597f40c775f09a71276e0050a
GIT binary patch
literal 25
hcmb=h($_E4Kh$u4|APe$@AvQhFrlI0!}|Suxd5(W4xs=5

literal 0
HcmV?d00001

diff --git a/t/t0110/url-2 b/t/t0110/url-2
new file mode 100644
index 0000000000000000000000000000000000000000..43334b05b2de3794d6020abd96e634a4e9e49cb0
GIT binary patch
literal 20
bcmb=h($_E47Zwo}6PJ*bmXVc{ujc{)C{+Vx

literal 0
HcmV?d00001

diff --git a/t/t0110/url-3 b/t/t0110/url-3
new file mode 100644
index 0000000000000000000000000000000000000000..7378c7bec247b996bc67b00a05ed89cf47d4b7a7
GIT binary patch
literal 23
ecmb=h($_E4Z)j|4ZfR|6@96C6?&<C8=K=t7Jqj}b

literal 0
HcmV?d00001

diff --git a/t/t0110/url-4 b/t/t0110/url-4
new file mode 100644
index 0000000000000000000000000000000000000000..220b198c97f942fea4960f51a2105cc42261061a
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFOZRvla!T~mzbHFo1C4Vp9*`u3o`%!

literal 0
HcmV?d00001

diff --git a/t/t0110/url-5 b/t/t0110/url-5
new file mode 100644
index 0000000000000000000000000000000000000000..1ccd9277792840955bb124bdde21f4b08bcccb63
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFQB2Kqok##r>Lo_tE{cAuL^}d3^M=#

literal 0
HcmV?d00001

diff --git a/t/t0110/url-6 b/t/t0110/url-6
new file mode 100644
index 0000000000000000000000000000000000000000..e8283aac6dff049d3e02454db6e684c5790a5996
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFR-z)v$VCgx45~wyS%-=zY31M4Kn}$

literal 0
HcmV?d00001

diff --git a/t/t0110/url-7 b/t/t0110/url-7
new file mode 100644
index 0000000000000000000000000000000000000000..fa7c10b615259deefd15b638b021da7c60eba1b2
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFTlaV!^FkL$H>Xb%goKr&kC454l@7%

literal 0
HcmV?d00001

diff --git a/t/t0110/url-8 b/t/t0110/url-8
new file mode 100644
index 0000000000000000000000000000000000000000..79a0ba836f5b8886b0a73f161eb292af2b105e65
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFVNA_)6~`0*Vx(G+uYsW-wL6<4>JG&

literal 0
HcmV?d00001

diff --git a/t/t0110/url-9 b/t/t0110/url-9
new file mode 100644
index 0000000000000000000000000000000000000000..8b44bec48b94467c63e8e1ad18162e465da6d6dd
GIT binary patch
literal 23
hcmV+y0O<dCIxjDAFW}+g<K*S$=jiF`>+J3B?+U9u5HkP(

literal 0
HcmV?d00001

diff --git a/test-urlmatch-normalization.c b/test-urlmatch-normalization.c
new file mode 100644
index 00000000..2603899b
--- /dev/null
+++ b/test-urlmatch-normalization.c
@@ -0,0 +1,50 @@
+#include "git-compat-util.h"
+#include "urlmatch.h"
+
+int main(int argc, char **argv)
+{
+	const char *usage = "test-urlmatch-normalization [-p | -l] <url1> | <url1> <url2>";
+	char *url1, *url2;
+	int opt_p = 0, opt_l = 0;
+
+	/*
+	 * For one url, succeed if url_normalize succeeds on it, fail otherwise.
+	 * For two urls, succeed only if url_normalize succeeds on both and
+	 * the results compare equal with strcmp.  If -p is given (one url only)
+	 * and url_normalize succeeds, print the result followed by "\n".  If
+	 * -l is given (one url only) and url_normalize succeeds, print the
+	 * returned length in decimal followed by "\n".
+	 */
+
+	if (argc > 1 && !strcmp(argv[1], "-p")) {
+		opt_p = 1;
+		argc--;
+		argv++;
+	} else if (argc > 1 && !strcmp(argv[1], "-l")) {
+		opt_l = 1;
+		argc--;
+		argv++;
+	}
+
+	if (argc < 2 || argc > 3)
+		die(usage);
+
+	if (argc == 2) {
+		struct url_info info;
+		url1 = url_normalize(argv[1], &info);
+		if (!url1)
+			return 1;
+		if (opt_p)
+			printf("%s\n", url1);
+		if (opt_l)
+			printf("%u\n", (unsigned)info.url_len);
+		return 0;
+	}
+
+	if (opt_p || opt_l)
+		die(usage);
+
+	url1 = url_normalize(argv[1], NULL);
+	url2 = url_normalize(argv[2], NULL);
+	return (url1 && url2 && !strcmp(url1, url2)) ? 0 : 1;
+}
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH ALTERNATIVE v6.v3 4/6] config: parse http.<url>.<variable> using urlmatch
  2013-08-05 20:20       ` [PATCH ALTERNATIVE v6.v3 4/6] config: parse http.<url>.<variable> using urlmatch Kyle J. McKay
@ 2013-08-05 22:56         ` Junio C Hamano
  2013-08-05 23:57           ` Kyle J. McKay
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2013-08-05 22:56 UTC (permalink / raw)
  To: Kyle J. McKay; +Cc: git, Jeff King

"Kyle J. McKay" <mackyle@gmail.com> writes:

> Use the urlmatch_config_entry() to wrap the underlying
> http_options() two-level variable parser in order to set
> http.<variable> to the value with the most specific URL in the
> configuration.
>
> Signed-off-by: Jeff King <peff@peff.net>
> Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---

Oops, what did we sign-off?

> This version of 4/6 moves the tests to t0110 since urlmatch is now global.
> The config tests are removed since part 6/6 already has those and they no
> longer belong with the urlmatch normalization tests.
>
> The Makefile rule has been removed since it's no longer needed to build
> correctly as the test program no longer includes http.c.
>
> Other than those changes (and a minor rename to reflect the new location),
> this patch is identical to the previous v6.v2 4/6.

Ahh, figures.  Thanks.

Peff, any comments?

> diff --git a/t/t0110-urlmatch-normalization.sh b/t/t0110-urlmatch-normalization.sh
> new file mode 100755
> index 00000000..8d6096d4
> --- /dev/null
> +++ b/t/t0110-urlmatch-normalization.sh
> @@ -0,0 +1,177 @@
> +#!/bin/sh
> +
> +test_description='urlmatch URL normalization'
> +. ./test-lib.sh
> +
> +# The base name of the test url files
> +tu="$TEST_DIRECTORY/t0110/url"
> +
> +# Note that only file: URLs should be allowed without a host

It is somewhat unfortunate that the form most commonly used for
pushing is not supported at all, i.e.

	host:path

Current configuration set may not have anything interesting to
affect the git-over-ssh push codepath, so in practice it may not
matter, though.

> +test_expect_success 'url authority' '

"authority" refers to the host part? (not a complaint, but is a
question)

> +test_expect_success 'url port checks' '
> +	test-urlmatch-normalization "xyz://q@some.host:" &&

This is presumably replaced by a default port for xyz:// scheme,
whatever the default port is, in other words, it is as if no colon
is given at the end?

> +	test-urlmatch-normalization "xyz://q@some.host:456/" &&
> +	! test-urlmatch-normalization "xyz://q@some.host:0" &&
> +	! test-urlmatch-normalization "xyz://q@some.host:0000000" &&

Port #0 is disallowed?

> +	test-urlmatch-normalization "xyz://q@some.host:0000001?" &&

Is it the same as specifying "xyz://q@some.host:1?" and does it
match "xyz://q@some.host:1"?

> +	test-urlmatch-normalization "xyz://q@some.host:065535#" &&

Ditto, for 65535 and without #-fragment at the end?

> +test_expect_success 'url port normalization' '
> +	test "$(test-urlmatch-normalization -p "http://x:800")" = "http://x:800/" &&
> +	test "$(test-urlmatch-normalization -p "http://x:0800")" = "http://x:800/" &&
> +	test "$(test-urlmatch-normalization -p "http://x:00000800")" = "http://x:800/" &&
> +	test "$(test-urlmatch-normalization -p "http://x:065535")" = "http://x:65535/" &&
> +	test "$(test-urlmatch-normalization -p "http://x:1")" = "http://x:1/" &&
> +	test "$(test-urlmatch-normalization -p "http://x:80")" = "http://x/" &&
> +	test "$(test-urlmatch-normalization -p "http://x:080")" = "http://x/" &&
> +	test "$(test-urlmatch-normalization -p "http://x:000000080")" = "http://x/" &&
> +	test "$(test-urlmatch-normalization -p "https://x:443")" = "https://x/" &&
> +	test "$(test-urlmatch-normalization -p "https://x:0443")" = "https://x/" &&
> +	test "$(test-urlmatch-normalization -p "https://x:000000443")" = "https://x/"
> +'

OK, these answer most of the previous questions.

> +# http://@foo specifies an empty user name but does not specify a password
> +# http://foo  specifies neither a user name nor a password
> +# So they should not be equivalent
> +test_expect_success 'url equivalents' '
> +	test-urlmatch-normalization "httP://x" "Http://X/" &&
> +	test-urlmatch-normalization "Http://%4d%65:%4d^%70@The.Host" "hTTP://Me:%4D^p@the.HOST:80/" &&
> +	! test-urlmatch-normalization "https://@x.y/^" "httpS://x.y:443/^" &&

The comment is about this test, which seems to make sense.  What is
"^"?  Just a random valid character that can appear in the path?
(not a complaint, but is a question).

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH ALTERNATIVE v6.v3 4/6] config: parse http.<url>.<variable> using urlmatch
  2013-08-05 22:56         ` Junio C Hamano
@ 2013-08-05 23:57           ` Kyle J. McKay
  0 siblings, 0 replies; 23+ messages in thread
From: Kyle J. McKay @ 2013-08-05 23:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King

On Aug 5, 2013, at 15:56, Junio C Hamano wrote:
> "Kyle J. McKay" <mackyle@gmail.com> writes:
>
>> Use the urlmatch_config_entry() to wrap the underlying
>> http_options() two-level variable parser in order to set
>> http.<variable> to the value with the most specific URL in the
>> configuration.
>>
>> Signed-off-by: Jeff King <peff@peff.net>
>> Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
>> Signed-off-by: Junio C Hamano <gitster@pobox.com>
>> ---
>
> Oops, what did we sign-off?

Some code removal.  No new additions.  Actually this:

On Aug 1, 2013, at 14:44, Junio C Hamano wrote:

> * jc/url-match (2013-07-31) 6 commits
> - config: "git config --get-urlmatch" parses section.<url>.key
> - builtin/config: refactor collect_config()
> - config: parse http.<url>.<variable> using urlmatch
> - config: add generic callback wrapper to parse section.<url>.key
> - config: add helper to normalize and match URLs
> - http.c: fix parsing of http.sslCertPasswordProtected variable
>
> Reroll of km/http-curl-config-per-url topic.  Peff raises many good
> points about the tests for http.* variables.
>
> Waiting for the discussion to conclude, hopefully with a replacement  
> test.

As requested.

>> This version of 4/6 moves the tests to t0110 since urlmatch is now  
>> global.
>> The config tests are removed since part 6/6 already has those and  
>> they no
>> longer belong with the urlmatch normalization tests.
>>
>> The Makefile rule has been removed since it's no longer needed to  
>> build
>> correctly as the test program no longer includes http.c.
>>
>> Other than those changes (and a minor rename to reflect the new  
>> location),
>> this patch is identical to the previous v6.v2 4/6.
>
> Ahh, figures.  Thanks.

The remaining tests, by the way, have not changed.  They are identical  
to previous versions.

> Peff, any comments?
>
>> diff --git a/t/t0110-urlmatch-normalization.sh b/t/t0110-urlmatch- 
>> normalization.sh
>> new file mode 100755
>> index 00000000..8d6096d4
>> --- /dev/null
>> +++ b/t/t0110-urlmatch-normalization.sh
>> @@ -0,0 +1,177 @@
>> +#!/bin/sh
>> +
>> +test_description='urlmatch URL normalization'
>> +. ./test-lib.sh
>> +
>> +# The base name of the test url files
>> +tu="$TEST_DIRECTORY/t0110/url"
>> +
>> +# Note that only file: URLs should be allowed without a host
>
> It is somewhat unfortunate that the form most commonly used for
> pushing is not supported at all, i.e.
>
> 	host:path

That is an SSH extension and they are certainly not URLs according to  
RFC 3986 because that would require every host to be its own scheme.

Also, host:path cannot in the general case, be unambiguously  
translated to a URL.

For example, repo.or.cz:srv/git/alt-git, has no translation.  It is  
different from repo.or.cz:/srv/git/alt-git which does have a  
translation.  There's no guarantee that inserting a '/' will not  
change the meaning of the URL (that only happens to be the case on  
repo.or.cz because all the ssh git users in the chroot jail have a '/'  
home directory).

> Current configuration set may not have anything interesting to
> affect the git-over-ssh push codepath, so in practice it may not
> matter, though.
>
>> +test_expect_success 'url authority' '
>
> "authority" refers to the host part? (not a complaint, but is a
> question)

It refers to this production from RFC 3986 Section "3.2 Authority":

authority = [ userinfo "@" ] host [ ":" port ]

>> +test_expect_success 'url port checks' '
>> +	test-urlmatch-normalization "xyz://q@some.host:" &&
>
> This is presumably replaced by a default port for xyz:// scheme,
> whatever the default port is, in other words, it is as if no colon
> is given at the end?

Yes.

The "port" production above is:

port = *DIGIT

which means 0 or more digits.

>> +	test-urlmatch-normalization "xyz://q@some.host:456/" &&
>> +	! test-urlmatch-normalization "xyz://q@some.host:0" &&
>> +	! test-urlmatch-normalization "xyz://q@some.host:0000000" &&
>
> Port #0 is disallowed?

Intentionally so.

The comments from urlmatch.c talk about this:

/*
  * Port number must be all digits with leading 0s removed
  * and since all the protocols we deal with have a 16-bit
  * port number it must also be in the range 1..65535
  * 0 is not allowed because that means "next available"
  * on just about every system and therefore cannot be used
  */

>> +	test-urlmatch-normalization "xyz://q@some.host:0000001?" &&
>
> Is it the same as specifying "xyz://q@some.host:1?" and does it
> match "xyz://q@some.host:1"?
>
>> +	test-urlmatch-normalization "xyz://q@some.host:065535#" &&
>
> Ditto, for 65535 and without #-fragment at the end?
>
>> +test_expect_success 'url port normalization' '
>> +	test "$(test-urlmatch-normalization -p "http://x:800")" = "http:// 
>> x:800/" &&
>> +	test "$(test-urlmatch-normalization -p "http://x:0800")" =  
>> "http://x:800/" &&
>> +	test "$(test-urlmatch-normalization -p "http://x:00000800")" =  
>> "http://x:800/" &&
>> +	test "$(test-urlmatch-normalization -p "http://x:065535")" =  
>> "http://x:65535/" &&
>> +	test "$(test-urlmatch-normalization -p "http://x:1")" = "http://x: 
>> 1/" &&
>> +	test "$(test-urlmatch-normalization -p "http://x:80")" = "http:// 
>> x/" &&
>> +	test "$(test-urlmatch-normalization -p "http://x:080")" = "http:// 
>> x/" &&
>> +	test "$(test-urlmatch-normalization -p "http://x:000000080")" =  
>> "http://x/" &&
>> +	test "$(test-urlmatch-normalization -p "https://x:443")" = "https://x/ 
>> " &&
>> +	test "$(test-urlmatch-normalization -p "https://x:0443")" = "https://x/ 
>> " &&
>> +	test "$(test-urlmatch-normalization -p "https://x:000000443")" = "https://x/ 
>> "
>> +'
>
> OK, these answer most of the previous questions.
>
>> +# http://@foo specifies an empty user name but does not specify a  
>> password
>> +# http://foo  specifies neither a user name nor a password
>> +# So they should not be equivalent
>> +test_expect_success 'url equivalents' '
>> +	test-urlmatch-normalization "httP://x" "Http://X/" &&
>> +	test-urlmatch-normalization "Http://%4d%65:%4d^%70@The.Host" "hTTP://Me 
>> :%4D^p@the.HOST:80/" &&
>> +	! test-urlmatch-normalization "https://@x.y/^" "httpS://x.y:443/ 
>> ^" &&
>
> The comment is about this test, which seems to make sense.  What is
> "^"?  Just a random valid character that can appear in the path?
> (not a complaint, but is a question).

The character '^' is one of the always-unsafe characters that must  
always be escaped.  It's also one of the always-unsafe characters  
that's easy to include in the tests as it doesn't require escaping or  
backslashing or binary includes.  It doesn't otherwise have any  
special meaning.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2013-08-05 23:57 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-31 19:26 [PATCH v6 0/6] http.<url>.<key> and friends Junio C Hamano
2013-07-31 19:26 ` [PATCH v6 1/6] http.c: fix parsing of http.sslCertPasswordProtected variable Junio C Hamano
2013-07-31 19:26 ` [PATCH v6 2/6] config: add helper to normalize and match URLs Junio C Hamano
2013-07-31 20:50   ` Kyle J. McKay
2013-07-31 19:26 ` [PATCH v6 3/6] config: add generic callback wrapper to parse section.<url>.key Junio C Hamano
2013-07-31 19:26 ` [PATCH v6 4/6] config: parse http.<url>.<variable> using urlmatch Junio C Hamano
2013-07-31 20:51   ` Kyle J. McKay
2013-07-31 20:51   ` [PATCH ALTERNATIVE v6 0/2] http.<url>.<key> and friends Kyle J. McKay
2013-07-31 20:52     ` [PATCH ALTERNATIVE v6 2/4] config: add helper to normalize and match URLs Kyle J. McKay
2013-07-31 20:52     ` [PATCH ALTERNATIVE v6 4/4] config: parse http.<url>.<variable> using urlmatch Kyle J. McKay
2013-07-31 22:01     ` [PATCH ALTERNATIVE v6 0/2] http.<url>.<key> and friends Junio C Hamano
2013-07-31 22:41     ` [PATCH ALTERNATIVE v6.v2 4/6] config: parse http.<url>.<variable> using urlmatch Kyle J. McKay
2013-07-31 19:26 ` [PATCH v6 5/6] builtin/config: refactor collect_config() Junio C Hamano
2013-07-31 19:26 ` [PATCH v6 6/6] config: "git config --get-urlmatch" parses section.<url>.key Junio C Hamano
2013-07-31 22:45   ` Jeff King
2013-07-31 23:03     ` Kyle J. McKay
2013-07-31 23:44       ` Jeff King
2013-08-01 17:25         ` Junio C Hamano
2013-08-01 17:30           ` Jeff King
2013-08-05 20:20       ` [PATCH ALTERNATIVE v6.v3 4/6] config: parse http.<url>.<variable> using urlmatch Kyle J. McKay
2013-08-05 22:56         ` Junio C Hamano
2013-08-05 23:57           ` Kyle J. McKay
2013-07-31 23:47     ` [PATCH v6 6/6] config: "git config --get-urlmatch" parses section.<url>.key Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).