git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: Patrick Steinhardt <patrick.steinhardt@elego.de>,
	Junio C Hamano <gitster@pobox.com>,
	Patrick Steinhardt <ps@pks.im>
Subject: [PATCH v5 5/5] urlmatch: allow globbing for the URL host part
Date: Tue, 31 Jan 2017 10:01:47 +0100	[thread overview]
Message-ID: <542b3b76f6c23a2b0e1e1306b6e58ce522977bf3.1485853153.git.ps@pks.im> (raw)
In-Reply-To: <cover.1485853153.git.ps@pks.im>
In-Reply-To: <cover.1485853153.git.ps@pks.im>

From: Patrick Steinhardt <patrick.steinhardt@elego.de>

The URL matching function computes for two URLs whether they match not.
The match is performed by splitting up the URL into different parts and
then doing an exact comparison with the to-be-matched URL.

The main user of `urlmatch` is the configuration subsystem. It allows to
set certain configurations based on the URL which is being connected to
via keys like `http.<url>.*`. A common use case for this is to set
proxies for only some remotes which match the given URL. Unfortunately,
having exact matches for all parts of the URL can become quite tedious
in some setups. Imagine for example a corporate network where there are
dozens or even hundreds of subdomains, which would have to be configured
individually.

Allow users to write an asterisk '*' in place of any 'host' or
'subdomain' label as part of the host name.  For example,
"http.https://*.example.com.proxy" sets "http.proxy" for all direct
subdomains of "https://example.com", e.g. "https://foo.example.com", but
not "https://foo.bar.example.com".

Signed-off-by: Patrick Steinhardt <patrick.steinhardt@elego.de>
Helped-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config.txt |  5 +++-
 t/t1300-repo-config.sh   | 72 ++++++++++++++++++++++++++++++++++++++++++++++++
 urlmatch.c               | 49 +++++++++++++++++++++++++++++---
 3 files changed, 121 insertions(+), 5 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index af2ae4cc0..ee155d8a6 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1914,7 +1914,10 @@ http.<url>.*::
   must match exactly between the config key and the URL.
 
 . Host/domain name (e.g., `example.com` in `https://example.com/`).
-  This field must match exactly between the config key and the URL.
+  This field must match between the config key and the URL. It is
+  possible to specify a `*` as part of the host name to match all subdomains
+  at this level. `https://*.example.com/` for example would match
+  `https://foo.example.com/`, but not `https://foo.bar.example.com/`.
 
 . Port number (e.g., `8080` in `http://example.com:8080/`).
   This field must match exactly between the config key and the URL.
diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index 6c844d519..052f12021 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -1187,6 +1187,18 @@ test_expect_success 'urlmatch favors more specific URLs' '
 		cookieFile = /tmp/user.txt
 	[http "https://averylonguser@example.com/"]
 		cookieFile = /tmp/averylonguser.txt
+	[http "https://preceding.example.com"]
+		cookieFile = /tmp/preceding.txt
+	[http "https://*.example.com"]
+		cookieFile = /tmp/wildcard.txt
+	[http "https://*.example.com/wildcardwithsubdomain"]
+		cookieFile = /tmp/wildcardwithsubdomain.txt
+	[http "https://trailing.example.com"]
+		cookieFile = /tmp/trailing.txt
+	[http "https://user@*.example.com/"]
+		cookieFile = /tmp/wildcardwithuser.txt
+	[http "https://sub.example.com/"]
+		cookieFile = /tmp/sub.txt
 	EOF
 
 	echo http.cookiefile /tmp/root.txt >expect &&
@@ -1207,6 +1219,66 @@ test_expect_success 'urlmatch favors more specific URLs' '
 
 	echo http.cookiefile /tmp/subdirectory.txt >expect &&
 	git config --get-urlmatch HTTP https://averylonguser@example.com/subdirectory >actual &&
+	test_cmp expect actual &&
+
+	echo http.cookiefile /tmp/preceding.txt >expect &&
+	git config --get-urlmatch HTTP https://preceding.example.com >actual &&
+	test_cmp expect actual &&
+
+	echo http.cookiefile /tmp/wildcard.txt >expect &&
+	git config --get-urlmatch HTTP https://wildcard.example.com >actual &&
+	test_cmp expect actual &&
+
+	echo http.cookiefile /tmp/sub.txt >expect &&
+	git config --get-urlmatch HTTP https://sub.example.com/wildcardwithsubdomain >actual &&
+	test_cmp expect actual &&
+
+	echo http.cookiefile /tmp/trailing.txt >expect &&
+	git config --get-urlmatch HTTP https://trailing.example.com >actual &&
+	test_cmp expect actual &&
+
+	echo http.cookiefile /tmp/sub.txt >expect &&
+	git config --get-urlmatch HTTP https://user@sub.example.com >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'urlmatch with wildcard' '
+	cat >.git/config <<-\EOF &&
+	[http]
+		sslVerify
+	[http "https://*.example.com"]
+		sslVerify = false
+		cookieFile = /tmp/cookie.txt
+	EOF
+
+	test_expect_code 1 git config --bool --get-urlmatch doesnt.exist https://good.example.com >actual &&
+	test_must_be_empty actual &&
+
+	echo true >expect &&
+	git config --bool --get-urlmatch http.SSLverify https://example.com >actual &&
+	test_cmp expect actual &&
+
+	echo true >expect &&
+	git config --bool --get-urlmatch http.SSLverify https://good-example.com >actual &&
+	test_cmp expect actual &&
+
+	echo true >expect &&
+	git config --bool --get-urlmatch http.sslverify https://deep.nested.example.com >actual &&
+	test_cmp expect actual &&
+
+	echo false >expect &&
+	git config --bool --get-urlmatch http.sslverify https://good.example.com >actual &&
+	test_cmp expect actual &&
+
+	{
+		echo http.cookiefile /tmp/cookie.txt &&
+		echo http.sslverify false
+	} >expect &&
+	git config --get-urlmatch HTTP https://good.example.com >actual &&
+	test_cmp expect actual &&
+
+	echo http.sslverify >expect &&
+	git config --get-urlmatch HTTP https://more.example.com.au >actual &&
 	test_cmp expect actual
 '
 
diff --git a/urlmatch.c b/urlmatch.c
index f79887825..6c12f1a48 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -63,6 +63,49 @@ static int append_normalized_escapes(struct strbuf *buf,
 	return 1;
 }
 
+static const char *end_of_token(const char *s, int c, size_t n)
+{
+	const char *next = memchr(s, c, n);
+	if (!next)
+		next = s + n;
+	return next;
+}
+
+static int match_host(const struct url_info *url_info,
+		      const struct url_info *pattern_info)
+{
+	const char *url = url_info->url + url_info->host_off;
+	const char *pat = pattern_info->url + pattern_info->host_off;
+	int url_len = url_info->host_len;
+	int pat_len = pattern_info->host_len;
+
+	while (url_len && pat_len) {
+		const char *url_next = end_of_token(url, '.', url_len);
+		const char *pat_next = end_of_token(pat, '.', pat_len);
+
+		if (pat_next == pat + 1 && pat[0] == '*')
+			/* wildcard matches anything */
+			;
+		else if ((pat_next - pat) == (url_next - url) &&
+			 !memcmp(url, pat, url_next - url))
+			/* the components are the same */
+			;
+		else
+			return 0; /* found an unmatch */
+
+		if (url_next < url + url_len)
+			url_next++;
+		url_len -= url_next - url;
+		url = url_next;
+		if (pat_next < pat + pat_len)
+			pat_next++;
+		pat_len -= pat_next - pat;
+		pat = pat_next;
+	}
+
+	return (!url_len && !pat_len);
+}
+
 static char *url_normalize_1(const char *url, struct url_info *out_info, char allow_globs)
 {
 	/*
@@ -467,9 +510,7 @@ static int match_urls(const struct url_info *url,
 	}
 
 	/* check the host */
-	if (url_prefix->host_len != url->host_len ||
-	    strncmp(url->url + url->host_off,
-		    url_prefix->url + url_prefix->host_off, url->host_len))
+	if (!match_host(url, url_prefix))
 		return 0; /* host names do not match */
 
 	/* check the port */
@@ -528,7 +569,7 @@ int urlmatch_config_entry(const char *var, const char *value, void *cb)
 		struct url_info norm_info;
 
 		config_url = xmemdupz(key, dot - key);
-		norm_url = url_normalize(config_url, &norm_info);
+		norm_url = url_normalize_1(config_url, &norm_info, 1);
 		free(config_url);
 		if (!norm_url)
 			return 0;
-- 
2.11.0


      parent reply	other threads:[~2017-01-31  9:04 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-23 13:06 [PATCH v1 0/2] urlmatch: allow regexp-based matches Patrick Steinhardt
2017-01-23 13:06 ` [PATCH v1 1/2] mailmap: add Patrick Steinhardt's work address Patrick Steinhardt
2017-01-23 13:06 ` [PATCH v1 2/2] urlmatch: allow regex-based URL matching Patrick Steinhardt
2017-01-23 19:53 ` [PATCH v1 0/2] urlmatch: allow regexp-based matches Junio C Hamano
2017-01-24 11:29   ` Patrick Steinhardt
2017-01-24 17:00 ` [PATCH v2 0/4] " Patrick Steinhardt
2017-01-24 17:00 ` [PATCH v2 1/4] mailmap: add Patrick Steinhardt's work address Patrick Steinhardt
2017-01-24 17:00 ` [PATCH v2 2/4] urlmatch: enable normalization of URLs with globs Patrick Steinhardt
2017-01-24 17:00 ` [PATCH v2 3/4] urlmatch: split host and port fields in `struct url_info` Patrick Steinhardt
2017-01-24 17:00 ` [PATCH v2 4/4] urlmatch: allow globbing for the URL host part Patrick Steinhardt
2017-01-24 17:52   ` Philip Oakley
2017-01-25  9:57     ` Patrick Steinhardt
2017-01-25  9:56 ` [PATCH v3 0/4] urlmatch: allow regexp-based matches Patrick Steinhardt
2017-01-25  9:56 ` [PATCH v3 1/4] mailmap: add Patrick Steinhardt's work address Patrick Steinhardt
2017-01-25  9:56 ` [PATCH v3 2/4] urlmatch: enable normalization of URLs with globs Patrick Steinhardt
2017-01-25  9:56 ` [PATCH v3 3/4] urlmatch: split host and port fields in `struct url_info` Patrick Steinhardt
2017-01-25  9:56 ` [PATCH v3 4/4] urlmatch: allow globbing for the URL host part Patrick Steinhardt
2017-01-26 20:43   ` Junio C Hamano
2017-01-26 20:49     ` Junio C Hamano
2017-01-26 21:12       ` Junio C Hamano
2017-01-27  6:21   ` Patrick Steinhardt
2017-01-27 17:45     ` Junio C Hamano
2017-01-27 10:32 ` [PATCH v4 0/5] urlmatch: allow wildcard-based matches Patrick Steinhardt
2017-01-30 22:00   ` Junio C Hamano
2017-01-30 22:52     ` Junio C Hamano
2017-01-31  8:26       ` Patrick Steinhardt
2017-01-27 10:32 ` [PATCH v4 1/5] mailmap: add Patrick Steinhardt's work address Patrick Steinhardt
2017-01-27 10:32 ` [PATCH v4 2/5] urlmatch: enable normalization of URLs with globs Patrick Steinhardt
2017-01-27 10:32 ` [PATCH v4 3/5] urlmatch: split host and port fields in `struct url_info` Patrick Steinhardt
2017-01-27 10:32 ` [PATCH v4 4/5] urlmatch: include host and port in urlmatch length Patrick Steinhardt
2017-01-27 10:32 ` [PATCH v4 5/5] urlmatch: allow globbing for the URL host part Patrick Steinhardt
2017-01-31  9:01 ` [PATCH v5 0/5] urlmatch: allow wildcard-based matches Patrick Steinhardt
2017-01-31  9:01 ` [PATCH v5 1/5] mailmap: add Patrick Steinhardt's work address Patrick Steinhardt
2017-01-31  9:01 ` [PATCH v5 2/5] urlmatch: enable normalization of URLs with globs Patrick Steinhardt
2017-01-31  9:01 ` [PATCH v5 3/5] urlmatch: split host and port fields in `struct url_info` Patrick Steinhardt
2017-01-31  9:01 ` [PATCH v5 4/5] urlmatch: include host in urlmatch ranking Patrick Steinhardt
2017-01-31  9:01 ` Patrick Steinhardt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=542b3b76f6c23a2b0e1e1306b6e58ce522977bf3.1485853153.git.ps@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=patrick.steinhardt@elego.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).