git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Philip Oakley" <philipoakley@iee.org>
To: "Patrick Steinhardt" <patrick.steinhardt@elego.de>,
	<git@vger.kernel.org>
Cc: "Patrick Steinhardt" <ps@pks.im>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Patrick Steinhardt" <patrick.steinhardt@elego.de>
Subject: Re: [PATCH v2 4/4] urlmatch: allow globbing for the URL host part
Date: Tue, 24 Jan 2017 17:52:39 -0000	[thread overview]
Message-ID: <5BF60E6DF2C04DF081466BC3BD3F954F@PhilipOakley> (raw)
In-Reply-To: 20170124170031.18069-5-patrick.steinhardt@elego.de

From: "Patrick Steinhardt" <patrick.steinhardt@elego.de>

a quick comment on the documentation part ..

> The URL matching function computes for two URLs whether they match not.
> The match is performed by splitting up the URL into different parts and
> then doing an exact comparison with the to-be-matched URL.
>
> The main user of `urlmatch` is the configuration subsystem. It allows to
> set certain configurations based on the URL which is being connected to
> via keys like `http.<url>.*`. A common use case for this is to set
> proxies for only some remotes which match the given URL. Unfortunately,
> having exact matches for all parts of the URL can become quite tedious
> in some setups. Imagine for example a corporate network where there are
> dozens or even hundreds of subdomains, which would have to be configured
> individually.
>
> This commit introduces the ability to use globbing in the host-part of
> the URLs. A user can simply specify a `*` as part of the host name to
> match all subdomains at this level. For example adding a configuration
> key `http.https://*.example.com.proxy` will match all subdomains of
> `https://example.com`.
>
> Signed-off-by: Patrick Steinhardt <patrick.steinhardt@elego.de>
> ---
> Documentation/config.txt |  5 ++++-
> t/t1300-repo-config.sh   | 36 ++++++++++++++++++++++++++++++++++++
> urlmatch.c               | 38 ++++++++++++++++++++++++++++++++++----
> 3 files changed, 74 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 506431267..a78921c2b 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1914,7 +1914,10 @@ http.<url>.*::
>   must match exactly between the config key and the URL.
>
> . Host/domain name (e.g., `example.com` in `https://example.com/`).
> -  This field must match exactly between the config key and the URL.
> +  This field must match between the config key and the URL. It is
> +  possible to use globs in the config key to match all subdomains, e.g.
> +  `https://*.example.com/` to match all subdomains of `example.com`. Note
> +  that a glob only every matches a single part of the hostname.

[s/every/ever/ ?]

the "match all subdomains" appears to contradict the "a glob only ever 
matches a single part ".

Maybe borrow the example from the 0/4 cover letter
"so for example `https://foo.bar.example.com` would not match in the case of 
`http.https://*.example.com` " (If I understood it correctly.

A simple example often clarifies much better than more words.
--
Philip

>
> . Port number (e.g., `8080` in `http://example.com:8080/`).
>   This field must match exactly between the config key and the URL.
> diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
> index 923bfc5a2..ec545e092 100755
> --- a/t/t1300-repo-config.sh
> +++ b/t/t1300-repo-config.sh
> @@ -1177,6 +1177,42 @@ test_expect_success 'urlmatch' '
>  test_cmp expect actual
> '
>
> +test_expect_success 'glob-based urlmatch' '
> + cat >.git/config <<-\EOF &&
> + [http]
> + sslVerify
> + [http "https://*.example.com"]
> + sslVerify = false
> + cookieFile = /tmp/cookie.txt
> + EOF
> +
> + test_expect_code 1 git config --bool --get-urlmatch doesnt.exist 
> https://good.example.com >actual &&
> + test_must_be_empty actual &&
> +
> + echo true >expect &&
> + git config --bool --get-urlmatch http.SSLverify https://example.com 
>  >actual &&
> + test_cmp expect actual &&
> +
> + echo true >expect &&
> + git config --bool --get-urlmatch http.SSLverify https://good-example.com 
>  >actual &&
> + test_cmp expect actual &&
> +
> + echo true >expect &&
> + git config --bool --get-urlmatch http.sslverify 
> https://deep.nested.example.com >actual &&
> + test_cmp expect actual &&
> +
> + echo false >expect &&
> + git config --bool --get-urlmatch http.sslverify https://good.example.com 
>  >actual &&
> + test_cmp expect actual &&
> +
> + {
> + echo http.cookiefile /tmp/cookie.txt &&
> + echo http.sslverify false
> + } >expect &&
> + git config --get-urlmatch HTTP https://good.example.com >actual &&
> + test_cmp expect actual
> +'
> +
> # good section hygiene
> test_expect_failure 'unsetting the last key in a section removes header' '
>  cat >.git/config <<-\EOF &&
> diff --git a/urlmatch.c b/urlmatch.c
> index e328905eb..53ff972a6 100644
> --- a/urlmatch.c
> +++ b/urlmatch.c
> @@ -63,6 +63,38 @@ static int append_normalized_escapes(struct strbuf 
> *buf,
>  return 1;
> }
>
> +static int match_host(const struct url_info *url_info,
> +       const struct url_info *pattern_info)
> +{
> + char *url = xmemdupz(url_info->url + url_info->host_off, 
> url_info->host_len);
> + char *pat = xmemdupz(pattern_info->url + pattern_info->host_off, 
> pattern_info->host_len);
> + char *url_tok, *pat_tok, *url_save, *pat_save;
> + int matching;
> +
> + url_tok = strtok_r(url, ".", &url_save);
> + pat_tok = strtok_r(pat, ".", &pat_save);
> +
> + for (; url_tok && pat_tok; url_tok = strtok_r(NULL, ".", &url_save),
> +    pat_tok = strtok_r(NULL, ".", &pat_save)) {
> + if (!strcmp(pat_tok, "*"))
> + continue; /* a simple glob matches everything */
> +
> + if (strcmp(url_tok, pat_tok)) {
> + /* subdomains do not match */
> + matching = 0;
> + break;
> + }
> + }
> +
> + /* matching if both URL and pattern are at their ends */
> + matching = (url_tok == NULL && pat_tok == NULL);
> +
> + free(url);
> + free(pat);
> +
> + return matching;
> +}
> +
> static char *url_normalize_1(const char *url, struct url_info *out_info, 
> char allow_globs)
> {
>  /*
> @@ -467,9 +499,7 @@ static int match_urls(const struct url_info *url,
>  }
>
>  /* check the host */
> - if (url_prefix->host_len != url->host_len ||
> -     strncmp(url->url + url->host_off,
> -     url_prefix->url + url_prefix->host_off, url->host_len))
> + if (!match_host(url, url_prefix))
>  return 0; /* host names do not match */
>
>  /* check the port */
> @@ -512,7 +542,7 @@ int urlmatch_config_entry(const char *var, const char 
> *value, void *cb)
>  struct url_info norm_info;
>
>  config_url = xmemdupz(key, dot - key);
> - norm_url = url_normalize(config_url, &norm_info);
> + norm_url = url_normalize_1(config_url, &norm_info, 1);
>  free(config_url);
>  if (!norm_url)
>  return 0;
> -- 
> 2.11.0
>
> 


  reply	other threads:[~2017-01-24 17:52 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-23 13:06 [PATCH v1 0/2] urlmatch: allow regexp-based matches Patrick Steinhardt
2017-01-23 13:06 ` [PATCH v1 1/2] mailmap: add Patrick Steinhardt's work address Patrick Steinhardt
2017-01-23 13:06 ` [PATCH v1 2/2] urlmatch: allow regex-based URL matching Patrick Steinhardt
2017-01-23 19:53 ` [PATCH v1 0/2] urlmatch: allow regexp-based matches Junio C Hamano
2017-01-24 11:29   ` Patrick Steinhardt
2017-01-24 17:00 ` [PATCH v2 0/4] " Patrick Steinhardt
2017-01-24 17:00 ` [PATCH v2 1/4] mailmap: add Patrick Steinhardt's work address Patrick Steinhardt
2017-01-24 17:00 ` [PATCH v2 2/4] urlmatch: enable normalization of URLs with globs Patrick Steinhardt
2017-01-24 17:00 ` [PATCH v2 3/4] urlmatch: split host and port fields in `struct url_info` Patrick Steinhardt
2017-01-24 17:00 ` [PATCH v2 4/4] urlmatch: allow globbing for the URL host part Patrick Steinhardt
2017-01-24 17:52   ` Philip Oakley [this message]
2017-01-25  9:57     ` Patrick Steinhardt
2017-01-25  9:56 ` [PATCH v3 0/4] urlmatch: allow regexp-based matches Patrick Steinhardt
2017-01-25  9:56 ` [PATCH v3 1/4] mailmap: add Patrick Steinhardt's work address Patrick Steinhardt
2017-01-25  9:56 ` [PATCH v3 2/4] urlmatch: enable normalization of URLs with globs Patrick Steinhardt
2017-01-25  9:56 ` [PATCH v3 3/4] urlmatch: split host and port fields in `struct url_info` Patrick Steinhardt
2017-01-25  9:56 ` [PATCH v3 4/4] urlmatch: allow globbing for the URL host part Patrick Steinhardt
2017-01-26 20:43   ` Junio C Hamano
2017-01-26 20:49     ` Junio C Hamano
2017-01-26 21:12       ` Junio C Hamano
2017-01-27  6:21   ` Patrick Steinhardt
2017-01-27 17:45     ` Junio C Hamano
2017-01-27 10:32 ` [PATCH v4 0/5] urlmatch: allow wildcard-based matches Patrick Steinhardt
2017-01-30 22:00   ` Junio C Hamano
2017-01-30 22:52     ` Junio C Hamano
2017-01-31  8:26       ` Patrick Steinhardt
2017-01-27 10:32 ` [PATCH v4 1/5] mailmap: add Patrick Steinhardt's work address Patrick Steinhardt
2017-01-27 10:32 ` [PATCH v4 2/5] urlmatch: enable normalization of URLs with globs Patrick Steinhardt
2017-01-27 10:32 ` [PATCH v4 3/5] urlmatch: split host and port fields in `struct url_info` Patrick Steinhardt
2017-01-27 10:32 ` [PATCH v4 4/5] urlmatch: include host and port in urlmatch length Patrick Steinhardt
2017-01-27 10:32 ` [PATCH v4 5/5] urlmatch: allow globbing for the URL host part Patrick Steinhardt
2017-01-31  9:01 ` [PATCH v5 0/5] urlmatch: allow wildcard-based matches Patrick Steinhardt
2017-01-31  9:01 ` [PATCH v5 1/5] mailmap: add Patrick Steinhardt's work address Patrick Steinhardt
2017-01-31  9:01 ` [PATCH v5 2/5] urlmatch: enable normalization of URLs with globs Patrick Steinhardt
2017-01-31  9:01 ` [PATCH v5 3/5] urlmatch: split host and port fields in `struct url_info` Patrick Steinhardt
2017-01-31  9:01 ` [PATCH v5 4/5] urlmatch: include host in urlmatch ranking Patrick Steinhardt
2017-01-31  9:01 ` [PATCH v5 5/5] urlmatch: allow globbing for the URL host part Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5BF60E6DF2C04DF081466BC3BD3F954F@PhilipOakley \
    --to=philipoakley@iee.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=patrick.steinhardt@elego.de \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).