git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Patrick Steinhardt <patrick.steinhardt@elego.de>
To: git@vger.kernel.org
Cc: Patrick Steinhardt <patrick.steinhardt@elego.de>,
	Junio C Hamano <gitster@pobox.com>,
	Patrick Steinhardt <ps@pks.im>,
	Philip Oakley <philipoakley@iee.org>
Subject: [PATCH v3 3/4] urlmatch: split host and port fields in `struct url_info`
Date: Wed, 25 Jan 2017 10:56:47 +0100	[thread overview]
Message-ID: <20170125095648.4116-4-patrick.steinhardt@elego.de> (raw)
In-Reply-To: <20170125095648.4116-1-patrick.steinhardt@elego.de>
In-Reply-To: <20170123130635.29577-1-patrick.steinhardt@elego.de>

The `url_info` structure contains information about a normalized URL
with the URL's components being represented by different fields. The
host and port part though are to be accessed by the same `host` field,
so that getting the host and/or port separately becomes more involved
than really necessary.

To make the port more readily accessible, split up the host and port
fields. Namely, the `host_len` will not include the port length anymore
and a new `port_off` field has been added which includes the offset to
the port, if available.

The only user of these fields is `url_normalize_1`. This change makes it
easier later on to treat host and port differently when introducing
globs for domains.

Signed-off-by: Patrick Steinhardt <patrick.steinhardt@elego.de>
---
 urlmatch.c | 16 ++++++++++++----
 urlmatch.h |  9 +++++----
 2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/urlmatch.c b/urlmatch.c
index d350478c0..e328905eb 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -104,7 +104,7 @@ static char *url_normalize_1(const char *url, struct url_info *out_info, char al
 	struct strbuf norm;
 	size_t spanned;
 	size_t scheme_len, user_off=0, user_len=0, passwd_off=0, passwd_len=0;
-	size_t host_off=0, host_len=0, port_len=0, path_off, path_len, result_len;
+	size_t host_off=0, host_len=0, port_off=0, port_len=0, path_off, path_len, result_len;
 	const char *slash_ptr, *at_ptr, *colon_ptr, *path_start;
 	char *result;
 
@@ -263,6 +263,7 @@ static char *url_normalize_1(const char *url, struct url_info *out_info, char al
 				return NULL;
 			}
 			strbuf_addch(&norm, ':');
+			port_off = norm.len;
 			strbuf_add(&norm, url, slash_ptr - url);
 			port_len = slash_ptr - url;
 		}
@@ -270,7 +271,7 @@ static char *url_normalize_1(const char *url, struct url_info *out_info, char al
 		url = slash_ptr;
 	}
 	if (host_off)
-		host_len = norm.len - host_off;
+		host_len = norm.len - host_off - (port_len ? port_len + 1 : 0);
 
 
 	/*
@@ -378,6 +379,7 @@ static char *url_normalize_1(const char *url, struct url_info *out_info, char al
 		out_info->passwd_len = passwd_len;
 		out_info->host_off = host_off;
 		out_info->host_len = host_len;
+		out_info->port_off = port_off;
 		out_info->port_len = port_len;
 		out_info->path_off = path_off;
 		out_info->path_len = path_len;
@@ -464,11 +466,17 @@ static int match_urls(const struct url_info *url,
 		usermatched = 1;
 	}
 
-	/* check the host and port */
+	/* check the host */
 	if (url_prefix->host_len != url->host_len ||
 	    strncmp(url->url + url->host_off,
 		    url_prefix->url + url_prefix->host_off, url->host_len))
-		return 0; /* host names and/or ports do not match */
+		return 0; /* host names do not match */
+
+	/* check the port */
+	if (url_prefix->port_len != url->port_len ||
+	    strncmp(url->url + url->port_off,
+		    url_prefix->url + url_prefix->port_off, url->port_len))
+		return 0; /* ports do not match */
 
 	/* check the path */
 	pathmatchlen = url_match_prefix(
diff --git a/urlmatch.h b/urlmatch.h
index 528862adc..0ea812b03 100644
--- a/urlmatch.h
+++ b/urlmatch.h
@@ -18,11 +18,12 @@ struct url_info {
 	size_t passwd_len;	/* length of passwd; if passwd_off != 0 but
 				   passwd_len == 0, an empty passwd was given */
 	size_t host_off;	/* offset into url to start of host name (0 => none) */
-	size_t host_len;	/* length of host name; this INCLUDES any ':portnum';
+	size_t host_len;	/* length of host name;
 				 * file urls may have host_len == 0 */
-	size_t port_len;	/* if a portnum is present (port_len != 0), it has
-				 * this length (excluding the leading ':') at the
-				 * end of the host name (always 0 for file urls) */
+	size_t port_off;	/* offset into url to start of port number (0 => none) */
+	size_t port_len;	/* if a portnum is present (port_off != 0), it has
+				 * this length (excluding the leading ':') starting
+				 * from port_off (always 0 for file urls) */
 	size_t path_off;	/* offset into url to the start of the url path;
 				 * this will always point to a '/' character
 				 * after the url has been normalized */
-- 
2.11.0


  parent reply	other threads:[~2017-01-25  9:57 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-23 13:06 [PATCH v1 0/2] urlmatch: allow regexp-based matches Patrick Steinhardt
2017-01-23 13:06 ` [PATCH v1 1/2] mailmap: add Patrick Steinhardt's work address Patrick Steinhardt
2017-01-23 13:06 ` [PATCH v1 2/2] urlmatch: allow regex-based URL matching Patrick Steinhardt
2017-01-23 19:53 ` [PATCH v1 0/2] urlmatch: allow regexp-based matches Junio C Hamano
2017-01-24 11:29   ` Patrick Steinhardt
2017-01-24 17:00 ` [PATCH v2 0/4] " Patrick Steinhardt
2017-01-24 17:00 ` [PATCH v2 1/4] mailmap: add Patrick Steinhardt's work address Patrick Steinhardt
2017-01-24 17:00 ` [PATCH v2 2/4] urlmatch: enable normalization of URLs with globs Patrick Steinhardt
2017-01-24 17:00 ` [PATCH v2 3/4] urlmatch: split host and port fields in `struct url_info` Patrick Steinhardt
2017-01-24 17:00 ` [PATCH v2 4/4] urlmatch: allow globbing for the URL host part Patrick Steinhardt
2017-01-24 17:52   ` Philip Oakley
2017-01-25  9:57     ` Patrick Steinhardt
2017-01-25  9:56 ` [PATCH v3 0/4] urlmatch: allow regexp-based matches Patrick Steinhardt
2017-01-25  9:56 ` [PATCH v3 1/4] mailmap: add Patrick Steinhardt's work address Patrick Steinhardt
2017-01-25  9:56 ` [PATCH v3 2/4] urlmatch: enable normalization of URLs with globs Patrick Steinhardt
2017-01-25  9:56 ` Patrick Steinhardt [this message]
2017-01-25  9:56 ` [PATCH v3 4/4] urlmatch: allow globbing for the URL host part Patrick Steinhardt
2017-01-26 20:43   ` Junio C Hamano
2017-01-26 20:49     ` Junio C Hamano
2017-01-26 21:12       ` Junio C Hamano
2017-01-27  6:21   ` Patrick Steinhardt
2017-01-27 17:45     ` Junio C Hamano
2017-01-27 10:32 ` [PATCH v4 0/5] urlmatch: allow wildcard-based matches Patrick Steinhardt
2017-01-30 22:00   ` Junio C Hamano
2017-01-30 22:52     ` Junio C Hamano
2017-01-31  8:26       ` Patrick Steinhardt
2017-01-27 10:32 ` [PATCH v4 1/5] mailmap: add Patrick Steinhardt's work address Patrick Steinhardt
2017-01-27 10:32 ` [PATCH v4 2/5] urlmatch: enable normalization of URLs with globs Patrick Steinhardt
2017-01-27 10:32 ` [PATCH v4 3/5] urlmatch: split host and port fields in `struct url_info` Patrick Steinhardt
2017-01-27 10:32 ` [PATCH v4 4/5] urlmatch: include host and port in urlmatch length Patrick Steinhardt
2017-01-27 10:32 ` [PATCH v4 5/5] urlmatch: allow globbing for the URL host part Patrick Steinhardt
2017-01-31  9:01 ` [PATCH v5 0/5] urlmatch: allow wildcard-based matches Patrick Steinhardt
2017-01-31  9:01 ` [PATCH v5 1/5] mailmap: add Patrick Steinhardt's work address Patrick Steinhardt
2017-01-31  9:01 ` [PATCH v5 2/5] urlmatch: enable normalization of URLs with globs Patrick Steinhardt
2017-01-31  9:01 ` [PATCH v5 3/5] urlmatch: split host and port fields in `struct url_info` Patrick Steinhardt
2017-01-31  9:01 ` [PATCH v5 4/5] urlmatch: include host in urlmatch ranking Patrick Steinhardt
2017-01-31  9:01 ` [PATCH v5 5/5] urlmatch: allow globbing for the URL host part Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170125095648.4116-4-patrick.steinhardt@elego.de \
    --to=patrick.steinhardt@elego.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=philipoakley@iee.org \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).