git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v3 0/6] connect: various cleanups
@ 2016-05-01  6:02 Mike Hommey
  2016-05-01  6:02 ` [PATCH 1/6] connect: remove get_port() Mike Hommey
                   ` (6 more replies)
  0 siblings, 7 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-01  6:02 UTC (permalink / raw)
  To: git; +Cc: gitster, tboegi


On Fri, Apr 29, 2016 at 09:58:37AM -0700, Junio C Hamano wrote:
> So I would suggest restructuring this series to do
> 
>  * 2/3 (DIAG consolidation)
>  * refactoring in 3/3 but not s/static/extern/
> 
> and in optional follow(s)-up, do
> 
>  * s/static/extern/ and update to *.h in 3/3
> 
>  * 1/3, but I do not think it is necessary for users of
>    prepare_ssh_command()

I went a little further on the restructuring and did some more cleanup.
This could probably go even further, but that'd be well out of the scope
I started those patches for :)

The order in which the changes have been done in this new series is
debatable, but I thought the split in smaller parts made things easier
to review, even if they could have been done in different orders.

I'm leaving the optional follows-up for now. I'll work around the functions
being static somehow on my end.


Mike Hommey (6):
  connect: remove get_port()
  connect: uniformize and group CONNECT_DIAG_URL handling code
  connect: only match the host with core.gitProxy
  connect: pass separate host and port to git_tcp_connect and
    git_proxy_connect
  connect: don't xstrdup target_host
  connect: move ssh command line preparation to a separate function

 connect.c             | 184 ++++++++++++++++++++++----------------------------
 t/t5500-fetch-pack.sh |  25 ++++---
 2 files changed, 95 insertions(+), 114 deletions(-)

-- 
2.8.1.16.g58dac65.dirty

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/6] connect: remove get_port()
  2016-05-01  6:02 [PATCH v3 0/6] connect: various cleanups Mike Hommey
@ 2016-05-01  6:02 ` Mike Hommey
  2016-05-01 10:10   ` Torsten Bögershausen
  2016-05-01  6:02 ` [PATCH 2/6] connect: uniformize and group CONNECT_DIAG_URL handling code Mike Hommey
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 46+ messages in thread
From: Mike Hommey @ 2016-05-01  6:02 UTC (permalink / raw)
  To: git; +Cc: gitster, tboegi

get_port() is only used as a fallback when get_host_and_port() does not
return a port. But get_port() does the same search as
get_host_and_port(), except get_host_and_port() starts from the end of
the host, respecting square brackets for ipv6 addresses, and get_port(),
operating after get_host_and_port(), works on a modified host string
that has square brackes removed if there were any.

I cannot think of any legal host:port string that would not have a port
returned by get_host_and_port() *and* have one returned by get_port().
So just remove get_port().

Signed-off-by: Mike Hommey <mh@glandium.org>
---
 connect.c | 19 -------------------
 1 file changed, 19 deletions(-)

diff --git a/connect.c b/connect.c
index c53f3f1..45308ef 100644
--- a/connect.c
+++ b/connect.c
@@ -571,22 +571,6 @@ static struct child_process *git_proxy_connect(int fd[2], char *host)
 	return proxy;
 }
 
-static char *get_port(char *host)
-{
-	char *end;
-	char *p = strchr(host, ':');
-
-	if (p) {
-		long port = strtol(p + 1, &end, 10);
-		if (end != p + 1 && *end == '\0' && 0 <= port && port < 65536) {
-			*p = '\0';
-			return p+1;
-		}
-	}
-
-	return NULL;
-}
-
 /*
  * Extract protocol and relevant parts from the specified connection URL.
  * The caller must free() the returned strings.
@@ -742,9 +726,6 @@ struct child_process *git_connect(int fd[2], const char *url,
 			transport_check_allowed("ssh");
 			get_host_and_port(&ssh_host, &port);
 
-			if (!port)
-				port = get_port(ssh_host);
-
 			if (flags & CONNECT_DIAG_URL) {
 				printf("Diag: url=%s\n", url ? url : "NULL");
 				printf("Diag: protocol=%s\n", prot_name(protocol));
-- 
2.8.1.16.g58dac65.dirty

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 2/6] connect: uniformize and group CONNECT_DIAG_URL handling code
  2016-05-01  6:02 [PATCH v3 0/6] connect: various cleanups Mike Hommey
  2016-05-01  6:02 ` [PATCH 1/6] connect: remove get_port() Mike Hommey
@ 2016-05-01  6:02 ` Mike Hommey
  2016-05-01 13:37   ` Torsten Bögershausen
  2016-05-02  4:56   ` Torsten Bögershausen
  2016-05-01  6:02 ` [PATCH 3/6] connect: only match the host with core.gitProxy Mike Hommey
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-01  6:02 UTC (permalink / raw)
  To: git; +Cc: gitster, tboegi

The CONNECT_DIAG_URL code for PROTO_GIT and PROTO_SSH were different in
subtle ways. Those differences are not significant enough to justify the
duplication of code, while this change also avoids starting to
initialize a connection at all when handling CONNECT_DIAG_URL.

This also moves the get_host_and_port() call earlier than it was
happening, while preserving hostandport intact (get_host_and_port()
possibly altering the string it's passed), because all or hostandport,
host and port may be used.

Further changes are going to remove several uses of hostandport where
get_host_and_port() is used further down the line.

Signed-off-by: Mike Hommey <mh@glandium.org>
---
 connect.c             | 28 +++++++++-------------------
 t/t5500-fetch-pack.sh | 25 +++++++++++++++----------
 2 files changed, 24 insertions(+), 29 deletions(-)

diff --git a/connect.c b/connect.c
index 45308ef..98449bc 100644
--- a/connect.c
+++ b/connect.c
@@ -656,7 +656,8 @@ static struct child_process no_fork = CHILD_PROCESS_INIT;
 struct child_process *git_connect(int fd[2], const char *url,
 				  const char *prog, int flags)
 {
-	char *hostandport, *path;
+	char *hostandport, *path, *host;
+	const char *port = NULL;
 	struct child_process *conn = &no_fork;
 	enum protocol protocol;
 	struct strbuf cmd = STRBUF_INIT;
@@ -667,10 +668,14 @@ struct child_process *git_connect(int fd[2], const char *url,
 	signal(SIGCHLD, SIG_DFL);
 
 	protocol = parse_connect_url(url, &hostandport, &path);
-	if ((flags & CONNECT_DIAG_URL) && (protocol != PROTO_SSH)) {
+	host = xstrdup(hostandport);
+	get_host_and_port(&host, &port);
+	if (flags & CONNECT_DIAG_URL) {
 		printf("Diag: url=%s\n", url ? url : "NULL");
 		printf("Diag: protocol=%s\n", prot_name(protocol));
 		printf("Diag: hostandport=%s\n", hostandport ? hostandport : "NULL");
+		printf("Diag: userandhost=%s\n", host ? host : "NULL");
+		printf("Diag: port=%s\n", port ? port : "NONE");
 		printf("Diag: path=%s\n", path ? path : "NULL");
 		conn = NULL;
 	} else if (protocol == PROTO_GIT) {
@@ -721,23 +726,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 		if (protocol == PROTO_SSH) {
 			const char *ssh;
 			int putty = 0, tortoiseplink = 0;
-			char *ssh_host = hostandport;
-			const char *port = NULL;
 			transport_check_allowed("ssh");
-			get_host_and_port(&ssh_host, &port);
-
-			if (flags & CONNECT_DIAG_URL) {
-				printf("Diag: url=%s\n", url ? url : "NULL");
-				printf("Diag: protocol=%s\n", prot_name(protocol));
-				printf("Diag: userandhost=%s\n", ssh_host ? ssh_host : "NULL");
-				printf("Diag: port=%s\n", port ? port : "NONE");
-				printf("Diag: path=%s\n", path ? path : "NULL");
-
-				free(hostandport);
-				free(path);
-				free(conn);
-				return NULL;
-			}
 
 			ssh = getenv("GIT_SSH_COMMAND");
 			if (!ssh) {
@@ -779,7 +768,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 				argv_array_push(&conn->args, putty ? "-P" : "-p");
 				argv_array_push(&conn->args, port);
 			}
-			argv_array_push(&conn->args, ssh_host);
+			argv_array_push(&conn->args, host);
 		} else {
 			transport_check_allowed("file");
 		}
@@ -793,6 +782,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 		strbuf_release(&cmd);
 	}
 	free(hostandport);
+	free(host);
 	free(path);
 	return conn;
 }
diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index e5f83bf..360c59d 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -537,7 +537,7 @@ check_prot_path () {
 	Diag: protocol=$2
 	Diag: path=$3
 	EOF
-	git fetch-pack --diag-url "$1" | grep -v hostandport= >actual &&
+	git fetch-pack --diag-url "$1" | grep -v host= | grep -v port= >actual &&
 	test_cmp expected actual
 }
 
@@ -546,22 +546,27 @@ check_prot_host_port_path () {
 	case "$2" in
 		*ssh*)
 		pp=ssh
-		uah=userandhost
-		ehost=$(echo $3 | tr -d "[]")
-		diagport="Diag: port=$4"
 		;;
 		*)
-		pp=$p
-		uah=hostandport
-		ehost=$(echo $3$4 | sed -e "s/22$/:22/" -e "s/NONE//")
-		diagport=""
+		pp=$2
 		;;
 	esac
+	case "$4" in
+		NONE)
+		hostandport=$3
+		;;
+		*)
+		hostandport=$3:$4
+		;;
+	esac
+	ehost=$(echo $3 | tr -d "[]")
+	diagport="Diag: port=$4"
 	cat >exp <<-EOF &&
 	Diag: url=$1
 	Diag: protocol=$pp
-	Diag: $uah=$ehost
-	$diagport
+	Diag: hostandport=$hostandport
+	Diag: userandhost=$ehost
+	Diag: port=$4
 	Diag: path=$5
 	EOF
 	grep -v "^$" exp >expected
-- 
2.8.1.16.g58dac65.dirty

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 3/6] connect: only match the host with core.gitProxy
  2016-05-01  6:02 [PATCH v3 0/6] connect: various cleanups Mike Hommey
  2016-05-01  6:02 ` [PATCH 1/6] connect: remove get_port() Mike Hommey
  2016-05-01  6:02 ` [PATCH 2/6] connect: uniformize and group CONNECT_DIAG_URL handling code Mike Hommey
@ 2016-05-01  6:02 ` Mike Hommey
  2016-05-01  6:02 ` [PATCH 4/6] connect: pass separate host and port to git_tcp_connect and git_proxy_connect Mike Hommey
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-01  6:02 UTC (permalink / raw)
  To: git; +Cc: gitster, tboegi

Currently, core.gitProxy doesn't actually match purely on domain names
as documented: it also matches ports.

So a core.gitProxy value like "script for kernel.org" doesn't make the
script called for an url like git://kernel.org:port/path, while it is
called for git://kernel.org/path.

This per-port behavior seems like an oversight rather than a deliberate
choice, so, make git://kernel.org:port/path call the gitProxy script in
the case described above.

Signed-off-by: Mike Hommey <mh@glandium.org>
---
 connect.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/connect.c b/connect.c
index 98449bc..1c8ed6b 100644
--- a/connect.c
+++ b/connect.c
@@ -695,7 +695,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 		/* These underlying connection commands die() if they
 		 * cannot connect.
 		 */
-		if (git_use_proxy(hostandport))
+		if (git_use_proxy(host))
 			conn = git_proxy_connect(fd, hostandport);
 		else
 			git_tcp_connect(fd, hostandport, flags);
-- 
2.8.1.16.g58dac65.dirty

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 4/6] connect: pass separate host and port to git_tcp_connect and git_proxy_connect
  2016-05-01  6:02 [PATCH v3 0/6] connect: various cleanups Mike Hommey
                   ` (2 preceding siblings ...)
  2016-05-01  6:02 ` [PATCH 3/6] connect: only match the host with core.gitProxy Mike Hommey
@ 2016-05-01  6:02 ` Mike Hommey
  2016-05-01  6:02 ` [PATCH 5/6] connect: don't xstrdup target_host Mike Hommey
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-01  6:02 UTC (permalink / raw)
  To: git; +Cc: gitster, tboegi

Instead of having each of them (and their callees) take a host:port
string and call get_host_and_port on it on their own, rely on the
get_host_and_port that already happens in git_connect and pass down
the separate host and port strings we got out of it.

Note that in git_tcp_connect_sock, the port was set to "<none>" in a
case that never can happen, so that code path was removed.

Signed-off-by: Mike Hommey <mh@glandium.org>
---
 connect.c | 31 ++++++++++++++-----------------
 1 file changed, 14 insertions(+), 17 deletions(-)

diff --git a/connect.c b/connect.c
index 1c8ed6b..b3fce84 100644
--- a/connect.c
+++ b/connect.c
@@ -343,18 +343,16 @@ static const char *ai_name(const struct addrinfo *ai)
 /*
  * Returns a connected socket() fd, or else die()s.
  */
-static int git_tcp_connect_sock(char *host, int flags)
+static int git_tcp_connect_sock(const char *host, const char *port, int flags)
 {
 	struct strbuf error_message = STRBUF_INIT;
 	int sockfd = -1;
-	const char *port = STR(DEFAULT_GIT_PORT);
 	struct addrinfo hints, *ai0, *ai;
 	int gai;
 	int cnt = 0;
 
-	get_host_and_port(&host, &port);
-	if (!*port)
-		port = "<none>";
+	if (!port)
+		port = STR(DEFAULT_GIT_PORT);
 
 	memset(&hints, 0, sizeof(hints));
 	if (flags & CONNECT_IPV4)
@@ -411,11 +409,10 @@ static int git_tcp_connect_sock(char *host, int flags)
 /*
  * Returns a connected socket() fd, or else die()s.
  */
-static int git_tcp_connect_sock(char *host, int flags)
+static int git_tcp_connect_sock(const char *host, const char *port, int flags)
 {
 	struct strbuf error_message = STRBUF_INIT;
 	int sockfd = -1;
-	const char *port = STR(DEFAULT_GIT_PORT);
 	char *ep;
 	struct hostent *he;
 	struct sockaddr_in sa;
@@ -423,7 +420,8 @@ static int git_tcp_connect_sock(char *host, int flags)
 	unsigned int nport;
 	int cnt;
 
-	get_host_and_port(&host, &port);
+	if (!port)
+		port = STR(DEFAULT_GIT_PORT);
 
 	if (flags & CONNECT_VERBOSE)
 		fprintf(stderr, "Looking up %s ... ", host);
@@ -482,9 +480,10 @@ static int git_tcp_connect_sock(char *host, int flags)
 #endif /* NO_IPV6 */
 
 
-static void git_tcp_connect(int fd[2], char *host, int flags)
+static void git_tcp_connect(int fd[2], const char *host, const char *port,
+			    int flags)
 {
-	int sockfd = git_tcp_connect_sock(host, flags);
+	int sockfd = git_tcp_connect_sock(host, port, flags);
 
 	fd[0] = sockfd;
 	fd[1] = dup(sockfd);
@@ -550,18 +549,16 @@ static int git_use_proxy(const char *host)
 	return (git_proxy_command && *git_proxy_command);
 }
 
-static struct child_process *git_proxy_connect(int fd[2], char *host)
+static struct child_process *git_proxy_connect(int fd[2], const char *host,
+					       const char *port)
 {
-	const char *port = STR(DEFAULT_GIT_PORT);
 	struct child_process *proxy;
 
-	get_host_and_port(&host, &port);
-
 	proxy = xmalloc(sizeof(*proxy));
 	child_process_init(proxy);
 	argv_array_push(&proxy->args, git_proxy_command);
 	argv_array_push(&proxy->args, host);
-	argv_array_push(&proxy->args, port);
+	argv_array_push(&proxy->args, port ? port : STR(DEFAULT_GIT_PORT));
 	proxy->in = -1;
 	proxy->out = -1;
 	if (start_command(proxy))
@@ -696,9 +693,9 @@ struct child_process *git_connect(int fd[2], const char *url,
 		 * cannot connect.
 		 */
 		if (git_use_proxy(host))
-			conn = git_proxy_connect(fd, hostandport);
+			conn = git_proxy_connect(fd, host, port);
 		else
-			git_tcp_connect(fd, hostandport, flags);
+			git_tcp_connect(fd, host, port, flags);
 		/*
 		 * Separate original protocol components prog and path
 		 * from extended host header with a NUL byte.
-- 
2.8.1.16.g58dac65.dirty

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 5/6] connect: don't xstrdup target_host
  2016-05-01  6:02 [PATCH v3 0/6] connect: various cleanups Mike Hommey
                   ` (3 preceding siblings ...)
  2016-05-01  6:02 ` [PATCH 4/6] connect: pass separate host and port to git_tcp_connect and git_proxy_connect Mike Hommey
@ 2016-05-01  6:02 ` Mike Hommey
  2016-05-01  6:02 ` [PATCH 6/6] connect: move ssh command line preparation to a separate function Mike Hommey
  2016-05-03  8:50 ` [PATCH v4 00/11] connect: various cleanups Mike Hommey
  6 siblings, 0 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-01  6:02 UTC (permalink / raw)
  To: git; +Cc: gitster, tboegi

Now that hostandport is left unmodified in git_connect (we don't pass
it to get_host_and_port() anymore), we can avoid xstrdup'ing it.

Signed-off-by: Mike Hommey <mh@glandium.org>
---
 connect.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/connect.c b/connect.c
index b3fce84..e2b976e 100644
--- a/connect.c
+++ b/connect.c
@@ -683,9 +683,9 @@ struct child_process *git_connect(int fd[2], const char *url,
 		 */
 		char *target_host = getenv("GIT_OVERRIDE_VIRTUAL_HOST");
 		if (target_host)
-			target_host = xstrdup(target_host);
+			target_host = target_host;
 		else
-			target_host = xstrdup(hostandport);
+			target_host = hostandport;
 
 		transport_check_allowed("git");
 
@@ -707,7 +707,6 @@ struct child_process *git_connect(int fd[2], const char *url,
 			     "%s %s%chost=%s%c",
 			     prog, path, 0,
 			     target_host, 0);
-		free(target_host);
 	} else {
 		conn = xmalloc(sizeof(*conn));
 		child_process_init(conn);
-- 
2.8.1.16.g58dac65.dirty

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 6/6] connect: move ssh command line preparation to a separate function
  2016-05-01  6:02 [PATCH v3 0/6] connect: various cleanups Mike Hommey
                   ` (4 preceding siblings ...)
  2016-05-01  6:02 ` [PATCH 5/6] connect: don't xstrdup target_host Mike Hommey
@ 2016-05-01  6:02 ` Mike Hommey
  2016-05-03  8:50 ` [PATCH v4 00/11] connect: various cleanups Mike Hommey
  6 siblings, 0 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-01  6:02 UTC (permalink / raw)
  To: git; +Cc: gitster, tboegi

Signed-off-by: Mike Hommey <mh@glandium.org>
---
 connect.c | 101 ++++++++++++++++++++++++++++++++++----------------------------
 1 file changed, 55 insertions(+), 46 deletions(-)

diff --git a/connect.c b/connect.c
index e2b976e..db78eb2 100644
--- a/connect.c
+++ b/connect.c
@@ -639,6 +639,58 @@ static enum protocol parse_connect_url(const char *url_orig, char **ret_host,
 
 static struct child_process no_fork = CHILD_PROCESS_INIT;
 
+static int prepare_ssh_command(struct argv_array *cmd, const char *host,
+			       const char *port, int flags)
+{
+	const char *ssh;
+	int putty = 0, tortoiseplink = 0, use_shell = 1;
+	transport_check_allowed("ssh");
+
+	ssh = getenv("GIT_SSH_COMMAND");
+	if (!ssh) {
+		const char *base;
+		char *ssh_dup;
+
+		/*
+		 * GIT_SSH is the no-shell version of
+		 * GIT_SSH_COMMAND (and must remain so for
+		 * historical compatibility).
+		 */
+		use_shell = 0;
+
+		ssh = getenv("GIT_SSH");
+		if (!ssh)
+			ssh = "ssh";
+
+		ssh_dup = xstrdup(ssh);
+		base = basename(ssh_dup);
+
+		tortoiseplink = !strcasecmp(base, "tortoiseplink") ||
+			!strcasecmp(base, "tortoiseplink.exe");
+		putty = tortoiseplink ||
+			!strcasecmp(base, "plink") ||
+			!strcasecmp(base, "plink.exe");
+
+		free(ssh_dup);
+	}
+
+	argv_array_push(cmd, ssh);
+	if (flags & CONNECT_IPV4)
+		argv_array_push(cmd, "-4");
+	else if (flags & CONNECT_IPV6)
+		argv_array_push(cmd, "-6");
+	if (tortoiseplink)
+		argv_array_push(cmd, "-batch");
+	if (port) {
+		/* P is for PuTTY, p is for OpenSSH */
+		argv_array_push(cmd, putty ? "-P" : "-p");
+		argv_array_push(cmd, port);
+	}
+	argv_array_push(cmd, host);
+
+	return use_shell;
+}
+
 /*
  * This returns a dummy child_process if the transport protocol does not
  * need fork(2), or a struct child_process object if it does.  Once done,
@@ -717,55 +769,12 @@ struct child_process *git_connect(int fd[2], const char *url,
 
 		/* remove repo-local variables from the environment */
 		conn->env = local_repo_env;
-		conn->use_shell = 1;
 		conn->in = conn->out = -1;
 		if (protocol == PROTO_SSH) {
-			const char *ssh;
-			int putty = 0, tortoiseplink = 0;
-			transport_check_allowed("ssh");
-
-			ssh = getenv("GIT_SSH_COMMAND");
-			if (!ssh) {
-				const char *base;
-				char *ssh_dup;
-
-				/*
-				 * GIT_SSH is the no-shell version of
-				 * GIT_SSH_COMMAND (and must remain so for
-				 * historical compatibility).
-				 */
-				conn->use_shell = 0;
-
-				ssh = getenv("GIT_SSH");
-				if (!ssh)
-					ssh = "ssh";
-
-				ssh_dup = xstrdup(ssh);
-				base = basename(ssh_dup);
-
-				tortoiseplink = !strcasecmp(base, "tortoiseplink") ||
-					!strcasecmp(base, "tortoiseplink.exe");
-				putty = tortoiseplink ||
-					!strcasecmp(base, "plink") ||
-					!strcasecmp(base, "plink.exe");
-
-				free(ssh_dup);
-			}
-
-			argv_array_push(&conn->args, ssh);
-			if (flags & CONNECT_IPV4)
-				argv_array_push(&conn->args, "-4");
-			else if (flags & CONNECT_IPV6)
-				argv_array_push(&conn->args, "-6");
-			if (tortoiseplink)
-				argv_array_push(&conn->args, "-batch");
-			if (port) {
-				/* P is for PuTTY, p is for OpenSSH */
-				argv_array_push(&conn->args, putty ? "-P" : "-p");
-				argv_array_push(&conn->args, port);
-			}
-			argv_array_push(&conn->args, host);
+			conn->use_shell = prepare_ssh_command(
+				&conn->args, host, port, flags);
 		} else {
+			conn->use_shell = 1;
 			transport_check_allowed("file");
 		}
 		argv_array_push(&conn->args, cmd.buf);
-- 
2.8.1.16.g58dac65.dirty

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/6] connect: remove get_port()
  2016-05-01  6:02 ` [PATCH 1/6] connect: remove get_port() Mike Hommey
@ 2016-05-01 10:10   ` Torsten Bögershausen
  2016-05-01 21:43     ` Mike Hommey
  2016-05-03  5:03     ` Jeff King
  0 siblings, 2 replies; 46+ messages in thread
From: Torsten Bögershausen @ 2016-05-01 10:10 UTC (permalink / raw)
  To: Mike Hommey, git; +Cc: gitster, tboegi

On 2016-05-01 08.02, Mike Hommey wrote:
> get_port() is only used as a fallback when get_host_and_port() does not
> return a port. But get_port() does the same search as
> get_host_and_port(), except get_host_and_port() starts from the end of
> the host, respecting square brackets for ipv6 addresses, and get_port(),
> operating after get_host_and_port(), works on a modified host string
> that has square brackes removed if there were any.
typo: brackets.
> 
> I cannot think of any legal host:port string that would not have a port
> returned by get_host_and_port() *and* have one returned by get_port().
> So just remove get_port().
> 
> Signed-off-by: Mike Hommey <mh@glandium.org>
Does this pass the test-suite ?
It doesn't pass here, t5601:

not ok 39 - bracketed hostnames are still ssh
#
#               git clone "[myhost:123]:src" ssh-bracket-clone &&
#               expect_ssh "-p 123" myhost src
#
not ok 40 - uplink is not treated as putty
#
#               copy_ssh_wrapper_as "$TRASH_DIRECTORY/uplink" &&
#               git clone "[myhost:123]:src" ssh-bracket-clone-uplink &&
#               expect_ssh "-p 123" myhost src
#
not ok 41 - plink is treated specially (as putty)
#
#               copy_ssh_wrapper_as "$TRASH_DIRECTORY/plink" &&
#               git clone "[myhost:123]:src" ssh-bracket-clone-plink-0 &&
#               expect_ssh "-P 123" myhost src
#
not ok 42 - plink.exe is treated specially (as putty)
#
#               copy_ssh_wrapper_as "$TRASH_DIRECTORY/plink.exe" &&
#               git clone "[myhost:123]:src" ssh-bracket-clone-plink-1 &&
#               expect_ssh "-P 123" myhost src
#
not ok 43 - tortoiseplink is like putty, with extra arguments

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/6] connect: uniformize and group CONNECT_DIAG_URL handling code
  2016-05-01  6:02 ` [PATCH 2/6] connect: uniformize and group CONNECT_DIAG_URL handling code Mike Hommey
@ 2016-05-01 13:37   ` Torsten Bögershausen
  2016-05-01 23:20     ` Mike Hommey
  2016-05-02  4:56   ` Torsten Bögershausen
  1 sibling, 1 reply; 46+ messages in thread
From: Torsten Bögershausen @ 2016-05-01 13:37 UTC (permalink / raw)
  To: Mike Hommey, git; +Cc: gitster, tboegi



On 01.05.16 08:02, Mike Hommey wrote:
> The CONNECT_DIAG_URL code for PROTO_GIT and PROTO_SSH were different in
> subtle ways.
Yes, and there (historical) reasons for that.
The first implementation did support IPV6 with SSH:


commit 5ba884483fe1a5f9ce1ce5e3c5e1c37c0fd296c4
    [PATCH] GIT: Try all addresses for given remote name
   
    Try all addresses for given remote name until it succeeds.  Also
    supports IPv6.

This lead to a regression, which was fixed here:
commit ce6f8e7ec2bbebe2472e23b684cae0a4adf325ad
    Fix git protocol connection 'port' override
   
    It was broken by the IPv6 patches - we need to remove the ":" part from
    the hostname for a successful name lookup.


Later the ssh:// syntax was added:

commit c05186cc38ca4605bff1f275619d7d0faeaf2fa5
    Support git+ssh:// and ssh+git:// URL
   
Later support for [] was added:
commit 356bece0a2725818191b12f6e991490d52baa1d1
    GIT: Support [address] in URLs
    Allow IPv6address/IPvFuture enclosed by [] in URLs, like:
       git push '[3ffe:ffff:...:1]:GIT/git'
    or
       git push 'ssh://[3ffe:ffff:...:1]/GIT/git'

Later, support for a trailing ':' was added:
commit 608d48b2207a6152839a9762c7a66f217bceb440
    Fix "getaddrinfo()" buglet
    So when somebody passes me a "please pull" request pointing to something
    like the following
    	git://git.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb.git
    (note the extraneous colon at the end of the host name), git would happily
    try to connect to port 0, which would generally just cause the remote to
    not even answer, and the "connect()" will take a long time to time out.

Then it was possible to use ssh:// with a port number:
commit 2e7766655abd0312a6bf4db6a6ff141e7e4ac8b6
    URL: allow port specification in ssh:// URLs


Parsing of the port number was improved here:
commit 8f1482536ad680fcd738158e76e254a534f2e690
    connect.c: stricter port validation, silence compiler warning


Then, it was possible to pass the port number to putty or plink:
commit 36ad53ffee6ed5b7c277cde660f526fd8ce3d68f
    connect.c: Support PuTTY plink and TortoisePlink as SSH on Windows

This caused a regression, see below.

Later, there was a fix for [] together with a port number:
commit 7acf438215d1b0e8e47a707f3585de8486a2e5fe
    git: Wrong parsing of ssh urls with IPv6 literals ignores port


A better way to distinguish "local file system" URLS from scp like URLs,
(None of them has a scheme, but they are nice to use)
commit 60003340cda05f5ecd79ee8522b21eda038b994b
    clone: allow cloning local paths with colons in them

This fix lead to another regression, which was partly resolved here
(And that's where test cases came in)
commit 8d3d28f5dba94a15a79975e4adc909c295c80d80
    clone: tighten "local paths with colons" check a bit
   

 In order to allow diagnostics in the parser, the diag-url feature was added:
commit 5610b7c0c6957cf0b236b6fac087c1f4dc209376
    git fetch-pack: add --diag-url
 
The the parser was improved:
commit 6a59974869c0a6e9399101bc02223b4c00a8aff2
    git fetch: support host:/~repo

and refactored:
commit 83b058752707a6ba4af51ebc98c47913bc7d2d25
    git_connect(): refactor the port handling for ssh

And more refactored:
commit c59ab2e52a64abd7fded97e0983a9b7f3d0508a0
    connect.c: refactor url parsing

And more refactored:
commit a2036d7e00ad8aa16ba010a80078e10f0e4568a3
    git_connect(): use common return point

Later the parser was improved:
commit 86ceb337ec340c7db9b060b90bfab05a08b8251b
    connect.c: allow ssh://user@[2001:db8::1]/repo.git
commit 3f55ccab8e0fec73c8e38b909e9bb4963bfb8f6a
    t5500: show user name and host in diag-url

However, one of the refactoring broke a use case,
which was valid before, leading to regressions around the world:

commit 6b6c5f7a2f66751a93afce54277a1f30ab0dc521
    connect.c: ignore extra colon after hostname

Now back to the regression introduced by supporting plink/putty with
ports, it was improved by a preparing commit, followed by the fix:

commit baaf233755f71c057d28b9e8692e24d4fca7d22f
    connect: improve check for plink to reduce false positives

commit 37ee646e72d7f39d61a538e21a4c2721e32cb444
    connect: simplify SSH connection code path

Later IPV6/IPV4 only connections had been supported:
commit c915f11eb4922e154e29cf62d3b549d65c06a170
    connect & http: support -4 and -6 switches for remote operations

I skipped the dates and names (I was responsible for one regression)
I hope this gives a half-correct overview,
why I am reluctant to change any code in connect.c
unless there is a fix for a real world problem.

And even here, the test cases should be changed first (and reviewed) in an own commit.
Marked as test_must_failure.
The c-code can be changed in the next commit, and the TC are marked as "test_expect_success"


Back to another topic:
If you want to support the native Git-support for native HG via hg-serve,
I will be happy to assist with reviews.
Please, if possible, don't touch connect.c at all.
(Beside the memory leak fix).

It may be better to start with a real remote-helper and copy the code from connect.c
And, later, if there are common code paths, refactor stuff.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/6] connect: remove get_port()
  2016-05-01 10:10   ` Torsten Bögershausen
@ 2016-05-01 21:43     ` Mike Hommey
  2016-05-03  5:03     ` Jeff King
  1 sibling, 0 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-01 21:43 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git, gitster

On Sun, May 01, 2016 at 12:10:09PM +0200, Torsten Bögershausen wrote:
> On 2016-05-01 08.02, Mike Hommey wrote:
> > get_port() is only used as a fallback when get_host_and_port() does not
> > return a port. But get_port() does the same search as
> > get_host_and_port(), except get_host_and_port() starts from the end of
> > the host, respecting square brackets for ipv6 addresses, and get_port(),
> > operating after get_host_and_port(), works on a modified host string
> > that has square brackes removed if there were any.
> typo: brackets.
> > 
> > I cannot think of any legal host:port string that would not have a port
> > returned by get_host_and_port() *and* have one returned by get_port().
> > So just remove get_port().
> > 
> > Signed-off-by: Mike Hommey <mh@glandium.org>
> Does this pass the test-suite ?
> It doesn't pass here, t5601:
> 
> not ok 39 - bracketed hostnames are still ssh
> #
> #               git clone "[myhost:123]:src" ssh-bracket-clone &&
> #               expect_ssh "-p 123" myhost src
> #
> not ok 40 - uplink is not treated as putty
> #
> #               copy_ssh_wrapper_as "$TRASH_DIRECTORY/uplink" &&
> #               git clone "[myhost:123]:src" ssh-bracket-clone-uplink &&
> #               expect_ssh "-p 123" myhost src
> #
> not ok 41 - plink is treated specially (as putty)
> #
> #               copy_ssh_wrapper_as "$TRASH_DIRECTORY/plink" &&
> #               git clone "[myhost:123]:src" ssh-bracket-clone-plink-0 &&
> #               expect_ssh "-P 123" myhost src
> #
> not ok 42 - plink.exe is treated specially (as putty)
> #
> #               copy_ssh_wrapper_as "$TRASH_DIRECTORY/plink.exe" &&
> #               git clone "[myhost:123]:src" ssh-bracket-clone-plink-1 &&
> #               expect_ssh "-P 123" myhost src
> #
> not ok 43 - tortoiseplink is like putty, with extra arguments
> 

sigh. I was assuming all the relevant cases were tested in t5500, but
apparently they aren't.

Mike

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/6] connect: uniformize and group CONNECT_DIAG_URL handling code
  2016-05-01 13:37   ` Torsten Bögershausen
@ 2016-05-01 23:20     ` Mike Hommey
  0 siblings, 0 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-01 23:20 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git, gitster

On Sun, May 01, 2016 at 03:37:24PM +0200, Torsten Bögershausen wrote:
> I skipped the dates and names (I was responsible for one regression)
> I hope this gives a half-correct overview,
> why I am reluctant to change any code in connect.c
> unless there is a fix for a real world problem.

I don't agree that connect.c should be left untouched because it's
fragile. On the contrary, if anything, it should be made less fragile.

And the more I look at it, the more I'm tempted to just change the
parse_connect_url function itself to do _all_ the parsing, instead of
having some code paths invoke get_host_and_port and some others invoke
get_port on top of that, for some effect that, when you look at the
code, you can't know what it's supposed to be. If anything, my attempts
at cleaning up the code, and partially failing are a demonstration that
the code *does* need some clean up.

> And even here, the test cases should be changed first (and reviewed) in an own commit.
> Marked as test_must_failure.
> The c-code can be changed in the next commit, and the TC are marked as "test_expect_success"

What is the benefit from doing so?

> Back to another topic:
> If you want to support the native Git-support for native HG via hg-serve,
> I will be happy to assist with reviews.
> Please, if possible, don't touch connect.c at all.
> (Beside the memory leak fix).
> 
> It may be better to start with a real remote-helper and copy the code from connect.c
> And, later, if there are common code paths, refactor stuff.

There *are* common code paths. Which is exactly why I'm trying to reuse
connect.c without copying it.

Mike

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/6] connect: uniformize and group CONNECT_DIAG_URL handling code
  2016-05-01  6:02 ` [PATCH 2/6] connect: uniformize and group CONNECT_DIAG_URL handling code Mike Hommey
  2016-05-01 13:37   ` Torsten Bögershausen
@ 2016-05-02  4:56   ` Torsten Bögershausen
  2016-05-02  8:31     ` Mike Hommey
  1 sibling, 1 reply; 46+ messages in thread
From: Torsten Bögershausen @ 2016-05-02  4:56 UTC (permalink / raw)
  To: Mike Hommey, git; +Cc: gitster, tboegi

On 05/01/2016 08:02 AM, Mike Hommey wrote:
> +	if (flags & CONNECT_DIAG_URL) {
>   		printf("Diag: url=%s\n", url ? url : "NULL");
>   		printf("Diag: protocol=%s\n", prot_name(protocol));
>   		printf("Diag: hostandport=%s\n", hostandport ? hostandport : "NULL");
> +		printf("Diag: userandhost=%s\n", host ? host : "NULL");
> +		printf("Diag: port=%s\n", port ? port : "NONE");
>   		printf("Diag: path=%s\n", path ? path : "NULL");
>   		conn = NULL;
Does it make sense  to write the host twice?
If things are cleaned up, would something like this make sense ?

		printf("Diag: url=%s\n", url ? url : "NULL");
  		printf("Diag: protocol=%s\n", prot_name(protocol));
		printf("Diag: user=%s\n", user ? user : "NULL");
  		printf("Diag: host=%s\n", host ? host : "NULL");
		printf("Diag: port=%s\n", port ? port : "NONE");
  		printf("Diag: path=%s\n", path ? path : "NULL");
  		
or
		printf("Diag: url=%s\n", url ? url : "NULL");
  		printf("Diag: protocol=%s\n", prot_name(protocol));
                 if (user)
			printf("Diag: user=%s\n", user);
  		printf("Diag: host=%s\n", host ? host : "NULL");
		printf("Diag: port=%s\n", port ? port : "NONE");
  		printf("Diag: path=%s\n", path ? path : "NULL");

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/6] connect: uniformize and group CONNECT_DIAG_URL handling code
  2016-05-02  4:56   ` Torsten Bögershausen
@ 2016-05-02  8:31     ` Mike Hommey
  2016-05-02 11:29       ` Torsten Bögershausen
  0 siblings, 1 reply; 46+ messages in thread
From: Mike Hommey @ 2016-05-02  8:31 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git, gitster

On Mon, May 02, 2016 at 06:56:54AM +0200, Torsten Bögershausen wrote:
> On 05/01/2016 08:02 AM, Mike Hommey wrote:
> > +	if (flags & CONNECT_DIAG_URL) {
> >   		printf("Diag: url=%s\n", url ? url : "NULL");
> >   		printf("Diag: protocol=%s\n", prot_name(protocol));
> >   		printf("Diag: hostandport=%s\n", hostandport ? hostandport : "NULL");
> > +		printf("Diag: userandhost=%s\n", host ? host : "NULL");
> > +		printf("Diag: port=%s\n", port ? port : "NONE");
> >   		printf("Diag: path=%s\n", path ? path : "NULL");
> >   		conn = NULL;
> Does it make sense  to write the host twice?
> If things are cleaned up, would something like this make sense ?
> 
> 		printf("Diag: url=%s\n", url ? url : "NULL");
>  		printf("Diag: protocol=%s\n", prot_name(protocol));
> 		printf("Diag: user=%s\n", user ? user : "NULL");
>  		printf("Diag: host=%s\n", host ? host : "NULL");
> 		printf("Diag: port=%s\n", port ? port : "NONE");
>  		printf("Diag: path=%s\n", path ? path : "NULL");

That's what I'm converging towards, in the end. There is one thing that
needs hostandport, though: the git protocol host header. But I don't
really like that parse_connect_url would return user, host, port, *and*
hostandport. How about "reconstructing" hostandport in that case, adding
square brackets when the host contains colons?

BTW, the git protocol currently doesn't reject urls with users and
doesn't seem to handle them properly either. My changes would fix this
by separating user and host at the parse_connect_url level, but the
question then is what to do when there is a user part? die()?

Mike

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/6] connect: uniformize and group CONNECT_DIAG_URL handling code
  2016-05-02  8:31     ` Mike Hommey
@ 2016-05-02 11:29       ` Torsten Bögershausen
  2016-05-02 12:38         ` Mike Hommey
  2016-05-02 22:05         ` Junio C Hamano
  0 siblings, 2 replies; 46+ messages in thread
From: Torsten Bögershausen @ 2016-05-02 11:29 UTC (permalink / raw)
  To: Mike Hommey; +Cc: git, gitster

On 05/02/2016 10:31 AM, Mike Hommey wrote:
> On Mon, May 02, 2016 at 06:56:54AM +0200, Torsten Bögershausen wrote:
>> On 05/01/2016 08:02 AM, Mike Hommey wrote:
>>> +	if (flags & CONNECT_DIAG_URL) {
>>>    		printf("Diag: url=%s\n", url ? url : "NULL");
>>>    		printf("Diag: protocol=%s\n", prot_name(protocol));
>>>    		printf("Diag: hostandport=%s\n", hostandport ? hostandport : "NULL");
>>> +		printf("Diag: userandhost=%s\n", host ? host : "NULL");
>>> +		printf("Diag: port=%s\n", port ? port : "NONE");
>>>    		printf("Diag: path=%s\n", path ? path : "NULL");
>>>    		conn = NULL;
>> Does it make sense  to write the host twice?
>> If things are cleaned up, would something like this make sense ?
>>
>> 		printf("Diag: url=%s\n", url ? url : "NULL");
>>   		printf("Diag: protocol=%s\n", prot_name(protocol));
>> 		printf("Diag: user=%s\n", user ? user : "NULL");
>>   		printf("Diag: host=%s\n", host ? host : "NULL");
>> 		printf("Diag: port=%s\n", port ? port : "NONE");
>>   		printf("Diag: path=%s\n", path ? path : "NULL");
>
> That's what I'm converging towards, in the end. There is one thing that
> needs hostandport, though: the git protocol host header. But I don't
> really like that parse_connect_url would return user, host, port, *and*
> hostandport. How about "reconstructing" hostandport in that case, adding
> square brackets when the host contains colons?
>
> BTW, the git protocol currently doesn't reject urls with users and
> doesn't seem to handle them properly either. My changes would fix this
> by separating user and host at the parse_connect_url level, but the
> question then is what to do when there is a user part? die()?
>
> Mike
>
That is what happening:
git clone git://xx@git.kernel.org/pub/scm/git/git.git
Cloning into 'git'...
fatal: Unable to look up xx@git.kernel.org (port 9418) (Name or service 
not known)

And that may explain, why we have different handling of ssh vs git:
The URL-scheme for git:// simply doesn't specify a user name at all.

git://host:[port]/path/to/repo
Knowing that, the "@" will be feed into the name resolver,
and that's OK.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/6] connect: uniformize and group CONNECT_DIAG_URL handling code
  2016-05-02 11:29       ` Torsten Bögershausen
@ 2016-05-02 12:38         ` Mike Hommey
  2016-05-02 22:05         ` Junio C Hamano
  1 sibling, 0 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-02 12:38 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git, gitster

On Mon, May 02, 2016 at 01:29:15PM +0200, Torsten Bögershausen wrote:
> On 05/02/2016 10:31 AM, Mike Hommey wrote:
> > On Mon, May 02, 2016 at 06:56:54AM +0200, Torsten Bögershausen wrote:
> > > On 05/01/2016 08:02 AM, Mike Hommey wrote:
> > > > +	if (flags & CONNECT_DIAG_URL) {
> > > >    		printf("Diag: url=%s\n", url ? url : "NULL");
> > > >    		printf("Diag: protocol=%s\n", prot_name(protocol));
> > > >    		printf("Diag: hostandport=%s\n", hostandport ? hostandport : "NULL");
> > > > +		printf("Diag: userandhost=%s\n", host ? host : "NULL");
> > > > +		printf("Diag: port=%s\n", port ? port : "NONE");
> > > >    		printf("Diag: path=%s\n", path ? path : "NULL");
> > > >    		conn = NULL;
> > > Does it make sense  to write the host twice?
> > > If things are cleaned up, would something like this make sense ?
> > > 
> > > 		printf("Diag: url=%s\n", url ? url : "NULL");
> > >   		printf("Diag: protocol=%s\n", prot_name(protocol));
> > > 		printf("Diag: user=%s\n", user ? user : "NULL");
> > >   		printf("Diag: host=%s\n", host ? host : "NULL");
> > > 		printf("Diag: port=%s\n", port ? port : "NONE");
> > >   		printf("Diag: path=%s\n", path ? path : "NULL");
> > 
> > That's what I'm converging towards, in the end. There is one thing that
> > needs hostandport, though: the git protocol host header. But I don't
> > really like that parse_connect_url would return user, host, port, *and*
> > hostandport. How about "reconstructing" hostandport in that case, adding
> > square brackets when the host contains colons?
> > 
> > BTW, the git protocol currently doesn't reject urls with users and
> > doesn't seem to handle them properly either. My changes would fix this
> > by separating user and host at the parse_connect_url level, but the
> > question then is what to do when there is a user part? die()?
> > 
> > Mike
> > 
> That is what happening:
> git clone git://xx@git.kernel.org/pub/scm/git/git.git
> Cloning into 'git'...
> fatal: Unable to look up xx@git.kernel.org (port 9418) (Name or service not
> known)
> 
> And that may explain, why we have different handling of ssh vs git:
> The URL-scheme for git:// simply doesn't specify a user name at all.
> 
> git://host:[port]/path/to/repo
> Knowing that, the "@" will be feed into the name resolver,
> and that's OK.

But we might as well throw an error before, because @ is not going to
appear in a valid hostname.

Mike

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/6] connect: uniformize and group CONNECT_DIAG_URL handling code
  2016-05-02 11:29       ` Torsten Bögershausen
  2016-05-02 12:38         ` Mike Hommey
@ 2016-05-02 22:05         ` Junio C Hamano
  2016-05-02 23:14           ` Junio C Hamano
  1 sibling, 1 reply; 46+ messages in thread
From: Junio C Hamano @ 2016-05-02 22:05 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: Mike Hommey, git

Torsten Bögershausen <tboegi@web.de> writes:

> git://host:[port]/path/to/repo
> Knowing that, the "@" will be feed into the name resolver,
> and that's OK.

Is it OK?  It is plausible that our client side may even want to
accept git://user:pass@host:port/local/part, and as an anonymous
service, allow it to go to git://host:port/local/part without
sending user:pass part over the wire.  Or with the same knowledge
that git:// is an anonymous service, it is also a plausible policy
to error such a request out.  To implement either needs a robust
parsing of the URL, doesn't it?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/6] connect: uniformize and group CONNECT_DIAG_URL handling code
  2016-05-02 22:05         ` Junio C Hamano
@ 2016-05-02 23:14           ` Junio C Hamano
  0 siblings, 0 replies; 46+ messages in thread
From: Junio C Hamano @ 2016-05-02 23:14 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: Mike Hommey, git

Junio C Hamano <gitster@pobox.com> writes:

> Torsten Bögershausen <tboegi@web.de> writes:
>
>> git://host:[port]/path/to/repo
>> Knowing that, the "@" will be feed into the name resolver,
>> and that's OK.
>
> Is it OK?  It is plausible that our client side may even want to
> accept git://user:pass@host:port/local/part, and as an anonymous
> service, allow it to go to git://host:port/local/part without
> sending user:pass part over the wire.  Or with the same knowledge
> that git:// is an anonymous service, it is also a plausible policy
> to error such a request out.  To implement either needs a robust
> parsing of the URL, doesn't it?

To put it differently, there is a vast difference between

 (1) knowing that xx@git.kernel.org is asking to access
     git.kernel.org as user 'xx' and failing because of a policy
     that says "we do not send auth material over the wire when we
     know we are doing anonymous access"; and

 (2) not caring the distinction between xx@git.kernel.org and
     git.kernel.org, and implicitly relying on the DNS to forbid '@'
     in the hostname and to return a look-up failure when the whole
     string of the former is taken as a hostname in order to fail
     that request.

If we want to fail the request, we should be in control of the
policy to fail.  Even in an unlikely (and impossible) world where
suddenly resolver starts allowing "xx@git.kernel.org" as a host, we
do not want to be making a connection to a bogus host that is not
what the end user requested, i.e. git.kernel.org.

Otherwise we cannot change the policy to allow it in the future if
it turns out to be necessary.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/6] connect: remove get_port()
  2016-05-01 10:10   ` Torsten Bögershausen
  2016-05-01 21:43     ` Mike Hommey
@ 2016-05-03  5:03     ` Jeff King
  2016-05-03  5:11       ` Mike Hommey
  1 sibling, 1 reply; 46+ messages in thread
From: Jeff King @ 2016-05-03  5:03 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: Mike Hommey, git, gitster

On Sun, May 01, 2016 at 12:10:09PM +0200, Torsten Bögershausen wrote:

> On 2016-05-01 08.02, Mike Hommey wrote:
> > get_port() is only used as a fallback when get_host_and_port() does not
> > return a port. But get_port() does the same search as
> > get_host_and_port(), except get_host_and_port() starts from the end of
> > the host, respecting square brackets for ipv6 addresses, and get_port(),
> > operating after get_host_and_port(), works on a modified host string
> > that has square brackes removed if there were any.
> typo: brackets.
> > 
> > I cannot think of any legal host:port string that would not have a port
> > returned by get_host_and_port() *and* have one returned by get_port().
> > So just remove get_port().
> > 
> > Signed-off-by: Mike Hommey <mh@glandium.org>
> Does this pass the test-suite ?
> It doesn't pass here, t5601:

Hmm. I do notice that Mike's patch only touches the ssh code path. The
normal TCP code path _just_ uses get_host_and_port. Does that mean the
TCP side is missing some cases, or is the extra parsing just for
handling bracketed "[host:port]:path" cases for ssh, like:

> not ok 39 - bracketed hostnames are still ssh
> #
> #               git clone "[myhost:123]:src" ssh-bracket-clone &&
> #               expect_ssh "-p 123" myhost src
> #

?

-Peff

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/6] connect: remove get_port()
  2016-05-03  5:03     ` Jeff King
@ 2016-05-03  5:11       ` Mike Hommey
  0 siblings, 0 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-03  5:11 UTC (permalink / raw)
  To: Jeff King; +Cc: Torsten Bögershausen, git, gitster

On Tue, May 03, 2016 at 01:03:38AM -0400, Jeff King wrote:
> On Sun, May 01, 2016 at 12:10:09PM +0200, Torsten Bögershausen wrote:
> 
> > On 2016-05-01 08.02, Mike Hommey wrote:
> > > get_port() is only used as a fallback when get_host_and_port() does not
> > > return a port. But get_port() does the same search as
> > > get_host_and_port(), except get_host_and_port() starts from the end of
> > > the host, respecting square brackets for ipv6 addresses, and get_port(),
> > > operating after get_host_and_port(), works on a modified host string
> > > that has square brackes removed if there were any.
> > typo: brackets.
> > > 
> > > I cannot think of any legal host:port string that would not have a port
> > > returned by get_host_and_port() *and* have one returned by get_port().
> > > So just remove get_port().
> > > 
> > > Signed-off-by: Mike Hommey <mh@glandium.org>
> > Does this pass the test-suite ?
> > It doesn't pass here, t5601:
> 
> Hmm. I do notice that Mike's patch only touches the ssh code path. The
> normal TCP code path _just_ uses get_host_and_port. Does that mean the
> TCP side is missing some cases, or is the extra parsing just for
> handling bracketed "[host:port]:path" cases for ssh, like:
> 
> > not ok 39 - bracketed hostnames are still ssh
> > #
> > #               git clone "[myhost:123]:src" ssh-bracket-clone &&
> > #               expect_ssh "-p 123" myhost src
> > #
> 
> ?

The latter.

Mike

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v4 00/11] connect: various cleanups
  2016-05-01  6:02 [PATCH v3 0/6] connect: various cleanups Mike Hommey
                   ` (5 preceding siblings ...)
  2016-05-01  6:02 ` [PATCH 6/6] connect: move ssh command line preparation to a separate function Mike Hommey
@ 2016-05-03  8:50 ` Mike Hommey
  2016-05-03  8:50   ` [PATCH v4 01/11] add fetch-pack --diag-url tests for some corner cases Mike Hommey
                     ` (10 more replies)
  6 siblings, 11 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-03  8:50 UTC (permalink / raw)
  To: git; +Cc: gitster, tboegi

I went even further this time around. I'm not totally satistifed with
the resulting parse_connect_url function, but at least it feels to me
this series puts us in a better place to actually improve it further.

Mike Hommey (11):
  add fetch-pack --diag-url tests for some corner cases
  connect: call get_host_and_port() earlier
  connect: only match the host with core.gitProxy
  connect: fill the host header in the git protocol with the host and
    port variables
  connect: make parse_connect_url() return separated host and port
  connect: group CONNECT_DIAG_URL handling code
  connect: make parse_connect_url() return the user part of the url as a
    separate value
  connect: change the --diag-url output to separate user and host
  connect: use "-l user" instead of "user@" on ssh command line
  connect: actively reject git:// urls with a user part
  connect: move ssh command line preparation to a separate function

 connect.c             | 223 ++++++++++++++++++++++++++++----------------------
 t/t5500-fetch-pack.sh |  46 ++++++++---
 t/t5601-clone.sh      |  52 ++++++++++--
 3 files changed, 204 insertions(+), 117 deletions(-)

-- 
2.8.1.16.gaa70619.dirty

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v4 01/11] add fetch-pack --diag-url tests for some corner cases
  2016-05-03  8:50 ` [PATCH v4 00/11] connect: various cleanups Mike Hommey
@ 2016-05-03  8:50   ` Mike Hommey
  2016-05-03 16:07     ` Torsten Bögershausen
  2016-05-03 16:07     ` Junio C Hamano
  2016-05-03  8:50   ` [PATCH v4 02/11] connect: call get_host_and_port() earlier Mike Hommey
                     ` (9 subsequent siblings)
  10 siblings, 2 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-03  8:50 UTC (permalink / raw)
  To: git; +Cc: gitster, tboegi

t5603-clone-dirname uses url patterns that are not tested with
fetch-pack --diag-url, and it would be useful if they were.

Interestingly, some of those tests, involving both a port and a
user:password pair, don't currently pass. Note that even if a
user:password pair is actually not supported by git, the values used
could be valid user names (user names can actually contain colons
and at signs), and are still worth testing the url parser for.

Signed-off-by: Mike Hommey <mh@glandium.org>
---
 t/t5500-fetch-pack.sh | 38 ++++++++++++++++++++++++++++++++++----
 1 file changed, 34 insertions(+), 4 deletions(-)

Note I'm not /entirely/ sure about colons in user names, but ssh happily
sends requests to authenticate with logins containing colons. I however
*do* know it works with at signs (hg.mozilla.org ssh accounts are email
addresses).

diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index e5f83bf..1f0133f 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -569,12 +569,27 @@ check_prot_host_port_path () {
 	test_cmp expected actual
 }
 
-for r in repo re:po re/po
+test_maybe_fail () {
+	host=$1; shift
+	case $host in
+		git=*)
+		test_expect_success "$@"
+		;;
+		*:*@*)
+		test_expect_failure "$@"
+		;;
+		*)
+		test_expect_success "$@"
+		;;
+	esac
+}
+
+for r in repo re:po re/po re@po
 do
 	# git or ssh with scheme
 	for p in "ssh+git" "git+ssh" git ssh
 	do
-		for h in host user@host user@[::1] user@::1
+		for h in host user@host user@[::1] user@::1 user:password@host user:passw@rd@host
 		do
 			for c in "" :
 			do
@@ -587,9 +602,12 @@ do
 				'
 			done
 		done
-		for h in host User@host User@[::1]
+		for h in host User@host User@[::1] User:password@host User:passw@rd@host
 		do
-			test_expect_success "fetch-pack --diag-url $p://$h:22/$r" '
+			test_maybe_fail "$p=$h" "fetch-pack --diag-url $p://$h:22/$r" '
+				check_prot_host_port_path $p://$h:22/$r $p "$h" 22 "/$r"
+			'
+			test_maybe_fail "$p=$h" "fetch-pack --diag-url $p://$h:22/$r" '
 				check_prot_host_port_path $p://$h:22/$r $p "$h" 22 "/$r"
 			'
 		done
@@ -628,6 +646,18 @@ do
 			check_prot_host_port_path $h:/~$r $p "$h" NONE "~$r"
 		'
 	done
+	#ssh without scheme with port
+	p=ssh
+	for h in host user@host user:password@host user:passw@rd@host
+	do
+		test_maybe_fail "$h" "fetch-pack --diag-url [$h:22]:$r" '
+			check_prot_host_port_path [$h:22]:$r $p $h 22 "$r"
+		'
+		# Do "/~" -> "~" conversion
+		test_maybe_fail "$h" "fetch-pack --diag-url [$h:22]:/~$r" '
+			check_prot_host_port_path [$h:22]:/~$r $p $h 22 "~$r"
+		'
+	done
 done
 
 test_expect_success MINGW 'fetch-pack --diag-url file://c:/repo' '
-- 
2.8.1.16.gaa70619.dirty

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v4 02/11] connect: call get_host_and_port() earlier
  2016-05-03  8:50 ` [PATCH v4 00/11] connect: various cleanups Mike Hommey
  2016-05-03  8:50   ` [PATCH v4 01/11] add fetch-pack --diag-url tests for some corner cases Mike Hommey
@ 2016-05-03  8:50   ` Mike Hommey
  2016-05-03  8:50   ` [PATCH v4 03/11] connect: only match the host with core.gitProxy Mike Hommey
                     ` (8 subsequent siblings)
  10 siblings, 0 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-03  8:50 UTC (permalink / raw)
  To: git; +Cc: gitster, tboegi

Currently, get_host_and_port() is called in git_connect() for the ssh
protocol, and in git_tcp_connect_sock() for the git protocol. Instead
of doing this, just call it from a single place, right after
parse_connect_url(), and pass the host and port separately to
git_*_connect() functions.

We however preserve hostandport, at least for now.

Note that in git_tcp_connect_sock, the port was set to "<none>" in a
case that never can happen, so that code path was removed.

Signed-off-by: Mike Hommey <mh@glandium.org>
---
 connect.c | 47 +++++++++++++++++++++++------------------------
 1 file changed, 23 insertions(+), 24 deletions(-)

diff --git a/connect.c b/connect.c
index c53f3f1..d3448c2 100644
--- a/connect.c
+++ b/connect.c
@@ -343,18 +343,16 @@ static const char *ai_name(const struct addrinfo *ai)
 /*
  * Returns a connected socket() fd, or else die()s.
  */
-static int git_tcp_connect_sock(char *host, int flags)
+static int git_tcp_connect_sock(const char *host, const char *port, int flags)
 {
 	struct strbuf error_message = STRBUF_INIT;
 	int sockfd = -1;
-	const char *port = STR(DEFAULT_GIT_PORT);
 	struct addrinfo hints, *ai0, *ai;
 	int gai;
 	int cnt = 0;
 
-	get_host_and_port(&host, &port);
-	if (!*port)
-		port = "<none>";
+	if (!port)
+		port = STR(DEFAULT_GIT_PORT);
 
 	memset(&hints, 0, sizeof(hints));
 	if (flags & CONNECT_IPV4)
@@ -411,11 +409,10 @@ static int git_tcp_connect_sock(char *host, int flags)
 /*
  * Returns a connected socket() fd, or else die()s.
  */
-static int git_tcp_connect_sock(char *host, int flags)
+static int git_tcp_connect_sock(const char *host, const char *port, int flags)
 {
 	struct strbuf error_message = STRBUF_INIT;
 	int sockfd = -1;
-	const char *port = STR(DEFAULT_GIT_PORT);
 	char *ep;
 	struct hostent *he;
 	struct sockaddr_in sa;
@@ -423,7 +420,8 @@ static int git_tcp_connect_sock(char *host, int flags)
 	unsigned int nport;
 	int cnt;
 
-	get_host_and_port(&host, &port);
+	if (!port)
+		port = STR(DEFAULT_GIT_PORT);
 
 	if (flags & CONNECT_VERBOSE)
 		fprintf(stderr, "Looking up %s ... ", host);
@@ -482,9 +480,10 @@ static int git_tcp_connect_sock(char *host, int flags)
 #endif /* NO_IPV6 */
 
 
-static void git_tcp_connect(int fd[2], char *host, int flags)
+static void git_tcp_connect(int fd[2], const char *host, const char *port,
+			    int flags)
 {
-	int sockfd = git_tcp_connect_sock(host, flags);
+	int sockfd = git_tcp_connect_sock(host, port, flags);
 
 	fd[0] = sockfd;
 	fd[1] = dup(sockfd);
@@ -550,18 +549,16 @@ static int git_use_proxy(const char *host)
 	return (git_proxy_command && *git_proxy_command);
 }
 
-static struct child_process *git_proxy_connect(int fd[2], char *host)
+static struct child_process *git_proxy_connect(int fd[2], const char *host,
+					       const char *port)
 {
-	const char *port = STR(DEFAULT_GIT_PORT);
 	struct child_process *proxy;
 
-	get_host_and_port(&host, &port);
-
 	proxy = xmalloc(sizeof(*proxy));
 	child_process_init(proxy);
 	argv_array_push(&proxy->args, git_proxy_command);
 	argv_array_push(&proxy->args, host);
-	argv_array_push(&proxy->args, port);
+	argv_array_push(&proxy->args, port ? port : STR(DEFAULT_GIT_PORT));
 	proxy->in = -1;
 	proxy->out = -1;
 	if (start_command(proxy))
@@ -672,7 +669,8 @@ static struct child_process no_fork = CHILD_PROCESS_INIT;
 struct child_process *git_connect(int fd[2], const char *url,
 				  const char *prog, int flags)
 {
-	char *hostandport, *path;
+	char *hostandport, *path, *host;
+	const char *port = NULL;
 	struct child_process *conn = &no_fork;
 	enum protocol protocol;
 	struct strbuf cmd = STRBUF_INIT;
@@ -683,6 +681,8 @@ struct child_process *git_connect(int fd[2], const char *url,
 	signal(SIGCHLD, SIG_DFL);
 
 	protocol = parse_connect_url(url, &hostandport, &path);
+	host = xstrdup(hostandport);
+	get_host_and_port(&host, &port);
 	if ((flags & CONNECT_DIAG_URL) && (protocol != PROTO_SSH)) {
 		printf("Diag: url=%s\n", url ? url : "NULL");
 		printf("Diag: protocol=%s\n", prot_name(protocol));
@@ -707,9 +707,9 @@ struct child_process *git_connect(int fd[2], const char *url,
 		 * cannot connect.
 		 */
 		if (git_use_proxy(hostandport))
-			conn = git_proxy_connect(fd, hostandport);
+			conn = git_proxy_connect(fd, host, port);
 		else
-			git_tcp_connect(fd, hostandport, flags);
+			git_tcp_connect(fd, host, port, flags);
 		/*
 		 * Separate original protocol components prog and path
 		 * from extended host header with a NUL byte.
@@ -737,22 +737,20 @@ struct child_process *git_connect(int fd[2], const char *url,
 		if (protocol == PROTO_SSH) {
 			const char *ssh;
 			int putty = 0, tortoiseplink = 0;
-			char *ssh_host = hostandport;
-			const char *port = NULL;
 			transport_check_allowed("ssh");
-			get_host_and_port(&ssh_host, &port);
 
 			if (!port)
-				port = get_port(ssh_host);
+				port = get_port(host);
 
 			if (flags & CONNECT_DIAG_URL) {
 				printf("Diag: url=%s\n", url ? url : "NULL");
 				printf("Diag: protocol=%s\n", prot_name(protocol));
-				printf("Diag: userandhost=%s\n", ssh_host ? ssh_host : "NULL");
+				printf("Diag: userandhost=%s\n", host ? host : "NULL");
 				printf("Diag: port=%s\n", port ? port : "NONE");
 				printf("Diag: path=%s\n", path ? path : "NULL");
 
 				free(hostandport);
+				free(host);
 				free(path);
 				free(conn);
 				return NULL;
@@ -798,7 +796,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 				argv_array_push(&conn->args, putty ? "-P" : "-p");
 				argv_array_push(&conn->args, port);
 			}
-			argv_array_push(&conn->args, ssh_host);
+			argv_array_push(&conn->args, host);
 		} else {
 			transport_check_allowed("file");
 		}
@@ -812,6 +810,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 		strbuf_release(&cmd);
 	}
 	free(hostandport);
+	free(host);
 	free(path);
 	return conn;
 }
-- 
2.8.1.16.gaa70619.dirty

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v4 03/11] connect: only match the host with core.gitProxy
  2016-05-03  8:50 ` [PATCH v4 00/11] connect: various cleanups Mike Hommey
  2016-05-03  8:50   ` [PATCH v4 01/11] add fetch-pack --diag-url tests for some corner cases Mike Hommey
  2016-05-03  8:50   ` [PATCH v4 02/11] connect: call get_host_and_port() earlier Mike Hommey
@ 2016-05-03  8:50   ` Mike Hommey
  2016-05-03  8:50   ` [PATCH v4 04/11] connect: fill the host header in the git protocol with the host and port variables Mike Hommey
                     ` (7 subsequent siblings)
  10 siblings, 0 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-03  8:50 UTC (permalink / raw)
  To: git; +Cc: gitster, tboegi

Currently, core.gitProxy doesn't actually match purely on domain names
as documented: it also matches ports.

So a core.gitProxy value like "script for kernel.org" doesn't make the
script called for an url like git://kernel.org:port/path, while it is
called for git://kernel.org/path.

This per-port behavior seems like an oversight rather than a deliberate
choice, so, make git://kernel.org:port/path call the gitProxy script in
the case described above.

Signed-off-by: Mike Hommey <mh@glandium.org>
---
 connect.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/connect.c b/connect.c
index d3448c2..2a08318 100644
--- a/connect.c
+++ b/connect.c
@@ -706,7 +706,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 		/* These underlying connection commands die() if they
 		 * cannot connect.
 		 */
-		if (git_use_proxy(hostandport))
+		if (git_use_proxy(host))
 			conn = git_proxy_connect(fd, host, port);
 		else
 			git_tcp_connect(fd, host, port, flags);
-- 
2.8.1.16.gaa70619.dirty

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v4 04/11] connect: fill the host header in the git protocol with the host and port variables
  2016-05-03  8:50 ` [PATCH v4 00/11] connect: various cleanups Mike Hommey
                     ` (2 preceding siblings ...)
  2016-05-03  8:50   ` [PATCH v4 03/11] connect: only match the host with core.gitProxy Mike Hommey
@ 2016-05-03  8:50   ` Mike Hommey
  2016-05-03  8:50   ` [PATCH v4 05/11] connect: make parse_connect_url() return separated host and port Mike Hommey
                     ` (6 subsequent siblings)
  10 siblings, 0 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-03  8:50 UTC (permalink / raw)
  To: git; +Cc: gitster, tboegi

The last use of the hostandport variable, besides being strdup'ed before
being split into host and port, is to fill the host header in the git
protocol.

Instead of relying on parse_connect_url() to return a host:port string
that makes sense there, construct one from the host and port variables.

Signed-off-by: Mike Hommey <mh@glandium.org>
---
 connect.c | 27 ++++++++++++++++++++-------
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/connect.c b/connect.c
index 2a08318..ed1a00d 100644
--- a/connect.c
+++ b/connect.c
@@ -695,11 +695,24 @@ struct child_process *git_connect(int fd[2], const char *url,
 		 * connect, unless the user has overridden us in
 		 * the environment.
 		 */
-		char *target_host = getenv("GIT_OVERRIDE_VIRTUAL_HOST");
-		if (target_host)
-			target_host = xstrdup(target_host);
-		else
-			target_host = xstrdup(hostandport);
+		struct strbuf target_host = STRBUF_INIT;
+		char *override_vhost = getenv("GIT_OVERRIDE_VIRTUAL_HOST");
+		if (override_vhost)
+			strbuf_addstr(&target_host, override_vhost);
+		else {
+			/* If the host contains a colon (ipv6 address), it
+			 * needs to be enclosed with square brackets. */
+			const char *colon = strchr(host, ':');
+			if (colon)
+				strbuf_addch(&target_host, '[');
+			strbuf_addstr(&target_host, host);
+			if (colon)
+				strbuf_addch(&target_host, ']');
+			if (port) {
+				strbuf_addch(&target_host, ':');
+				strbuf_addstr(&target_host, port);
+			}
+		}
 
 		transport_check_allowed("git");
 
@@ -720,8 +733,8 @@ struct child_process *git_connect(int fd[2], const char *url,
 		packet_write(fd[1],
 			     "%s %s%chost=%s%c",
 			     prog, path, 0,
-			     target_host, 0);
-		free(target_host);
+			     target_host.buf, 0);
+		strbuf_release(&target_host);
 	} else {
 		conn = xmalloc(sizeof(*conn));
 		child_process_init(conn);
-- 
2.8.1.16.gaa70619.dirty

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v4 05/11] connect: make parse_connect_url() return separated host and port
  2016-05-03  8:50 ` [PATCH v4 00/11] connect: various cleanups Mike Hommey
                     ` (3 preceding siblings ...)
  2016-05-03  8:50   ` [PATCH v4 04/11] connect: fill the host header in the git protocol with the host and port variables Mike Hommey
@ 2016-05-03  8:50   ` Mike Hommey
  2016-05-03  8:50   ` [PATCH v4 06/11] connect: group CONNECT_DIAG_URL handling code Mike Hommey
                     ` (5 subsequent siblings)
  10 siblings, 0 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-03  8:50 UTC (permalink / raw)
  To: git; +Cc: gitster, tboegi

Now that nothing besides CONNECT_DIAG_URL is using hostandport, we can
have parse_connect_url() itself do the host and port splitting.

This still leaves "user@" part of the host, if there is one, which will
be addressed in a subsequent change. This however does add /some/
handling of the "user@" part of the host, in order to pick the port
properly. This also fixes the tests added 4 commits ago.

Signed-off-by: Mike Hommey <mh@glandium.org>
---
 connect.c             | 30 ++++++++++++++++++------------
 t/t5500-fetch-pack.sh | 38 +++++++++-----------------------------
 2 files changed, 27 insertions(+), 41 deletions(-)

diff --git a/connect.c b/connect.c
index ed1a00d..3428149 100644
--- a/connect.c
+++ b/connect.c
@@ -589,10 +589,11 @@ static char *get_port(char *host)
  * The caller must free() the returned strings.
  */
 static enum protocol parse_connect_url(const char *url_orig, char **ret_host,
-				       char **ret_path)
+				       char **ret_port, char **ret_path)
 {
 	char *url;
 	char *host, *path;
+	const char *port = NULL;
 	char *end;
 	int separator = '/';
 	enum protocol protocol = PROTO_LOCAL;
@@ -647,7 +648,17 @@ static enum protocol parse_connect_url(const char *url_orig, char **ret_host,
 	path = xstrdup(path);
 	*end = '\0';
 
+	get_host_and_port(&host, &port);
+
+	if (*host && !port) {
+		/* The host might contain a user:password string, ignore it
+		 * when searching for the port again */
+		char *end_user = strrchr(host, '@');
+		port = get_port(end_user ? end_user : host);
+	}
+
 	*ret_host = xstrdup(host);
+	*ret_port = port ? xstrdup(port) : NULL;
 	*ret_path = path;
 	free(url);
 	return protocol;
@@ -669,8 +680,7 @@ static struct child_process no_fork = CHILD_PROCESS_INIT;
 struct child_process *git_connect(int fd[2], const char *url,
 				  const char *prog, int flags)
 {
-	char *hostandport, *path, *host;
-	const char *port = NULL;
+	char *host, *port, *path;
 	struct child_process *conn = &no_fork;
 	enum protocol protocol;
 	struct strbuf cmd = STRBUF_INIT;
@@ -680,13 +690,12 @@ struct child_process *git_connect(int fd[2], const char *url,
 	 */
 	signal(SIGCHLD, SIG_DFL);
 
-	protocol = parse_connect_url(url, &hostandport, &path);
-	host = xstrdup(hostandport);
-	get_host_and_port(&host, &port);
+	protocol = parse_connect_url(url, &host, &port, &path);
 	if ((flags & CONNECT_DIAG_URL) && (protocol != PROTO_SSH)) {
 		printf("Diag: url=%s\n", url ? url : "NULL");
 		printf("Diag: protocol=%s\n", prot_name(protocol));
-		printf("Diag: hostandport=%s\n", hostandport ? hostandport : "NULL");
+		printf("Diag: userandhost=%s\n", host ? host : "NULL");
+		printf("Diag: port=%s\n", port ? port : "NONE");
 		printf("Diag: path=%s\n", path ? path : "NULL");
 		conn = NULL;
 	} else if (protocol == PROTO_GIT) {
@@ -752,9 +761,6 @@ struct child_process *git_connect(int fd[2], const char *url,
 			int putty = 0, tortoiseplink = 0;
 			transport_check_allowed("ssh");
 
-			if (!port)
-				port = get_port(host);
-
 			if (flags & CONNECT_DIAG_URL) {
 				printf("Diag: url=%s\n", url ? url : "NULL");
 				printf("Diag: protocol=%s\n", prot_name(protocol));
@@ -762,8 +768,8 @@ struct child_process *git_connect(int fd[2], const char *url,
 				printf("Diag: port=%s\n", port ? port : "NONE");
 				printf("Diag: path=%s\n", path ? path : "NULL");
 
-				free(hostandport);
 				free(host);
+				free(port);
 				free(path);
 				free(conn);
 				return NULL;
@@ -822,8 +828,8 @@ struct child_process *git_connect(int fd[2], const char *url,
 		fd[1] = conn->in;  /* write to child's stdin */
 		strbuf_release(&cmd);
 	}
-	free(hostandport);
 	free(host);
+	free(port);
 	free(path);
 	return conn;
 }
diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index 1f0133f..09d46c3 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -537,7 +537,7 @@ check_prot_path () {
 	Diag: protocol=$2
 	Diag: path=$3
 	EOF
-	git fetch-pack --diag-url "$1" | grep -v hostandport= >actual &&
+	git fetch-pack --diag-url "$1" | grep -v host= | grep -v port= >actual &&
 	test_cmp expected actual
 }
 
@@ -546,22 +546,17 @@ check_prot_host_port_path () {
 	case "$2" in
 		*ssh*)
 		pp=ssh
-		uah=userandhost
-		ehost=$(echo $3 | tr -d "[]")
-		diagport="Diag: port=$4"
 		;;
 		*)
-		pp=$p
-		uah=hostandport
-		ehost=$(echo $3$4 | sed -e "s/22$/:22/" -e "s/NONE//")
-		diagport=""
+		pp=$2
 		;;
 	esac
+	ehost=$(echo $3 | tr -d "[]")
 	cat >exp <<-EOF &&
 	Diag: url=$1
 	Diag: protocol=$pp
-	Diag: $uah=$ehost
-	$diagport
+	Diag: userandhost=$ehost
+	Diag: port=$4
 	Diag: path=$5
 	EOF
 	grep -v "^$" exp >expected
@@ -569,21 +564,6 @@ check_prot_host_port_path () {
 	test_cmp expected actual
 }
 
-test_maybe_fail () {
-	host=$1; shift
-	case $host in
-		git=*)
-		test_expect_success "$@"
-		;;
-		*:*@*)
-		test_expect_failure "$@"
-		;;
-		*)
-		test_expect_success "$@"
-		;;
-	esac
-}
-
 for r in repo re:po re/po re@po
 do
 	# git or ssh with scheme
@@ -604,10 +584,10 @@ do
 		done
 		for h in host User@host User@[::1] User:password@host User:passw@rd@host
 		do
-			test_maybe_fail "$p=$h" "fetch-pack --diag-url $p://$h:22/$r" '
+			test_expect_success "fetch-pack --diag-url $p://$h:22/$r" '
 				check_prot_host_port_path $p://$h:22/$r $p "$h" 22 "/$r"
 			'
-			test_maybe_fail "$p=$h" "fetch-pack --diag-url $p://$h:22/$r" '
+			test_expect_success "fetch-pack --diag-url $p://$h:22/$r" '
 				check_prot_host_port_path $p://$h:22/$r $p "$h" 22 "/$r"
 			'
 		done
@@ -650,11 +630,11 @@ do
 	p=ssh
 	for h in host user@host user:password@host user:passw@rd@host
 	do
-		test_maybe_fail "$h" "fetch-pack --diag-url [$h:22]:$r" '
+		test_expect_success "fetch-pack --diag-url [$h:22]:$r" '
 			check_prot_host_port_path [$h:22]:$r $p $h 22 "$r"
 		'
 		# Do "/~" -> "~" conversion
-		test_maybe_fail "$h" "fetch-pack --diag-url [$h:22]:/~$r" '
+		test_expect_success "fetch-pack --diag-url [$h:22]:/~$r" '
 			check_prot_host_port_path [$h:22]:/~$r $p $h 22 "~$r"
 		'
 	done
-- 
2.8.1.16.gaa70619.dirty

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v4 06/11] connect: group CONNECT_DIAG_URL handling code
  2016-05-03  8:50 ` [PATCH v4 00/11] connect: various cleanups Mike Hommey
                     ` (4 preceding siblings ...)
  2016-05-03  8:50   ` [PATCH v4 05/11] connect: make parse_connect_url() return separated host and port Mike Hommey
@ 2016-05-03  8:50   ` Mike Hommey
  2016-05-03  8:50   ` [PATCH v4 07/11] connect: make parse_connect_url() return the user part of the url as a separate value Mike Hommey
                     ` (4 subsequent siblings)
  10 siblings, 0 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-03  8:50 UTC (permalink / raw)
  To: git; +Cc: gitster, tboegi

Previous changes made both branches handling CONNECT_DIAG_URL identical.
We can now remove one of those branches and have CONNECT_DIAG_URL be
handled in one place.

Signed-off-by: Mike Hommey <mh@glandium.org>
---
 connect.c | 16 +---------------
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/connect.c b/connect.c
index 3428149..8813f90 100644
--- a/connect.c
+++ b/connect.c
@@ -691,7 +691,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 	signal(SIGCHLD, SIG_DFL);
 
 	protocol = parse_connect_url(url, &host, &port, &path);
-	if ((flags & CONNECT_DIAG_URL) && (protocol != PROTO_SSH)) {
+	if (flags & CONNECT_DIAG_URL) {
 		printf("Diag: url=%s\n", url ? url : "NULL");
 		printf("Diag: protocol=%s\n", prot_name(protocol));
 		printf("Diag: userandhost=%s\n", host ? host : "NULL");
@@ -761,20 +761,6 @@ struct child_process *git_connect(int fd[2], const char *url,
 			int putty = 0, tortoiseplink = 0;
 			transport_check_allowed("ssh");
 
-			if (flags & CONNECT_DIAG_URL) {
-				printf("Diag: url=%s\n", url ? url : "NULL");
-				printf("Diag: protocol=%s\n", prot_name(protocol));
-				printf("Diag: userandhost=%s\n", host ? host : "NULL");
-				printf("Diag: port=%s\n", port ? port : "NONE");
-				printf("Diag: path=%s\n", path ? path : "NULL");
-
-				free(host);
-				free(port);
-				free(path);
-				free(conn);
-				return NULL;
-			}
-
 			ssh = getenv("GIT_SSH_COMMAND");
 			if (!ssh) {
 				const char *base;
-- 
2.8.1.16.gaa70619.dirty

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v4 07/11] connect: make parse_connect_url() return the user part of the url as a separate value
  2016-05-03  8:50 ` [PATCH v4 00/11] connect: various cleanups Mike Hommey
                     ` (5 preceding siblings ...)
  2016-05-03  8:50   ` [PATCH v4 06/11] connect: group CONNECT_DIAG_URL handling code Mike Hommey
@ 2016-05-03  8:50   ` Mike Hommey
  2016-05-03  8:50   ` [PATCH v4 08/11] connect: change the --diag-url output to separate user and host Mike Hommey
                     ` (3 subsequent siblings)
  10 siblings, 0 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-03  8:50 UTC (permalink / raw)
  To: git; +Cc: gitster, tboegi

Signed-off-by: Mike Hommey <mh@glandium.org>
---
 connect.c | 37 +++++++++++++++++++++++++++++--------
 1 file changed, 29 insertions(+), 8 deletions(-)

diff --git a/connect.c b/connect.c
index 8813f90..e95e385 100644
--- a/connect.c
+++ b/connect.c
@@ -588,11 +588,13 @@ static char *get_port(char *host)
  * Extract protocol and relevant parts from the specified connection URL.
  * The caller must free() the returned strings.
  */
-static enum protocol parse_connect_url(const char *url_orig, char **ret_host,
-				       char **ret_port, char **ret_path)
+static enum protocol parse_connect_url(const char *url_orig, char **ret_user,
+				       char **ret_host, char **ret_port,
+				       char **ret_path)
 {
 	char *url;
 	char *host, *path;
+	const char *user = NULL;
 	const char *port = NULL;
 	char *end;
 	int separator = '/';
@@ -650,13 +652,20 @@ static enum protocol parse_connect_url(const char *url_orig, char **ret_host,
 
 	get_host_and_port(&host, &port);
 
-	if (*host && !port) {
+	if (*host) {
 		/* The host might contain a user:password string, ignore it
 		 * when searching for the port again */
 		char *end_user = strrchr(host, '@');
-		port = get_port(end_user ? end_user : host);
+		if (end_user) {
+			*end_user = '\0';
+			user = host;
+			host = end_user + 1;
+		}
 	}
+	if (!port)
+		port = get_port(host);
 
+	*ret_user = user ? xstrdup(user) : NULL;
 	*ret_host = xstrdup(host);
 	*ret_port = port ? xstrdup(port) : NULL;
 	*ret_path = path;
@@ -680,7 +689,7 @@ static struct child_process no_fork = CHILD_PROCESS_INIT;
 struct child_process *git_connect(int fd[2], const char *url,
 				  const char *prog, int flags)
 {
-	char *host, *port, *path;
+	char *user, *host, *port, *path;
 	struct child_process *conn = &no_fork;
 	enum protocol protocol;
 	struct strbuf cmd = STRBUF_INIT;
@@ -690,11 +699,14 @@ struct child_process *git_connect(int fd[2], const char *url,
 	 */
 	signal(SIGCHLD, SIG_DFL);
 
-	protocol = parse_connect_url(url, &host, &port, &path);
+	protocol = parse_connect_url(url, &user, &host, &port, &path);
 	if (flags & CONNECT_DIAG_URL) {
 		printf("Diag: url=%s\n", url ? url : "NULL");
 		printf("Diag: protocol=%s\n", prot_name(protocol));
-		printf("Diag: userandhost=%s\n", host ? host : "NULL");
+		if (user)
+			printf("Diag: userandhost=%s@%s\n", user, host);
+		else
+			printf("Diag: userandhost=%s\n", host ? host : "NULL");
 		printf("Diag: port=%s\n", port ? port : "NONE");
 		printf("Diag: path=%s\n", path ? path : "NULL");
 		conn = NULL;
@@ -801,7 +813,15 @@ struct child_process *git_connect(int fd[2], const char *url,
 				argv_array_push(&conn->args, putty ? "-P" : "-p");
 				argv_array_push(&conn->args, port);
 			}
-			argv_array_push(&conn->args, host);
+			if (user) {
+				struct strbuf userandhost = STRBUF_INIT;
+				strbuf_addstr(&userandhost, user);
+				strbuf_addch(&userandhost, '@');
+				strbuf_addstr(&userandhost, host);
+				argv_array_push(&conn->args, userandhost.buf);
+				strbuf_release(&userandhost);
+			} else
+				argv_array_push(&conn->args, host);
 		} else {
 			transport_check_allowed("file");
 		}
@@ -814,6 +834,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 		fd[1] = conn->in;  /* write to child's stdin */
 		strbuf_release(&cmd);
 	}
+	free(user);
 	free(host);
 	free(port);
 	free(path);
-- 
2.8.1.16.gaa70619.dirty

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v4 08/11] connect: change the --diag-url output to separate user and host
  2016-05-03  8:50 ` [PATCH v4 00/11] connect: various cleanups Mike Hommey
                     ` (6 preceding siblings ...)
  2016-05-03  8:50   ` [PATCH v4 07/11] connect: make parse_connect_url() return the user part of the url as a separate value Mike Hommey
@ 2016-05-03  8:50   ` Mike Hommey
  2016-05-03 16:20     ` Torsten Bögershausen
  2016-05-03  8:50   ` [PATCH v4 09/11] connect: use "-l user" instead of "user@" on ssh command line Mike Hommey
                     ` (2 subsequent siblings)
  10 siblings, 1 reply; 46+ messages in thread
From: Mike Hommey @ 2016-05-03  8:50 UTC (permalink / raw)
  To: git; +Cc: gitster, tboegi

Signed-off-by: Mike Hommey <mh@glandium.org>
---
 connect.c             |  6 ++----
 t/t5500-fetch-pack.sh | 14 ++++++++++++--
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/connect.c b/connect.c
index e95e385..2c5b722 100644
--- a/connect.c
+++ b/connect.c
@@ -703,10 +703,8 @@ struct child_process *git_connect(int fd[2], const char *url,
 	if (flags & CONNECT_DIAG_URL) {
 		printf("Diag: url=%s\n", url ? url : "NULL");
 		printf("Diag: protocol=%s\n", prot_name(protocol));
-		if (user)
-			printf("Diag: userandhost=%s@%s\n", user, host);
-		else
-			printf("Diag: userandhost=%s\n", host ? host : "NULL");
+		printf("Diag: user=%s\n", user ? user : "NULL");
+		printf("Diag: host=%s\n", host ? host : "NULL");
 		printf("Diag: port=%s\n", port ? port : "NONE");
 		printf("Diag: path=%s\n", path ? path : "NULL");
 		conn = NULL;
diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index 09d46c3..0c2f79f 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -537,7 +537,7 @@ check_prot_path () {
 	Diag: protocol=$2
 	Diag: path=$3
 	EOF
-	git fetch-pack --diag-url "$1" | grep -v host= | grep -v port= >actual &&
+	git fetch-pack --diag-url "$1" | grep -v user= | grep -v host= | grep -v port= >actual &&
 	test_cmp expected actual
 }
 
@@ -552,10 +552,20 @@ check_prot_host_port_path () {
 		;;
 	esac
 	ehost=$(echo $3 | tr -d "[]")
+	case "$ehost" in
+		*@*)
+		user=${ehost%@*}
+		ehost=${ehost#$user@}
+		;;
+		*)
+		user=NULL
+		;;
+	esac
 	cat >exp <<-EOF &&
 	Diag: url=$1
 	Diag: protocol=$pp
-	Diag: userandhost=$ehost
+	Diag: user=$user
+	Diag: host=$ehost
 	Diag: port=$4
 	Diag: path=$5
 	EOF
-- 
2.8.1.16.gaa70619.dirty

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v4 09/11] connect: use "-l user" instead of "user@" on ssh command line
  2016-05-03  8:50 ` [PATCH v4 00/11] connect: various cleanups Mike Hommey
                     ` (7 preceding siblings ...)
  2016-05-03  8:50   ` [PATCH v4 08/11] connect: change the --diag-url output to separate user and host Mike Hommey
@ 2016-05-03  8:50   ` Mike Hommey
  2016-05-03 16:25     ` Torsten Bögershausen
  2016-05-03 17:33     ` Eric Sunshine
  2016-05-03  8:50   ` [PATCH v4 10/11] connect: actively reject git:// urls with a user part Mike Hommey
  2016-05-03  8:50   ` [PATCH v4 11/11] connect: move ssh command line preparation to a separate function Mike Hommey
  10 siblings, 2 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-03  8:50 UTC (permalink / raw)
  To: git; +Cc: gitster, tboegi

While it is not strictly necessary, it makes the connect code simpler
when there is user.

Signed-off-by: Mike Hommey <mh@glandium.org>
---
 connect.c        | 12 ++++--------
 t/t5601-clone.sh | 52 ++++++++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 48 insertions(+), 16 deletions(-)

This is entirely optional.

diff --git a/connect.c b/connect.c
index 2c5b722..0cec822 100644
--- a/connect.c
+++ b/connect.c
@@ -812,14 +812,10 @@ struct child_process *git_connect(int fd[2], const char *url,
 				argv_array_push(&conn->args, port);
 			}
 			if (user) {
-				struct strbuf userandhost = STRBUF_INIT;
-				strbuf_addstr(&userandhost, user);
-				strbuf_addch(&userandhost, '@');
-				strbuf_addstr(&userandhost, host);
-				argv_array_push(&conn->args, userandhost.buf);
-				strbuf_release(&userandhost);
-			} else
-				argv_array_push(&conn->args, host);
+				argv_array_push(&conn->args, "-l");
+				argv_array_push(&conn->args, user);
+			}
+			argv_array_push(&conn->args, host);
 		} else {
 			transport_check_allowed("file");
 		}
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index c1efb8e..98fe861 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -464,38 +464,74 @@ test_expect_success 'clone ssh://host.xz:22/~repo' '
 '
 
 #IPv6
-for tuah in ::1 [::1] [::1]: user@::1 user@[::1] user@[::1]: [user@::1] [user@::1]:
+for tah in ::1 [::1] [::1]:
+do
+	ehost=$(echo $tah | sed -e "s/1]:/1]/ "| tr -d "[]")
+	test_expect_success "clone ssh://$tah/home/user/repo" "
+	  test_clone_url ssh://$tah/home/user/repo $ehost /home/user/repo
+	"
+done
+
+for tuah in user@::1 user@[::1] user@[::1]: [user@::1] [user@::1]:
 do
 	ehost=$(echo $tuah | sed -e "s/1]:/1]/ "| tr -d "[]")
+	ehost=${ehost#user@}
 	test_expect_success "clone ssh://$tuah/home/user/repo" "
-	  test_clone_url ssh://$tuah/home/user/repo $ehost /home/user/repo
+	  test_clone_url ssh://$tuah/home/user/repo '-l user $ehost' /home/user/repo
 	"
 done
 
 #IPv6 from home directory
-for tuah in ::1 [::1] user@::1 user@[::1] [user@::1]
+for tah in ::1 [::1]
+do
+	eah=$(echo $tah | tr -d "[]")
+	test_expect_success "clone ssh://$tah/~repo" "
+	  test_clone_url ssh://$tah/~repo $eah '~repo'
+	"
+done
+
+for tuah in user@::1 user@[::1] [user@::1]
 do
 	euah=$(echo $tuah | tr -d "[]")
+	eah=${euah#user@}
 	test_expect_success "clone ssh://$tuah/~repo" "
-	  test_clone_url ssh://$tuah/~repo $euah '~repo'
+	  test_clone_url ssh://$tuah/~repo '-l user' $eah '~repo'
 	"
 done
 
 #IPv6 with port number
-for tuah in [::1] user@[::1] [user@::1]
+for tah in [::1]
+do
+	eah=$(echo $tah | tr -d "[]")
+	test_expect_success "clone ssh://$tah:22/home/user/repo" "
+	  test_clone_url ssh://$tah:22/home/user/repo '-p 22' $eah /home/user/repo
+	"
+done
+
+for tuah in user@[::1] [user@::1]
 do
 	euah=$(echo $tuah | tr -d "[]")
+	eah=${euah#user@}
 	test_expect_success "clone ssh://$tuah:22/home/user/repo" "
-	  test_clone_url ssh://$tuah:22/home/user/repo '-p 22' $euah /home/user/repo
+	  test_clone_url ssh://$tuah:22/home/user/repo '-p 22 -l user' $eah /home/user/repo
 	"
 done
 
 #IPv6 from home directory with port number
-for tuah in [::1] user@[::1] [user@::1]
+for tah in [::1]
+do
+	eah=$(echo $tah | tr -d "[]")
+	test_expect_success "clone ssh://$tah:22/~repo" "
+	  test_clone_url ssh://$tah:22/~repo '-p 22' $eah '~repo'
+	"
+done
+
+for tuah in user@[::1] [user@::1]
 do
 	euah=$(echo $tuah | tr -d "[]")
+	eah=${euah#user@}
 	test_expect_success "clone ssh://$tuah:22/~repo" "
-	  test_clone_url ssh://$tuah:22/~repo '-p 22' $euah '~repo'
+	  test_clone_url ssh://$tuah:22/~repo '-p 22 -l user' $eah '~repo'
 	"
 done
 
-- 
2.8.1.16.gaa70619.dirty

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v4 10/11] connect: actively reject git:// urls with a user part
  2016-05-03  8:50 ` [PATCH v4 00/11] connect: various cleanups Mike Hommey
                     ` (8 preceding siblings ...)
  2016-05-03  8:50   ` [PATCH v4 09/11] connect: use "-l user" instead of "user@" on ssh command line Mike Hommey
@ 2016-05-03  8:50   ` Mike Hommey
  2016-05-03  8:50   ` [PATCH v4 11/11] connect: move ssh command line preparation to a separate function Mike Hommey
  10 siblings, 0 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-03  8:50 UTC (permalink / raw)
  To: git; +Cc: gitster, tboegi

Currently, urls of the for git://user@host don't work because user@host
is not resolving at the DNS level, but we shouldn't be relying on it
being an invalid host name, and actively reject it for containing a
username in the first place.

Signed-off-by: Mike Hommey <mh@glandium.org>
---
 connect.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/connect.c b/connect.c
index 0cec822..215d6d9 100644
--- a/connect.c
+++ b/connect.c
@@ -716,6 +716,9 @@ struct child_process *git_connect(int fd[2], const char *url,
 		 */
 		struct strbuf target_host = STRBUF_INIT;
 		char *override_vhost = getenv("GIT_OVERRIDE_VIRTUAL_HOST");
+		if (user)
+			die("user@host is not allowed in git:// urls");
+
 		if (override_vhost)
 			strbuf_addstr(&target_host, override_vhost);
 		else {
-- 
2.8.1.16.gaa70619.dirty

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v4 11/11] connect: move ssh command line preparation to a separate function
  2016-05-03  8:50 ` [PATCH v4 00/11] connect: various cleanups Mike Hommey
                     ` (9 preceding siblings ...)
  2016-05-03  8:50   ` [PATCH v4 10/11] connect: actively reject git:// urls with a user part Mike Hommey
@ 2016-05-03  8:50   ` Mike Hommey
  2016-05-03 12:30     ` [PATCH v4.1 " Mike Hommey
  10 siblings, 1 reply; 46+ messages in thread
From: Mike Hommey @ 2016-05-03  8:50 UTC (permalink / raw)
  To: git; +Cc: gitster, tboegi

Signed-off-by: Mike Hommey <mh@glandium.org>
---
 connect.c | 109 ++++++++++++++++++++++++++++++++++----------------------------
 1 file changed, 59 insertions(+), 50 deletions(-)

diff --git a/connect.c b/connect.c
index 215d6d9..37b3140 100644
--- a/connect.c
+++ b/connect.c
@@ -673,6 +673,62 @@ static enum protocol parse_connect_url(const char *url_orig, char **ret_user,
 	return protocol;
 }
 
+static int prepare_ssh_command(struct argv_array *cmd, const char *user,
+			       const char *host, const char *port, int flags)
+{
+	const char *ssh;
+	int putty = 0, tortoiseplink = 0, use_shell = 1;
+	transport_check_allowed("ssh");
+
+	ssh = getenv("GIT_SSH_COMMAND");
+	if (!ssh) {
+		const char *base;
+		char *ssh_dup;
+
+		/*
+		 * GIT_SSH is the no-shell version of
+		 * GIT_SSH_COMMAND (and must remain so for
+		 * historical compatibility).
+		 */
+		use_shell = 0;
+
+		ssh = getenv("GIT_SSH");
+		if (!ssh)
+			ssh = "ssh";
+
+		ssh_dup = xstrdup(ssh);
+		base = basename(ssh_dup);
+
+		tortoiseplink = !strcasecmp(base, "tortoiseplink") ||
+			!strcasecmp(base, "tortoiseplink.exe");
+		putty = tortoiseplink ||
+			!strcasecmp(base, "plink") ||
+			!strcasecmp(base, "plink.exe");
+
+		free(ssh_dup);
+	}
+
+	argv_array_push(cmd, ssh);
+	if (flags & CONNECT_IPV4)
+		argv_array_push(cmd, "-4");
+	else if (flags & CONNECT_IPV6)
+		argv_array_push(cmd, "-6");
+	if (tortoiseplink)
+		argv_array_push(cmd, "-batch");
+	if (port) {
+		/* P is for PuTTY, p is for OpenSSH */
+		argv_array_push(cmd, putty ? "-P" : "-p");
+		argv_array_push(cmd, port);
+	}
+	if (user) {
+		argv_array_push(cmd, "-l");
+		argv_array_push(cmd, user);
+	}
+	argv_array_push(cmd, host);
+
+	return use_shell;
+}
+
 static struct child_process no_fork = CHILD_PROCESS_INIT;
 
 /*
@@ -767,59 +823,12 @@ struct child_process *git_connect(int fd[2], const char *url,
 
 		/* remove repo-local variables from the environment */
 		conn->env = local_repo_env;
-		conn->use_shell = 1;
 		conn->in = conn->out = -1;
 		if (protocol == PROTO_SSH) {
-			const char *ssh;
-			int putty = 0, tortoiseplink = 0;
-			transport_check_allowed("ssh");
-
-			ssh = getenv("GIT_SSH_COMMAND");
-			if (!ssh) {
-				const char *base;
-				char *ssh_dup;
-
-				/*
-				 * GIT_SSH is the no-shell version of
-				 * GIT_SSH_COMMAND (and must remain so for
-				 * historical compatibility).
-				 */
-				conn->use_shell = 0;
-
-				ssh = getenv("GIT_SSH");
-				if (!ssh)
-					ssh = "ssh";
-
-				ssh_dup = xstrdup(ssh);
-				base = basename(ssh_dup);
-
-				tortoiseplink = !strcasecmp(base, "tortoiseplink") ||
-					!strcasecmp(base, "tortoiseplink.exe");
-				putty = tortoiseplink ||
-					!strcasecmp(base, "plink") ||
-					!strcasecmp(base, "plink.exe");
-
-				free(ssh_dup);
-			}
-
-			argv_array_push(&conn->args, ssh);
-			if (flags & CONNECT_IPV4)
-				argv_array_push(&conn->args, "-4");
-			else if (flags & CONNECT_IPV6)
-				argv_array_push(&conn->args, "-6");
-			if (tortoiseplink)
-				argv_array_push(&conn->args, "-batch");
-			if (port) {
-				/* P is for PuTTY, p is for OpenSSH */
-				argv_array_push(&conn->args, putty ? "-P" : "-p");
-				argv_array_push(&conn->args, port);
-			}
-			if (user) {
-				argv_array_push(&conn->args, "-l");
-				argv_array_push(&conn->args, user);
-			}
-			argv_array_push(&conn->args, host);
+			conn->use_shell = prepare_ssh_command(
+				&conn->args, host, port, flags);
 		} else {
+			conn->use_shell = 1;
 			transport_check_allowed("file");
 		}
 		argv_array_push(&conn->args, cmd.buf);
-- 
2.8.1.16.gaa70619.dirty

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v4.1 11/11] connect: move ssh command line preparation to a separate function
  2016-05-03  8:50   ` [PATCH v4 11/11] connect: move ssh command line preparation to a separate function Mike Hommey
@ 2016-05-03 12:30     ` Mike Hommey
  0 siblings, 0 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-03 12:30 UTC (permalink / raw)
  To: git; +Cc: gitster, tboegi

---
 connect.c | 109 ++++++++++++++++++++++++++++++++++----------------------------
 1 file changed, 59 insertions(+), 50 deletions(-)

Err, I had forgotten to commit --amend to add a missing argument.

diff --git a/connect.c b/connect.c
index 215d6d9..5f86983 100644
--- a/connect.c
+++ b/connect.c
@@ -673,6 +673,62 @@ static enum protocol parse_connect_url(const char *url_orig, char **ret_user,
 	return protocol;
 }
 
+static int prepare_ssh_command(struct argv_array *cmd, const char *user,
+			       const char *host, const char *port, int flags)
+{
+	const char *ssh;
+	int putty = 0, tortoiseplink = 0, use_shell = 1;
+	transport_check_allowed("ssh");
+
+	ssh = getenv("GIT_SSH_COMMAND");
+	if (!ssh) {
+		const char *base;
+		char *ssh_dup;
+
+		/*
+		 * GIT_SSH is the no-shell version of
+		 * GIT_SSH_COMMAND (and must remain so for
+		 * historical compatibility).
+		 */
+		use_shell = 0;
+
+		ssh = getenv("GIT_SSH");
+		if (!ssh)
+			ssh = "ssh";
+
+		ssh_dup = xstrdup(ssh);
+		base = basename(ssh_dup);
+
+		tortoiseplink = !strcasecmp(base, "tortoiseplink") ||
+			!strcasecmp(base, "tortoiseplink.exe");
+		putty = tortoiseplink ||
+			!strcasecmp(base, "plink") ||
+			!strcasecmp(base, "plink.exe");
+
+		free(ssh_dup);
+	}
+
+	argv_array_push(cmd, ssh);
+	if (flags & CONNECT_IPV4)
+		argv_array_push(cmd, "-4");
+	else if (flags & CONNECT_IPV6)
+		argv_array_push(cmd, "-6");
+	if (tortoiseplink)
+		argv_array_push(cmd, "-batch");
+	if (port) {
+		/* P is for PuTTY, p is for OpenSSH */
+		argv_array_push(cmd, putty ? "-P" : "-p");
+		argv_array_push(cmd, port);
+	}
+	if (user) {
+		argv_array_push(cmd, "-l");
+		argv_array_push(cmd, user);
+	}
+	argv_array_push(cmd, host);
+
+	return use_shell;
+}
+
 static struct child_process no_fork = CHILD_PROCESS_INIT;
 
 /*
@@ -767,59 +823,12 @@ struct child_process *git_connect(int fd[2], const char *url,
 
 		/* remove repo-local variables from the environment */
 		conn->env = local_repo_env;
-		conn->use_shell = 1;
 		conn->in = conn->out = -1;
 		if (protocol == PROTO_SSH) {
-			const char *ssh;
-			int putty = 0, tortoiseplink = 0;
-			transport_check_allowed("ssh");
-
-			ssh = getenv("GIT_SSH_COMMAND");
-			if (!ssh) {
-				const char *base;
-				char *ssh_dup;
-
-				/*
-				 * GIT_SSH is the no-shell version of
-				 * GIT_SSH_COMMAND (and must remain so for
-				 * historical compatibility).
-				 */
-				conn->use_shell = 0;
-
-				ssh = getenv("GIT_SSH");
-				if (!ssh)
-					ssh = "ssh";
-
-				ssh_dup = xstrdup(ssh);
-				base = basename(ssh_dup);
-
-				tortoiseplink = !strcasecmp(base, "tortoiseplink") ||
-					!strcasecmp(base, "tortoiseplink.exe");
-				putty = tortoiseplink ||
-					!strcasecmp(base, "plink") ||
-					!strcasecmp(base, "plink.exe");
-
-				free(ssh_dup);
-			}
-
-			argv_array_push(&conn->args, ssh);
-			if (flags & CONNECT_IPV4)
-				argv_array_push(&conn->args, "-4");
-			else if (flags & CONNECT_IPV6)
-				argv_array_push(&conn->args, "-6");
-			if (tortoiseplink)
-				argv_array_push(&conn->args, "-batch");
-			if (port) {
-				/* P is for PuTTY, p is for OpenSSH */
-				argv_array_push(&conn->args, putty ? "-P" : "-p");
-				argv_array_push(&conn->args, port);
-			}
-			if (user) {
-				argv_array_push(&conn->args, "-l");
-				argv_array_push(&conn->args, user);
-			}
-			argv_array_push(&conn->args, host);
+			conn->use_shell = prepare_ssh_command(
+				&conn->args, user, host, port, flags);
 		} else {
+			conn->use_shell = 1;
 			transport_check_allowed("file");
 		}
 		argv_array_push(&conn->args, cmd.buf);
-- 
2.8.1.18.gbe709d1.dirty

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH v4 01/11] add fetch-pack --diag-url tests for some corner cases
  2016-05-03  8:50   ` [PATCH v4 01/11] add fetch-pack --diag-url tests for some corner cases Mike Hommey
@ 2016-05-03 16:07     ` Torsten Bögershausen
  2016-05-03 16:07     ` Junio C Hamano
  1 sibling, 0 replies; 46+ messages in thread
From: Torsten Bögershausen @ 2016-05-03 16:07 UTC (permalink / raw)
  To: Mike Hommey, git; +Cc: gitster, tboegi

On 2016-05-03 10.50, Mike Hommey wrote:
> diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
> index e5f83bf..1f0133f 100755
> --- a/t/t5500-fetch-pack.sh
> +++ b/t/t5500-fetch-pack.sh
> @@ -569,12 +569,27 @@ check_prot_host_port_path () {
>  	test_cmp expected actual
>  }
>  
> -for r in repo re:po re/po
> +test_maybe_fail () {
test_may_fail sounds non-deterministic or so ;-)
how about test_git_and_colon() or similar ?
> +	host=$1; shift
> +	case $host in
> +		git=*)
> +		test_expect_success "$@"
> +		;;
> +		*:*@*)
> +		test_expect_failure "$@"
> +		;;
> +		*)
> +		test_expect_success "$@"
> +		;;
> +	esac
> +}

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v4 01/11] add fetch-pack --diag-url tests for some corner cases
  2016-05-03  8:50   ` [PATCH v4 01/11] add fetch-pack --diag-url tests for some corner cases Mike Hommey
  2016-05-03 16:07     ` Torsten Bögershausen
@ 2016-05-03 16:07     ` Junio C Hamano
  2016-05-03 16:30       ` Torsten Bögershausen
  2016-05-03 22:48       ` Mike Hommey
  1 sibling, 2 replies; 46+ messages in thread
From: Junio C Hamano @ 2016-05-03 16:07 UTC (permalink / raw)
  To: Mike Hommey; +Cc: git, tboegi

Mike Hommey <mh@glandium.org> writes:

> t5603-clone-dirname uses url patterns that are not tested with
> fetch-pack --diag-url, and it would be useful if they were.
>
> Interestingly, some of those tests, involving both a port and a
> user:password pair, don't currently pass. Note that even if a
> user:password pair is actually not supported by git, the values used
> could be valid user names (user names can actually contain colons
> and at signs), and are still worth testing the url parser for.

I am not sure about the last part of this (and the tests in the
patch for them).  When you are constrained by the Common Internet
Scheme Syntax, i.e.

    <scheme>://<user>:<password>@<host>:<port>/<url-path>

you cannot have arbitrary characters in these parts; within the user
and password field, any ":", "@", or "/" must be encoded.

Which maens that for the purpose of the parser you are modifying,
you can rely on these three special characters to parse things out
(decoding after the code determines which part is user and which
part is password is a separate issue).

> Signed-off-by: Mike Hommey <mh@glandium.org>
> ---
>  t/t5500-fetch-pack.sh | 38 ++++++++++++++++++++++++++++++++++----
>  1 file changed, 34 insertions(+), 4 deletions(-)
>
> Note I'm not /entirely/ sure about colons in user names, but ssh happily
> sends requests to authenticate with logins containing colons. I however
> *do* know it works with at signs (hg.mozilla.org ssh accounts are email
> addresses).
>
> diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
> index e5f83bf..1f0133f 100755
> --- a/t/t5500-fetch-pack.sh
> +++ b/t/t5500-fetch-pack.sh
> @@ -569,12 +569,27 @@ check_prot_host_port_path () {
>  	test_cmp expected actual
>  }
>  
> -for r in repo re:po re/po
> +test_maybe_fail () {

That is way too confusing a name when reading the caller of it by
being too close to generic test helpers like test_might_fail,
test_must_fail, etc.

> +	host=$1; shift
> +	case $host in
> +		git=*)

Dedent this line by one level.

> +		test_expect_success "$@"
> +		;;

These two lines are indented correctly.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v4 08/11] connect: change the --diag-url output to separate user and host
  2016-05-03  8:50   ` [PATCH v4 08/11] connect: change the --diag-url output to separate user and host Mike Hommey
@ 2016-05-03 16:20     ` Torsten Bögershausen
  2016-05-03 17:23       ` Eric Sunshine
  0 siblings, 1 reply; 46+ messages in thread
From: Torsten Bögershausen @ 2016-05-03 16:20 UTC (permalink / raw)
  To: Mike Hommey, git; +Cc: gitster, tboegi

On 2016-05-03 10.50, Mike Hommey wrote:
> Signed-off-by: Mike Hommey <mh@glandium.org>
> ---
>  connect.c             |  6 ++----
>  t/t5500-fetch-pack.sh | 14 ++++++++++++--
>  2 files changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/connect.c b/connect.c
> index e95e385..2c5b722 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -703,10 +703,8 @@ struct child_process *git_connect(int fd[2], const char *url,
>  	if (flags & CONNECT_DIAG_URL) {
>  		printf("Diag: url=%s\n", url ? url : "NULL");
>  		printf("Diag: protocol=%s\n", prot_name(protocol));
> -		if (user)
> -			printf("Diag: userandhost=%s@%s\n", user, host);
> -		else
> -			printf("Diag: userandhost=%s\n", host ? host : "NULL");
> +		printf("Diag: user=%s\n", user ? user : "NULL");
> +		printf("Diag: host=%s\n", host ? host : "NULL");
>  		printf("Diag: port=%s\n", port ? port : "NONE");
>  		printf("Diag: path=%s\n", path ? path : "NULL");
>  		conn = NULL;
> diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
> index 09d46c3..0c2f79f 100755
> --- a/t/t5500-fetch-pack.sh
> +++ b/t/t5500-fetch-pack.sh
> @@ -537,7 +537,7 @@ check_prot_path () {
>  	Diag: protocol=$2
>  	Diag: path=$3
>  	EOF
> -	git fetch-pack --diag-url "$1" | grep -v host= | grep -v port= >actual &&
> +	git fetch-pack --diag-url "$1" | grep -v user= | grep -v host= | grep -v port= >actual &&
Running grep a couple of times is probably not optimal in terms of spawning a
process....
Does

git fetch-pack --diag-url "$1" | egrep -v "user=|host=|port=" >actual &&
work ?
or the version like this:
git fetch-pack --diag-url "$1" | egrep -v "(user|host|port)=" >actual &&


(And before I forget it: The whole series makes sense, thanks for that,
it may be good if I review the final result as well as all the small changes)

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v4 09/11] connect: use "-l user" instead of "user@" on ssh command line
  2016-05-03  8:50   ` [PATCH v4 09/11] connect: use "-l user" instead of "user@" on ssh command line Mike Hommey
@ 2016-05-03 16:25     ` Torsten Bögershausen
  2016-05-03 17:50       ` Junio C Hamano
  2016-05-03 17:33     ` Eric Sunshine
  1 sibling, 1 reply; 46+ messages in thread
From: Torsten Bögershausen @ 2016-05-03 16:25 UTC (permalink / raw)
  To: Mike Hommey, git; +Cc: gitster, tboegi

On 2016-05-03 10.50, Mike Hommey wrote:
> While it is not strictly necessary, it makes the connect code simpler
> when there is user.
> 

That commit message does't tell too much, I think.

Besides that, I'm sure it will break (at least) my ssh wrapper scripts,
which rely on user@host to be passed into the script.

The thing is that some hosts don't have a DNS entry, but can be reached via
host.local, if avahi is running.
And my wrapper script parses the url, looks up the host via netlookup,
or avahi using host.local, and feeds the result into /usr/bin/ssh.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v4 01/11] add fetch-pack --diag-url tests for some corner cases
  2016-05-03 16:07     ` Junio C Hamano
@ 2016-05-03 16:30       ` Torsten Bögershausen
  2016-05-03 22:48       ` Mike Hommey
  1 sibling, 0 replies; 46+ messages in thread
From: Torsten Bögershausen @ 2016-05-03 16:30 UTC (permalink / raw)
  To: Junio C Hamano, Mike Hommey; +Cc: git, tboegi

On 2016-05-03 18.07, Junio C Hamano wrote:
> Mike Hommey <mh@glandium.org> writes:
> 
>> t5603-clone-dirname uses url patterns that are not tested with
>> fetch-pack --diag-url, and it would be useful if they were.
>>
>> Interestingly, some of those tests, involving both a port and a
>> user:password pair, don't currently pass. Note that even if a
>> user:password pair is actually not supported by git, the values used
>> could be valid user names (user names can actually contain colons
>> and at signs), and are still worth testing the url parser for.
> 
> I am not sure about the last part of this (and the tests in the
> patch for them).  When you are constrained by the Common Internet
> Scheme Syntax, i.e.
> 
>     <scheme>://<user>:<password>@<host>:<port>/<url-path>
> 
> you cannot have arbitrary characters in these parts; within the user
> and password field, any ":", "@", or "/" must be encoded.
> 
I thinnk we have an old bug here:
	if (is_url(url_orig))
		url = url_decode(url_orig);
	else
		url = xstrdup(url_orig);

The the url should be separated into the components first,
and afther that url-path should got into url_decode,
and may be password, username....
(That's out of my head)

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v4 08/11] connect: change the --diag-url output to separate user and host
  2016-05-03 16:20     ` Torsten Bögershausen
@ 2016-05-03 17:23       ` Eric Sunshine
  2016-05-03 22:50         ` Mike Hommey
  0 siblings, 1 reply; 46+ messages in thread
From: Eric Sunshine @ 2016-05-03 17:23 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: Mike Hommey, Git List, Junio C Hamano

On Tue, May 3, 2016 at 12:20 PM, Torsten Bögershausen <tboegi@web.de> wrote:
> On 2016-05-03 10.50, Mike Hommey wrote:
>> -     git fetch-pack --diag-url "$1" | grep -v host= | grep -v port= >actual &&
>> +     git fetch-pack --diag-url "$1" | grep -v user= | grep -v host= | grep -v port= >actual &&
> Running grep a couple of times is probably not optimal in terms of spawning a
> process....
> Does
>
> git fetch-pack --diag-url "$1" | egrep -v "user=|host=|port=" >actual &&
> work ?
> or the version like this:
> git fetch-pack --diag-url "$1" | egrep -v "(user|host|port)=" >actual &&

I always worry about portability problems with these "advanced"
expressions in grep and sed, however, both of these work fine under
Mac OS X and FreeBSD (which is where problems often manifest).

An alterante expression which shouldn't raise any portability alarms would be:

    sed -e /user=/d -e /host=/d -e /port=/d

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v4 09/11] connect: use "-l user" instead of "user@" on ssh command line
  2016-05-03  8:50   ` [PATCH v4 09/11] connect: use "-l user" instead of "user@" on ssh command line Mike Hommey
  2016-05-03 16:25     ` Torsten Bögershausen
@ 2016-05-03 17:33     ` Eric Sunshine
  2016-05-03 22:52       ` Mike Hommey
  1 sibling, 1 reply; 46+ messages in thread
From: Eric Sunshine @ 2016-05-03 17:33 UTC (permalink / raw)
  To: Mike Hommey; +Cc: Git List, Junio C Hamano, Torsten Bögershausen

On Tue, May 3, 2016 at 4:50 AM, Mike Hommey <mh@glandium.org> wrote:
> While it is not strictly necessary, it makes the connect code simpler
> when there is user.
>
> Signed-off-by: Mike Hommey <mh@glandium.org>
> ---
> diff --git a/connect.c b/connect.c
> @@ -812,14 +812,10 @@ struct child_process *git_connect(int fd[2], const char *url,
>                         if (user) {
> -                               struct strbuf userandhost = STRBUF_INIT;
> -                               strbuf_addstr(&userandhost, user);
> -                               strbuf_addch(&userandhost, '@');
> -                               strbuf_addstr(&userandhost, host);
> -                               argv_array_push(&conn->args, userandhost.buf);
> -                               strbuf_release(&userandhost);
> -                       } else
> -                               argv_array_push(&conn->args, host);
> +                               argv_array_push(&conn->args, "-l");
> +                               argv_array_push(&conn->args, user);
> +                       }
> +                       argv_array_push(&conn->args, host);

Even simpler would be a one-liner for the user case:

    if (user)
        argv_array_pushf(&conn->args, "%s@%s"", user, host);
    else
        argv_array_push(&conn->args, host);

>                 } else {
>                         transport_check_allowed("file");
>                 }

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v4 09/11] connect: use "-l user" instead of "user@" on ssh command line
  2016-05-03 16:25     ` Torsten Bögershausen
@ 2016-05-03 17:50       ` Junio C Hamano
  0 siblings, 0 replies; 46+ messages in thread
From: Junio C Hamano @ 2016-05-03 17:50 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: Mike Hommey, git

Torsten Bögershausen <tboegi@web.de> writes:

> On 2016-05-03 10.50, Mike Hommey wrote:
>> While it is not strictly necessary, it makes the connect code simpler
>> when there is user.
>> 
>
> That commit message does't tell too much, I think.

"Doesn't tell too much" is not necessarily bad, but "tells too
little" is, and I think this tells me enough to say it is not a good
change ;-)

> Besides that, I'm sure it will break (at least) my ssh wrapper scripts,
> which rely on user@host to be passed into the script.

Thanks for bringing it up.  "By reducing the language we accept it
makes my coding simpler" is not a good excuse to break existing
users, and "While it is not strictly necessary, " is a good hint
that the author _knows_ that the change can either (1) be done
without, or (2) be done in a way that does not break existing users
and yet make the end result easier to read.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v4 01/11] add fetch-pack --diag-url tests for some corner cases
  2016-05-03 16:07     ` Junio C Hamano
  2016-05-03 16:30       ` Torsten Bögershausen
@ 2016-05-03 22:48       ` Mike Hommey
  2016-05-05 21:52         ` Mike Hommey
  1 sibling, 1 reply; 46+ messages in thread
From: Mike Hommey @ 2016-05-03 22:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, tboegi

On Tue, May 03, 2016 at 09:07:41AM -0700, Junio C Hamano wrote:
> Mike Hommey <mh@glandium.org> writes:
> 
> > t5603-clone-dirname uses url patterns that are not tested with
> > fetch-pack --diag-url, and it would be useful if they were.
> >
> > Interestingly, some of those tests, involving both a port and a
> > user:password pair, don't currently pass. Note that even if a
> > user:password pair is actually not supported by git, the values used
> > could be valid user names (user names can actually contain colons
> > and at signs), and are still worth testing the url parser for.
> 
> I am not sure about the last part of this (and the tests in the
> patch for them).  When you are constrained by the Common Internet
> Scheme Syntax, i.e.
> 
>     <scheme>://<user>:<password>@<host>:<port>/<url-path>
> 
> you cannot have arbitrary characters in these parts; within the user
> and password field, any ":", "@", or "/" must be encoded.
> 
> Which maens that for the purpose of the parser you are modifying,
> you can rely on these three special characters to parse things out
> (decoding after the code determines which part is user and which
> part is password is a separate issue).

t5603-clone-dirname contains a test for e.g. ssh://user:passw@rd@host:1234/
That's the basis for these additions. Whether that should work or not is
besides what I was interested in, which was to have a single test file to
run to test my changes, instead of several.

Strictly speaking, this patch is not necessary, because it only covers
things that I found while breaking other tests.

So, there are multiple possible ways forward here:
- Completely remove this patch for v5 of the series.
- Remove the user:passw@rd cases because of the @.
- Remove the user:password cases because we do nothing with the password
  anyways.
- A combination of both of the above.

I don't really care which is picked, at this point I just want to get
over with this series ;)

Mike

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v4 08/11] connect: change the --diag-url output to separate user and host
  2016-05-03 17:23       ` Eric Sunshine
@ 2016-05-03 22:50         ` Mike Hommey
  0 siblings, 0 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-03 22:50 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Torsten Bögershausen, Git List, Junio C Hamano

On Tue, May 03, 2016 at 01:23:37PM -0400, Eric Sunshine wrote:
> On Tue, May 3, 2016 at 12:20 PM, Torsten Bögershausen <tboegi@web.de> wrote:
> > On 2016-05-03 10.50, Mike Hommey wrote:
> >> -     git fetch-pack --diag-url "$1" | grep -v host= | grep -v port= >actual &&
> >> +     git fetch-pack --diag-url "$1" | grep -v user= | grep -v host= | grep -v port= >actual &&
> > Running grep a couple of times is probably not optimal in terms of spawning a
> > process....
> > Does
> >
> > git fetch-pack --diag-url "$1" | egrep -v "user=|host=|port=" >actual &&
> > work ?
> > or the version like this:
> > git fetch-pack --diag-url "$1" | egrep -v "(user|host|port)=" >actual &&
> 
> I always worry about portability problems with these "advanced"
> expressions in grep and sed, however, both of these work fine under
> Mac OS X and FreeBSD (which is where problems often manifest).

That was my concern. But it looks like we already rely on the
egrep "(|)" form working in some other test, so I guess it's fine to use
that.

Mike

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v4 09/11] connect: use "-l user" instead of "user@" on ssh command line
  2016-05-03 17:33     ` Eric Sunshine
@ 2016-05-03 22:52       ` Mike Hommey
  0 siblings, 0 replies; 46+ messages in thread
From: Mike Hommey @ 2016-05-03 22:52 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Junio C Hamano, Torsten Bögershausen

On Tue, May 03, 2016 at 01:33:24PM -0400, Eric Sunshine wrote:
> On Tue, May 3, 2016 at 4:50 AM, Mike Hommey <mh@glandium.org> wrote:
> > While it is not strictly necessary, it makes the connect code simpler
> > when there is user.
> >
> > Signed-off-by: Mike Hommey <mh@glandium.org>
> > ---
> > diff --git a/connect.c b/connect.c
> > @@ -812,14 +812,10 @@ struct child_process *git_connect(int fd[2], const char *url,
> >                         if (user) {
> > -                               struct strbuf userandhost = STRBUF_INIT;
> > -                               strbuf_addstr(&userandhost, user);
> > -                               strbuf_addch(&userandhost, '@');
> > -                               strbuf_addstr(&userandhost, host);
> > -                               argv_array_push(&conn->args, userandhost.buf);
> > -                               strbuf_release(&userandhost);
> > -                       } else
> > -                               argv_array_push(&conn->args, host);
> > +                               argv_array_push(&conn->args, "-l");
> > +                               argv_array_push(&conn->args, user);
> > +                       }
> > +                       argv_array_push(&conn->args, host);
> 
> Even simpler would be a one-liner for the user case:
> 
>     if (user)
>         argv_array_pushf(&conn->args, "%s@%s"", user, host);

Oh, I should have read the argv-array.h header. Thanks.

Mike

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v4 01/11] add fetch-pack --diag-url tests for some corner cases
  2016-05-03 22:48       ` Mike Hommey
@ 2016-05-05 21:52         ` Mike Hommey
  2016-05-06  4:17           ` Torsten Bögershausen
  0 siblings, 1 reply; 46+ messages in thread
From: Mike Hommey @ 2016-05-05 21:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, tboegi

On Wed, May 04, 2016 at 07:48:30AM +0900, Mike Hommey wrote:
> On Tue, May 03, 2016 at 09:07:41AM -0700, Junio C Hamano wrote:
> > Mike Hommey <mh@glandium.org> writes:
> > 
> > > t5603-clone-dirname uses url patterns that are not tested with
> > > fetch-pack --diag-url, and it would be useful if they were.
> > >
> > > Interestingly, some of those tests, involving both a port and a
> > > user:password pair, don't currently pass. Note that even if a
> > > user:password pair is actually not supported by git, the values used
> > > could be valid user names (user names can actually contain colons
> > > and at signs), and are still worth testing the url parser for.
> > 
> > I am not sure about the last part of this (and the tests in the
> > patch for them).  When you are constrained by the Common Internet
> > Scheme Syntax, i.e.
> > 
> >     <scheme>://<user>:<password>@<host>:<port>/<url-path>
> > 
> > you cannot have arbitrary characters in these parts; within the user
> > and password field, any ":", "@", or "/" must be encoded.
> > 
> > Which maens that for the purpose of the parser you are modifying,
> > you can rely on these three special characters to parse things out
> > (decoding after the code determines which part is user and which
> > part is password is a separate issue).
> 
> t5603-clone-dirname contains a test for e.g. ssh://user:passw@rd@host:1234/
> That's the basis for these additions. Whether that should work or not is
> besides what I was interested in, which was to have a single test file to
> run to test my changes, instead of several.
> 
> Strictly speaking, this patch is not necessary, because it only covers
> things that I found while breaking other tests.
> 
> So, there are multiple possible ways forward here:
> - Completely remove this patch for v5 of the series.
> - Remove the user:passw@rd cases because of the @.
> - Remove the user:password cases because we do nothing with the password
>   anyways.
> - A combination of both of the above.

Any opinions on this?

Mike

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v4 01/11] add fetch-pack --diag-url tests for some corner cases
  2016-05-05 21:52         ` Mike Hommey
@ 2016-05-06  4:17           ` Torsten Bögershausen
  2016-05-06 15:52             ` Junio C Hamano
  0 siblings, 1 reply; 46+ messages in thread
From: Torsten Bögershausen @ 2016-05-06  4:17 UTC (permalink / raw)
  To: Mike Hommey, Junio C Hamano; +Cc: git, tboegi

On 05.05.16 23:52, Mike Hommey wrote:
> On Wed, May 04, 2016 at 07:48:30AM +0900, Mike Hommey wrote:
>> On Tue, May 03, 2016 at 09:07:41AM -0700, Junio C Hamano wrote:
>>> Mike Hommey <mh@glandium.org> writes:
>>>
>>>> t5603-clone-dirname uses url patterns that are not tested with
>>>> fetch-pack --diag-url, and it would be useful if they were.
>>>>
>>>> Interestingly, some of those tests, involving both a port and a
>>>> user:password pair, don't currently pass. Note that even if a
>>>> user:password pair is actually not supported by git, the values used
>>>> could be valid user names (user names can actually contain colons
>>>> and at signs), and are still worth testing the url parser for.
>>>
>>> I am not sure about the last part of this (and the tests in the
>>> patch for them).  When you are constrained by the Common Internet
>>> Scheme Syntax, i.e.
>>>
>>>     <scheme>://<user>:<password>@<host>:<port>/<url-path>
>>>
>>> you cannot have arbitrary characters in these parts; within the user
>>> and password field, any ":", "@", or "/" must be encoded.
>>>
>>> Which maens that for the purpose of the parser you are modifying,
>>> you can rely on these three special characters to parse things out
>>> (decoding after the code determines which part is user and which
>>> part is password is a separate issue).
>>
>> t5603-clone-dirname contains a test for e.g. ssh://user:passw@rd@host:1234/
>> That's the basis for these additions. Whether that should work or not is
>> besides what I was interested in, which was to have a single test file to
>> run to test my changes, instead of several.
>>
>> Strictly speaking, this patch is not necessary, because it only covers
>> things that I found while breaking other tests.
>>
>> So, there are multiple possible ways forward here:
>> - Completely remove this patch for v5 of the series.
>> - Remove the user:passw@rd cases because of the @.
>> - Remove the user:password cases because we do nothing with the password
>>   anyways.
>> - A combination of both of the above.
> 
> Any opinions on this?

ssh itself does not use a password:

SSH(1)                    BSD General Commands Manual                   SSH(1)

NAME
     ssh -- OpenSSH SSH client (remote login program)

SYNOPSIS
     ssh [-1246AaCfgKkMNnqsTtVvXxYy] [-b bind_address] [-c cipher_spec]
         [-D [bind_address:]port] [-e escape_char] [-F configfile] [-I pkcs11]
         [-i identity_file] [-L [bind_address:]port:host:hostport]
         [-l login_name] [-m mac_spec] [-O ctl_cmd] [-o option] [-p port]
         [-R [bind_address:]port:host:hostport] [-S ctl_path] [-W host:port]
         [-w local_tun[:remote_tun]] [user@]hostname [command]


Neither does Git.
A user name must not include a ':'

The user:password came in here:
Commit 92722efec01f67a54b
clone: do not use port number as dir name

Actually, looking back, it may have been better to say
git clone ssh://aaaa:bbbb@host:/path
is illegal and simply die() out.

Back to your question and looking at the offered alternatives. I would vote for
"Completely remove this patch for v5 of the series."

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v4 01/11] add fetch-pack --diag-url tests for some corner cases
  2016-05-06  4:17           ` Torsten Bögershausen
@ 2016-05-06 15:52             ` Junio C Hamano
  0 siblings, 0 replies; 46+ messages in thread
From: Junio C Hamano @ 2016-05-06 15:52 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: Mike Hommey, git

Torsten Bögershausen <tboegi@web.de> writes:

> ssh itself does not use a password:
> ...
> Neither does Git.
> ...
> The user:password came in here:
> Commit 92722efec01f67a54b
> clone: do not use port number as dir name
>
> Actually, looking back, it may have been better to say
> git clone ssh://aaaa:bbbb@host:/path
> is illegal and simply die() out.

RFC2396, which updated RFC1738, discourages the use of :password
in "3.2.2 Server-based Naming Authority", for obvious reasons.

   Some URL schemes use the format "user:password" in the userinfo
   field.  This practice is NOT RECOMMENDED ...

and then this is marked as deprecated in RFC3986 "3.2.1. User
Information".

   Use of the format "user:password" in the userinfo field is
   deprecated.  Applications should not render as clear text any
   data after the first colon (":") character found within a
   userinfo subcomponent unless the data after the colon is the
   empty string (indicating no password).

However, at the parser level that _knows_ the syntax, you shouldn't
be unilaterally turning these "not recommended" and "deprecated" to
"forbidden".  It should be prepared to see ':' to its input, if only
to correctly recognize that as an attempt to express :password, in
order to be able to hide the data after the first colon when running
in verbose mode for example.

I'd recommend that the parser to allow <user>:<password>@<host>, and
at least notice ':' that appears before the first '@' as having a
depreated form of <userinfo>.  After stripping <scheme>:// from the
front, it is OK to assume that everything before the first '@' is
<userinfo> (in RFC2396 lingo), and everything in <userinfo> that is
before the first ':' is <user> when doing so.  

>>> ...  When you are constrained by the Common Internet
>>> Scheme Syntax, i.e.
>>>
>>>     <scheme>://<user>:<password>@<host>:<port>/<url-path>
>>>
>>> you cannot have arbitrary characters in these parts; within the user
>>> and password field, any ":", "@", or "/" must be encoded.

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2016-05-06 15:52 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-01  6:02 [PATCH v3 0/6] connect: various cleanups Mike Hommey
2016-05-01  6:02 ` [PATCH 1/6] connect: remove get_port() Mike Hommey
2016-05-01 10:10   ` Torsten Bögershausen
2016-05-01 21:43     ` Mike Hommey
2016-05-03  5:03     ` Jeff King
2016-05-03  5:11       ` Mike Hommey
2016-05-01  6:02 ` [PATCH 2/6] connect: uniformize and group CONNECT_DIAG_URL handling code Mike Hommey
2016-05-01 13:37   ` Torsten Bögershausen
2016-05-01 23:20     ` Mike Hommey
2016-05-02  4:56   ` Torsten Bögershausen
2016-05-02  8:31     ` Mike Hommey
2016-05-02 11:29       ` Torsten Bögershausen
2016-05-02 12:38         ` Mike Hommey
2016-05-02 22:05         ` Junio C Hamano
2016-05-02 23:14           ` Junio C Hamano
2016-05-01  6:02 ` [PATCH 3/6] connect: only match the host with core.gitProxy Mike Hommey
2016-05-01  6:02 ` [PATCH 4/6] connect: pass separate host and port to git_tcp_connect and git_proxy_connect Mike Hommey
2016-05-01  6:02 ` [PATCH 5/6] connect: don't xstrdup target_host Mike Hommey
2016-05-01  6:02 ` [PATCH 6/6] connect: move ssh command line preparation to a separate function Mike Hommey
2016-05-03  8:50 ` [PATCH v4 00/11] connect: various cleanups Mike Hommey
2016-05-03  8:50   ` [PATCH v4 01/11] add fetch-pack --diag-url tests for some corner cases Mike Hommey
2016-05-03 16:07     ` Torsten Bögershausen
2016-05-03 16:07     ` Junio C Hamano
2016-05-03 16:30       ` Torsten Bögershausen
2016-05-03 22:48       ` Mike Hommey
2016-05-05 21:52         ` Mike Hommey
2016-05-06  4:17           ` Torsten Bögershausen
2016-05-06 15:52             ` Junio C Hamano
2016-05-03  8:50   ` [PATCH v4 02/11] connect: call get_host_and_port() earlier Mike Hommey
2016-05-03  8:50   ` [PATCH v4 03/11] connect: only match the host with core.gitProxy Mike Hommey
2016-05-03  8:50   ` [PATCH v4 04/11] connect: fill the host header in the git protocol with the host and port variables Mike Hommey
2016-05-03  8:50   ` [PATCH v4 05/11] connect: make parse_connect_url() return separated host and port Mike Hommey
2016-05-03  8:50   ` [PATCH v4 06/11] connect: group CONNECT_DIAG_URL handling code Mike Hommey
2016-05-03  8:50   ` [PATCH v4 07/11] connect: make parse_connect_url() return the user part of the url as a separate value Mike Hommey
2016-05-03  8:50   ` [PATCH v4 08/11] connect: change the --diag-url output to separate user and host Mike Hommey
2016-05-03 16:20     ` Torsten Bögershausen
2016-05-03 17:23       ` Eric Sunshine
2016-05-03 22:50         ` Mike Hommey
2016-05-03  8:50   ` [PATCH v4 09/11] connect: use "-l user" instead of "user@" on ssh command line Mike Hommey
2016-05-03 16:25     ` Torsten Bögershausen
2016-05-03 17:50       ` Junio C Hamano
2016-05-03 17:33     ` Eric Sunshine
2016-05-03 22:52       ` Mike Hommey
2016-05-03  8:50   ` [PATCH v4 10/11] connect: actively reject git:// urls with a user part Mike Hommey
2016-05-03  8:50   ` [PATCH v4 11/11] connect: move ssh command line preparation to a separate function Mike Hommey
2016-05-03 12:30     ` [PATCH v4.1 " Mike Hommey

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).