git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [RFC 0/7] transitioning to protocol v2
@ 2017-08-24 22:53 Brandon Williams
  2017-08-24 22:53 ` [RFC 1/7] pkt-line: add packet_write function Brandon Williams
                   ` (8 more replies)
  0 siblings, 9 replies; 31+ messages in thread
From: Brandon Williams @ 2017-08-24 22:53 UTC (permalink / raw)
  To: git; +Cc: peff, jrnieder, sbeller, gitster, jonathantanmy, Brandon Williams

Another version of Git's wire protocol is a topic that has been discussed and
attempted by many in the community over the years.  The biggest challenge, as
far as I understand, has been coming up with a transition plan to using the new
server without breaking existing clients and servers.  As such this RFC is
really only concerned with solidifying a transition plan.  Once it has been
decided how we can transition to a new protocol we can get into decided what
this new protocol would look like (though it would obviously eliminate the ref
advertisement ;).

The best way to preserve functionality with old servers and clients would be to
communicate using the same end point but have the client send a bit of extra
information with its initial request.  This extra information would need to be
sent in such a way that old servers ignore it and operate normally (using
protocol v1).  The client would then need to be able to look at a server's
response to determine whether the server understands and is speaking v2 or has
ignored the clients request to use a newer protocol and is speaking v1.

Patches 1-5 enable a client to unconditionally send this back-channel
information to a server.  This is done by sending a version number after a
second NUL byte in git://, in the envvar GIT_PROTOCOL in file:// and ssh://,
and in an http header in http://, https://.  Patches 6-7 teach a client and
upload-pack to send and recognize a request to use protocol v2.

The biggest question I'm trying to answer is if these are reasonable ways with
which to communicate a request to a server to use a newer protocol, without
breaking current servers/clients.  As far as I've tested, with patches 1-5
applied I can still communicate with current servers without causing any
problems.

Any comments/discussion is welcome!

Brandon Williams (7):
  pkt-line: add packet_write function
  pkt-line: add strbuf_packet_read
  protocol: tell server that the client understands v2
  t: fix ssh tests to cope with using '-o SendEnv=GIT_PROTOCOL'
  http: send Git-Protocol-Version header
  transport: teach client to recognize v2 server response
  upload-pack: ack version 2

 builtin/fetch-pack.c         |   4 +-
 builtin/send-pack.c          |   5 +-
 connect.c                    | 196 +++++++++++++++++++++++++++----------------
 daemon.c                     |  28 ++++++-
 http.c                       |   7 ++
 pkt-line.c                   |  27 ++++++
 pkt-line.h                   |   2 +
 remote-curl.c                |   7 +-
 remote.h                     |  22 ++++-
 t/lib-proto-disable.sh       |   1 +
 t/t5551-http-fetch-smart.sh  |   2 +
 t/t5601-clone.sh             |  10 +--
 t/t5602-clone-remote-exec.sh |   4 +-
 transport.c                  |  60 +++++++++++--
 upload-pack.c                |  11 +++
 15 files changed, 286 insertions(+), 100 deletions(-)

-- 
2.14.1.342.g6490525c54-goog


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [RFC 1/7] pkt-line: add packet_write function
  2017-08-24 22:53 [RFC 0/7] transitioning to protocol v2 Brandon Williams
@ 2017-08-24 22:53 ` Brandon Williams
  2017-08-24 22:53 ` [RFC 2/7] pkt-line: add strbuf_packet_read Brandon Williams
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 31+ messages in thread
From: Brandon Williams @ 2017-08-24 22:53 UTC (permalink / raw)
  To: git; +Cc: peff, jrnieder, sbeller, gitster, jonathantanmy, Brandon Williams

Add a function which can be used to write the contents of an arbitrary
buffer.  This makes it easy to build up data in a strbuf before writing
the packet.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 6 ++++++
 pkt-line.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index 7db911957..cf98f371b 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -188,6 +188,12 @@ static int packet_write_gently(const int fd_out, const char *buf, size_t size)
 	return error("packet write failed");
 }
 
+void packet_write(const int fd_out, const char *buf, size_t size)
+{
+	if (packet_write_gently(fd_out, buf, size))
+		die_errno("packet write failed");
+}
+
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...)
 {
 	va_list args;
diff --git a/pkt-line.h b/pkt-line.h
index 66ef610fc..d9e9783b1 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -22,6 +22,7 @@
 void packet_flush(int fd);
 void packet_write_fmt(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 void packet_buf_flush(struct strbuf *buf);
+void packet_write(const int fd_out, const char *buf, size_t size);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 int packet_flush_gently(int fd);
 int packet_write_fmt_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
-- 
2.14.1.342.g6490525c54-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC 2/7] pkt-line: add strbuf_packet_read
  2017-08-24 22:53 [RFC 0/7] transitioning to protocol v2 Brandon Williams
  2017-08-24 22:53 ` [RFC 1/7] pkt-line: add packet_write function Brandon Williams
@ 2017-08-24 22:53 ` Brandon Williams
  2017-08-24 22:53 ` [RFC 3/7] protocol: tell server that the client understands v2 Brandon Williams
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 31+ messages in thread
From: Brandon Williams @ 2017-08-24 22:53 UTC (permalink / raw)
  To: git; +Cc: peff, jrnieder, sbeller, gitster, jonathantanmy, Brandon Williams

Add function which can be used to read the contents of a single pkt-line
into a strbuf.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 21 +++++++++++++++++++++
 pkt-line.h |  1 +
 2 files changed, 22 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index cf98f371b..875524ab8 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -312,6 +312,27 @@ int packet_read(int fd, char **src_buf, size_t *src_len,
 	return len;
 }
 
+int strbuf_packet_read(int fd_in, struct strbuf *sb_out, int options)
+{
+	int packet_len;
+	strbuf_grow(sb_out, LARGE_PACKET_DATA_MAX);
+	packet_len = packet_read(fd_in, NULL, NULL,
+				 /*
+				  * strbuf_grow() above always allocates one extra byte to
+				  * store a '\0' at the end of the string. packet_read()
+				  * writes a '\0' extra byte at the end, too. Let it know
+				  * that there is already room for the extra byte.
+				  */
+				 sb_out->buf, LARGE_PACKET_DATA_MAX+1,
+				 options);
+	if (packet_len < 0)
+		return packet_len;
+
+	sb_out->len = packet_len;
+
+	return sb_out->len;
+}
+
 static char *packet_read_line_generic(int fd,
 				      char **src, size_t *src_len,
 				      int *dst_len)
diff --git a/pkt-line.h b/pkt-line.h
index d9e9783b1..c24c4f290 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -65,6 +65,7 @@ int write_packetized_from_buf(const char *src_in, size_t len, int fd_out);
 int packet_read(int fd, char **src_buffer, size_t *src_len, char
 		*buffer, unsigned size, int options);
 
+int strbuf_packet_read(int fd_in, struct strbuf *sb_out, int options);
 /*
  * Convenience wrapper for packet_read that is not gentle, and sets the
  * CHOMP_NEWLINE option. The return value is NULL for a flush packet,
-- 
2.14.1.342.g6490525c54-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC 3/7] protocol: tell server that the client understands v2
  2017-08-24 22:53 [RFC 0/7] transitioning to protocol v2 Brandon Williams
  2017-08-24 22:53 ` [RFC 1/7] pkt-line: add packet_write function Brandon Williams
  2017-08-24 22:53 ` [RFC 2/7] pkt-line: add strbuf_packet_read Brandon Williams
@ 2017-08-24 22:53 ` Brandon Williams
  2017-08-25 17:45   ` Junio C Hamano
  2017-08-24 22:53 ` [RFC 4/7] t: fix ssh tests to cope with using '-o SendEnv=GIT_PROTOCOL' Brandon Williams
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 31+ messages in thread
From: Brandon Williams @ 2017-08-24 22:53 UTC (permalink / raw)
  To: git; +Cc: peff, jrnieder, sbeller, gitster, jonathantanmy, Brandon Williams

Teach the connection logic to tell a serve that it understands protocol
v2.  This is done in 2 different ways for the built in protocols.

1. git://
   A normal request is structured as "command path/to/repo\0host=..\0"
   and due to a bug in an old version of git-daemon 73bb33a94 (daemon:
   Strictly parse the "extra arg" part of the command, 2009-06-04) we
   aren't able to place any extra args (separated by NULs) besides the
   host.  In order to get around this limitation put protocol version
   information after a second NUL byte so the request is structured
   like: "command path/to/repo\0host=..\0\0version=2\0".  git-daemon can
   then parse out the version number and set GIT_PROTOCOL.

2. ssh://, file://
   Set GIT_PROTOCOL envvar with the desired protocol version.  The
   envvar can be sent across ssh by using '-o SendEnv=GIT_PROTOCOL' and
   having the server whitelist this envvar.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c | 31 ++++++++++++++++++++++++++-----
 daemon.c  | 28 +++++++++++++++++++++++++---
 2 files changed, 51 insertions(+), 8 deletions(-)

diff --git a/connect.c b/connect.c
index 49b28b83b..d609267be 100644
--- a/connect.c
+++ b/connect.c
@@ -793,6 +793,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 		printf("Diag: path=%s\n", path ? path : "NULL");
 		conn = NULL;
 	} else if (protocol == PROTO_GIT) {
+		struct strbuf request = STRBUF_INIT;
 		/*
 		 * Set up virtual host information based on where we will
 		 * connect, unless the user has overridden us in
@@ -820,12 +821,23 @@ struct child_process *git_connect(int fd[2], const char *url,
 		 * Note: Do not add any other headers here!  Doing so
 		 * will cause older git-daemon servers to crash.
 		 */
-		packet_write_fmt(fd[1],
-			     "%s %s%chost=%s%c",
-			     prog, path, 0,
-			     target_host, 0);
+		strbuf_addf(&request,
+			    "%s %s%chost=%s%c",
+			    prog, path, 0,
+			    target_host, 0);
+
+		/* If using a new version put that stuff here after a second null byte */
+		strbuf_addch(&request, '\0');
+		strbuf_addf(&request, "version=%d%c", 2, '\0');
+		/* subsequent supported versions can also be added */
+		strbuf_addf(&request, "version=%d%c", 3, '\0');
+
+		packet_write(fd[1], request.buf, request.len);
+
 		free(target_host);
+		strbuf_release(&request);
 	} else {
+		const char *const *var;
 		conn = xmalloc(sizeof(*conn));
 		child_process_init(conn);
 
@@ -837,7 +849,9 @@ struct child_process *git_connect(int fd[2], const char *url,
 		sq_quote_buf(&cmd, path);
 
 		/* remove repo-local variables from the environment */
-		conn->env = local_repo_env;
+		for (var = local_repo_env; *var; var++)
+			argv_array_push(&conn->env_array, *var);
+
 		conn->use_shell = 1;
 		conn->in = conn->out = -1;
 		if (protocol == PROTO_SSH) {
@@ -890,6 +904,12 @@ struct child_process *git_connect(int fd[2], const char *url,
 			}
 
 			argv_array_push(&conn->args, ssh);
+
+			/* protocol v2! */
+			argv_array_push(&conn->args, "-o");
+			argv_array_push(&conn->args, "SendEnv=GIT_PROTOCOL");
+			argv_array_push(&conn->env_array, "GIT_PROTOCOL=2");
+
 			if (flags & CONNECT_IPV4)
 				argv_array_push(&conn->args, "-4");
 			else if (flags & CONNECT_IPV6)
@@ -904,6 +924,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 			argv_array_push(&conn->args, ssh_host);
 		} else {
 			transport_check_allowed("file");
+			argv_array_push(&conn->env_array, "GIT_PROTOCOL=2");
 		}
 		argv_array_push(&conn->args, cmd.buf);
 
diff --git a/daemon.c b/daemon.c
index 30747075f..76a7b2d64 100644
--- a/daemon.c
+++ b/daemon.c
@@ -574,7 +574,7 @@ static void canonicalize_client(struct strbuf *out, const char *in)
 /*
  * Read the host as supplied by the client connection.
  */
-static void parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
+static char *parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
 {
 	char *val;
 	int vallen;
@@ -602,6 +602,22 @@ static void parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
 		if (extra_args < end && *extra_args)
 			die("Invalid request");
 	}
+
+	return extra_args;
+}
+
+static void parse_extra_args(const char *extra_args, int buflen)
+{
+	const char *end = extra_args + buflen;
+
+	for (; extra_args < end; extra_args += strlen(extra_args) + 1) {
+		const char *arg = extra_args;
+		if (skip_prefix(arg, "version=", &arg))
+			fprintf(stderr, "VERSION=%s\n", arg);
+
+		fprintf(stderr, "%s\n", extra_args);
+
+	}
 }
 
 /*
@@ -716,8 +732,14 @@ static int execute(void)
 		pktlen--;
 	}
 
-	if (len != pktlen)
-		parse_host_arg(&hi, line + len + 1, pktlen - len - 1);
+	if (len != pktlen) {
+		const char *extra_args;
+		/* retreive host */
+		extra_args = parse_host_arg(&hi, line + len + 1, pktlen - len - 1);
+
+		/* determine version */
+		parse_extra_args(extra_args + 1, pktlen - (extra_args - line) - 1);
+	}
 
 	for (i = 0; i < ARRAY_SIZE(daemon_service); i++) {
 		struct daemon_service *s = &(daemon_service[i]);
-- 
2.14.1.342.g6490525c54-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC 4/7] t: fix ssh tests to cope with using '-o SendEnv=GIT_PROTOCOL'
  2017-08-24 22:53 [RFC 0/7] transitioning to protocol v2 Brandon Williams
                   ` (2 preceding siblings ...)
  2017-08-24 22:53 ` [RFC 3/7] protocol: tell server that the client understands v2 Brandon Williams
@ 2017-08-24 22:53 ` Brandon Williams
  2017-08-24 22:53 ` [RFC 5/7] http: send Git-Protocol-Version header Brandon Williams
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 31+ messages in thread
From: Brandon Williams @ 2017-08-24 22:53 UTC (permalink / raw)
  To: git; +Cc: peff, jrnieder, sbeller, gitster, jonathantanmy, Brandon Williams

Update some of our tests to cope with ssh being launched with the option
to send the protocol version.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 t/lib-proto-disable.sh       |  1 +
 t/t5601-clone.sh             | 10 +++++-----
 t/t5602-clone-remote-exec.sh |  4 ++--
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/t/lib-proto-disable.sh b/t/lib-proto-disable.sh
index 83babe57d..d19c88f96 100644
--- a/t/lib-proto-disable.sh
+++ b/t/lib-proto-disable.sh
@@ -194,6 +194,7 @@ setup_ssh_wrapper () {
 	test_expect_success 'setup ssh wrapper' '
 		write_script ssh-wrapper <<-\EOF &&
 		echo >&2 "ssh: $*"
+		shift; shift
 		host=$1; shift
 		cd "$TRASH_DIRECTORY/$host" &&
 		eval "$*"
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 9c56f771b..7e65013c5 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -332,13 +332,13 @@ expect_ssh () {
 		1)
 			;;
 		2)
-			echo "ssh: $1 git-upload-pack '$2'"
+			echo "ssh: -o SendEnv=GIT_PROTOCOL $1 git-upload-pack '$2'"
 			;;
 		3)
-			echo "ssh: $1 $2 git-upload-pack '$3'"
+			echo "ssh: -o SendEnv=GIT_PROTOCOL $1 $2 git-upload-pack '$3'"
 			;;
 		*)
-			echo "ssh: $1 $2 git-upload-pack '$3' $4"
+			echo "ssh: $1 -o SendEnv=GIT_PROTOCOL $2 $3 git-upload-pack '$4'"
 		esac
 	} >"$TRASH_DIRECTORY/ssh-expect" &&
 	(cd "$TRASH_DIRECTORY" && test_cmp ssh-expect ssh-output)
@@ -390,7 +390,7 @@ test_expect_success 'double quoted plink.exe in GIT_SSH_COMMAND' '
 	copy_ssh_wrapper_as "$TRASH_DIRECTORY/plink.exe" &&
 	GIT_SSH_COMMAND="\"$TRASH_DIRECTORY/plink.exe\" -v" \
 		git clone "[myhost:123]:src" ssh-bracket-clone-plink-3 &&
-	expect_ssh "-v -P 123" myhost src
+	expect_ssh "-v" "-P 123" myhost src
 '
 
 SQ="'"
@@ -398,7 +398,7 @@ test_expect_success 'single quoted plink.exe in GIT_SSH_COMMAND' '
 	copy_ssh_wrapper_as "$TRASH_DIRECTORY/plink.exe" &&
 	GIT_SSH_COMMAND="$SQ$TRASH_DIRECTORY/plink.exe$SQ -v" \
 		git clone "[myhost:123]:src" ssh-bracket-clone-plink-4 &&
-	expect_ssh "-v -P 123" myhost src
+	expect_ssh "-v" "-P 123" myhost src
 '
 
 test_expect_success 'GIT_SSH_VARIANT overrides plink detection' '
diff --git a/t/t5602-clone-remote-exec.sh b/t/t5602-clone-remote-exec.sh
index cbcceab9d..b0d80cadd 100755
--- a/t/t5602-clone-remote-exec.sh
+++ b/t/t5602-clone-remote-exec.sh
@@ -13,14 +13,14 @@ test_expect_success setup '
 
 test_expect_success 'clone calls git upload-pack unqualified with no -u option' '
 	test_must_fail env GIT_SSH=./not_ssh git clone localhost:/path/to/repo junk &&
-	echo "localhost git-upload-pack '\''/path/to/repo'\''" >expected &&
+	echo "-o SendEnv=GIT_PROTOCOL localhost git-upload-pack '\''/path/to/repo'\''" >expected &&
 	test_cmp expected not_ssh_output
 '
 
 test_expect_success 'clone calls specified git upload-pack with -u option' '
 	test_must_fail env GIT_SSH=./not_ssh \
 		git clone -u ./something/bin/git-upload-pack localhost:/path/to/repo junk &&
-	echo "localhost ./something/bin/git-upload-pack '\''/path/to/repo'\''" >expected &&
+	echo "-o SendEnv=GIT_PROTOCOL localhost ./something/bin/git-upload-pack '\''/path/to/repo'\''" >expected &&
 	test_cmp expected not_ssh_output
 '
 
-- 
2.14.1.342.g6490525c54-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC 5/7] http: send Git-Protocol-Version header
  2017-08-24 22:53 [RFC 0/7] transitioning to protocol v2 Brandon Williams
                   ` (3 preceding siblings ...)
  2017-08-24 22:53 ` [RFC 4/7] t: fix ssh tests to cope with using '-o SendEnv=GIT_PROTOCOL' Brandon Williams
@ 2017-08-24 22:53 ` Brandon Williams
  2017-08-30 10:55   ` Kevin Daudt
  2017-08-24 22:53 ` [RFC 6/7] transport: teach client to recognize v2 server response Brandon Williams
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 31+ messages in thread
From: Brandon Williams @ 2017-08-24 22:53 UTC (permalink / raw)
  To: git; +Cc: peff, jrnieder, sbeller, gitster, jonathantanmy, Brandon Williams

Tell a serve that protocol v2 can be used by sending an http header
indicating this.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 http.c                      | 7 +++++++
 t/t5551-http-fetch-smart.sh | 2 ++
 2 files changed, 9 insertions(+)

diff --git a/http.c b/http.c
index fa8666a21..504a14a5a 100644
--- a/http.c
+++ b/http.c
@@ -896,6 +896,11 @@ static void set_from_env(const char **var, const char *envname)
 		*var = val;
 }
 
+static const char *get_version(void)
+{
+	return "Git-Protocol-Version: 2";
+}
+
 void http_init(struct remote *remote, const char *url, int proactive_auth)
 {
 	char *low_speed_limit;
@@ -926,6 +931,8 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 	if (remote)
 		var_override(&http_proxy_authmethod, remote->http_proxy_authmethod);
 
+	extra_http_headers = curl_slist_append(extra_http_headers, get_version());
+
 	pragma_header = curl_slist_append(http_copy_default_headers(),
 		"Pragma: no-cache");
 	no_pragma_header = curl_slist_append(http_copy_default_headers(),
diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
index a51b7e20d..ce13f2425 100755
--- a/t/t5551-http-fetch-smart.sh
+++ b/t/t5551-http-fetch-smart.sh
@@ -27,6 +27,7 @@ cat >exp <<EOF
 > GET /smart/repo.git/info/refs?service=git-upload-pack HTTP/1.1
 > Accept: */*
 > Accept-Encoding: gzip
+> Git-Protocol-Version: 2
 > Pragma: no-cache
 < HTTP/1.1 200 OK
 < Pragma: no-cache
@@ -34,6 +35,7 @@ cat >exp <<EOF
 < Content-Type: application/x-git-upload-pack-advertisement
 > POST /smart/repo.git/git-upload-pack HTTP/1.1
 > Accept-Encoding: gzip
+> Git-Protocol-Version: 2
 > Content-Type: application/x-git-upload-pack-request
 > Accept: application/x-git-upload-pack-result
 > Content-Length: xxx
-- 
2.14.1.342.g6490525c54-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC 6/7] transport: teach client to recognize v2 server response
  2017-08-24 22:53 [RFC 0/7] transitioning to protocol v2 Brandon Williams
                   ` (4 preceding siblings ...)
  2017-08-24 22:53 ` [RFC 5/7] http: send Git-Protocol-Version header Brandon Williams
@ 2017-08-24 22:53 ` Brandon Williams
  2017-08-24 22:53 ` [RFC 7/7] upload-pack: ack version 2 Brandon Williams
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 31+ messages in thread
From: Brandon Williams @ 2017-08-24 22:53 UTC (permalink / raw)
  To: git; +Cc: peff, jrnieder, sbeller, gitster, jonathantanmy, Brandon Williams

Teach a client to recognize that a server understand protocol v2 by
looking at the first pkt-line the server sends in response.  This is
done by looking for the response "version 2" sent by upload-pack.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch-pack.c |   4 +-
 builtin/send-pack.c  |   5 +-
 connect.c            | 165 ++++++++++++++++++++++++++++++---------------------
 remote-curl.c        |   7 ++-
 remote.h             |  22 ++++++-
 transport.c          |  60 ++++++++++++++++---
 6 files changed, 178 insertions(+), 85 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 366b9d13f..a2a5e1c73 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -52,6 +52,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	struct fetch_pack_args args;
 	struct oid_array shallow = OID_ARRAY_INIT;
 	struct string_list deepen_not = STRING_LIST_INIT_DUP;
+	struct remote_refs_scanner scanner;
 
 	packet_trace_identity("fetch-pack");
 
@@ -193,7 +194,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 		if (!conn)
 			return args.diag_url ? 0 : 1;
 	}
-	get_remote_heads(fd[0], NULL, 0, &ref, 0, NULL, &shallow);
+	remote_refs_scanner_init(&scanner, &ref, 0, NULL, &shallow);
+	get_remote_heads(fd[0], NULL, 0, &scanner);
 
 	ref = fetch_pack(&args, fd, conn, ref, dest, sought, nr_sought,
 			 &shallow, pack_lockfile_ptr);
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index fc4f0bb5f..92ec1f871 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -154,6 +154,7 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
 	int progress = -1;
 	int from_stdin = 0;
 	struct push_cas_option cas = {0};
+	struct remote_refs_scanner scanner;
 
 	struct option options[] = {
 		OPT__VERBOSITY(&verbose),
@@ -256,8 +257,8 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
 			args.verbose ? CONNECT_VERBOSE : 0);
 	}
 
-	get_remote_heads(fd[0], NULL, 0, &remote_refs, REF_NORMAL,
-			 &extra_have, &shallow);
+	remote_refs_scanner_init(&scanner, &remote_refs, REF_NORMAL, &extra_have, &shallow);
+	get_remote_heads(fd[0], NULL, 0, &scanner);
 
 	transport_verify_remote_names(nr_refspecs, refspecs);
 
diff --git a/connect.c b/connect.c
index d609267be..732b651d9 100644
--- a/connect.c
+++ b/connect.c
@@ -107,97 +107,124 @@ static void annotate_refs_with_symref_info(struct ref *ref)
 	string_list_clear(&symref, 0);
 }
 
-/*
- * Read all the refs from the other end
- */
-struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
+void remote_refs_scanner_init(struct remote_refs_scanner *scanner,
 			      struct ref **list, unsigned int flags,
 			      struct oid_array *extra_have,
 			      struct oid_array *shallow_points)
 {
-	struct ref **orig_list = list;
+	memset(scanner, 0, sizeof(*scanner));
+
+	scanner->orig_list = list;
+	*list = NULL;
+	scanner->list = list;
+	scanner->flags = flags;
+	scanner->extra_have = extra_have;
+	scanner->shallow_points = shallow_points;
+}
+
+int process_ref(struct remote_refs_scanner *scanner,
+		const char *buffer, int len)
+{
+	struct ref *ref;
+	struct object_id old_oid;
+	const char *name;
+	int name_len;
+	const char *arg;
+	int ret = 1;
+
+	if (len < 0)
+		die_initial_contact(scanner->seen_response);
+
+	if (!len) {
+		ret = 0;
+		goto out;
+	}
+
+	if (len > 4 && skip_prefix(buffer, "ERR ", &arg))
+		die("remote error: %s", arg);
+
+	if (len == GIT_SHA1_HEXSZ + strlen("shallow ") &&
+	    skip_prefix(buffer, "shallow ", &arg)) {
+		if (get_oid_hex(arg, &old_oid))
+			die("protocol error: expected shallow sha-1, got '%s'", arg);
+		if (!scanner->shallow_points)
+			die("repository on the other end cannot be shallow");
+		oid_array_append(scanner->shallow_points, &old_oid);
+		goto out;
+	}
+
+	if (len < GIT_SHA1_HEXSZ + 2 || get_oid_hex(buffer, &old_oid) ||
+	    buffer[GIT_SHA1_HEXSZ] != ' ')
+		die("protocol error: expected sha/ref, got '%s'", buffer);
+	name = buffer + GIT_SHA1_HEXSZ + 1;
+
+	name_len = strlen(name);
+	if (len != name_len + GIT_SHA1_HEXSZ + 1) {
+		free(server_capabilities);
+		server_capabilities = xstrdup(name + name_len + 1);
+	}
+
+	if (scanner->extra_have && !strcmp(name, ".have")) {
+		oid_array_append(scanner->extra_have, &old_oid);
+		goto out;
+	}
+
+	if (!strcmp(name, "capabilities^{}")) {
+		if (scanner->seen_response)
+			die("protocol error: unexpected capabilities^{}");
+		if (scanner->recieved_dummy_capabilities_ref)
+			die("protocol error: multiple capabilities^{}");
+		scanner->recieved_dummy_capabilities_ref = 1;
+		ret = 1;
+		goto out;
+	}
+
+	if (!check_ref(name, scanner->flags))
+		goto out;
+
+	if (scanner->recieved_dummy_capabilities_ref)
+		die("protocol error: unexpected ref after capabilities^{}");
+
+	ref = alloc_ref(buffer + GIT_SHA1_HEXSZ + 1);
+	oidcpy(&ref->old_oid, &old_oid);
+
+	*scanner->list = ref;
+	scanner->list = &ref->next;
+
+out:
+	scanner->seen_response = 1;
+	scanner->done = !ret;
+	return ret;
+}
 
+/*
+ * Read all the refs from the other end
+ */
+struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
+			      struct remote_refs_scanner *scanner)
+{
 	/*
 	 * A hang-up after seeing some response from the other end
 	 * means that it is unexpected, as we know the other end is
 	 * willing to talk to us.  A hang-up before seeing any
 	 * response does not necessarily mean an ACL problem, though.
 	 */
-	int saw_response;
-	int got_dummy_ref_with_capabilities_declaration = 0;
-
-	*list = NULL;
-	for (saw_response = 0; ; saw_response = 1) {
-		struct ref *ref;
-		struct object_id old_oid;
-		char *name;
-		int len, name_len;
+	while (!scanner->done) {
 		char *buffer = packet_buffer;
-		const char *arg;
+		int len;
 
 		len = packet_read(in, &src_buf, &src_len,
 				  packet_buffer, sizeof(packet_buffer),
 				  PACKET_READ_GENTLE_ON_EOF |
 				  PACKET_READ_CHOMP_NEWLINE);
-		if (len < 0)
-			die_initial_contact(saw_response);
 
-		if (!len)
+		if (!process_ref(scanner, buffer, len))
 			break;
-
-		if (len > 4 && skip_prefix(buffer, "ERR ", &arg))
-			die("remote error: %s", arg);
-
-		if (len == GIT_SHA1_HEXSZ + strlen("shallow ") &&
-			skip_prefix(buffer, "shallow ", &arg)) {
-			if (get_oid_hex(arg, &old_oid))
-				die("protocol error: expected shallow sha-1, got '%s'", arg);
-			if (!shallow_points)
-				die("repository on the other end cannot be shallow");
-			oid_array_append(shallow_points, &old_oid);
-			continue;
-		}
-
-		if (len < GIT_SHA1_HEXSZ + 2 || get_oid_hex(buffer, &old_oid) ||
-			buffer[GIT_SHA1_HEXSZ] != ' ')
-			die("protocol error: expected sha/ref, got '%s'", buffer);
-		name = buffer + GIT_SHA1_HEXSZ + 1;
-
-		name_len = strlen(name);
-		if (len != name_len + GIT_SHA1_HEXSZ + 1) {
-			free(server_capabilities);
-			server_capabilities = xstrdup(name + name_len + 1);
-		}
-
-		if (extra_have && !strcmp(name, ".have")) {
-			oid_array_append(extra_have, &old_oid);
-			continue;
-		}
-
-		if (!strcmp(name, "capabilities^{}")) {
-			if (saw_response)
-				die("protocol error: unexpected capabilities^{}");
-			if (got_dummy_ref_with_capabilities_declaration)
-				die("protocol error: multiple capabilities^{}");
-			got_dummy_ref_with_capabilities_declaration = 1;
-			continue;
-		}
-
-		if (!check_ref(name, flags))
-			continue;
-
-		if (got_dummy_ref_with_capabilities_declaration)
-			die("protocol error: unexpected ref after capabilities^{}");
-
-		ref = alloc_ref(buffer + GIT_SHA1_HEXSZ + 1);
-		oidcpy(&ref->old_oid, &old_oid);
-		*list = ref;
-		list = &ref->next;
 	}
 
-	annotate_refs_with_symref_info(*orig_list);
+	annotate_refs_with_symref_info(*scanner->orig_list);
 
-	return list;
+	return scanner->list;
 }
 
 static const char *parse_feature_value(const char *feature_list, const char *feature, int *lenp)
diff --git a/remote-curl.c b/remote-curl.c
index 0053b0954..04c191493 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -175,9 +175,12 @@ static struct discovery *last_discovery;
 
 static struct ref *parse_git_refs(struct discovery *heads, int for_push)
 {
+	struct remote_refs_scanner scanner;
 	struct ref *list = NULL;
-	get_remote_heads(-1, heads->buf, heads->len, &list,
-			 for_push ? REF_NORMAL : 0, NULL, &heads->shallow);
+	remote_refs_scanner_init(&scanner, &list, for_push ? REF_NORMAL : 0,
+				 NULL, &heads->shallow);
+	get_remote_heads(-1, heads->buf, heads->len, &scanner);
+
 	return list;
 }
 
diff --git a/remote.h b/remote.h
index 2ecf4c8c7..3a60b8fc4 100644
--- a/remote.h
+++ b/remote.h
@@ -150,10 +150,26 @@ int check_ref_type(const struct ref *ref, int flags);
 void free_refs(struct ref *ref);
 
 struct oid_array;
+
+struct remote_refs_scanner {
+	struct ref **orig_list;
+	struct ref **list;
+	unsigned int flags;
+	struct oid_array *extra_have;
+	struct oid_array *shallow_points;
+	unsigned seen_response : 1;
+	unsigned recieved_dummy_capabilities_ref : 1;
+	unsigned done : 1;
+};
+void remote_refs_scanner_init(struct remote_refs_scanner *scanner,
+			      struct ref **list, unsigned int flags,
+			      struct oid_array *extra_have,
+			      struct oid_array *shallow_points);
+int process_ref(struct remote_refs_scanner *scanner,
+		const char *buffer, int len);
+
 extern struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
-				     struct ref **list, unsigned int flags,
-				     struct oid_array *extra_have,
-				     struct oid_array *shallow);
+				     struct remote_refs_scanner *scanner);
 
 int resolve_remote_symref(struct ref *ref, struct ref *list);
 int ref_newer(const struct object_id *new_oid, const struct object_id *old_oid);
diff --git a/transport.c b/transport.c
index d75ff0514..c05e167d6 100644
--- a/transport.c
+++ b/transport.c
@@ -187,18 +187,59 @@ static int connect_setup(struct transport *transport, int for_push)
 	return 0;
 }
 
+static int determine_version(int fd, struct strbuf *packet)
+{
+	const char *v;
+	int len;
+
+	len = strbuf_packet_read(fd, packet,
+				 PACKET_READ_GENTLE_ON_EOF |
+				 PACKET_READ_CHOMP_NEWLINE);
+	if (len < 0)
+		return -1;
+
+	if (skip_prefix(packet->buf, "version ", &v)) {
+		return 2;
+	}
+
+	return 1;
+}
+
 static struct ref *get_refs_via_connect(struct transport *transport, int for_push)
 {
 	struct git_transport_data *data = transport->data;
+	struct strbuf buf = STRBUF_INIT;
+	struct remote_refs_scanner scanner;
 	struct ref *refs;
 
+	remote_refs_scanner_init(&scanner, &refs, for_push ? REF_NORMAL : 0,
+				 &data->extra_have, &data->shallow);
+
 	connect_setup(transport, for_push);
-	get_remote_heads(data->fd[0], NULL, 0, &refs,
-			 for_push ? REF_NORMAL : 0,
-			 &data->extra_have,
-			 &data->shallow);
+
+	if (0)
+		determine_version(-1, NULL);
+	switch(determine_version(data->fd[0], &buf)) {
+	case 2:
+		/* The server understands Protocol v2 */
+		fprintf(stderr, "Server understands Protocol v2!\n");
+		break;
+	case 1:
+		/* Server is speaking Protocol v1 and sent a ref so process it */
+		process_ref(&scanner, buf.buf, buf.len);
+		break;
+	default:
+		process_ref(&scanner, NULL, -1);
+		break;
+
+	}
+
+	get_remote_heads(data->fd[0], NULL, 0, &scanner);
+
 	data->got_remote_heads = 1;
 
+	strbuf_release(&buf);
+
 	return refs;
 }
 
@@ -231,9 +272,11 @@ static int fetch_refs_via_pack(struct transport *transport,
 	args.update_shallow = data->options.update_shallow;
 
 	if (!data->got_remote_heads) {
+		struct remote_refs_scanner scanner;
 		connect_setup(transport, 0);
-		get_remote_heads(data->fd[0], NULL, 0, &refs_tmp, 0,
-				 NULL, &data->shallow);
+
+		remote_refs_scanner_init(&scanner, &refs_tmp, 0, NULL, &data->shallow);
+		get_remote_heads(data->fd[0], NULL, 0, &scanner);
 		data->got_remote_heads = 1;
 	}
 
@@ -544,10 +587,11 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 
 	if (!data->got_remote_heads) {
 		struct ref *tmp_refs;
+		struct remote_refs_scanner scanner;
 		connect_setup(transport, 1);
 
-		get_remote_heads(data->fd[0], NULL, 0, &tmp_refs, REF_NORMAL,
-				 NULL, &data->shallow);
+		remote_refs_scanner_init(&scanner, &tmp_refs, REF_NORMAL, NULL, &data->shallow);
+		get_remote_heads(data->fd[0], NULL, 0, &scanner);
 		data->got_remote_heads = 1;
 	}
 
-- 
2.14.1.342.g6490525c54-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC 7/7] upload-pack: ack version 2
  2017-08-24 22:53 [RFC 0/7] transitioning to protocol v2 Brandon Williams
                   ` (5 preceding siblings ...)
  2017-08-24 22:53 ` [RFC 6/7] transport: teach client to recognize v2 server response Brandon Williams
@ 2017-08-24 22:53 ` Brandon Williams
  2017-09-01 22:02   ` Bryan Turner
  2017-08-25  1:19 ` [RFC 0/7] transitioning to protocol v2 Junio C Hamano
  2017-08-25 17:29 ` Jeff King
  8 siblings, 1 reply; 31+ messages in thread
From: Brandon Williams @ 2017-08-24 22:53 UTC (permalink / raw)
  To: git; +Cc: peff, jrnieder, sbeller, gitster, jonathantanmy, Brandon Williams

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 upload-pack.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/upload-pack.c b/upload-pack.c
index 7efff2fbf..0f853152f 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1032,9 +1032,15 @@ static int upload_pack_config(const char *var, const char *value, void *unused)
 	return parse_hide_refs_config(var, value, "uploadpack");
 }
 
+void upload_pack_v2(void)
+{
+	packet_write_fmt(1, "%s\n", "version 2");
+}
+
 int cmd_main(int argc, const char **argv)
 {
 	const char *dir;
+	const char *version;
 	int strict = 0;
 	struct option options[] = {
 		OPT_BOOL(0, "stateless-rpc", &stateless_rpc,
@@ -1067,6 +1073,11 @@ int cmd_main(int argc, const char **argv)
 		die("'%s' does not appear to be a git repository", dir);
 
 	git_config(upload_pack_config, NULL);
+
+	version = getenv("GIT_PROTOCOL");
+	if (!strcmp(version, "2"))
+		upload_pack_v2();
+
 	upload_pack();
 	return 0;
 }
-- 
2.14.1.342.g6490525c54-goog


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [RFC 0/7] transitioning to protocol v2
  2017-08-24 22:53 [RFC 0/7] transitioning to protocol v2 Brandon Williams
                   ` (6 preceding siblings ...)
  2017-08-24 22:53 ` [RFC 7/7] upload-pack: ack version 2 Brandon Williams
@ 2017-08-25  1:19 ` Junio C Hamano
  2017-08-25 17:07   ` Stefan Beller
  2017-08-25 17:29 ` Jeff King
  8 siblings, 1 reply; 31+ messages in thread
From: Junio C Hamano @ 2017-08-25  1:19 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, peff, jrnieder, sbeller, jonathantanmy

Brandon Williams <bmwill@google.com> writes:

> The best way to preserve functionality with old servers and clients would be to
> communicate using the same end point but have the client send a bit of extra
> information with its initial request.  This extra information would need to be
> sent in such a way that old servers ignore it and operate normally (using
> protocol v1).  The client would then need to be able to look at a server's
> response to determine whether the server understands and is speaking v2 or has
> ignored the clients request to use a newer protocol and is speaking v1.

Good. I think the idle talk last round was for the server to tell
the v1 client "we are still doing the slow ls-remote comes first
protocol with this exchange, but just for future reference, you can
use the v2 endpoint with me", which was way less desirable (even
though it may be safer).

> Patches 1-5 enable a client to unconditionally send this back-channel
> information to a server.  This is done by sending a version number after a
> second NUL byte in git://, in the envvar GIT_PROTOCOL in file:// and ssh://,
> and in an http header in http://, https://.  Patches 6-7 teach a client and
> upload-pack to send and recognize a request to use protocol v2.

All sounds sensible.

 - for git://, if you say you found a hole in the protocol to stuff
   this information, I simply believe you ;-)  Good job.

 - http:// and https:// should be a no-brainer as the HTTP headers
   give ample room to send information from the initiator side.

 - For ssh://, I do not think it is sane to assume that we can
   convince server operators to allow passing any random environment
   variable.  If the use of this specific variable turns out to be
   popular enough, however, and its benefit outweighs administrative
   burden, perhaps it is not impossible to convince them to allow
   AcceptEnv on this single variable.

Thanks.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 0/7] transitioning to protocol v2
  2017-08-25  1:19 ` [RFC 0/7] transitioning to protocol v2 Junio C Hamano
@ 2017-08-25 17:07   ` Stefan Beller
  2017-08-25 17:14     ` Junio C Hamano
  0 siblings, 1 reply; 31+ messages in thread
From: Stefan Beller @ 2017-08-25 17:07 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Brandon Williams, git@vger.kernel.org, Jeff King, Jonathan Nieder,
	Jonathan Tan

On Thu, Aug 24, 2017 at 6:19 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Brandon Williams <bmwill@google.com> writes:
>
>> The best way to preserve functionality with old servers and clients would be to
>> communicate using the same end point but have the client send a bit of extra
>> information with its initial request.  This extra information would need to be
>> sent in such a way that old servers ignore it and operate normally (using
>> protocol v1).  The client would then need to be able to look at a server's
>> response to determine whether the server understands and is speaking v2 or has
>> ignored the clients request to use a newer protocol and is speaking v1.
>
> Good. I think the idle talk last round was for the server to tell
> the v1 client "we are still doing the slow ls-remote comes first
> protocol with this exchange, but just for future reference, you can
> use the v2 endpoint with me", which was way less desirable (even
> though it may be safer).

The idea that this RFC series tries to show case is safer IMO.

>> Patches 1-5 enable a client to unconditionally send this back-channel
>> information to a server.  This is done by sending a version number after a
>> second NUL byte in git://, in the envvar GIT_PROTOCOL in file:// and ssh://,
>> and in an http header in http://, https://.  Patches 6-7 teach a client and
>> upload-pack to send and recognize a request to use protocol v2.
>
> All sounds sensible.
>
>  - for git://, if you say you found a hole in the protocol to stuff
>    this information, I simply believe you ;-)  Good job.

The hole is incredible funny and sad IMHO, so I would expect that
this series (specifically the review once it leaves RFC state) focusses
on how to allow future protocols with no such hacks. So AFAICT
the core idea of this series is that we can have an exchange

client through poked hole>  "We can speak version 4,3, and 1 as fallback"
server > "Ok, 3 it is"
[ protocol v3 is executed, I don't know what it looks like. ]

Alternatively when the server is old:

client >  "We can speak version 4,3, and 1 as fallback"
server > lists refs
client continues v1 as usual, because the version announcements
are so different from the first ref in the refs advertisement of v1,
such that the client knows for sure which version is talked. (despite
never getting an explicit "it is v1")

Or if the client is old:

client > (nothing)
server > lists refs, current v1 style.


>  - http:// and https:// should be a no-brainer as the HTTP headers
>    give ample room to send information from the initiator side.
>
>  - For ssh://, I do not think it is sane to assume that we can
>    convince server operators to allow passing any random environment
>    variable.  If the use of this specific variable turns out to be
>    popular enough, however, and its benefit outweighs administrative
>    burden, perhaps it is not impossible to convince them to allow
>    AcceptEnv on this single variable.

Once the next generation protocol demonstrates it is far superior
than the current protocol admins will switch happily. (Some ideas:
(a) refstable instead packed-refs format, yielding better compression for
ref advertisement, (b) refs-in-want to cut down ref advertisement to
just the needs, (c) negotiation to draw from the remote reflog)

For now I would suggest we put a protocol v2 in place that is
the current protocol + a version number coming through the
poked hole at the beginning; the goal and review of this series
ought to focus on getting the version handshake right (and future
proof for an eventual v3 if needed in another 10 years)

Regarding the patches itself:

 patches 1,2 seem ok, no further comment
 patches 3-5 are the client stating that it can understand v2.
 which means that patch 6 ("actually understand a v2, that
 looks surprisingly similar to v1") should come before 3-5.

 patch 7 can be either before or after 6, the server side
 seems independent of the client side for the sake of
 this patch series.

Thanks,
Stefan

>
> Thanks.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 0/7] transitioning to protocol v2
  2017-08-25 17:07   ` Stefan Beller
@ 2017-08-25 17:14     ` Junio C Hamano
  2017-08-25 17:36       ` Jeff King
  0 siblings, 1 reply; 31+ messages in thread
From: Junio C Hamano @ 2017-08-25 17:14 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Brandon Williams, git@vger.kernel.org, Jeff King, Jonathan Nieder,
	Jonathan Tan

Stefan Beller <sbeller@google.com> writes:

> For now I would suggest we put a protocol v2 in place that is
> the current protocol + a version number coming through the
> poked hole at the beginning; the goal and review of this series
> ought to focus on getting the version handshake right...

Oh, we are in absolute agreement on that.  It would be nice if we
can have new tests to demonostrate three combinations working well
(i.e. use 'installed git' whose path is given externally on one end
of the connection, while the just-built binary sits on the other
end, in addition to making sure just-built binary successfully talks
with itself).


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 0/7] transitioning to protocol v2
  2017-08-24 22:53 [RFC 0/7] transitioning to protocol v2 Brandon Williams
                   ` (7 preceding siblings ...)
  2017-08-25  1:19 ` [RFC 0/7] transitioning to protocol v2 Junio C Hamano
@ 2017-08-25 17:29 ` Jeff King
  2017-08-25 17:35   ` Jonathan Nieder
                     ` (2 more replies)
  8 siblings, 3 replies; 31+ messages in thread
From: Jeff King @ 2017-08-25 17:29 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, jrnieder, sbeller, gitster, jonathantanmy

On Thu, Aug 24, 2017 at 03:53:21PM -0700, Brandon Williams wrote:

> Another version of Git's wire protocol is a topic that has been discussed and
> attempted by many in the community over the years.  The biggest challenge, as
> far as I understand, has been coming up with a transition plan to using the new
> server without breaking existing clients and servers.  As such this RFC is
> really only concerned with solidifying a transition plan.  Once it has been
> decided how we can transition to a new protocol we can get into decided what
> this new protocol would look like (though it would obviously eliminate the ref
> advertisement ;).

Sadly, while splitting these things apart makes the protocol
conceptually cleaner, I'm not sure if we can consider them separately
and avoid adding an extra round-trip to the protocol.

For instance, let's say as a client that I've communicated "I would like
to speak v2" to the server. I don't immediately know if it was received
and respected, so I have to wait for the server to say "OK, I know v2"
before sending any more data (like my list of capabilities that I'd like
the server to know before doing the ref advertisement).

So what was perhaps:

  C: please run upload-pack
  S: advertisement + caps
  C: caps + wants
  C+S: async have negotiation
  S: packfile

becomes:

  C: please run upload-pack (v2 if you support it)
  S: yes, I speak v2
  C: caps (including that I'm interested only in refs/heads/foo)
  S: advertise refs/heads/foo + caps
  C+S async have negotiation
  S: packfile

That extra round-trip is probably tolerable for stateful connections
like git:// or ssh. But what about http? We have to add a whole
request/response pair just to find out if v2 is supported.

But what if we instead think of it not as "protocol v2" but as "can I
give the server some hints that it may end up ignoring", then we end up
with something more like:

  C: please run upload-pack (btw, I'm only interested in refs/heads/foo)
  S: advertisement + caps (hopefully limited to foo, but client is prepared to receive all)
  ... etc, as before ...

It's a subtle distinction, but the question becomes not "can we sneak in
an extra bit of information" but "can we sneak in a reasonable number of
arbitrary key/value pairs".

Given the techniques you've used here, I suspect the answer may be
"yes". We could stick arbitrary data in each of those methods (though I
suspect our length may be limited to about 1024 bytes if we want
compatibility with very old git servers).

> The biggest question I'm trying to answer is if these are reasonable ways with
> which to communicate a request to a server to use a newer protocol, without
> breaking current servers/clients.  As far as I've tested, with patches 1-5
> applied I can still communicate with current servers without causing any
> problems.

Current git.git servers, I assume?. How much do we want to care about
alternate implementations? I would not be surprised if other git://
implementations are more picky about cruft after the virtual-host field
(though I double-checked GitHub's implementation at least, and it is
fine).

I don't think libgit2 implements the server side. That leaves probably
JGit, Microsoft's VSTS (which I think is custom), and whatever Atlassian
and GitLab use.

There's not really a spec here.  I'm not entirely opposed to saying "if
your server does not behave like the git.git one it is wrong". But this
is awfully quirky behavior to be relying on, and if it behaves badly
with those implementations it will create headaches for users. The
centralized services I'm not too worried about; they'll upgrade
promptly. But any deployments of those systems may hang around for
years.

I dunno. Maybe it would be enough to have a config to switch off this
feature, which would give people using those systems an escape hatch
(until they upgrade). Or alternatively, I guess make this optional to
start with, and let early adopters turn it on and complain to their server
vendors for a while before flipping the default to on.

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 0/7] transitioning to protocol v2
  2017-08-25 17:29 ` Jeff King
@ 2017-08-25 17:35   ` Jonathan Nieder
  2017-08-25 17:41     ` Jeff King
  2017-08-29 20:08     ` Jeff Hostetler
  2017-08-25 17:48   ` Junio C Hamano
  2017-08-30 20:38   ` Bryan Turner
  2 siblings, 2 replies; 31+ messages in thread
From: Jonathan Nieder @ 2017-08-25 17:35 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, git, sbeller, gitster, jonathantanmy

Hi,

Jeff King wrote:
> On Thu, Aug 24, 2017 at 03:53:21PM -0700, Brandon Williams wrote:

>> Another version of Git's wire protocol is a topic that has been discussed and
>> attempted by many in the community over the years.  The biggest challenge, as
>> far as I understand, has been coming up with a transition plan to using the new
>> server without breaking existing clients and servers.  As such this RFC is
>> really only concerned with solidifying a transition plan.  Once it has been
>> decided how we can transition to a new protocol we can get into decided what
>> this new protocol would look like (though it would obviously eliminate the ref
>> advertisement ;).
>
> Sadly, while splitting these things apart makes the protocol
> conceptually cleaner, I'm not sure if we can consider them separately
> and avoid adding an extra round-trip to the protocol.

How about the idea of using this mechanism to implement a protocol
"v1"?

The reply would be the same as today, except that it has a "protocol
v1" pkt-line at the beginning.  So this doesn't change the number of
round-trips --- it just validates the protocol migration approach.

I agree with you that an actual protocol v2 needs to change how the
capability exchange etc work.  bmwill@ has mentioned some thoughts about
this privately.  Probably he can say more here too.

[...]
> Given the techniques you've used here, I suspect the answer may be
> "yes". We could stick arbitrary data in each of those methods (though I
> suspect our length may be limited to about 1024 bytes if we want
> compatibility with very old git servers).

Yes, including arbitrary data to be able to include some kinds of
requests inline in the initial request is one of the design goals.

>> The biggest question I'm trying to answer is if these are reasonable ways with
>> which to communicate a request to a server to use a newer protocol, without
>> breaking current servers/clients.  As far as I've tested, with patches 1-5
>> applied I can still communicate with current servers without causing any
>> problems.
>
> Current git.git servers, I assume?. How much do we want to care about
> alternate implementations? I would not be surprised if other git://
> implementations are more picky about cruft after the virtual-host field
> (though I double-checked GitHub's implementation at least, and it is
> fine).

FWIW JGit copes fine with this.

> I don't think libgit2 implements the server side. That leaves probably
> JGit, Microsoft's VSTS (which I think is custom), and whatever Atlassian
> and GitLab use.

I'd be happy if someone tests the patches against those. :)

> There's not really a spec here.

Technically pack-protocol is a spec, just not a very clear one.

It does say this kind of client request is invalid.  The idea of this
series is to change the spec. :)

[...]
> I dunno. Maybe it would be enough to have a config to switch off this
> feature, which would give people using those systems an escape hatch
> (until they upgrade).

I'd rather not.  That means there's less motivation for people to
report compatibility problems so we can fix them.

It might be necessary as a temporary escape hatch, though.

>                        Or alternatively, I guess make this optional to
> start with, and let early adopters turn it on and complain to their server
> vendors for a while before flipping the default to on.

Can we do that by making it a patch / letting it cook for a while in
'next'? :)

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 0/7] transitioning to protocol v2
  2017-08-25 17:14     ` Junio C Hamano
@ 2017-08-25 17:36       ` Jeff King
  0 siblings, 0 replies; 31+ messages in thread
From: Jeff King @ 2017-08-25 17:36 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Stefan Beller, Brandon Williams, git@vger.kernel.org,
	Jonathan Nieder, Jonathan Tan

On Fri, Aug 25, 2017 at 10:14:13AM -0700, Junio C Hamano wrote:

> Stefan Beller <sbeller@google.com> writes:
> 
> > For now I would suggest we put a protocol v2 in place that is
> > the current protocol + a version number coming through the
> > poked hole at the beginning; the goal and review of this series
> > ought to focus on getting the version handshake right...
> 
> Oh, we are in absolute agreement on that.  It would be nice if we
> can have new tests to demonostrate three combinations working well
> (i.e. use 'installed git' whose path is given externally on one end
> of the connection, while the just-built binary sits on the other
> end, in addition to making sure just-built binary successfully talks
> with itself).

The harness in t/interop can perhaps help with that (at least between
existing git versions; testing across other implementations makes the
setup a lot tricker).

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 0/7] transitioning to protocol v2
  2017-08-25 17:35   ` Jonathan Nieder
@ 2017-08-25 17:41     ` Jeff King
  2017-08-25 18:50       ` Brandon Williams
  2017-08-29 20:08     ` Jeff Hostetler
  1 sibling, 1 reply; 31+ messages in thread
From: Jeff King @ 2017-08-25 17:41 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Brandon Williams, git, sbeller, gitster, jonathantanmy

On Fri, Aug 25, 2017 at 10:35:50AM -0700, Jonathan Nieder wrote:

> > Sadly, while splitting these things apart makes the protocol
> > conceptually cleaner, I'm not sure if we can consider them separately
> > and avoid adding an extra round-trip to the protocol.
> 
> How about the idea of using this mechanism to implement a protocol
> "v1"?
> 
> The reply would be the same as today, except that it has a "protocol
> v1" pkt-line at the beginning.  So this doesn't change the number of
> round-trips --- it just validates the protocol migration approach.

I'm not that worried about validating the ideas here to shoe-horn a
version field. I'm worried about what "step 2" is going to look like,
and whether "we shoe-horned a version field" can be extended to "we
shoe-horned arbitrary data".

Maybe those are the same thing, and validating the first validates the
second. But I don't think it's a foregone conclusion.

> > Current git.git servers, I assume?. How much do we want to care about
> > alternate implementations? I would not be surprised if other git://
> > implementations are more picky about cruft after the virtual-host field
> > (though I double-checked GitHub's implementation at least, and it is
> > fine).
> 
> FWIW JGit copes fine with this.

Thanks for checking.

> > I don't think libgit2 implements the server side. That leaves probably
> > JGit, Microsoft's VSTS (which I think is custom), and whatever Atlassian
> > and GitLab use.
> 
> I'd be happy if someone tests the patches against those. :)

Me too. I don't have an easy setup for the last 3.

> >                        Or alternatively, I guess make this optional to
> > start with, and let early adopters turn it on and complain to their server
> > vendors for a while before flipping the default to on.
> 
> Can we do that by making it a patch / letting it cook for a while in
> 'next'? :)

If people actually ran 'next', that would help. I was hoping that we
could get it out to the masses behind a feature flag, and dangle it in
front of them with "this will improve fetch performance if you turn it
on". But that carrot implies going all the way through the follow-on
steps of designing the performance-improving v2 extensions and getting
them implemented on the server side.

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 3/7] protocol: tell server that the client understands v2
  2017-08-24 22:53 ` [RFC 3/7] protocol: tell server that the client understands v2 Brandon Williams
@ 2017-08-25 17:45   ` Junio C Hamano
  2017-08-25 18:53     ` Brandon Williams
  0 siblings, 1 reply; 31+ messages in thread
From: Junio C Hamano @ 2017-08-25 17:45 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, peff, jrnieder, sbeller, jonathantanmy

Brandon Williams <bmwill@google.com> writes:

> +		/* If using a new version put that stuff here after a second null byte */
> +		strbuf_addch(&request, '\0');
> +		strbuf_addf(&request, "version=%d%c", 2, '\0');
> +		/* subsequent supported versions can also be added */
> +		strbuf_addf(&request, "version=%d%c", 3, '\0');

Isn't this last one meant only as a comment?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 0/7] transitioning to protocol v2
  2017-08-25 17:29 ` Jeff King
  2017-08-25 17:35   ` Jonathan Nieder
@ 2017-08-25 17:48   ` Junio C Hamano
  2017-08-30 20:38   ` Bryan Turner
  2 siblings, 0 replies; 31+ messages in thread
From: Junio C Hamano @ 2017-08-25 17:48 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, git, jrnieder, sbeller, jonathantanmy

Jeff King <peff@peff.net> writes:

> But what if we instead think of it not as "protocol v2" but as "can I
> give the server some hints that it may end up ignoring", then we end up
> with something more like:
>
>   C: please run upload-pack (btw, I'm only interested in refs/heads/foo)
>   S: advertisement + caps (hopefully limited to foo, but client is prepared to receive all)
>   ... etc, as before ...

Nice.  The caps that come back can tell us between the cases where
they only had refs/heads/foo and nothing else, or if they limited
their output to it among many others we told them to ignore, so
there is no ambiguity.

> Or alternatively, I guess make this optional to start with, and
> let early adopters turn it on and complain to their server vendors
> for a while before flipping the default to on.

That sounds like a safe transition plan.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 0/7] transitioning to protocol v2
  2017-08-25 17:41     ` Jeff King
@ 2017-08-25 18:50       ` Brandon Williams
  0 siblings, 0 replies; 31+ messages in thread
From: Brandon Williams @ 2017-08-25 18:50 UTC (permalink / raw)
  To: Jeff King; +Cc: Jonathan Nieder, git, sbeller, gitster, jonathantanmy

On 08/25, Jeff King wrote:
> On Fri, Aug 25, 2017 at 10:35:50AM -0700, Jonathan Nieder wrote:
> 
> > > Sadly, while splitting these things apart makes the protocol
> > > conceptually cleaner, I'm not sure if we can consider them separately
> > > and avoid adding an extra round-trip to the protocol.
> > 
> > How about the idea of using this mechanism to implement a protocol
> > "v1"?
> > 
> > The reply would be the same as today, except that it has a "protocol
> > v1" pkt-line at the beginning.  So this doesn't change the number of
> > round-trips --- it just validates the protocol migration approach.
> 
> I'm not that worried about validating the ideas here to shoe-horn a
> version field. I'm worried about what "step 2" is going to look like,
> and whether "we shoe-horned a version field" can be extended to "we
> shoe-horned arbitrary data".

I know that this initial RFC didn't mention adding arbitrary data in the
initial request but I fully intend that the final version will allow
such a thing.  One such idea (via the envvar, though the same can be
done with git-daemon) would be to have GIT_PROTOCOL hold colon
delimited values, versions could be indicated by "v1", "v2", etc while
arbitrary key/values as "key=value" such that the envvar would look
like: 'v1:v2:key1=value1:key2=value2'.  This would mean that the client
understands protocol version 1 and 2 (so either are acceptable for the
server to select since there is a change the server doesn't understand
v2 or some other higher version number) and that if supported it wants
key1 to have value 'value1' and key2 to have value 'value2'.

As you said the initial request to git-daemon is limited in length (I
think envvars have a length limit too?) so we would still be limited in
how much data we can send in that initial request and so we should
design a new protocol in such a way that doesn't hinge on adding extra
data in that first request (aside from a version number) but that can
use it to speed things up if at all possible.

> > Can we do that by making it a patch / letting it cook for a while in
> > 'next'? :)
> 
> If people actually ran 'next', that would help. I was hoping that we
> could get it out to the masses behind a feature flag, and dangle it in
> front of them with "this will improve fetch performance if you turn it
> on". But that carrot implies going all the way through the follow-on
> steps of designing the performance-improving v2 extensions and getting
> them implemented on the server side.

We run 'next' here so we will be able to get at least a little bit of
feedback from a small subset of users.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 3/7] protocol: tell server that the client understands v2
  2017-08-25 17:45   ` Junio C Hamano
@ 2017-08-25 18:53     ` Brandon Williams
  2017-08-25 18:55       ` Brandon Williams
  0 siblings, 1 reply; 31+ messages in thread
From: Brandon Williams @ 2017-08-25 18:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, peff, jrnieder, sbeller, jonathantanmy

On 08/25, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > +		/* If using a new version put that stuff here after a second null byte */
> > +		strbuf_addch(&request, '\0');
> > +		strbuf_addf(&request, "version=%d%c", 2, '\0');
> > +		/* subsequent supported versions can also be added */
> > +		strbuf_addf(&request, "version=%d%c", 3, '\0');
> 
> Isn't this last one meant only as a comment?

Sorry since this was structured as an RFC I didn't go back through the
code with a fine tooth comb to ensure I removed or commented out any
debugging statements.  Stefan also pointed out to me that I left in an
if (0) statement somewhere haha.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 3/7] protocol: tell server that the client understands v2
  2017-08-25 18:53     ` Brandon Williams
@ 2017-08-25 18:55       ` Brandon Williams
  0 siblings, 0 replies; 31+ messages in thread
From: Brandon Williams @ 2017-08-25 18:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, peff, jrnieder, sbeller, jonathantanmy

On 08/25, Brandon Williams wrote:
> On 08/25, Junio C Hamano wrote:
> > Brandon Williams <bmwill@google.com> writes:
> > 
> > > +		/* If using a new version put that stuff here after a second null byte */
> > > +		strbuf_addch(&request, '\0');
> > > +		strbuf_addf(&request, "version=%d%c", 2, '\0');
> > > +		/* subsequent supported versions can also be added */
> > > +		strbuf_addf(&request, "version=%d%c", 3, '\0');
> > 
> > Isn't this last one meant only as a comment?
> 
> Sorry since this was structured as an RFC I didn't go back through the
> code with a fine tooth comb to ensure I removed or commented out any
> debugging statements.  Stefan also pointed out to me that I left in an
> if (0) statement somewhere haha.

Oh one more thing about this line.  I added it to show (and check) that
git-daemon would be able to cope with more than one field being added as
I wanted to avoid the issue we had when adding the 'host' field where
additional fields aren't allowed as they aren't gracefully ignored by
the server.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 0/7] transitioning to protocol v2
  2017-08-25 17:35   ` Jonathan Nieder
  2017-08-25 17:41     ` Jeff King
@ 2017-08-29 20:08     ` Jeff Hostetler
  2017-08-29 21:10       ` Brandon Williams
  2017-08-30  3:06       ` Jeff King
  1 sibling, 2 replies; 31+ messages in thread
From: Jeff Hostetler @ 2017-08-29 20:08 UTC (permalink / raw)
  To: Jonathan Nieder, Jeff King
  Cc: Brandon Williams, git, sbeller, gitster, jonathantanmy



On 8/25/2017 1:35 PM, Jonathan Nieder wrote:
> Hi,
> 
> Jeff King wrote:
>> On Thu, Aug 24, 2017 at 03:53:21PM -0700, Brandon Williams wrote:
> 
>>> Another version of Git's wire protocol is a topic that has been discussed and
>>> attempted by many in the community over the years.  The biggest challenge, as
>>> far as I understand, has been coming up with a transition plan to using the new
>>> server without breaking existing clients and servers.  As such this RFC is
>>> really only concerned with solidifying a transition plan.  Once it has been
>>> decided how we can transition to a new protocol we can get into decided what
>>> this new protocol would look like (though it would obviously eliminate the ref
>>> advertisement ;).
>>
> 
>> I don't think libgit2 implements the server side. That leaves probably
>> JGit, Microsoft's VSTS (which I think is custom), and whatever Atlassian
>> and GitLab use.
> 
> I'd be happy if someone tests the patches against those. :)

I just wanted to jump in here and say I've done some initial
testing of this against VSTS and so far it seems fine.  And yes,
we have a custom git server.

VSTS doesn't support the "git://" protocol, so the double-null trick
isn't an issue for us.  But "https://" worked just fine.  I'm still
asking around internally whether we support passing SSH environment
variables.

Jeff


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 0/7] transitioning to protocol v2
  2017-08-29 20:08     ` Jeff Hostetler
@ 2017-08-29 21:10       ` Brandon Williams
  2017-08-30  3:06       ` Jeff King
  1 sibling, 0 replies; 31+ messages in thread
From: Brandon Williams @ 2017-08-29 21:10 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Jonathan Nieder, Jeff King, git, sbeller, gitster, jonathantanmy

On 08/29, Jeff Hostetler wrote:
> 
> 
> On 8/25/2017 1:35 PM, Jonathan Nieder wrote:
> >Hi,
> >
> >Jeff King wrote:
> >>On Thu, Aug 24, 2017 at 03:53:21PM -0700, Brandon Williams wrote:
> >
> >>>Another version of Git's wire protocol is a topic that has been discussed and
> >>>attempted by many in the community over the years.  The biggest challenge, as
> >>>far as I understand, has been coming up with a transition plan to using the new
> >>>server without breaking existing clients and servers.  As such this RFC is
> >>>really only concerned with solidifying a transition plan.  Once it has been
> >>>decided how we can transition to a new protocol we can get into decided what
> >>>this new protocol would look like (though it would obviously eliminate the ref
> >>>advertisement ;).
> >>
> >
> >>I don't think libgit2 implements the server side. That leaves probably
> >>JGit, Microsoft's VSTS (which I think is custom), and whatever Atlassian
> >>and GitLab use.
> >
> >I'd be happy if someone tests the patches against those. :)
> 
> I just wanted to jump in here and say I've done some initial
> testing of this against VSTS and so far it seems fine.  And yes,
> we have a custom git server.
> 
> VSTS doesn't support the "git://" protocol, so the double-null trick
> isn't an issue for us.  But "https://" worked just fine.  I'm still
> asking around internally whether we support passing SSH environment
> variables.
> 
> Jeff
> 

Thanks for checking on this, I really appreciate it.  Please let me know
if anything I haven't thought of becomes an issue.

I'm currently working on getting these patches into a more polished
state to be used (as discussed elsewhere on this thread) as a precursor
to an actual v2.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 0/7] transitioning to protocol v2
  2017-08-29 20:08     ` Jeff Hostetler
  2017-08-29 21:10       ` Brandon Williams
@ 2017-08-30  3:06       ` Jeff King
  2017-08-30 13:30         ` Jeff Hostetler
  1 sibling, 1 reply; 31+ messages in thread
From: Jeff King @ 2017-08-30  3:06 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Jonathan Nieder, Brandon Williams, git, sbeller, gitster,
	jonathantanmy

On Tue, Aug 29, 2017 at 04:08:25PM -0400, Jeff Hostetler wrote:

> I just wanted to jump in here and say I've done some initial
> testing of this against VSTS and so far it seems fine.  And yes,
> we have a custom git server.

Great, thank you for checking.

> VSTS doesn't support the "git://" protocol, so the double-null trick
> isn't an issue for us.  But "https://" worked just fine.  I'm still
> asking around internally whether we support passing SSH environment
> variables.

The key thing for ssh is not whether you support passing environment
variables. It's whether you quietly ignore unknown variables rather than
cutting off the connection.

To support the v2 protocol you'd need to pass the new variables, but
you'd also need to modify your server to actually do something useful
with them anyway. At this point we're mostly concerned with whether we
can safely pass the variables to current implementations unconditionally
and get a reasonable outcome.

To be honest, I'm not all that worried about VSTS either way. It's a
centralized service which will likely get v2 extensions implemented in a
timely manner (once they exist). I'm much more concerned about
shrink-wrap implementations deployed in the wild, which may hang around
without being upgraded for years. If v2-aware clients send requests that
cause those old implementations to choke, users won't be happy.

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 5/7] http: send Git-Protocol-Version header
  2017-08-24 22:53 ` [RFC 5/7] http: send Git-Protocol-Version header Brandon Williams
@ 2017-08-30 10:55   ` Kevin Daudt
  0 siblings, 0 replies; 31+ messages in thread
From: Kevin Daudt @ 2017-08-30 10:55 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, peff, jrnieder, sbeller, gitster, jonathantanmy

On Thu, Aug 24, 2017 at 03:53:26PM -0700, Brandon Williams wrote:
> Tell a serve that protocol v2 can be used by sending an http header
> indicating this.

s/serve/server/

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 0/7] transitioning to protocol v2
  2017-08-30  3:06       ` Jeff King
@ 2017-08-30 13:30         ` Jeff Hostetler
  2017-08-30 16:54           ` Brandon Williams
  0 siblings, 1 reply; 31+ messages in thread
From: Jeff Hostetler @ 2017-08-30 13:30 UTC (permalink / raw)
  To: Jeff King
  Cc: Jonathan Nieder, Brandon Williams, git, sbeller, gitster,
	jonathantanmy



On 8/29/2017 11:06 PM, Jeff King wrote:
> On Tue, Aug 29, 2017 at 04:08:25PM -0400, Jeff Hostetler wrote:
> 
>> I just wanted to jump in here and say I've done some initial
>> testing of this against VSTS and so far it seems fine.  And yes,
>> we have a custom git server.
> 
> Great, thank you for checking.
> 
>> VSTS doesn't support the "git://" protocol, so the double-null trick
>> isn't an issue for us.  But "https://" worked just fine.  I'm still
>> asking around internally whether we support passing SSH environment
>> variables.
> 
> The key thing for ssh is not whether you support passing environment
> variables. It's whether you quietly ignore unknown variables rather than
> cutting off the connection.
 >
> To support the v2 protocol you'd need to pass the new variables, but
> you'd also need to modify your server to actually do something useful
> with them anyway. At this point we're mostly concerned with whether we
> can safely pass the variables to current implementations unconditionally
> and get a reasonable outcome.

Right.  I just spoke with our server folks and, currently, our SSH
support quietly eats ALL variables.   So we're safe :-)

I'm starting a conversation with them to pass them thru so we can
be ready for this.  (Assuming we choose to go this way.)

Thanks,
Jeff



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 0/7] transitioning to protocol v2
  2017-08-30 13:30         ` Jeff Hostetler
@ 2017-08-30 16:54           ` Brandon Williams
  0 siblings, 0 replies; 31+ messages in thread
From: Brandon Williams @ 2017-08-30 16:54 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Jeff King, Jonathan Nieder, git, sbeller, gitster, jonathantanmy

On 08/30, Jeff Hostetler wrote:
> 
> 
> On 8/29/2017 11:06 PM, Jeff King wrote:
> >On Tue, Aug 29, 2017 at 04:08:25PM -0400, Jeff Hostetler wrote:
> >
> >>I just wanted to jump in here and say I've done some initial
> >>testing of this against VSTS and so far it seems fine.  And yes,
> >>we have a custom git server.
> >
> >Great, thank you for checking.
> >
> >>VSTS doesn't support the "git://" protocol, so the double-null trick
> >>isn't an issue for us.  But "https://" worked just fine.  I'm still
> >>asking around internally whether we support passing SSH environment
> >>variables.
> >
> >The key thing for ssh is not whether you support passing environment
> >variables. It's whether you quietly ignore unknown variables rather than
> >cutting off the connection.
> >
> >To support the v2 protocol you'd need to pass the new variables, but
> >you'd also need to modify your server to actually do something useful
> >with them anyway. At this point we're mostly concerned with whether we
> >can safely pass the variables to current implementations unconditionally
> >and get a reasonable outcome.
> 
> Right.  I just spoke with our server folks and, currently, our SSH
> support quietly eats ALL variables.   So we're safe :-)
> 
> I'm starting a conversation with them to pass them thru so we can
> be ready for this.  (Assuming we choose to go this way.)

Perfect! Thanks again.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 0/7] transitioning to protocol v2
  2017-08-25 17:29 ` Jeff King
  2017-08-25 17:35   ` Jonathan Nieder
  2017-08-25 17:48   ` Junio C Hamano
@ 2017-08-30 20:38   ` Bryan Turner
  2017-08-30 21:12     ` Brandon Williams
  2 siblings, 1 reply; 31+ messages in thread
From: Bryan Turner @ 2017-08-30 20:38 UTC (permalink / raw)
  To: Jeff King
  Cc: Brandon Williams, Git Users, Jonathan Nieder, sbeller,
	Junio C Hamano, jonathantanmy

On Fri, Aug 25, 2017 at 10:29 AM, Jeff King <peff@peff.net> wrote:
> On Thu, Aug 24, 2017 at 03:53:21PM -0700, Brandon Williams wrote:
>
>> The biggest question I'm trying to answer is if these are reasonable ways with
>> which to communicate a request to a server to use a newer protocol, without
>> breaking current servers/clients.  As far as I've tested, with patches 1-5
>> applied I can still communicate with current servers without causing any
>> problems.
>
> Current git.git servers, I assume?. How much do we want to care about
> alternate implementations? I would not be surprised if other git://
> implementations are more picky about cruft after the virtual-host field
> (though I double-checked GitHub's implementation at least, and it is
> fine).
>
> I don't think libgit2 implements the server side. That leaves probably
> JGit, Microsoft's VSTS (which I think is custom), and whatever Atlassian
> and GitLab use.

Before I manually apply the patches to test how they work with
Bitbucket Server, are they applied on a branch somewhere where I can
just fetch them? If not, I'll apply them manually and verify.

Just based on the description, though, I expect no issues. We don't
currently support the git:// protocol. Our HTTP handling passes
headers through to the receive-pack and upload-pack processes as
environment variables (with a little massaging), but doesn't consider
them itself; it only considers the URL and "service" query parameter
to decide what command to run and to detect "dumb" requests. Our SSH
handling ignores any environment variables provided and does not
forward them to the git process, similar to VSTS.

I'll confirm explicitly, to be certain, but just based on reading the
overview and knowing our code I think the described approaches should
work fine.

Best regards,
Bryan Turner

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 0/7] transitioning to protocol v2
  2017-08-30 20:38   ` Bryan Turner
@ 2017-08-30 21:12     ` Brandon Williams
  2017-09-01 23:06       ` Bryan Turner
  0 siblings, 1 reply; 31+ messages in thread
From: Brandon Williams @ 2017-08-30 21:12 UTC (permalink / raw)
  To: Bryan Turner
  Cc: Jeff King, Git Users, Jonathan Nieder, sbeller, Junio C Hamano,
	jonathantanmy

On 08/30, Bryan Turner wrote:
> On Fri, Aug 25, 2017 at 10:29 AM, Jeff King <peff@peff.net> wrote:
> > On Thu, Aug 24, 2017 at 03:53:21PM -0700, Brandon Williams wrote:
> >
> >> The biggest question I'm trying to answer is if these are reasonable ways with
> >> which to communicate a request to a server to use a newer protocol, without
> >> breaking current servers/clients.  As far as I've tested, with patches 1-5
> >> applied I can still communicate with current servers without causing any
> >> problems.
> >
> > Current git.git servers, I assume?. How much do we want to care about
> > alternate implementations? I would not be surprised if other git://
> > implementations are more picky about cruft after the virtual-host field
> > (though I double-checked GitHub's implementation at least, and it is
> > fine).
> >
> > I don't think libgit2 implements the server side. That leaves probably
> > JGit, Microsoft's VSTS (which I think is custom), and whatever Atlassian
> > and GitLab use.
> 
> Before I manually apply the patches to test how they work with
> Bitbucket Server, are they applied on a branch somewhere where I can
> just fetch them? If not, I'll apply them manually and verify.

I just pushed this set of patches up to: https://github.com/bmwill/git/tree/protocol-v2
so you should be able to fetch them from there (saves you from having to
manually applying the patches).

> Just based on the description, though, I expect no issues. We don't
> currently support the git:// protocol. Our HTTP handling passes
> headers through to the receive-pack and upload-pack processes as
> environment variables (with a little massaging), but doesn't consider
> them itself; it only considers the URL and "service" query parameter
> to decide what command to run and to detect "dumb" requests. Our SSH
> handling ignores any environment variables provided and does not
> forward them to the git process, similar to VSTS.
> 
> I'll confirm explicitly, to be certain, but just based on reading the
> overview and knowing our code I think the described approaches should
> work fine.

Perfect!  Thanks for taking the time to verify that this will work.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 7/7] upload-pack: ack version 2
  2017-08-24 22:53 ` [RFC 7/7] upload-pack: ack version 2 Brandon Williams
@ 2017-09-01 22:02   ` Bryan Turner
  2017-09-01 23:20     ` Brandon Williams
  0 siblings, 1 reply; 31+ messages in thread
From: Bryan Turner @ 2017-09-01 22:02 UTC (permalink / raw)
  To: Brandon Williams
  Cc: Git Users, Jeff King, Jonathan Nieder, sbeller, Junio C Hamano,
	jonathantanmy

On Thu, Aug 24, 2017 at 3:53 PM, Brandon Williams <bmwill@google.com> wrote:
> +
> +       version = getenv("GIT_PROTOCOL");
> +       if (!strcmp(version, "2"))
> +               upload_pack_v2();
> +

I think the "if" check here needs some type of NULL check for
"version" before calling "strcmp". Without that, if the "GIT_PROTOCOL"
environment variable isn't set, git-upload-pack SEGVs.

This came up when I was testing the "protocol-v2" branch with
Bitbucket Server. For performance reasons, we sometimes run ref
advertisements as a separate process, when serving fetches, to allow
throttling the ref advertisement separately from actually generating
and sending a packfile. Since CI servers continuously poll for
changes, but usually don't find any, we want to be able to serve ref
advertisements, which have minimal overhead, with much higher
parallelism than serving packs. To do that, we run "git-upload-pack
--advertize-refs" directly, and that command has never *required*
"GIT_PROTOCOL" before this change.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC 0/7] transitioning to protocol v2
  2017-08-30 21:12     ` Brandon Williams
@ 2017-09-01 23:06       ` Bryan Turner
  0 siblings, 0 replies; 31+ messages in thread
From: Bryan Turner @ 2017-09-01 23:06 UTC (permalink / raw)
  To: Brandon Williams
  Cc: Jeff King, Git Users, Jonathan Nieder, Stefan Beller,
	Junio C Hamano, Jonathan Tan

On Wed, Aug 30, 2017 at 2:12 PM, Brandon Williams <bmwill@google.com> wrote:
> On 08/30, Bryan Turner wrote:
>> On Fri, Aug 25, 2017 at 10:29 AM, Jeff King <peff@peff.net> wrote:
>> > On Thu, Aug 24, 2017 at 03:53:21PM -0700, Brandon Williams wrote:
>> >
>> >> The biggest question I'm trying to answer is if these are reasonable ways with
>> >> which to communicate a request to a server to use a newer protocol, without
>> >> breaking current servers/clients.  As far as I've tested, with patches 1-5
>> >> applied I can still communicate with current servers without causing any
>> >> problems.
>> >
>> > Current git.git servers, I assume?. How much do we want to care about
>> > alternate implementations? I would not be surprised if other git://
>> > implementations are more picky about cruft after the virtual-host field
>> > (though I double-checked GitHub's implementation at least, and it is
>> > fine).
>> >
>> > I don't think libgit2 implements the server side. That leaves probably
>> > JGit, Microsoft's VSTS (which I think is custom), and whatever Atlassian
>> > and GitLab use.
>>
>> Before I manually apply the patches to test how they work with
>> Bitbucket Server, are they applied on a branch somewhere where I can
>> just fetch them? If not, I'll apply them manually and verify.
>
> I just pushed this set of patches up to: https://github.com/bmwill/git/tree/protocol-v2
> so you should be able to fetch them from there (saves you from having to
> manually applying the patches).

Thanks for that! It made testing a lot simpler.

>
>> Just based on the description, though, I expect no issues. We don't
>> currently support the git:// protocol. Our HTTP handling passes
>> headers through to the receive-pack and upload-pack processes as
>> environment variables (with a little massaging), but doesn't consider
>> them itself; it only considers the URL and "service" query parameter
>> to decide what command to run and to detect "dumb" requests. Our SSH
>> handling ignores any environment variables provided and does not
>> forward them to the git process, similar to VSTS.
>>
>> I'll confirm explicitly, to be certain, but just based on reading the
>> overview and knowing our code I think the described approaches should
>> work fine.
>
> Perfect!  Thanks for taking the time to verify that this will work.

With 2 small tweaks on top of "protocol-v2", I was able to run our
full command and hosting (HTTP and SSH) test suites without issues.

diff --git a/transport.c b/transport.c
index c05e167..37b5d83 100644
--- a/transport.c
+++ b/transport.c
@@ -222,7 +222,8 @@ static struct ref *get_refs_via_connect(struct
transport *transport, int for_pus
        switch(determine_version(data->fd[0], &buf)) {
        case 2:
                /* The server understands Protocol v2 */
-               fprintf(stderr, "Server understands Protocol v2!\n");
+               if (transport->verbose >= 0)
+                       fprintf(stderr, "Server understands Protocol v2!\n");
                break;
        case 1:
                /* Server is speaking Protocol v1 and sent a ref so
process it */
diff --git a/upload-pack.c b/upload-pack.c
index 0f85315..7c59495 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1075,7 +1075,7 @@ int cmd_main(int argc, const char **argv)
        git_config(upload_pack_config, NULL);

        version = getenv("GIT_PROTOCOL");
-       if (!strcmp(version, "2"))
+       if (version && !strcmp(version, "2"))
                upload_pack_v2();

        upload_pack();

I'd imagine the "Server understands Protocol v2!" message won't
actually *ship* in Git, so it's likely making that honor "--quiet"
doesn't actually matter; I only adjusted it because we have a test
that verifies "git clone --quiet" is quiet. The "if (version" change I
commented on separately in the 7/7 patch that introduced the check.

Thanks again for publishing this, and for letting me test it!

>
> --
> Brandon Williams

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [RFC 7/7] upload-pack: ack version 2
  2017-09-01 22:02   ` Bryan Turner
@ 2017-09-01 23:20     ` Brandon Williams
  0 siblings, 0 replies; 31+ messages in thread
From: Brandon Williams @ 2017-09-01 23:20 UTC (permalink / raw)
  To: Bryan Turner
  Cc: Git Users, Jeff King, Jonathan Nieder, sbeller, Junio C Hamano,
	jonathantanmy

On 09/01, Bryan Turner wrote:
> On Thu, Aug 24, 2017 at 3:53 PM, Brandon Williams <bmwill@google.com> wrote:
> > +
> > +       version = getenv("GIT_PROTOCOL");
> > +       if (!strcmp(version, "2"))
> > +               upload_pack_v2();
> > +
> 
> I think the "if" check here needs some type of NULL check for
> "version" before calling "strcmp". Without that, if the "GIT_PROTOCOL"
> environment variable isn't set, git-upload-pack SEGVs.
> 
> This came up when I was testing the "protocol-v2" branch with
> Bitbucket Server. For performance reasons, we sometimes run ref
> advertisements as a separate process, when serving fetches, to allow
> throttling the ref advertisement separately from actually generating
> and sending a packfile. Since CI servers continuously poll for
> changes, but usually don't find any, we want to be able to serve ref
> advertisements, which have minimal overhead, with much higher
> parallelism than serving packs. To do that, we run "git-upload-pack
> --advertize-refs" directly, and that command has never *required*
> "GIT_PROTOCOL" before this change.

Thanks for pointing this out.  Since this was an RFC I wasn't being
careful with doing these sorts of checks :).  I'm currently working on
the non-RFC version of this series and it is getting close to being in a
state where I can send it out for more careful review.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2017-09-01 23:20 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-24 22:53 [RFC 0/7] transitioning to protocol v2 Brandon Williams
2017-08-24 22:53 ` [RFC 1/7] pkt-line: add packet_write function Brandon Williams
2017-08-24 22:53 ` [RFC 2/7] pkt-line: add strbuf_packet_read Brandon Williams
2017-08-24 22:53 ` [RFC 3/7] protocol: tell server that the client understands v2 Brandon Williams
2017-08-25 17:45   ` Junio C Hamano
2017-08-25 18:53     ` Brandon Williams
2017-08-25 18:55       ` Brandon Williams
2017-08-24 22:53 ` [RFC 4/7] t: fix ssh tests to cope with using '-o SendEnv=GIT_PROTOCOL' Brandon Williams
2017-08-24 22:53 ` [RFC 5/7] http: send Git-Protocol-Version header Brandon Williams
2017-08-30 10:55   ` Kevin Daudt
2017-08-24 22:53 ` [RFC 6/7] transport: teach client to recognize v2 server response Brandon Williams
2017-08-24 22:53 ` [RFC 7/7] upload-pack: ack version 2 Brandon Williams
2017-09-01 22:02   ` Bryan Turner
2017-09-01 23:20     ` Brandon Williams
2017-08-25  1:19 ` [RFC 0/7] transitioning to protocol v2 Junio C Hamano
2017-08-25 17:07   ` Stefan Beller
2017-08-25 17:14     ` Junio C Hamano
2017-08-25 17:36       ` Jeff King
2017-08-25 17:29 ` Jeff King
2017-08-25 17:35   ` Jonathan Nieder
2017-08-25 17:41     ` Jeff King
2017-08-25 18:50       ` Brandon Williams
2017-08-29 20:08     ` Jeff Hostetler
2017-08-29 21:10       ` Brandon Williams
2017-08-30  3:06       ` Jeff King
2017-08-30 13:30         ` Jeff Hostetler
2017-08-30 16:54           ` Brandon Williams
2017-08-25 17:48   ` Junio C Hamano
2017-08-30 20:38   ` Bryan Turner
2017-08-30 21:12     ` Brandon Williams
2017-09-01 23:06       ` Bryan Turner

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).