git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / Atom feed
* [RFC/WIP PATCH 00/11] Protocol version 2, again!
@ 2015-05-26 22:01 Stefan Beller
  2015-05-26 22:01 ` [RFC/WIP PATCH 01/11] upload-pack: make client capability parsing code a separate function Stefan Beller
                   ` (11 more replies)
  0 siblings, 12 replies; 61+ messages in thread
From: Stefan Beller @ 2015-05-26 22:01 UTC (permalink / raw)
  To: git; +Cc: pclouds, peff, gitster, Stefan Beller

    "Just give us something to play around with" - Peff at GitMerge 2015

This is another approach for updating the pack protocol of Git.
While in the previous attempts I tried to come up with the perfect
specification of the new protocol I realized that such an approach
doesn't lead anywhere. So this time actual working code is included!

The included code is not complete, but a minimal example of

    git config remote.origin.transportversion 2
    git fetch origin master # uses the new protocol!

works.

In a previous discussion[1] towards version 2 of the pack protocol I wanted
to come up with a protocol which even includes negotiating what is done
in the protocol exchange, such as having a command "push, fetch, ls-remote"
being part of the protocol. This is not a very good approach as it's too much
work to be done at the same time.

This patch series focusses on just the fetching side, so the first four patches
are teaching git-upload-pack about the new pack protocol. The next five patches
will teach git fetch and fetch-pack how to use the version 2 protocol. Then there
will be a small test and a documentation update.

The new protocol works just like the old protocol, except for
the capabilities negotiation being before any exchange of real data.

This solves the problem of having a first huge chunk of data (the refs
advertisement) sent over the wire and just realizing in between that we
don't need these data for that operation and no way to cancel.

By having a capabilites negotiation first the new protocol may be even
more future-proof than the current one, as the capabilites will hopefully
be kept small and all larger data transfers will get their own later stage
in the protocol.

To determine the protocol version we check the ending of the
invoked program for an appended version number to see which protocol
version we're using in an exchange. At first I thought we should append
a unique suffix instead of a version number to make a distinction easier
for the kind of protocol we want to talk (There may be the traditional
protocol with no suffix, or the "upload-pack-capabilities-first" protocol
which will transmit the capabilities first).

My preference for a string suffix instead of a sequential number is
motiviated by the discussion of sha1 vs sequential numbers to describe
a state of a repository. The main difference here is however how often
we expect changes. Version 1 of the protocol is current for 10 years by
now, so I do not changes to the protocol number often, which makes it
suitable for just having a counter appended to the binary.

The advantage of just a counting number is its small size,
and I think this advantage outweights the disadvantage
of having named protocol versions.

This series doesn't include an automatic upgrade of the protocol version
for later changes if the server supports it, so for now the use of the new
protocol needs to be activated manually via setting remote.origin.transportversion.

Any comments welcome!

Thanks,
Stefan

[1] http://permalink.gmane.org/gmane.comp.version-control.git/267572

Nguyễn Thái Ngọc Duy (2):
  upload-pack: make client capability parsing code a separate function
  upload-pack: only accept capabilities on the first "want" line

Stefan Beller (9):
  upload-pack: move capabilities out of send_ref
  upload-pack-2: Implement the version 2 of upload-pack
  transport: add infrastructure to support a protocol version number
  remote.h: add get_remote_capabilities, request_capabilities
  fetch-pack: use the configured transport protocol
  transport: connect_setup appends protocol version number
  transport: get_refs_via_connect exchanges capabilities before refs.
  t5544: add a test case for the new protocol
  Document protocol version 2

 .gitignore                                        |   1 +
 Documentation/technical/pack-protocol.txt         |  86 ++++++++++++--
 Documentation/technical/protocol-capabilities.txt |  15 ---
 Makefile                                          |   2 +
 builtin/fetch-pack.c                              |  17 ++-
 builtin/fetch.c                                   |   6 +
 connect.c                                         |  48 +++++++-
 fetch-pack.h                                      |   1 +
 remote.c                                          |   2 +
 remote.h                                          |   5 +
 t/t5544-fetch-2.sh                                |  40 +++++++
 transport-helper.c                                |   1 +
 transport.c                                       |  60 +++++++++-
 transport.h                                       |   4 +
 upload-pack-2.c                                   |   1 +
 upload-pack.c                                     | 138 +++++++++++++++++-----
 16 files changed, 366 insertions(+), 61 deletions(-)
 create mode 100755 t/t5544-fetch-2.sh
 create mode 120000 upload-pack-2.c

-- 
2.4.1.345.gab207b6.dirty

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [RFC/WIP PATCH 01/11] upload-pack: make client capability parsing code a separate function
  2015-05-26 22:01 [RFC/WIP PATCH 00/11] Protocol version 2, again! Stefan Beller
@ 2015-05-26 22:01 ` Stefan Beller
  2015-05-26 22:01 ` [RFC/WIP PATCH 02/11] upload-pack: only accept capabilities on the first "want" line Stefan Beller
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 61+ messages in thread
From: Stefan Beller @ 2015-05-26 22:01 UTC (permalink / raw)
  To: git; +Cc: pclouds, peff, gitster, Stefan Beller

From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---
 upload-pack.c | 44 +++++++++++++++++++++++---------------------
 1 file changed, 23 insertions(+), 21 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index 745fda8..5449ff7 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -531,6 +531,28 @@ error:
 	}
 }
 
+static void parse_features(const char *features)
+{
+	if (parse_feature_request(features, "multi_ack_detailed"))
+		multi_ack = 2;
+	else if (parse_feature_request(features, "multi_ack"))
+		multi_ack = 1;
+	if (parse_feature_request(features, "no-done"))
+		no_done = 1;
+	if (parse_feature_request(features, "thin-pack"))
+		use_thin_pack = 1;
+	if (parse_feature_request(features, "ofs-delta"))
+		use_ofs_delta = 1;
+	if (parse_feature_request(features, "side-band-64k"))
+		use_sideband = LARGE_PACKET_MAX;
+	else if (parse_feature_request(features, "side-band"))
+		use_sideband = DEFAULT_PACKET_MAX;
+	if (parse_feature_request(features, "no-progress"))
+		no_progress = 1;
+	if (parse_feature_request(features, "include-tag"))
+		use_include_tag = 1;
+}
+
 static void receive_needs(void)
 {
 	struct object_array shallows = OBJECT_ARRAY_INIT;
@@ -540,7 +562,6 @@ static void receive_needs(void)
 	shallow_nr = 0;
 	for (;;) {
 		struct object *o;
-		const char *features;
 		unsigned char sha1_buf[20];
 		char *line = packet_read_line(0, NULL);
 		reset_timeout();
@@ -575,26 +596,7 @@ static void receive_needs(void)
 			die("git upload-pack: protocol error, "
 			    "expected to get sha, not '%s'", line);
 
-		features = line + 45;
-
-		if (parse_feature_request(features, "multi_ack_detailed"))
-			multi_ack = 2;
-		else if (parse_feature_request(features, "multi_ack"))
-			multi_ack = 1;
-		if (parse_feature_request(features, "no-done"))
-			no_done = 1;
-		if (parse_feature_request(features, "thin-pack"))
-			use_thin_pack = 1;
-		if (parse_feature_request(features, "ofs-delta"))
-			use_ofs_delta = 1;
-		if (parse_feature_request(features, "side-band-64k"))
-			use_sideband = LARGE_PACKET_MAX;
-		else if (parse_feature_request(features, "side-band"))
-			use_sideband = DEFAULT_PACKET_MAX;
-		if (parse_feature_request(features, "no-progress"))
-			no_progress = 1;
-		if (parse_feature_request(features, "include-tag"))
-			use_include_tag = 1;
+		parse_features(line + 45);
 
 		o = parse_object(sha1_buf);
 		if (!o)
-- 
2.4.1.345.gab207b6.dirty

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [RFC/WIP PATCH 02/11] upload-pack: only accept capabilities on the first "want" line
  2015-05-26 22:01 [RFC/WIP PATCH 00/11] Protocol version 2, again! Stefan Beller
  2015-05-26 22:01 ` [RFC/WIP PATCH 01/11] upload-pack: make client capability parsing code a separate function Stefan Beller
@ 2015-05-26 22:01 ` Stefan Beller
  2015-05-26 22:17   ` Junio C Hamano
  2015-05-26 22:01 ` [RFC/WIP PATCH 03/11] upload-pack: move capabilities out of send_ref Stefan Beller
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 61+ messages in thread
From: Stefan Beller @ 2015-05-26 22:01 UTC (permalink / raw)
  To: git; +Cc: pclouds, peff, gitster, Stefan Beller

From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>

pack-protocol.txt says so and fetch-pack also follows it even though
upload-pack is a bit lax. Fix it.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---
 upload-pack.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/upload-pack.c b/upload-pack.c
index 5449ff7..6734777 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -558,6 +558,7 @@ static void receive_needs(void)
 	struct object_array shallows = OBJECT_ARRAY_INIT;
 	int depth = 0;
 	int has_non_tip = 0;
+	int first_want = 1;
 
 	shallow_nr = 0;
 	for (;;) {
@@ -596,7 +597,11 @@ static void receive_needs(void)
 			die("git upload-pack: protocol error, "
 			    "expected to get sha, not '%s'", line);
 
-		parse_features(line + 45);
+		if (first_want) {
+			parse_features(line + 45);
+			first_want = 0;
+		} else if (line[45])
+			die("garbage at the end of 'want' line %s", line + 45);
 
 		o = parse_object(sha1_buf);
 		if (!o)
-- 
2.4.1.345.gab207b6.dirty

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [RFC/WIP PATCH 03/11] upload-pack: move capabilities out of send_ref
  2015-05-26 22:01 [RFC/WIP PATCH 00/11] Protocol version 2, again! Stefan Beller
  2015-05-26 22:01 ` [RFC/WIP PATCH 01/11] upload-pack: make client capability parsing code a separate function Stefan Beller
  2015-05-26 22:01 ` [RFC/WIP PATCH 02/11] upload-pack: only accept capabilities on the first "want" line Stefan Beller
@ 2015-05-26 22:01 ` Stefan Beller
  2015-05-26 22:01 ` [RFC/WIP PATCH 04/11] upload-pack-2: Implement the version 2 of upload-pack Stefan Beller
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 61+ messages in thread
From: Stefan Beller @ 2015-05-26 22:01 UTC (permalink / raw)
  To: git; +Cc: pclouds, peff, gitster, Stefan Beller

This will make it easier to reuse the capabilities in a later patch.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 upload-pack.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index 6734777..a5f75b7 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -716,33 +716,35 @@ static void format_symref_info(struct strbuf *buf, struct string_list *symref)
 		strbuf_addf(buf, " symref=%s:%s", item->string, (char *)item->util);
 }
 
-static int send_ref(const char *refname, const unsigned char *sha1, int flag, void *cb_data)
-{
-	static const char *capabilities = "multi_ack thin-pack side-band"
+static const char *advertise_capabilities = "multi_ack thin-pack side-band"
 		" side-band-64k ofs-delta shallow no-progress"
 		" include-tag multi_ack_detailed";
+
+static int send_ref(const char *refname, const unsigned char *sha1, int flag, void *cb_data)
+{
+
 	const char *refname_nons = strip_namespace(refname);
 	unsigned char peeled[20];
 
 	if (mark_our_ref(refname, sha1))
 		return 0;
 
-	if (capabilities) {
+	if (advertise_capabilities) {
 		struct strbuf symref_info = STRBUF_INIT;
 
 		format_symref_info(&symref_info, cb_data);
 		packet_write(1, "%s %s%c%s%s%s%s agent=%s\n",
 			     sha1_to_hex(sha1), refname_nons,
-			     0, capabilities,
+			     0, advertise_capabilities,
 			     allow_tip_sha1_in_want ? " allow-tip-sha1-in-want" : "",
 			     stateless_rpc ? " no-done" : "",
 			     symref_info.buf,
 			     git_user_agent_sanitized());
 		strbuf_release(&symref_info);
+		advertise_capabilities = NULL;
 	} else {
 		packet_write(1, "%s %s\n", sha1_to_hex(sha1), refname_nons);
 	}
-	capabilities = NULL;
 	if (!peel_ref(refname, peeled))
 		packet_write(1, "%s %s^{}\n", sha1_to_hex(peeled), refname_nons);
 	return 0;
-- 
2.4.1.345.gab207b6.dirty

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [RFC/WIP PATCH 04/11] upload-pack-2: Implement the version 2 of upload-pack
  2015-05-26 22:01 [RFC/WIP PATCH 00/11] Protocol version 2, again! Stefan Beller
                   ` (2 preceding siblings ...)
  2015-05-26 22:01 ` [RFC/WIP PATCH 03/11] upload-pack: move capabilities out of send_ref Stefan Beller
@ 2015-05-26 22:01 ` Stefan Beller
  2015-05-27  2:30   ` Eric Sunshine
  2015-05-27  6:35   ` Jeff King
  2015-05-26 22:01 ` [RFC/WIP PATCH 05/11] transport: add infrastructure to support a protocol version number Stefan Beller
                   ` (7 subsequent siblings)
  11 siblings, 2 replies; 61+ messages in thread
From: Stefan Beller @ 2015-05-26 22:01 UTC (permalink / raw)
  To: git; +Cc: pclouds, peff, gitster, Stefan Beller

In upload-pack-2 we send each capability in its own packet.
By reusing the advertise_capabilities and eventually setting it to
NULL we will be able to reuse the methods for refs advertisement.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 .gitignore      |  1 +
 Makefile        |  2 ++
 upload-pack-2.c |  1 +
 upload-pack.c   | 77 ++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 4 files changed, 78 insertions(+), 3 deletions(-)
 create mode 120000 upload-pack-2.c

diff --git a/.gitignore b/.gitignore
index a052419..a3c8ab9 100644
--- a/.gitignore
+++ b/.gitignore
@@ -165,6 +165,7 @@
 /git-update-server-info
 /git-upload-archive
 /git-upload-pack
+/git-upload-pack-2
 /git-var
 /git-verify-commit
 /git-verify-pack
diff --git a/Makefile b/Makefile
index 25a453b..0f3ee41 100644
--- a/Makefile
+++ b/Makefile
@@ -560,6 +560,7 @@ PROGRAM_OBJS += sh-i18n--envsubst.o
 PROGRAM_OBJS += shell.o
 PROGRAM_OBJS += show-index.o
 PROGRAM_OBJS += upload-pack.o
+PROGRAM_OBJS += upload-pack-2.o
 PROGRAM_OBJS += remote-testsvn.o
 
 # Binary suffix, set to .exe for Windows builds
@@ -625,6 +626,7 @@ OTHER_PROGRAMS = git$X
 # what test wrappers are needed and 'install' will install, in bindir
 BINDIR_PROGRAMS_NEED_X += git
 BINDIR_PROGRAMS_NEED_X += git-upload-pack
+BINDIR_PROGRAMS_NEED_X += git-upload-pack-2
 BINDIR_PROGRAMS_NEED_X += git-receive-pack
 BINDIR_PROGRAMS_NEED_X += git-upload-archive
 BINDIR_PROGRAMS_NEED_X += git-shell
diff --git a/upload-pack-2.c b/upload-pack-2.c
new file mode 120000
index 0000000..e30a871
--- /dev/null
+++ b/upload-pack-2.c
@@ -0,0 +1 @@
+upload-pack.c
\ No newline at end of file
diff --git a/upload-pack.c b/upload-pack.c
index a5f75b7..84f9ee3 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -716,10 +716,47 @@ static void format_symref_info(struct strbuf *buf, struct string_list *symref)
 		strbuf_addf(buf, " symref=%s:%s", item->string, (char *)item->util);
 }
 
-static const char *advertise_capabilities = "multi_ack thin-pack side-band"
+static char *advertise_capabilities = "multi_ack thin-pack side-band"
 		" side-band-64k ofs-delta shallow no-progress"
 		" include-tag multi_ack_detailed";
 
+/*
+ * Reads the next capability and puts it into dst as a null terminated string.
+ * Returns true if more capabilities can be read.
+ * */
+static int next_capability(char *dst)
+{
+	int len = 0;
+	if (!*advertise_capabilities) {
+		/* make sure to not advertise capabilities afterwards */
+		advertise_capabilities = NULL;
+		return 0;
+	}
+
+	while (advertise_capabilities[len] != '\0' &&
+	       advertise_capabilities[len] != ' ')
+		len ++;
+	strncpy(dst, advertise_capabilities, len);
+	dst[len] = '\0';
+
+	advertise_capabilities += len;
+	if (*advertise_capabilities == ' ')
+		advertise_capabilities++;
+
+	return 1;
+}
+
+static void send_capabilities(void)
+{
+	char buf[100];
+
+	while (next_capability(buf))
+		packet_write(1, "capability:%s\n", buf);
+
+	packet_write(1, "agent:%s\n", git_user_agent_sanitized());
+	packet_flush(1);
+}
+
 static int send_ref(const char *refname, const unsigned char *sha1, int flag, void *cb_data)
 {
 
@@ -794,6 +831,28 @@ static void upload_pack(void)
 	}
 }
 
+static void receive_capabilities(void)
+{
+	int done = 0;
+	while (1) {
+		char *line = packet_read_line(0, NULL);
+		if (!line)
+			break;
+		if (starts_with(line, "capability:"))
+			parse_features(line + strlen("capability:"));
+	}
+}
+
+static void upload_pack_version_2(void)
+{
+	send_capabilities();
+	receive_capabilities();
+
+	/* The rest of the protocol stays the same, capabilities advertising
+	   is disabled though. */
+	upload_pack();
+}
+
 static int upload_pack_config(const char *var, const char *value, void *unused)
 {
 	if (!strcmp("uploadpack.allowtipsha1inwant", var))
@@ -806,16 +865,24 @@ static int upload_pack_config(const char *var, const char *value, void *unused)
 	return parse_hide_refs_config(var, value, "uploadpack");
 }
 
+static int endswith(const char *teststring, const char *ending)
+{
+	int slen = strlen(teststring);
+	int elen = strlen(ending);
+	return !strcmp(teststring + slen - elen, ending);
+}
+
 int main(int argc, char **argv)
 {
 	char *dir;
+	const char *cmd;
 	int i;
 	int strict = 0;
 
 	git_setup_gettext();
 
 	packet_trace_identity("upload-pack");
-	git_extract_argv0_path(argv[0]);
+	cmd = git_extract_argv0_path(argv[0]);
 	check_replace_refs = 0;
 
 	for (i = 1; i < argc; i++) {
@@ -857,6 +924,10 @@ int main(int argc, char **argv)
 		die("'%s' does not appear to be a git repository", dir);
 
 	git_config(upload_pack_config, NULL);
-	upload_pack();
+
+	if (endswith(cmd, "-2"))
+		upload_pack_version_2();
+	else
+		upload_pack();
 	return 0;
 }
-- 
2.4.1.345.gab207b6.dirty

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [RFC/WIP PATCH 05/11] transport: add infrastructure to support a protocol version number
  2015-05-26 22:01 [RFC/WIP PATCH 00/11] Protocol version 2, again! Stefan Beller
                   ` (3 preceding siblings ...)
  2015-05-26 22:01 ` [RFC/WIP PATCH 04/11] upload-pack-2: Implement the version 2 of upload-pack Stefan Beller
@ 2015-05-26 22:01 ` Stefan Beller
  2015-05-27  6:39   ` Jeff King
  2015-05-26 22:01 ` [RFC/WIP PATCH 06/11] remote.h: add get_remote_capabilities, request_capabilities Stefan Beller
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 61+ messages in thread
From: Stefan Beller @ 2015-05-26 22:01 UTC (permalink / raw)
  To: git; +Cc: pclouds, peff, gitster, Stefan Beller

The transport version set via command line argument in
git fetch takes precedence over the configured version.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/fetch.c    |  6 ++++++
 remote.c           |  2 ++
 remote.h           |  2 ++
 transport-helper.c |  1 +
 transport.c        | 13 +++++++++++++
 transport.h        |  4 ++++
 6 files changed, 28 insertions(+)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 7910419..d2e1828 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -46,6 +46,7 @@ static const char *recurse_submodules_default;
 static int shown_url = 0;
 static int refmap_alloc, refmap_nr;
 static const char **refmap_array;
+static const char *transport_version;
 
 static int option_parse_recurse_submodules(const struct option *opt,
 				   const char *arg, int unset)
@@ -121,6 +122,9 @@ static struct option builtin_fetch_options[] = {
 		   N_("default mode for recursion"), PARSE_OPT_HIDDEN },
 	OPT_BOOL(0, "update-shallow", &update_shallow,
 		 N_("accept refs that update .git/shallow")),
+	OPT_STRING('y', "transport-version", &transport_version,
+		   N_("transport-version"),
+		   N_("specify transport version to be used")),
 	{ OPTION_CALLBACK, 0, "refmap", NULL, N_("refmap"),
 	  N_("specify fetch refmap"), PARSE_OPT_NONEG, parse_refmap_arg },
 	OPT_END()
@@ -848,6 +852,8 @@ static struct transport *prepare_transport(struct remote *remote)
 	struct transport *transport;
 	transport = transport_get(remote, NULL);
 	transport_set_verbosity(transport, verbosity, progress);
+	if (transport_version)
+		set_option(transport, TRANS_OPT_TRANSPORTVERSION, transport_version);
 	if (upload_pack)
 		set_option(transport, TRANS_OPT_UPLOADPACK, upload_pack);
 	if (keep)
diff --git a/remote.c b/remote.c
index 68901b0..2914d9d 100644
--- a/remote.c
+++ b/remote.c
@@ -476,6 +476,8 @@ static int handle_config(const char *key, const char *value, void *cb)
 					 key, value);
 	} else if (!strcmp(subkey, ".vcs")) {
 		return git_config_string(&remote->foreign_vcs, key, value);
+	} else if (!strcmp(subkey, ".transportversion")) {
+		remote->transport_version = git_config_int(key, value);
 	}
 	return 0;
 }
diff --git a/remote.h b/remote.h
index 02d66ce..04e2310 100644
--- a/remote.h
+++ b/remote.h
@@ -50,6 +50,8 @@ struct remote {
 	const char *receivepack;
 	const char *uploadpack;
 
+	int transport_version;
+
 	/*
 	 * for curl remotes only
 	 */
diff --git a/transport-helper.c b/transport-helper.c
index 5d99a6b..ab3cd5b 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -247,6 +247,7 @@ static int disconnect_helper(struct transport *transport)
 }
 
 static const char *unsupported_options[] = {
+	TRANS_OPT_TRANSPORTVERSION,
 	TRANS_OPT_UPLOADPACK,
 	TRANS_OPT_RECEIVEPACK,
 	TRANS_OPT_THIN,
diff --git a/transport.c b/transport.c
index f080e93..3ef15f6 100644
--- a/transport.c
+++ b/transport.c
@@ -479,6 +479,16 @@ static int set_git_option(struct git_transport_options *opts,
 	} else if (!strcmp(name, TRANS_OPT_PUSH_CERT)) {
 		opts->push_cert = !!value;
 		return 0;
+	} else if (!strcmp(name, TRANS_OPT_TRANSPORTVERSION)) {
+		if (!value)
+			opts->transport_version = 0;
+		else {
+			char *end;
+			opts->transport_version = strtol(value, &end, 0);
+			if (*end)
+				die("transport: invalid transport version option '%s'", value);
+		}
+		return 0;
 	}
 	return 1;
 }
@@ -988,6 +998,9 @@ struct transport *transport_get(struct remote *remote, const char *url)
 		ret->smart_options->receivepack = "git-receive-pack";
 		if (remote->receivepack)
 			ret->smart_options->receivepack = remote->receivepack;
+		if (remote->transport_version)
+			ret->smart_options->transport_version =
+				remote->transport_version;
 	}
 
 	return ret;
diff --git a/transport.h b/transport.h
index 18d2cf8..e07d356 100644
--- a/transport.h
+++ b/transport.h
@@ -14,6 +14,7 @@ struct git_transport_options {
 	unsigned update_shallow : 1;
 	unsigned push_cert : 1;
 	int depth;
+	int transport_version;
 	const char *uploadpack;
 	const char *receivepack;
 	struct push_cas_option *cas;
@@ -162,6 +163,9 @@ struct transport *transport_get(struct remote *, const char *);
 /* Send push certificates */
 #define TRANS_OPT_PUSH_CERT "pushcert"
 
+/* Use a new version of the git protocol */
+#define TRANS_OPT_TRANSPORTVERSION "transportversion"
+
 /**
  * Returns 0 if the option was used, non-zero otherwise. Prints a
  * message to stderr if the option is not used.
-- 
2.4.1.345.gab207b6.dirty

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [RFC/WIP PATCH 06/11] remote.h: add get_remote_capabilities, request_capabilities
  2015-05-26 22:01 [RFC/WIP PATCH 00/11] Protocol version 2, again! Stefan Beller
                   ` (4 preceding siblings ...)
  2015-05-26 22:01 ` [RFC/WIP PATCH 05/11] transport: add infrastructure to support a protocol version number Stefan Beller
@ 2015-05-26 22:01 ` Stefan Beller
  2015-05-27  3:25   ` Eric Sunshine
  2015-05-27  6:45   ` Jeff King
  2015-05-26 22:01 ` [RFC/WIP PATCH 07/11] fetch-pack: use the configured transport protocol Stefan Beller
                   ` (5 subsequent siblings)
  11 siblings, 2 replies; 61+ messages in thread
From: Stefan Beller @ 2015-05-26 22:01 UTC (permalink / raw)
  To: git; +Cc: pclouds, peff, gitster, Stefan Beller

Instead of calling get_remote_heads as a first command during the
protocol exchange, we need to have fine grained control over the
capability negotiation in version 2 of the protocol.

Introduce get_remote_capabilities, which will just listen to
capabilities of the remote and request_capabilities which will
tell the selection of capabilities to the remote.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 connect.c | 48 +++++++++++++++++++++++++++++++++++++++++++++++-
 remote.h  |  3 +++
 2 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/connect.c b/connect.c
index c0144d8..bb17618 100644
--- a/connect.c
+++ b/connect.c
@@ -105,8 +105,54 @@ static void annotate_refs_with_symref_info(struct ref *ref)
 	string_list_clear(&symref, 0);
 }
 
+void get_remote_capabilities(int in, char *src_buf, size_t src_len)
+{
+	struct strbuf capabilities_string = STRBUF_INIT;
+	for (;;) {
+		int len;
+		char *line = packet_buffer;
+		const char *arg;
+
+		len = packet_read(in, &src_buf, &src_len,
+				  packet_buffer, sizeof(packet_buffer),
+				  PACKET_READ_GENTLE_ON_EOF |
+				  PACKET_READ_CHOMP_NEWLINE);
+		if (len < 0)
+			die_initial_contact(0);
+
+		if (!len)
+			break;
+
+		if (len > 4 && skip_prefix(line, "ERR ", &arg))
+			die("remote error: %s", arg);
+
+		if (starts_with(line, "capability:")) {
+			strbuf_addstr(&capabilities_string, line + strlen("capability:"));
+			strbuf_addch(&capabilities_string, ' ');
+		}
+	}
+	free(server_capabilities);
+	server_capabilities = xmalloc(capabilities_string.len + 1);
+	server_capabilities = strbuf_detach(&capabilities_string, NULL);
+
+	strbuf_release(&capabilities_string);
+}
+
+int request_capabilities(int out)
+{
+	fprintf(stderr, "request_capabilities\n");
+	// todo: send our capabilities
+	packet_write(out, "capability:foo");
+	packet_flush(out);
+}
+
 /*
- * Read all the refs from the other end
+ * Read all the refs from the other end,
+ * in is a file descriptor input stream
+ * src_buf and src_len may be an alternative way to specify the input.
+ * list is the output of refs
+ * extra_have is a list to store information learned about sha1 the other side has.
+ * shallow_points
  */
 struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 			      struct ref **list, unsigned int flags,
diff --git a/remote.h b/remote.h
index 04e2310..7381ddf 100644
--- a/remote.h
+++ b/remote.h
@@ -146,6 +146,9 @@ int check_ref_type(const struct ref *ref, int flags);
 void free_refs(struct ref *ref);
 
 struct sha1_array;
+
+extern void get_remote_capabilities(int in, char *src_buf, size_t src_len);
+extern int request_capabilities(int out);
 extern struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 				     struct ref **list, unsigned int flags,
 				     struct sha1_array *extra_have,
-- 
2.4.1.345.gab207b6.dirty

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [RFC/WIP PATCH 07/11] fetch-pack: use the configured transport protocol
  2015-05-26 22:01 [RFC/WIP PATCH 00/11] Protocol version 2, again! Stefan Beller
                   ` (5 preceding siblings ...)
  2015-05-26 22:01 ` [RFC/WIP PATCH 06/11] remote.h: add get_remote_capabilities, request_capabilities Stefan Beller
@ 2015-05-26 22:01 ` Stefan Beller
  2015-05-26 22:19   ` Junio C Hamano
  2015-05-27  6:53   ` Jeff King
  2015-05-26 22:01 ` [RFC/WIP PATCH 08/11] transport: connect_setup appends protocol version number Stefan Beller
                   ` (4 subsequent siblings)
  11 siblings, 2 replies; 61+ messages in thread
From: Stefan Beller @ 2015-05-26 22:01 UTC (permalink / raw)
  To: git; +Cc: pclouds, peff, gitster, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 builtin/fetch-pack.c | 17 ++++++++++++++++-
 fetch-pack.h         |  1 +
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 4a6b340..32dc8b0 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -127,6 +127,10 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 			args.update_shallow = 1;
 			continue;
 		}
+		if (!strcmp("--transport-version", arg)) {
+			args.version = strtol(arg + strlen("--transport-version"), NULL, 0);
+			continue;
+		}
 		usage(fetch_pack_usage);
 	}
 
@@ -175,7 +179,18 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 		if (!conn)
 			return args.diag_url ? 0 : 1;
 	}
-	get_remote_heads(fd[0], NULL, 0, &ref, 0, NULL, &shallow);
+
+	switch (args.version) {
+	default:
+	case 2:
+		get_remote_capabilities(fd[0], NULL, 0);
+		request_capabilities(fd[1]);
+		break;
+	case 1: /* fall through */
+	case 0:
+		get_remote_heads(fd[0], NULL, 0, &ref, 0, NULL, &shallow);
+		break;
+	}
 
 	ref = fetch_pack(&args, fd, conn, ref, dest, sought, nr_sought,
 			 &shallow, pack_lockfile_ptr);
diff --git a/fetch-pack.h b/fetch-pack.h
index bb7fd76..b48b4f5 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -10,6 +10,7 @@ struct fetch_pack_args {
 	const char *uploadpack;
 	int unpacklimit;
 	int depth;
+	int version;
 	unsigned quiet:1;
 	unsigned keep_pack:1;
 	unsigned lock_pack:1;
-- 
2.4.1.345.gab207b6.dirty

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [RFC/WIP PATCH 08/11] transport: connect_setup appends protocol version number
  2015-05-26 22:01 [RFC/WIP PATCH 00/11] Protocol version 2, again! Stefan Beller
                   ` (6 preceding siblings ...)
  2015-05-26 22:01 ` [RFC/WIP PATCH 07/11] fetch-pack: use the configured transport protocol Stefan Beller
@ 2015-05-26 22:01 ` Stefan Beller
  2015-05-26 22:21   ` Junio C Hamano
                     ` (2 more replies)
  2015-05-26 22:01 ` [RFC/WIP PATCH 09/11] transport: get_refs_via_connect exchanges capabilities before refs Stefan Beller
                   ` (3 subsequent siblings)
  11 siblings, 3 replies; 61+ messages in thread
From: Stefan Beller @ 2015-05-26 22:01 UTC (permalink / raw)
  To: git; +Cc: pclouds, peff, gitster, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 transport.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/transport.c b/transport.c
index 3ef15f6..33644a6 100644
--- a/transport.c
+++ b/transport.c
@@ -496,15 +496,29 @@ static int set_git_option(struct git_transport_options *opts,
 static int connect_setup(struct transport *transport, int for_push, int verbose)
 {
 	struct git_transport_data *data = transport->data;
+	const char *remote_program;
+	char *buf = 0;
 
 	if (data->conn)
 		return 0;
 
+	remote_program = (for_push ? data->options.receivepack
+				   : data->options.uploadpack);
+
+	if (transport->smart_options
+	    && transport->smart_options->transport_version) {
+		buf = xmalloc(strlen(remote_program) + 12);
+		sprintf(buf, "%s-%d", remote_program,
+			transport->smart_options->transport_version);
+		remote_program = buf;
+	}
+
 	data->conn = git_connect(data->fd, transport->url,
-				 for_push ? data->options.receivepack :
-				 data->options.uploadpack,
+				 remote_program,
 				 verbose ? CONNECT_VERBOSE : 0);
 
+	free(buf);
+
 	return 0;
 }
 
-- 
2.4.1.345.gab207b6.dirty

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [RFC/WIP PATCH 09/11] transport: get_refs_via_connect exchanges capabilities before refs.
  2015-05-26 22:01 [RFC/WIP PATCH 00/11] Protocol version 2, again! Stefan Beller
                   ` (7 preceding siblings ...)
  2015-05-26 22:01 ` [RFC/WIP PATCH 08/11] transport: connect_setup appends protocol version number Stefan Beller
@ 2015-05-26 22:01 ` Stefan Beller
  2015-05-27  5:37   ` Eric Sunshine
  2015-05-27  7:06   ` Jeff King
  2015-05-26 22:01 ` [RFC/WIP PATCH 10/11] t5544: add a test case for the new protocol Stefan Beller
                   ` (2 subsequent siblings)
  11 siblings, 2 replies; 61+ messages in thread
From: Stefan Beller @ 2015-05-26 22:01 UTC (permalink / raw)
  To: git; +Cc: pclouds, peff, gitster, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 transport.c | 29 +++++++++++++++++++++++++----
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/transport.c b/transport.c
index 33644a6..1cd9b77 100644
--- a/transport.c
+++ b/transport.c
@@ -526,12 +526,33 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 {
 	struct git_transport_data *data = transport->data;
 	struct ref *refs;
+	int version = 0;
 
+	if (transport->smart_options)
+		version = transport->smart_options->transport_version;
 	connect_setup(transport, for_push, 0);
-	get_remote_heads(data->fd[0], NULL, 0, &refs,
-			 for_push ? REF_NORMAL : 0,
-			 &data->extra_have,
-			 &data->shallow);
+	switch (version) {
+		default: /*
+			  * Configured a protocol version > 2?
+			  * Try version 2 as it's the most future proof.
+			  */
+			/* fall through */
+		case 2: /* first talk about capabilities, then get the heads */
+			get_remote_capabilities(data->fd[0], NULL, 0);
+			request_capabilities(data->fd[1]);
+			get_remote_heads(data->fd[0], NULL, 0, &refs,
+					 for_push ? REF_NORMAL : 0,
+					 &data->extra_have,
+					 &data->shallow);
+			break;
+		case 1: /* configured version 1, fall through */
+		case 0: /* unconfigured, use first protocol */
+			get_remote_heads(data->fd[0], NULL, 0, &refs,
+					 for_push ? REF_NORMAL : 0,
+					 &data->extra_have,
+					 &data->shallow);
+			break;
+	}
 	data->got_remote_heads = 1;
 
 	return refs;
-- 
2.4.1.345.gab207b6.dirty

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [RFC/WIP PATCH 10/11] t5544: add a test case for the new protocol
  2015-05-26 22:01 [RFC/WIP PATCH 00/11] Protocol version 2, again! Stefan Beller
                   ` (8 preceding siblings ...)
  2015-05-26 22:01 ` [RFC/WIP PATCH 09/11] transport: get_refs_via_connect exchanges capabilities before refs Stefan Beller
@ 2015-05-26 22:01 ` Stefan Beller
  2015-05-27  5:34   ` Eric Sunshine
  2015-05-27  7:12   ` Jeff King
  2015-05-26 22:01 ` [RFC/WIP PATCH 11/11] Document protocol version 2 Stefan Beller
  2015-05-27  6:18 ` [RFC/WIP PATCH 00/11] Protocol version 2, again! Jeff King
  11 siblings, 2 replies; 61+ messages in thread
From: Stefan Beller @ 2015-05-26 22:01 UTC (permalink / raw)
  To: git; +Cc: pclouds, peff, gitster, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 t/t5544-fetch-2.sh | 40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)
 create mode 100755 t/t5544-fetch-2.sh

diff --git a/t/t5544-fetch-2.sh b/t/t5544-fetch-2.sh
new file mode 100755
index 0000000..beee46c
--- /dev/null
+++ b/t/t5544-fetch-2.sh
@@ -0,0 +1,40 @@
+#!/bin/sh
+#
+# Copyright (c) 2015 Stefan Beller
+#
+
+test_description='Testing version 2 of the fetch protocol'
+
+. ./test-lib.sh
+
+mk_repo_pair () {
+	rm -rf client server &&
+	test_create_repo client &&
+	test_create_repo server &&
+	(
+		cd server &&
+		git config receive.denyCurrentBranch warn
+	) &&
+	(
+		cd client &&
+		git remote add origin ../server
+		git config remote.origin.transportversion 2
+	)
+}
+
+test_expect_success 'setup' '
+	mk_repo_pair &&
+	(
+		cd server &&
+		test_commit one
+	) &&
+	(
+		cd client &&
+		git fetch origin master
+	)
+'
+
+# More to come here, similar to t5515 having a sub directory full of expected
+# data going over the wire.
+
+test_done
-- 
2.4.1.345.gab207b6.dirty

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [RFC/WIP PATCH 11/11] Document protocol version 2
  2015-05-26 22:01 [RFC/WIP PATCH 00/11] Protocol version 2, again! Stefan Beller
                   ` (9 preceding siblings ...)
  2015-05-26 22:01 ` [RFC/WIP PATCH 10/11] t5544: add a test case for the new protocol Stefan Beller
@ 2015-05-26 22:01 ` Stefan Beller
  2015-05-29 20:35   ` Junio C Hamano
  2015-05-27  6:18 ` [RFC/WIP PATCH 00/11] Protocol version 2, again! Jeff King
  11 siblings, 1 reply; 61+ messages in thread
From: Stefan Beller @ 2015-05-26 22:01 UTC (permalink / raw)
  To: git; +Cc: pclouds, peff, gitster, Stefan Beller

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/technical/pack-protocol.txt         | 86 ++++++++++++++++++++---
 Documentation/technical/protocol-capabilities.txt | 15 ----
 2 files changed, 77 insertions(+), 24 deletions(-)

diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt
index fc09c63..4e1c82e 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -1,11 +1,11 @@
 Packfile transfer protocols
 ===========================
 
-Git supports transferring data in packfiles over the ssh://, git:// and
+Git supports transferring data in packfiles over the ssh://, git://, http:// and
 file:// transports.  There exist two sets of protocols, one for pushing
 data from a client to a server and another for fetching data from a
-server to a client.  All three transports (ssh, git, file) use the same
-protocol to transfer data.
+server to a client.  The three transports (ssh, git, file) use the same
+protocol to transfer data. http is documented in http-protocol.txt.
 
 The processes invoked in the canonical Git implementation are 'upload-pack'
 on the server side and 'fetch-pack' on the client side for fetching data;
@@ -14,6 +14,12 @@ data.  The protocol functions to have a server tell a client what is
 currently on the server, then for the two to negotiate the smallest amount
 of data to send in order to fully update one or the other.
 
+"upload-pack-2" and "receive-pack-2" are the next generation of
+"upload-pack" and "receive-pack" respectively. The first two are
+referred as "version 2" in this document and pack-capabilities.txt
+while the last two are "version 1". Unless stated otherwise, "version 1"
+is implied.
+
 Transports
 ----------
 There are three transports over which the packfile protocol is
@@ -42,7 +48,8 @@ hostname parameter, terminated by a NUL byte.
 
 --
    git-proto-request = request-command SP pathname NUL [ host-parameter NUL ]
-   request-command   = "git-upload-pack" / "git-receive-pack" /
+   request-command   = "git-upload-pack" / "git-upload-pack-2" /
+		       "git-receive-pack" / "git-receive-pack-2" /
 		       "git-upload-archive"   ; case sensitive
    pathname          = *( %x01-ff ) ; exclude NUL
    host-parameter    = "host=" hostname [ ":" port ]
@@ -124,6 +131,48 @@ has, the first can 'fetch' from the second.  This operation determines
 what data the server has that the client does not then streams that
 data down to the client in packfile format.
 
+Capability discovery (v2)
+-------------------------
+
+In version 1, capability discovery is part of reference discovery and
+covered in reference discovery section.
+
+In version 2, when the client initially connects, the server
+immediately sends its capabilities to the client followed by a flush.
+Then the client must send the list of server capabilities it wants to
+use to the server.
+
+   S: 00XXcapability:lang\n
+   S: 00XXcapability:thin-pack\n
+   S: 00XXcapability:ofs-delta\n
+   S: 00XXagent:agent=git/2:3.4.5+custom-739-gb850f98\n
+   S: 0000
+
+   C: 00XXcap:thin-pack\n
+   C: 00XXcap:ofs-delta\n
+   C: 00XXcap:lang=en\n
+   C: 00XXagent:agent=git/custom_string\n
+   C: 0000
+
+----
+  capability-list  =  *(capability) [agent LF] flush-pkt
+  capability       =  PKT-LINE("capability:" keyvaluepair LF)
+  agent            =  keyvaluepair LF
+  keyvaluepair     =  1*(LC_ALPHA / DIGIT / "-" / "_" / "=")
+  LC_ALPHA         =  %x61-7A
+----
+
+The client MUST ignore any data on pkt-lines starting with anything
+different than "capability" for future ease of extension.
+
+The client MUST NOT ask for capabilities the server did not say it
+supports. The server MUST diagnose and abort if capabilities it does
+not understand was requested. The server MUST NOT ignore capabilities
+that client requested and server advertised.  As a consequence of these
+rules, server MUST NOT advertise capabilities it does not understand.
+
+See protocol-capabilities.txt for a list of allowed server and client
+capabilities and descriptions.
 
 Reference Discovery
 -------------------
@@ -154,10 +203,14 @@ If HEAD is a valid ref, HEAD MUST appear as the first advertised
 ref.  If HEAD is not a valid ref, HEAD MUST NOT appear in the
 advertisement list at all, but other refs may still appear.
 
-The stream MUST include capability declarations behind a NUL on the
-first ref. The peeled value of a ref (that is "ref^{}") MUST be
-immediately after the ref itself, if presented. A conforming server
-MUST peel the ref if it's an annotated tag.
+In version 1 the stream MUST include capability declarations behind
+a NUL on the first ref. The peeled value of a ref (that is "ref^{}")
+MUST be immediately after the ref itself, if presented. A conforming
+server MUST peel the ref if it's an annotated tag.
+
+In version 2 the capabilities are already negotiated, so the first ref
+MUST NOT be followed by any capability advertisement, but it should be
+treated as any other refs advertising line.
 
 ----
   advertised-refs  =  (no-refs / list-of-refs)
@@ -178,13 +231,28 @@ MUST peel the ref if it's an annotated tag.
   shallow          =  PKT-LINE("shallow" SP obj-id)
 
   capability-list  =  capability *(SP capability)
-  capability       =  1*(LC_ALPHA / DIGIT / "-" / "_")
+  capability       =  1*(LC_ALPHA / DIGIT / "-" / "_" / "=")
   LC_ALPHA         =  %x61-7A
 ----
 
 Server and client MUST use lowercase for obj-id, both MUST treat obj-id
 as case-insensitive.
 
+On the very first line of the initial server response of either
+receive-pack and upload-pack the first reference is followed by
+a NUL byte and then a list of space delimited server capabilities.
+These allow the server to declare what it can and cannot support
+to the client.
+
+Client will then send a space separated list of capabilities it wants
+to be in effect. The client MUST NOT ask for capabilities the server
+did not say it supports.
+
+Server MUST diagnose and abort if capabilities it does not understand
+was sent.  Server MUST NOT ignore capabilities that client requested
+and server advertised.  As a consequence of these rules, server MUST
+NOT advertise capabilities it does not understand.
+
 See protocol-capabilities.txt for a list of allowed server capabilities
 and descriptions.
 
diff --git a/Documentation/technical/protocol-capabilities.txt b/Documentation/technical/protocol-capabilities.txt
index 4f8a7bf..a6241d8 100644
--- a/Documentation/technical/protocol-capabilities.txt
+++ b/Documentation/technical/protocol-capabilities.txt
@@ -3,21 +3,6 @@ Git Protocol Capabilities
 
 Servers SHOULD support all capabilities defined in this document.
 
-On the very first line of the initial server response of either
-receive-pack and upload-pack the first reference is followed by
-a NUL byte and then a list of space delimited server capabilities.
-These allow the server to declare what it can and cannot support
-to the client.
-
-Client will then send a space separated list of capabilities it wants
-to be in effect. The client MUST NOT ask for capabilities the server
-did not say it supports.
-
-Server MUST diagnose and abort if capabilities it does not understand
-was sent.  Server MUST NOT ignore capabilities that client requested
-and server advertised.  As a consequence of these rules, server MUST
-NOT advertise capabilities it does not understand.
-
 The 'atomic', 'report-status', 'delete-refs', 'quiet', and 'push-cert'
 capabilities are sent and recognized by the receive-pack (push to server)
 process.
-- 
2.4.1.345.gab207b6.dirty

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 02/11] upload-pack: only accept capabilities on the first "want" line
  2015-05-26 22:01 ` [RFC/WIP PATCH 02/11] upload-pack: only accept capabilities on the first "want" line Stefan Beller
@ 2015-05-26 22:17   ` Junio C Hamano
  2015-05-26 22:20     ` Stefan Beller
  0 siblings, 1 reply; 61+ messages in thread
From: Junio C Hamano @ 2015-05-26 22:17 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, pclouds, peff

Stefan Beller <sbeller@google.com> writes:

> From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>
> pack-protocol.txt says so and fetch-pack also follows it even though
> upload-pack is a bit lax. Fix it.

Hmm, I actually think the .txt file unsuccessfully tried to close
the barn door after horse has long left.  The existing clients that
read protocol capabilities keep the last-read copy and then
overwrite it when a new one came, which is why we couldn't update
the protocol by sending only incremental changes, etc. without
breaking existing clients.

I somehow thought that we already discussed why this was bad the
first time Duy's patch was posted, but apparently we didn't.

X-<.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 07/11] fetch-pack: use the configured transport protocol
  2015-05-26 22:01 ` [RFC/WIP PATCH 07/11] fetch-pack: use the configured transport protocol Stefan Beller
@ 2015-05-26 22:19   ` Junio C Hamano
  2015-05-26 22:23     ` Stefan Beller
  2015-05-27  6:53   ` Jeff King
  1 sibling, 1 reply; 61+ messages in thread
From: Junio C Hamano @ 2015-05-26 22:19 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, pclouds, peff

Stefan Beller <sbeller@google.com> writes:

> @@ -175,7 +179,18 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
>  		if (!conn)
>  			return args.diag_url ? 0 : 1;
>  	}
> -	get_remote_heads(fd[0], NULL, 0, &ref, 0, NULL, &shallow);
> +
> +	switch (args.version) {
> +	default:
> +	case 2:
> +		get_remote_capabilities(fd[0], NULL, 0);
> +		request_capabilities(fd[1]);
> +		break;
> +	case 1: /* fall through */
> +	case 0:
> +		get_remote_heads(fd[0], NULL, 0, &ref, 0, NULL, &shallow);
> +		break;
> +	}


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 02/11] upload-pack: only accept capabilities on the first "want" line
  2015-05-26 22:17   ` Junio C Hamano
@ 2015-05-26 22:20     ` Stefan Beller
  0 siblings, 0 replies; 61+ messages in thread
From: Stefan Beller @ 2015-05-26 22:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Duy Nguyen, Jeff King

On Tue, May 26, 2015 at 3:17 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>>
>> pack-protocol.txt says so and fetch-pack also follows it even though
>> upload-pack is a bit lax. Fix it.
>
> Hmm, I actually think the .txt file unsuccessfully tried to close
> the barn door after horse has long left.  The existing clients that
> read protocol capabilities keep the last-read copy and then
> overwrite it when a new one came, which is why we couldn't update
> the protocol by sending only incremental changes, etc. without
> breaking existing clients.

On the other hand I am not aware of any remote implementation
handing out capabilities after the first line, so this would not break
anything to my knowledge.

>
> I somehow thought that we already discussed why this was bad the
> first time Duy's patch was posted, but apparently we didn't.

/me checks the archive. I may have overlooked that part. :(

>
> X-<.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 08/11] transport: connect_setup appends protocol version number
  2015-05-26 22:01 ` [RFC/WIP PATCH 08/11] transport: connect_setup appends protocol version number Stefan Beller
@ 2015-05-26 22:21   ` Junio C Hamano
  2015-05-26 22:31     ` Stefan Beller
  2015-05-27  3:33   ` Eric Sunshine
  2015-05-27  7:02   ` Jeff King
  2 siblings, 1 reply; 61+ messages in thread
From: Junio C Hamano @ 2015-05-26 22:21 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, pclouds, peff

Stefan Beller <sbeller@google.com> writes:

> +	if (transport->smart_options
> +	    && transport->smart_options->transport_version) {
> +		buf = xmalloc(strlen(remote_program) + 12);
> +		sprintf(buf, "%s-%d", remote_program,
> +			transport->smart_options->transport_version);
> +		remote_program = buf;
> +	}

Bikeshedding: so the versioning scheme is that the current one is
zero, and the next one is two?  I would have expected that there
would be something like

	if (...->version < 2) {
        	... append "-%d" ...
	}

involved.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 07/11] fetch-pack: use the configured transport protocol
  2015-05-26 22:19   ` Junio C Hamano
@ 2015-05-26 22:23     ` Stefan Beller
  0 siblings, 0 replies; 61+ messages in thread
From: Stefan Beller @ 2015-05-26 22:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Duy Nguyen, Jeff King

On Tue, May 26, 2015 at 3:19 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> @@ -175,7 +179,18 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
>>               if (!conn)
>>                       return args.diag_url ? 0 : 1;
>>       }
>> -     get_remote_heads(fd[0], NULL, 0, &ref, 0, NULL, &shallow);
>> +
>> +     switch (args.version) {
>> +     default:
>> +     case 2:
>> +             get_remote_capabilities(fd[0], NULL, 0);
>> +             request_capabilities(fd[1]);
>> +             break;

Actually this is wrong, we need to actually fall through from here as well,
so we not only talk capabilities negotiation, but then continue
with get_remote_heads.

>> +     case 1: /* fall through */
>> +     case 0:
>> +             get_remote_heads(fd[0], NULL, 0, &ref, 0, NULL, &shallow);
>> +             break;
>> +     }
>
> Nice ;-)

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 08/11] transport: connect_setup appends protocol version number
  2015-05-26 22:21   ` Junio C Hamano
@ 2015-05-26 22:31     ` Stefan Beller
  2015-05-27  5:09       ` Junio C Hamano
  0 siblings, 1 reply; 61+ messages in thread
From: Stefan Beller @ 2015-05-26 22:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Duy Nguyen, Jeff King

On Tue, May 26, 2015 at 3:21 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> +     if (transport->smart_options
>> +         && transport->smart_options->transport_version) {
>> +             buf = xmalloc(strlen(remote_program) + 12);
>> +             sprintf(buf, "%s-%d", remote_program,
>> +                     transport->smart_options->transport_version);
>> +             remote_program = buf;
>> +     }
>
> Bikeshedding: so the versioning scheme is that the current one is
> zero, and the next one is two?  I would have expected that there
> would be something like

I think currently we have version 1. But we don't advertise it, so
I'll call it 0
(the default or unadvertised or however you name it. 0 as in "unsure" maybe)

I thought about future proofing this version a bit. Say version 2 is bad because
I don't have the experience of 10 years Git nor of maintaining large
projects and
you want to make a version 3 soon. And this would support that just fine.

The meaning being: Any version except 0 should have a dedicated
extension -${version}
The 0 is left out for backwards compatibility.

So in a later patch where we want to introduce force-using of old
versions  you could
configure upload-pack to be explicit upload-pack-1. The upload-pack-1
version is not
yet there with this series though.

>
>         if (...->version < 2) {
>                 ... append "-%d" ...
>         }
>
> involved.

Oh! I see here you would count the current one as 1, which has no
number extension,
and any further would have a -${version}. That would transport the
intention much better
I guess.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 04/11] upload-pack-2: Implement the version 2 of upload-pack
  2015-05-26 22:01 ` [RFC/WIP PATCH 04/11] upload-pack-2: Implement the version 2 of upload-pack Stefan Beller
@ 2015-05-27  2:30   ` Eric Sunshine
  2015-05-27  6:35   ` Jeff King
  1 sibling, 0 replies; 61+ messages in thread
From: Eric Sunshine @ 2015-05-27  2:30 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Git List, Nguyễn Thái Ngọc Duy, Jeff King,
	Junio C Hamano

On Tue, May 26, 2015 at 6:01 PM, Stefan Beller <sbeller@google.com> wrote:
> In upload-pack-2 we send each capability in its own packet.
> By reusing the advertise_capabilities and eventually setting it to
> NULL we will be able to reuse the methods for refs advertisement.
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
> diff --git a/upload-pack-2.c b/upload-pack-2.c
> new file mode 120000
> index 0000000..e30a871
> --- /dev/null
> +++ b/upload-pack-2.c
> @@ -716,10 +716,47 @@ static void format_symref_info(struct strbuf *buf, struct string_list *symref)
>                 strbuf_addf(buf, " symref=%s:%s", item->string, (char *)item->util);
>  }
>
> -static const char *advertise_capabilities = "multi_ack thin-pack side-band"
> +static char *advertise_capabilities = "multi_ack thin-pack side-band"
>                 " side-band-64k ofs-delta shallow no-progress"
>                 " include-tag multi_ack_detailed";
>
> +/*
> + * Reads the next capability and puts it into dst as a null terminated string.
> + * Returns true if more capabilities can be read.
> + * */
> +static int next_capability(char *dst)
> +{
> +       int len = 0;
> +       if (!*advertise_capabilities) {
> +               /* make sure to not advertise capabilities afterwards */
> +               advertise_capabilities = NULL;

You set advertise_capabilities to NULL here but then unconditionally
dereference that NULL in the if-statement just above (if someone calls
next_capability() again). Seems fishy (and crashy) to me. Either don't
dereference the NULL or don't bother setting it to NULL (in which
case, you'll dereference and find '\0', which should serve the same
purpose).

> +               return 0;
> +       }
> +
> +       while (advertise_capabilities[len] != '\0' &&
> +              advertise_capabilities[len] != ' ')
> +               len ++;

Style: len++

> +       strncpy(dst, advertise_capabilities, len);
> +       dst[len] = '\0';
> +
> +       advertise_capabilities += len;
> +       if (*advertise_capabilities == ' ')
> +               advertise_capabilities++;

If for some reason, someone happens to add an extra space between
capabilities in advertise_capabilities, then the capability returned
by the next invocation of next_capability() be zero-length, which is
probably undesirable, and certainly not expected by the caller.
Skipping whitespace in a loop would be more robust.

> +       return 1;
> +}

I realize that this is RFC, but my overall reaction to
next_capability() in its current form is that it's ugly . A somewhat
cleaner approach would be to pass some state into next_capability()
and have it modify that state rather than the global
advertise_capabilities. The passed-in state could be as simple as a
'const char *' which initially points at the start of
advertise_capabilities and then gets advanced; until, finally, it
points at the '\0' at the end of advertise_capabilities.

> +static void send_capabilities(void)
> +{
> +       char buf[100];
> +
> +       while (next_capability(buf))
> +               packet_write(1, "capability:%s\n", buf);
> +
> +       packet_write(1, "agent:%s\n", git_user_agent_sanitized());
> +       packet_flush(1);
> +}
> +
>  static int send_ref(const char *refname, const unsigned char *sha1, int flag, void *cb_data)
>  {
>
> @@ -794,6 +831,28 @@ static void upload_pack(void)
>         }
>  }
>
> +static void receive_capabilities(void)
> +{
> +       int done = 0;
> +       while (1) {
> +               char *line = packet_read_line(0, NULL);

If you declare this 'const char *', then it becomes much more obvious
that it's not your responsibility to free() the result (and,
tangentially, that you have no intention of modifying the content).

> +               if (!line)
> +                       break;
> +               if (starts_with(line, "capability:"))
> +                       parse_features(line + strlen("capability:"));

See skip_prefix().

> +       }
> +}
> +
> +static void upload_pack_version_2(void)
> +{
> +       send_capabilities();
> +       receive_capabilities();
> +
> +       /* The rest of the protocol stays the same, capabilities advertising
> +          is disabled though. */

    /*
     * This is a multi-line
     * comment.
     */

> +       upload_pack();
> +}
> +
>  static int upload_pack_config(const char *var, const char *value, void *unused)
>  {
>         if (!strcmp("uploadpack.allowtipsha1inwant", var))
> @@ -806,16 +865,24 @@ static int upload_pack_config(const char *var, const char *value, void *unused)
>         return parse_hide_refs_config(var, value, "uploadpack");
>  }
>
> +static int endswith(const char *teststring, const char *ending)
> +{
> +       int slen = strlen(teststring);
> +       int elen = strlen(ending);

You may be confident today that no callers are supplying an 'ending'
which is longer than 'teststring', but someone may some day break that
assumption. At the very least, for robustness, add an assert() or
die() if 'elen' is greater than 'slen'.

> +       return !strcmp(teststring + slen - elen, ending);
> +}
> +
>  int main(int argc, char **argv)
>  {
>         char *dir;
> +       const char *cmd;
>         int i;
>         int strict = 0;
>
>         git_setup_gettext();
>
>         packet_trace_identity("upload-pack");
> -       git_extract_argv0_path(argv[0]);
> +       cmd = git_extract_argv0_path(argv[0]);
>         check_replace_refs = 0;
>
>         for (i = 1; i < argc; i++) {
> @@ -857,6 +924,10 @@ int main(int argc, char **argv)
>                 die("'%s' does not appear to be a git repository", dir);
>
>         git_config(upload_pack_config, NULL);
> -       upload_pack();
> +
> +       if (endswith(cmd, "-2"))
> +               upload_pack_version_2();
> +       else
> +               upload_pack();
>         return 0;
>  }
> --
> 2.4.1.345.gab207b6.dirty

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 06/11] remote.h: add get_remote_capabilities, request_capabilities
  2015-05-26 22:01 ` [RFC/WIP PATCH 06/11] remote.h: add get_remote_capabilities, request_capabilities Stefan Beller
@ 2015-05-27  3:25   ` Eric Sunshine
  2015-05-27  6:50     ` Jeff King
  2015-05-27  6:45   ` Jeff King
  1 sibling, 1 reply; 61+ messages in thread
From: Eric Sunshine @ 2015-05-27  3:25 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Git List, Nguyễn Thái Ngọc Duy, Jeff King,
	Junio C Hamano

On Tue, May 26, 2015 at 6:01 PM, Stefan Beller <sbeller@google.com> wrote:
> Instead of calling get_remote_heads as a first command during the
> protocol exchange, we need to have fine grained control over the
> capability negotiation in version 2 of the protocol.
>
> Introduce get_remote_capabilities, which will just listen to
> capabilities of the remote and request_capabilities which will
> tell the selection of capabilities to the remote.
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
> diff --git a/connect.c b/connect.c
> index c0144d8..bb17618 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -105,8 +105,54 @@ static void annotate_refs_with_symref_info(struct ref *ref)
>         string_list_clear(&symref, 0);
>  }
>
> +void get_remote_capabilities(int in, char *src_buf, size_t src_len)
> +{
> +       struct strbuf capabilities_string = STRBUF_INIT;
> +       for (;;) {
> +               int len;
> +               char *line = packet_buffer;
> +               const char *arg;
> +
> +               len = packet_read(in, &src_buf, &src_len,
> +                                 packet_buffer, sizeof(packet_buffer),
> +                                 PACKET_READ_GENTLE_ON_EOF |
> +                                 PACKET_READ_CHOMP_NEWLINE);
> +               if (len < 0)
> +                       die_initial_contact(0);
> +
> +               if (!len)
> +                       break;
> +
> +               if (len > 4 && skip_prefix(line, "ERR ", &arg))

The 'len > 4' check is needed because there's no guarantee that 'line'
is NUL-terminated. Correct?

> +                       die("remote error: %s", arg);
> +
> +               if (starts_with(line, "capability:")) {
> +                       strbuf_addstr(&capabilities_string, line + strlen("capability:"));

skip_prefix() maybe?

> +                       strbuf_addch(&capabilities_string, ' ');
> +               }
> +       }
> +       free(server_capabilities);
> +       server_capabilities = xmalloc(capabilities_string.len + 1);
> +       server_capabilities = strbuf_detach(&capabilities_string, NULL);

So, you're allocating a buffer for server_capabilities and then
immediately throwing away (leaking) that buffer. Seems fishy.

> +       strbuf_release(&capabilities_string);

Not outright incorrect, but you've just detached capabilities_string's
buffer, so releasing it doesn't buy anything.

> +}
> +
> +int request_capabilities(int out)
> +{
> +       fprintf(stderr, "request_capabilities\n");
> +       // todo: send our capabilities
> +       packet_write(out, "capability:foo");
> +       packet_flush(out);
> +}
> +
>  /*
> - * Read all the refs from the other end
> + * Read all the refs from the other end,
> + * in is a file descriptor input stream
> + * src_buf and src_len may be an alternative way to specify the input.

The bit about src_buf and src_len conveys little or nothing. After
reading it, I'm as clueless to their purpose as I would be if they
were undocumented.

> + * list is the output of refs
> + * extra_have is a list to store information learned about sha1 the other side has.
> + * shallow_points

'shallow_points' what?

>   */
>  struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
>                               struct ref **list, unsigned int flags,
> diff --git a/remote.h b/remote.h
> index 04e2310..7381ddf 100644
> --- a/remote.h
> +++ b/remote.h
> @@ -146,6 +146,9 @@ int check_ref_type(const struct ref *ref, int flags);
>  void free_refs(struct ref *ref);
>
>  struct sha1_array;
> +
> +extern void get_remote_capabilities(int in, char *src_buf, size_t src_len);
> +extern int request_capabilities(int out);
>  extern struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
>                                      struct ref **list, unsigned int flags,
>                                      struct sha1_array *extra_have,
> --
> 2.4.1.345.gab207b6.dirty

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 08/11] transport: connect_setup appends protocol version number
  2015-05-26 22:01 ` [RFC/WIP PATCH 08/11] transport: connect_setup appends protocol version number Stefan Beller
  2015-05-26 22:21   ` Junio C Hamano
@ 2015-05-27  3:33   ` Eric Sunshine
  2015-05-27  7:02   ` Jeff King
  2 siblings, 0 replies; 61+ messages in thread
From: Eric Sunshine @ 2015-05-27  3:33 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Git List, Nguyễn Thái Ngọc Duy, Jeff King,
	Junio C Hamano

On Tue, May 26, 2015 at 6:01 PM, Stefan Beller <sbeller@google.com> wrote:
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
> diff --git a/transport.c b/transport.c
> index 3ef15f6..33644a6 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -496,15 +496,29 @@ static int set_git_option(struct git_transport_options *opts,
>  static int connect_setup(struct transport *transport, int for_push, int verbose)
>  {
>         struct git_transport_data *data = transport->data;
> +       const char *remote_program;
> +       char *buf = 0;

Naming this 'to_free' would make its purpose more obvious[1], and be
more consistent with code elsewhere in the project.

[1]: http://article.gmane.org/gmane.comp.version-control.git/256125/

>         if (data->conn)
>                 return 0;
>
> +       remote_program = (for_push ? data->options.receivepack
> +                                  : data->options.uploadpack);
> +
> +       if (transport->smart_options
> +           && transport->smart_options->transport_version) {
> +               buf = xmalloc(strlen(remote_program) + 12);
> +               sprintf(buf, "%s-%d", remote_program,
> +                       transport->smart_options->transport_version);
> +               remote_program = buf;
> +       }
> +
>         data->conn = git_connect(data->fd, transport->url,
> -                                for_push ? data->options.receivepack :
> -                                data->options.uploadpack,
> +                                remote_program,
>                                  verbose ? CONNECT_VERBOSE : 0);
>
> +       free(buf);
> +
>         return 0;
>  }
>
> --
> 2.4.1.345.gab207b6.dirty

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 08/11] transport: connect_setup appends protocol version number
  2015-05-26 22:31     ` Stefan Beller
@ 2015-05-27  5:09       ` Junio C Hamano
  2015-05-27  6:56         ` Jeff King
  0 siblings, 1 reply; 61+ messages in thread
From: Junio C Hamano @ 2015-05-27  5:09 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Duy Nguyen, Jeff King

Stefan Beller <sbeller@google.com> writes:

> On Tue, May 26, 2015 at 3:21 PM, Junio C Hamano <gitster@pobox.com> wrote:
>
>>
>>         if (...->version < 2) {
>>                 ... append "-%d" ...
>>         }
>>
>> involved.
>
> Oh! I see here you would count the current one as 1, which has no
> number extension, and any further would have a -${version}. That
> would transport the intention much better I guess.

Yeah, except that I screwed up my comparison.  Obviously, I should
have said "If version is 2 or later, then append -%d to the name,
otherwise use the name as-is".

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 10/11] t5544: add a test case for the new protocol
  2015-05-26 22:01 ` [RFC/WIP PATCH 10/11] t5544: add a test case for the new protocol Stefan Beller
@ 2015-05-27  5:34   ` Eric Sunshine
  2015-05-27  7:12   ` Jeff King
  1 sibling, 0 replies; 61+ messages in thread
From: Eric Sunshine @ 2015-05-27  5:34 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Git List, Nguyễn Thái Ngọc Duy, Jeff King,
	Junio C Hamano

On Tue, May 26, 2015 at 6:01 PM, Stefan Beller <sbeller@google.com> wrote:
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
> diff --git a/t/t5544-fetch-2.sh b/t/t5544-fetch-2.sh
> new file mode 100755
> index 0000000..beee46c
> --- /dev/null
> +++ b/t/t5544-fetch-2.sh
> @@ -0,0 +1,40 @@
> +#!/bin/sh
> +#
> +# Copyright (c) 2015 Stefan Beller
> +#
> +
> +test_description='Testing version 2 of the fetch protocol'
> +
> +. ./test-lib.sh
> +
> +mk_repo_pair () {
> +       rm -rf client server &&
> +       test_create_repo client &&
> +       test_create_repo server &&
> +       (
> +               cd server &&
> +               git config receive.denyCurrentBranch warn
> +       ) &&
> +       (
> +               cd client &&
> +               git remote add origin ../server

Broken &&-chain.

> +               git config remote.origin.transportversion 2
> +       )
> +}
> +
> +test_expect_success 'setup' '
> +       mk_repo_pair &&
> +       (
> +               cd server &&
> +               test_commit one
> +       ) &&
> +       (
> +               cd client &&
> +               git fetch origin master
> +       )
> +'
> +
> +# More to come here, similar to t5515 having a sub directory full of expected
> +# data going over the wire.
> +
> +test_done
> --
> 2.4.1.345.gab207b6.dirty

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 09/11] transport: get_refs_via_connect exchanges capabilities before refs.
  2015-05-26 22:01 ` [RFC/WIP PATCH 09/11] transport: get_refs_via_connect exchanges capabilities before refs Stefan Beller
@ 2015-05-27  5:37   ` Eric Sunshine
  2015-05-27  7:06   ` Jeff King
  1 sibling, 0 replies; 61+ messages in thread
From: Eric Sunshine @ 2015-05-27  5:37 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Git List, Nguyễn Thái Ngọc Duy, Jeff King,
	Junio C Hamano

On Tue, May 26, 2015 at 6:01 PM, Stefan Beller <sbeller@google.com> wrote:
> transport: get_refs_via_connect exchanges capabilities before refs.

s/exchanges/exchange/
s/\.$//

> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  transport.c | 29 +++++++++++++++++++++++++----
>  1 file changed, 25 insertions(+), 4 deletions(-)
>
> diff --git a/transport.c b/transport.c
> index 33644a6..1cd9b77 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -526,12 +526,33 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
>  {
>         struct git_transport_data *data = transport->data;
>         struct ref *refs;
> +       int version = 0;
>
> +       if (transport->smart_options)
> +               version = transport->smart_options->transport_version;
>         connect_setup(transport, for_push, 0);
> -       get_remote_heads(data->fd[0], NULL, 0, &refs,
> -                        for_push ? REF_NORMAL : 0,
> -                        &data->extra_have,
> -                        &data->shallow);
> +       switch (version) {
> +               default: /*
> +                         * Configured a protocol version > 2?
> +                         * Try version 2 as it's the most future proof.
> +                         */
> +                       /* fall through */
> +               case 2: /* first talk about capabilities, then get the heads */
> +                       get_remote_capabilities(data->fd[0], NULL, 0);
> +                       request_capabilities(data->fd[1]);
> +                       get_remote_heads(data->fd[0], NULL, 0, &refs,
> +                                        for_push ? REF_NORMAL : 0,
> +                                        &data->extra_have,
> +                                        &data->shallow);
> +                       break;
> +               case 1: /* configured version 1, fall through */
> +               case 0: /* unconfigured, use first protocol */
> +                       get_remote_heads(data->fd[0], NULL, 0, &refs,
> +                                        for_push ? REF_NORMAL : 0,
> +                                        &data->extra_have,
> +                                        &data->shallow);
> +                       break;
> +       }
>         data->got_remote_heads = 1;
>
>         return refs;
> --
> 2.4.1.345.gab207b6.dirty

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 00/11] Protocol version 2, again!
  2015-05-26 22:01 [RFC/WIP PATCH 00/11] Protocol version 2, again! Stefan Beller
                   ` (10 preceding siblings ...)
  2015-05-26 22:01 ` [RFC/WIP PATCH 11/11] Document protocol version 2 Stefan Beller
@ 2015-05-27  6:18 ` Jeff King
  2015-05-27  7:08   ` Jeff King
  11 siblings, 1 reply; 61+ messages in thread
From: Jeff King @ 2015-05-27  6:18 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, pclouds, gitster

On Tue, May 26, 2015 at 03:01:04PM -0700, Stefan Beller wrote:

>     "Just give us something to play around with" - Peff at GitMerge 2015

Sounds like something I would say.

> The new protocol works just like the old protocol, except for
> the capabilities negotiation being before any exchange of real data.

I like this approach. We all know that one next step is going to be a
"show me only these refs" capability negotiation of some kind. But it's
good to keep the version-breaking changes separated from that, and this
should be enough to bootstrap that conversation later.

If I understand correctly, your proposed protocol allows for a single
back-and-forth of capabilities followed by the server speaking the ref
advertisement. So it is worth thinking a moment how we might have a more
involved conversation before the advertisement if we want to do so in
the future.

I think in the simplest case, the server claims to understand the "foo"
extension, and then the client says "I wish to use foo". And then a rule
of "foo" might be that the conversation continues in some way before
sending the advertisement. Like (each line is a pkt-line):

  ...
  S: foo
  S: flush
  ...
  C: foo
  S: Here's some extra foo data.
  C: Now I respond to that foo data.
  S: Now the foo conversation is done. Here's the ref advertisement.

What if there are multiple such extensions? E.g., if the client asks for
both "foo" and "bar", and both require extra back-and-forth messages?
Which conversation happens first?

Maybe the rule is just "whichever one the client asked for first in the
capabilities list". And I think in general we want to avoid protocol
round-trips if we can (so we'd prefer the client say
"refspec=refs/heads/*", and not "I understand refspecs, too! Let's have
a conversation about which ones I'm interested in."). But I think it's
worth giving it a little thought to make sure we don't paint ourselves
in a corner.

> My preference for a string suffix instead of a sequential number is
> motiviated by the discussion of sha1 vs sequential numbers to describe
> a state of a repository. The main difference here is however how often
> we expect changes. Version 1 of the protocol is current for 10 years by
> now, so I do not changes to the protocol number often, which makes it
> suitable for just having a counter appended to the binary.

I think I prefer a number. I'm really hoping that v2 lasts even longer
than v1 has, because the capability negotiation should allow us to
extend it from within rather than breaking compatibility.

As a minor nit, I think I like "upload-pack-v2" better than
"upload-pack-2". But I can live with it either way. :)

> This series doesn't include an automatic upgrade of the protocol version
> for later changes if the server supports it, so for now the use of the new
> protocol needs to be activated manually via setting remote.origin.transportversion.

I think that's a good start. Last time we discussed it, I think
everybody was more or less on board with client probing (so v1 would
start to say "btw, I speak v2", and then client would set its
remote.origin.transportversion flag). That can come later, and we are
not painting ourselves into a corner.

I think we also discussed passing the version information out-of-band.
So over git-daemon, as "git-upload-pack repo\0host=...\0\0version=2",
for example. I _think_ we are also fine to build that on top of what you
have here. If the version information makes it through to upload-pack,
then we can do v2, and if not, we are no worse off than we were. HTTP
can do a similar out-of-band thing, but I think ssh is mostly out of
luck. The best I could think of was passing an environment variable, but
ssh typically only lets through a few variables. We could abuse PATH or
something, but that's getting pretty nasty.

Anyway, that is all for the future. The only reason I mention it is to
make sure that we are not closing any future doors, and I don't think we
are.

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 04/11] upload-pack-2: Implement the version 2 of upload-pack
  2015-05-26 22:01 ` [RFC/WIP PATCH 04/11] upload-pack-2: Implement the version 2 of upload-pack Stefan Beller
  2015-05-27  2:30   ` Eric Sunshine
@ 2015-05-27  6:35   ` Jeff King
  2015-05-27 17:30     ` Eric Sunshine
  2015-05-27 17:40     ` Stefan Beller
  1 sibling, 2 replies; 61+ messages in thread
From: Jeff King @ 2015-05-27  6:35 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, pclouds, gitster

On Tue, May 26, 2015 at 03:01:08PM -0700, Stefan Beller wrote:

> --- a/upload-pack.c
> +++ b/upload-pack.c
> @@ -716,10 +716,47 @@ static void format_symref_info(struct strbuf *buf, struct string_list *symref)
>  		strbuf_addf(buf, " symref=%s:%s", item->string, (char *)item->util);
>  }
>  
> -static const char *advertise_capabilities = "multi_ack thin-pack side-band"
> +static char *advertise_capabilities = "multi_ack thin-pack side-band"
>  		" side-band-64k ofs-delta shallow no-progress"
>  		" include-tag multi_ack_detailed";

If we are upload-pack-2, should we advertise that in the capabilities? I
think it may make things easier later if we try to provide some
opportunistic out-of-band data. E.g., if see tell git-daemon:

  git-upload-pack repo\0host=whatever\0\0version=2

how do we know whether version=2 was understood and kicked us into v2
mode, versus an old server that ignored it?

It also just makes the protocol more self-describing; from the
conversation you can see which version is in use.

> +static void send_capabilities(void)
> +{
> +	char buf[100];
> +
> +	while (next_capability(buf))
> +		packet_write(1, "capability:%s\n", buf);

Having a static-sized buffer and then passing it to a function which has
no idea of the buffer size seems like a recipe for a buffer overflow. :)

You are fine here because you are parsing the hard-coded capabilities
string, and we know 100 is enough there. But it's nice to avoid such
magic.

Like Eric, I find the whole next_capability thing a little ugly. His
suggestion to pass in the parsing state is an improvement, but I wonder
why we need to parse at all. Can we keep the capabilities as:

  const char *capabilities[] = {
	"multi_ack",
	"thin-pack",
	... etc ...
  };

and then loop over the array? The v1 writer will need to be modified,
but it should be fairly straightforward (loop and add them to the buffer
rather than dumping the whole string).

Also, do we need the capability noise-word? They all have it, except
for:

> +	packet_write(1, "agent:%s\n", git_user_agent_sanitized());

But isn't that basically a capability, too (I agree it is stretching the
definition of "capability", but I think that ship has sailed; the
client's response is not "I support this, too" but "I want to enable
this").

IOW, is there a reason that the initial conversation is not just:

  pkt-line("multi_ack");
  pkt-line("thin-pack");
  ...
  pkt-line("agent=v2.5.0");
  pkt-line(0000);

We already have framing for each capability due to the use of pkt-line.
The "capability:" is just one more thing the client has to parse past.
The keys are already unique up to any "=", so I don't think there is any
ambiguity (e.g., we don't care about "capability:agent"; we have already
assigned a meaning to the term "agent", and will never introduce a
standalone capability with the same name).

Likewise, if we introduce new protocol items here, the client should
either ignore them (if it does not understand them) or know what to do
with them.

> +static void receive_capabilities(void)
> +{
> +	int done = 0;
> +	while (1) {
> +		char *line = packet_read_line(0, NULL);
> +		if (!line)
> +			break;
> +		if (starts_with(line, "capability:"))
> +			parse_features(line + strlen("capability:"));
> +	}

Use:

  const char *cap;
  if (skip_prefix(line, "capability:", &cap))
	parse_features(cap);

Or better yet, if you take my suggestion above:

  parse_features(line);

:)

> +static int endswith(const char *teststring, const char *ending)
> +{
> +	int slen = strlen(teststring);
> +	int elen = strlen(ending);
> +	return !strcmp(teststring + slen - elen, ending);
> +}

Eric mentioned the underflow problems here, but I wonder even more:
what's wrong with the global ends_with() that we already provide?

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 05/11] transport: add infrastructure to support a protocol version number
  2015-05-26 22:01 ` [RFC/WIP PATCH 05/11] transport: add infrastructure to support a protocol version number Stefan Beller
@ 2015-05-27  6:39   ` Jeff King
  2015-05-27 19:01     ` Stefan Beller
  2015-05-27 19:10     ` Junio C Hamano
  0 siblings, 2 replies; 61+ messages in thread
From: Jeff King @ 2015-05-27  6:39 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, pclouds, gitster

On Tue, May 26, 2015 at 03:01:09PM -0700, Stefan Beller wrote:

> +	OPT_STRING('y', "transport-version", &transport_version,
> +		   N_("transport-version"),
> +		   N_("specify transport version to be used")),

Interesting choice for the short option ("-v" would be nice, but
obviously it is taken). Do we want to delay on claiming the
short-and-sweet 'y' until we are sure this is something people will use
a lot? In an ideal world, it is not (i.e., auto-upgrade and other tricks
become good enough that nobody bothers to specify it manually).

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 06/11] remote.h: add get_remote_capabilities, request_capabilities
  2015-05-26 22:01 ` [RFC/WIP PATCH 06/11] remote.h: add get_remote_capabilities, request_capabilities Stefan Beller
  2015-05-27  3:25   ` Eric Sunshine
@ 2015-05-27  6:45   ` Jeff King
  2015-05-29 19:39     ` Stefan Beller
  1 sibling, 1 reply; 61+ messages in thread
From: Jeff King @ 2015-05-27  6:45 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, pclouds, gitster

On Tue, May 26, 2015 at 03:01:10PM -0700, Stefan Beller wrote:

> +void get_remote_capabilities(int in, char *src_buf, size_t src_len)
> +{
> +	struct strbuf capabilities_string = STRBUF_INIT;
> +	for (;;) {
> +		int len;
> +		char *line = packet_buffer;
> +		const char *arg;
> +
> +		len = packet_read(in, &src_buf, &src_len,
> +				  packet_buffer, sizeof(packet_buffer),
> +				  PACKET_READ_GENTLE_ON_EOF |
> +				  PACKET_READ_CHOMP_NEWLINE);
> +		if (len < 0)
> +			die_initial_contact(0);
> +
> +		if (!len)
> +			break;
> +
> +		if (len > 4 && skip_prefix(line, "ERR ", &arg))
> +			die("remote error: %s", arg);
> +
> +		if (starts_with(line, "capability:")) {
> +			strbuf_addstr(&capabilities_string, line + strlen("capability:"));
> +			strbuf_addch(&capabilities_string, ' ');
> +		}
> +	}

I think this is the reverse case of next_capabilities in the upload-pack
side, so I'll make the reverse suggestion. :) Would it make things nicer
if both v1 and v2 parsed the capabilities into a string_list?

> +	free(server_capabilities);
> +	server_capabilities = xmalloc(capabilities_string.len + 1);
> +	server_capabilities = strbuf_detach(&capabilities_string, NULL);

Is this xmalloc line left over? The strbuf_detach will write over it.

> +	strbuf_release(&capabilities_string);

No need to release if we just detached.

> +int request_capabilities(int out)
> +{
> +	fprintf(stderr, "request_capabilities\n");

Debug cruft, I presume.

> +	// todo: send our capabilities
> +	packet_write(out, "capability:foo");

No C99 comments. But I think this is just a placeholder. :)

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 06/11] remote.h: add get_remote_capabilities, request_capabilities
  2015-05-27  3:25   ` Eric Sunshine
@ 2015-05-27  6:50     ` Jeff King
  2015-05-27 17:19       ` Eric Sunshine
  0 siblings, 1 reply; 61+ messages in thread
From: Jeff King @ 2015-05-27  6:50 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Stefan Beller, Git List, Nguyễn Thái Ngọc Duy,
	Junio C Hamano

On Tue, May 26, 2015 at 11:25:05PM -0400, Eric Sunshine wrote:

> > +               len = packet_read(in, &src_buf, &src_len,
> > +                                 packet_buffer, sizeof(packet_buffer),
> > +                                 PACKET_READ_GENTLE_ON_EOF |
> > +                                 PACKET_READ_CHOMP_NEWLINE);
> > +               if (len < 0)
> > +                       die_initial_contact(0);
> > +
> > +               if (!len)
> > +                       break;
> > +
> > +               if (len > 4 && skip_prefix(line, "ERR ", &arg))
> 
> The 'len > 4' check is needed because there's no guarantee that 'line'
> is NUL-terminated. Correct?

I think this was just blindly copied from get_remote_heads(). And I
think that code was being overly paranoid. Ever since f3a3214 (Make
send/receive-pack be closer to doing something interesting, 2005-06-29),
the pkt-line reader will add an extra NUL to the buffer to ease cases
like this.

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 07/11] fetch-pack: use the configured transport protocol
  2015-05-26 22:01 ` [RFC/WIP PATCH 07/11] fetch-pack: use the configured transport protocol Stefan Beller
  2015-05-26 22:19   ` Junio C Hamano
@ 2015-05-27  6:53   ` Jeff King
  1 sibling, 0 replies; 61+ messages in thread
From: Jeff King @ 2015-05-27  6:53 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, pclouds, gitster

On Tue, May 26, 2015 at 03:01:11PM -0700, Stefan Beller wrote:

> diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
> index 4a6b340..32dc8b0 100644
> --- a/builtin/fetch-pack.c
> +++ b/builtin/fetch-pack.c
> @@ -127,6 +127,10 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
>  			args.update_shallow = 1;
>  			continue;
>  		}
> +		if (!strcmp("--transport-version", arg)) {
> +			args.version = strtol(arg + strlen("--transport-version"), NULL, 0);
> +			continue;
> +		}

You strcmp() the arg here, so there can't be anything at arg +
strlen(...), can there? Did you want:

  starts_with(arg, "--transports-version=")

here? Or better yet, skip_prefix().

> -	get_remote_heads(fd[0], NULL, 0, &ref, 0, NULL, &shallow);
> +
> +	switch (args.version) {
> +	default:
> +	case 2:
> +		get_remote_capabilities(fd[0], NULL, 0);
> +		request_capabilities(fd[1]);
> +		break;

Should the "default" case be to complain about an unknown version,
rather than fall-through to 2?

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 08/11] transport: connect_setup appends protocol version number
  2015-05-27  5:09       ` Junio C Hamano
@ 2015-05-27  6:56         ` Jeff King
  0 siblings, 0 replies; 61+ messages in thread
From: Jeff King @ 2015-05-27  6:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Beller, git, Duy Nguyen

On Tue, May 26, 2015 at 10:09:45PM -0700, Junio C Hamano wrote:

> Stefan Beller <sbeller@google.com> writes:
> 
> > On Tue, May 26, 2015 at 3:21 PM, Junio C Hamano <gitster@pobox.com> wrote:
> >
> >>
> >>         if (...->version < 2) {
> >>                 ... append "-%d" ...
> >>         }
> >>
> >> involved.
> >
> > Oh! I see here you would count the current one as 1, which has no
> > number extension, and any further would have a -${version}. That
> > would transport the intention much better I guess.
> 
> Yeah, except that I screwed up my comparison.  Obviously, I should
> have said "If version is 2 or later, then append -%d to the name,
> otherwise use the name as-is".

FWIW, I had similar head-scratching over Stefan's original. I think it's
OK to say "version 1 is magical, and for historical reasons does not
have its number appended". There's no need for us ever to make
"upload-pack-1"; it just introduces more headaches.

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 08/11] transport: connect_setup appends protocol version number
  2015-05-26 22:01 ` [RFC/WIP PATCH 08/11] transport: connect_setup appends protocol version number Stefan Beller
  2015-05-26 22:21   ` Junio C Hamano
  2015-05-27  3:33   ` Eric Sunshine
@ 2015-05-27  7:02   ` Jeff King
  2 siblings, 0 replies; 61+ messages in thread
From: Jeff King @ 2015-05-27  7:02 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, pclouds, gitster

On Tue, May 26, 2015 at 03:01:12PM -0700, Stefan Beller wrote:

> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  transport.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/transport.c b/transport.c
> index 3ef15f6..33644a6 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -496,15 +496,29 @@ static int set_git_option(struct git_transport_options *opts,
>  static int connect_setup(struct transport *transport, int for_push, int verbose)
>  {
>  	struct git_transport_data *data = transport->data;
> +	const char *remote_program;
> +	char *buf = 0;

Use NULL when you mean a NULL pointer (they're equivalent to the
compiler, but the word is easier to read).

I agree on Eric's naming this "to_free" (and I consider it idiomatic to
assign them in a chain, like "foo = to_free = xmalloc(...)", but we
don't always do that).

> +	if (transport->smart_options
> +	    && transport->smart_options->transport_version) {
> +		buf = xmalloc(strlen(remote_program) + 12);
> +		sprintf(buf, "%s-%d", remote_program,
> +			transport->smart_options->transport_version);
> +		remote_program = buf;
> +	}

Using xstrfmt can help you avoid magic numbers and repetition,
like:

  to_free = xstrfmt("%s-%d",
                    remote_program,
		    transport->smart_options->transport_version);

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 09/11] transport: get_refs_via_connect exchanges capabilities before refs.
  2015-05-26 22:01 ` [RFC/WIP PATCH 09/11] transport: get_refs_via_connect exchanges capabilities before refs Stefan Beller
  2015-05-27  5:37   ` Eric Sunshine
@ 2015-05-27  7:06   ` Jeff King
  1 sibling, 0 replies; 61+ messages in thread
From: Jeff King @ 2015-05-27  7:06 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, pclouds, gitster

On Tue, May 26, 2015 at 03:01:13PM -0700, Stefan Beller wrote:

> +	switch (version) {
> +		default: /*
> +			  * Configured a protocol version > 2?
> +			  * Try version 2 as it's the most future proof.
> +			  */
> +			/* fall through */

Same comment here as earlier. If we do a v3, it might not be compatible
with v2. Shouldn't we bail if the user asked for it?

> +		case 2: /* first talk about capabilities, then get the heads */
> +			get_remote_capabilities(data->fd[0], NULL, 0);
> +			request_capabilities(data->fd[1]);
> +			get_remote_heads(data->fd[0], NULL, 0, &refs,
> +					 for_push ? REF_NORMAL : 0,
> +					 &data->extra_have,
> +					 &data->shallow);
> +			break;
> +		case 1: /* configured version 1, fall through */
> +		case 0: /* unconfigured, use first protocol */
> +			get_remote_heads(data->fd[0], NULL, 0, &refs,
> +					 for_push ? REF_NORMAL : 0,
> +					 &data->extra_have,
> +					 &data->shallow);
> +			break;
> +	}

I actually kind of wonder if we should just die("BUG") here on seeing
"0". Is this low level the right place to do the "default to v1"?
Because eventually we're going to want to default to v2, I would think
(after a few years have passed, at least).  It seems like we should be
making that decision centrally when we init the transport options.

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 00/11] Protocol version 2, again!
  2015-05-27  6:18 ` [RFC/WIP PATCH 00/11] Protocol version 2, again! Jeff King
@ 2015-05-27  7:08   ` Jeff King
  2015-06-01 17:49     ` Stefan Beller
  0 siblings, 1 reply; 61+ messages in thread
From: Jeff King @ 2015-05-27  7:08 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, pclouds, gitster

On Wed, May 27, 2015 at 02:18:18AM -0400, Jeff King wrote:

> > The new protocol works just like the old protocol, except for
> > the capabilities negotiation being before any exchange of real data.
> 
> I like this approach. [...]

So now I've read through all the patches. I still like it. :)

There's a lot of work to be done still, but I think this is a great
start. Thanks for getting the ball rolling.

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 10/11] t5544: add a test case for the new protocol
  2015-05-26 22:01 ` [RFC/WIP PATCH 10/11] t5544: add a test case for the new protocol Stefan Beller
  2015-05-27  5:34   ` Eric Sunshine
@ 2015-05-27  7:12   ` Jeff King
  1 sibling, 0 replies; 61+ messages in thread
From: Jeff King @ 2015-05-27  7:12 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, pclouds, gitster

On Tue, May 26, 2015 at 03:01:14PM -0700, Stefan Beller wrote:

> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  t/t5544-fetch-2.sh | 40 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 40 insertions(+)
>  create mode 100755 t/t5544-fetch-2.sh

Obviously we are not there yet, but a fun test will be to globally bump
the transport version to "2" and try to run the test suite.

We will also want to test interoperation between v1 and v2. Though the
_best_ test of that is not hitting a v2 server configured for v1, but
hitting an actual older version of git. Which is outside the scope of
the current test suite, as it always operates on a single version.

It might be nice to have a separate test suite for doing
interoperability, that knows how to build various versions (there's
already some prior art in t/perf). I know this isn't the first time this
concept has come up.

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 06/11] remote.h: add get_remote_capabilities, request_capabilities
  2015-05-27  6:50     ` Jeff King
@ 2015-05-27 17:19       ` Eric Sunshine
  2015-05-27 20:09         ` Jeff King
  0 siblings, 1 reply; 61+ messages in thread
From: Eric Sunshine @ 2015-05-27 17:19 UTC (permalink / raw)
  To: Jeff King
  Cc: Stefan Beller, Git List, Nguyễn Thái Ngọc,
	Junio C Hamano

On Wed, May 27, 2015 at 2:50 AM, Jeff King <peff@peff.net> wrote:
> On Tue, May 26, 2015 at 11:25:05PM -0400, Eric Sunshine wrote:
>
>> > +               len = packet_read(in, &src_buf, &src_len,
>> > +                                 packet_buffer, sizeof(packet_buffer),
>> > +                                 PACKET_READ_GENTLE_ON_EOF |
>> > +                                 PACKET_READ_CHOMP_NEWLINE);
>> > +               if (len < 0)
>> > +                       die_initial_contact(0);
>> > +
>> > +               if (!len)
>> > +                       break;
>> > +
>> > +               if (len > 4 && skip_prefix(line, "ERR ", &arg))
>>
>> The 'len > 4' check is needed because there's no guarantee that 'line'
>> is NUL-terminated. Correct?
>
> I think this was just blindly copied from get_remote_heads(). And I
> think that code was being overly paranoid. Ever since f3a3214 (Make
> send/receive-pack be closer to doing something interesting, 2005-06-29),
> the pkt-line reader will add an extra NUL to the buffer to ease cases
> like this.

Thanks. I had started digging into packet_read() to determine whether
it guaranteed NUL-termination, but didn't get far enough to decide. I
agree that if NUL-termination is guaranteed, then the 'len > 4' check
is superfluous (and confusing, which is why it caught my attention in
the first place).

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 04/11] upload-pack-2: Implement the version 2 of upload-pack
  2015-05-27  6:35   ` Jeff King
@ 2015-05-27 17:30     ` Eric Sunshine
  2015-05-27 20:14       ` Jeff King
  2015-05-27 17:40     ` Stefan Beller
  1 sibling, 1 reply; 61+ messages in thread
From: Eric Sunshine @ 2015-05-27 17:30 UTC (permalink / raw)
  To: Jeff King
  Cc: Stefan Beller, Git List, Nguyễn Thái Ngọc Duy,
	Junio C Hamano

On Wed, May 27, 2015 at 2:35 AM, Jeff King <peff@peff.net> wrote:
> On Tue, May 26, 2015 at 03:01:08PM -0700, Stefan Beller wrote:
>> +static void send_capabilities(void)
>> +{
>> +     char buf[100];
>> +
>> +     while (next_capability(buf))
>> +             packet_write(1, "capability:%s\n", buf);
>
> Like Eric, I find the whole next_capability thing a little ugly. His
> suggestion to pass in the parsing state is an improvement, but I wonder
> why we need to parse at all. Can we keep the capabilities as:
>
>   const char *capabilities[] = {
>         "multi_ack",
>         "thin-pack",
>         ... etc ...
>   };
>
> and then loop over the array?

Yes, that would be much nicer. I also had this in mind but didn't know
if there was a strong reason for the capabilities to be shoehorned
into a single string as they are currently.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 04/11] upload-pack-2: Implement the version 2 of upload-pack
  2015-05-27  6:35   ` Jeff King
  2015-05-27 17:30     ` Eric Sunshine
@ 2015-05-27 17:40     ` Stefan Beller
  2015-05-27 20:34       ` Jeff King
  1 sibling, 1 reply; 61+ messages in thread
From: Stefan Beller @ 2015-05-27 17:40 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Duy Nguyen, Junio C Hamano

On Tue, May 26, 2015 at 11:35 PM, Jeff King <peff@peff.net> wrote:
> On Tue, May 26, 2015 at 03:01:08PM -0700, Stefan Beller wrote:
>
>> --- a/upload-pack.c
>> +++ b/upload-pack.c
>> @@ -716,10 +716,47 @@ static void format_symref_info(struct strbuf *buf, struct string_list *symref)
>>               strbuf_addf(buf, " symref=%s:%s", item->string, (char *)item->util);
>>  }
>>
>> -static const char *advertise_capabilities = "multi_ack thin-pack side-band"
>> +static char *advertise_capabilities = "multi_ack thin-pack side-band"
>>               " side-band-64k ofs-delta shallow no-progress"
>>               " include-tag multi_ack_detailed";
>
> If we are upload-pack-2, should we advertise that in the capabilities? I
> think it may make things easier later if we try to provide some
> opportunistic out-of-band data. E.g., if see tell git-daemon:
>
>   git-upload-pack repo\0host=whatever\0\0version=2
>
> how do we know whether version=2 was understood and kicked us into v2
> mode, versus an old server that ignored it?

So in my vision we would call git-upload-pack-2 instead of having a "version=2".
and if git-upload-pack-2 doesn't exist, the whole conversation is
over, the client
it is left to make up some good error message or retry version 1.

But I think advertising both which versions the server could deal
with, as well as
the currently expected version is a good thing.

capability: can_speak=1,2
capability: speaking_now=2

>
> It also just makes the protocol more self-describing; from the
> conversation you can see which version is in use.
>
>> +static void send_capabilities(void)
>> +{
>> +     char buf[100];
>> +
>> +     while (next_capability(buf))
>> +             packet_write(1, "capability:%s\n", buf);
>
> Having a static-sized buffer and then passing it to a function which has
> no idea of the buffer size seems like a recipe for a buffer overflow. :)

Yes. All patches I proposed are very brittle. My first goal was to have the
last test passing (an actual fetch with version 2). Now I will start looking at
making things robust, as by now you all seem to not disagree totally.

>
> You are fine here because you are parsing the hard-coded capabilities
> string, and we know 100 is enough there. But it's nice to avoid such
> magic.
>
> Like Eric, I find the whole next_capability thing a little ugly. His
> suggestion to pass in the parsing state is an improvement, but I wonder
> why we need to parse at all. Can we keep the capabilities as:
>
>   const char *capabilities[] = {
>         "multi_ack",
>         "thin-pack",
>         ... etc ...
>   };
>
> and then loop over the array? The v1 writer will need to be modified,
> but it should be fairly straightforward (loop and add them to the buffer
> rather than dumping the whole string).

Thanks for the design guidance! I'll change that.

>
> Also, do we need the capability noise-word?

I thought it opens up a new possible door in the future.
As we ignore anything not starting with "capability" for now, you
could introduce
your foo and bar ping pong easily and still be version 2 compatible.

S: capability: thin
S: capability: another-capability
S: ping-pong foo
S: done
C: (not having understood ping-pong) just answering with capability: thin
C: done, let's proceed to refs advertisement

The alternative client would do:

C: ping-pong: foo-data1a
S: ping-pong: foo-data1b
C: ping-pong: foo-data2a
C: capability: thin
...

> They all have it, except
> for:
>
>> +     packet_write(1, "agent:%s\n", git_user_agent_sanitized());
>
> But isn't that basically a capability, too (I agree it is stretching the
> definition of "capability", but I think that ship has sailed; the
> client's response is not "I support this, too" but "I want to enable
> this").
>
> IOW, is there a reason that the initial conversation is not just:
>
>   pkt-line("multi_ack");
>   pkt-line("thin-pack");
>   ...
>   pkt-line("agent=v2.5.0");
>   pkt-line(0000);
>
> We already have framing for each capability due to the use of pkt-line.
> The "capability:" is just one more thing the client has to parse past.
> The keys are already unique up to any "=", so I don't think there is any
> ambiguity (e.g., we don't care about "capability:agent"; we have already
> assigned a meaning to the term "agent", and will never introduce a
> standalone capability with the same name).

It looks clearer to me (we're not wasting band-width), maybe this ping pong
thing can be part of the capabilities announcement too, so we're good this way.

>
> Likewise, if we introduce new protocol items here, the client should
> either ignore them (if it does not understand them) or know what to do
> with them.
>
>> +static void receive_capabilities(void)
>> +{
>> +     int done = 0;
>> +     while (1) {
>> +             char *line = packet_read_line(0, NULL);
>> +             if (!line)
>> +                     break;
>> +             if (starts_with(line, "capability:"))
>> +                     parse_features(line + strlen("capability:"));
>> +     }
>
> Use:
>
>   const char *cap;
>   if (skip_prefix(line, "capability:", &cap))
>         parse_features(cap);
>
> Or better yet, if you take my suggestion above:
>
>   parse_features(line);

will do.

>
> :)
>
>> +static int endswith(const char *teststring, const char *ending)
>> +{
>> +     int slen = strlen(teststring);
>> +     int elen = strlen(ending);
>> +     return !strcmp(teststring + slen - elen, ending);
>> +}
>
> Eric mentioned the underflow problems here, but I wonder even more:
> what's wrong with the global ends_with() that we already provide?

I was missing knowledge we have that, and apparently I was thinking it's
faster to come up with my own version than to look for it. :)

>
> -Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 05/11] transport: add infrastructure to support a protocol version number
  2015-05-27  6:39   ` Jeff King
@ 2015-05-27 19:01     ` Stefan Beller
  2015-05-27 20:17       ` Jeff King
  2015-05-27 19:10     ` Junio C Hamano
  1 sibling, 1 reply; 61+ messages in thread
From: Stefan Beller @ 2015-05-27 19:01 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Duy Nguyen, Junio C Hamano

On Tue, May 26, 2015 at 11:39 PM, Jeff King <peff@peff.net> wrote:
> On Tue, May 26, 2015 at 03:01:09PM -0700, Stefan Beller wrote:
>
>> +     OPT_STRING('y', "transport-version", &transport_version,
>> +                N_("transport-version"),
>> +                N_("specify transport version to be used")),
>
> Interesting choice for the short option ("-v" would be nice, but
> obviously it is taken). Do we want to delay on claiming the
> short-and-sweet 'y' until we are sure this is something people will use
> a lot? In an ideal world, it is not (i.e., auto-upgrade and other tricks
> become good enough that nobody bothers to specify it manually).

I made the choice this way:
"Oh crap! -v is taken. so which letter is most likely not taken, so I
can move on?"

So if you have any other proposal with actual reasons I'd switch in a
heart beat.
I figured the -y is good to testing and debugging, but in an ideal
world we don't
want people messing with the transport option because of automatic upgrades
as you said.

Or do you rather hint on dropping the short option at all, and just having NULL
in the field?

>
> -Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 05/11] transport: add infrastructure to support a protocol version number
  2015-05-27  6:39   ` Jeff King
  2015-05-27 19:01     ` Stefan Beller
@ 2015-05-27 19:10     ` Junio C Hamano
  1 sibling, 0 replies; 61+ messages in thread
From: Junio C Hamano @ 2015-05-27 19:10 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Jeff King, git, pclouds

Jeff King <peff@peff.net> writes:

> On Tue, May 26, 2015 at 03:01:09PM -0700, Stefan Beller wrote:
>
>> +	OPT_STRING('y', "transport-version", &transport_version,
>> +		   N_("transport-version"),
>> +		   N_("specify transport version to be used")),
>
> Interesting choice for the short option ("-v" would be nice, but
> obviously it is taken). Do we want to delay on claiming the
> short-and-sweet 'y' until we are sure this is something people will use
> a lot? In an ideal world, it is not (i.e., auto-upgrade and other tricks
> become good enough that nobody bothers to specify it manually).

Yes, just stuff 0 (not NULL but NUL) there; unless we have a very
good reason to believe that the option will be used every day to
toggle per invocation settings, we shouldn't squat on a short and
sweet single letter.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 06/11] remote.h: add get_remote_capabilities, request_capabilities
  2015-05-27 17:19       ` Eric Sunshine
@ 2015-05-27 20:09         ` Jeff King
  0 siblings, 0 replies; 61+ messages in thread
From: Jeff King @ 2015-05-27 20:09 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Stefan Beller, Git List, Nguyễn Thái Ngọc,
	Junio C Hamano

On Wed, May 27, 2015 at 01:19:39PM -0400, Eric Sunshine wrote:

> >> The 'len > 4' check is needed because there's no guarantee that 'line'
> >> is NUL-terminated. Correct?
> >
> > I think this was just blindly copied from get_remote_heads(). And I
> > think that code was being overly paranoid. Ever since f3a3214 (Make
> > send/receive-pack be closer to doing something interesting, 2005-06-29),
> > the pkt-line reader will add an extra NUL to the buffer to ease cases
> > like this.
> 
> Thanks. I had started digging into packet_read() to determine whether
> it guaranteed NUL-termination, but didn't get far enough to decide. I
> agree that if NUL-termination is guaranteed, then the 'len > 4' check
> is superfluous (and confusing, which is why it caught my attention in
> the first place).

Yeah, agreed that it should be cleaned up. Interestingly, if you dig on
that line, I've touched it several times myself and never noticed this.
:)

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 04/11] upload-pack-2: Implement the version 2 of upload-pack
  2015-05-27 17:30     ` Eric Sunshine
@ 2015-05-27 20:14       ` Jeff King
  0 siblings, 0 replies; 61+ messages in thread
From: Jeff King @ 2015-05-27 20:14 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Stefan Beller, Git List, Nguyễn Thái Ngọc Duy,
	Junio C Hamano

On Wed, May 27, 2015 at 01:30:28PM -0400, Eric Sunshine wrote:

> > Like Eric, I find the whole next_capability thing a little ugly. His
> > suggestion to pass in the parsing state is an improvement, but I wonder
> > why we need to parse at all. Can we keep the capabilities as:
> >
> >   const char *capabilities[] = {
> >         "multi_ack",
> >         "thin-pack",
> >         ... etc ...
> >   };
> >
> > and then loop over the array?
> 
> Yes, that would be much nicer. I also had this in mind but didn't know
> if there was a strong reason for the capabilities to be shoehorned
> into a single string as they are currently.

I don't think there is a good reason, beyond it being the simplest thing
for the current code to work. But as you can see from the existing
packet_write() in upload-pack, it's already going through some
contortions to handle optional capabilities (i.e., "capabilities" is by
no means the full list anymore).

Doing it item by item will mean we can't use a single packet_write() in
the v1 code, but it's OK to format into a buffer here (we already need
such a buffer for format_symref_info anyway).

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 05/11] transport: add infrastructure to support a protocol version number
  2015-05-27 19:01     ` Stefan Beller
@ 2015-05-27 20:17       ` Jeff King
  0 siblings, 0 replies; 61+ messages in thread
From: Jeff King @ 2015-05-27 20:17 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Duy Nguyen, Junio C Hamano

On Wed, May 27, 2015 at 12:01:50PM -0700, Stefan Beller wrote:

> > Interesting choice for the short option ("-v" would be nice, but
> > obviously it is taken). Do we want to delay on claiming the
> > short-and-sweet 'y' until we are sure this is something people will use
> > a lot? In an ideal world, it is not (i.e., auto-upgrade and other tricks
> > become good enough that nobody bothers to specify it manually).
> [...]
> Or do you rather hint on dropping the short option at all, and just having NULL
> in the field?

Yes, that's what I was hinting.

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 04/11] upload-pack-2: Implement the version 2 of upload-pack
  2015-05-27 17:40     ` Stefan Beller
@ 2015-05-27 20:34       ` Jeff King
  2015-05-27 20:45         ` Stefan Beller
  0 siblings, 1 reply; 61+ messages in thread
From: Jeff King @ 2015-05-27 20:34 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Duy Nguyen, Junio C Hamano

On Wed, May 27, 2015 at 10:40:37AM -0700, Stefan Beller wrote:

> > If we are upload-pack-2, should we advertise that in the capabilities? I
> > think it may make things easier later if we try to provide some
> > opportunistic out-of-band data. E.g., if see tell git-daemon:
> >
> >   git-upload-pack repo\0host=whatever\0\0version=2
> >
> > how do we know whether version=2 was understood and kicked us into v2
> > mode, versus an old server that ignored it?
> 
> So in my vision we would call git-upload-pack-2 instead of having a "version=2".
> and if git-upload-pack-2 doesn't exist, the whole conversation is
> over, the client
> it is left to make up some good error message or retry version 1.

I'd like for that to be a starting point for us, and then to be able to
add-on "hints" to ease the transition path in whatever way we want. We
may even not do that in the long run, but I want to leave the door open
if we can.

> But I think advertising both which versions the server could deal
> with, as well as
> the currently expected version is a good thing.
> 
> capability: can_speak=1,2
> capability: speaking_now=2

I was thinking just "speaking_now=2", but it probably makes sense to do
both. I do not think using it to "downgrade" will ever be particularly
useful (certainly not from v2 to v1, since to understand the flag both
sides must be v2 in the first place). But advertising that via the v1
conversation will be a good way to tell the other side that upgrade is
possible.

> > Also, do we need the capability noise-word?
> 
> I thought it opens up a new possible door in the future.
> As we ignore anything not starting with "capability" for now, you
> could introduce
> your foo and bar ping pong easily and still be version 2 compatible.
> 
> S: capability: thin
> S: capability: another-capability
> S: ping-pong foo
> S: done
> C: (not having understood ping-pong) just answering with capability: thin
> C: done, let's proceed to refs advertisement
> 
> The alternative client would do:
> 
> C: ping-pong: foo-data1a
> S: ping-pong: foo-data1b
> C: ping-pong: foo-data2a
> C: capability: thin
> ...

Right, but I think (and please correct me if there's a case I'm missing)
that the behavior is the same whether it is spelled "ping-pong" or
"capability:ping-pong". That is, the rule for "capability:" is "if you
do not understand it, ignore it and do not mention it in your
capabilities; the server is required to assume you were written before
that capability was invented". But that is _also_ the rule for
ping-pong, I think.

> > Eric mentioned the underflow problems here, but I wonder even more:
> > what's wrong with the global ends_with() that we already provide?
> 
> I was missing knowledge we have that, and apparently I was thinking it's
> faster to come up with my own version than to look for it. :)

Makes sense. :)

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 04/11] upload-pack-2: Implement the version 2 of upload-pack
  2015-05-27 20:34       ` Jeff King
@ 2015-05-27 20:45         ` Stefan Beller
  2015-05-27 21:46           ` Jeff King
  0 siblings, 1 reply; 61+ messages in thread
From: Stefan Beller @ 2015-05-27 20:45 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Duy Nguyen, Junio C Hamano

On Wed, May 27, 2015 at 1:34 PM, Jeff King <peff@peff.net> wrote:
> On Wed, May 27, 2015 at 10:40:37AM -0700, Stefan Beller wrote:
>
>> > If we are upload-pack-2, should we advertise that in the capabilities? I
>> > think it may make things easier later if we try to provide some
>> > opportunistic out-of-band data. E.g., if see tell git-daemon:
>> >
>> >   git-upload-pack repo\0host=whatever\0\0version=2
>> >
>> > how do we know whether version=2 was understood and kicked us into v2
>> > mode, versus an old server that ignored it?
>>
>> So in my vision we would call git-upload-pack-2 instead of having a "version=2".
>> and if git-upload-pack-2 doesn't exist, the whole conversation is
>> over, the client
>> it is left to make up some good error message or retry version 1.
>
> I'd like for that to be a starting point for us, and then to be able to
> add-on "hints" to ease the transition path in whatever way we want. We
> may even not do that in the long run, but I want to leave the door open
> if we can.
>
>> But I think advertising both which versions the server could deal
>> with, as well as
>> the currently expected version is a good thing.
>>
>> capability: can_speak=1,2
>> capability: speaking_now=2
>
> I was thinking just "speaking_now=2", but it probably makes sense to do
> both. I do not think using it to "downgrade" will ever be particularly
> useful (certainly not from v2 to v1, since to understand the flag both
> sides must be v2 in the first place). But advertising that via the v1
> conversation will be a good way to tell the other side that upgrade is
> possible.

If for some reason we discover a flaw in the current version, which
makes it unusable
(a buffer overflow?, some stupid abuse which makes the capability list huge),
you may want to force downgrading (and in the very distant future when we are
current on version 4 and have dropped version 1 already, you can only downgrade
to 2 and 3, so I can see value in it.

Another idea to make it all more future proof:
"capability: speaking_now=2" must be sent as the first line, so then
you can adapt
on the client side easily for which version you are listening.

>
>> > Also, do we need the capability noise-word?
>>
>> I thought it opens up a new possible door in the future.
>> As we ignore anything not starting with "capability" for now, you
>> could introduce
>> your foo and bar ping pong easily and still be version 2 compatible.
>>
>> S: capability: thin
>> S: capability: another-capability
>> S: ping-pong foo
>> S: done
>> C: (not having understood ping-pong) just answering with capability: thin
>> C: done, let's proceed to refs advertisement
>>
>> The alternative client would do:
>>
>> C: ping-pong: foo-data1a
>> S: ping-pong: foo-data1b
>> C: ping-pong: foo-data2a
>> C: capability: thin
>> ...
>
> Right, but I think (and please correct me if there's a case I'm missing)
> that the behavior is the same whether it is spelled "ping-pong" or
> "capability:ping-pong". That is, the rule for "capability:" is "if you
> do not understand it, ignore it and do not mention it in your
> capabilities; the server is required to assume you were written before
> that capability was invented". But that is _also_ the rule for
> ping-pong, I think.

The rules are the same, right. But the allowed characters are limited
(in theory)
as the regular expressions given for the capabilities don't allow for
binary data
for example, but only well formed ASCII text, space separated.
The "ping-pong" keyword could introduce a binary stream there
including line feeds. (Today it sounds like a stupid idea though)

>
>> > Eric mentioned the underflow problems here, but I wonder even more:
>> > what's wrong with the global ends_with() that we already provide?
>>
>> I was missing knowledge we have that, and apparently I was thinking it's
>> faster to come up with my own version than to look for it. :)
>
> Makes sense. :)
>
> -Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 04/11] upload-pack-2: Implement the version 2 of upload-pack
  2015-05-27 20:45         ` Stefan Beller
@ 2015-05-27 21:46           ` Jeff King
  0 siblings, 0 replies; 61+ messages in thread
From: Jeff King @ 2015-05-27 21:46 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Duy Nguyen, Junio C Hamano

On Wed, May 27, 2015 at 01:45:55PM -0700, Stefan Beller wrote:

> > Right, but I think (and please correct me if there's a case I'm
> > missing) that the behavior is the same whether it is spelled
> > "ping-pong" or "capability:ping-pong". That is, the rule for
> > "capability:" is "if you do not understand it, ignore it and do not
> > mention it in your capabilities; the server is required to assume
> > you were written before that capability was invented". But that is
> > _also_ the rule for ping-pong, I think.
> 
> The rules are the same, right. But the allowed characters are limited
> (in theory) as the regular expressions given for the capabilities
> don't allow for binary data for example, but only well formed ASCII
> text, space separated.  The "ping-pong" keyword could introduce a
> binary stream there including line feeds. (Today it sounds like a
> stupid idea though)

Do we need that restriction? IOW, as long as we parse from the start of
the packet and give up as soon as we realize it is not a thing we
understand, I do not think anybody is hurt by the contents of the item.

E.g., if an old client sees:

   00XXping-pong=<binary goo>

It knows:

  1. The item starts with ping-pong; we don't know what that is, so we
     never even bother looking at the binary goo.

  2. It's in a pkt-line. We read to the end of the packet line and throw
     the rest of the data away. Now we're synchronized to read the next
     item.

Of course, "ping-pong" may also introduce another phase to the protocol
which is not encapsulated in pkt-lines (e.g., if the data is too big to
fit right inside the capability pkt-line, or if it needs a lot of back
and forth like ref negotiation). But then we only proceed to that
phase if both sides have said "I understand ping-pong".

So I think we are capable of boot-strapping just about anything using
capability negotiation (with the exception of fixing the capability
negotiation itself; but even that, we can bootstrap a second more
intensive capability negotiation using a capability in the initial
list).

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 06/11] remote.h: add get_remote_capabilities, request_capabilities
  2015-05-27  6:45   ` Jeff King
@ 2015-05-29 19:39     ` Stefan Beller
  2015-05-29 22:08       ` Jeff King
  0 siblings, 1 reply; 61+ messages in thread
From: Stefan Beller @ 2015-05-29 19:39 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Duy Nguyen, Junio C Hamano

On Tue, May 26, 2015 at 11:45 PM, Jeff King <peff@peff.net> wrote:
> On Tue, May 26, 2015 at 03:01:10PM -0700, Stefan Beller wrote:
>
>> +void get_remote_capabilities(int in, char *src_buf, size_t src_len)
>> +{
>> +     struct strbuf capabilities_string = STRBUF_INIT;
>> +     for (;;) {
>> +             int len;
>> +             char *line = packet_buffer;
>> +             const char *arg;
>> +
>> +             len = packet_read(in, &src_buf, &src_len,
>> +                               packet_buffer, sizeof(packet_buffer),
>> +                               PACKET_READ_GENTLE_ON_EOF |
>> +                               PACKET_READ_CHOMP_NEWLINE);
>> +             if (len < 0)
>> +                     die_initial_contact(0);
>> +
>> +             if (!len)
>> +                     break;
>> +
>> +             if (len > 4 && skip_prefix(line, "ERR ", &arg))
>> +                     die("remote error: %s", arg);
>> +
>> +             if (starts_with(line, "capability:")) {
>> +                     strbuf_addstr(&capabilities_string, line + strlen("capability:"));
>> +                     strbuf_addch(&capabilities_string, ' ');
>> +             }
>> +     }
>
> I think this is the reverse case of next_capabilities in the upload-pack
> side, so I'll make the reverse suggestion. :) Would it make things nicer
> if both v1 and v2 parsed the capabilities into a string_list?

Ok, I'll do that. Though this makes future enhancements a bit uneasy.
Say we want to transport a message by the server admins, this might be
the right place to do.

    if (starts_with("message"))
        fprintf(stderr, ....

Of course we can later add this in the future, but it would break older
clients (clients as of this patch series). That's why I like the idea of
adding a prefix here. Maybe just a "c:" as an abbreviation for capability.
But now making it short and concise makes it painful in the future when
we want to transport anything else apart from capabilities in this phase
of the protocol exchange. Of course we could have a capability "server-message"
indicating that after the capabilities we'll have a dedicated message to be
displayed to the user.

Yeah well that should do.

I'll just parse in a string_list for now.

>
>> +     free(server_capabilities);
>> +     server_capabilities = xmalloc(capabilities_string.len + 1);
>> +     server_capabilities = strbuf_detach(&capabilities_string, NULL);
>
> Is this xmalloc line left over? The strbuf_detach will write over it.

No, I wasn't reading the fine documentation and assuming things I
should not.

>
>> +     strbuf_release(&capabilities_string);
>
> No need to release if we just detached.
>
>> +int request_capabilities(int out)
>> +{
>> +     fprintf(stderr, "request_capabilities\n");
>
> Debug cruft, I presume.
>
>> +     // todo: send our capabilities
>> +     packet_write(out, "capability:foo");
>
> No C99 comments. But I think this is just a placeholder. :)
>
> -Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 11/11] Document protocol version 2
  2015-05-26 22:01 ` [RFC/WIP PATCH 11/11] Document protocol version 2 Stefan Beller
@ 2015-05-29 20:35   ` Junio C Hamano
  2015-05-29 21:36     ` Stefan Beller
  0 siblings, 1 reply; 61+ messages in thread
From: Junio C Hamano @ 2015-05-29 20:35 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, pclouds, peff

Stefan Beller <sbeller@google.com> writes:

> @@ -1,11 +1,11 @@
>  Packfile transfer protocols
>  ===========================
>  
> -Git supports transferring data in packfiles over the ssh://, git:// and
> +Git supports transferring data in packfiles over the ssh://, git://, http:// and

When you have chance, can you do things like this, which is a clear
improvement of the current document even if we never had v2, as
separate patches?

> +Capability discovery (v2)
> +-------------------------
> ...
> +  capability-list  =  *(capability) [agent LF] flush-pkt
> +  capability       =  PKT-LINE("capability:" keyvaluepair LF)
> +  agent            =  keyvaluepair LF
> +  keyvaluepair     =  1*(LC_ALPHA / DIGIT / "-" / "_" / "=")

What is the "=" doing there?  If you meant to cover things like
"lang=en" with this, I do not think it is a good idea.  Rather, it
should be more like this:

	capability = 1*(LC_ALPHA / DIGIT / "-" / "_") [ "=" value ]
	value = 0*( any octet other than LF, NUL )

in order to leave us wiggle room to have more than very limited
subset of US-ASCII in 'value'.  I suspect that we may want to allow
anything other than LF (unlike v1 that allowed anything other than
SP and LF).

> +  LC_ALPHA         =  %x61-7A
> +----
> +
> +The client MUST ignore any data on pkt-lines starting with anything
> +different than "capability" for future ease of extension.
> +
> +The client MUST NOT ask for capabilities the server did not say it
> +supports. The server MUST diagnose and abort if capabilities it does
> +not understand was requested. The server MUST NOT ignore capabilities
> +that client requested and server advertised.  As a consequence of these
> +rules, server MUST NOT advertise capabilities it does not understand.

I think it was already discussed that we shouldn't do the
"capability:" and "cap:" prefixes in reviews of earlier parts, so
the details of this part would be updated?

> @@ -154,10 +203,14 @@ If HEAD is a valid ref, HEAD MUST appear as the first advertised
>  ref.  If HEAD is not a valid ref, HEAD MUST NOT appear in the
>  advertisement list at all, but other refs may still appear.
>  
> -The stream MUST include capability declarations behind a NUL on the
> -first ref. The peeled value of a ref (that is "ref^{}") MUST be
> -immediately after the ref itself, if presented. A conforming server
> -MUST peel the ref if it's an annotated tag.
> +In version 1 the stream MUST include capability declarations behind
> +a NUL on the first ref. The peeled value of a ref (that is "ref^{}")
> +MUST be immediately after the ref itself, if presented. A conforming
> +server MUST peel the ref if it's an annotated tag.
> +
> +In version 2 the capabilities are already negotiated, so the first ref
> +MUST NOT be followed by any capability advertisement, but it should be
> +treated as any other refs advertising line.

Sensible.

> @@ -178,13 +231,28 @@ MUST peel the ref if it's an annotated tag.
>    shallow          =  PKT-LINE("shallow" SP obj-id)
>  
>    capability-list  =  capability *(SP capability)
> -  capability       =  1*(LC_ALPHA / DIGIT / "-" / "_")
> +  capability       =  1*(LC_ALPHA / DIGIT / "-" / "_" / "=")

Ditto.

Thanks.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 11/11] Document protocol version 2
  2015-05-29 20:35   ` Junio C Hamano
@ 2015-05-29 21:36     ` Stefan Beller
  2015-05-29 21:52       ` Junio C Hamano
  0 siblings, 1 reply; 61+ messages in thread
From: Stefan Beller @ 2015-05-29 21:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Duy Nguyen, Jeff King

On Fri, May 29, 2015 at 1:35 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> @@ -1,11 +1,11 @@
>>  Packfile transfer protocols
>>  ===========================
>>
>> -Git supports transferring data in packfiles over the ssh://, git:// and
>> +Git supports transferring data in packfiles over the ssh://, git://, http:// and
>
> When you have chance, can you do things like this, which is a clear
> improvement of the current document even if we never had v2, as
> separate patches?

will do.

>
>> +Capability discovery (v2)
>> +-------------------------
>> ...
>> +  capability-list  =  *(capability) [agent LF] flush-pkt
>> +  capability       =  PKT-LINE("capability:" keyvaluepair LF)
>> +  agent            =  keyvaluepair LF
>> +  keyvaluepair     =  1*(LC_ALPHA / DIGIT / "-" / "_" / "=")
>
> What is the "=" doing there?  If you meant to cover things like
> "lang=en" with this, I do not think it is a good idea.  Rather, it
> should be more like this:
>
>         capability = 1*(LC_ALPHA / DIGIT / "-" / "_") [ "=" value ]
>         value = 0*( any octet other than LF, NUL )
>
> in order to leave us wiggle room to have more than very limited
> subset of US-ASCII in 'value'.  I suspect that we may want to allow
> anything other than LF (unlike v1 that allowed anything other than
> SP and LF).

Currently we can do a = as part of the line after the first ref, such as

    symref=HEAD:refs/heads/master agent=git/2:2.4.0

so I thought we want to keep this. And below I just corrected what I thought
was a difference between documentation and implementation.

>
>> +  LC_ALPHA         =  %x61-7A
>> +----
>> +
>> +The client MUST ignore any data on pkt-lines starting with anything
>> +different than "capability" for future ease of extension.
>> +
>> +The client MUST NOT ask for capabilities the server did not say it
>> +supports. The server MUST diagnose and abort if capabilities it does
>> +not understand was requested. The server MUST NOT ignore capabilities
>> +that client requested and server advertised.  As a consequence of these
>> +rules, server MUST NOT advertise capabilities it does not understand.
>
> I think it was already discussed that we shouldn't do the
> "capability:" and "cap:" prefixes in reviews of earlier parts, so
> the details of this part would be updated?

sure

>
>> @@ -154,10 +203,14 @@ If HEAD is a valid ref, HEAD MUST appear as the first advertised
>>  ref.  If HEAD is not a valid ref, HEAD MUST NOT appear in the
>>  advertisement list at all, but other refs may still appear.
>>
>> -The stream MUST include capability declarations behind a NUL on the
>> -first ref. The peeled value of a ref (that is "ref^{}") MUST be
>> -immediately after the ref itself, if presented. A conforming server
>> -MUST peel the ref if it's an annotated tag.
>> +In version 1 the stream MUST include capability declarations behind
>> +a NUL on the first ref. The peeled value of a ref (that is "ref^{}")
>> +MUST be immediately after the ref itself, if presented. A conforming
>> +server MUST peel the ref if it's an annotated tag.
>> +
>> +In version 2 the capabilities are already negotiated, so the first ref
>> +MUST NOT be followed by any capability advertisement, but it should be
>> +treated as any other refs advertising line.
>
> Sensible.
>
>> @@ -178,13 +231,28 @@ MUST peel the ref if it's an annotated tag.
>>    shallow          =  PKT-LINE("shallow" SP obj-id)
>>
>>    capability-list  =  capability *(SP capability)
>> -  capability       =  1*(LC_ALPHA / DIGIT / "-" / "_")
>> +  capability       =  1*(LC_ALPHA / DIGIT / "-" / "_" / "=")
>
> Ditto.
>
> Thanks.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 11/11] Document protocol version 2
  2015-05-29 21:36     ` Stefan Beller
@ 2015-05-29 21:52       ` Junio C Hamano
  2015-05-29 22:21         ` Jeff King
  0 siblings, 1 reply; 61+ messages in thread
From: Junio C Hamano @ 2015-05-29 21:52 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Duy Nguyen, Jeff King

Stefan Beller <sbeller@google.com> writes:

>>> +Capability discovery (v2)
>>> +-------------------------
>>> ...
>>> +  capability-list  =  *(capability) [agent LF] flush-pkt
>>> +  capability       =  PKT-LINE("capability:" keyvaluepair LF)
>>> +  agent            =  keyvaluepair LF
>>> +  keyvaluepair     =  1*(LC_ALPHA / DIGIT / "-" / "_" / "=")
>>
>> What is the "=" doing there?  If you meant to cover things like
>> "lang=en" with this, I do not think it is a good idea.  Rather, it
>> should be more like this:
>>
>>         capability = 1*(LC_ALPHA / DIGIT / "-" / "_") [ "=" value ]
>>         value = 0*( any octet other than LF, NUL )
>>
>> in order to leave us wiggle room to have more than very limited
>> subset of US-ASCII in 'value'.  I suspect that we may want to allow
>> anything other than LF (unlike v1 that allowed anything other than
>> SP and LF).
>
> Currently we can do a = as part of the line after the first ref, such as
>
>     symref=HEAD:refs/heads/master agent=git/2:2.4.0
>
> so I thought we want to keep this.

I do not understand that statement.

Capability exchange in v2 is one packet per cap, so the above
example would be expressed as:

	symref=HEAD:refs/heads/master
        agent=git/2:2.4.0

right?  Your "keyvaluepair" is limited to [a-z0-9-_=]*, and neither
of the above two can be expressed with that, which was why I said
you need two different set of characters before and after "=".  Left
hand side of "=" is tightly limited and that is OK.  Right hand side
may contain characters like ':', '.' and '/', so your alphabet need
to be more lenient, even in v1 (which I would imagine would be "any
octet other than SP, LF and NUL").

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 06/11] remote.h: add get_remote_capabilities, request_capabilities
  2015-05-29 19:39     ` Stefan Beller
@ 2015-05-29 22:08       ` Jeff King
  0 siblings, 0 replies; 61+ messages in thread
From: Jeff King @ 2015-05-29 22:08 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Duy Nguyen, Junio C Hamano

On Fri, May 29, 2015 at 12:39:35PM -0700, Stefan Beller wrote:

> > I think this is the reverse case of next_capabilities in the upload-pack
> > side, so I'll make the reverse suggestion. :) Would it make things nicer
> > if both v1 and v2 parsed the capabilities into a string_list?
> 
> Ok, I'll do that. Though this makes future enhancements a bit uneasy.
> Say we want to transport a message by the server admins, this might be
> the right place to do.
> 
>     if (starts_with("message"))
>         fprintf(stderr, ....
> 
> Of course we can later add this in the future, but it would break older
> clients (clients as of this patch series). That's why I like the idea of
> adding a prefix here. Maybe just a "c:" as an abbreviation for capability.

I don't understand how that breaks existing clients. Under your scheme,
the older client says "message? That does not start with capability:, so
I must ignore it". Without the "capability:" flag, it becomes "message?
I do not know that type, so I must ignore it".

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 11/11] Document protocol version 2
  2015-05-29 21:52       ` Junio C Hamano
@ 2015-05-29 22:21         ` Jeff King
  2015-06-01 23:14           ` Stefan Beller
  0 siblings, 1 reply; 61+ messages in thread
From: Jeff King @ 2015-05-29 22:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Beller, git, Duy Nguyen

On Fri, May 29, 2015 at 02:52:14PM -0700, Junio C Hamano wrote:

> > Currently we can do a = as part of the line after the first ref, such as
> >
> >     symref=HEAD:refs/heads/master agent=git/2:2.4.0
> >
> > so I thought we want to keep this.
> 
> I do not understand that statement.
> 
> Capability exchange in v2 is one packet per cap, so the above
> example would be expressed as:
> 
> 	symref=HEAD:refs/heads/master
>         agent=git/2:2.4.0
> 
> right?  Your "keyvaluepair" is limited to [a-z0-9-_=]*, and neither
> of the above two can be expressed with that, which was why I said
> you need two different set of characters before and after "=".  Left
> hand side of "=" is tightly limited and that is OK.  Right hand side
> may contain characters like ':', '.' and '/', so your alphabet need
> to be more lenient, even in v1 (which I would imagine would be "any
> octet other than SP, LF and NUL").

Yes. See git_user_agent_sanitized(), for example, which allows basically
any printable ASCII except for SP.

I think the v2 capabilities do not even need to have that restriction.
It can allow arbitrary binary data, because it has an 8bit-clean framing
mechanism (pkt-lines). Of course, that means such capabilities cannot be
represented in a v1 conversation (whose framing mechanism involves SP
and NUL). But it's probably acceptable to introduce new capabilities
which are only available in a v2 conversation. Old clients that do not
understand v2 would not understand the capability either. It does
require new clients implementing the capability to _also_ implement v2
if they have not done so, but I do not mind pushing people in that
direction.

The initial v2 client implementation should probably do a few cautionary
things, then:

  1. Do _not_ fold the per-pkt capabilities into a v1 string; that loses
     the robust framing. I suggested string_list earlier, but probably
     we want a list of ptr/len pair, so that it can remain NUL-clean.

  2. Avoid holding on to unknown packets longer than necessary. Some
     capability pkt-lines may be arbitrarily large (up to 64K). If we do
     not understand them during the v2 read of the capabilities, there
     is no point hanging on to them. It's not _wrong_ to do so, but just
     inefficient; if we know that clients will just throw away unknown
     packets, then we can later introduce new packets with large data,
     without worrying about wasting the client's resources.

     I suspect it's not that big a deal either way, though. I have no
     plans for sending a bunch of large packets, and anyway network
     bandwidth is probably more precious than client memory.

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 00/11] Protocol version 2, again!
  2015-05-27  7:08   ` Jeff King
@ 2015-06-01 17:49     ` Stefan Beller
  2015-06-02 10:10       ` Duy Nguyen
  2015-06-04 13:09       ` Jeff King
  0 siblings, 2 replies; 61+ messages in thread
From: Stefan Beller @ 2015-06-01 17:49 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Duy Nguyen, Junio C Hamano

On Wed, May 27, 2015 at 12:08 AM, Jeff King <peff@peff.net> wrote:
> On Wed, May 27, 2015 at 02:18:18AM -0400, Jeff King wrote:
>
>> > The new protocol works just like the old protocol, except for
>> > the capabilities negotiation being before any exchange of real data.
>>
>> I like this approach. [...]
>
> So now I've read through all the patches. I still like it. :)
>
> There's a lot of work to be done still, but I think this is a great
> start. Thanks for getting the ball rolling.
>
> -Peff

Thanks for the reviews. I think I got them all covered by now and
I am pretty happy about the upload-pack part for now.

However the client side with builtin/fetch, builtin/fetch-pack, fetch-pack
is a bit of a mystery to me, as I cannot fully grasp the difference between
 * connect.{h,c}
 * remote.{h.c}
 * transport.{h.c}
there. All of it seems to be doing network related stuff, but I have trouble
getting the big picture. I am assuming all of these 3 are rather a low level,
used like a library, though there must be even more hierarchy in there,
connect is most low level judging from the header file and used by
the other two.
transport.h seems to provide the most toplevel library stuff as it includes
remote.h in its header?

The problem I am currently trying to tackle, is passing the options through all
the layers early on. so in a few places we have code like

    switch (version) {
    case 2: /* first talk about capabilities, then get the heads */
        get_remote_capabilities(data->fd[0], NULL, 0);
        select_capabilities();
        request_capabilities(data->fd[1]);
        /* fall through */
    case 1:
        get_remote_heads(data->fd[0], NULL, 0, &refs,
                 for_push ? REF_NORMAL : 0,
                 &data->extra_have,
                 &data->shallow);
        break;
    default:
        die("BUG: Transport version %d not supported", version);
        break;
    }

and the issue I am concerned about is the select_capabilities as well as
the request_capabilities function here. The select_capabilities functionality
is currently residing in the high level parts of the code as it both depends on
the advertised server capabilities and on the user input (--use-frotz or config
options), so the capability selection is done in fetchpack.c

So there are 2 routes to go: Either we leave the select_capabilities in the
upper layers (near the actual high level command, fetch, fetchpack) or we put
it into the transport layer and just passing in a struct what the user desires.
And when the users desire doesn't meet the server capabilities we die deep down
in the transport layer.

Any advice on how to proceed there welcome!

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 11/11] Document protocol version 2
  2015-05-29 22:21         ` Jeff King
@ 2015-06-01 23:14           ` Stefan Beller
  2015-06-01 23:40             ` Stefan Beller
  2015-06-02 17:06             ` Junio C Hamano
  0 siblings, 2 replies; 61+ messages in thread
From: Stefan Beller @ 2015-06-01 23:14 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, Duy Nguyen

On Fri, May 29, 2015 at 3:21 PM, Jeff King <peff@peff.net> wrote:
> On Fri, May 29, 2015 at 02:52:14PM -0700, Junio C Hamano wrote:
>
>> > Currently we can do a = as part of the line after the first ref, such as
>> >
>> >     symref=HEAD:refs/heads/master agent=git/2:2.4.0
>> >
>> > so I thought we want to keep this.
>>
>> I do not understand that statement.
>>
>> Capability exchange in v2 is one packet per cap, so the above
>> example would be expressed as:
>>
>>       symref=HEAD:refs/heads/master
>>         agent=git/2:2.4.0
>>
>> right?  Your "keyvaluepair" is limited to [a-z0-9-_=]*, and neither
>> of the above two can be expressed with that, which was why I said
>> you need two different set of characters before and after "=".  Left
>> hand side of "=" is tightly limited and that is OK.  Right hand side
>> may contain characters like ':', '.' and '/', so your alphabet need
>> to be more lenient, even in v1 (which I would imagine would be "any
>> octet other than SP, LF and NUL").

I think the recent issue with the push certificates shows that having arbitrary
data after the = is a bad idea. So we need to be very cautious when to allow
which data after the =.

I'll try split up the patch.

>
> Yes. See git_user_agent_sanitized(), for example, which allows basically
> any printable ASCII except for SP.
>
> I think the v2 capabilities do not even need to have that restriction.
> It can allow arbitrary binary data, because it has an 8bit-clean framing
> mechanism (pkt-lines). Of course, that means such capabilities cannot be
> represented in a v1 conversation (whose framing mechanism involves SP
> and NUL). But it's probably acceptable to introduce new capabilities
> which are only available in a v2 conversation. Old clients that do not
> understand v2 would not understand the capability either. It does
> require new clients implementing the capability to _also_ implement v2
> if they have not done so, but I do not mind pushing people in that
> direction.
>
> The initial v2 client implementation should probably do a few cautionary
> things, then:
>
>   1. Do _not_ fold the per-pkt capabilities into a v1 string; that loses
>      the robust framing. I suggested string_list earlier, but probably
>      we want a list of ptr/len pair, so that it can remain NUL-clean.
>
>   2. Avoid holding on to unknown packets longer than necessary. Some
>      capability pkt-lines may be arbitrarily large (up to 64K). If we do
>      not understand them during the v2 read of the capabilities, there
>      is no point hanging on to them. It's not _wrong_ to do so, but just
>      inefficient; if we know that clients will just throw away unknown
>      packets, then we can later introduce new packets with large data,
>      without worrying about wasting the client's resources.
>
>      I suspect it's not that big a deal either way, though. I have no
>      plans for sending a bunch of large packets, and anyway network
>      bandwidth is probably more precious than client memory.

That's very sensible thoughts after rereading this email. The version
I'll be sending out today will not follow those suggestions though. :(

>
> -Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 11/11] Document protocol version 2
  2015-06-01 23:14           ` Stefan Beller
@ 2015-06-01 23:40             ` Stefan Beller
  2015-06-04 13:18               ` Jeff King
  2015-06-02 17:06             ` Junio C Hamano
  1 sibling, 1 reply; 61+ messages in thread
From: Stefan Beller @ 2015-06-01 23:40 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, Duy Nguyen

On Mon, Jun 1, 2015 at 4:14 PM, Stefan Beller <sbeller@google.com> wrote:
> On Fri, May 29, 2015 at 3:21 PM, Jeff King <peff@peff.net> wrote:
>> On Fri, May 29, 2015 at 02:52:14PM -0700, Junio C Hamano wrote:
>>
>>> > Currently we can do a = as part of the line after the first ref, such as
>>> >
>>> >     symref=HEAD:refs/heads/master agent=git/2:2.4.0
>>> >
>>> > so I thought we want to keep this.
>>>
>>> I do not understand that statement.
>>>
>>> Capability exchange in v2 is one packet per cap, so the above
>>> example would be expressed as:
>>>
>>>       symref=HEAD:refs/heads/master
>>>         agent=git/2:2.4.0
>>>
>>> right?  Your "keyvaluepair" is limited to [a-z0-9-_=]*, and neither
>>> of the above two can be expressed with that, which was why I said
>>> you need two different set of characters before and after "=".  Left
>>> hand side of "=" is tightly limited and that is OK.  Right hand side
>>> may contain characters like ':', '.' and '/', so your alphabet need
>>> to be more lenient, even in v1 (which I would imagine would be "any
>>> octet other than SP, LF and NUL").
>
> I think the recent issue with the push certificates shows that having arbitrary
> data after the = is a bad idea. So we need to be very cautious when to allow
> which data after the =.
>
> I'll try split up the patch.
>
>>
>> Yes. See git_user_agent_sanitized(), for example, which allows basically
>> any printable ASCII except for SP.
>>
>> I think the v2 capabilities do not even need to have that restriction.
>> It can allow arbitrary binary data, because it has an 8bit-clean framing
>> mechanism (pkt-lines). Of course, that means such capabilities cannot be
>> represented in a v1 conversation (whose framing mechanism involves SP
>> and NUL). But it's probably acceptable to introduce new capabilities
>> which are only available in a v2 conversation. Old clients that do not
>> understand v2 would not understand the capability either. It does
>> require new clients implementing the capability to _also_ implement v2
>> if they have not done so, but I do not mind pushing people in that
>> direction.
>>
>> The initial v2 client implementation should probably do a few cautionary
>> things, then:
>>
>>   1. Do _not_ fold the per-pkt capabilities into a v1 string; that loses
>>      the robust framing. I suggested string_list earlier, but probably
>>      we want a list of ptr/len pair, so that it can remain NUL-clean.
>>
>>   2. Avoid holding on to unknown packets longer than necessary. Some
>>      capability pkt-lines may be arbitrarily large (up to 64K). If we do
>>      not understand them during the v2 read of the capabilities, there
>>      is no point hanging on to them. It's not _wrong_ to do so, but just
>>      inefficient; if we know that clients will just throw away unknown
>>      packets, then we can later introduce new packets with large data,
>>      without worrying about wasting the client's resources.
>>
>>      I suspect it's not that big a deal either way, though. I have no
>>      plans for sending a bunch of large packets, and anyway network
>>      bandwidth is probably more precious than client memory.
>
> That's very sensible thoughts after rereading this email. The version
> I'll be sending out today will not follow those suggestions though. :(

Thinking about this further, maybe it is a good idea to restrict the
capabilities
advertising to alphabetical order?

The exchange would look like this:

server:
  for capability in list:
      pkt_write(capability)
  pkt_flush

client:
  do
    line = recv_pkt()
    parse_capability(line)
  while line != flush

with parse_capability checking if we know the capability and maybe setting some
internal field if we know this capability.

Now if we assume the number of capabilities grows over time a lot (someone may
"abuse" it for a cool feature, similar to the refs currently. Nobody
thought about
having so many refs in advance)

So how does parse_capability scale w.r.t the number of capabilities?
If parse_capability is just a linear search then it is O(n) and with n
capabilities
the client faces an O(n^2) computation which is bad. So if we were to require
alphabetic capabilities, you could internally keep track and the whole operation
is O(n). I just wonder if this is premature optimization or some thought we need
to think of.

To prevent this problem from popping up, it must be easier to
introduce a new phase
after the capabilities exchange than to just abuse the capabilities
phase for whatever
you plan on doing.

Thanks,
Stefan

>
>>
>> -Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 00/11] Protocol version 2, again!
  2015-06-01 17:49     ` Stefan Beller
@ 2015-06-02 10:10       ` Duy Nguyen
  2015-06-04 13:09       ` Jeff King
  1 sibling, 0 replies; 61+ messages in thread
From: Duy Nguyen @ 2015-06-02 10:10 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Jeff King, git, Junio C Hamano

On Tue, Jun 2, 2015 at 12:49 AM, Stefan Beller <sbeller@google.com> wrote:
> However the client side with builtin/fetch, builtin/fetch-pack, fetch-pack
> is a bit of a mystery to me, as I cannot fully grasp the difference between
>  * connect.{h,c}
>  * remote.{h.c}
>  * transport.{h.c}
> there. All of it seems to be doing network related stuff, but I have trouble
> getting the big picture. I am assuming all of these 3 are rather a low level,
> used like a library, though there must be even more hierarchy in there,
> connect is most low level judging from the header file and used by
> the other two.
> transport.h seems to provide the most toplevel library stuff as it includes
> remote.h in its header?

I think transport.c is there to support non-native protocols (and
later on, smart http). So yeah it's basically the API for fetches and
pushes. git-log over those files may reveal their purposes, especially
the few first versions of them.

> The problem I am currently trying to tackle, is passing the options through all
> the layers early on. so in a few places we have code like
>
>     switch (version) {
>     case 2: /* first talk about capabilities, then get the heads */
>         get_remote_capabilities(data->fd[0], NULL, 0);
>         select_capabilities();
>         request_capabilities(data->fd[1]);
>         /* fall through */
>     case 1:
>         get_remote_heads(data->fd[0], NULL, 0, &refs,
>                  for_push ? REF_NORMAL : 0,
>                  &data->extra_have,
>                  &data->shallow);
>         break;
>     default:
>         die("BUG: Transport version %d not supported", version);
>         break;
>     }
>
> and the issue I am concerned about is the select_capabilities as well as
> the request_capabilities function here. The select_capabilities functionality
> is currently residing in the high level parts of the code as it both depends on
> the advertised server capabilities and on the user input (--use-frotz or config
> options), so the capability selection is done in fetchpack.c
>
> So there are 2 routes to go: Either we leave the select_capabilities in the
> upper layers (near the actual high level command, fetch, fetchpack) or we put
> it into the transport layer and just passing in a struct what the user desires.
> And when the users desire doesn't meet the server capabilities we die deep down
> in the transport layer.

I read the latest re-roll and I think the placement makes sense. You
can't put protocol specific at transport level because "pack protocol"
is just one of the supported protocols. There is smart-http (which
shares a bunch of code, but from transport perspective is a separate
protocol), and then user-defined protocols that know nothing about
this v2.
-- 
Duy

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 11/11] Document protocol version 2
  2015-06-01 23:14           ` Stefan Beller
  2015-06-01 23:40             ` Stefan Beller
@ 2015-06-02 17:06             ` Junio C Hamano
  1 sibling, 0 replies; 61+ messages in thread
From: Junio C Hamano @ 2015-06-02 17:06 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Jeff King, git, Duy Nguyen

Stefan Beller <sbeller@google.com> writes:

> I think the recent issue with the push certificates shows that having arbitrary
> data after the = is a bad idea.

I do not think push certificate episode tells any such thing.

It was about not carefully using cryptography with arbitrary data.
How that arbitrary data came to the machinery is irrelevant.  We
could have used base-64 to encode the server nonce when transferring
it to the machinery via capability, but then decode it in order to
place it in the cerficiate.

Do not restrict transport for such a reason and make legitimate uses
of the transport unnecessarily harder for later users.  What needs
to be done is to think how the data that was transport was used.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 00/11] Protocol version 2, again!
  2015-06-01 17:49     ` Stefan Beller
  2015-06-02 10:10       ` Duy Nguyen
@ 2015-06-04 13:09       ` Jeff King
  2015-06-04 16:44         ` Stefan Beller
  1 sibling, 1 reply; 61+ messages in thread
From: Jeff King @ 2015-06-04 13:09 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Duy Nguyen, Junio C Hamano

On Mon, Jun 01, 2015 at 10:49:45AM -0700, Stefan Beller wrote:

> However the client side with builtin/fetch, builtin/fetch-pack, fetch-pack
> is a bit of a mystery to me, as I cannot fully grasp the difference between
>  * connect.{h,c}
>  * remote.{h.c}
>  * transport.{h.c}
> there. All of it seems to be doing network related stuff, but I have trouble
> getting the big picture. I am assuming all of these 3 are rather a low level,
> used like a library, though there must be even more hierarchy in there,
> connect is most low level judging from the header file and used by
> the other two.
> transport.h seems to provide the most toplevel library stuff as it includes
> remote.h in its header?

connect.c was originally "the git protocol", and was used by send-pack
and fetch-pack. Other individual programs implemented other transports.
Later, as the interface moved towards everybody running "fetch" and
"push", and those delegating work to the individual transports, we got
transport.c, which is an abstract interface for all transports. It
delegates actual git-protocol work to the functions in connect.c (or
bundle work elsewhere, or handles remote-helpers itself).

And then remote.c contains routines for dealing with the remotes at a
logical level. E.g., which refs to fetch or push, etc.

So in theory, the flow is something like:

  - fetch.c knows "the user wants to fetch from 'foo'"

  - fetch asks remote.c: "who is remote 'foo'"; we get back a URL

  - fetch asks transport.c: "what are the refs for $URL"

  - it turns out to be a git URL. transport.c calls into connect.c to
    implement get_refs_via_connect.

  - after fetch gets back the list of refs, it uses routines in remote.c
    to figure out which refs we are interested in, handle refspecs, etc

  - now fetch asks transport.c: "OK, fetch just these refs"

  - transport.c again calls into connect.c to handle the actual fetch

Of course over the years a lot of cruft has grown around all of them. I
wouldn't be surprised if there are functions which cross these
abstractions, or other random functions inside each file that do not
belong.

> and the issue I am concerned about is the select_capabilities as well as
> the request_capabilities function here. The select_capabilities functionality
> is currently residing in the high level parts of the code as it both depends on
> the advertised server capabilities and on the user input (--use-frotz or config
> options), so the capability selection is done in fetchpack.c
> 
> So there are 2 routes to go: Either we leave the select_capabilities in the
> upper layers (near the actual high level command, fetch, fetchpack) or we put
> it into the transport layer and just passing in a struct what the user desires.
> And when the users desire doesn't meet the server capabilities we die deep down
> in the transport layer.

I think you have to leave it in the fetch-pack code. As you note, it's
the place where we know about what the user is asking for and can
manipulate the list. And not all transports even support capabilities
like this.

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 11/11] Document protocol version 2
  2015-06-01 23:40             ` Stefan Beller
@ 2015-06-04 13:18               ` Jeff King
  2015-06-04 17:01                 ` Junio C Hamano
  0 siblings, 1 reply; 61+ messages in thread
From: Jeff King @ 2015-06-04 13:18 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Junio C Hamano, git, Duy Nguyen

On Mon, Jun 01, 2015 at 04:40:54PM -0700, Stefan Beller wrote:

> Thinking about this further, maybe it is a good idea to restrict the
> capabilities advertising to alphabetical order?

This seems like an unnecessary restriction. The main impetus seems to
be:

> So how does parse_capability scale w.r.t the number of capabilities?
> If parse_capability is just a linear search then it is O(n) and with n
> capabilities the client faces an O(n^2) computation which is bad. So
> if we were to require alphabetic capabilities, you could internally
> keep track and the whole operation is O(n). I just wonder if this is
> premature optimization or some thought we need to think of.

but that is an implementation problem, not a protocol problem. The
parsing side can easily use something better than O(n) lookups (e.g., a
binary search). I think we can live with O(n lg n) to parse the whole
list. A true in-order merge would be O(n), but the difference is
probably negligible here, because...

> Now if we assume the number of capabilities grows over time a lot
> (someone may "abuse" it for a cool feature, similar to the refs
> currently. Nobody thought about having so many refs in advance)

I think if it grows without bound, we are doing it wrong. We should
generally only be adding capabilities that require a single short tag in
the list (server supports "foo", client asks for "foo"). I'd find it
acceptable to add ones that repeat, as long as we are very sure that the
repetition is going to be small, or scales with the size of the config
(e.g., servers says "here is a mirror you can seed from; here is another
mirror you can seed from").

We should definitely _not_ add anything that scales with the repository
size. For instance, the "symref" field only shows the "HEAD" now. That's
OK, as it's constant size. We do not show _all_ symrefs right now
because of pkt-line length limitations. With v2, we could in theory
start doing so (one per pkt-line). But that does not make it a good
idea. The right way to implement that is:

  1. the server says "I can tell you about symrefs"

  2. client says "OK, I understand how to parse your symref data"

  3. for each ref in the advertisement, tack on "\0symref=..." (or
     whatever).

The capability portion of the conversation remains constant-sized, and
the O(# of refs) portion is part of the ref advertisement, which is
inherently of that magnitude.

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 00/11] Protocol version 2, again!
  2015-06-04 13:09       ` Jeff King
@ 2015-06-04 16:44         ` Stefan Beller
  0 siblings, 0 replies; 61+ messages in thread
From: Stefan Beller @ 2015-06-04 16:44 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Duy Nguyen, Junio C Hamano

On Thu, Jun 4, 2015 at 6:09 AM, Jeff King <peff@peff.net> wrote:
> On Mon, Jun 01, 2015 at 10:49:45AM -0700, Stefan Beller wrote:
>
>> However the client side with builtin/fetch, builtin/fetch-pack, fetch-pack
>> is a bit of a mystery to me, as I cannot fully grasp the difference between
>>  * connect.{h,c}
>>  * remote.{h.c}
>>  * transport.{h.c}
>> there. All of it seems to be doing network related stuff, but I have trouble
>> getting the big picture. I am assuming all of these 3 are rather a low level,
>> used like a library, though there must be even more hierarchy in there,
>> connect is most low level judging from the header file and used by
>> the other two.
>> transport.h seems to provide the most toplevel library stuff as it includes
>> remote.h in its header?
>
> connect.c was originally "the git protocol", and was used by send-pack
> and fetch-pack. Other individual programs implemented other transports.
> Later, as the interface moved towards everybody running "fetch" and
> "push", and those delegating work to the individual transports, we got
> transport.c, which is an abstract interface for all transports. It
> delegates actual git-protocol work to the functions in connect.c (or
> bundle work elsewhere, or handles remote-helpers itself).
>
> And then remote.c contains routines for dealing with the remotes at a
> logical level. E.g., which refs to fetch or push, etc.
>
> So in theory, the flow is something like:
>
>   - fetch.c knows "the user wants to fetch from 'foo'"
>
>   - fetch asks remote.c: "who is remote 'foo'"; we get back a URL
>
>   - fetch asks transport.c: "what are the refs for $URL"
>
>   - it turns out to be a git URL. transport.c calls into connect.c to
>     implement get_refs_via_connect.

Currently the distinction which protocol to speak is made
here (in get_refs_via_connect), which may be a bit late. Though I updating
the git protocol only first would also be feasible.

So for the next git protocol

 - get_refs_via_connect first asks for the capabilities and gets an answer from
   upload-pack-2. Now what?

 - we could have a callback in struct transport, which must be set
accordingly by
   fetch in step 4 (it turns out to be a git URL. transport.c ...)
   This callback is called with each pkt-line such as

        void parse-capability(char *line, struct
*transport_capabilities, void *cdata);

The line would contain the pkt-line, while the transport_capabilities
would be a struct
similar as in "[RFCv2 06/16] remote.h: add new struct for options",
where the fetch
implementation must select the right bits. Looking at fetch-pack.{c,h}
we only expose
one do-it-all method there, so we currently don't have file wide
easily accessible variables,
but rather all in a struct fetch_pack_args, which carries important
information for the
selection process such as verbosity or desired options. This is why a
void* comes in
handy as well. (It will be easy later to adapt that to the sending
side as well).

Instead of a full grown line by line callback we could also just
collect all the capabilities
first in a string list and then only call back once into a

    void select_capabilities(struct string_list *available, struct
string_list *selected);

I think I'd find this second approach more handy as there are subtle
details you'd miss in
the first approach. Looking at fetch-pack.c, do_fetch_pack (line 790),
we have one
selection (no_done) conditioned on another (multi_ack_detailed), so
having the full list
there makes the code easier.

This second approach however might not be as future proof as the
first, because we store
all received capabilities (which may grow large in the future) and not
throw unknowns away
immediately.

I tend to rather implement the second one (easier to read/maintain trumps a
maybe-performance-problem-in-the-future).

This performance-problem-in-the-future could be mitigated easily by having a
preselection in transport.c get_capabilities, which ignores any capabilities not
white listed there (harder to maintain though, as we have a more than one spot
where to put a list)

By writing this mail I realized another thing. I have had the patch
    "[RFCv2 09/16] remote.h: add get_remote_capabilities, request_capabilities"
which has request_capabilities just translating from a struct
containing some bits
into a sequence of pkt-lines containing the actual protocol
capabilities. Maybe we
should not have that in the connect file, but rather as proposed in
this email, the
high level command directly selects the strings to put back on the
wire. (By having
"struct string_list *selected" as part of the select_capabilities arguments.)
then the request_capabilities in connect.c would be dumbed down to just:

    void request_capabilities(int out, struct string_list *list)
    {
        struct string_list_item *item;
        for_each_string_list_item(item, list) {
             packet_write(out, item->string);
        }
        packet_flush(out);
    }

I think that would be reasonable?

>
>   - after fetch gets back the list of refs, it uses routines in remote.c
>     to figure out which refs we are interested in, handle refspecs, etc
>
>   - now fetch asks transport.c: "OK, fetch just these refs"
>
>   - transport.c again calls into connect.c to handle the actual fetch
>
> Of course over the years a lot of cruft has grown around all of them. I
> wouldn't be surprised if there are functions which cross these
> abstractions, or other random functions inside each file that do not
> belong.
>
>> and the issue I am concerned about is the select_capabilities as well as
>> the request_capabilities function here. The select_capabilities functionality
>> is currently residing in the high level parts of the code as it both depends on
>> the advertised server capabilities and on the user input (--use-frotz or config
>> options), so the capability selection is done in fetchpack.c
>>
>> So there are 2 routes to go: Either we leave the select_capabilities in the
>> upper layers (near the actual high level command, fetch, fetchpack) or we put
>> it into the transport layer and just passing in a struct what the user desires.
>> And when the users desire doesn't meet the server capabilities we die deep down
>> in the transport layer.
>
> I think you have to leave it in the fetch-pack code. As you note, it's
> the place where we know about what the user is asking for and can
> manipulate the list. And not all transports even support capabilities
> like this.
>
> -Peff

Okay

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC/WIP PATCH 11/11] Document protocol version 2
  2015-06-04 13:18               ` Jeff King
@ 2015-06-04 17:01                 ` Junio C Hamano
  0 siblings, 0 replies; 61+ messages in thread
From: Junio C Hamano @ 2015-06-04 17:01 UTC (permalink / raw)
  To: Jeff King; +Cc: Stefan Beller, git, Duy Nguyen

Jeff King <peff@peff.net> writes:

> We should definitely _not_ add anything that scales with the repository
> size. For instance, the "symref" field only shows the "HEAD" now. That's
> OK, as it's constant size.

I agree that this is an easy-to-explain rule to keep the design
sensible.

> We do not show _all_ symrefs right now
> because of pkt-line length limitations. With v2, we could in theory
> start doing so (one per pkt-line). But that does not make it a good
> idea.

Very true.  If we want symref information conveyed, then we should
either make a new "symref advertisement" available (and it is OK to
use a single-bit capability to enable or disable that new section),
or make the "ref advertisement" natively capable of always doing so
(i.e. if there is no need to make this optional).

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, back to index

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-26 22:01 [RFC/WIP PATCH 00/11] Protocol version 2, again! Stefan Beller
2015-05-26 22:01 ` [RFC/WIP PATCH 01/11] upload-pack: make client capability parsing code a separate function Stefan Beller
2015-05-26 22:01 ` [RFC/WIP PATCH 02/11] upload-pack: only accept capabilities on the first "want" line Stefan Beller
2015-05-26 22:17   ` Junio C Hamano
2015-05-26 22:20     ` Stefan Beller
2015-05-26 22:01 ` [RFC/WIP PATCH 03/11] upload-pack: move capabilities out of send_ref Stefan Beller
2015-05-26 22:01 ` [RFC/WIP PATCH 04/11] upload-pack-2: Implement the version 2 of upload-pack Stefan Beller
2015-05-27  2:30   ` Eric Sunshine
2015-05-27  6:35   ` Jeff King
2015-05-27 17:30     ` Eric Sunshine
2015-05-27 20:14       ` Jeff King
2015-05-27 17:40     ` Stefan Beller
2015-05-27 20:34       ` Jeff King
2015-05-27 20:45         ` Stefan Beller
2015-05-27 21:46           ` Jeff King
2015-05-26 22:01 ` [RFC/WIP PATCH 05/11] transport: add infrastructure to support a protocol version number Stefan Beller
2015-05-27  6:39   ` Jeff King
2015-05-27 19:01     ` Stefan Beller
2015-05-27 20:17       ` Jeff King
2015-05-27 19:10     ` Junio C Hamano
2015-05-26 22:01 ` [RFC/WIP PATCH 06/11] remote.h: add get_remote_capabilities, request_capabilities Stefan Beller
2015-05-27  3:25   ` Eric Sunshine
2015-05-27  6:50     ` Jeff King
2015-05-27 17:19       ` Eric Sunshine
2015-05-27 20:09         ` Jeff King
2015-05-27  6:45   ` Jeff King
2015-05-29 19:39     ` Stefan Beller
2015-05-29 22:08       ` Jeff King
2015-05-26 22:01 ` [RFC/WIP PATCH 07/11] fetch-pack: use the configured transport protocol Stefan Beller
2015-05-26 22:19   ` Junio C Hamano
2015-05-26 22:23     ` Stefan Beller
2015-05-27  6:53   ` Jeff King
2015-05-26 22:01 ` [RFC/WIP PATCH 08/11] transport: connect_setup appends protocol version number Stefan Beller
2015-05-26 22:21   ` Junio C Hamano
2015-05-26 22:31     ` Stefan Beller
2015-05-27  5:09       ` Junio C Hamano
2015-05-27  6:56         ` Jeff King
2015-05-27  3:33   ` Eric Sunshine
2015-05-27  7:02   ` Jeff King
2015-05-26 22:01 ` [RFC/WIP PATCH 09/11] transport: get_refs_via_connect exchanges capabilities before refs Stefan Beller
2015-05-27  5:37   ` Eric Sunshine
2015-05-27  7:06   ` Jeff King
2015-05-26 22:01 ` [RFC/WIP PATCH 10/11] t5544: add a test case for the new protocol Stefan Beller
2015-05-27  5:34   ` Eric Sunshine
2015-05-27  7:12   ` Jeff King
2015-05-26 22:01 ` [RFC/WIP PATCH 11/11] Document protocol version 2 Stefan Beller
2015-05-29 20:35   ` Junio C Hamano
2015-05-29 21:36     ` Stefan Beller
2015-05-29 21:52       ` Junio C Hamano
2015-05-29 22:21         ` Jeff King
2015-06-01 23:14           ` Stefan Beller
2015-06-01 23:40             ` Stefan Beller
2015-06-04 13:18               ` Jeff King
2015-06-04 17:01                 ` Junio C Hamano
2015-06-02 17:06             ` Junio C Hamano
2015-05-27  6:18 ` [RFC/WIP PATCH 00/11] Protocol version 2, again! Jeff King
2015-05-27  7:08   ` Jeff King
2015-06-01 17:49     ` Stefan Beller
2015-06-02 10:10       ` Duy Nguyen
2015-06-04 13:09       ` Jeff King
2015-06-04 16:44         ` Stefan Beller

git@vger.kernel.org list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Example config snippet for mirrors

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git