git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
* [PATCH 0/8] ref-in-want
@ 2018-06-05 17:51 Brandon Williams
  2018-06-05 17:51 ` [PATCH 1/8] test-pkt-line: add unpack-sideband subcommand Brandon Williams
                   ` (9 more replies)
  0 siblings, 10 replies; 46+ messages in thread
From: Brandon Williams @ 2018-06-05 17:51 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams

This series adds the ref-in-want feature which was originally proposed
by Jonathan Tan
(https://public-inbox.org/git/cover.1485381677.git.jonathantanmy@google.com/).

Back when ref-in-want was first discussed it was decided that we should
first solve the issue of moving to a new wire format and find a way to
limit the ref-advertisement before moving forward with ref-in-want.  Now
that protocol version 2 is a reality, and that refs can be filtered on
the server side, we can revisit ref-in-want.

This version of ref-in-want is a bit more restrictive than what Jonathan
originally proposed (only full ref names are allowed instead of globs
and OIDs), but it is meant to accomplish the same goal (solve the issues
of refs changing during negotiation).

Brandon Williams (8):
  test-pkt-line: add unpack-sideband subcommand
  upload-pack: implement ref-in-want
  upload-pack: test negotiation with changing repository
  fetch: refactor the population of peer ref OIDs
  fetch: refactor fetch_refs into two functions
  fetch: refactor to make function args narrower
  fetch-pack: put shallow info in output parameter
  fetch-pack: implement ref-in-want

 Documentation/config.txt                |   4 +
 Documentation/technical/protocol-v2.txt |  28 ++-
 builtin/clone.c                         |   4 +-
 builtin/fetch.c                         | 126 +++++++-----
 fetch-object.c                          |   2 +-
 fetch-pack.c                            |  52 +++--
 remote.c                                |   1 +
 remote.h                                |   1 +
 t/helper/test-pkt-line.c                |  37 ++++
 t/lib-httpd.sh                          |   1 +
 t/lib-httpd/apache.conf                 |   8 +
 t/lib-httpd/one-time-sed.sh             |  16 ++
 t/t5703-upload-pack-ref-in-want.sh      | 245 ++++++++++++++++++++++++
 transport-helper.c                      |   6 +-
 transport-internal.h                    |   9 +-
 transport.c                             |  34 +++-
 transport.h                             |   3 +-
 upload-pack.c                           |  64 +++++++
 18 files changed, 564 insertions(+), 77 deletions(-)
 create mode 100644 t/lib-httpd/one-time-sed.sh
 create mode 100755 t/t5703-upload-pack-ref-in-want.sh

-- 
2.17.1.1185.g55be947832-goog


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/8] test-pkt-line: add unpack-sideband subcommand
  2018-06-05 17:51 [PATCH 0/8] ref-in-want Brandon Williams
@ 2018-06-05 17:51 ` Brandon Williams
  2018-06-05 17:51 ` [PATCH 2/8] upload-pack: implement ref-in-want Brandon Williams
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Brandon Williams @ 2018-06-05 17:51 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams

Add an 'unpack-sideband' subcommand to the test-pkt-line helper to
enable unpacking packet line data sent multiplexed using a sideband.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 t/helper/test-pkt-line.c | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/t/helper/test-pkt-line.c b/t/helper/test-pkt-line.c
index 0f19e53c7..2a55ffff1 100644
--- a/t/helper/test-pkt-line.c
+++ b/t/helper/test-pkt-line.c
@@ -1,3 +1,4 @@
+#include "cache.h"
 #include "pkt-line.h"
 
 static void pack_line(const char *line)
@@ -48,6 +49,40 @@ static void unpack(void)
 	}
 }
 
+static void unpack_sideband(void)
+{
+	struct packet_reader reader;
+	packet_reader_init(&reader, 0, NULL, 0,
+			   PACKET_READ_GENTLE_ON_EOF |
+			   PACKET_READ_CHOMP_NEWLINE);
+
+	while (packet_reader_read(&reader) != PACKET_READ_EOF) {
+		int band;
+		int fd;
+
+		switch (reader.status) {
+		case PACKET_READ_EOF:
+			break;
+		case PACKET_READ_NORMAL:
+			band = reader.line[0] & 0xff;
+			if (band == 1)
+				fd = 1;
+			else
+				fd = 2;
+
+			write_or_die(fd, reader.line+1, reader.pktlen-1);
+
+			if (band == 3)
+				die("sind-band error");
+			break;
+		case PACKET_READ_FLUSH:
+			return;
+		case PACKET_READ_DELIM:
+			break;
+		}
+	}
+}
+
 int cmd_main(int argc, const char **argv)
 {
 	if (argc < 2)
@@ -57,6 +92,8 @@ int cmd_main(int argc, const char **argv)
 		pack(argc - 2, argv + 2);
 	else if (!strcmp(argv[1], "unpack"))
 		unpack();
+	else if (!strcmp(argv[1], "unpack-sideband"))
+		unpack_sideband();
 	else
 		die("invalid argument '%s'", argv[1]);
 
-- 
2.17.1.1185.g55be947832-goog


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 2/8] upload-pack: implement ref-in-want
  2018-06-05 17:51 [PATCH 0/8] ref-in-want Brandon Williams
  2018-06-05 17:51 ` [PATCH 1/8] test-pkt-line: add unpack-sideband subcommand Brandon Williams
@ 2018-06-05 17:51 ` Brandon Williams
  2018-06-05 19:11   ` Ramsay Jones
  2018-06-05 20:32   ` Ævar Arnfjörð Bjarmason
  2018-06-05 17:51 ` [PATCH 3/8] upload-pack: test negotiation with changing repository Brandon Williams
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 46+ messages in thread
From: Brandon Williams @ 2018-06-05 17:51 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams

Currently, while performing packfile negotiation, clients are only
allowed to specify their desired objects using object ids.  This causes
a vulnerability to failure when an object turns non-existent during
negotiation, which may happen if, for example, the desired repository is
provided by multiple Git servers in a load-balancing arrangement.

In order to eliminate this vulnerability, implement the ref-in-want
feature for the 'fetch' command in protocol version 2.  This feature
enables the 'fetch' command to support requests in the form of ref names
through a new "want-ref <ref>" parameter.  At the conclusion of
negotiation, the server will send a list of all of the wanted references
(as provided by "want-ref" lines) in addition to the generated packfile.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/config.txt                |   4 +
 Documentation/technical/protocol-v2.txt |  28 ++++-
 t/t5703-upload-pack-ref-in-want.sh      | 153 ++++++++++++++++++++++++
 upload-pack.c                           |  64 ++++++++++
 4 files changed, 248 insertions(+), 1 deletion(-)
 create mode 100755 t/t5703-upload-pack-ref-in-want.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index ab641bf5a..acafe6c8d 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -3479,6 +3479,10 @@ Note that this configuration variable is ignored if it is seen in the
 repository-level config (this is a safety measure against fetching from
 untrusted repositories).
 
+uploadpack.allowRefInWant::
+	If this option is set, `upload-pack` will support the `ref-in-want`
+	feature of the protocol version 2 `fetch` command.
+
 url.<base>.insteadOf::
 	Any URL that starts with this value will be rewritten to
 	start, instead, with <base>. In cases where some site serves a
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 49bda76d2..8367e09b8 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -299,12 +299,21 @@ included in the client's request:
 	for use with partial clone and partial fetch operations. See
 	`rev-list` for possible "filter-spec" values.
 
+If the 'ref-in-want' feature is advertised, the following argument can
+be included in the client's request as well as the potential addition of
+the 'wanted-refs' section in the server's response as explained below.
+
+    want-ref <ref>
+	Indicates to the server than the client wants to retrieve a
+	particular ref, where <ref> is the full name of a ref on the
+	server.
+
 The response of `fetch` is broken into a number of sections separated by
 delimiter packets (0001), with each section beginning with its section
 header.
 
     output = *section
-    section = (acknowledgments | shallow-info | packfile)
+    section = (acknowledgments | shallow-info | wanted-refs | packfile)
 	      (flush-pkt | delim-pkt)
 
     acknowledgments = PKT-LINE("acknowledgments" LF)
@@ -319,6 +328,10 @@ header.
     shallow = "shallow" SP obj-id
     unshallow = "unshallow" SP obj-id
 
+    wanted-refs = PKT-LINE("wanted-refs" LF)
+		  *PKT-Line(wanted-ref LF)
+    wanted-ref = obj-id SP refname
+
     packfile = PKT-LINE("packfile" LF)
 	       *PKT-LINE(%x01-03 *%x00-ff)
 
@@ -379,6 +392,19 @@ header.
 	* This section is only included if a packfile section is also
 	  included in the response.
 
+    wanted-refs section
+	* This section is only included if the client has requested a
+	  ref using a 'want-ref' line and if a packfile section is also
+	  included in the response.
+
+	* Always begins with the section header "wanted-refs"
+
+	* The server will send a ref listing ("<oid> <refname>") for
+	  each reference requested using 'want-ref' lines.
+
+	* Ther server MUST NOT send any refs which were not requested
+	  using 'want-ref' lines.
+
     packfile section
 	* This section is only included if the client has sent 'want'
 	  lines in its request and either requested that no more
diff --git a/t/t5703-upload-pack-ref-in-want.sh b/t/t5703-upload-pack-ref-in-want.sh
new file mode 100755
index 000000000..0ef182970
--- /dev/null
+++ b/t/t5703-upload-pack-ref-in-want.sh
@@ -0,0 +1,153 @@
+#!/bin/sh
+
+test_description='upload-pack ref-in-want'
+
+. ./test-lib.sh
+
+get_actual_refs() {
+	sed -n '/wanted-refs/,/0001/p' <out | sed '1d;$d' | test-pkt-line unpack >actual_refs
+}
+
+get_actual_commits() {
+	sed -n '/packfile/,/0000/p' <out | sed '1d' | test-pkt-line unpack-sideband >o.pack &&
+	git index-pack o.pack &&
+	git verify-pack -v o.idx | grep commit | cut -c-40 | sort >actual_commits
+}
+
+check_output() {
+	get_actual_refs &&
+	test_cmp expected_refs actual_refs &&
+	get_actual_commits &&
+	test_cmp expected_commits actual_commits
+}
+
+# c(o/foo) d(o/bar)
+#        \ /
+#         b   e(baz)  f(master)
+#          \__  |  __/
+#             \ | /
+#               a
+test_expect_success 'setup repository' '
+	test_commit a &&
+	git checkout -b o/foo &&
+	test_commit b &&
+	test_commit c &&
+	git checkout -b o/bar b &&
+	test_commit d &&
+	git checkout -b baz a &&
+	test_commit e &&
+	git checkout master &&
+	test_commit f
+'
+
+test_expect_success 'config controls ref-in-want advertisement' '
+	git serve --advertise-capabilities >out &&
+	! grep -a ref-in-want out &&
+
+	git config uploadpack.allowRefInWant false &&
+	git serve --advertise-capabilities >out &&
+	! grep -a ref-in-want out &&
+
+	git config uploadpack.allowRefInWant true &&
+	git serve --advertise-capabilities >out &&
+	grep -a ref-in-want out
+'
+
+test_expect_success 'invalid want-ref line' '
+	test-pkt-line pack >in <<-EOF &&
+	command=fetch
+	0001
+	no-progress
+	want-ref refs/heads/non-existent
+	done
+	0000
+	EOF
+
+	test_must_fail git serve --stateless-rpc 2>out <in &&
+	grep "unknown ref" out
+'
+
+test_expect_success 'basic want-ref' '
+	cat >expected_refs <<-EOF &&
+	$(git rev-parse f) refs/heads/master
+	EOF
+	git rev-parse f | sort >expected_commits &&
+
+	test-pkt-line pack >in <<-EOF &&
+	command=fetch
+	0001
+	no-progress
+	want-ref refs/heads/master
+	have $(git rev-parse a)
+	done
+	0000
+	EOF
+
+	git serve --stateless-rpc >out <in &&
+	check_output
+'
+
+test_expect_success 'multiple want-ref lines' '
+	cat >expected_refs <<-EOF &&
+	$(git rev-parse c) refs/heads/o/foo
+	$(git rev-parse d) refs/heads/o/bar
+	EOF
+	git rev-parse c d | sort >expected_commits &&
+
+	test-pkt-line pack >in <<-EOF &&
+	command=fetch
+	0001
+	no-progress
+	want-ref refs/heads/o/foo
+	want-ref refs/heads/o/bar
+	have $(git rev-parse b)
+	done
+	0000
+	EOF
+
+	git serve --stateless-rpc >out <in &&
+	check_output
+'
+
+test_expect_success 'mix want and want-ref' '
+	cat >expected_refs <<-EOF &&
+	$(git rev-parse f) refs/heads/master
+	EOF
+	git rev-parse e f | sort >expected_commits &&
+
+	test-pkt-line pack >in <<-EOF &&
+	command=fetch
+	0001
+	no-progress
+	want-ref refs/heads/master
+	want $(git rev-parse e)
+	have $(git rev-parse a)
+	done
+	0000
+	EOF
+
+	git serve --stateless-rpc >out <in &&
+	check_output
+'
+
+test_expect_success 'want-ref with ref we already have commit for' '
+	cat >expected_refs <<-EOF &&
+	$(git rev-parse c) refs/heads/o/foo
+	EOF
+	>expected_commits &&
+
+	test-pkt-line pack >in <<-EOF &&
+	command=fetch
+	0001
+	no-progress
+	want-ref refs/heads/o/foo
+	have $(git rev-parse c)
+	done
+	0000
+	EOF
+
+	git serve --stateless-rpc >out <in &&
+	check_output
+'
+
+test_done
diff --git a/upload-pack.c b/upload-pack.c
index 87c6722ea..47858d367 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -64,6 +64,7 @@ static const char *pack_objects_hook;
 
 static int filter_capability_requested;
 static int allow_filter;
+static int allow_ref_in_want;
 static struct list_objects_filter_options filter_options;
 
 static void reset_timeout(void)
@@ -1075,6 +1076,8 @@ static int upload_pack_config(const char *var, const char *value, void *unused)
 			return git_config_string(&pack_objects_hook, var, value);
 	} else if (!strcmp("uploadpack.allowfilter", var)) {
 		allow_filter = git_config_bool(var, value);
+	} else if (!strcmp("uploadpack.allowrefinwant", var)) {
+		allow_ref_in_want = git_config_bool(var, value);
 	}
 	return parse_hide_refs_config(var, value, "uploadpack");
 }
@@ -1114,6 +1117,7 @@ void upload_pack(struct upload_pack_options *options)
 
 struct upload_pack_data {
 	struct object_array wants;
+	struct string_list wanted_refs;
 	struct oid_array haves;
 
 	struct object_array shallows;
@@ -1135,12 +1139,14 @@ struct upload_pack_data {
 static void upload_pack_data_init(struct upload_pack_data *data)
 {
 	struct object_array wants = OBJECT_ARRAY_INIT;
+	struct string_list wanted_refs = STRING_LIST_INIT_DUP;
 	struct oid_array haves = OID_ARRAY_INIT;
 	struct object_array shallows = OBJECT_ARRAY_INIT;
 	struct string_list deepen_not = STRING_LIST_INIT_DUP;
 
 	memset(data, 0, sizeof(*data));
 	data->wants = wants;
+	data->wanted_refs = wanted_refs;
 	data->haves = haves;
 	data->shallows = shallows;
 	data->deepen_not = deepen_not;
@@ -1149,6 +1155,7 @@ static void upload_pack_data_init(struct upload_pack_data *data)
 static void upload_pack_data_clear(struct upload_pack_data *data)
 {
 	object_array_clear(&data->wants);
+	string_list_clear(&data->wanted_refs, 1);
 	oid_array_clear(&data->haves);
 	object_array_clear(&data->shallows);
 	string_list_clear(&data->deepen_not, 0);
@@ -1185,6 +1192,32 @@ static int parse_want(const char *line)
 	return 0;
 }
 
+static int parse_want_ref(const char *line, struct string_list *wanted_refs)
+{
+	const char *arg;
+	if (skip_prefix(line, "want-ref ", &arg)) {
+		struct object_id oid;
+		struct string_list_item *item;
+		struct object *o;
+
+		if (read_ref(arg, &oid))
+			die("unknown ref %s", arg);
+
+		item = string_list_append(wanted_refs, arg);
+		item->util = oiddup(&oid);
+
+		o = parse_object_or_die(&oid, arg);
+		if (!(o->flags & WANTED)) {
+			o->flags |= WANTED;
+			add_object_array(o, NULL, &want_obj);
+		}
+
+		return 1;
+	}
+
+	return 0;
+}
+
 static int parse_have(const char *line, struct oid_array *haves)
 {
 	const char *arg;
@@ -1210,6 +1243,8 @@ static void process_args(struct packet_reader *request,
 		/* process want */
 		if (parse_want(arg))
 			continue;
+		if (allow_ref_in_want && parse_want_ref(arg, &data->wanted_refs))
+			continue;
 		/* process have line */
 		if (parse_have(arg, &data->haves))
 			continue;
@@ -1352,6 +1387,24 @@ static int process_haves_and_send_acks(struct upload_pack_data *data)
 	return ret;
 }
 
+static void send_wanted_ref_info(struct upload_pack_data *data)
+{
+	const struct string_list_item *item;
+
+	if (!data->wanted_refs.nr)
+		return;
+
+	packet_write_fmt(1, "wanted-refs\n");
+
+	for_each_string_list_item(item, &data->wanted_refs) {
+		packet_write_fmt(1, "%s %s\n",
+				 oid_to_hex(item->util),
+				 item->string);
+	}
+
+	packet_delim(1);
+}
+
 static void send_shallow_info(struct upload_pack_data *data)
 {
 	/* No shallow info needs to be sent */
@@ -1418,6 +1471,7 @@ int upload_pack_v2(struct repository *r, struct argv_array *keys,
 				state = FETCH_DONE;
 			break;
 		case FETCH_SEND_PACK:
+			send_wanted_ref_info(&data);
 			send_shallow_info(&data);
 
 			packet_write_fmt(1, "packfile\n");
@@ -1438,12 +1492,22 @@ int upload_pack_advertise(struct repository *r,
 {
 	if (value) {
 		int allow_filter_value;
+		int allow_ref_in_want;
+
 		strbuf_addstr(value, "shallow");
+
 		if (!repo_config_get_bool(the_repository,
 					 "uploadpack.allowfilter",
 					 &allow_filter_value) &&
 		    allow_filter_value)
 			strbuf_addstr(value, " filter");
+
+		if (!repo_config_get_bool(the_repository,
+					 "uploadpack.allowrefinwant",
+					 &allow_ref_in_want) &&
+		    allow_ref_in_want)
+			strbuf_addstr(value, " ref-in-want");
 	}
+
 	return 1;
 }
-- 
2.17.1.1185.g55be947832-goog


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 3/8] upload-pack: test negotiation with changing repository
  2018-06-05 17:51 [PATCH 0/8] ref-in-want Brandon Williams
  2018-06-05 17:51 ` [PATCH 1/8] test-pkt-line: add unpack-sideband subcommand Brandon Williams
  2018-06-05 17:51 ` [PATCH 2/8] upload-pack: implement ref-in-want Brandon Williams
@ 2018-06-05 17:51 ` Brandon Williams
  2018-06-05 17:51 ` [PATCH 4/8] fetch: refactor the population of peer ref OIDs Brandon Williams
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Brandon Williams @ 2018-06-05 17:51 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams

Add tests to check the behavior of fetching from a repository which
changes between rounds of negotiation (for example, when different
servers in a load-balancing agreement participate in the same stateless
RPC negotiation). This forms a baseline of comparison to the ref-in-want
functionality (which will be introduced to the client in subsequent
commits), and ensures that subsequent commits do not change existing
behavior.

As part of this effort, a mechanism to substitute strings in a single
HTTP response is added.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 t/lib-httpd.sh                     |  1 +
 t/lib-httpd/apache.conf            |  8 +++
 t/lib-httpd/one-time-sed.sh        | 16 ++++++
 t/t5703-upload-pack-ref-in-want.sh | 92 ++++++++++++++++++++++++++++++
 4 files changed, 117 insertions(+)
 create mode 100644 t/lib-httpd/one-time-sed.sh

diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index 435a37465..84f8efdd4 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -132,6 +132,7 @@ prepare_httpd() {
 	cp "$TEST_PATH"/passwd "$HTTPD_ROOT_PATH"
 	install_script broken-smart-http.sh
 	install_script error.sh
+	install_script one-time-sed.sh
 
 	ln -s "$LIB_HTTPD_MODULE_PATH" "$HTTPD_ROOT_PATH/modules"
 
diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
index 724d9ae46..fe68d37bb 100644
--- a/t/lib-httpd/apache.conf
+++ b/t/lib-httpd/apache.conf
@@ -111,9 +111,14 @@ Alias /auth/dumb/ www/auth/dumb/
 	SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
 	SetEnv GIT_HTTP_EXPORT_ALL
 </LocationMatch>
+<LocationMatch /one_time_sed/>
+	SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+	SetEnv GIT_HTTP_EXPORT_ALL
+</LocationMatch>
 ScriptAliasMatch /smart_*[^/]*/(.*) ${GIT_EXEC_PATH}/git-http-backend/$1
 ScriptAlias /broken_smart/ broken-smart-http.sh/
 ScriptAlias /error/ error.sh/
+ScriptAliasMatch /one_time_sed/(.*) one-time-sed.sh/$1
 <Directory ${GIT_EXEC_PATH}>
 	Options FollowSymlinks
 </Directory>
@@ -123,6 +128,9 @@ ScriptAlias /error/ error.sh/
 <Files error.sh>
   Options ExecCGI
 </Files>
+<Files one-time-sed.sh>
+	Options ExecCGI
+</Files>
 <Files ${GIT_EXEC_PATH}/git-http-backend>
 	Options ExecCGI
 </Files>
diff --git a/t/lib-httpd/one-time-sed.sh b/t/lib-httpd/one-time-sed.sh
new file mode 100644
index 000000000..a9c4aa5f4
--- /dev/null
+++ b/t/lib-httpd/one-time-sed.sh
@@ -0,0 +1,16 @@
+#!/bin/sh
+
+if [ -e one-time-sed ]; then
+	"$GIT_EXEC_PATH/git-http-backend" >out
+
+	sed "$(cat one-time-sed)" <out >out_modified
+
+	if diff out out_modified >/dev/null; then
+		cat out
+	else
+		cat out_modified
+		rm one-time-sed
+	fi
+else
+	"$GIT_EXEC_PATH/git-http-backend"
+fi
diff --git a/t/t5703-upload-pack-ref-in-want.sh b/t/t5703-upload-pack-ref-in-want.sh
index 0ef182970..979ab6d03 100755
--- a/t/t5703-upload-pack-ref-in-want.sh
+++ b/t/t5703-upload-pack-ref-in-want.sh
@@ -150,4 +150,96 @@ test_expect_success 'want-ref with ref we already have commit for' '
 	check_output
 '
 
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+REPO="$HTTPD_DOCUMENT_ROOT_PATH/repo"
+LOCAL_PRISTINE="$(pwd)/local_pristine"
+
+test_expect_success 'setup repos for change-while-negotiating test' '
+	(
+		git init "$REPO" &&
+		cd "$REPO" &&
+		>.git/git-daemon-export-ok &&
+		test_commit m1 &&
+		git tag -d m1 &&
+
+		# Local repo with many commits (so that negotiation will take
+		# more than 1 request/response pair)
+		git clone "http://127.0.0.1:$LIB_HTTPD_PORT/smart/repo" "$LOCAL_PRISTINE" &&
+		cd "$LOCAL_PRISTINE" &&
+		git checkout -b side &&
+		for i in $(seq 1 33); do test_commit s$i; done &&
+
+		# Add novel commits to upstream
+		git checkout master &&
+		cd "$REPO" &&
+		test_commit m2 &&
+		test_commit m3 &&
+		git tag -d m2 m3
+	) &&
+	git -C "$LOCAL_PRISTINE" remote set-url origin "http://127.0.0.1:$LIB_HTTPD_PORT/one_time_sed/repo" &&
+	git -C "$LOCAL_PRISTINE" config protocol.version 2
+'
+
+inconsistency() {
+	# Simulate that the server initially reports $2 as the ref
+	# corresponding to $1, and after that, $1 as the ref corresponding to
+	# $1. This corresponds to the real-life situation where the server's
+	# repository appears to change during negotiation, for example, when
+	# different servers in a load-balancing arrangement serve (stateless)
+	# RPCs during a single negotiation.
+	printf "s/%s/%s/" \
+	       $(git -C "$REPO" rev-parse $1 | tr -d "\n") \
+	       $(git -C "$REPO" rev-parse $2 | tr -d "\n") \
+	       >"$HTTPD_ROOT_PATH/one-time-sed"
+}
+
+test_expect_success 'server is initially ahead - no ref in want' '
+	git -C "$REPO" config uploadpack.allowRefInWant false &&
+	rm -rf local &&
+	cp -r "$LOCAL_PRISTINE" local &&
+	inconsistency master 1234567890123456789012345678901234567890 &&
+	test_must_fail git -C local fetch 2>err &&
+	grep "ERR upload-pack: not our ref" err
+'
+
+test_expect_failure 'server is initially ahead - ref in want' '
+	git -C "$REPO" config uploadpack.allowRefInWant true &&
+	rm -rf local &&
+	cp -r "$LOCAL_PRISTINE" local &&
+	inconsistency master 1234567890123456789012345678901234567890 &&
+	git -C local fetch &&
+
+	git -C "$REPO" rev-parse --verify master >expected &&
+	git -C local rev-parse --verify refs/remotes/origin/master >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'server is initially behind - no ref in want' '
+	git -C "$REPO" config uploadpack.allowRefInWant false &&
+	rm -rf local &&
+	cp -r "$LOCAL_PRISTINE" local &&
+	inconsistency master "master^" &&
+	git -C local fetch &&
+
+	git -C "$REPO" rev-parse --verify "master^" >expected &&
+	git -C local rev-parse --verify refs/remotes/origin/master >actual &&
+	test_cmp expected actual
+'
+
+test_expect_failure 'server is initially behind - ref in want' '
+	git -C "$REPO" config uploadpack.allowRefInWant true &&
+	rm -rf local &&
+	cp -r "$LOCAL_PRISTINE" local &&
+	inconsistency master "master^" &&
+	git -C local fetch &&
+
+	git -C "$REPO" rev-parse --verify "master" >expected &&
+	git -C local rev-parse --verify refs/remotes/origin/master >actual &&
+	test_cmp expected actual
+'
+
+stop_httpd
+
 test_done
-- 
2.17.1.1185.g55be947832-goog


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 4/8] fetch: refactor the population of peer ref OIDs
  2018-06-05 17:51 [PATCH 0/8] ref-in-want Brandon Williams
                   ` (2 preceding siblings ...)
  2018-06-05 17:51 ` [PATCH 3/8] upload-pack: test negotiation with changing repository Brandon Williams
@ 2018-06-05 17:51 ` Brandon Williams
  2018-06-05 17:51 ` [PATCH 5/8] fetch: refactor fetch_refs into two functions Brandon Williams
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Brandon Williams @ 2018-06-05 17:51 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams

Populate peer ref OIDs in get_ref_map instead of do_fetch. Besides
tightening scopes of variables in the code, this also prepares for
get_ref_map being able to be called multiple times within do_fetch.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch.c | 36 ++++++++++++++++++------------------
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index ea5b9669a..545635448 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -351,6 +351,7 @@ static struct ref *get_ref_map(struct transport *transport,
 	/* opportunistically-updated references: */
 	struct ref *orefs = NULL, **oref_tail = &orefs;
 
+	struct string_list existing_refs = STRING_LIST_INIT_DUP;
 	const struct ref *remote_refs;
 
 	if (rs->nr)
@@ -458,7 +459,23 @@ static struct ref *get_ref_map(struct transport *transport,
 		tail = &rm->next;
 	}
 
-	return ref_remove_duplicates(ref_map);
+	ref_map = ref_remove_duplicates(ref_map);
+
+	for_each_ref(add_existing, &existing_refs);
+	for (rm = ref_map; rm; rm = rm->next) {
+		if (rm->peer_ref) {
+			struct string_list_item *peer_item =
+				string_list_lookup(&existing_refs,
+						   rm->peer_ref->name);
+			if (peer_item) {
+				struct object_id *old_oid = peer_item->util;
+				oidcpy(&rm->peer_ref->old_oid, old_oid);
+			}
+		}
+	}
+	string_list_clear(&existing_refs, 1);
+
+	return ref_map;
 }
 
 #define STORE_REF_ERROR_OTHER 1
@@ -1110,14 +1127,10 @@ static void backfill_tags(struct transport *transport, struct ref *ref_map)
 static int do_fetch(struct transport *transport,
 		    struct refspec *rs)
 {
-	struct string_list existing_refs = STRING_LIST_INIT_DUP;
 	struct ref *ref_map;
-	struct ref *rm;
 	int autotags = (transport->remote->fetch_tags == 1);
 	int retcode = 0;
 
-	for_each_ref(add_existing, &existing_refs);
-
 	if (tags == TAGS_DEFAULT) {
 		if (transport->remote->fetch_tags == 2)
 			tags = TAGS_SET;
@@ -1136,18 +1149,6 @@ static int do_fetch(struct transport *transport,
 	if (!update_head_ok)
 		check_not_current_branch(ref_map);
 
-	for (rm = ref_map; rm; rm = rm->next) {
-		if (rm->peer_ref) {
-			struct string_list_item *peer_item =
-				string_list_lookup(&existing_refs,
-						   rm->peer_ref->name);
-			if (peer_item) {
-				struct object_id *old_oid = peer_item->util;
-				oidcpy(&rm->peer_ref->old_oid, old_oid);
-			}
-		}
-	}
-
 	if (tags == TAGS_DEFAULT && autotags)
 		transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, "1");
 	if (prune) {
@@ -1183,7 +1184,6 @@ static int do_fetch(struct transport *transport,
 	}
 
  cleanup:
-	string_list_clear(&existing_refs, 1);
 	return retcode;
 }
 
-- 
2.17.1.1185.g55be947832-goog


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 5/8] fetch: refactor fetch_refs into two functions
  2018-06-05 17:51 [PATCH 0/8] ref-in-want Brandon Williams
                   ` (3 preceding siblings ...)
  2018-06-05 17:51 ` [PATCH 4/8] fetch: refactor the population of peer ref OIDs Brandon Williams
@ 2018-06-05 17:51 ` Brandon Williams
  2018-06-05 17:51 ` [PATCH 6/8] fetch: refactor to make function args narrower Brandon Williams
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Brandon Williams @ 2018-06-05 17:51 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams

Refactor the fetch_refs function into a function that does the fetching
of refs and another function that stores them.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 545635448..ee8b87c78 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -967,10 +967,16 @@ static int fetch_refs(struct transport *transport, struct ref *ref_map)
 	int ret = quickfetch(ref_map);
 	if (ret)
 		ret = transport_fetch_refs(transport, ref_map);
-	if (!ret)
-		ret |= store_updated_refs(transport->url,
-				transport->remote->name,
-				ref_map);
+	if (ret)
+		transport_unlock_pack(transport);
+	return ret;
+}
+
+static int consume_refs(struct transport *transport, struct ref *ref_map)
+{
+	int ret = store_updated_refs(transport->url,
+				     transport->remote->name,
+				     ref_map);
 	transport_unlock_pack(transport);
 	return ret;
 }
@@ -1116,7 +1122,8 @@ static void backfill_tags(struct transport *transport, struct ref *ref_map)
 	transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, NULL);
 	transport_set_option(transport, TRANS_OPT_DEPTH, "0");
 	transport_set_option(transport, TRANS_OPT_DEEPEN_RELATIVE, NULL);
-	fetch_refs(transport, ref_map);
+	if (!fetch_refs(transport, ref_map))
+		consume_refs(transport, ref_map);
 
 	if (gsecondary) {
 		transport_disconnect(gsecondary);
@@ -1165,7 +1172,7 @@ static int do_fetch(struct transport *transport,
 				   transport->url);
 		}
 	}
-	if (fetch_refs(transport, ref_map)) {
+	if (fetch_refs(transport, ref_map) || consume_refs(transport, ref_map)) {
 		free_refs(ref_map);
 		retcode = 1;
 		goto cleanup;
-- 
2.17.1.1185.g55be947832-goog


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 6/8] fetch: refactor to make function args narrower
  2018-06-05 17:51 [PATCH 0/8] ref-in-want Brandon Williams
                   ` (4 preceding siblings ...)
  2018-06-05 17:51 ` [PATCH 5/8] fetch: refactor fetch_refs into two functions Brandon Williams
@ 2018-06-05 17:51 ` Brandon Williams
  2018-06-05 17:51 ` [PATCH 7/8] fetch-pack: put shallow info in output parameter Brandon Williams
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Brandon Williams @ 2018-06-05 17:51 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams

Refactor find_non_local_tags and get_ref_map to only take the
information they need instead of the entire transport struct. Besides
improving code clarity, this also improves their flexibility, allowing
for a different set of refs to be used instead of relying on the ones
stored in the transport struct.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch.c | 52 ++++++++++++++++++++++++-------------------------
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index ee8b87c78..b600e1f10 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -254,9 +254,9 @@ static int will_fetch(struct ref **head, const unsigned char *sha1)
 	return 0;
 }
 
-static void find_non_local_tags(struct transport *transport,
-			struct ref **head,
-			struct ref ***tail)
+static void find_non_local_tags(const struct ref *refs,
+				struct ref **head,
+				struct ref ***tail)
 {
 	struct string_list existing_refs = STRING_LIST_INIT_DUP;
 	struct string_list remote_refs = STRING_LIST_INIT_NODUP;
@@ -264,7 +264,7 @@ static void find_non_local_tags(struct transport *transport,
 	struct string_list_item *item = NULL;
 
 	for_each_ref(add_existing, &existing_refs);
-	for (ref = transport_get_remote_refs(transport, NULL); ref; ref = ref->next) {
+	for (ref = refs; ref; ref = ref->next) {
 		if (!starts_with(ref->name, "refs/tags/"))
 			continue;
 
@@ -338,7 +338,8 @@ static void find_non_local_tags(struct transport *transport,
 	string_list_clear(&remote_refs, 0);
 }
 
-static struct ref *get_ref_map(struct transport *transport,
+static struct ref *get_ref_map(struct remote *remote,
+			       const struct ref *remote_refs,
 			       struct refspec *rs,
 			       int tags, int *autotags)
 {
@@ -346,27 +347,11 @@ static struct ref *get_ref_map(struct transport *transport,
 	struct ref *rm;
 	struct ref *ref_map = NULL;
 	struct ref **tail = &ref_map;
-	struct argv_array ref_prefixes = ARGV_ARRAY_INIT;
 
 	/* opportunistically-updated references: */
 	struct ref *orefs = NULL, **oref_tail = &orefs;
 
 	struct string_list existing_refs = STRING_LIST_INIT_DUP;
-	const struct ref *remote_refs;
-
-	if (rs->nr)
-		refspec_ref_prefixes(rs, &ref_prefixes);
-	else if (transport->remote && transport->remote->fetch.nr)
-		refspec_ref_prefixes(&transport->remote->fetch, &ref_prefixes);
-
-	if (ref_prefixes.argc &&
-	    (tags == TAGS_SET || (tags == TAGS_DEFAULT && !rs->nr))) {
-		argv_array_push(&ref_prefixes, "refs/tags/");
-	}
-
-	remote_refs = transport_get_remote_refs(transport, &ref_prefixes);
-
-	argv_array_clear(&ref_prefixes);
 
 	if (rs->nr) {
 		struct refspec *fetch_refspec;
@@ -403,7 +388,7 @@ static struct ref *get_ref_map(struct transport *transport,
 		if (refmap.nr)
 			fetch_refspec = &refmap;
 		else
-			fetch_refspec = &transport->remote->fetch;
+			fetch_refspec = &remote->fetch;
 
 		for (i = 0; i < fetch_refspec->nr; i++)
 			get_fetch_map(ref_map, &fetch_refspec->items[i], &oref_tail, 1);
@@ -411,7 +396,6 @@ static struct ref *get_ref_map(struct transport *transport,
 		die("--refmap option is only meaningful with command-line refspec(s).");
 	} else {
 		/* Use the defaults */
-		struct remote *remote = transport->remote;
 		struct branch *branch = branch_get(NULL);
 		int has_merge = branch_has_merge_config(branch);
 		if (remote &&
@@ -450,7 +434,7 @@ static struct ref *get_ref_map(struct transport *transport,
 		/* also fetch all tags */
 		get_fetch_map(remote_refs, tag_refspec, &tail, 0);
 	else if (tags == TAGS_DEFAULT && *autotags)
-		find_non_local_tags(transport, &ref_map, &tail);
+		find_non_local_tags(remote_refs, &ref_map, &tail);
 
 	/* Now append any refs to be updated opportunistically: */
 	*tail = orefs;
@@ -1137,6 +1121,8 @@ static int do_fetch(struct transport *transport,
 	struct ref *ref_map;
 	int autotags = (transport->remote->fetch_tags == 1);
 	int retcode = 0;
+	const struct ref *remote_refs;
+	struct argv_array ref_prefixes = ARGV_ARRAY_INIT;
 
 	if (tags == TAGS_DEFAULT) {
 		if (transport->remote->fetch_tags == 2)
@@ -1152,7 +1138,21 @@ static int do_fetch(struct transport *transport,
 			goto cleanup;
 	}
 
-	ref_map = get_ref_map(transport, rs, tags, &autotags);
+	if (rs->nr)
+		refspec_ref_prefixes(rs, &ref_prefixes);
+	else if (transport->remote && transport->remote->fetch.nr)
+		refspec_ref_prefixes(&transport->remote->fetch, &ref_prefixes);
+
+	if (ref_prefixes.argc &&
+	    (tags == TAGS_SET || (tags == TAGS_DEFAULT && !rs->nr))) {
+		argv_array_push(&ref_prefixes, "refs/tags/");
+	}
+
+	remote_refs = transport_get_remote_refs(transport, &ref_prefixes);
+	argv_array_clear(&ref_prefixes);
+
+	ref_map = get_ref_map(transport->remote, remote_refs, rs,
+			      tags, &autotags);
 	if (!update_head_ok)
 		check_not_current_branch(ref_map);
 
@@ -1184,7 +1184,7 @@ static int do_fetch(struct transport *transport,
 	if (tags == TAGS_DEFAULT && autotags) {
 		struct ref **tail = &ref_map;
 		ref_map = NULL;
-		find_non_local_tags(transport, &ref_map, &tail);
+		find_non_local_tags(remote_refs, &ref_map, &tail);
 		if (ref_map)
 			backfill_tags(transport, ref_map);
 		free_refs(ref_map);
-- 
2.17.1.1185.g55be947832-goog


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 7/8] fetch-pack: put shallow info in output parameter
  2018-06-05 17:51 [PATCH 0/8] ref-in-want Brandon Williams
                   ` (5 preceding siblings ...)
  2018-06-05 17:51 ` [PATCH 6/8] fetch: refactor to make function args narrower Brandon Williams
@ 2018-06-05 17:51 ` Brandon Williams
  2018-06-05 17:51 ` [PATCH 8/8] fetch-pack: implement ref-in-want Brandon Williams
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 46+ messages in thread
From: Brandon Williams @ 2018-06-05 17:51 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams

Expand the transport fetch method signature, by adding an output
parameter, to allow transports to return information about the refs they
have fetched.  Then communicate shallow status information through this
mechanism instead of by modifying the input list of refs.

This does require clients to sometimes generate the ref map twice: once
from the list of refs provided by the remote (as is currently done) and
potentially once from the new list of refs that the fetch mechanism
provides.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/clone.c      |  4 ++--
 builtin/fetch.c      | 23 +++++++++++++++++++----
 fetch-object.c       |  2 +-
 fetch-pack.c         | 17 +++++++++--------
 transport-helper.c   |  6 ++++--
 transport-internal.h |  9 ++++++++-
 transport.c          | 34 ++++++++++++++++++++++++++++------
 transport.h          |  3 ++-
 8 files changed, 73 insertions(+), 25 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 99e73dae8..8f86d99c5 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1155,7 +1155,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 			}
 
 		if (!is_local && !complete_refs_before_fetch)
-			transport_fetch_refs(transport, mapped_refs);
+			transport_fetch_refs(transport, mapped_refs, NULL);
 
 		remote_head = find_ref_by_name(refs, "HEAD");
 		remote_head_points_at =
@@ -1197,7 +1197,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (is_local)
 		clone_local(path, git_dir);
 	else if (refs && complete_refs_before_fetch)
-		transport_fetch_refs(transport, mapped_refs);
+		transport_fetch_refs(transport, mapped_refs, NULL);
 
 	update_remote_refs(refs, mapped_refs, remote_head_points_at,
 			   branch_top.buf, reflog_msg.buf, transport,
diff --git a/builtin/fetch.c b/builtin/fetch.c
index b600e1f10..ddf44ba1a 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -946,11 +946,13 @@ static int quickfetch(struct ref *ref_map)
 	return check_connected(iterate_ref_map, &rm, &opt);
 }
 
-static int fetch_refs(struct transport *transport, struct ref *ref_map)
+static int fetch_refs(struct transport *transport, struct ref *ref_map,
+		      struct ref **updated_remote_refs)
 {
 	int ret = quickfetch(ref_map);
 	if (ret)
-		ret = transport_fetch_refs(transport, ref_map);
+		ret = transport_fetch_refs(transport, ref_map,
+					   updated_remote_refs);
 	if (ret)
 		transport_unlock_pack(transport);
 	return ret;
@@ -1106,7 +1108,7 @@ static void backfill_tags(struct transport *transport, struct ref *ref_map)
 	transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, NULL);
 	transport_set_option(transport, TRANS_OPT_DEPTH, "0");
 	transport_set_option(transport, TRANS_OPT_DEEPEN_RELATIVE, NULL);
-	if (!fetch_refs(transport, ref_map))
+	if (!fetch_refs(transport, ref_map, NULL))
 		consume_refs(transport, ref_map);
 
 	if (gsecondary) {
@@ -1122,6 +1124,7 @@ static int do_fetch(struct transport *transport,
 	int autotags = (transport->remote->fetch_tags == 1);
 	int retcode = 0;
 	const struct ref *remote_refs;
+	struct ref *new_remote_refs = NULL;
 	struct argv_array ref_prefixes = ARGV_ARRAY_INIT;
 
 	if (tags == TAGS_DEFAULT) {
@@ -1172,7 +1175,19 @@ static int do_fetch(struct transport *transport,
 				   transport->url);
 		}
 	}
-	if (fetch_refs(transport, ref_map) || consume_refs(transport, ref_map)) {
+
+	if (fetch_refs(transport, ref_map, &new_remote_refs)) {
+		free_refs(ref_map);
+		retcode = 1;
+		goto cleanup;
+	}
+	if (new_remote_refs) {
+		free_refs(ref_map);
+		ref_map = get_ref_map(transport->remote, new_remote_refs, rs,
+				      tags, &autotags);
+		free_refs(new_remote_refs);
+	}
+	if (consume_refs(transport, ref_map)) {
 		free_refs(ref_map);
 		retcode = 1;
 		goto cleanup;
diff --git a/fetch-object.c b/fetch-object.c
index 853624f81..48fe63dd6 100644
--- a/fetch-object.c
+++ b/fetch-object.c
@@ -19,7 +19,7 @@ static void fetch_refs(const char *remote_name, struct ref *ref)
 
 	transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
 	transport_set_option(transport, TRANS_OPT_NO_DEPENDENTS, "1");
-	transport_fetch_refs(transport, ref);
+	transport_fetch_refs(transport, ref, NULL);
 	fetch_if_missing = original_fetch_if_missing;
 }
 
diff --git a/fetch-pack.c b/fetch-pack.c
index a320ce987..7799ee2cd 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1470,12 +1470,13 @@ static int remove_duplicates_in_refs(struct ref **ref, int nr)
 }
 
 static void update_shallow(struct fetch_pack_args *args,
-			   struct ref **sought, int nr_sought,
+			   struct ref *refs,
 			   struct shallow_info *si)
 {
 	struct oid_array ref = OID_ARRAY_INIT;
 	int *status;
-	int i;
+	int i = 0;
+	struct ref *r;
 
 	if (args->deepen && alternate_shallow_file) {
 		if (*alternate_shallow_file == '\0') { /* --unshallow */
@@ -1517,8 +1518,8 @@ static void update_shallow(struct fetch_pack_args *args,
 	remove_nonexistent_theirs_shallow(si);
 	if (!si->nr_ours && !si->nr_theirs)
 		return;
-	for (i = 0; i < nr_sought; i++)
-		oid_array_append(&ref, &sought[i]->old_oid);
+	for (r = refs; r; r = r->next)
+		oid_array_append(&ref, &r->old_oid);
 	si->ref = &ref;
 
 	if (args->update_shallow) {
@@ -1552,12 +1553,12 @@ static void update_shallow(struct fetch_pack_args *args,
 	 * remote is also shallow, check what ref is safe to update
 	 * without updating .git/shallow
 	 */
-	status = xcalloc(nr_sought, sizeof(*status));
+	status = xcalloc(ref.nr, sizeof(*status));
 	assign_shallow_commits_to_refs(si, NULL, status);
 	if (si->nr_ours || si->nr_theirs) {
-		for (i = 0; i < nr_sought; i++)
+		for (r = refs; r; r = r->next, i++)
 			if (status[i])
-				sought[i]->status = REF_STATUS_REJECT_SHALLOW;
+				r->status = REF_STATUS_REJECT_SHALLOW;
 	}
 	free(status);
 	oid_array_clear(&ref);
@@ -1591,7 +1592,7 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
 		ref_cpy = do_fetch_pack(args, fd, ref, sought, nr_sought,
 					&si, pack_lockfile);
 	reprepare_packed_git(the_repository);
-	update_shallow(args, sought, nr_sought, &si);
+	update_shallow(args, ref_cpy, &si);
 	clear_shallow_info(&si);
 	return ref_cpy;
 }
diff --git a/transport-helper.c b/transport-helper.c
index 1f8ff7e94..8b5abca29 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -651,14 +651,16 @@ static int connect_helper(struct transport *transport, const char *name,
 }
 
 static int fetch(struct transport *transport,
-		 int nr_heads, struct ref **to_fetch)
+		 int nr_heads, struct ref **to_fetch,
+		 struct ref **fetched_refs)
 {
 	struct helper_data *data = transport->data;
 	int i, count;
 
 	if (process_connect(transport, 0)) {
 		do_take_over(transport);
-		return transport->vtable->fetch(transport, nr_heads, to_fetch);
+		return transport->vtable->fetch(transport, nr_heads, to_fetch,
+						fetched_refs);
 	}
 
 	count = 0;
diff --git a/transport-internal.h b/transport-internal.h
index 1cde6258a..eeb6c340e 100644
--- a/transport-internal.h
+++ b/transport-internal.h
@@ -36,11 +36,18 @@ struct transport_vtable {
 	 * Fetch the objects for the given refs. Note that this gets
 	 * an array, and should ignore the list structure.
 	 *
+	 * The transport *may* provide, in fetched_refs, the list of refs that
+	 * it fetched.  If the transport knows anything about the fetched refs
+	 * that the caller does not know (for example, shallow status), it
+	 * should provide that list of refs and include that information in the
+	 * list.
+	 *
 	 * If the transport did not get hashes for refs in
 	 * get_refs_list(), it should set the old_sha1 fields in the
 	 * provided refs now.
 	 **/
-	int (*fetch)(struct transport *transport, int refs_nr, struct ref **refs);
+	int (*fetch)(struct transport *transport, int refs_nr, struct ref **refs,
+		     struct ref **fetched_refs);
 
 	/**
 	 * Push the objects and refs. Send the necessary objects, and
diff --git a/transport.c b/transport.c
index a32da30de..8704c20f1 100644
--- a/transport.c
+++ b/transport.c
@@ -151,7 +151,8 @@ static struct ref *get_refs_from_bundle(struct transport *transport,
 }
 
 static int fetch_refs_from_bundle(struct transport *transport,
-			       int nr_heads, struct ref **to_fetch)
+			       int nr_heads, struct ref **to_fetch,
+			       struct ref **fetched_refs)
 {
 	struct bundle_transport_data *data = transport->data;
 	return unbundle(&data->header, data->fd,
@@ -287,7 +288,8 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 }
 
 static int fetch_refs_via_pack(struct transport *transport,
-			       int nr_heads, struct ref **to_fetch)
+			       int nr_heads, struct ref **to_fetch,
+			       struct ref **fetched_refs)
 {
 	int ret = 0;
 	struct git_transport_data *data = transport->data;
@@ -354,8 +356,12 @@ static int fetch_refs_via_pack(struct transport *transport,
 	if (report_unmatched_refs(to_fetch, nr_heads))
 		ret = -1;
 
+	if (fetched_refs)
+		*fetched_refs = refs;
+	else
+		free_refs(refs);
+
 	free_refs(refs_tmp);
-	free_refs(refs);
 	free(dest);
 	return ret;
 }
@@ -1215,19 +1221,31 @@ const struct ref *transport_get_remote_refs(struct transport *transport,
 	return transport->remote_refs;
 }
 
-int transport_fetch_refs(struct transport *transport, struct ref *refs)
+int transport_fetch_refs(struct transport *transport, struct ref *refs,
+			 struct ref **fetched_refs)
 {
 	int rc;
 	int nr_heads = 0, nr_alloc = 0, nr_refs = 0;
 	struct ref **heads = NULL;
+	struct ref *nop_head = NULL, **nop_tail = &nop_head;
 	struct ref *rm;
 
 	for (rm = refs; rm; rm = rm->next) {
 		nr_refs++;
 		if (rm->peer_ref &&
 		    !is_null_oid(&rm->old_oid) &&
-		    !oidcmp(&rm->peer_ref->old_oid, &rm->old_oid))
+		    !oidcmp(&rm->peer_ref->old_oid, &rm->old_oid)) {
+			/*
+			 * These need to be reported as fetched, but we don not
+			 * actually need to fetch them.
+			 */
+			if (fetched_refs) {
+				struct ref *nop_ref = copy_ref(rm);
+				*nop_tail = nop_ref;
+				nop_tail = &nop_ref->next;
+			}
 			continue;
+		}
 		ALLOC_GROW(heads, nr_heads + 1, nr_alloc);
 		heads[nr_heads++] = rm;
 	}
@@ -1245,7 +1263,11 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs)
 			heads[nr_heads++] = rm;
 	}
 
-	rc = transport->vtable->fetch(transport, nr_heads, heads);
+	rc = transport->vtable->fetch(transport, nr_heads, heads, fetched_refs);
+	if (fetched_refs && nop_head) {
+		*nop_tail = *fetched_refs;
+		*fetched_refs = nop_head;
+	}
 
 	free(heads);
 	return rc;
diff --git a/transport.h b/transport.h
index 7792b0858..3dff767a8 100644
--- a/transport.h
+++ b/transport.h
@@ -218,7 +218,8 @@ int transport_push(struct transport *connection,
 const struct ref *transport_get_remote_refs(struct transport *transport,
 					    const struct argv_array *ref_prefixes);
 
-int transport_fetch_refs(struct transport *transport, struct ref *refs);
+int transport_fetch_refs(struct transport *transport, struct ref *refs,
+			 struct ref **fetched_refs);
 void transport_unlock_pack(struct transport *transport);
 int transport_disconnect(struct transport *transport);
 char *transport_anonymize_url(const char *url);
-- 
2.17.1.1185.g55be947832-goog


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 8/8] fetch-pack: implement ref-in-want
  2018-06-05 17:51 [PATCH 0/8] ref-in-want Brandon Williams
                   ` (6 preceding siblings ...)
  2018-06-05 17:51 ` [PATCH 7/8] fetch-pack: put shallow info in output parameter Brandon Williams
@ 2018-06-05 17:51 ` Brandon Williams
  2018-06-13 21:39 ` [PATCH v2 0/8] ref-in-want Brandon Williams
  2018-06-15 19:04 ` [PATCH " Jonathan Tan
  9 siblings, 0 replies; 46+ messages in thread
From: Brandon Williams @ 2018-06-05 17:51 UTC (permalink / raw)
  To: git; +Cc: Brandon Williams

Implement ref-in-want on the client side so that when a server supports
the "ref-in-want" feature, a client will send "want-ref" lines for each
reference the client wants to fetch.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 fetch-pack.c                       | 35 +++++++++++++++++++++++++++---
 remote.c                           |  1 +
 remote.h                           |  1 +
 t/t5703-upload-pack-ref-in-want.sh |  4 ++--
 4 files changed, 36 insertions(+), 5 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index 7799ee2cd..51e8356ba 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1102,9 +1102,10 @@ static void add_shallow_requests(struct strbuf *req_buf,
 
 static void add_wants(const struct ref *wants, struct strbuf *req_buf)
 {
+	int use_ref_in_want = server_supports_feature("fetch", "ref-in-want", 0);
+
 	for ( ; wants ; wants = wants->next) {
 		const struct object_id *remote = &wants->old_oid;
-		const char *remote_hex;
 		struct object *o;
 
 		/*
@@ -1122,8 +1123,10 @@ static void add_wants(const struct ref *wants, struct strbuf *req_buf)
 			continue;
 		}
 
-		remote_hex = oid_to_hex(remote);
-		packet_buf_write(req_buf, "want %s\n", remote_hex);
+		if (!use_ref_in_want || wants->exact_sha1)
+			packet_buf_write(req_buf, "want %s\n", oid_to_hex(remote));
+		else
+			packet_buf_write(req_buf, "want-ref %s\n", wants->name);
 	}
 }
 
@@ -1334,6 +1337,29 @@ static void receive_shallow_info(struct fetch_pack_args *args,
 	args->deepen = 1;
 }
 
+static void receive_wanted_refs(struct packet_reader *reader, struct ref *refs)
+{
+	process_section_header(reader, "wanted-refs", 0);
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+		struct object_id oid;
+		const char *end;
+		struct ref *r = NULL;
+
+		if (parse_oid_hex(reader->line, &oid, &end) || *end++ != ' ')
+			die("expected wanted-ref, got '%s'", reader->line);
+
+		for (r = refs; r; r = r->next) {
+			if (!strcmp(end, r->name)) {
+				oidcpy(&r->old_oid, &oid);
+				break;
+			}
+		}
+	}
+
+	if (reader->status != PACKET_READ_DELIM)
+		die("error processing wanted refs: %d", reader->status);
+}
+
 enum fetch_state {
 	FETCH_CHECK_LOCAL = 0,
 	FETCH_SEND_REQUEST,
@@ -1408,6 +1434,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 			if (process_section_header(&reader, "shallow-info", 1))
 				receive_shallow_info(args, &reader);
 
+			if (process_section_header(&reader, "wanted-refs", 1))
+				receive_wanted_refs(&reader, ref);
+
 			/* get the pack */
 			process_section_header(&reader, "packfile", 0);
 			if (get_pack(args, fd, pack_lockfile))
diff --git a/remote.c b/remote.c
index abe80c139..c9d452ac0 100644
--- a/remote.c
+++ b/remote.c
@@ -1735,6 +1735,7 @@ int get_fetch_map(const struct ref *remote_refs,
 		if (refspec->exact_sha1) {
 			ref_map = alloc_ref(name);
 			get_oid_hex(name, &ref_map->old_oid);
+			ref_map->exact_sha1 = 1;
 		} else {
 			ref_map = get_remote_ref(remote_refs, name);
 		}
diff --git a/remote.h b/remote.h
index 45ecc6cef..e5338e368 100644
--- a/remote.h
+++ b/remote.h
@@ -73,6 +73,7 @@ struct ref {
 		force:1,
 		forced_update:1,
 		expect_old_sha1:1,
+		exact_sha1:1,
 		deletion:1;
 
 	enum {
diff --git a/t/t5703-upload-pack-ref-in-want.sh b/t/t5703-upload-pack-ref-in-want.sh
index 979ab6d03..b94a51380 100755
--- a/t/t5703-upload-pack-ref-in-want.sh
+++ b/t/t5703-upload-pack-ref-in-want.sh
@@ -204,7 +204,7 @@ test_expect_success 'server is initially ahead - no ref in want' '
 	grep "ERR upload-pack: not our ref" err
 '
 
-test_expect_failure 'server is initially ahead - ref in want' '
+test_expect_success 'server is initially ahead - ref in want' '
 	git -C "$REPO" config uploadpack.allowRefInWant true &&
 	rm -rf local &&
 	cp -r "$LOCAL_PRISTINE" local &&
@@ -228,7 +228,7 @@ test_expect_success 'server is initially behind - no ref in want' '
 	test_cmp expected actual
 '
 
-test_expect_failure 'server is initially behind - ref in want' '
+test_expect_success 'server is initially behind - ref in want' '
 	git -C "$REPO" config uploadpack.allowRefInWant true &&
 	rm -rf local &&
 	cp -r "$LOCAL_PRISTINE" local &&
-- 
2.17.1.1185.g55be947832-goog


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/8] upload-pack: implement ref-in-want
  2018-06-05 17:51 ` [PATCH 2/8] upload-pack: implement ref-in-want Brandon Williams
@ 2018-06-05 19:11   ` Ramsay Jones
  2018-06-05 20:32   ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 46+ messages in thread
From: Ramsay Jones @ 2018-06-05 19:11 UTC (permalink / raw)
  To: Brandon Williams, git



On 05/06/18 18:51, Brandon Williams wrote:
> Currently, while performing packfile negotiation, clients are only
> allowed to specify their desired objects using object ids.  This causes
> a vulnerability to failure when an object turns non-existent during
> negotiation, which may happen if, for example, the desired repository is
> provided by multiple Git servers in a load-balancing arrangement.
> 
> In order to eliminate this vulnerability, implement the ref-in-want
> feature for the 'fetch' command in protocol version 2.  This feature
> enables the 'fetch' command to support requests in the form of ref names
> through a new "want-ref <ref>" parameter.  At the conclusion of
> negotiation, the server will send a list of all of the wanted references
> (as provided by "want-ref" lines) in addition to the generated packfile.
> 
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  Documentation/config.txt                |   4 +
>  Documentation/technical/protocol-v2.txt |  28 ++++-
>  t/t5703-upload-pack-ref-in-want.sh      | 153 ++++++++++++++++++++++++
>  upload-pack.c                           |  64 ++++++++++
>  4 files changed, 248 insertions(+), 1 deletion(-)
>  create mode 100755 t/t5703-upload-pack-ref-in-want.sh
> 
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index ab641bf5a..acafe6c8d 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -3479,6 +3479,10 @@ Note that this configuration variable is ignored if it is seen in the
>  repository-level config (this is a safety measure against fetching from
>  untrusted repositories).
>  
> +uploadpack.allowRefInWant::
> +	If this option is set, `upload-pack` will support the `ref-in-want`
> +	feature of the protocol version 2 `fetch` command.
> +
>  url.<base>.insteadOf::
>  	Any URL that starts with this value will be rewritten to
>  	start, instead, with <base>. In cases where some site serves a
> diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> index 49bda76d2..8367e09b8 100644
> --- a/Documentation/technical/protocol-v2.txt
> +++ b/Documentation/technical/protocol-v2.txt
> @@ -299,12 +299,21 @@ included in the client's request:
>  	for use with partial clone and partial fetch operations. See
>  	`rev-list` for possible "filter-spec" values.
>  
> +If the 'ref-in-want' feature is advertised, the following argument can
> +be included in the client's request as well as the potential addition of
> +the 'wanted-refs' section in the server's response as explained below.
> +
> +    want-ref <ref>
> +	Indicates to the server than the client wants to retrieve a
> +	particular ref, where <ref> is the full name of a ref on the
> +	server.
> +
>  The response of `fetch` is broken into a number of sections separated by
>  delimiter packets (0001), with each section beginning with its section
>  header.
>  
>      output = *section
> -    section = (acknowledgments | shallow-info | packfile)
> +    section = (acknowledgments | shallow-info | wanted-refs | packfile)
>  	      (flush-pkt | delim-pkt)
>  
>      acknowledgments = PKT-LINE("acknowledgments" LF)
> @@ -319,6 +328,10 @@ header.
>      shallow = "shallow" SP obj-id
>      unshallow = "unshallow" SP obj-id
>  
> +    wanted-refs = PKT-LINE("wanted-refs" LF)
> +		  *PKT-Line(wanted-ref LF)

s/PKT-Line/PKT-LINE/

ATB,
Ramsay Jones


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/8] upload-pack: implement ref-in-want
  2018-06-05 17:51 ` [PATCH 2/8] upload-pack: implement ref-in-want Brandon Williams
  2018-06-05 19:11   ` Ramsay Jones
@ 2018-06-05 20:32   ` Ævar Arnfjörð Bjarmason
  2018-06-06 21:32     ` Brandon Williams
  1 sibling, 1 reply; 46+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-06-05 20:32 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git


On Tue, Jun 05 2018, Brandon Williams wrote:

> +uploadpack.allowRefInWant::
> +	If this option is set, `upload-pack` will support the `ref-in-want`
> +	feature of the protocol version 2 `fetch` command.
> +

I think it makes sense to elaborate a bit on what this is for. Having
read this series through, and to make sure I understood this, maybe
something like this:

   This feature is intended for the benefit of load-balanced servers
   which may not have the same view of what SHA-1s their refs point to,
   but are guaranteed to never advertise a reference that another server
   serving the request doesn't know about.

I.e. from what I can tell this gives no benefits for someone using a
monolithic git server, except insofar as there would be a slight
decrease in network traffic if the average length of refs is less than
the length of a SHA-1.

That's fair enough, just something we should prominently say.

It does have the "disadvantage", if you can call it that, that it's
introducing a race condition between when we read the ref advertisement
and are promised XYZ refs, but may actually get ABC, but I can't think
of a reason anyone would care about this in practice.

The reason I'm saying "another server [...] doesn't know about" above is
that 2/8 has this:

	if (read_ref(arg, &oid))
		die("unknown ref %s", arg);

Doesn't that mean that if server A in your pool advertises master, next
& pu, and you then go and fetch from server B advertising master & next,
but not "pu" that the clone will die?

Presumably at Google you either have something to ensure a consistent
view, e.g. only advertise refs by name older than N seconds, or globally
update ref name but not their contents, and don't allow deleting refs
(or give them the same treatment).

But that, and again, I may have misunderstood this whole thing,
significantly reduces the utility of this feature for anyone "in the
wild" since nothing shipped with "git" gives you that feature.

The naïve way to do slave mirroring with stock git is to have a
post-receive hook that pushes to your mirrors in a for-loop, or has them
fetch from the master in a loop, and then round-robin LB those
servers. Due to the "die on nonexisting" semantics in this extension
that'll result in failed clones.

So I think we should either be really vocal about that caveat, or
perhaps think of how we could make that configurable, e.g. what happens
if the server says "sorry, don't know about that one", and carries on
with the rest it does know about?

Is there a way for client & server to gracefully recover from that?
E.g. send "master" & "next" now, and when I pull again in a few seconds
I get the new "pu"?

Also, as a digression isn't that a problem shared with protocol v2 in
general? I.e. without this extension isn't it going to make another
connection to the naïve LB'd mirroring setup described above and find
that SHA-1s as well as refs don't match?

BREAK.

Also is if this E-Mail wasn't long enough, on a completely different
topic, in an earlier discussion in
https://public-inbox.org/git/87inaje1uv.fsf@evledraar.gmail.com/ I noted
that it would be neat-o to have optional wildmatch/pcre etc. matching
for the use case you're not caring about here (and I don't expect you
to, you're solving a different problem).

But let's say I want to add that after this, and being unfamiliar with
the protocol v2 conventions. Would that be a whole new
ref-in-want-wildmatch-prefix capability with a new
want-ref-wildmatch-prefix verb, or is there some less verbose way we can
anticipate that use-case and internally version / advertise
sub-capabilities?

I don't know if that makes any sense, and would be fine with just a
ref-in-want-wildmatch-prefix if that's the way to do it. I just think
it's inevitable that we'll have such a thing eventually, so it's worth
thinking about how such a future extension fits in.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/8] upload-pack: implement ref-in-want
  2018-06-05 20:32   ` Ævar Arnfjörð Bjarmason
@ 2018-06-06 21:32     ` Brandon Williams
  2018-06-06 22:42       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 46+ messages in thread
From: Brandon Williams @ 2018-06-06 21:32 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git

On 06/05, Ævar Arnfjörð Bjarmason wrote:
> 
> On Tue, Jun 05 2018, Brandon Williams wrote:
> 
> > +uploadpack.allowRefInWant::
> > +	If this option is set, `upload-pack` will support the `ref-in-want`
> > +	feature of the protocol version 2 `fetch` command.
> > +
> 
> I think it makes sense to elaborate a bit on what this is for. Having
> read this series through, and to make sure I understood this, maybe
> something like this:
> 
>    This feature is intended for the benefit of load-balanced servers
>    which may not have the same view of what SHA-1s their refs point to,
>    but are guaranteed to never advertise a reference that another server
>    serving the request doesn't know about.
> 
> I.e. from what I can tell this gives no benefits for someone using a
> monolithic git server, except insofar as there would be a slight
> decrease in network traffic if the average length of refs is less than
> the length of a SHA-1.

Yeah I agree that the motivation should probably be spelled out more,
thanks for the suggestion.

> 
> That's fair enough, just something we should prominently say.
> 
> It does have the "disadvantage", if you can call it that, that it's
> introducing a race condition between when we read the ref advertisement
> and are promised XYZ refs, but may actually get ABC, but I can't think
> of a reason anyone would care about this in practice.
> 
> The reason I'm saying "another server [...] doesn't know about" above is
> that 2/8 has this:
> 
> 	if (read_ref(arg, &oid))
> 		die("unknown ref %s", arg);
> 
> Doesn't that mean that if server A in your pool advertises master, next
> & pu, and you then go and fetch from server B advertising master & next,
> but not "pu" that the clone will die?
> 
> Presumably at Google you either have something to ensure a consistent
> view, e.g. only advertise refs by name older than N seconds, or globally
> update ref name but not their contents, and don't allow deleting refs
> (or give them the same treatment).
> 
> But that, and again, I may have misunderstood this whole thing,
> significantly reduces the utility of this feature for anyone "in the
> wild" since nothing shipped with "git" gives you that feature.
> 
> The naïve way to do slave mirroring with stock git is to have a
> post-receive hook that pushes to your mirrors in a for-loop, or has them
> fetch from the master in a loop, and then round-robin LB those
> servers. Due to the "die on nonexisting" semantics in this extension
> that'll result in failed clones.
> 
> So I think we should either be really vocal about that caveat, or
> perhaps think of how we could make that configurable, e.g. what happens
> if the server says "sorry, don't know about that one", and carries on
> with the rest it does know about?

Jonathan actually pointed this out to me earlier and I think the best
way to deal with this is to just ignore the refs that the server doesn't
know about instead of dying here. I mean its no worse than what we
already have and we shouldn't hit this case too often.  And that way the
fetch can still proceed.

> 
> Is there a way for client & server to gracefully recover from that?
> E.g. send "master" & "next" now, and when I pull again in a few seconds
> I get the new "pu"?

I think in this case the client would just need to wait for some amount
of replication delay and attempt fetching at a later point.

> 
> Also, as a digression isn't that a problem shared with protocol v2 in
> general? I.e. without this extension isn't it going to make another
> connection to the naïve LB'd mirroring setup described above and find
> that SHA-1s as well as refs don't match?

This is actually an issue with fetch using either v2 or v0.  Unless I'm
misunderstanding what you're asking here.

> 
> BREAK.
> 
> Also is if this E-Mail wasn't long enough, on a completely different
> topic, in an earlier discussion in
> https://public-inbox.org/git/87inaje1uv.fsf@evledraar.gmail.com/ I noted
> that it would be neat-o to have optional wildmatch/pcre etc. matching
> for the use case you're not caring about here (and I don't expect you
> to, you're solving a different problem).
> 
> But let's say I want to add that after this, and being unfamiliar with
> the protocol v2 conventions. Would that be a whole new
> ref-in-want-wildmatch-prefix capability with a new
> want-ref-wildmatch-prefix verb, or is there some less verbose way we can
> anticipate that use-case and internally version / advertise
> sub-capabilities?
> 
> I don't know if that makes any sense, and would be fine with just a
> ref-in-want-wildmatch-prefix if that's the way to do it. I just think
> it's inevitable that we'll have such a thing eventually, so it's worth
> thinking about how such a future extension fits in.

Yes back when introducing the server-side ref filtering in ls-refs we
originally talked about included wildmatch or other forms of pattern
matching.  We opted to not over complicate things and favored prefix
matching because it didn't bake in some subset of globbing or regex and
it was easier to compute on the server side.

Anyway back to your question.  Yes if at some point in the future we
wanted to add in wildmatch/pcre to the protocol for ls-refs or for
ref-in-want then it could be added as a feature or capability.  I don't
think it would require adding a whole new verb (it probably would for
the ls-refs case since the verb used there is "ref-prefix") but the
capability could mean that the "want-ref" verb now understands wildmatch
patterns in addition to fully qualified refs.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/8] upload-pack: implement ref-in-want
  2018-06-06 21:32     ` Brandon Williams
@ 2018-06-06 22:42       ` Ævar Arnfjörð Bjarmason
  2018-06-06 22:45         ` Brandon Williams
  0 siblings, 1 reply; 46+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-06-06 22:42 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git


On Wed, Jun 06 2018, Brandon Williams wrote:

> On 06/05, Ævar Arnfjörð Bjarmason wrote:
>>
>> On Tue, Jun 05 2018, Brandon Williams wrote:
>>
>> > +uploadpack.allowRefInWant::
>> > +	If this option is set, `upload-pack` will support the `ref-in-want`
>> > +	feature of the protocol version 2 `fetch` command.
>> > +
>>
>> I think it makes sense to elaborate a bit on what this is for. Having
>> read this series through, and to make sure I understood this, maybe
>> something like this:
>>
>>    This feature is intended for the benefit of load-balanced servers
>>    which may not have the same view of what SHA-1s their refs point to,
>>    but are guaranteed to never advertise a reference that another server
>>    serving the request doesn't know about.
>>
>> I.e. from what I can tell this gives no benefits for someone using a
>> monolithic git server, except insofar as there would be a slight
>> decrease in network traffic if the average length of refs is less than
>> the length of a SHA-1.
>
> Yeah I agree that the motivation should probably be spelled out more,
> thanks for the suggestion.
>
>>
>> That's fair enough, just something we should prominently say.
>>
>> It does have the "disadvantage", if you can call it that, that it's
>> introducing a race condition between when we read the ref advertisement
>> and are promised XYZ refs, but may actually get ABC, but I can't think
>> of a reason anyone would care about this in practice.
>>
>> The reason I'm saying "another server [...] doesn't know about" above is
>> that 2/8 has this:
>>
>> 	if (read_ref(arg, &oid))
>> 		die("unknown ref %s", arg);
>>
>> Doesn't that mean that if server A in your pool advertises master, next
>> & pu, and you then go and fetch from server B advertising master & next,
>> but not "pu" that the clone will die?
>>
>> Presumably at Google you either have something to ensure a consistent
>> view, e.g. only advertise refs by name older than N seconds, or globally
>> update ref name but not their contents, and don't allow deleting refs
>> (or give them the same treatment).
>>
>> But that, and again, I may have misunderstood this whole thing,
>> significantly reduces the utility of this feature for anyone "in the
>> wild" since nothing shipped with "git" gives you that feature.
>>
>> The naïve way to do slave mirroring with stock git is to have a
>> post-receive hook that pushes to your mirrors in a for-loop, or has them
>> fetch from the master in a loop, and then round-robin LB those
>> servers. Due to the "die on nonexisting" semantics in this extension
>> that'll result in failed clones.
>>
>> So I think we should either be really vocal about that caveat, or
>> perhaps think of how we could make that configurable, e.g. what happens
>> if the server says "sorry, don't know about that one", and carries on
>> with the rest it does know about?
>
> Jonathan actually pointed this out to me earlier and I think the best
> way to deal with this is to just ignore the refs that the server doesn't
> know about instead of dying here. I mean its no worse than what we
> already have and we shouldn't hit this case too often.  And that way the
> fetch can still proceed.
>
>>
>> Is there a way for client & server to gracefully recover from that?
>> E.g. send "master" & "next" now, and when I pull again in a few seconds
>> I get the new "pu"?
>
> I think in this case the client would just need to wait for some amount
> of replication delay and attempt fetching at a later point.
>
>>
>> Also, as a digression isn't that a problem shared with protocol v2 in
>> general? I.e. without this extension isn't it going to make another
>> connection to the naïve LB'd mirroring setup described above and find
>> that SHA-1s as well as refs don't match?
>
> This is actually an issue with fetch using either v2 or v0.  Unless I'm
> misunderstanding what you're asking here.

Isn't the whole dialog in v1 guaranteed to be with one server from
intial ref advertisement to the client saying have/want, or is that just
with ssh?

In any case the reason the above is an issue here is because you're
getting the advertisement from a different server than you're
negotiating the pack with, right?

>>
>> BREAK.
>>
>> Also is if this E-Mail wasn't long enough, on a completely different
>> topic, in an earlier discussion in
>> https://public-inbox.org/git/87inaje1uv.fsf@evledraar.gmail.com/ I noted
>> that it would be neat-o to have optional wildmatch/pcre etc. matching
>> for the use case you're not caring about here (and I don't expect you
>> to, you're solving a different problem).
>>
>> But let's say I want to add that after this, and being unfamiliar with
>> the protocol v2 conventions. Would that be a whole new
>> ref-in-want-wildmatch-prefix capability with a new
>> want-ref-wildmatch-prefix verb, or is there some less verbose way we can
>> anticipate that use-case and internally version / advertise
>> sub-capabilities?
>>
>> I don't know if that makes any sense, and would be fine with just a
>> ref-in-want-wildmatch-prefix if that's the way to do it. I just think
>> it's inevitable that we'll have such a thing eventually, so it's worth
>> thinking about how such a future extension fits in.
>
> Yes back when introducing the server-side ref filtering in ls-refs we
> originally talked about included wildmatch or other forms of pattern
> matching.  We opted to not over complicate things and favored prefix
> matching because it didn't bake in some subset of globbing or regex and
> it was easier to compute on the server side.
>
> Anyway back to your question.  Yes if at some point in the future we
> wanted to add in wildmatch/pcre to the protocol for ls-refs or for
> ref-in-want then it could be added as a feature or capability.  I don't
> think it would require adding a whole new verb (it probably would for
> the ls-refs case since the verb used there is "ref-prefix") but the
> capability could mean that the "want-ref" verb now understands wildmatch
> patterns in addition to fully qualified refs.

Probably still makes sense to have it be a different verb since some
things in wildmatch / regex are metachars but may be valid in ref names.

Thanks!

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/8] upload-pack: implement ref-in-want
  2018-06-06 22:42       ` Ævar Arnfjörð Bjarmason
@ 2018-06-06 22:45         ` Brandon Williams
  0 siblings, 0 replies; 46+ messages in thread
From: Brandon Williams @ 2018-06-06 22:45 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git

On 06/07, Ævar Arnfjörð Bjarmason wrote:
> 
> On Wed, Jun 06 2018, Brandon Williams wrote:
> 
> > On 06/05, Ævar Arnfjörð Bjarmason wrote:
> >>
> >> On Tue, Jun 05 2018, Brandon Williams wrote:
> >>
> >> > +uploadpack.allowRefInWant::
> >> > +	If this option is set, `upload-pack` will support the `ref-in-want`
> >> > +	feature of the protocol version 2 `fetch` command.
> >> > +
> >>
> >> I think it makes sense to elaborate a bit on what this is for. Having
> >> read this series through, and to make sure I understood this, maybe
> >> something like this:
> >>
> >>    This feature is intended for the benefit of load-balanced servers
> >>    which may not have the same view of what SHA-1s their refs point to,
> >>    but are guaranteed to never advertise a reference that another server
> >>    serving the request doesn't know about.
> >>
> >> I.e. from what I can tell this gives no benefits for someone using a
> >> monolithic git server, except insofar as there would be a slight
> >> decrease in network traffic if the average length of refs is less than
> >> the length of a SHA-1.
> >
> > Yeah I agree that the motivation should probably be spelled out more,
> > thanks for the suggestion.
> >
> >>
> >> That's fair enough, just something we should prominently say.
> >>
> >> It does have the "disadvantage", if you can call it that, that it's
> >> introducing a race condition between when we read the ref advertisement
> >> and are promised XYZ refs, but may actually get ABC, but I can't think
> >> of a reason anyone would care about this in practice.
> >>
> >> The reason I'm saying "another server [...] doesn't know about" above is
> >> that 2/8 has this:
> >>
> >> 	if (read_ref(arg, &oid))
> >> 		die("unknown ref %s", arg);
> >>
> >> Doesn't that mean that if server A in your pool advertises master, next
> >> & pu, and you then go and fetch from server B advertising master & next,
> >> but not "pu" that the clone will die?
> >>
> >> Presumably at Google you either have something to ensure a consistent
> >> view, e.g. only advertise refs by name older than N seconds, or globally
> >> update ref name but not their contents, and don't allow deleting refs
> >> (or give them the same treatment).
> >>
> >> But that, and again, I may have misunderstood this whole thing,
> >> significantly reduces the utility of this feature for anyone "in the
> >> wild" since nothing shipped with "git" gives you that feature.
> >>
> >> The naïve way to do slave mirroring with stock git is to have a
> >> post-receive hook that pushes to your mirrors in a for-loop, or has them
> >> fetch from the master in a loop, and then round-robin LB those
> >> servers. Due to the "die on nonexisting" semantics in this extension
> >> that'll result in failed clones.
> >>
> >> So I think we should either be really vocal about that caveat, or
> >> perhaps think of how we could make that configurable, e.g. what happens
> >> if the server says "sorry, don't know about that one", and carries on
> >> with the rest it does know about?
> >
> > Jonathan actually pointed this out to me earlier and I think the best
> > way to deal with this is to just ignore the refs that the server doesn't
> > know about instead of dying here. I mean its no worse than what we
> > already have and we shouldn't hit this case too often.  And that way the
> > fetch can still proceed.
> >
> >>
> >> Is there a way for client & server to gracefully recover from that?
> >> E.g. send "master" & "next" now, and when I pull again in a few seconds
> >> I get the new "pu"?
> >
> > I think in this case the client would just need to wait for some amount
> > of replication delay and attempt fetching at a later point.
> >
> >>
> >> Also, as a digression isn't that a problem shared with protocol v2 in
> >> general? I.e. without this extension isn't it going to make another
> >> connection to the naïve LB'd mirroring setup described above and find
> >> that SHA-1s as well as refs don't match?
> >
> > This is actually an issue with fetch using either v2 or v0.  Unless I'm
> > misunderstanding what you're asking here.
> 
> Isn't the whole dialog in v1 guaranteed to be with one server from
> intial ref advertisement to the client saying have/want, or is that just
> with ssh?

That's only guaranteed with statefull connections (git:// and ssh://),
http:// has this issue because its stateless.

> 
> In any case the reason the above is an issue here is because you're
> getting the advertisement from a different server than you're
> negotiating the pack with, right?

Yes correct, or even a different server on each negotiation round-trip.

> 
> >>
> >> BREAK.
> >>
> >> Also is if this E-Mail wasn't long enough, on a completely different
> >> topic, in an earlier discussion in
> >> https://public-inbox.org/git/87inaje1uv.fsf@evledraar.gmail.com/ I noted
> >> that it would be neat-o to have optional wildmatch/pcre etc. matching
> >> for the use case you're not caring about here (and I don't expect you
> >> to, you're solving a different problem).
> >>
> >> But let's say I want to add that after this, and being unfamiliar with
> >> the protocol v2 conventions. Would that be a whole new
> >> ref-in-want-wildmatch-prefix capability with a new
> >> want-ref-wildmatch-prefix verb, or is there some less verbose way we can
> >> anticipate that use-case and internally version / advertise
> >> sub-capabilities?
> >>
> >> I don't know if that makes any sense, and would be fine with just a
> >> ref-in-want-wildmatch-prefix if that's the way to do it. I just think
> >> it's inevitable that we'll have such a thing eventually, so it's worth
> >> thinking about how such a future extension fits in.
> >
> > Yes back when introducing the server-side ref filtering in ls-refs we
> > originally talked about included wildmatch or other forms of pattern
> > matching.  We opted to not over complicate things and favored prefix
> > matching because it didn't bake in some subset of globbing or regex and
> > it was easier to compute on the server side.
> >
> > Anyway back to your question.  Yes if at some point in the future we
> > wanted to add in wildmatch/pcre to the protocol for ls-refs or for
> > ref-in-want then it could be added as a feature or capability.  I don't
> > think it would require adding a whole new verb (it probably would for
> > the ls-refs case since the verb used there is "ref-prefix") but the
> > capability could mean that the "want-ref" verb now understands wildmatch
> > patterns in addition to fully qualified refs.
> 
> Probably still makes sense to have it be a different verb since some
> things in wildmatch / regex are metachars but may be valid in ref names.

Yeah we can leave that up to the designer of such a feature ;) 

> 
> Thanks!

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v2 0/8] ref-in-want
  2018-06-05 17:51 [PATCH 0/8] ref-in-want Brandon Williams
                   ` (7 preceding siblings ...)
  2018-06-05 17:51 ` [PATCH 8/8] fetch-pack: implement ref-in-want Brandon Williams
@ 2018-06-13 21:39 ` Brandon Williams
  2018-06-13 21:39   ` [PATCH v2 1/8] test-pkt-line: add unpack-sideband subcommand Brandon Williams
                     ` (8 more replies)
  2018-06-15 19:04 ` [PATCH " Jonathan Tan
  9 siblings, 9 replies; 46+ messages in thread
From: Brandon Williams @ 2018-06-13 21:39 UTC (permalink / raw)
  To: git; +Cc: avarab, ramsay, Brandon Williams

Changes in v2:
* issuing a want-ref line to a ref which doesn't exist is just ignored.
* fixed some typos 

Brandon Williams (8):
  test-pkt-line: add unpack-sideband subcommand
  upload-pack: implement ref-in-want
  upload-pack: test negotiation with changing repository
  fetch: refactor the population of peer ref OIDs
  fetch: refactor fetch_refs into two functions
  fetch: refactor to make function args narrower
  fetch-pack: put shallow info in output parameter
  fetch-pack: implement ref-in-want

 Documentation/config.txt                |   7 +
 Documentation/technical/protocol-v2.txt |  29 ++-
 builtin/clone.c                         |   4 +-
 builtin/fetch.c                         | 126 +++++++-----
 fetch-object.c                          |   2 +-
 fetch-pack.c                            |  52 +++--
 remote.c                                |   1 +
 remote.h                                |   1 +
 t/helper/test-pkt-line.c                |  37 ++++
 t/lib-httpd.sh                          |   1 +
 t/lib-httpd/apache.conf                 |   8 +
 t/lib-httpd/one-time-sed.sh             |  16 ++
 t/t5703-upload-pack-ref-in-want.sh      | 245 ++++++++++++++++++++++++
 transport-helper.c                      |   6 +-
 transport-internal.h                    |   9 +-
 transport.c                             |  34 +++-
 transport.h                             |   3 +-
 upload-pack.c                           |  64 +++++++
 18 files changed, 568 insertions(+), 77 deletions(-)
 create mode 100644 t/lib-httpd/one-time-sed.sh
 create mode 100755 t/t5703-upload-pack-ref-in-want.sh

-- 
2.18.0.rc1.242.g61856ae69a-goog


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v2 1/8] test-pkt-line: add unpack-sideband subcommand
  2018-06-13 21:39 ` [PATCH v2 0/8] ref-in-want Brandon Williams
@ 2018-06-13 21:39   ` Brandon Williams
  2018-06-14 18:09     ` Stefan Beller
  2018-06-13 21:39   ` [PATCH v2 2/8] upload-pack: implement ref-in-want Brandon Williams
                     ` (7 subsequent siblings)
  8 siblings, 1 reply; 46+ messages in thread
From: Brandon Williams @ 2018-06-13 21:39 UTC (permalink / raw)
  To: git; +Cc: avarab, ramsay, Brandon Williams

Add an 'unpack-sideband' subcommand to the test-pkt-line helper to
enable unpacking packet line data sent multiplexed using a sideband.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 t/helper/test-pkt-line.c | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/t/helper/test-pkt-line.c b/t/helper/test-pkt-line.c
index 0f19e53c7..2a55ffff1 100644
--- a/t/helper/test-pkt-line.c
+++ b/t/helper/test-pkt-line.c
@@ -1,3 +1,4 @@
+#include "cache.h"
 #include "pkt-line.h"
 
 static void pack_line(const char *line)
@@ -48,6 +49,40 @@ static void unpack(void)
 	}
 }
 
+static void unpack_sideband(void)
+{
+	struct packet_reader reader;
+	packet_reader_init(&reader, 0, NULL, 0,
+			   PACKET_READ_GENTLE_ON_EOF |
+			   PACKET_READ_CHOMP_NEWLINE);
+
+	while (packet_reader_read(&reader) != PACKET_READ_EOF) {
+		int band;
+		int fd;
+
+		switch (reader.status) {
+		case PACKET_READ_EOF:
+			break;
+		case PACKET_READ_NORMAL:
+			band = reader.line[0] & 0xff;
+			if (band == 1)
+				fd = 1;
+			else
+				fd = 2;
+
+			write_or_die(fd, reader.line+1, reader.pktlen-1);
+
+			if (band == 3)
+				die("sind-band error");
+			break;
+		case PACKET_READ_FLUSH:
+			return;
+		case PACKET_READ_DELIM:
+			break;
+		}
+	}
+}
+
 int cmd_main(int argc, const char **argv)
 {
 	if (argc < 2)
@@ -57,6 +92,8 @@ int cmd_main(int argc, const char **argv)
 		pack(argc - 2, argv + 2);
 	else if (!strcmp(argv[1], "unpack"))
 		unpack();
+	else if (!strcmp(argv[1], "unpack-sideband"))
+		unpack_sideband();
 	else
 		die("invalid argument '%s'", argv[1]);
 
-- 
2.18.0.rc1.242.g61856ae69a-goog


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v2 2/8] upload-pack: implement ref-in-want
  2018-06-13 21:39 ` [PATCH v2 0/8] ref-in-want Brandon Williams
  2018-06-13 21:39   ` [PATCH v2 1/8] test-pkt-line: add unpack-sideband subcommand Brandon Williams
@ 2018-06-13 21:39   ` Brandon Williams
  2018-06-14 18:40     ` Stefan Beller
  2018-06-15 21:08     ` Junio C Hamano
  2018-06-13 21:39   ` [PATCH v2 3/8] upload-pack: test negotiation with changing repository Brandon Williams
                     ` (6 subsequent siblings)
  8 siblings, 2 replies; 46+ messages in thread
From: Brandon Williams @ 2018-06-13 21:39 UTC (permalink / raw)
  To: git; +Cc: avarab, ramsay, Brandon Williams

Currently, while performing packfile negotiation, clients are only
allowed to specify their desired objects using object ids.  This causes
a vulnerability to failure when an object turns non-existent during
negotiation, which may happen if, for example, the desired repository is
provided by multiple Git servers in a load-balancing arrangement.

In order to eliminate this vulnerability, implement the ref-in-want
feature for the 'fetch' command in protocol version 2.  This feature
enables the 'fetch' command to support requests in the form of ref names
through a new "want-ref <ref>" parameter.  At the conclusion of
negotiation, the server will send a list of all of the wanted references
(as provided by "want-ref" lines) in addition to the generated packfile.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/config.txt                |   7 ++
 Documentation/technical/protocol-v2.txt |  29 ++++-
 t/t5703-upload-pack-ref-in-want.sh      | 153 ++++++++++++++++++++++++
 upload-pack.c                           |  64 ++++++++++
 4 files changed, 252 insertions(+), 1 deletion(-)
 create mode 100755 t/t5703-upload-pack-ref-in-want.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index ab641bf5a..fb1dd7428 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -3479,6 +3479,13 @@ Note that this configuration variable is ignored if it is seen in the
 repository-level config (this is a safety measure against fetching from
 untrusted repositories).
 
+uploadpack.allowRefInWant::
+	If this option is set, `upload-pack` will support the `ref-in-want`
+	feature of the protocol version 2 `fetch` command.  This feature
+	is intended for the benefit of load-balanced servers which may
+	not have the same view of what OIDs their refs point to due to
+	replication delay.
+
 url.<base>.insteadOf::
 	Any URL that starts with this value will be rewritten to
 	start, instead, with <base>. In cases where some site serves a
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 49bda76d2..6020632b4 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -299,12 +299,22 @@ included in the client's request:
 	for use with partial clone and partial fetch operations. See
 	`rev-list` for possible "filter-spec" values.
 
+If the 'ref-in-want' feature is advertised, the following argument can
+be included in the client's request as well as the potential addition of
+the 'wanted-refs' section in the server's response as explained below.
+
+    want-ref <ref>
+	Indicates to the server that the client wants to retrieve a
+	particular ref, where <ref> is the full name of a ref on the
+	server.  A server should ignore any "want-ref <ref>" lines where
+	<ref> doesn't exist on the server.
+
 The response of `fetch` is broken into a number of sections separated by
 delimiter packets (0001), with each section beginning with its section
 header.
 
     output = *section
-    section = (acknowledgments | shallow-info | packfile)
+    section = (acknowledgments | shallow-info | wanted-refs | packfile)
 	      (flush-pkt | delim-pkt)
 
     acknowledgments = PKT-LINE("acknowledgments" LF)
@@ -319,6 +329,10 @@ header.
     shallow = "shallow" SP obj-id
     unshallow = "unshallow" SP obj-id
 
+    wanted-refs = PKT-LINE("wanted-refs" LF)
+		  *PKT-LINE(wanted-ref LF)
+    wanted-ref = obj-id SP refname
+
     packfile = PKT-LINE("packfile" LF)
 	       *PKT-LINE(%x01-03 *%x00-ff)
 
@@ -379,6 +393,19 @@ header.
 	* This section is only included if a packfile section is also
 	  included in the response.
 
+    wanted-refs section
+	* This section is only included if the client has requested a
+	  ref using a 'want-ref' line and if a packfile section is also
+	  included in the response.
+
+	* Always begins with the section header "wanted-refs"
+
+	* The server will send a ref listing ("<oid> <refname>") for
+	  each reference requested using 'want-ref' lines.
+
+	* The server MUST NOT send any refs which were not requested
+	  using 'want-ref' lines.
+
     packfile section
 	* This section is only included if the client has sent 'want'
 	  lines in its request and either requested that no more
diff --git a/t/t5703-upload-pack-ref-in-want.sh b/t/t5703-upload-pack-ref-in-want.sh
new file mode 100755
index 000000000..0ef182970
--- /dev/null
+++ b/t/t5703-upload-pack-ref-in-want.sh
@@ -0,0 +1,153 @@
+#!/bin/sh
+
+test_description='upload-pack ref-in-want'
+
+. ./test-lib.sh
+
+get_actual_refs() {
+	sed -n '/wanted-refs/,/0001/p' <out | sed '1d;$d' | test-pkt-line unpack >actual_refs
+}
+
+get_actual_commits() {
+	sed -n '/packfile/,/0000/p' <out | sed '1d' | test-pkt-line unpack-sideband >o.pack &&
+	git index-pack o.pack &&
+	git verify-pack -v o.idx | grep commit | cut -c-40 | sort >actual_commits
+}
+
+check_output() {
+	get_actual_refs &&
+	test_cmp expected_refs actual_refs &&
+	get_actual_commits &&
+	test_cmp expected_commits actual_commits
+}
+
+# c(o/foo) d(o/bar)
+#        \ /
+#         b   e(baz)  f(master)
+#          \__  |  __/
+#             \ | /
+#               a
+test_expect_success 'setup repository' '
+	test_commit a &&
+	git checkout -b o/foo &&
+	test_commit b &&
+	test_commit c &&
+	git checkout -b o/bar b &&
+	test_commit d &&
+	git checkout -b baz a &&
+	test_commit e &&
+	git checkout master &&
+	test_commit f
+'
+
+test_expect_success 'config controls ref-in-want advertisement' '
+	git serve --advertise-capabilities >out &&
+	! grep -a ref-in-want out &&
+
+	git config uploadpack.allowRefInWant false &&
+	git serve --advertise-capabilities >out &&
+	! grep -a ref-in-want out &&
+
+	git config uploadpack.allowRefInWant true &&
+	git serve --advertise-capabilities >out &&
+	grep -a ref-in-want out
+'
+
+test_expect_success 'invalid want-ref line' '
+	test-pkt-line pack >in <<-EOF &&
+	command=fetch
+	0001
+	no-progress
+	want-ref refs/heads/non-existent
+	done
+	0000
+	EOF
+
+	test_must_fail git serve --stateless-rpc 2>out <in &&
+	grep "unknown ref" out
+'
+
+test_expect_success 'basic want-ref' '
+	cat >expected_refs <<-EOF &&
+	$(git rev-parse f) refs/heads/master
+	EOF
+	git rev-parse f | sort >expected_commits &&
+
+	test-pkt-line pack >in <<-EOF &&
+	command=fetch
+	0001
+	no-progress
+	want-ref refs/heads/master
+	have $(git rev-parse a)
+	done
+	0000
+	EOF
+
+	git serve --stateless-rpc >out <in &&
+	check_output
+'
+
+test_expect_success 'multiple want-ref lines' '
+	cat >expected_refs <<-EOF &&
+	$(git rev-parse c) refs/heads/o/foo
+	$(git rev-parse d) refs/heads/o/bar
+	EOF
+	git rev-parse c d | sort >expected_commits &&
+
+	test-pkt-line pack >in <<-EOF &&
+	command=fetch
+	0001
+	no-progress
+	want-ref refs/heads/o/foo
+	want-ref refs/heads/o/bar
+	have $(git rev-parse b)
+	done
+	0000
+	EOF
+
+	git serve --stateless-rpc >out <in &&
+	check_output
+'
+
+test_expect_success 'mix want and want-ref' '
+	cat >expected_refs <<-EOF &&
+	$(git rev-parse f) refs/heads/master
+	EOF
+	git rev-parse e f | sort >expected_commits &&
+
+	test-pkt-line pack >in <<-EOF &&
+	command=fetch
+	0001
+	no-progress
+	want-ref refs/heads/master
+	want $(git rev-parse e)
+	have $(git rev-parse a)
+	done
+	0000
+	EOF
+
+	git serve --stateless-rpc >out <in &&
+	check_output
+'
+
+test_expect_success 'want-ref with ref we already have commit for' '
+	cat >expected_refs <<-EOF &&
+	$(git rev-parse c) refs/heads/o/foo
+	EOF
+	>expected_commits &&
+
+	test-pkt-line pack >in <<-EOF &&
+	command=fetch
+	0001
+	no-progress
+	want-ref refs/heads/o/foo
+	have $(git rev-parse c)
+	done
+	0000
+	EOF
+
+	git serve --stateless-rpc >out <in &&
+	check_output
+'
+
+test_done
diff --git a/upload-pack.c b/upload-pack.c
index 87c6722ea..94b17c038 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -64,6 +64,7 @@ static const char *pack_objects_hook;
 
 static int filter_capability_requested;
 static int allow_filter;
+static int allow_ref_in_want;
 static struct list_objects_filter_options filter_options;
 
 static void reset_timeout(void)
@@ -1075,6 +1076,8 @@ static int upload_pack_config(const char *var, const char *value, void *unused)
 			return git_config_string(&pack_objects_hook, var, value);
 	} else if (!strcmp("uploadpack.allowfilter", var)) {
 		allow_filter = git_config_bool(var, value);
+	} else if (!strcmp("uploadpack.allowrefinwant", var)) {
+		allow_ref_in_want = git_config_bool(var, value);
 	}
 	return parse_hide_refs_config(var, value, "uploadpack");
 }
@@ -1114,6 +1117,7 @@ void upload_pack(struct upload_pack_options *options)
 
 struct upload_pack_data {
 	struct object_array wants;
+	struct string_list wanted_refs;
 	struct oid_array haves;
 
 	struct object_array shallows;
@@ -1135,12 +1139,14 @@ struct upload_pack_data {
 static void upload_pack_data_init(struct upload_pack_data *data)
 {
 	struct object_array wants = OBJECT_ARRAY_INIT;
+	struct string_list wanted_refs = STRING_LIST_INIT_DUP;
 	struct oid_array haves = OID_ARRAY_INIT;
 	struct object_array shallows = OBJECT_ARRAY_INIT;
 	struct string_list deepen_not = STRING_LIST_INIT_DUP;
 
 	memset(data, 0, sizeof(*data));
 	data->wants = wants;
+	data->wanted_refs = wanted_refs;
 	data->haves = haves;
 	data->shallows = shallows;
 	data->deepen_not = deepen_not;
@@ -1149,6 +1155,7 @@ static void upload_pack_data_init(struct upload_pack_data *data)
 static void upload_pack_data_clear(struct upload_pack_data *data)
 {
 	object_array_clear(&data->wants);
+	string_list_clear(&data->wanted_refs, 1);
 	oid_array_clear(&data->haves);
 	object_array_clear(&data->shallows);
 	string_list_clear(&data->deepen_not, 0);
@@ -1185,6 +1192,32 @@ static int parse_want(const char *line)
 	return 0;
 }
 
+static int parse_want_ref(const char *line, struct string_list *wanted_refs)
+{
+	const char *arg;
+	if (skip_prefix(line, "want-ref ", &arg)) {
+		struct object_id oid;
+		struct string_list_item *item;
+		struct object *o;
+
+		if (read_ref(arg, &oid))
+			return 1;
+
+		item = string_list_append(wanted_refs, arg);
+		item->util = oiddup(&oid);
+
+		o = parse_object_or_die(&oid, arg);
+		if (!(o->flags & WANTED)) {
+			o->flags |= WANTED;
+			add_object_array(o, NULL, &want_obj);
+		}
+
+		return 1;
+	}
+
+	return 0;
+}
+
 static int parse_have(const char *line, struct oid_array *haves)
 {
 	const char *arg;
@@ -1210,6 +1243,8 @@ static void process_args(struct packet_reader *request,
 		/* process want */
 		if (parse_want(arg))
 			continue;
+		if (allow_ref_in_want && parse_want_ref(arg, &data->wanted_refs))
+			continue;
 		/* process have line */
 		if (parse_have(arg, &data->haves))
 			continue;
@@ -1352,6 +1387,24 @@ static int process_haves_and_send_acks(struct upload_pack_data *data)
 	return ret;
 }
 
+static void send_wanted_ref_info(struct upload_pack_data *data)
+{
+	const struct string_list_item *item;
+
+	if (!data->wanted_refs.nr)
+		return;
+
+	packet_write_fmt(1, "wanted-refs\n");
+
+	for_each_string_list_item(item, &data->wanted_refs) {
+		packet_write_fmt(1, "%s %s\n",
+				 oid_to_hex(item->util),
+				 item->string);
+	}
+
+	packet_delim(1);
+}
+
 static void send_shallow_info(struct upload_pack_data *data)
 {
 	/* No shallow info needs to be sent */
@@ -1418,6 +1471,7 @@ int upload_pack_v2(struct repository *r, struct argv_array *keys,
 				state = FETCH_DONE;
 			break;
 		case FETCH_SEND_PACK:
+			send_wanted_ref_info(&data);
 			send_shallow_info(&data);
 
 			packet_write_fmt(1, "packfile\n");
@@ -1438,12 +1492,22 @@ int upload_pack_advertise(struct repository *r,
 {
 	if (value) {
 		int allow_filter_value;
+		int allow_ref_in_want;
+
 		strbuf_addstr(value, "shallow");
+
 		if (!repo_config_get_bool(the_repository,
 					 "uploadpack.allowfilter",
 					 &allow_filter_value) &&
 		    allow_filter_value)
 			strbuf_addstr(value, " filter");
+
+		if (!repo_config_get_bool(the_repository,
+					 "uploadpack.allowrefinwant",
+					 &allow_ref_in_want) &&
+		    allow_ref_in_want)
+			strbuf_addstr(value, " ref-in-want");
 	}
+
 	return 1;
 }
-- 
2.18.0.rc1.242.g61856ae69a-goog


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v2 3/8] upload-pack: test negotiation with changing repository
  2018-06-13 21:39 ` [PATCH v2 0/8] ref-in-want Brandon Williams
  2018-06-13 21:39   ` [PATCH v2 1/8] test-pkt-line: add unpack-sideband subcommand Brandon Williams
  2018-06-13 21:39   ` [PATCH v2 2/8] upload-pack: implement ref-in-want Brandon Williams
@ 2018-06-13 21:39   ` Brandon Williams
  2018-06-14 19:23     ` Stefan Beller
  2018-06-13 21:39   ` [PATCH v2 4/8] fetch: refactor the population of peer ref OIDs Brandon Williams
                     ` (5 subsequent siblings)
  8 siblings, 1 reply; 46+ messages in thread
From: Brandon Williams @ 2018-06-13 21:39 UTC (permalink / raw)
  To: git; +Cc: avarab, ramsay, Brandon Williams

Add tests to check the behavior of fetching from a repository which
changes between rounds of negotiation (for example, when different
servers in a load-balancing agreement participate in the same stateless
RPC negotiation). This forms a baseline of comparison to the ref-in-want
functionality (which will be introduced to the client in subsequent
commits), and ensures that subsequent commits do not change existing
behavior.

As part of this effort, a mechanism to substitute strings in a single
HTTP response is added.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 t/lib-httpd.sh                     |  1 +
 t/lib-httpd/apache.conf            |  8 +++
 t/lib-httpd/one-time-sed.sh        | 16 ++++++
 t/t5703-upload-pack-ref-in-want.sh | 92 ++++++++++++++++++++++++++++++
 4 files changed, 117 insertions(+)
 create mode 100644 t/lib-httpd/one-time-sed.sh

diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index 435a37465..84f8efdd4 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -132,6 +132,7 @@ prepare_httpd() {
 	cp "$TEST_PATH"/passwd "$HTTPD_ROOT_PATH"
 	install_script broken-smart-http.sh
 	install_script error.sh
+	install_script one-time-sed.sh
 
 	ln -s "$LIB_HTTPD_MODULE_PATH" "$HTTPD_ROOT_PATH/modules"
 
diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
index 724d9ae46..fe68d37bb 100644
--- a/t/lib-httpd/apache.conf
+++ b/t/lib-httpd/apache.conf
@@ -111,9 +111,14 @@ Alias /auth/dumb/ www/auth/dumb/
 	SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
 	SetEnv GIT_HTTP_EXPORT_ALL
 </LocationMatch>
+<LocationMatch /one_time_sed/>
+	SetEnv GIT_EXEC_PATH ${GIT_EXEC_PATH}
+	SetEnv GIT_HTTP_EXPORT_ALL
+</LocationMatch>
 ScriptAliasMatch /smart_*[^/]*/(.*) ${GIT_EXEC_PATH}/git-http-backend/$1
 ScriptAlias /broken_smart/ broken-smart-http.sh/
 ScriptAlias /error/ error.sh/
+ScriptAliasMatch /one_time_sed/(.*) one-time-sed.sh/$1
 <Directory ${GIT_EXEC_PATH}>
 	Options FollowSymlinks
 </Directory>
@@ -123,6 +128,9 @@ ScriptAlias /error/ error.sh/
 <Files error.sh>
   Options ExecCGI
 </Files>
+<Files one-time-sed.sh>
+	Options ExecCGI
+</Files>
 <Files ${GIT_EXEC_PATH}/git-http-backend>
 	Options ExecCGI
 </Files>
diff --git a/t/lib-httpd/one-time-sed.sh b/t/lib-httpd/one-time-sed.sh
new file mode 100644
index 000000000..a9c4aa5f4
--- /dev/null
+++ b/t/lib-httpd/one-time-sed.sh
@@ -0,0 +1,16 @@
+#!/bin/sh
+
+if [ -e one-time-sed ]; then
+	"$GIT_EXEC_PATH/git-http-backend" >out
+
+	sed "$(cat one-time-sed)" <out >out_modified
+
+	if diff out out_modified >/dev/null; then
+		cat out
+	else
+		cat out_modified
+		rm one-time-sed
+	fi
+else
+	"$GIT_EXEC_PATH/git-http-backend"
+fi
diff --git a/t/t5703-upload-pack-ref-in-want.sh b/t/t5703-upload-pack-ref-in-want.sh
index 0ef182970..979ab6d03 100755
--- a/t/t5703-upload-pack-ref-in-want.sh
+++ b/t/t5703-upload-pack-ref-in-want.sh
@@ -150,4 +150,96 @@ test_expect_success 'want-ref with ref we already have commit for' '
 	check_output
 '
 
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+REPO="$HTTPD_DOCUMENT_ROOT_PATH/repo"
+LOCAL_PRISTINE="$(pwd)/local_pristine"
+
+test_expect_success 'setup repos for change-while-negotiating test' '
+	(
+		git init "$REPO" &&
+		cd "$REPO" &&
+		>.git/git-daemon-export-ok &&
+		test_commit m1 &&
+		git tag -d m1 &&
+
+		# Local repo with many commits (so that negotiation will take
+		# more than 1 request/response pair)
+		git clone "http://127.0.0.1:$LIB_HTTPD_PORT/smart/repo" "$LOCAL_PRISTINE" &&
+		cd "$LOCAL_PRISTINE" &&
+		git checkout -b side &&
+		for i in $(seq 1 33); do test_commit s$i; done &&
+
+		# Add novel commits to upstream
+		git checkout master &&
+		cd "$REPO" &&
+		test_commit m2 &&
+		test_commit m3 &&
+		git tag -d m2 m3
+	) &&
+	git -C "$LOCAL_PRISTINE" remote set-url origin "http://127.0.0.1:$LIB_HTTPD_PORT/one_time_sed/repo" &&
+	git -C "$LOCAL_PRISTINE" config protocol.version 2
+'
+
+inconsistency() {
+	# Simulate that the server initially reports $2 as the ref
+	# corresponding to $1, and after that, $1 as the ref corresponding to
+	# $1. This corresponds to the real-life situation where the server's
+	# repository appears to change during negotiation, for example, when
+	# different servers in a load-balancing arrangement serve (stateless)
+	# RPCs during a single negotiation.
+	printf "s/%s/%s/" \
+	       $(git -C "$REPO" rev-parse $1 | tr -d "\n") \
+	       $(git -C "$REPO" rev-parse $2 | tr -d "\n") \
+	       >"$HTTPD_ROOT_PATH/one-time-sed"
+}
+
+test_expect_success 'server is initially ahead - no ref in want' '
+	git -C "$REPO" config uploadpack.allowRefInWant false &&
+	rm -rf local &&
+	cp -r "$LOCAL_PRISTINE" local &&
+	inconsistency master 1234567890123456789012345678901234567890 &&
+	test_must_fail git -C local fetch 2>err &&
+	grep "ERR upload-pack: not our ref" err
+'
+
+test_expect_failure 'server is initially ahead - ref in want' '
+	git -C "$REPO" config uploadpack.allowRefInWant true &&
+	rm -rf local &&
+	cp -r "$LOCAL_PRISTINE" local &&
+	inconsistency master 1234567890123456789012345678901234567890 &&
+	git -C local fetch &&
+
+	git -C "$REPO" rev-parse --verify master >expected &&
+	git -C local rev-parse --verify refs/remotes/origin/master >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'server is initially behind - no ref in want' '
+	git -C "$REPO" config uploadpack.allowRefInWant false &&
+	rm -rf local &&
+	cp -r "$LOCAL_PRISTINE" local &&
+	inconsistency master "master^" &&
+	git -C local fetch &&
+
+	git -C "$REPO" rev-parse --verify "master^" >expected &&
+	git -C local rev-parse --verify refs/remotes/origin/master >actual &&
+	test_cmp expected actual
+'
+
+test_expect_failure 'server is initially behind - ref in want' '
+	git -C "$REPO" config uploadpack.allowRefInWant true &&
+	rm -rf local &&
+	cp -r "$LOCAL_PRISTINE" local &&
+	inconsistency master "master^" &&
+	git -C local fetch &&
+
+	git -C "$REPO" rev-parse --verify "master" >expected &&
+	git -C local rev-parse --verify refs/remotes/origin/master >actual &&
+	test_cmp expected actual
+'
+
+stop_httpd
+
 test_done
-- 
2.18.0.rc1.242.g61856ae69a-goog


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v2 4/8] fetch: refactor the population of peer ref OIDs
  2018-06-13 21:39 ` [PATCH v2 0/8] ref-in-want Brandon Williams
                     ` (2 preceding siblings ...)
  2018-06-13 21:39   ` [PATCH v2 3/8] upload-pack: test negotiation with changing repository Brandon Williams
@ 2018-06-13 21:39   ` Brandon Williams
  2018-06-13 21:39   ` [PATCH v2 5/8] fetch: refactor fetch_refs into two functions Brandon Williams
                     ` (4 subsequent siblings)
  8 siblings, 0 replies; 46+ messages in thread
From: Brandon Williams @ 2018-06-13 21:39 UTC (permalink / raw)
  To: git; +Cc: avarab, ramsay, Brandon Williams

Populate peer ref OIDs in get_ref_map instead of do_fetch. Besides
tightening scopes of variables in the code, this also prepares for
get_ref_map being able to be called multiple times within do_fetch.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch.c | 36 ++++++++++++++++++------------------
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index ea5b9669a..545635448 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -351,6 +351,7 @@ static struct ref *get_ref_map(struct transport *transport,
 	/* opportunistically-updated references: */
 	struct ref *orefs = NULL, **oref_tail = &orefs;
 
+	struct string_list existing_refs = STRING_LIST_INIT_DUP;
 	const struct ref *remote_refs;
 
 	if (rs->nr)
@@ -458,7 +459,23 @@ static struct ref *get_ref_map(struct transport *transport,
 		tail = &rm->next;
 	}
 
-	return ref_remove_duplicates(ref_map);
+	ref_map = ref_remove_duplicates(ref_map);
+
+	for_each_ref(add_existing, &existing_refs);
+	for (rm = ref_map; rm; rm = rm->next) {
+		if (rm->peer_ref) {
+			struct string_list_item *peer_item =
+				string_list_lookup(&existing_refs,
+						   rm->peer_ref->name);
+			if (peer_item) {
+				struct object_id *old_oid = peer_item->util;
+				oidcpy(&rm->peer_ref->old_oid, old_oid);
+			}
+		}
+	}
+	string_list_clear(&existing_refs, 1);
+
+	return ref_map;
 }
 
 #define STORE_REF_ERROR_OTHER 1
@@ -1110,14 +1127,10 @@ static void backfill_tags(struct transport *transport, struct ref *ref_map)
 static int do_fetch(struct transport *transport,
 		    struct refspec *rs)
 {
-	struct string_list existing_refs = STRING_LIST_INIT_DUP;
 	struct ref *ref_map;
-	struct ref *rm;
 	int autotags = (transport->remote->fetch_tags == 1);
 	int retcode = 0;
 
-	for_each_ref(add_existing, &existing_refs);
-
 	if (tags == TAGS_DEFAULT) {
 		if (transport->remote->fetch_tags == 2)
 			tags = TAGS_SET;
@@ -1136,18 +1149,6 @@ static int do_fetch(struct transport *transport,
 	if (!update_head_ok)
 		check_not_current_branch(ref_map);
 
-	for (rm = ref_map; rm; rm = rm->next) {
-		if (rm->peer_ref) {
-			struct string_list_item *peer_item =
-				string_list_lookup(&existing_refs,
-						   rm->peer_ref->name);
-			if (peer_item) {
-				struct object_id *old_oid = peer_item->util;
-				oidcpy(&rm->peer_ref->old_oid, old_oid);
-			}
-		}
-	}
-
 	if (tags == TAGS_DEFAULT && autotags)
 		transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, "1");
 	if (prune) {
@@ -1183,7 +1184,6 @@ static int do_fetch(struct transport *transport,
 	}
 
  cleanup:
-	string_list_clear(&existing_refs, 1);
 	return retcode;
 }
 
-- 
2.18.0.rc1.242.g61856ae69a-goog


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v2 5/8] fetch: refactor fetch_refs into two functions
  2018-06-13 21:39 ` [PATCH v2 0/8] ref-in-want Brandon Williams
                     ` (3 preceding siblings ...)
  2018-06-13 21:39   ` [PATCH v2 4/8] fetch: refactor the population of peer ref OIDs Brandon Williams
@ 2018-06-13 21:39   ` Brandon Williams
  2018-06-13 21:39   ` [PATCH v2 6/8] fetch: refactor to make function args narrower Brandon Williams
                     ` (3 subsequent siblings)
  8 siblings, 0 replies; 46+ messages in thread
From: Brandon Williams @ 2018-06-13 21:39 UTC (permalink / raw)
  To: git; +Cc: avarab, ramsay, Brandon Williams

Refactor the fetch_refs function into a function that does the fetching
of refs and another function that stores them.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 545635448..ee8b87c78 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -967,10 +967,16 @@ static int fetch_refs(struct transport *transport, struct ref *ref_map)
 	int ret = quickfetch(ref_map);
 	if (ret)
 		ret = transport_fetch_refs(transport, ref_map);
-	if (!ret)
-		ret |= store_updated_refs(transport->url,
-				transport->remote->name,
-				ref_map);
+	if (ret)
+		transport_unlock_pack(transport);
+	return ret;
+}
+
+static int consume_refs(struct transport *transport, struct ref *ref_map)
+{
+	int ret = store_updated_refs(transport->url,
+				     transport->remote->name,
+				     ref_map);
 	transport_unlock_pack(transport);
 	return ret;
 }
@@ -1116,7 +1122,8 @@ static void backfill_tags(struct transport *transport, struct ref *ref_map)
 	transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, NULL);
 	transport_set_option(transport, TRANS_OPT_DEPTH, "0");
 	transport_set_option(transport, TRANS_OPT_DEEPEN_RELATIVE, NULL);
-	fetch_refs(transport, ref_map);
+	if (!fetch_refs(transport, ref_map))
+		consume_refs(transport, ref_map);
 
 	if (gsecondary) {
 		transport_disconnect(gsecondary);
@@ -1165,7 +1172,7 @@ static int do_fetch(struct transport *transport,
 				   transport->url);
 		}
 	}
-	if (fetch_refs(transport, ref_map)) {
+	if (fetch_refs(transport, ref_map) || consume_refs(transport, ref_map)) {
 		free_refs(ref_map);
 		retcode = 1;
 		goto cleanup;
-- 
2.18.0.rc1.242.g61856ae69a-goog


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v2 6/8] fetch: refactor to make function args narrower
  2018-06-13 21:39 ` [PATCH v2 0/8] ref-in-want Brandon Williams
                     ` (4 preceding siblings ...)
  2018-06-13 21:39   ` [PATCH v2 5/8] fetch: refactor fetch_refs into two functions Brandon Williams
@ 2018-06-13 21:39   ` Brandon Williams
  2018-06-14 19:32     ` Stefan Beller
  2018-06-13 21:39   ` [PATCH v2 7/8] fetch-pack: put shallow info in output parameter Brandon Williams
                     ` (2 subsequent siblings)
  8 siblings, 1 reply; 46+ messages in thread
From: Brandon Williams @ 2018-06-13 21:39 UTC (permalink / raw)
  To: git; +Cc: avarab, ramsay, Brandon Williams

Refactor find_non_local_tags and get_ref_map to only take the
information they need instead of the entire transport struct. Besides
improving code clarity, this also improves their flexibility, allowing
for a different set of refs to be used instead of relying on the ones
stored in the transport struct.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch.c | 52 ++++++++++++++++++++++++-------------------------
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index ee8b87c78..b600e1f10 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -254,9 +254,9 @@ static int will_fetch(struct ref **head, const unsigned char *sha1)
 	return 0;
 }
 
-static void find_non_local_tags(struct transport *transport,
-			struct ref **head,
-			struct ref ***tail)
+static void find_non_local_tags(const struct ref *refs,
+				struct ref **head,
+				struct ref ***tail)
 {
 	struct string_list existing_refs = STRING_LIST_INIT_DUP;
 	struct string_list remote_refs = STRING_LIST_INIT_NODUP;
@@ -264,7 +264,7 @@ static void find_non_local_tags(struct transport *transport,
 	struct string_list_item *item = NULL;
 
 	for_each_ref(add_existing, &existing_refs);
-	for (ref = transport_get_remote_refs(transport, NULL); ref; ref = ref->next) {
+	for (ref = refs; ref; ref = ref->next) {
 		if (!starts_with(ref->name, "refs/tags/"))
 			continue;
 
@@ -338,7 +338,8 @@ static void find_non_local_tags(struct transport *transport,
 	string_list_clear(&remote_refs, 0);
 }
 
-static struct ref *get_ref_map(struct transport *transport,
+static struct ref *get_ref_map(struct remote *remote,
+			       const struct ref *remote_refs,
 			       struct refspec *rs,
 			       int tags, int *autotags)
 {
@@ -346,27 +347,11 @@ static struct ref *get_ref_map(struct transport *transport,
 	struct ref *rm;
 	struct ref *ref_map = NULL;
 	struct ref **tail = &ref_map;
-	struct argv_array ref_prefixes = ARGV_ARRAY_INIT;
 
 	/* opportunistically-updated references: */
 	struct ref *orefs = NULL, **oref_tail = &orefs;
 
 	struct string_list existing_refs = STRING_LIST_INIT_DUP;
-	const struct ref *remote_refs;
-
-	if (rs->nr)
-		refspec_ref_prefixes(rs, &ref_prefixes);
-	else if (transport->remote && transport->remote->fetch.nr)
-		refspec_ref_prefixes(&transport->remote->fetch, &ref_prefixes);
-
-	if (ref_prefixes.argc &&
-	    (tags == TAGS_SET || (tags == TAGS_DEFAULT && !rs->nr))) {
-		argv_array_push(&ref_prefixes, "refs/tags/");
-	}
-
-	remote_refs = transport_get_remote_refs(transport, &ref_prefixes);
-
-	argv_array_clear(&ref_prefixes);
 
 	if (rs->nr) {
 		struct refspec *fetch_refspec;
@@ -403,7 +388,7 @@ static struct ref *get_ref_map(struct transport *transport,
 		if (refmap.nr)
 			fetch_refspec = &refmap;
 		else
-			fetch_refspec = &transport->remote->fetch;
+			fetch_refspec = &remote->fetch;
 
 		for (i = 0; i < fetch_refspec->nr; i++)
 			get_fetch_map(ref_map, &fetch_refspec->items[i], &oref_tail, 1);
@@ -411,7 +396,6 @@ static struct ref *get_ref_map(struct transport *transport,
 		die("--refmap option is only meaningful with command-line refspec(s).");
 	} else {
 		/* Use the defaults */
-		struct remote *remote = transport->remote;
 		struct branch *branch = branch_get(NULL);
 		int has_merge = branch_has_merge_config(branch);
 		if (remote &&
@@ -450,7 +434,7 @@ static struct ref *get_ref_map(struct transport *transport,
 		/* also fetch all tags */
 		get_fetch_map(remote_refs, tag_refspec, &tail, 0);
 	else if (tags == TAGS_DEFAULT && *autotags)
-		find_non_local_tags(transport, &ref_map, &tail);
+		find_non_local_tags(remote_refs, &ref_map, &tail);
 
 	/* Now append any refs to be updated opportunistically: */
 	*tail = orefs;
@@ -1137,6 +1121,8 @@ static int do_fetch(struct transport *transport,
 	struct ref *ref_map;
 	int autotags = (transport->remote->fetch_tags == 1);
 	int retcode = 0;
+	const struct ref *remote_refs;
+	struct argv_array ref_prefixes = ARGV_ARRAY_INIT;
 
 	if (tags == TAGS_DEFAULT) {
 		if (transport->remote->fetch_tags == 2)
@@ -1152,7 +1138,21 @@ static int do_fetch(struct transport *transport,
 			goto cleanup;
 	}
 
-	ref_map = get_ref_map(transport, rs, tags, &autotags);
+	if (rs->nr)
+		refspec_ref_prefixes(rs, &ref_prefixes);
+	else if (transport->remote && transport->remote->fetch.nr)
+		refspec_ref_prefixes(&transport->remote->fetch, &ref_prefixes);
+
+	if (ref_prefixes.argc &&
+	    (tags == TAGS_SET || (tags == TAGS_DEFAULT && !rs->nr))) {
+		argv_array_push(&ref_prefixes, "refs/tags/");
+	}
+
+	remote_refs = transport_get_remote_refs(transport, &ref_prefixes);
+	argv_array_clear(&ref_prefixes);
+
+	ref_map = get_ref_map(transport->remote, remote_refs, rs,
+			      tags, &autotags);
 	if (!update_head_ok)
 		check_not_current_branch(ref_map);
 
@@ -1184,7 +1184,7 @@ static int do_fetch(struct transport *transport,
 	if (tags == TAGS_DEFAULT && autotags) {
 		struct ref **tail = &ref_map;
 		ref_map = NULL;
-		find_non_local_tags(transport, &ref_map, &tail);
+		find_non_local_tags(remote_refs, &ref_map, &tail);
 		if (ref_map)
 			backfill_tags(transport, ref_map);
 		free_refs(ref_map);
-- 
2.18.0.rc1.242.g61856ae69a-goog


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v2 7/8] fetch-pack: put shallow info in output parameter
  2018-06-13 21:39 ` [PATCH v2 0/8] ref-in-want Brandon Williams
                     ` (5 preceding siblings ...)
  2018-06-13 21:39   ` [PATCH v2 6/8] fetch: refactor to make function args narrower Brandon Williams
@ 2018-06-13 21:39   ` Brandon Williams
  2018-06-14 19:42     ` Stefan Beller
  2018-06-14 23:59     ` Jonathan Tan
  2018-06-13 21:39   ` [PATCH v2 8/8] fetch-pack: implement ref-in-want Brandon Williams
  2018-06-15 21:20   ` [PATCH v2 0/8] ref-in-want Junio C Hamano
  8 siblings, 2 replies; 46+ messages in thread
From: Brandon Williams @ 2018-06-13 21:39 UTC (permalink / raw)
  To: git; +Cc: avarab, ramsay, Brandon Williams

Expand the transport fetch method signature, by adding an output
parameter, to allow transports to return information about the refs they
have fetched.  Then communicate shallow status information through this
mechanism instead of by modifying the input list of refs.

This does require clients to sometimes generate the ref map twice: once
from the list of refs provided by the remote (as is currently done) and
potentially once from the new list of refs that the fetch mechanism
provides.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/clone.c      |  4 ++--
 builtin/fetch.c      | 23 +++++++++++++++++++----
 fetch-object.c       |  2 +-
 fetch-pack.c         | 17 +++++++++--------
 transport-helper.c   |  6 ++++--
 transport-internal.h |  9 ++++++++-
 transport.c          | 34 ++++++++++++++++++++++++++++------
 transport.h          |  3 ++-
 8 files changed, 73 insertions(+), 25 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 99e73dae8..8f86d99c5 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1155,7 +1155,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 			}
 
 		if (!is_local && !complete_refs_before_fetch)
-			transport_fetch_refs(transport, mapped_refs);
+			transport_fetch_refs(transport, mapped_refs, NULL);
 
 		remote_head = find_ref_by_name(refs, "HEAD");
 		remote_head_points_at =
@@ -1197,7 +1197,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (is_local)
 		clone_local(path, git_dir);
 	else if (refs && complete_refs_before_fetch)
-		transport_fetch_refs(transport, mapped_refs);
+		transport_fetch_refs(transport, mapped_refs, NULL);
 
 	update_remote_refs(refs, mapped_refs, remote_head_points_at,
 			   branch_top.buf, reflog_msg.buf, transport,
diff --git a/builtin/fetch.c b/builtin/fetch.c
index b600e1f10..ddf44ba1a 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -946,11 +946,13 @@ static int quickfetch(struct ref *ref_map)
 	return check_connected(iterate_ref_map, &rm, &opt);
 }
 
-static int fetch_refs(struct transport *transport, struct ref *ref_map)
+static int fetch_refs(struct transport *transport, struct ref *ref_map,
+		      struct ref **updated_remote_refs)
 {
 	int ret = quickfetch(ref_map);
 	if (ret)
-		ret = transport_fetch_refs(transport, ref_map);
+		ret = transport_fetch_refs(transport, ref_map,
+					   updated_remote_refs);
 	if (ret)
 		transport_unlock_pack(transport);
 	return ret;
@@ -1106,7 +1108,7 @@ static void backfill_tags(struct transport *transport, struct ref *ref_map)
 	transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, NULL);
 	transport_set_option(transport, TRANS_OPT_DEPTH, "0");
 	transport_set_option(transport, TRANS_OPT_DEEPEN_RELATIVE, NULL);
-	if (!fetch_refs(transport, ref_map))
+	if (!fetch_refs(transport, ref_map, NULL))
 		consume_refs(transport, ref_map);
 
 	if (gsecondary) {
@@ -1122,6 +1124,7 @@ static int do_fetch(struct transport *transport,
 	int autotags = (transport->remote->fetch_tags == 1);
 	int retcode = 0;
 	const struct ref *remote_refs;
+	struct ref *new_remote_refs = NULL;
 	struct argv_array ref_prefixes = ARGV_ARRAY_INIT;
 
 	if (tags == TAGS_DEFAULT) {
@@ -1172,7 +1175,19 @@ static int do_fetch(struct transport *transport,
 				   transport->url);
 		}
 	}
-	if (fetch_refs(transport, ref_map) || consume_refs(transport, ref_map)) {
+
+	if (fetch_refs(transport, ref_map, &new_remote_refs)) {
+		free_refs(ref_map);
+		retcode = 1;
+		goto cleanup;
+	}
+	if (new_remote_refs) {
+		free_refs(ref_map);
+		ref_map = get_ref_map(transport->remote, new_remote_refs, rs,
+				      tags, &autotags);
+		free_refs(new_remote_refs);
+	}
+	if (consume_refs(transport, ref_map)) {
 		free_refs(ref_map);
 		retcode = 1;
 		goto cleanup;
diff --git a/fetch-object.c b/fetch-object.c
index 853624f81..48fe63dd6 100644
--- a/fetch-object.c
+++ b/fetch-object.c
@@ -19,7 +19,7 @@ static void fetch_refs(const char *remote_name, struct ref *ref)
 
 	transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
 	transport_set_option(transport, TRANS_OPT_NO_DEPENDENTS, "1");
-	transport_fetch_refs(transport, ref);
+	transport_fetch_refs(transport, ref, NULL);
 	fetch_if_missing = original_fetch_if_missing;
 }
 
diff --git a/fetch-pack.c b/fetch-pack.c
index a320ce987..7799ee2cd 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1470,12 +1470,13 @@ static int remove_duplicates_in_refs(struct ref **ref, int nr)
 }
 
 static void update_shallow(struct fetch_pack_args *args,
-			   struct ref **sought, int nr_sought,
+			   struct ref *refs,
 			   struct shallow_info *si)
 {
 	struct oid_array ref = OID_ARRAY_INIT;
 	int *status;
-	int i;
+	int i = 0;
+	struct ref *r;
 
 	if (args->deepen && alternate_shallow_file) {
 		if (*alternate_shallow_file == '\0') { /* --unshallow */
@@ -1517,8 +1518,8 @@ static void update_shallow(struct fetch_pack_args *args,
 	remove_nonexistent_theirs_shallow(si);
 	if (!si->nr_ours && !si->nr_theirs)
 		return;
-	for (i = 0; i < nr_sought; i++)
-		oid_array_append(&ref, &sought[i]->old_oid);
+	for (r = refs; r; r = r->next)
+		oid_array_append(&ref, &r->old_oid);
 	si->ref = &ref;
 
 	if (args->update_shallow) {
@@ -1552,12 +1553,12 @@ static void update_shallow(struct fetch_pack_args *args,
 	 * remote is also shallow, check what ref is safe to update
 	 * without updating .git/shallow
 	 */
-	status = xcalloc(nr_sought, sizeof(*status));
+	status = xcalloc(ref.nr, sizeof(*status));
 	assign_shallow_commits_to_refs(si, NULL, status);
 	if (si->nr_ours || si->nr_theirs) {
-		for (i = 0; i < nr_sought; i++)
+		for (r = refs; r; r = r->next, i++)
 			if (status[i])
-				sought[i]->status = REF_STATUS_REJECT_SHALLOW;
+				r->status = REF_STATUS_REJECT_SHALLOW;
 	}
 	free(status);
 	oid_array_clear(&ref);
@@ -1591,7 +1592,7 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
 		ref_cpy = do_fetch_pack(args, fd, ref, sought, nr_sought,
 					&si, pack_lockfile);
 	reprepare_packed_git(the_repository);
-	update_shallow(args, sought, nr_sought, &si);
+	update_shallow(args, ref_cpy, &si);
 	clear_shallow_info(&si);
 	return ref_cpy;
 }
diff --git a/transport-helper.c b/transport-helper.c
index 1f8ff7e94..8b5abca29 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -651,14 +651,16 @@ static int connect_helper(struct transport *transport, const char *name,
 }
 
 static int fetch(struct transport *transport,
-		 int nr_heads, struct ref **to_fetch)
+		 int nr_heads, struct ref **to_fetch,
+		 struct ref **fetched_refs)
 {
 	struct helper_data *data = transport->data;
 	int i, count;
 
 	if (process_connect(transport, 0)) {
 		do_take_over(transport);
-		return transport->vtable->fetch(transport, nr_heads, to_fetch);
+		return transport->vtable->fetch(transport, nr_heads, to_fetch,
+						fetched_refs);
 	}
 
 	count = 0;
diff --git a/transport-internal.h b/transport-internal.h
index 1cde6258a..eeb6c340e 100644
--- a/transport-internal.h
+++ b/transport-internal.h
@@ -36,11 +36,18 @@ struct transport_vtable {
 	 * Fetch the objects for the given refs. Note that this gets
 	 * an array, and should ignore the list structure.
 	 *
+	 * The transport *may* provide, in fetched_refs, the list of refs that
+	 * it fetched.  If the transport knows anything about the fetched refs
+	 * that the caller does not know (for example, shallow status), it
+	 * should provide that list of refs and include that information in the
+	 * list.
+	 *
 	 * If the transport did not get hashes for refs in
 	 * get_refs_list(), it should set the old_sha1 fields in the
 	 * provided refs now.
 	 **/
-	int (*fetch)(struct transport *transport, int refs_nr, struct ref **refs);
+	int (*fetch)(struct transport *transport, int refs_nr, struct ref **refs,
+		     struct ref **fetched_refs);
 
 	/**
 	 * Push the objects and refs. Send the necessary objects, and
diff --git a/transport.c b/transport.c
index a32da30de..8704c20f1 100644
--- a/transport.c
+++ b/transport.c
@@ -151,7 +151,8 @@ static struct ref *get_refs_from_bundle(struct transport *transport,
 }
 
 static int fetch_refs_from_bundle(struct transport *transport,
-			       int nr_heads, struct ref **to_fetch)
+			       int nr_heads, struct ref **to_fetch,
+			       struct ref **fetched_refs)
 {
 	struct bundle_transport_data *data = transport->data;
 	return unbundle(&data->header, data->fd,
@@ -287,7 +288,8 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 }
 
 static int fetch_refs_via_pack(struct transport *transport,
-			       int nr_heads, struct ref **to_fetch)
+			       int nr_heads, struct ref **to_fetch,
+			       struct ref **fetched_refs)
 {
 	int ret = 0;
 	struct git_transport_data *data = transport->data;
@@ -354,8 +356,12 @@ static int fetch_refs_via_pack(struct transport *transport,
 	if (report_unmatched_refs(to_fetch, nr_heads))
 		ret = -1;
 
+	if (fetched_refs)
+		*fetched_refs = refs;
+	else
+		free_refs(refs);
+
 	free_refs(refs_tmp);
-	free_refs(refs);
 	free(dest);
 	return ret;
 }
@@ -1215,19 +1221,31 @@ const struct ref *transport_get_remote_refs(struct transport *transport,
 	return transport->remote_refs;
 }
 
-int transport_fetch_refs(struct transport *transport, struct ref *refs)
+int transport_fetch_refs(struct transport *transport, struct ref *refs,
+			 struct ref **fetched_refs)
 {
 	int rc;
 	int nr_heads = 0, nr_alloc = 0, nr_refs = 0;
 	struct ref **heads = NULL;
+	struct ref *nop_head = NULL, **nop_tail = &nop_head;
 	struct ref *rm;
 
 	for (rm = refs; rm; rm = rm->next) {
 		nr_refs++;
 		if (rm->peer_ref &&
 		    !is_null_oid(&rm->old_oid) &&
-		    !oidcmp(&rm->peer_ref->old_oid, &rm->old_oid))
+		    !oidcmp(&rm->peer_ref->old_oid, &rm->old_oid)) {
+			/*
+			 * These need to be reported as fetched, but we don not
+			 * actually need to fetch them.
+			 */
+			if (fetched_refs) {
+				struct ref *nop_ref = copy_ref(rm);
+				*nop_tail = nop_ref;
+				nop_tail = &nop_ref->next;
+			}
 			continue;
+		}
 		ALLOC_GROW(heads, nr_heads + 1, nr_alloc);
 		heads[nr_heads++] = rm;
 	}
@@ -1245,7 +1263,11 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs)
 			heads[nr_heads++] = rm;
 	}
 
-	rc = transport->vtable->fetch(transport, nr_heads, heads);
+	rc = transport->vtable->fetch(transport, nr_heads, heads, fetched_refs);
+	if (fetched_refs && nop_head) {
+		*nop_tail = *fetched_refs;
+		*fetched_refs = nop_head;
+	}
 
 	free(heads);
 	return rc;
diff --git a/transport.h b/transport.h
index 7792b0858..3dff767a8 100644
--- a/transport.h
+++ b/transport.h
@@ -218,7 +218,8 @@ int transport_push(struct transport *connection,
 const struct ref *transport_get_remote_refs(struct transport *transport,
 					    const struct argv_array *ref_prefixes);
 
-int transport_fetch_refs(struct transport *transport, struct ref *refs);
+int transport_fetch_refs(struct transport *transport, struct ref *refs,
+			 struct ref **fetched_refs);
 void transport_unlock_pack(struct transport *transport);
 int transport_disconnect(struct transport *transport);
 char *transport_anonymize_url(const char *url);
-- 
2.18.0.rc1.242.g61856ae69a-goog


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v2 8/8] fetch-pack: implement ref-in-want
  2018-06-13 21:39 ` [PATCH v2 0/8] ref-in-want Brandon Williams
                     ` (6 preceding siblings ...)
  2018-06-13 21:39   ` [PATCH v2 7/8] fetch-pack: put shallow info in output parameter Brandon Williams
@ 2018-06-13 21:39   ` Brandon Williams
  2018-06-14 19:56     ` Stefan Beller
  2018-06-15 21:20   ` [PATCH v2 0/8] ref-in-want Junio C Hamano
  8 siblings, 1 reply; 46+ messages in thread
From: Brandon Williams @ 2018-06-13 21:39 UTC (permalink / raw)
  To: git; +Cc: avarab, ramsay, Brandon Williams

Implement ref-in-want on the client side so that when a server supports
the "ref-in-want" feature, a client will send "want-ref" lines for each
reference the client wants to fetch.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 fetch-pack.c                       | 35 +++++++++++++++++++++++++++---
 remote.c                           |  1 +
 remote.h                           |  1 +
 t/t5703-upload-pack-ref-in-want.sh |  4 ++--
 4 files changed, 36 insertions(+), 5 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index 7799ee2cd..51e8356ba 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1102,9 +1102,10 @@ static void add_shallow_requests(struct strbuf *req_buf,
 
 static void add_wants(const struct ref *wants, struct strbuf *req_buf)
 {
+	int use_ref_in_want = server_supports_feature("fetch", "ref-in-want", 0);
+
 	for ( ; wants ; wants = wants->next) {
 		const struct object_id *remote = &wants->old_oid;
-		const char *remote_hex;
 		struct object *o;
 
 		/*
@@ -1122,8 +1123,10 @@ static void add_wants(const struct ref *wants, struct strbuf *req_buf)
 			continue;
 		}
 
-		remote_hex = oid_to_hex(remote);
-		packet_buf_write(req_buf, "want %s\n", remote_hex);
+		if (!use_ref_in_want || wants->exact_sha1)
+			packet_buf_write(req_buf, "want %s\n", oid_to_hex(remote));
+		else
+			packet_buf_write(req_buf, "want-ref %s\n", wants->name);
 	}
 }
 
@@ -1334,6 +1337,29 @@ static void receive_shallow_info(struct fetch_pack_args *args,
 	args->deepen = 1;
 }
 
+static void receive_wanted_refs(struct packet_reader *reader, struct ref *refs)
+{
+	process_section_header(reader, "wanted-refs", 0);
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+		struct object_id oid;
+		const char *end;
+		struct ref *r = NULL;
+
+		if (parse_oid_hex(reader->line, &oid, &end) || *end++ != ' ')
+			die("expected wanted-ref, got '%s'", reader->line);
+
+		for (r = refs; r; r = r->next) {
+			if (!strcmp(end, r->name)) {
+				oidcpy(&r->old_oid, &oid);
+				break;
+			}
+		}
+	}
+
+	if (reader->status != PACKET_READ_DELIM)
+		die("error processing wanted refs: %d", reader->status);
+}
+
 enum fetch_state {
 	FETCH_CHECK_LOCAL = 0,
 	FETCH_SEND_REQUEST,
@@ -1408,6 +1434,9 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 			if (process_section_header(&reader, "shallow-info", 1))
 				receive_shallow_info(args, &reader);
 
+			if (process_section_header(&reader, "wanted-refs", 1))
+				receive_wanted_refs(&reader, ref);
+
 			/* get the pack */
 			process_section_header(&reader, "packfile", 0);
 			if (get_pack(args, fd, pack_lockfile))
diff --git a/remote.c b/remote.c
index abe80c139..c9d452ac0 100644
--- a/remote.c
+++ b/remote.c
@@ -1735,6 +1735,7 @@ int get_fetch_map(const struct ref *remote_refs,
 		if (refspec->exact_sha1) {
 			ref_map = alloc_ref(name);
 			get_oid_hex(name, &ref_map->old_oid);
+			ref_map->exact_sha1 = 1;
 		} else {
 			ref_map = get_remote_ref(remote_refs, name);
 		}
diff --git a/remote.h b/remote.h
index 45ecc6cef..e5338e368 100644
--- a/remote.h
+++ b/remote.h
@@ -73,6 +73,7 @@ struct ref {
 		force:1,
 		forced_update:1,
 		expect_old_sha1:1,
+		exact_sha1:1,
 		deletion:1;
 
 	enum {
diff --git a/t/t5703-upload-pack-ref-in-want.sh b/t/t5703-upload-pack-ref-in-want.sh
index 979ab6d03..b94a51380 100755
--- a/t/t5703-upload-pack-ref-in-want.sh
+++ b/t/t5703-upload-pack-ref-in-want.sh
@@ -204,7 +204,7 @@ test_expect_success 'server is initially ahead - no ref in want' '
 	grep "ERR upload-pack: not our ref" err
 '
 
-test_expect_failure 'server is initially ahead - ref in want' '
+test_expect_success 'server is initially ahead - ref in want' '
 	git -C "$REPO" config uploadpack.allowRefInWant true &&
 	rm -rf local &&
 	cp -r "$LOCAL_PRISTINE" local &&
@@ -228,7 +228,7 @@ test_expect_success 'server is initially behind - no ref in want' '
 	test_cmp expected actual
 '
 
-test_expect_failure 'server is initially behind - ref in want' '
+test_expect_success 'server is initially behind - ref in want' '
 	git -C "$REPO" config uploadpack.allowRefInWant true &&
 	rm -rf local &&
 	cp -r "$LOCAL_PRISTINE" local &&
-- 
2.18.0.rc1.242.g61856ae69a-goog


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 1/8] test-pkt-line: add unpack-sideband subcommand
  2018-06-13 21:39   ` [PATCH v2 1/8] test-pkt-line: add unpack-sideband subcommand Brandon Williams
@ 2018-06-14 18:09     ` Stefan Beller
  2018-06-14 19:21       ` Brandon Williams
  0 siblings, 1 reply; 46+ messages in thread
From: Stefan Beller @ 2018-06-14 18:09 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Ævar Arnfjörð Bjarmason, Ramsay Jones

On Wed, Jun 13, 2018 at 2:39 PM Brandon Williams <bmwill@google.com> wrote:
>
> Add an 'unpack-sideband' subcommand to the test-pkt-line helper to
> enable unpacking packet line data sent multiplexed using a sideband.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  t/helper/test-pkt-line.c | 37 +++++++++++++++++++++++++++++++++++++
>  1 file changed, 37 insertions(+)
>
> diff --git a/t/helper/test-pkt-line.c b/t/helper/test-pkt-line.c
> index 0f19e53c7..2a55ffff1 100644
> --- a/t/helper/test-pkt-line.c
> +++ b/t/helper/test-pkt-line.c
> @@ -1,3 +1,4 @@
> +#include "cache.h"
>  #include "pkt-line.h"
>
>  static void pack_line(const char *line)
> @@ -48,6 +49,40 @@ static void unpack(void)
>         }
>  }
>
> +static void unpack_sideband(void)
> +{
> +       struct packet_reader reader;
> +       packet_reader_init(&reader, 0, NULL, 0,
> +                          PACKET_READ_GENTLE_ON_EOF |
> +                          PACKET_READ_CHOMP_NEWLINE);
> +
> +       while (packet_reader_read(&reader) != PACKET_READ_EOF) {
> +               int band;
> +               int fd;
> +
> +               switch (reader.status) {
> +               case PACKET_READ_EOF:
> +                       break;
> +               case PACKET_READ_NORMAL:
> +                       band = reader.line[0] & 0xff;
> +                       if (band == 1)
> +                               fd = 1;
> +                       else
> +                               fd = 2;
> +
> +                       write_or_die(fd, reader.line+1, reader.pktlen-1);

white space around + and - ?

> +
> +                       if (band == 3)
> +                               die("sind-band error");

s/sind/side/ ?

What values for band are possible?
e.g. band==4 would also just write to fd=1;
but I suspect we don't want that, yet.

So maybe

    band = reader.line[0] & 0xff;
    if (band < 1 || band > 2)
        die("unexpected side band %d", band)
    fd = band;

instead?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 2/8] upload-pack: implement ref-in-want
  2018-06-13 21:39   ` [PATCH v2 2/8] upload-pack: implement ref-in-want Brandon Williams
@ 2018-06-14 18:40     ` Stefan Beller
  2018-06-14 18:52       ` Brandon Williams
  2018-06-15 21:08     ` Junio C Hamano
  1 sibling, 1 reply; 46+ messages in thread
From: Stefan Beller @ 2018-06-14 18:40 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Ævar Arnfjörð Bjarmason, Ramsay Jones

Hi Brandon,
On Wed, Jun 13, 2018 at 2:39 PM Brandon Williams <bmwill@google.com> wrote:
>
> Currently, while performing packfile negotiation, clients are only
> allowed to specify their desired objects using object ids.  This causes
> a vulnerability to failure when an object turns non-existent during

I stopped reading when stumbling upon 'vulnerability to failure' a I
found it hard to read. A quick search turns out this is insider slang
of civil engineers. :)

> negotiation, which may happen if, for example, the desired repository is
> provided by multiple Git servers in a load-balancing arrangement.

... and the repository is not replicated evenly to all servers, yet.

> In order to eliminate this vulnerability, implement the ref-in-want
> feature for the 'fetch' command in protocol version 2.  This feature
> enables the 'fetch' command to support requests in the form of ref names
> through a new "want-ref <ref>" parameter.  At the conclusion of
> negotiation, the server will send a list of all of the wanted references
> (as provided by "want-ref" lines) in addition to the generated packfile.

This paragraph makes it sound as if it can be combined technically,
i.e.

client:
    want 01234...
    want-ref master

.. usual back and forth + pack..

server:

  wanted-ref: master 2345..

What happens if the client "wants" a sha1 that is advertised,
but happens to be the same as a wanted-ref?

> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  Documentation/config.txt                |   7 ++
>  Documentation/technical/protocol-v2.txt |  29 ++++-
>  t/t5703-upload-pack-ref-in-want.sh      | 153 ++++++++++++++++++++++++
>  upload-pack.c                           |  64 ++++++++++
>  4 files changed, 252 insertions(+), 1 deletion(-)
>  create mode 100755 t/t5703-upload-pack-ref-in-want.sh
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index ab641bf5a..fb1dd7428 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -3479,6 +3479,13 @@ Note that this configuration variable is ignored if it is seen in the
>  repository-level config (this is a safety measure against fetching from
>  untrusted repositories).
>
> +uploadpack.allowRefInWant::
> +       If this option is set, `upload-pack` will support the `ref-in-want`
> +       feature of the protocol version 2 `fetch` command.  This feature
> +       is intended for the benefit of load-balanced servers which may
> +       not have the same view of what OIDs their refs point to due to
> +       replication delay.

Instead of saying who benefits, can we also say what the feature is about?
Didn't someone mention on the first round of this series, that technically
ref-in-want also provides smaller net work load as refs usually are shorter
than oids (specifically as oids will grow in the hash transisition plan later)?
Is that worth mentioning?

When using this feature is a ref advertisement still needed?

> +
>  url.<base>.insteadOf::
>         Any URL that starts with this value will be rewritten to
>         start, instead, with <base>. In cases where some site serves a
> diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> index 49bda76d2..6020632b4 100644
> --- a/Documentation/technical/protocol-v2.txt
> +++ b/Documentation/technical/protocol-v2.txt
> @@ -299,12 +299,22 @@ included in the client's request:
>         for use with partial clone and partial fetch operations. See
>         `rev-list` for possible "filter-spec" values.
>
> +If the 'ref-in-want' feature is advertised, the following argument can
> +be included in the client's request as well as the potential addition of
> +the 'wanted-refs' section in the server's response as explained below.
> +
> +    want-ref <ref>
> +       Indicates to the server that the client wants to retrieve a
> +       particular ref, where <ref> is the full name of a ref on the
> +       server.  A server should ignore any "want-ref <ref>" lines where
> +       <ref> doesn't exist on the server.

Are patterns allowed?, e.g. I might want refs/tags/* at all times.

> @@ -319,6 +329,10 @@ header.
>      shallow = "shallow" SP obj-id
>      unshallow = "unshallow" SP obj-id
>
> +    wanted-refs = PKT-LINE("wanted-refs" LF)
> +                 *PKT-LINE(wanted-ref LF)
> +    wanted-ref = obj-id SP refname
> +
>      packfile = PKT-LINE("packfile" LF)
>                *PKT-LINE(%x01-03 *%x00-ff)
>
> @@ -379,6 +393,19 @@ header.
>         * This section is only included if a packfile section is also
>           included in the response.
>
> +    wanted-refs section
> +       * This section is only included if the client has requested a
> +         ref using a 'want-ref' line and if a packfile section is also
> +         included in the response.

Is it possible to fetch non-fast-forwarded refs this way? Or specifcially
refs that were reset to an older point in history such that no pack file
is needed to transfer; would we transfer an empty pack and then
the wanted-refs section for that use case?


> +
> +# c(o/foo) d(o/bar)
> +#        \ /
> +#         b   e(baz)  f(master)
> +#          \__  |  __/
> +#             \ | /
> +#               a

time is up in this diagram, most diagrams I looked at in tests
are sideways. Should be fine either way.

> +test_expect_success 'invalid want-ref line' '
> +       test-pkt-line pack >in <<-EOF &&
> +       command=fetch
> +       0001
> +       no-progress
> +       want-ref refs/heads/non-existent
> +       done
> +       0000
> +       EOF
> +
> +       test_must_fail git serve --stateless-rpc 2>out <in &&
> +       grep "unknown ref" out

The docs disagree with the test?
     A server should ignore any "want-ref <ref>" lines where
    <ref> doesn't exist on the server.


> +
> +test_expect_success 'mix want and want-ref' '

cool!

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 2/8] upload-pack: implement ref-in-want
  2018-06-14 18:40     ` Stefan Beller
@ 2018-06-14 18:52       ` Brandon Williams
  0 siblings, 0 replies; 46+ messages in thread
From: Brandon Williams @ 2018-06-14 18:52 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Ævar Arnfjörð Bjarmason, Ramsay Jones

On 06/14, Stefan Beller wrote:
> Hi Brandon,
> On Wed, Jun 13, 2018 at 2:39 PM Brandon Williams <bmwill@google.com> wrote:
> > negotiation, which may happen if, for example, the desired repository is
> > provided by multiple Git servers in a load-balancing arrangement.
> 
> ... and the repository is not replicated evenly to all servers, yet.

I'll update the commit msg to also include this.

> 
> > In order to eliminate this vulnerability, implement the ref-in-want
> > feature for the 'fetch' command in protocol version 2.  This feature
> > enables the 'fetch' command to support requests in the form of ref names
> > through a new "want-ref <ref>" parameter.  At the conclusion of
> > negotiation, the server will send a list of all of the wanted references
> > (as provided by "want-ref" lines) in addition to the generated packfile.
> 
> This paragraph makes it sound as if it can be combined technically,
> i.e.
> 
> client:
>     want 01234...
>     want-ref master
> 
> .. usual back and forth + pack..
> 
> server:
> 
>   wanted-ref: master 2345..
> 
> What happens if the client "wants" a sha1 that is advertised,
> but happens to be the same as a wanted-ref?

This would be fine, same as sending a want line with the same sha1 lots
of times.  Though there would still be a wanted-ref section from the
server for the wanted-ref.

> 
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >  Documentation/config.txt                |   7 ++
> >  Documentation/technical/protocol-v2.txt |  29 ++++-
> >  t/t5703-upload-pack-ref-in-want.sh      | 153 ++++++++++++++++++++++++
> >  upload-pack.c                           |  64 ++++++++++
> >  4 files changed, 252 insertions(+), 1 deletion(-)
> >  create mode 100755 t/t5703-upload-pack-ref-in-want.sh
> >
> > diff --git a/Documentation/config.txt b/Documentation/config.txt
> > index ab641bf5a..fb1dd7428 100644
> > --- a/Documentation/config.txt
> > +++ b/Documentation/config.txt
> > @@ -3479,6 +3479,13 @@ Note that this configuration variable is ignored if it is seen in the
> >  repository-level config (this is a safety measure against fetching from
> >  untrusted repositories).
> >
> > +uploadpack.allowRefInWant::
> > +       If this option is set, `upload-pack` will support the `ref-in-want`
> > +       feature of the protocol version 2 `fetch` command.  This feature
> > +       is intended for the benefit of load-balanced servers which may
> > +       not have the same view of what OIDs their refs point to due to
> > +       replication delay.
> 
> Instead of saying who benefits, can we also say what the feature is about?
> Didn't someone mention on the first round of this series, that technically
> ref-in-want also provides smaller net work load as refs usually are shorter
> than oids (specifically as oids will grow in the hash transisition plan later)?
> Is that worth mentioning?

Well I basically just took this from what a previous reviewer thought it
should say.  I think what you have listed here isn't really a big
benefit of using ref-in-want, its the issue with load-balanced servers
that this is trying to solve.

> 
> When using this feature is a ref advertisement still needed?

Maybe in the future no, but as of right now the code is structured to
still request a ref advertisement.

> 
> > +
> >  url.<base>.insteadOf::
> >         Any URL that starts with this value will be rewritten to
> >         start, instead, with <base>. In cases where some site serves a
> > diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> > index 49bda76d2..6020632b4 100644
> > --- a/Documentation/technical/protocol-v2.txt
> > +++ b/Documentation/technical/protocol-v2.txt
> > @@ -299,12 +299,22 @@ included in the client's request:
> >         for use with partial clone and partial fetch operations. See
> >         `rev-list` for possible "filter-spec" values.
> >
> > +If the 'ref-in-want' feature is advertised, the following argument can
> > +be included in the client's request as well as the potential addition of
> > +the 'wanted-refs' section in the server's response as explained below.
> > +
> > +    want-ref <ref>
> > +       Indicates to the server that the client wants to retrieve a
> > +       particular ref, where <ref> is the full name of a ref on the
> > +       server.  A server should ignore any "want-ref <ref>" lines where
> > +       <ref> doesn't exist on the server.
> 
> Are patterns allowed?, e.g. I might want refs/tags/* at all times.

Nope, "Where <ref> is the full name of a ref".  We can maybe allow this
at a later point in time.

> 
> > @@ -319,6 +329,10 @@ header.
> >      shallow = "shallow" SP obj-id
> >      unshallow = "unshallow" SP obj-id
> >
> > +    wanted-refs = PKT-LINE("wanted-refs" LF)
> > +                 *PKT-LINE(wanted-ref LF)
> > +    wanted-ref = obj-id SP refname
> > +
> >      packfile = PKT-LINE("packfile" LF)
> >                *PKT-LINE(%x01-03 *%x00-ff)
> >
> > @@ -379,6 +393,19 @@ header.
> >         * This section is only included if a packfile section is also
> >           included in the response.
> >
> > +    wanted-refs section
> > +       * This section is only included if the client has requested a
> > +         ref using a 'want-ref' line and if a packfile section is also
> > +         included in the response.
> 
> Is it possible to fetch non-fast-forwarded refs this way? Or specifcially
> refs that were reset to an older point in history such that no pack file
> is needed to transfer; would we transfer an empty pack and then
> the wanted-refs section for that use case?

Yeah there are cases where an empty packfile would be sent like you've
described.

> 
> 
> > +
> > +# c(o/foo) d(o/bar)
> > +#        \ /
> > +#         b   e(baz)  f(master)
> > +#          \__  |  __/
> > +#             \ | /
> > +#               a
> 
> time is up in this diagram, most diagrams I looked at in tests
> are sideways. Should be fine either way.
> 
> > +test_expect_success 'invalid want-ref line' '
> > +       test-pkt-line pack >in <<-EOF &&
> > +       command=fetch
> > +       0001
> > +       no-progress
> > +       want-ref refs/heads/non-existent
> > +       done
> > +       0000
> > +       EOF
> > +
> > +       test_must_fail git serve --stateless-rpc 2>out <in &&
> > +       grep "unknown ref" out
> 
> The docs disagree with the test?
>      A server should ignore any "want-ref <ref>" lines where
>     <ref> doesn't exist on the server.

I forgot to remove this when i updated the docs.  I'll remove this test
as it fails now :(

> 
> 
> > +
> > +test_expect_success 'mix want and want-ref' '
> 
> cool!

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 1/8] test-pkt-line: add unpack-sideband subcommand
  2018-06-14 18:09     ` Stefan Beller
@ 2018-06-14 19:21       ` Brandon Williams
  0 siblings, 0 replies; 46+ messages in thread
From: Brandon Williams @ 2018-06-14 19:21 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Ævar Arnfjörð Bjarmason, Ramsay Jones

On 06/14, Stefan Beller wrote:
> On Wed, Jun 13, 2018 at 2:39 PM Brandon Williams <bmwill@google.com> wrote:
> >
> > Add an 'unpack-sideband' subcommand to the test-pkt-line helper to
> > enable unpacking packet line data sent multiplexed using a sideband.
> >
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >  t/helper/test-pkt-line.c | 37 +++++++++++++++++++++++++++++++++++++
> >  1 file changed, 37 insertions(+)
> >
> > diff --git a/t/helper/test-pkt-line.c b/t/helper/test-pkt-line.c
> > index 0f19e53c7..2a55ffff1 100644
> > --- a/t/helper/test-pkt-line.c
> > +++ b/t/helper/test-pkt-line.c
> > @@ -1,3 +1,4 @@
> > +#include "cache.h"
> >  #include "pkt-line.h"
> >
> >  static void pack_line(const char *line)
> > @@ -48,6 +49,40 @@ static void unpack(void)
> >         }
> >  }
> >
> > +static void unpack_sideband(void)
> > +{
> > +       struct packet_reader reader;
> > +       packet_reader_init(&reader, 0, NULL, 0,
> > +                          PACKET_READ_GENTLE_ON_EOF |
> > +                          PACKET_READ_CHOMP_NEWLINE);
> > +
> > +       while (packet_reader_read(&reader) != PACKET_READ_EOF) {
> > +               int band;
> > +               int fd;
> > +
> > +               switch (reader.status) {
> > +               case PACKET_READ_EOF:
> > +                       break;
> > +               case PACKET_READ_NORMAL:
> > +                       band = reader.line[0] & 0xff;
> > +                       if (band == 1)
> > +                               fd = 1;
> > +                       else
> > +                               fd = 2;
> > +
> > +                       write_or_die(fd, reader.line+1, reader.pktlen-1);
> 
> white space around + and - ?

Will fix.

> 
> > +
> > +                       if (band == 3)
> > +                               die("sind-band error");
> 
> s/sind/side/ ?

Thanks for catching this.

> 
> What values for band are possible?
> e.g. band==4 would also just write to fd=1;
> but I suspect we don't want that, yet.
> 
> So maybe
> 
>     band = reader.line[0] & 0xff;
>     if (band < 1 || band > 2)
>         die("unexpected side band %d", band)
>     fd = band;
> 
> instead?

Yeah that's must cleaner logic.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 3/8] upload-pack: test negotiation with changing repository
  2018-06-13 21:39   ` [PATCH v2 3/8] upload-pack: test negotiation with changing repository Brandon Williams
@ 2018-06-14 19:23     ` Stefan Beller
  0 siblings, 0 replies; 46+ messages in thread
From: Stefan Beller @ 2018-06-14 19:23 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Ævar Arnfjörð Bjarmason, Ramsay Jones

On Wed, Jun 13, 2018 at 2:39 PM Brandon Williams <bmwill@google.com> wrote:
>
> Add tests to check the behavior of fetching from a repository which
> changes between rounds of negotiation (for example, when different
> servers in a load-balancing agreement participate in the same stateless
> RPC negotiation). This forms a baseline of comparison to the ref-in-want
> functionality (which will be introduced to the client in subsequent
> commits), and ensures that subsequent commits do not change existing
> behavior.
>
> As part of this effort, a mechanism to substitute strings in a single
> HTTP response is added.

This patch looks good to me.
Thanks,
Stefan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 6/8] fetch: refactor to make function args narrower
  2018-06-13 21:39   ` [PATCH v2 6/8] fetch: refactor to make function args narrower Brandon Williams
@ 2018-06-14 19:32     ` Stefan Beller
  0 siblings, 0 replies; 46+ messages in thread
From: Stefan Beller @ 2018-06-14 19:32 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Ævar Arnfjörð Bjarmason, Ramsay Jones

On Wed, Jun 13, 2018 at 2:39 PM Brandon Williams <bmwill@google.com> wrote:
>
> Refactor find_non_local_tags and get_ref_map to only take the
> information they need instead of the entire transport struct. Besides
> improving code clarity, this also improves their flexibility, allowing
> for a different set of refs to be used instead of relying on the ones
> stored in the transport struct.
>

This patch and the two prior refactoring patches are
Reviewed-by: Stefan Beller <sbeller@google.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 7/8] fetch-pack: put shallow info in output parameter
  2018-06-13 21:39   ` [PATCH v2 7/8] fetch-pack: put shallow info in output parameter Brandon Williams
@ 2018-06-14 19:42     ` Stefan Beller
  2018-06-14 23:59     ` Jonathan Tan
  1 sibling, 0 replies; 46+ messages in thread
From: Stefan Beller @ 2018-06-14 19:42 UTC (permalink / raw)
  To: bmwill; +Cc: git, avarab, ramsay

> +                   !oidcmp(&rm->peer_ref->old_oid, &rm->old_oid)) {
> +                       /*
> +                        * These need to be reported as fetched, but we don not

do not or don't; there is no middle way. ;)

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 8/8] fetch-pack: implement ref-in-want
  2018-06-13 21:39   ` [PATCH v2 8/8] fetch-pack: implement ref-in-want Brandon Williams
@ 2018-06-14 19:56     ` Stefan Beller
  2018-06-14 21:18       ` Brandon Williams
  0 siblings, 1 reply; 46+ messages in thread
From: Stefan Beller @ 2018-06-14 19:56 UTC (permalink / raw)
  To: bmwill; +Cc: git, avarab, ramsay

On Wed, Jun 13, 2018 at 2:39 PM Brandon Williams <bmwill@google.com> wrote:

> +static void receive_wanted_refs(struct packet_reader *reader, struct ref *refs)
> +{
...
> +
> +               for (r = refs; r; r = r->next) {
> +                       if (!strcmp(end, r->name)) {
> +                               oidcpy(&r->old_oid, &oid);
> +                               break;
> +                       }
> +               }

The server is documented as MUST NOT send additional refs,
which is fine here, as we'd have no way of storing them anyway.
Do we want to issue a warning, though?

    if (!r) /* never break'd */
        warning ("server send unexpected line '%s'", reader.line);



> diff --git a/remote.c b/remote.c
> index abe80c139..c9d452ac0 100644
> --- a/remote.c
> +++ b/remote.c
> @@ -1735,6 +1735,7 @@ int get_fetch_map(const struct ref *remote_refs,
>                 if (refspec->exact_sha1) {
>                         ref_map = alloc_ref(name);
>                         get_oid_hex(name, &ref_map->old_oid);
> +                       ref_map->exact_sha1 = 1;
>                 } else {
>                         ref_map = get_remote_ref(remote_refs, name);
>                 }
> diff --git a/remote.h b/remote.h
> index 45ecc6cef..e5338e368 100644
> --- a/remote.h
> +++ b/remote.h
> @@ -73,6 +73,7 @@ struct ref {
>                 force:1,
>                 forced_update:1,
>                 expect_old_sha1:1,
> +               exact_sha1:1,

Can we rename that to exact_oid ?
(bonus points for also converting expect_old_sha1)

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 8/8] fetch-pack: implement ref-in-want
  2018-06-14 19:56     ` Stefan Beller
@ 2018-06-14 21:18       ` Brandon Williams
  0 siblings, 0 replies; 46+ messages in thread
From: Brandon Williams @ 2018-06-14 21:18 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, avarab, ramsay

On 06/14, Stefan Beller wrote:
> On Wed, Jun 13, 2018 at 2:39 PM Brandon Williams <bmwill@google.com> wrote:
> 
> > +static void receive_wanted_refs(struct packet_reader *reader, struct ref *refs)
> > +{
> ...
> > +
> > +               for (r = refs; r; r = r->next) {
> > +                       if (!strcmp(end, r->name)) {
> > +                               oidcpy(&r->old_oid, &oid);
> > +                               break;
> > +                       }
> > +               }
> 
> The server is documented as MUST NOT send additional refs,
> which is fine here, as we'd have no way of storing them anyway.
> Do we want to issue a warning, though?
> 
>     if (!r) /* never break'd */
>         warning ("server send unexpected line '%s'", reader.line);

Depends, does this warning help out the end user or do you think it
would confuse users to see this and still have their fetch succeed?

> 
> 
> 
> > diff --git a/remote.c b/remote.c
> > index abe80c139..c9d452ac0 100644
> > --- a/remote.c
> > +++ b/remote.c
> > @@ -1735,6 +1735,7 @@ int get_fetch_map(const struct ref *remote_refs,
> >                 if (refspec->exact_sha1) {
> >                         ref_map = alloc_ref(name);
> >                         get_oid_hex(name, &ref_map->old_oid);
> > +                       ref_map->exact_sha1 = 1;
> >                 } else {
> >                         ref_map = get_remote_ref(remote_refs, name);
> >                 }
> > diff --git a/remote.h b/remote.h
> > index 45ecc6cef..e5338e368 100644
> > --- a/remote.h
> > +++ b/remote.h
> > @@ -73,6 +73,7 @@ struct ref {
> >                 force:1,
> >                 forced_update:1,
> >                 expect_old_sha1:1,
> > +               exact_sha1:1,
> 
> Can we rename that to exact_oid ?

I'll fix this.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 7/8] fetch-pack: put shallow info in output parameter
  2018-06-13 21:39   ` [PATCH v2 7/8] fetch-pack: put shallow info in output parameter Brandon Williams
  2018-06-14 19:42     ` Stefan Beller
@ 2018-06-14 23:59     ` Jonathan Tan
  2018-06-19 17:41       ` Brandon Williams
  1 sibling, 1 reply; 46+ messages in thread
From: Jonathan Tan @ 2018-06-14 23:59 UTC (permalink / raw)
  To: bmwill; +Cc: git, avarab, ramsay, Jonathan Tan

> @@ -1122,6 +1124,7 @@ static int do_fetch(struct transport *transport,
>  	int autotags = (transport->remote->fetch_tags == 1);
>  	int retcode = 0;
>  	const struct ref *remote_refs;
> +	struct ref *new_remote_refs = NULL;

Above, you use the name "updated_remote_refs" - it's probably better to
standardize on one. I think "updated" is better.

(The transport calling it "fetched_refs" is fine, because that's what
they are from the perspective of the transport. From the perspective of
fetch-pack, it is indeed a new or updated set of remote refs.)

> -	if (fetch_refs(transport, ref_map) || consume_refs(transport, ref_map)) {
> +
> +	if (fetch_refs(transport, ref_map, &new_remote_refs)) {
> +		free_refs(ref_map);
> +		retcode = 1;
> +		goto cleanup;
> +	}
> +	if (new_remote_refs) {
> +		free_refs(ref_map);
> +		ref_map = get_ref_map(transport->remote, new_remote_refs, rs,
> +				      tags, &autotags);
> +		free_refs(new_remote_refs);
> +	}
> +	if (consume_refs(transport, ref_map)) {
>  		free_refs(ref_map);
>  		retcode = 1;
>  		goto cleanup;

Here, if we got updated remote refs, we need to regenerate ref_map,
since it is the source of truth.

Maybe add a comment in the "if (new_remote_refs)" block explaining this
- something like: Regenerate ref_map using the updated remote refs,
because the transport would place shallow (and other) information
there.

> -		for (i = 0; i < nr_sought; i++)
> +		for (r = refs; r; r = r->next, i++)
>  			if (status[i])
> -				sought[i]->status = REF_STATUS_REJECT_SHALLOW;
> +				r->status = REF_STATUS_REJECT_SHALLOW;

You use i here without initializing it to 0. t5703 also fails with this
patch - probably related to this, but I didn't check.

If you initialize i here, I don't think you need to initialize it to 0
at the top of this function.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/8] ref-in-want
  2018-06-05 17:51 [PATCH 0/8] ref-in-want Brandon Williams
                   ` (8 preceding siblings ...)
  2018-06-13 21:39 ` [PATCH v2 0/8] ref-in-want Brandon Williams
@ 2018-06-15 19:04 ` " Jonathan Tan
  2018-06-19 17:32   ` Brandon Williams
  2018-06-19 23:16   ` Brandon Williams
  9 siblings, 2 replies; 46+ messages in thread
From: Jonathan Tan @ 2018-06-15 19:04 UTC (permalink / raw)
  To: bmwill; +Cc: git, Jonathan Tan

(replying to the original since my e-mail is about design)

> This version of ref-in-want is a bit more restrictive than what Jonathan
> originally proposed (only full ref names are allowed instead of globs
> and OIDs), but it is meant to accomplish the same goal (solve the issues
> of refs changing during negotiation).

One question remains: are we planning to expand this feature (e.g. to
support patterns ending in *, or to support any pattern that can appear
on the LHS of a refspec), and if yes, are we OK with having 2 or more
versions of the service in the wild, each having different pattern
support?

Supporting patterns would mean that we would possibly be able to
eliminate the ls-refs step, thus saving at least a RTT. (Originally I
thought that supporting patterns would also allow us to tolerate refs
being removed during the fetch process, but I see that this is already
handled by the server ignoring "want-ref <ref>" wherein <ref> doesn't
exist on the server.)

However, after some in-office discussion, I see that eliminating the
ls-refs step means that we lose some optimizations that can only be done
when we see that we already have a sought remote ref. For example, in a
repo like this:

 A
 |
 O
 |
 O B C
 |/ /
 O O
 |/
 O

in which we have rarely-updated branches that we still want to fetch
(e.g. an annotated tag when we fetch refs/tags/* or a Gerrit
refs/changes/* branch), having the ref advertisement first means that we
can omit them from our "want" or "want-ref" list. But not having them
means that we send "want-ref refs/tags/*" to the server, and during
negotiation inform the server of our master branch (A), and since the
server knows of a common ancestor of all our wants (A, B, C), it will
terminate the negotiation and send the objects specific to branches B
and C even though it didn't need to.

So maybe we still need to keep the ls-refs step around, and thus, this
design of only accepting exact refs is perhaps good enough for now.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 2/8] upload-pack: implement ref-in-want
  2018-06-13 21:39   ` [PATCH v2 2/8] upload-pack: implement ref-in-want Brandon Williams
  2018-06-14 18:40     ` Stefan Beller
@ 2018-06-15 21:08     ` Junio C Hamano
  2018-06-15 21:14       ` Junio C Hamano
  2018-06-19 18:50       ` Brandon Williams
  1 sibling, 2 replies; 46+ messages in thread
From: Junio C Hamano @ 2018-06-15 21:08 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, avarab, ramsay

Brandon Williams <bmwill@google.com> writes:

> Currently, while performing packfile negotiation, clients are only
> allowed to specify their desired objects using object ids.  This causes
> a vulnerability to failure when an object turns non-existent during
> negotiation, which may happen if, for example, the desired repository is
> provided by multiple Git servers in a load-balancing arrangement.

In other words, your "git fetch refs/heads/master" initially
contacts a mirror A and learns that its refs/heads/master is at
commit X, starts negotiation, and end up with getting served by a
mirror B that is slightly beind mirror A, which wants to give you a
packfile that brings you up to say commit X~3.  You want to be able
to say "I want to update my refs/remotes/origin/master with whatever
is (close to) the latest at your refs/heads/master", not "I just
learned refs/heads/master is at X from one of you, I demand getting
updated to that exact commit."

For that, you'd say, after getting ref advertisement, "I want
the tip of refs/heads/master that one of you have, whoever ends up
serving me eventually".

> +    want-ref <ref>
> +	Indicates to the server that the client wants to retrieve a
> +	particular ref, where <ref> is the full name of a ref on the
> +	server.  A server should ignore any "want-ref <ref>" lines where
> +	<ref> doesn't exist on the server.

But when you said "I want some version of refs/heads/bo", is it
sensible for me to say "Oh, among 7 mirrors, you unluckily ended up
with me who is most behind and do not even have 'bo' branch yet, so
I won't be giving you that branch (or history leading to the commit
my 6 friends may have at refs/heads/bo)"?  

What is the end-user visible effect of "ignoring"?  Does your "git
fetch bo" that were unluckily served by me who do not yet have the
branch error out?  If so, preferrably the conversation should faile
before packfile generation and transfer.

I am guessing that you do not want to fail the negotiation if your
"git fetch bo" happen to contact me (who lack 'bo') first, as long
as the conversation continues and switches later to one of my
friends who can fulfill the request, and that is why you are
forbidding me (who got initial contact and saw "want-ref bo") from
failing the whole thing.  But it is unclear who is responsible for
erroring out the whole "git fetch bo" *if* unlucky you ended up
getting served by the most stale mirror that does not even have
'bo'.

The story would be different if your request were 

	git fetch refs/heads/*:refs/remotes/origin/*

in which case, you are not even saying "I want this and that ref";
you are saying "all refs in refs/heads/* whoever ends up serving me
happens to have".  You may initially contact one of my friends and
learn that there are 'master' and 'bo' branches (and probably
others), and after conversation end up talking with me who is stale
and lack 'bo'.  In such a case, I agree that it is not sensible for
me to fail the request as a whole and instead serve you whatever
branches I happen to have.  I may lack 'bo' branch due to mirroring
lag, but I may also have 'extra' branch that others no longer have
due to mirroring lag of deletion of that branch!

But then I think your "git fetch refs/heads/*:refs/remotes/origin/*"
should not fail not just because I do not have 'bo', but you also
should grab other old branches I have, which you didn't hear about
when you made the initial contact with my friend in the mirror pool.

So, given that, would it make sense for 'want-ref <ref>' request to
name "a particular ref" as the above document says?  I have a
feeling that it should allow a pattern to be matched at the server
side (and it is not an error if the pattern did not match anything),
in addition to asking for a particular ref (in which case, lack of
that ref should be a hard failure, at least for the mirror that ends
up serving the packfile and the final "here are the refs your
request ended up fetching, with their values").

>  The response of `fetch` is broken into a number of sections separated by
>  delimiter packets (0001), with each section beginning with its section
>  header.
>  
>      output = *section
> -    section = (acknowledgments | shallow-info | packfile)
> +    section = (acknowledgments | shallow-info | wanted-refs | packfile)
>  	      (flush-pkt | delim-pkt)

OK.  In my initial reading, I somehow failed to read this piece
before going on to the next hunk ...

> @@ -319,6 +329,10 @@ header.
>      shallow = "shallow" SP obj-id
>      unshallow = "unshallow" SP obj-id
>  
> +    wanted-refs = PKT-LINE("wanted-refs" LF)
> +		  *PKT-LINE(wanted-ref LF)
> +    wanted-ref = obj-id SP refname
> +

... and wondered how the reader knows where wanted-ref list ends; we
will see a flush (or is it delim?  either is accepted?) where the
list ends, which is good.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 2/8] upload-pack: implement ref-in-want
  2018-06-15 21:08     ` Junio C Hamano
@ 2018-06-15 21:14       ` Junio C Hamano
  2018-06-19 18:50       ` Brandon Williams
  1 sibling, 0 replies; 46+ messages in thread
From: Junio C Hamano @ 2018-06-15 21:14 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, avarab, ramsay

Junio C Hamano <gitster@pobox.com> writes:

> The story would be different if your request were 
>
> 	git fetch refs/heads/*:refs/remotes/origin/*
>
> in which case, you are not even saying "I want this and that ref";
> you are saying "all refs in refs/heads/* whoever ends up serving me
> happens to have".  You may initially contact one of my friends and
> learn that there are 'master' and 'bo' branches (and probably
> others), and after conversation end up talking with me who is stale
> and lack 'bo'.  In such a case, I agree that it is not sensible for
> me to fail the request as a whole and instead serve you whatever

s/whole and instead/whole.  I should instead/;

> branches I happen to have.  I may lack 'bo' branch due to mirroring
> lag, but I may also have 'extra' branch that others no longer have
> due to mirroring lag of deletion of that branch!


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 0/8] ref-in-want
  2018-06-13 21:39 ` [PATCH v2 0/8] ref-in-want Brandon Williams
                     ` (7 preceding siblings ...)
  2018-06-13 21:39   ` [PATCH v2 8/8] fetch-pack: implement ref-in-want Brandon Williams
@ 2018-06-15 21:20   ` Junio C Hamano
  2018-06-18 18:05     ` Brandon Williams
  8 siblings, 1 reply; 46+ messages in thread
From: Junio C Hamano @ 2018-06-15 21:20 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, avarab, ramsay

Brandon Williams <bmwill@google.com> writes:

> Changes in v2:
> * issuing a want-ref line to a ref which doesn't exist is just ignored.
> * fixed some typos 

Do I lock some (logical) prerequisite changes?  On 2.18-rc2 they
apply cleanly but the series fails its own test.

t5703-upload-pack-ref-in-want.sh
expecting success:
        test-pkt-line pack >in <<-EOF &&
        command=fetch
        0001
        no-progress
        want-ref refs/heads/non-existent
        done
        0000
        EOF

        test_must_fail git serve --stateless-rpc 2>out <in &&
        grep "unknown ref" out

test_must_fail: command succeeded: git serve --stateless-rpc
not ok 3 - invalid want-ref line

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 0/8] ref-in-want
  2018-06-15 21:20   ` [PATCH v2 0/8] ref-in-want Junio C Hamano
@ 2018-06-18 18:05     ` Brandon Williams
  0 siblings, 0 replies; 46+ messages in thread
From: Brandon Williams @ 2018-06-18 18:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, avarab, ramsay

On 06/15, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > Changes in v2:
> > * issuing a want-ref line to a ref which doesn't exist is just ignored.
> > * fixed some typos 
> 
> Do I lock some (logical) prerequisite changes?  On 2.18-rc2 they
> apply cleanly but the series fails its own test.

No this is an error I made in this version of the series which another
reviewer pointed out, I have a local v3 which addresses this (by
removing the test since it isn't necessary anymore).  Sorry for the
mistake :)

> 
> t5703-upload-pack-ref-in-want.sh
> expecting success:
>         test-pkt-line pack >in <<-EOF &&
>         command=fetch
>         0001
>         no-progress
>         want-ref refs/heads/non-existent
>         done
>         0000
>         EOF
> 
>         test_must_fail git serve --stateless-rpc 2>out <in &&
>         grep "unknown ref" out
> 
> test_must_fail: command succeeded: git serve --stateless-rpc
> not ok 3 - invalid want-ref line

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/8] ref-in-want
  2018-06-15 19:04 ` [PATCH " Jonathan Tan
@ 2018-06-19 17:32   ` Brandon Williams
  2018-06-19 19:23     ` Jonathan Tan
  2018-06-19 23:16   ` Brandon Williams
  1 sibling, 1 reply; 46+ messages in thread
From: Brandon Williams @ 2018-06-19 17:32 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On 06/15, Jonathan Tan wrote:
> (replying to the original since my e-mail is about design)
> 
> > This version of ref-in-want is a bit more restrictive than what Jonathan
> > originally proposed (only full ref names are allowed instead of globs
> > and OIDs), but it is meant to accomplish the same goal (solve the issues
> > of refs changing during negotiation).
> 
> One question remains: are we planning to expand this feature (e.g. to
> support patterns ending in *, or to support any pattern that can appear
> on the LHS of a refspec), and if yes, are we OK with having 2 or more
> versions of the service in the wild, each having different pattern
> support?
> 
> Supporting patterns would mean that we would possibly be able to
> eliminate the ls-refs step, thus saving at least a RTT. (Originally I
> thought that supporting patterns would also allow us to tolerate refs
> being removed during the fetch process, but I see that this is already
> handled by the server ignoring "want-ref <ref>" wherein <ref> doesn't
> exist on the server.)
> 
> However, after some in-office discussion, I see that eliminating the
> ls-refs step means that we lose some optimizations that can only be done
> when we see that we already have a sought remote ref. For example, in a
> repo like this:
> 
>  A
>  |
>  O
>  |
>  O B C
>  |/ /
>  O O
>  |/
>  O
> 
> in which we have rarely-updated branches that we still want to fetch
> (e.g. an annotated tag when we fetch refs/tags/* or a Gerrit
> refs/changes/* branch), having the ref advertisement first means that we
> can omit them from our "want" or "want-ref" list. But not having them
> means that we send "want-ref refs/tags/*" to the server, and during
> negotiation inform the server of our master branch (A), and since the
> server knows of a common ancestor of all our wants (A, B, C), it will
> terminate the negotiation and send the objects specific to branches B
> and C even though it didn't need to.
> 
> So maybe we still need to keep the ls-refs step around, and thus, this
> design of only accepting exact refs is perhaps good enough for now.

I think that taking a smaller step first it probably better.  This is
something that we've done in the past with the shallow features and
later capabilities were added to add different ways to request shallow
fetches.

That being said, if we find that this feature doesn't work as-is and
needs the extra complexity of patterns from the start then they should
be added.  But it doesn't seem like there's a concrete reason at the
moment.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 7/8] fetch-pack: put shallow info in output parameter
  2018-06-14 23:59     ` Jonathan Tan
@ 2018-06-19 17:41       ` Brandon Williams
  0 siblings, 0 replies; 46+ messages in thread
From: Brandon Williams @ 2018-06-19 17:41 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, avarab, ramsay

On 06/14, Jonathan Tan wrote:
> > @@ -1122,6 +1124,7 @@ static int do_fetch(struct transport *transport,
> >  	int autotags = (transport->remote->fetch_tags == 1);
> >  	int retcode = 0;
> >  	const struct ref *remote_refs;
> > +	struct ref *new_remote_refs = NULL;
> 
> Above, you use the name "updated_remote_refs" - it's probably better to
> standardize on one. I think "updated" is better.

Good catch I'll update the variable name.

> 
> (The transport calling it "fetched_refs" is fine, because that's what
> they are from the perspective of the transport. From the perspective of
> fetch-pack, it is indeed a new or updated set of remote refs.)
> 
> > -	if (fetch_refs(transport, ref_map) || consume_refs(transport, ref_map)) {
> > +
> > +	if (fetch_refs(transport, ref_map, &new_remote_refs)) {
> > +		free_refs(ref_map);
> > +		retcode = 1;
> > +		goto cleanup;
> > +	}
> > +	if (new_remote_refs) {
> > +		free_refs(ref_map);
> > +		ref_map = get_ref_map(transport->remote, new_remote_refs, rs,
> > +				      tags, &autotags);
> > +		free_refs(new_remote_refs);
> > +	}
> > +	if (consume_refs(transport, ref_map)) {
> >  		free_refs(ref_map);
> >  		retcode = 1;
> >  		goto cleanup;
> 
> Here, if we got updated remote refs, we need to regenerate ref_map,
> since it is the source of truth.
> 
> Maybe add a comment in the "if (new_remote_refs)" block explaining this
> - something like: Regenerate ref_map using the updated remote refs,
> because the transport would place shallow (and other) information
> there.

That's probably a good idea to give future readers more context into why
this is happening.

> 
> > -		for (i = 0; i < nr_sought; i++)
> > +		for (r = refs; r; r = r->next, i++)
> >  			if (status[i])
> > -				sought[i]->status = REF_STATUS_REJECT_SHALLOW;
> > +				r->status = REF_STATUS_REJECT_SHALLOW;
> 
> You use i here without initializing it to 0. t5703 also fails with this
> patch - probably related to this, but I didn't check.

Oh yeah that's definitely a bug, thanks for catching that.

> 
> If you initialize i here, I don't think you need to initialize it to 0
> at the top of this function.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 2/8] upload-pack: implement ref-in-want
  2018-06-15 21:08     ` Junio C Hamano
  2018-06-15 21:14       ` Junio C Hamano
@ 2018-06-19 18:50       ` Brandon Williams
  2018-06-19 20:37         ` Junio C Hamano
  1 sibling, 1 reply; 46+ messages in thread
From: Brandon Williams @ 2018-06-19 18:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, avarab, ramsay

On 06/15, Junio C Hamano wrote:
> The story would be different if your request were 
> 
> 	git fetch refs/heads/*:refs/remotes/origin/*
> 
> in which case, you are not even saying "I want this and that ref";
> you are saying "all refs in refs/heads/* whoever ends up serving me
> happens to have".  You may initially contact one of my friends and
> learn that there are 'master' and 'bo' branches (and probably
> others), and after conversation end up talking with me who is stale
> and lack 'bo'.  In such a case, I agree that it is not sensible for
> me to fail the request as a whole and instead serve you whatever
> branches I happen to have.  I may lack 'bo' branch due to mirroring
> lag, but I may also have 'extra' branch that others no longer have
> due to mirroring lag of deletion of that branch!
> 
> But then I think your "git fetch refs/heads/*:refs/remotes/origin/*"
> should not fail not just because I do not have 'bo', but you also
> should grab other old branches I have, which you didn't hear about
> when you made the initial contact with my friend in the mirror pool.
> 
> So, given that, would it make sense for 'want-ref <ref>' request to
> name "a particular ref" as the above document says?  I have a
> feeling that it should allow a pattern to be matched at the server
> side (and it is not an error if the pattern did not match anything),
> in addition to asking for a particular ref (in which case, lack of
> that ref should be a hard failure, at least for the mirror that ends
> up serving the packfile and the final "here are the refs your
> request ended up fetching, with their values").

After some more in-office discussion about this I think I should revert
back to making it a hard failure when a client requests a ref which
doesn't exist on the server.  This makes things more consistent with
what happens today if I request a non-existent ref (Although that error
is purely on the client).  Its no worse than we have today and even with
this solution not solving the issues of new/deleted refs (which are rare
operations wrt updates) we still can get the benefit of not failing due
to a ref updating.  This is also very valuable for the servers which
have to to ACL checks on individual ref names.

I also think that we should keep this first implementation of
ref-in-want simple and *not* include patterns, even if that's what we
may want someday down the road.  Adding a new capability in the future
for support of such patterns would be relatively simple and easy.  The
reason why I don't think we should add pattern support just yet is due
to a request for "want-ref refs/tags/*" or a like request resulting in a
larger than expected packfile every time "fetch --tags" is run.  The
issue being that in a fetch request "refs/tags/*" is too broad of a
request and could be requesting 100s of tags when all we really wanted
was to get the one or two new tags which are present on the remote
(because we already have all the other tags present locally).

So I think the best way is to limit these patterns to the ls-refs
request where we can then discover the few tags we're missing and then
request just those tags explicitly, keeping the resulting packfile
smaller.

Thoughts?

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/8] ref-in-want
  2018-06-19 17:32   ` Brandon Williams
@ 2018-06-19 19:23     ` Jonathan Tan
  0 siblings, 0 replies; 46+ messages in thread
From: Jonathan Tan @ 2018-06-19 19:23 UTC (permalink / raw)
  To: bmwill; +Cc: jonathantanmy, git

[snip]

> > in which we have rarely-updated branches that we still want to fetch
> > (e.g. an annotated tag when we fetch refs/tags/* or a Gerrit
> > refs/changes/* branch), having the ref advertisement first means that we
> > can omit them from our "want" or "want-ref" list. But not having them
> > means that we send "want-ref refs/tags/*" to the server, and during
> > negotiation inform the server of our master branch (A), and since the
> > server knows of a common ancestor of all our wants (A, B, C), it will
> > terminate the negotiation and send the objects specific to branches B
> > and C even though it didn't need to.
> > 
> > So maybe we still need to keep the ls-refs step around, and thus, this
> > design of only accepting exact refs is perhaps good enough for now.
> 
> I think that taking a smaller step first it probably better.  This is
> something that we've done in the past with the shallow features and
> later capabilities were added to add different ways to request shallow
> fetches.

I think we're agreeing that the smaller step first is better.

> That being said, if we find that this feature doesn't work as-is and
> needs the extra complexity of patterns from the start then they should
> be added.

I agree (although I would be OK too if we decide to do the small
exact-name step now and then the pattern step later guarded by a
capability, as long as the project understood that multiple support
levels would then exist in the wild).

> But it doesn't seem like there's a concrete reason at the
> moment.

I agree. I thought I had a reason, but not after thinking through the
ideas I explained in [1].

[1] https://public-inbox.org/git/20180615190458.147775-1-jonathantanmy@google.com/

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 2/8] upload-pack: implement ref-in-want
  2018-06-19 18:50       ` Brandon Williams
@ 2018-06-19 20:37         ` Junio C Hamano
  2018-06-19 23:14           ` Brandon Williams
  0 siblings, 1 reply; 46+ messages in thread
From: Junio C Hamano @ 2018-06-19 20:37 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, avarab, ramsay

Brandon Williams <bmwill@google.com> writes:

> I also think that we should keep this first implementation of
> ref-in-want simple and *not* include patterns, even if that's what we
> may want someday down the road.  Adding a new capability in the future
> for support of such patterns would be relatively simple and easy.

I am all for many-small-steps over a single-giant-step approach.

>  The
> reason why I don't think we should add pattern support just yet is due
> to a request for "want-ref refs/tags/*" or a like request resulting in a
> larger than expected packfile every time "fetch --tags" is run.  The
> issue being that in a fetch request "refs/tags/*" is too broad of a
> request and could be requesting 100s of tags when all we really wanted
> was to get the one or two new tags which are present on the remote
> (because we already have all the other tags present locally).

I do not quite get this.  Why does it have to result in a large
packfile?  Doesn't the requester of refs/tags/* still show what it
has via "have" exchange?




^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v2 2/8] upload-pack: implement ref-in-want
  2018-06-19 20:37         ` Junio C Hamano
@ 2018-06-19 23:14           ` Brandon Williams
  0 siblings, 0 replies; 46+ messages in thread
From: Brandon Williams @ 2018-06-19 23:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, avarab, ramsay

On 06/19, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > I also think that we should keep this first implementation of
> > ref-in-want simple and *not* include patterns, even if that's what we
> > may want someday down the road.  Adding a new capability in the future
> > for support of such patterns would be relatively simple and easy.
> 
> I am all for many-small-steps over a single-giant-step approach.
> 
> >  The
> > reason why I don't think we should add pattern support just yet is due
> > to a request for "want-ref refs/tags/*" or a like request resulting in a
> > larger than expected packfile every time "fetch --tags" is run.  The
> > issue being that in a fetch request "refs/tags/*" is too broad of a
> > request and could be requesting 100s of tags when all we really wanted
> > was to get the one or two new tags which are present on the remote
> > (because we already have all the other tags present locally).
> 
> I do not quite get this.  Why does it have to result in a large
> packfile?  Doesn't the requester of refs/tags/* still show what it
> has via "have" exchange?

Sorry Jonathan Tan said it much clearer here:
https://public-inbox.org/git/20180615190458.147775-1-jonathantanmy@google.com/

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/8] ref-in-want
  2018-06-15 19:04 ` [PATCH " Jonathan Tan
  2018-06-19 17:32   ` Brandon Williams
@ 2018-06-19 23:16   ` Brandon Williams
  2018-06-19 23:38     ` Jonathan Tan
  1 sibling, 1 reply; 46+ messages in thread
From: Brandon Williams @ 2018-06-19 23:16 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On 06/15, Jonathan Tan wrote:
> 
> Supporting patterns would mean that we would possibly be able to
> eliminate the ls-refs step, thus saving at least a RTT. (Originally I
> thought that supporting patterns would also allow us to tolerate refs
> being removed during the fetch process, but I see that this is already
> handled by the server ignoring "want-ref <ref>" wherein <ref> doesn't
> exist on the server.)

What's your opinion on this?  Should we keep it how it is in v2 of the
series where the server ignores refs it doesn't know about or revert to
what v1 of the series did and have it be a hard error?

I've gone back and forth on what I think we should do so I'd like to
hear at least one more opinion :)

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/8] ref-in-want
  2018-06-19 23:16   ` Brandon Williams
@ 2018-06-19 23:38     ` Jonathan Tan
  0 siblings, 0 replies; 46+ messages in thread
From: Jonathan Tan @ 2018-06-19 23:38 UTC (permalink / raw)
  To: bmwill; +Cc: jonathantanmy, git

> On 06/15, Jonathan Tan wrote:
> > 
> > Supporting patterns would mean that we would possibly be able to
> > eliminate the ls-refs step, thus saving at least a RTT. (Originally I
> > thought that supporting patterns would also allow us to tolerate refs
> > being removed during the fetch process, but I see that this is already
> > handled by the server ignoring "want-ref <ref>" wherein <ref> doesn't
> > exist on the server.)
> 
> What's your opinion on this?  Should we keep it how it is in v2 of the
> series where the server ignores refs it doesn't know about or revert to
> what v1 of the series did and have it be a hard error?

I think it should be like in v2 - the server should ignore "want-ref
<ref>" lines for refs it doesn't know about. And, after more thought, I
think that the client should die if "fetch <exact-ref-name>" was not
fulfilled, and ignore if a ref in "fetch <ref-with-wildcard>" was not
fulfilled.

The advantage of doing that is that we make the protocol a bit more
tolerant to adverse conditions (e.g. a rapidly changing repository or an
eventually consistent load-balancing setup), while having little-to-no
effect on regular conditions.

The disadvantage is that there is now one additional place where a
failure can silently occur, but I think that this is a minor
disadvantage. A naive script using "git fetch", in my mind, would assume
that refs/heads/exact exists if "fetch
refs/heads/exact:refs/heads/exact" succeeds, but would not assume that
refs/heads/wildcard-something exists if "fetch
refs/heads/wildcard*:refs/heads/wildcard*" succeeds, which fits in
nicely with the die/ignore behavior I outlined above.

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, back to index

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-05 17:51 [PATCH 0/8] ref-in-want Brandon Williams
2018-06-05 17:51 ` [PATCH 1/8] test-pkt-line: add unpack-sideband subcommand Brandon Williams
2018-06-05 17:51 ` [PATCH 2/8] upload-pack: implement ref-in-want Brandon Williams
2018-06-05 19:11   ` Ramsay Jones
2018-06-05 20:32   ` Ævar Arnfjörð Bjarmason
2018-06-06 21:32     ` Brandon Williams
2018-06-06 22:42       ` Ævar Arnfjörð Bjarmason
2018-06-06 22:45         ` Brandon Williams
2018-06-05 17:51 ` [PATCH 3/8] upload-pack: test negotiation with changing repository Brandon Williams
2018-06-05 17:51 ` [PATCH 4/8] fetch: refactor the population of peer ref OIDs Brandon Williams
2018-06-05 17:51 ` [PATCH 5/8] fetch: refactor fetch_refs into two functions Brandon Williams
2018-06-05 17:51 ` [PATCH 6/8] fetch: refactor to make function args narrower Brandon Williams
2018-06-05 17:51 ` [PATCH 7/8] fetch-pack: put shallow info in output parameter Brandon Williams
2018-06-05 17:51 ` [PATCH 8/8] fetch-pack: implement ref-in-want Brandon Williams
2018-06-13 21:39 ` [PATCH v2 0/8] ref-in-want Brandon Williams
2018-06-13 21:39   ` [PATCH v2 1/8] test-pkt-line: add unpack-sideband subcommand Brandon Williams
2018-06-14 18:09     ` Stefan Beller
2018-06-14 19:21       ` Brandon Williams
2018-06-13 21:39   ` [PATCH v2 2/8] upload-pack: implement ref-in-want Brandon Williams
2018-06-14 18:40     ` Stefan Beller
2018-06-14 18:52       ` Brandon Williams
2018-06-15 21:08     ` Junio C Hamano
2018-06-15 21:14       ` Junio C Hamano
2018-06-19 18:50       ` Brandon Williams
2018-06-19 20:37         ` Junio C Hamano
2018-06-19 23:14           ` Brandon Williams
2018-06-13 21:39   ` [PATCH v2 3/8] upload-pack: test negotiation with changing repository Brandon Williams
2018-06-14 19:23     ` Stefan Beller
2018-06-13 21:39   ` [PATCH v2 4/8] fetch: refactor the population of peer ref OIDs Brandon Williams
2018-06-13 21:39   ` [PATCH v2 5/8] fetch: refactor fetch_refs into two functions Brandon Williams
2018-06-13 21:39   ` [PATCH v2 6/8] fetch: refactor to make function args narrower Brandon Williams
2018-06-14 19:32     ` Stefan Beller
2018-06-13 21:39   ` [PATCH v2 7/8] fetch-pack: put shallow info in output parameter Brandon Williams
2018-06-14 19:42     ` Stefan Beller
2018-06-14 23:59     ` Jonathan Tan
2018-06-19 17:41       ` Brandon Williams
2018-06-13 21:39   ` [PATCH v2 8/8] fetch-pack: implement ref-in-want Brandon Williams
2018-06-14 19:56     ` Stefan Beller
2018-06-14 21:18       ` Brandon Williams
2018-06-15 21:20   ` [PATCH v2 0/8] ref-in-want Junio C Hamano
2018-06-18 18:05     ` Brandon Williams
2018-06-15 19:04 ` [PATCH " Jonathan Tan
2018-06-19 17:32   ` Brandon Williams
2018-06-19 19:23     ` Jonathan Tan
2018-06-19 23:16   ` Brandon Williams
2018-06-19 23:38     ` Jonathan Tan

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox