git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v3 0/8] Hiding refs
@ 2013-01-30 18:45 Junio C Hamano
  2013-01-30 18:45 ` [PATCH v3 1/8] upload-pack: share more code Junio C Hamano
                   ` (8 more replies)
  0 siblings, 9 replies; 54+ messages in thread
From: Junio C Hamano @ 2013-01-30 18:45 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Shawn Pearce

The third round.

 - Multi-valued variable transfer.hiderefs lists prefixes of ref
   hierarchies to be hidden from the requests coming over the
   network.

 - A configuration optionally allows uploadpack to accept fetch
   requests for an object at the tip of a hidden ref.

Elsewhere, we discussed "delaying ref advertisement" (aka "expand
refs"), but it is an orthogonal feature and this "hiding refs
completely from advertisement" series does not attempt to address.

Patch #2 (simplify request validation), #4 (clarify the codeflow),
and #5 (use struct ref) are new.  The are all long overdue clean-ups
for these codepaths.

The last patch is an illustration why it wouldn't make sense to
optionally allow pushing into hidden refs, and not meant to be part
of the series proper.

For those who missed it, earlier rounds are at:

    http://thread.gmane.org/gmane.comp.version-control.git/213951
    http://thread.gmane.org/gmane.comp.version-control.git/214888

Junio C Hamano (8):
  upload-pack: share more code
  upload-pack: simplify request validation
  upload/receive-pack: allow hiding ref hierarchies
  parse_fetch_refspec(): clarify the codeflow a bit
  fetch: use struct ref to represent refs to be fetched
  upload-pack: optionally allow fetching from the tips of hidden refs
  fetch: fetch objects by their exact SHA-1 object names
  WIP: receive.allowupdatestohidden

 Documentation/config.txt |  23 +++++++++++
 builtin/fetch-pack.c     |  40 +++++++++++++++----
 builtin/receive-pack.c   |  31 +++++++++++++++
 cache.h                  |   3 +-
 fetch-pack.c             | 101 ++++++++++++++++++++++++++++++++---------------
 fetch-pack.h             |  11 +++---
 refs.c                   |  41 +++++++++++++++++++
 refs.h                   |   3 ++
 remote.c                 |  41 ++++++++++---------
 remote.h                 |   1 +
 t/t5512-ls-remote.sh     |   9 +++++
 t/t5516-fetch-push.sh    |  82 ++++++++++++++++++++++++++++++++++++++
 transport.c              |   9 +----
 upload-pack.c            |  86 ++++++++++++++++++++++++----------------
 14 files changed, 374 insertions(+), 107 deletions(-)

-- 
1.8.1.2.589.ga9b91ac

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH v3 1/8] upload-pack: share more code
  2013-01-30 18:45 [PATCH v3 0/8] Hiding refs Junio C Hamano
@ 2013-01-30 18:45 ` Junio C Hamano
  2013-01-30 18:45 ` [PATCH v3 2/8] upload-pack: simplify request validation Junio C Hamano
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 54+ messages in thread
From: Junio C Hamano @ 2013-01-30 18:45 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Shawn Pearce

We mark the objects pointed at our refs with "OUR_REF" flag in two
functions (mark_our_ref() and send_ref()), but we can just use the
former as a helper for the latter.

Update the way mark_our_ref() prepares in-core object to use
lookup_unknown_object() to delay reading the actual object data,
just like we did in 435c833 (upload-pack: use peel_ref for ref
advertisements, 2012-10-04).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 upload-pack.c | 31 ++++++++++++++-----------------
 1 file changed, 14 insertions(+), 17 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index 95d8313..3dd220d 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -722,15 +722,28 @@ static void receive_needs(void)
 	free(shallows.objects);
 }
 
+static int mark_our_ref(const char *refname, const unsigned char *sha1, int flag, void *cb_data)
+{
+	struct object *o = lookup_unknown_object(sha1);
+	if (!o)
+		die("git upload-pack: cannot find object %s:", sha1_to_hex(sha1));
+	if (!(o->flags & OUR_REF)) {
+		o->flags |= OUR_REF;
+		nr_our_refs++;
+	}
+	return 0;
+}
+
 static int send_ref(const char *refname, const unsigned char *sha1, int flag, void *cb_data)
 {
 	static const char *capabilities = "multi_ack thin-pack side-band"
 		" side-band-64k ofs-delta shallow no-progress"
 		" include-tag multi_ack_detailed";
-	struct object *o = lookup_unknown_object(sha1);
 	const char *refname_nons = strip_namespace(refname);
 	unsigned char peeled[20];
 
+	mark_our_ref(refname, sha1, flag, cb_data);
+
 	if (capabilities)
 		packet_write(1, "%s %s%c%s%s agent=%s\n",
 			     sha1_to_hex(sha1), refname_nons,
@@ -740,27 +753,11 @@ static int send_ref(const char *refname, const unsigned char *sha1, int flag, vo
 	else
 		packet_write(1, "%s %s\n", sha1_to_hex(sha1), refname_nons);
 	capabilities = NULL;
-	if (!(o->flags & OUR_REF)) {
-		o->flags |= OUR_REF;
-		nr_our_refs++;
-	}
 	if (!peel_ref(refname, peeled))
 		packet_write(1, "%s %s^{}\n", sha1_to_hex(peeled), refname_nons);
 	return 0;
 }
 
-static int mark_our_ref(const char *refname, const unsigned char *sha1, int flag, void *cb_data)
-{
-	struct object *o = parse_object(sha1);
-	if (!o)
-		die("git upload-pack: cannot find object %s:", sha1_to_hex(sha1));
-	if (!(o->flags & OUR_REF)) {
-		o->flags |= OUR_REF;
-		nr_our_refs++;
-	}
-	return 0;
-}
-
 static void upload_pack(void)
 {
 	if (advertise_refs || !stateless_rpc) {
-- 
1.8.1.2.589.ga9b91ac

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 2/8] upload-pack: simplify request validation
  2013-01-30 18:45 [PATCH v3 0/8] Hiding refs Junio C Hamano
  2013-01-30 18:45 ` [PATCH v3 1/8] upload-pack: share more code Junio C Hamano
@ 2013-01-30 18:45 ` Junio C Hamano
  2013-01-30 18:45 ` [PATCH v3 3/8] upload/receive-pack: allow hiding ref hierarchies Junio C Hamano
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 54+ messages in thread
From: Junio C Hamano @ 2013-01-30 18:45 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Shawn Pearce

Long time ago, we used to punt on a large (read: asking for more
than 256 refs) fetch request and instead sent a full pack, because
we couldn't fit many refs on the command line of rev-list we run
internally to enumerate the objects to be sent.  To fix this,
565ebbf (upload-pack: tighten request validation., 2005-10-24),
added a check to count the number of refs in the request and matched
with the number of refs we advertised, and changed the invocation of
rev-list to pass "--all" to it, still keeping us under the command
line argument limit.

However, these days we feed the list of objects requested and the
list of objects the other end is known to have via standard input,
so there is no longer a valid reason to special case a full clone
request.  Remove the code associated with "create_full_pack" to
simplify the logic.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 upload-pack.c | 28 +++++++++++-----------------
 1 file changed, 11 insertions(+), 17 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index 3dd220d..3a26a7b 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -28,7 +28,7 @@ static const char upload_pack_usage[] = "git upload-pack [--strict] [--timeout=<
 
 static unsigned long oldest_have;
 
-static int multi_ack, nr_our_refs;
+static int multi_ack;
 static int no_done;
 static int use_thin_pack, use_ofs_delta, use_include_tag;
 static int no_progress, daemon_mode;
@@ -139,7 +139,6 @@ static void create_pack_file(void)
 {
 	struct async rev_list;
 	struct child_process pack_objects;
-	int create_full_pack = (nr_our_refs == want_obj.nr && !have_obj.nr);
 	char data[8193], progress[128];
 	char abort_msg[] = "aborting due to possible repository "
 		"corruption on the remote side.";
@@ -151,9 +150,7 @@ static void create_pack_file(void)
 	argv[arg++] = "pack-objects";
 	if (!shallow_nr) {
 		argv[arg++] = "--revs";
-		if (create_full_pack)
-			argv[arg++] = "--all";
-		else if (use_thin_pack)
+		if (use_thin_pack)
 			argv[arg++] = "--thin";
 	}
 
@@ -185,15 +182,15 @@ static void create_pack_file(void)
 	}
 	else {
 		FILE *pipe_fd = xfdopen(pack_objects.in, "w");
-		if (!create_full_pack) {
-			int i;
-			for (i = 0; i < want_obj.nr; i++)
-				fprintf(pipe_fd, "%s\n", sha1_to_hex(want_obj.objects[i].item->sha1));
-			fprintf(pipe_fd, "--not\n");
-			for (i = 0; i < have_obj.nr; i++)
-				fprintf(pipe_fd, "%s\n", sha1_to_hex(have_obj.objects[i].item->sha1));
-		}
+		int i;
 
+		for (i = 0; i < want_obj.nr; i++)
+			fprintf(pipe_fd, "%s\n",
+				sha1_to_hex(want_obj.objects[i].item->sha1));
+		fprintf(pipe_fd, "--not\n");
+		for (i = 0; i < have_obj.nr; i++)
+			fprintf(pipe_fd, "%s\n",
+				sha1_to_hex(have_obj.objects[i].item->sha1));
 		fprintf(pipe_fd, "\n");
 		fflush(pipe_fd);
 		fclose(pipe_fd);
@@ -727,10 +724,7 @@ static int mark_our_ref(const char *refname, const unsigned char *sha1, int flag
 	struct object *o = lookup_unknown_object(sha1);
 	if (!o)
 		die("git upload-pack: cannot find object %s:", sha1_to_hex(sha1));
-	if (!(o->flags & OUR_REF)) {
-		o->flags |= OUR_REF;
-		nr_our_refs++;
-	}
+	o->flags |= OUR_REF;
 	return 0;
 }
 
-- 
1.8.1.2.589.ga9b91ac

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 3/8] upload/receive-pack: allow hiding ref hierarchies
  2013-01-30 18:45 [PATCH v3 0/8] Hiding refs Junio C Hamano
  2013-01-30 18:45 ` [PATCH v3 1/8] upload-pack: share more code Junio C Hamano
  2013-01-30 18:45 ` [PATCH v3 2/8] upload-pack: simplify request validation Junio C Hamano
@ 2013-01-30 18:45 ` Junio C Hamano
  2013-02-05  8:50   ` Jeff King
  2013-01-30 18:45 ` [PATCH v3 4/8] parse_fetch_refspec(): clarify the codeflow a bit Junio C Hamano
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 54+ messages in thread
From: Junio C Hamano @ 2013-01-30 18:45 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Shawn Pearce

Teach upload-pack and receive-pack to omit some refs from their
initial advertisements by paying attention to the transfer.hiderefs
multi-valued configuration variable.  Any ref that is under the
hierarchies listed on the value of this variable is excluded from
responses to requests made by "ls-remote", "fetch", "clone", "push",
etc.

A typical use case may be

	[transfer]
		hiderefs = refs/pull

to hide the refs that are internally used by the hosting site and
should not be exposed over the network.

Because these hidden refs do not count as OUR_REF, an attempt to
fetch objects at the tip of them will be rejected, and because these
refs do not get advertised, "git push :" will not see local branches
that have the same name as them as "matching" ones to be sent.

An attempt to update/delete these hidden refs with an explicit
refspec, e.g. "git push origin :refs/pull/11/head", is rejected.

This is not a new restriction.  To the pusher, it would appear that
there is no such ref, so its push request will conclude with "Now
that I sent you all the data, it is time for you to update the refs.
I saw that the ref did not exist when I started pushing, and I want
the result to point at this commit".  The receiving end will apply
the compare-and-swap rule to this request and rejects the push with
"Well, your update request conflicts with somebody else; I see there
is such a ref.", which is the right thing to do. Otherwise a push to
a hidden ref will always be "the last one wins", which is not a good
default.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config.txt | 11 +++++++++++
 builtin/receive-pack.c   | 24 ++++++++++++++++++++++++
 refs.c                   | 41 +++++++++++++++++++++++++++++++++++++++++
 refs.h                   |  3 +++
 t/t5512-ls-remote.sh     |  9 +++++++++
 t/t5516-fetch-push.sh    | 24 ++++++++++++++++++++++++
 upload-pack.c            | 14 +++++++++++++-
 7 files changed, 125 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index ef45c99..f57c802 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2057,6 +2057,17 @@ transfer.fsckObjects::
 	not set, the value of this variable is used instead.
 	Defaults to false.
 
+transfer.hiderefs::
+	String(s) `upload-pack` and `receive-pack` use to decide
+	which refs to omit from their initial advertisement.  Use
+	more than one transfer.hiderefs configuration variables to
+	specify multiple prefix strings. A ref that are under the
+	hierarchies listed on the value of this variable is excluded,
+	and is hidden from `git ls-remote`, `git fetch`, `git push :`,
+	etc.  An attempt to update or delete a hidden ref by `git push`
+	is rejected, and an attempt to fetch a hidden ref by `git fetch`
+	will fail.
+
 transfer.unpackLimit::
 	When `fetch.unpackLimit` or `receive.unpackLimit` are
 	not set, the value of this variable is used instead.
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index ff781fe..a8248d9 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -59,6 +59,11 @@ static enum deny_action parse_deny_action(const char *var, const char *value)
 
 static int receive_pack_config(const char *var, const char *value, void *cb)
 {
+	int status = parse_hide_refs_config(var, value, cb);
+
+	if (status)
+		return status;
+
 	if (strcmp(var, "receive.denydeletes") == 0) {
 		deny_deletes = git_config_bool(var, value);
 		return 0;
@@ -119,6 +124,9 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 
 static void show_ref(const char *path, const unsigned char *sha1)
 {
+	if (ref_is_hidden(path))
+		return;
+
 	if (sent_capabilities)
 		packet_write(1, "%s %s\n", sha1_to_hex(sha1), path);
 	else
@@ -688,6 +696,20 @@ static int iterate_receive_command_list(void *cb_data, unsigned char sha1[20])
 	return -1; /* end of list */
 }
 
+static void reject_updates_to_hidden(struct command *commands)
+{
+	struct command *cmd;
+
+	for (cmd = commands; cmd; cmd = cmd->next) {
+		if (cmd->error_string || !ref_is_hidden(cmd->ref_name))
+			continue;
+		if (is_null_sha1(cmd->new_sha1))
+			cmd->error_string = "deny deleting a hidden ref";
+		else
+			cmd->error_string = "deny updating a hidden ref";
+	}
+}
+
 static void execute_commands(struct command *commands, const char *unpacker_error)
 {
 	struct command *cmd;
@@ -704,6 +726,8 @@ static void execute_commands(struct command *commands, const char *unpacker_erro
 				       0, &cmd))
 		set_connectivity_errors(commands);
 
+	reject_updates_to_hidden(commands);
+
 	if (run_receive_hook(commands, pre_receive_hook, 0)) {
 		for (cmd = commands; cmd; cmd = cmd->next) {
 			if (!cmd->error_string)
diff --git a/refs.c b/refs.c
index 541fec2..e3574ca 100644
--- a/refs.c
+++ b/refs.c
@@ -3,6 +3,7 @@
 #include "object.h"
 #include "tag.h"
 #include "dir.h"
+#include "string-list.h"
 
 /*
  * Make sure "ref" is something reasonable to have under ".git/refs/";
@@ -2556,3 +2557,43 @@ char *shorten_unambiguous_ref(const char *refname, int strict)
 	free(short_name);
 	return xstrdup(refname);
 }
+
+static struct string_list *hide_refs;
+
+int parse_hide_refs_config(const char *var, const char *value, void *unused)
+{
+	if (!strcmp("transfer.hiderefs", var)) {
+		char *ref;
+		int len;
+
+		if (!value)
+			return config_error_nonbool(var);
+		ref = xstrdup(value);
+		len = strlen(ref);
+		while (len && ref[len - 1] == '/')
+			ref[--len] = '\0';
+		if (!hide_refs) {
+			hide_refs = xcalloc(1, sizeof(*hide_refs));
+			hide_refs->strdup_strings = 1;
+		}
+		string_list_append(hide_refs, ref);
+	}
+	return 0;
+}
+
+int ref_is_hidden(const char *refname)
+{
+	struct string_list_item *item;
+
+	if (!hide_refs)
+		return 0;
+	for_each_string_list_item(item, hide_refs) {
+		int len;
+		if (prefixcmp(refname, item->string))
+			continue;
+		len = strlen(item->string);
+		if (!refname[len] || refname[len] == '/')
+			return 1;
+	}
+	return 0;
+}
diff --git a/refs.h b/refs.h
index d6c2fe2..50b233f 100644
--- a/refs.h
+++ b/refs.h
@@ -147,4 +147,7 @@ int update_ref(const char *action, const char *refname,
 		const unsigned char *sha1, const unsigned char *oldval,
 		int flags, enum action_on_err onerr);
 
+extern int parse_hide_refs_config(const char *var, const char *value, void *);
+extern int ref_is_hidden(const char *);
+
 #endif /* REFS_H */
diff --git a/t/t5512-ls-remote.sh b/t/t5512-ls-remote.sh
index d16e5d3..d0702ed 100755
--- a/t/t5512-ls-remote.sh
+++ b/t/t5512-ls-remote.sh
@@ -126,4 +126,13 @@ test_expect_success 'Report match with --exit-code' '
 	test_cmp expect actual
 '
 
+test_expect_success 'Hide some refs' '
+	test_config transfer.hiderefs refs/tags &&
+	git ls-remote . >actual &&
+	test_unconfig transfer.hiderefs &&
+	git ls-remote . |
+	sed -e "/	refs\/tags\//d" >expect &&
+	test_cmp expect actual
+'
+
 test_done
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index 6009372..852efb6 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -1037,4 +1037,28 @@ test_expect_success 'push --prune refspec' '
 	! check_push_result $the_first_commit tmp/foo tmp/bar
 '
 
+test_expect_success 'push to update a hidden ref' '
+	mk_test heads/master hidden/one hidden/two hidden/three &&
+	(
+		cd testrepo &&
+		git config transfer.hiderefs refs/hidden
+	) &&
+
+	# push to unhidden ref succeeds normally
+	git push testrepo master:refs/heads/master &&
+	check_push_result $the_commit heads/master &&
+
+	# push to update a hidden ref should fail
+	test_must_fail git push testrepo master:refs/hidden/one &&
+	check_push_result $the_first_commit hidden/one &&
+
+	# push to delete a hidden ref should fail
+	test_must_fail git push testrepo :refs/hidden/two &&
+	check_push_result $the_first_commit hidden/two &&
+
+	# idempotent push to update a hidden ref should fail
+	test_must_fail git push testrepo $the_first_commit:refs/hidden/three &&
+	check_push_result $the_first_commit hidden/three
+'
+
 test_done
diff --git a/upload-pack.c b/upload-pack.c
index 3a26a7b..6b10843 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -12,6 +12,7 @@
 #include "run-command.h"
 #include "sigchain.h"
 #include "version.h"
+#include "string-list.h"
 
 static const char upload_pack_usage[] = "git upload-pack [--strict] [--timeout=<n>] <dir>";
 
@@ -719,9 +720,13 @@ static void receive_needs(void)
 	free(shallows.objects);
 }
 
+/* return non-zero if the ref is hidden, otherwise 0 */
 static int mark_our_ref(const char *refname, const unsigned char *sha1, int flag, void *cb_data)
 {
 	struct object *o = lookup_unknown_object(sha1);
+
+	if (ref_is_hidden(refname))
+		return 1;
 	if (!o)
 		die("git upload-pack: cannot find object %s:", sha1_to_hex(sha1));
 	o->flags |= OUR_REF;
@@ -736,7 +741,8 @@ static int send_ref(const char *refname, const unsigned char *sha1, int flag, vo
 	const char *refname_nons = strip_namespace(refname);
 	unsigned char peeled[20];
 
-	mark_our_ref(refname, sha1, flag, cb_data);
+	if (mark_our_ref(refname, sha1, flag, cb_data))
+		return 0;
 
 	if (capabilities)
 		packet_write(1, "%s %s%c%s%s agent=%s\n",
@@ -773,6 +779,11 @@ static void upload_pack(void)
 	}
 }
 
+static int upload_pack_config(const char *var, const char *value, void *unused)
+{
+	return parse_hide_refs_config(var, value, unused);
+}
+
 int main(int argc, char **argv)
 {
 	char *dir;
@@ -824,6 +835,7 @@ int main(int argc, char **argv)
 		die("'%s' does not appear to be a git repository", dir);
 	if (is_repository_shallow())
 		die("attempt to fetch/clone from a shallow repository");
+	git_config(upload_pack_config, NULL);
 	if (getenv("GIT_DEBUG_SEND_PACK"))
 		debug_fd = atoi(getenv("GIT_DEBUG_SEND_PACK"));
 	upload_pack();
-- 
1.8.1.2.589.ga9b91ac

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 4/8] parse_fetch_refspec(): clarify the codeflow a bit
  2013-01-30 18:45 [PATCH v3 0/8] Hiding refs Junio C Hamano
                   ` (2 preceding siblings ...)
  2013-01-30 18:45 ` [PATCH v3 3/8] upload/receive-pack: allow hiding ref hierarchies Junio C Hamano
@ 2013-01-30 18:45 ` Junio C Hamano
  2013-01-30 18:45 ` [PATCH v3 5/8] fetch: use struct ref to represent refs to be fetched Junio C Hamano
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 54+ messages in thread
From: Junio C Hamano @ 2013-01-30 18:45 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Shawn Pearce

Most parts of the cascaded if/else if/... checked an allowable
condition but some checked forbidden conditions.  This makes adding
new allowable conditions unnecessarily inconvenient.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 remote.c | 29 ++++++++++++-----------------
 1 file changed, 12 insertions(+), 17 deletions(-)

diff --git a/remote.c b/remote.c
index 4b1153f..1b7828d 100644
--- a/remote.c
+++ b/remote.c
@@ -538,7 +538,7 @@ static struct refspec *parse_refspec_internal(int nr_refspec, const char **refsp
 
 		/*
 		 * Before going on, special case ":" (or "+:") as a refspec
-		 * for matching refs.
+		 * for pushing matching refs.
 		 */
 		if (!fetch && rhs == lhs && rhs[1] == '\0') {
 			rs[i].matching = 1;
@@ -565,26 +565,21 @@ static struct refspec *parse_refspec_internal(int nr_refspec, const char **refsp
 		flags = REFNAME_ALLOW_ONELEVEL | (is_glob ? REFNAME_REFSPEC_PATTERN : 0);
 
 		if (fetch) {
-			/*
-			 * LHS
-			 * - empty is allowed; it means HEAD.
-			 * - otherwise it must be a valid looking ref.
-			 */
+			/* LHS */
 			if (!*rs[i].src)
-				; /* empty is ok */
-			else if (check_refname_format(rs[i].src, flags))
+				; /* empty is ok; it means "HEAD" */
+			else if (!check_refname_format(rs[i].src, flags))
+				; /* valid looking ref is ok */
+			else
 				goto invalid;
-			/*
-			 * RHS
-			 * - missing is ok, and is same as empty.
-			 * - empty is ok; it means not to store.
-			 * - otherwise it must be a valid looking ref.
-			 */
+			/* RHS */
 			if (!rs[i].dst)
-				; /* ok */
+				; /* missing is ok; it is the same as empty */
 			else if (!*rs[i].dst)
-				; /* ok */
-			else if (check_refname_format(rs[i].dst, flags))
+				; /* empty is ok; it means "do not store" */
+			else if (!check_refname_format(rs[i].dst, flags))
+				; /* valid looking ref is ok */
+			else
 				goto invalid;
 		} else {
 			/*
-- 
1.8.1.2.589.ga9b91ac

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 5/8] fetch: use struct ref to represent refs to be fetched
  2013-01-30 18:45 [PATCH v3 0/8] Hiding refs Junio C Hamano
                   ` (3 preceding siblings ...)
  2013-01-30 18:45 ` [PATCH v3 4/8] parse_fetch_refspec(): clarify the codeflow a bit Junio C Hamano
@ 2013-01-30 18:45 ` Junio C Hamano
  2013-01-30 18:45 ` [PATCH v3 6/8] upload-pack: optionally allow fetching from the tips of hidden refs Junio C Hamano
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 54+ messages in thread
From: Junio C Hamano @ 2013-01-30 18:45 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Shawn Pearce

Even though "git fetch" has full infrastructure to parse refspecs to
be fetched and match them against the list of refs to come up with
the final list of refs to be fetched, the list of refs that are
requested to be fetched were internally converted to a plain list of
strings at the transport layer and then passed to the underlying
fetch-pack driver.

Stop this conversion and instead pass around an array of refs.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/fetch-pack.c | 40 ++++++++++++++++++++------
 cache.h              |  3 +-
 fetch-pack.c         | 79 +++++++++++++++++++++++++++++++---------------------
 fetch-pack.h         | 11 ++++----
 transport.c          |  9 ++----
 5 files changed, 89 insertions(+), 53 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 940ae35..cc6bf8f 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -7,12 +7,31 @@ static const char fetch_pack_usage[] =
 "[--include-tag] [--upload-pack=<git-upload-pack>] [--depth=<n>] "
 "[--no-progress] [-v] [<host>:]<directory> [<refs>...]";
 
+static void add_sought_entry_mem(struct ref ***sought, int *nr, int *alloc,
+				 const char *name, int namelen)
+{
+	struct ref *ref = xcalloc(1, sizeof(*ref) + namelen + 1);
+
+	memcpy(ref->name, name, namelen);
+	ref->name[namelen] = '\0';
+	(*nr)++;
+	ALLOC_GROW(*sought, *nr, *alloc);
+	(*sought)[*nr - 1] = ref;
+}
+
+static void add_sought_entry(struct ref ***sought, int *nr, int *alloc,
+			     const char *string)
+{
+	add_sought_entry_mem(sought, nr, alloc, string, strlen(string));
+}
+
 int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 {
 	int i, ret;
 	struct ref *ref = NULL;
 	const char *dest = NULL;
-	struct string_list sought = STRING_LIST_INIT_DUP;
+	struct ref **sought;
+	int nr_sought = 0, alloc_sought = 0;
 	int fd[2];
 	char *pack_lockfile = NULL;
 	char **pack_lockfile_ptr = NULL;
@@ -94,7 +113,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	 * refs from the standard input:
 	 */
 	for (; i < argc; i++)
-		string_list_append(&sought, xstrdup(argv[i]));
+		add_sought_entry(&sought, &nr_sought, &alloc_sought, argv[i]);
 	if (args.stdin_refs) {
 		if (args.stateless_rpc) {
 			/* in stateless RPC mode we use pkt-line to read
@@ -107,14 +126,14 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 					break;
 				if (line[n-1] == '\n')
 					n--;
-				string_list_append(&sought, xmemdupz(line, n));
+				add_sought_entry_mem(&sought, &nr_sought,  &alloc_sought, line, n);
 			}
 		}
 		else {
 			/* read from stdin one ref per line, until EOF */
 			struct strbuf line = STRBUF_INIT;
 			while (strbuf_getline(&line, stdin, '\n') != EOF)
-				string_list_append(&sought, strbuf_detach(&line, NULL));
+				add_sought_entry(&sought, &nr_sought, &alloc_sought, line.buf);
 			strbuf_release(&line);
 		}
 	}
@@ -131,7 +150,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	get_remote_heads(fd[0], &ref, 0, NULL);
 
 	ref = fetch_pack(&args, fd, conn, ref, dest,
-			 &sought, pack_lockfile_ptr);
+			 sought, nr_sought, pack_lockfile_ptr);
 	if (pack_lockfile) {
 		printf("lock %s\n", pack_lockfile);
 		fflush(stdout);
@@ -141,7 +160,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	if (finish_connect(conn))
 		return 1;
 
-	ret = !ref || sought.nr;
+	ret = !ref;
 
 	/*
 	 * If the heads to pull were given, we should have consumed
@@ -149,8 +168,13 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	 * remote no-such-ref' would silently succeed without issuing
 	 * an error.
 	 */
-	for (i = 0; i < sought.nr; i++)
-		error("no such remote ref %s", sought.items[i].string);
+	for (i = 0; i < nr_sought; i++) {
+		if (!sought[i] || sought[i]->matched)
+			continue;
+		error("no such remote ref %s", sought[i]->name);
+		ret = 1;
+	}
+
 	while (ref) {
 		printf("%s %s\n",
 		       sha1_to_hex(ref->old_sha1), ref->name);
diff --git a/cache.h b/cache.h
index c257953..03b3285 100644
--- a/cache.h
+++ b/cache.h
@@ -1013,7 +1013,8 @@ struct ref {
 		nonfastforward:1,
 		not_forwardable:1,
 		update:1,
-		deletion:1;
+		deletion:1,
+		matched:1;
 	enum {
 		REF_STATUS_NONE = 0,
 		REF_STATUS_OK,
diff --git a/fetch-pack.c b/fetch-pack.c
index f0acdf7..915c0b7 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -520,47 +520,37 @@ static void mark_recent_complete_commits(struct fetch_pack_args *args,
 	}
 }
 
-static int non_matching_ref(struct string_list_item *item, void *unused)
-{
-	if (item->util) {
-		item->util = NULL;
-		return 0;
-	}
-	else
-		return 1;
-}
-
 static void filter_refs(struct fetch_pack_args *args,
-			struct ref **refs, struct string_list *sought)
+			struct ref **refs,
+			struct ref **sought, int nr_sought)
 {
 	struct ref *newlist = NULL;
 	struct ref **newtail = &newlist;
 	struct ref *ref, *next;
-	int sought_pos;
+	int i;
 
-	sought_pos = 0;
+	i = 0;
 	for (ref = *refs; ref; ref = next) {
 		int keep = 0;
 		next = ref->next;
+
 		if (!memcmp(ref->name, "refs/", 5) &&
 		    check_refname_format(ref->name + 5, 0))
 			; /* trash */
 		else {
-			while (sought_pos < sought->nr) {
-				int cmp = strcmp(ref->name, sought->items[sought_pos].string);
+			while (i < nr_sought) {
+				int cmp = strcmp(ref->name, sought[i]->name);
 				if (cmp < 0)
 					break; /* definitely do not have it */
 				else if (cmp == 0) {
 					keep = 1; /* definitely have it */
-					sought->items[sought_pos++].util = "matched";
-					break;
+					sought[i]->matched = 1;
 				}
-				else
-					sought_pos++; /* might have it; keep looking */
+				i++;
 			}
 		}
 
-		if (! keep && args->fetch_all &&
+		if (!keep && args->fetch_all &&
 		    (!args->depth || prefixcmp(ref->name, "refs/tags/")))
 			keep = 1;
 
@@ -573,7 +563,6 @@ static void filter_refs(struct fetch_pack_args *args,
 		}
 	}
 
-	filter_string_list(sought, 0, non_matching_ref, NULL);
 	*refs = newlist;
 }
 
@@ -583,7 +572,8 @@ static void mark_alternate_complete(const struct ref *ref, void *unused)
 }
 
 static int everything_local(struct fetch_pack_args *args,
-			    struct ref **refs, struct string_list *sought)
+			    struct ref **refs,
+			    struct ref **sought, int nr_sought)
 {
 	struct ref *ref;
 	int retval;
@@ -634,7 +624,7 @@ static int everything_local(struct fetch_pack_args *args,
 		}
 	}
 
-	filter_refs(args, refs, sought);
+	filter_refs(args, refs, sought, nr_sought);
 
 	for (retval = 1, ref = *refs; ref ; ref = ref->next) {
 		const unsigned char *remote = ref->old_sha1;
@@ -764,10 +754,17 @@ static int get_pack(struct fetch_pack_args *args,
 	return 0;
 }
 
+static int cmp_ref_by_name(const void *a_, const void *b_)
+{
+	const struct ref *a = *((const struct ref **)a_);
+	const struct ref *b = *((const struct ref **)b_);
+	return strcmp(a->name, b->name);
+}
+
 static struct ref *do_fetch_pack(struct fetch_pack_args *args,
 				 int fd[2],
 				 const struct ref *orig_ref,
-				 struct string_list *sought,
+				 struct ref **sought, int nr_sought,
 				 char **pack_lockfile)
 {
 	struct ref *ref = copy_ref_list(orig_ref);
@@ -776,6 +773,7 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args,
 	int agent_len;
 
 	sort_ref_list(&ref, ref_compare_name);
+	qsort(sought, nr_sought, sizeof(*sought), cmp_ref_by_name);
 
 	if (is_repository_shallow() && !server_supports("shallow"))
 		die("Server does not support shallow clients");
@@ -824,7 +822,7 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args,
 				agent_len, agent_feature);
 	}
 
-	if (everything_local(args, &ref, sought)) {
+	if (everything_local(args, &ref, sought, nr_sought)) {
 		packet_flush(fd[1]);
 		goto all_done;
 	}
@@ -887,11 +885,32 @@ static void fetch_pack_setup(void)
 	did_setup = 1;
 }
 
+static int remove_duplicates_in_refs(struct ref **ref, int nr)
+{
+	struct string_list names = STRING_LIST_INIT_NODUP;
+	int src, dst;
+
+	for (src = dst = 0; src < nr; src++) {
+		struct string_list_item *item;
+		item = string_list_insert(&names, ref[src]->name);
+		if (item->util)
+			continue; /* already have it */
+		item->util = ref[src];
+		if (src != dst)
+			ref[dst] = ref[src];
+		dst++;
+	}
+	for (src = dst; src < nr; src++)
+		ref[src] = NULL;
+	string_list_clear(&names, 0);
+	return dst;
+}
+
 struct ref *fetch_pack(struct fetch_pack_args *args,
 		       int fd[], struct child_process *conn,
 		       const struct ref *ref,
 		       const char *dest,
-		       struct string_list *sought,
+		       struct ref **sought, int nr_sought,
 		       char **pack_lockfile)
 {
 	struct stat st;
@@ -903,16 +922,14 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
 			st.st_mtime = 0;
 	}
 
-	if (sought->nr) {
-		sort_string_list(sought);
-		string_list_remove_duplicates(sought, 0);
-	}
+	if (nr_sought)
+		nr_sought = remove_duplicates_in_refs(sought, nr_sought);
 
 	if (!ref) {
 		packet_flush(fd[1]);
 		die("no matching remote head");
 	}
-	ref_cpy = do_fetch_pack(args, fd, ref, sought, pack_lockfile);
+	ref_cpy = do_fetch_pack(args, fd, ref, sought, nr_sought, pack_lockfile);
 
 	if (args->depth > 0) {
 		static struct lock_file lock;
diff --git a/fetch-pack.h b/fetch-pack.h
index cb14871..dc5266c 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -20,17 +20,16 @@ struct fetch_pack_args {
 };
 
 /*
- * sought contains the full names of remote references that should be
- * updated from.  On return, the names that were found on the remote
- * will have been removed from the list.  The util members of the
- * string_list_items are used internally; they must be NULL on entry
- * (and will be NULL on exit).
+ * sought represents remote references that should be updated from.
+ * On return, the names that were found on the remote will have been
+ * marked as such.
  */
 struct ref *fetch_pack(struct fetch_pack_args *args,
 		       int fd[], struct child_process *conn,
 		       const struct ref *ref,
 		       const char *dest,
-		       struct string_list *sought,
+		       struct ref **sought,
+		       int nr_sought,
 		       char **pack_lockfile);
 
 #endif
diff --git a/transport.c b/transport.c
index 2673d27..64ce651 100644
--- a/transport.c
+++ b/transport.c
@@ -518,11 +518,9 @@ static int fetch_refs_via_pack(struct transport *transport,
 			       int nr_heads, struct ref **to_fetch)
 {
 	struct git_transport_data *data = transport->data;
-	struct string_list sought = STRING_LIST_INIT_DUP;
 	const struct ref *refs;
 	char *dest = xstrdup(transport->url);
 	struct fetch_pack_args args;
-	int i;
 	struct ref *refs_tmp = NULL;
 
 	memset(&args, 0, sizeof(args));
@@ -536,9 +534,6 @@ static int fetch_refs_via_pack(struct transport *transport,
 	args.no_progress = !transport->progress;
 	args.depth = data->options.depth;
 
-	for (i = 0; i < nr_heads; i++)
-		string_list_append(&sought, to_fetch[i]->name);
-
 	if (!data->got_remote_heads) {
 		connect_setup(transport, 0, 0);
 		get_remote_heads(data->fd[0], &refs_tmp, 0, NULL);
@@ -547,7 +542,8 @@ static int fetch_refs_via_pack(struct transport *transport,
 
 	refs = fetch_pack(&args, data->fd, data->conn,
 			  refs_tmp ? refs_tmp : transport->remote_refs,
-			  dest, &sought, &transport->pack_lockfile);
+			  dest, to_fetch, nr_heads,
+			  &transport->pack_lockfile);
 	close(data->fd[0]);
 	close(data->fd[1]);
 	if (finish_connect(data->conn))
@@ -557,7 +553,6 @@ static int fetch_refs_via_pack(struct transport *transport,
 
 	free_refs(refs_tmp);
 
-	string_list_clear(&sought, 0);
 	free(dest);
 	return (refs ? 0 : -1);
 }
-- 
1.8.1.2.589.ga9b91ac

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 6/8] upload-pack: optionally allow fetching from the tips of hidden refs
  2013-01-30 18:45 [PATCH v3 0/8] Hiding refs Junio C Hamano
                   ` (4 preceding siblings ...)
  2013-01-30 18:45 ` [PATCH v3 5/8] fetch: use struct ref to represent refs to be fetched Junio C Hamano
@ 2013-01-30 18:45 ` Junio C Hamano
  2013-01-30 18:45 ` [PATCH v3 7/8] fetch: fetch objects by their exact SHA-1 object names Junio C Hamano
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 54+ messages in thread
From: Junio C Hamano @ 2013-01-30 18:45 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Shawn Pearce

With uploadpack.allowtipsha1inwant configuration option set, future
versions of "git fetch" that allow an exact object name (likely to
have been obtained out of band) on the LHS of the fetch refspec can
make a request with a "want" line that names an object that may not
have been advertised due to transfer.hiderefs configuration.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config.txt |  8 +++++++-
 upload-pack.c            | 25 +++++++++++++++++++------
 2 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index f57c802..2dce021 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2066,13 +2066,19 @@ transfer.hiderefs::
 	and is hidden from `git ls-remote`, `git fetch`, `git push :`,
 	etc.  An attempt to update or delete a hidden ref by `git push`
 	is rejected, and an attempt to fetch a hidden ref by `git fetch`
-	will fail.
+	will fail.  See also `uploadpack.allowtipsha1inwant`.
 
 transfer.unpackLimit::
 	When `fetch.unpackLimit` or `receive.unpackLimit` are
 	not set, the value of this variable is used instead.
 	The default value is 100.
 
+uploadpack.allowtipsha1inwant::
+	When `transfer.hiderefs` is in effect, allow `upload-pack`
+	to accept a fetch request that asks for an object at the tip
+	of a hidden ref (by default, such a request is rejected).
+	see also `transfer.hiderefs`.
+
 url.<base>.insteadOf::
 	Any URL that starts with this value will be rewritten to
 	start, instead, with <base>. In cases where some site serves a
diff --git a/upload-pack.c b/upload-pack.c
index 6b10843..37977e2 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -26,6 +26,7 @@ static const char upload_pack_usage[] = "git upload-pack [--strict] [--timeout=<
 #define SHALLOW		(1u << 16)
 #define NOT_SHALLOW	(1u << 17)
 #define CLIENT_SHALLOW	(1u << 18)
+#define HIDDEN_REF	(1u << 19)
 
 static unsigned long oldest_have;
 
@@ -33,6 +34,7 @@ static int multi_ack;
 static int no_done;
 static int use_thin_pack, use_ofs_delta, use_include_tag;
 static int no_progress, daemon_mode;
+static int allow_tip_sha1_in_want;
 static int shallow_nr;
 static struct object_array have_obj;
 static struct object_array want_obj;
@@ -487,6 +489,12 @@ static int get_common_commits(void)
 	}
 }
 
+static int is_our_ref(struct object *o)
+{
+	return o->flags &
+		((allow_tip_sha1_in_want ? HIDDEN_REF : 0) | OUR_REF);
+}
+
 static void check_non_tip(void)
 {
 	static const char *argv[] = {
@@ -523,7 +531,7 @@ static void check_non_tip(void)
 		o = get_indexed_object(--i);
 		if (!o)
 			continue;
-		if (!(o->flags & OUR_REF))
+		if (!is_our_ref(o))
 			continue;
 		memcpy(namebuf + 1, sha1_to_hex(o->sha1), 40);
 		if (write_in_full(cmd.in, namebuf, 42) < 0)
@@ -532,7 +540,7 @@ static void check_non_tip(void)
 	namebuf[40] = '\n';
 	for (i = 0; i < want_obj.nr; i++) {
 		o = want_obj.objects[i].item;
-		if (o->flags & OUR_REF)
+		if (is_our_ref(o))
 			continue;
 		memcpy(namebuf, sha1_to_hex(o->sha1), 40);
 		if (write_in_full(cmd.in, namebuf, 41) < 0)
@@ -566,7 +574,7 @@ error:
 	/* Pick one of them (we know there at least is one) */
 	for (i = 0; i < want_obj.nr; i++) {
 		o = want_obj.objects[i].item;
-		if (!(o->flags & OUR_REF))
+		if (!is_our_ref(o))
 			die("git upload-pack: not our ref %s",
 			    sha1_to_hex(o->sha1));
 	}
@@ -646,7 +654,7 @@ static void receive_needs(void)
 			    sha1_to_hex(sha1_buf));
 		if (!(o->flags & WANTED)) {
 			o->flags |= WANTED;
-			if (!(o->flags & OUR_REF))
+			if (!is_our_ref(o))
 				has_non_tip = 1;
 			add_object_array(o, NULL, &want_obj);
 		}
@@ -725,8 +733,10 @@ static int mark_our_ref(const char *refname, const unsigned char *sha1, int flag
 {
 	struct object *o = lookup_unknown_object(sha1);
 
-	if (ref_is_hidden(refname))
+	if (ref_is_hidden(refname)) {
+		o->flags |= HIDDEN_REF;
 		return 1;
+	}
 	if (!o)
 		die("git upload-pack: cannot find object %s:", sha1_to_hex(sha1));
 	o->flags |= OUR_REF;
@@ -745,9 +755,10 @@ static int send_ref(const char *refname, const unsigned char *sha1, int flag, vo
 		return 0;
 
 	if (capabilities)
-		packet_write(1, "%s %s%c%s%s agent=%s\n",
+		packet_write(1, "%s %s%c%s%s%s agent=%s\n",
 			     sha1_to_hex(sha1), refname_nons,
 			     0, capabilities,
+			     allow_tip_sha1_in_want ? " allow-tip-sha1-in-want" : "",
 			     stateless_rpc ? " no-done" : "",
 			     git_user_agent_sanitized());
 	else
@@ -781,6 +792,8 @@ static void upload_pack(void)
 
 static int upload_pack_config(const char *var, const char *value, void *unused)
 {
+	if (!strcmp("uploadpack.allowtipsha1inwant", var))
+		allow_tip_sha1_in_want = git_config_bool(var, value);
 	return parse_hide_refs_config(var, value, unused);
 }
 
-- 
1.8.1.2.589.ga9b91ac

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 7/8] fetch: fetch objects by their exact SHA-1 object names
  2013-01-30 18:45 [PATCH v3 0/8] Hiding refs Junio C Hamano
                   ` (5 preceding siblings ...)
  2013-01-30 18:45 ` [PATCH v3 6/8] upload-pack: optionally allow fetching from the tips of hidden refs Junio C Hamano
@ 2013-01-30 18:45 ` Junio C Hamano
  2013-02-05  9:19   ` Jeff King
  2013-01-30 18:45 ` [PATCH v3 8/8] WIP: receive.allowupdatestohidden Junio C Hamano
  2013-02-05  8:04 ` [PATCH v3 0/8] Hiding refs Michael Haggerty
  8 siblings, 1 reply; 54+ messages in thread
From: Junio C Hamano @ 2013-01-30 18:45 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Shawn Pearce

Teach "git fetch" to accept an exact SHA-1 object name the user may
obtain out of band on the LHS of a pathspec, and send it on a "want"
message when the server side advertises the allow-tip-sha1-in-want
capability.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 fetch-pack.c          | 22 +++++++++++++++++++++-
 remote.c              | 12 +++++++++++-
 remote.h              |  1 +
 t/t5516-fetch-push.sh | 34 ++++++++++++++++++++++++++++++++++
 4 files changed, 67 insertions(+), 2 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index 915c0b7..70db646 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -36,7 +36,7 @@ static int marked;
 #define MAX_IN_VAIN 256
 
 static struct commit_list *rev_list;
-static int non_common_revs, multi_ack, use_sideband;
+static int non_common_revs, multi_ack, use_sideband, allow_tip_sha1_in_want;
 
 static void rev_list_push(struct commit *commit, int mark)
 {
@@ -563,6 +563,21 @@ static void filter_refs(struct fetch_pack_args *args,
 		}
 	}
 
+	/* Append unmatched requests to the list */
+	if (allow_tip_sha1_in_want) {
+		for (i = 0; i < nr_sought; i++) {
+			ref = sought[i];
+			if (ref->matched)
+				continue;
+			if (get_sha1_hex(ref->name, ref->old_sha1))
+				continue;
+
+			ref->matched = 1;
+			*newtail = ref;
+			ref->next = NULL;
+			newtail = &ref->next;
+		}
+	}
 	*refs = newlist;
 }
 
@@ -803,6 +818,11 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args,
 			fprintf(stderr, "Server supports side-band\n");
 		use_sideband = 1;
 	}
+	if (server_supports("allow-tip-sha1-in-want")) {
+		if (args->verbose)
+			fprintf(stderr, "Server supports allow-tip-sha1-in-want\n");
+		allow_tip_sha1_in_want = 1;
+	}
 	if (!server_supports("thin-pack"))
 		args->use_thin_pack = 0;
 	if (!server_supports("no-progress"))
diff --git a/remote.c b/remote.c
index 1b7828d..1118d05 100644
--- a/remote.c
+++ b/remote.c
@@ -15,6 +15,7 @@ static struct refspec s_tag_refspec = {
 	0,
 	1,
 	0,
+	0,
 	"refs/tags/*",
 	"refs/tags/*"
 };
@@ -565,9 +566,13 @@ static struct refspec *parse_refspec_internal(int nr_refspec, const char **refsp
 		flags = REFNAME_ALLOW_ONELEVEL | (is_glob ? REFNAME_REFSPEC_PATTERN : 0);
 
 		if (fetch) {
+			unsigned char unused[40];
+
 			/* LHS */
 			if (!*rs[i].src)
 				; /* empty is ok; it means "HEAD" */
+			else if (llen == 40 && !get_sha1_hex(rs[i].src, unused))
+				rs[i].exact_sha1 = 1; /* ok */
 			else if (!check_refname_format(rs[i].src, flags))
 				; /* valid looking ref is ok */
 			else
@@ -1495,7 +1500,12 @@ int get_fetch_map(const struct ref *remote_refs,
 	} else {
 		const char *name = refspec->src[0] ? refspec->src : "HEAD";
 
-		ref_map = get_remote_ref(remote_refs, name);
+		if (refspec->exact_sha1) {
+			ref_map = alloc_ref(name);
+			get_sha1_hex(name, ref_map->old_sha1);
+		} else {
+			ref_map = get_remote_ref(remote_refs, name);
+		}
 		if (!missing_ok && !ref_map)
 			die("Couldn't find remote ref %s", name);
 		if (ref_map) {
diff --git a/remote.h b/remote.h
index 251d8fd..f7b08f1 100644
--- a/remote.h
+++ b/remote.h
@@ -62,6 +62,7 @@ struct refspec {
 	unsigned force : 1;
 	unsigned pattern : 1;
 	unsigned matching : 1;
+	unsigned exact_sha1 : 1;
 
 	char *src;
 	char *dst;
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index 852efb6..522056f 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -1037,6 +1037,40 @@ test_expect_success 'push --prune refspec' '
 	! check_push_result $the_first_commit tmp/foo tmp/bar
 '
 
+test_expect_success 'fetch exact SHA1' '
+	mk_test heads/master hidden/one &&
+	git push testrepo master:refs/hidden/one &&
+	(
+		cd testrepo &&
+		git config transfer.hiderefs refs/hidden
+	) &&
+	check_push_result $the_commit hidden/one &&
+
+	mk_child child &&
+	(
+		cd child &&
+
+		# make sure $the_commit does not exist here
+		git repack -a -d &&
+		git prune &&
+		test_must_fail git cat-file -t $the_commit &&
+
+		# fetching the hidden object should fail by default
+		test_must_fail git fetch -v ../testrepo $the_commit:refs/heads/copy &&
+		test_must_fail git rev-parse --verify refs/heads/copy &&
+
+		# the server side can allow it to succeed
+		(
+			cd ../testrepo &&
+			git config uploadpack.allowtipsha1inwant true
+		) &&
+
+		git fetch -v ../testrepo $the_commit:refs/heads/copy &&
+		result=$(git rev-parse --verify refs/heads/copy) &&
+		test "$the_commit" = "$result"
+	)
+'
+
 test_expect_success 'push to update a hidden ref' '
 	mk_test heads/master hidden/one hidden/two hidden/three &&
 	(
-- 
1.8.1.2.589.ga9b91ac

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH v3 8/8] WIP: receive.allowupdatestohidden
  2013-01-30 18:45 [PATCH v3 0/8] Hiding refs Junio C Hamano
                   ` (6 preceding siblings ...)
  2013-01-30 18:45 ` [PATCH v3 7/8] fetch: fetch objects by their exact SHA-1 object names Junio C Hamano
@ 2013-01-30 18:45 ` Junio C Hamano
  2013-02-05  8:04 ` [PATCH v3 0/8] Hiding refs Michael Haggerty
  8 siblings, 0 replies; 54+ messages in thread
From: Junio C Hamano @ 2013-01-30 18:45 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Shawn Pearce

This does not work yet, and for a good reason.  As the side that
pushes to a hidden ref never sees that the ref already exists, a
request to update such a ref will come in the form of "please
_create_ this ref and point it at this object", which will not pass
the compare-and-swap based anti-race safety at the receiving end.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config.txt |  6 ++++++
 builtin/receive-pack.c   |  9 ++++++++-
 t/t5516-fetch-push.sh    | 24 ++++++++++++++++++++++++
 3 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 2dce021..8f13fc0 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1849,6 +1849,12 @@ receive.updateserverinfo::
 	If set to true, git-receive-pack will run git-update-server-info
 	after receiving data from git-push and updating refs.
 
+receive.allowupdatestohidden::
+	When `transfer.hiderefs` is in effect, allow `receive-pack`
+	to accept a push request that asks to update or delete a
+	hidden ref (by default, such a request is rejected).
+	see also `transfer.hiderefs`.
+
 remote.<name>.url::
 	The URL of a remote repository.  See linkgit:git-fetch[1] or
 	linkgit:git-push[1].
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index a8248d9..88500e7 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -41,6 +41,7 @@ static int auto_gc = 1;
 static const char *head_name;
 static void *head_name_to_free;
 static int sent_capabilities;
+static int allow_updates_to_hidden;
 
 static enum deny_action parse_deny_action(const char *var, const char *value)
 {
@@ -119,6 +120,11 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (strcmp(var, "receive.allowupdatestohidden") == 0) {
+		allow_updates_to_hidden = git_config_bool(var, value);
+		return 0;
+	}
+
 	return git_default_config(var, value, cb);
 }
 
@@ -726,7 +732,8 @@ static void execute_commands(struct command *commands, const char *unpacker_erro
 				       0, &cmd))
 		set_connectivity_errors(commands);
 
-	reject_updates_to_hidden(commands);
+	if (!allow_updates_to_hidden)
+		reject_updates_to_hidden(commands);
 
 	if (run_receive_hook(commands, pre_receive_hook, 0)) {
 		for (cmd = commands; cmd; cmd = cmd->next) {
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index 522056f..4c8aef9 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -1095,4 +1095,28 @@ test_expect_success 'push to update a hidden ref' '
 	check_push_result $the_first_commit hidden/three
 '
 
+test_expect_failure 'allow push to update a hidden ref' '
+	mk_test heads/master hidden/one hidden/two hidden/three &&
+	(
+		cd testrepo &&
+		git config transfer.hiderefs refs/hidden &&
+		git config receive.allowupdatestohidden yes
+	) &&
+
+	# push to unhidden ref succeeds normally
+	git push testrepo master:refs/heads/master &&
+	check_push_result $the_commit heads/master &&
+
+	# push to update a hidden ref should succeed
+	git push testrepo master:refs/hidden/one &&
+	check_push_result $the_commit heads/master &&
+
+	# push to delete a hidden ref should succeed
+	git push testrepo :refs/hidden/two &&
+	(
+		cd testrepo &&
+		test_must_fail git show-ref -q refs/hidden/one
+	)
+'
+
 test_done
-- 
1.8.1.2.589.ga9b91ac

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-01-30 18:45 [PATCH v3 0/8] Hiding refs Junio C Hamano
                   ` (7 preceding siblings ...)
  2013-01-30 18:45 ` [PATCH v3 8/8] WIP: receive.allowupdatestohidden Junio C Hamano
@ 2013-02-05  8:04 ` Michael Haggerty
  2013-02-05  8:33   ` Jonathan Nieder
  2013-02-05 17:27   ` Junio C Hamano
  8 siblings, 2 replies; 54+ messages in thread
From: Michael Haggerty @ 2013-02-05  8:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Shawn Pearce

On 01/30/2013 07:45 PM, Junio C Hamano wrote:
> The third round.
> 
>  - Multi-valued variable transfer.hiderefs lists prefixes of ref
>    hierarchies to be hidden from the requests coming over the
>    network.
> 
>  - A configuration optionally allows uploadpack to accept fetch
>    requests for an object at the tip of a hidden ref.
> 
> Elsewhere, we discussed "delaying ref advertisement" (aka "expand
> refs"), but it is an orthogonal feature and this "hiding refs
> completely from advertisement" series does not attempt to address.
> 
> Patch #2 (simplify request validation), #4 (clarify the codeflow),
> and #5 (use struct ref) are new.  The are all long overdue clean-ups
> for these codepaths.
> 
> The last patch is an illustration why it wouldn't make sense to
> optionally allow pushing into hidden refs, and not meant to be part
> of the series proper.
> 
> For those who missed it, earlier rounds are at:
> 
>     http://thread.gmane.org/gmane.comp.version-control.git/213951
>     http://thread.gmane.org/gmane.comp.version-control.git/214888

I would again like to express my discomfort about this feature, which is
already listed as "will merge to next".  Frankly, I have the feeling
that this feature is being steamrolled in before a community consensus
has been reached and indeed before many valid points raised by other
members of the community have even been addressed.  For example:

* I didn't see a response to Peff's convincing arguments that this
should be a client-side feature rather than a server-side feature [1].

* I didn't see an answer to Duy's question [2] about what is different
between the proposed feature and gitnamespaces.

* I didn't see a response to my worries that this feature could be
abused [3].

I also think that the feature is poorly designed.  For example:

* Why should a repository have exactly one setting for what refs should
be hidden?  Wouldn't it make more sense to allow multiple "views" to be
defined?:

[view "official"]
	hiderefs = refs/pull
	hiderefs = refs/heads/??/*
[view "pu"]
	hiderefs = refs/pull
[view "current"]
	hiderefs = refs/tags/releases

with the view perhaps selected via a server-side environment variable?
This would allow multiple views to be published via different URLs but
referring to the same git repository.

* Is it enough to support only reference exclusion (as opposed to
exclusion and inclusion rules)?  Is it enough to support only reference
selection by hierarchy (for example, how would you hide contributed
branches from your repo)?  Can your configuration scheme be expanded in
a backwards-compatible way if these or other extensions are added later?

* Why should this feature only be available remotely?  It would be handy
to clone everything but usually only see some subset of references in my
daily work: "GIT_VIEW=official gitk --all &".  Or to hide some remote
branches most of the time without having to remove them from my repo:

[view "brief"]
	refs = refs
	refs = !refs/remotes
	refs = refs/remotes/origin
	refs = refs/remotes/my-boss

I think there are still more questions than answers about this feature
and FWIW vote -1 on merging it to next at this time.

Michael

[1] http://article.gmane.org/gmane.comp.version-control.git/214168
[2] http://article.gmane.org/gmane.comp.version-control.git/214070
[3] http://article.gmane.org/gmane.comp.version-control.git/213957

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-05  8:04 ` [PATCH v3 0/8] Hiding refs Michael Haggerty
@ 2013-02-05  8:33   ` Jonathan Nieder
  2013-02-05 10:29     ` Michael Haggerty
  2013-02-05 17:36     ` Junio C Hamano
  2013-02-05 17:27   ` Junio C Hamano
  1 sibling, 2 replies; 54+ messages in thread
From: Jonathan Nieder @ 2013-02-05  8:33 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, git, Jeff King, Shawn Pearce

Hi Michael,

Michael Haggerty wrote:

> I would again like to express my discomfort about this feature, which is
> already listed as "will merge to next".  Frankly, I have the feeling
> that this feature is being steamrolled in before a community consensus
> has been reached and indeed before many valid points raised by other
> members of the community have even been addressed.  For example:

In $dayjob I work with Gerrit, so I think I can start to answer some
of these questions.

> * I didn't see a response to Peff's convincing arguments that this
> should be a client-side feature rather than a server-side feature [1].

The client can't control the size of the ref advertisement.  That is
the main motivation if I understood correctly.

> * I didn't see an answer to Duy's question [2] about what is different
> between the proposed feature and gitnamespaces.

Namespaces are more complicated and don't sit well in existing setups
involving git repositories whose refs are not namespaced.

> * I didn't see a response to my worries that this feature could be
> abused [3].

Can you elaborate?  Do you mean that through social engineering an
attacker would convince the server admin to store secrets using a
hidden ref and enable the upload-archive service?

That does sound like a reasonable concern.  Perhaps the documentation
should be updated along these lines

	transfer.hiderefs::
		String(s) `upload-pack` and `receive-pack` use to decide
		which refs to omit from their initial advertisement.  Use
		more than one transfer.hiderefs configuration variables to
		specify multiple prefix strings. A ref that are under the
		hierarchies listed on the value of this variable is excluded,
		and is hidden from `git ls-remote`, `git fetch`, `git push :`,
		etc.  An attempt to update or delete a hidden ref by `git push`
		is rejected, and an attempt to fetch a hidden ref by `git fetch`
		will fail.
	+
	This setting does not currently affect the `upload-archive` service.

until someone interested implements the same for upload-archive.

> I also think that the feature is poorly designed.  For example:

That's another reasonable concern.  It's very hard to get a design
correct right away, which is presumably part of the motivation of
getting this into the hands of interested users who can give feedback
on it.  What would potentially be worth blocking even that is concerns
about the wire protocol, since it is hard to take back mistakes there.

> * Why should a repository have exactly one setting for what refs should
> be hidden?  Wouldn't it make more sense to allow multiple "views" to be
> defined?:

How do I request a different view of the repository at
/path/to/repo.git over the network?  How can we make the common case
of only one view easy to achieve?  Isn't the multiple-views case
exactly what gitnamespaces is for?

[...]
> * Is it enough to support only reference exclusion (as opposed to
> exclusion and inclusion rules)?

The motivating example is turning off advertisement of the
refs/changes hierarchy.  If and when more complicated cases come up,
that would presumably be the time to support more complicated
configuration.

[...]
> * Why should this feature only be available remotely?

It is about transport.  Ref namespaces have their own set of use cases
and are a distinct feature.

Hoping that clarifies,
Jonathan

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 3/8] upload/receive-pack: allow hiding ref hierarchies
  2013-01-30 18:45 ` [PATCH v3 3/8] upload/receive-pack: allow hiding ref hierarchies Junio C Hamano
@ 2013-02-05  8:50   ` Jeff King
  2013-02-05 15:45     ` Junio C Hamano
  0 siblings, 1 reply; 54+ messages in thread
From: Jeff King @ 2013-02-05  8:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Shawn Pearce

On Wed, Jan 30, 2013 at 10:45:37AM -0800, Junio C Hamano wrote:

> Teach upload-pack and receive-pack to omit some refs from their
> initial advertisements by paying attention to the transfer.hiderefs
> multi-valued configuration variable.  Any ref that is under the
> hierarchies listed on the value of this variable is excluded from
> responses to requests made by "ls-remote", "fetch", "clone", "push",
> etc.
> 
> A typical use case may be
> 
> 	[transfer]
> 		hiderefs = refs/pull
> 
> to hide the refs that are internally used by the hosting site and
> should not be exposed over the network.

In the earlier review, I mentioned making this per-service, but I see
that is not the case here. Do you have an argument against doing so?

I'm specifically thinking of the way we do refs/pull at GitHub (which we
hide only from receive-pack).  I know that you think it would be cleaner
to hide those, and at some level I agree. But at the same time, the
current mechanism has been in place for some time; changing what we
present via upload-pack is likely to break people's workflows. And I
have not seen complaints about the current system. So unless there is a
compelling reason to do so, I'd rather let the fetcher make the
decision.

Gerrit's refs/changes may be a different story, if they have a large
enough number of them to make upload-pack's ref advertisement
overwhelming.

I'm happy to do the per-service patch on top, but I just expected it
here, so I'm wondering if you are against having the feature.

-Peff

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 7/8] fetch: fetch objects by their exact SHA-1 object names
  2013-01-30 18:45 ` [PATCH v3 7/8] fetch: fetch objects by their exact SHA-1 object names Junio C Hamano
@ 2013-02-05  9:19   ` Jeff King
  2013-02-05 11:18     ` Jeff King
  2013-02-05 15:55     ` Junio C Hamano
  0 siblings, 2 replies; 54+ messages in thread
From: Jeff King @ 2013-02-05  9:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Shawn Pearce

On Wed, Jan 30, 2013 at 10:45:41AM -0800, Junio C Hamano wrote:

> Teach "git fetch" to accept an exact SHA-1 object name the user may
> obtain out of band on the LHS of a pathspec, and send it on a "want"
> message when the server side advertises the allow-tip-sha1-in-want
> capability.

Hmm. The UI on this is a little less nice than I would have hoped. Right
now if you want a ref outside of refs/heads, it's up to you to configure
a refspec or do a one-off fetch of the ref:

  git config --add remote.origin.fetch '+refs/pull/*:refs/pull/*'
  git fetch
  git checkout refs/pull/123/head
  ... inspect the contents ...

Without advertisement, we have to learn that refs/pull/123/head exists
out of band. We can no longer fetch all of the refs/pull hierarchy
preemptively, but we can in theory grab at least that one ref like this:

  git fetch refs/pull/123/head
  git checkout FETCH_HEAD
  ... inspect the contents ...

But that does not work with your patch; instead you have to learn not
just the existence of the ref, but also its sha1. This may seem like a
little thing, since you are already learning of the ref out-of-band,
but:

  1. The full sha1 is more annoying to work with. You'd have to cut and
     paste or otherwise script getting it to fetch.  A human-readable
     ref, though, is much easier to remember. The "refs/pull/N/head"
     pattern is simple to learn and type.

  2. Related to (1) above, is that it may be easier to come up with a
     hidden ref name out of band than the full sha1. E.g., if I am
     looking at https://github.com/me/foo.git/pulls/123, I can easily
     construct the ref from that. Getting the sha1 will take extra
     steps.

  3. You have to do the out-of-band step, which may be inconvenient,
     every time the ref is updated. There is no way to say "just give me
     what is at the tip of refs/pull/123/head".

I think you could solve it by teaching upload-pack to understand refs on
"want" lines and convert them into the pointed-to object.

But taking a step back, this really seems quite inferior to an extension
that would allow the client to share its refspecs with the server. That
would solve the bandwidth efficiency problem for normal fetchers who are
looking at "refs/heads/*", while still giving people who are interested
in "refs/pull/*" (or even a specific refs/pull tip) the information they
need to fetch.

The obvious problem is that the server speaks first. But I recall
somebody suggested a combination of:

  1. For git-over-ssh and git-over-tcp, the server advertises
     tell-me-your-refspecs as it starts advertising.  Client interrupts
     advertisement with refspecs once it sees that it is OK to do so.

     We waste some bandwidth during the round-trip, but there will still
     be a benefit for repos with many refs (I wonder if we could even
     re-order the advertisement to show refs/heads/ first, as they are
     the most likely case to be requested). And as time goes on and the
     majority of clients support tell-me-your-refspecs, the server side
     can introduce a short delay after the first advertisement.

  2. For git-over-http, the client speaks first via the http protocol.
     We can stuff the refspecs into extra query parameters.

It's a little more complicated as a solution, but I feel like it gets
the efficiency without a loss of functionality. And it helps in more
situations than the hidden refs proposal (e.g., fetching refs/heads/foo
can avoid enumerating all of refs/heads/*).

-Peff

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-05  8:33   ` Jonathan Nieder
@ 2013-02-05 10:29     ` Michael Haggerty
  2013-02-05 17:38       ` Junio C Hamano
  2013-02-06 10:34       ` Duy Nguyen
  2013-02-05 17:36     ` Junio C Hamano
  1 sibling, 2 replies; 54+ messages in thread
From: Michael Haggerty @ 2013-02-05 10:29 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Junio C Hamano, git, Jeff King, Shawn Pearce

On 02/05/2013 09:33 AM, Jonathan Nieder wrote:
> Michael Haggerty wrote:
> 
>> I would again like to express my discomfort about this feature, which is
>> already listed as "will merge to next".  Frankly, I have the feeling
>> that this feature is being steamrolled in before a community consensus
>> has been reached and indeed before many valid points raised by other
>> members of the community have even been addressed.  For example:
> 
> In $dayjob I work with Gerrit, so I think I can start to answer some
> of these questions.
> 
>> * I didn't see a response to Peff's convincing arguments that this
>> should be a client-side feature rather than a server-side feature [1].
> 
> The client can't control the size of the ref advertisement.  That is
> the main motivation if I understood correctly.

Not according to Junio [4]:

  Look at this as a mechanism for the repository owner to control the
  clutter in what is shown to the intended audience of what s/he
  publishes in the repository.  Network bandwidth reduction of
  advertisement is a side effect of clutter reduction, and not
  necessarily the primary goal.

>> * I didn't see an answer to Duy's question [2] about what is different
>> between the proposed feature and gitnamespaces.
> 
> Namespaces are more complicated and don't sit well in existing setups
> involving git repositories whose refs are not namespaced.

Thanks.

>> * I didn't see a response to my worries that this feature could be
>> abused [3].
> 
> Can you elaborate?  Do you mean that through social engineering an
> attacker would convince the server admin to store secrets using a
> hidden ref and enable the upload-archive service?

The upload-archive service is not needed; after patch v3 "7/8" remote
clients are able to retrieve hidden content by SHA1.

Hiderefs creates a "dark" corner of a remote git repo that can hold
arbitrary content that is impossible for anybody to discover but
nevertheless possible for anybody to download (if they know the name of
a hidden reference).  In earlier versions of the patch series I believe
that it was possible to push to a hidden reference hierarchy, which made
it possible to upload dark content.  The new version appears (from the
code) to prohibit adding references in a hidden hierarchy, which would
close the main loophole that I was worried about.  But the documentation
and the unit tests only explicitly say that updates and deletes are
prohibited; nothing is said about adding references (unless "update" is
understood to include "add").  I think the true behavior should be
clarified and tested.

I was worried that somehow this "dark" content could be used for
malicious purposes; for example, pushing compromised code then
convincing somebody to download it by SHA1 with the implicit argument
"it's safe since it comes directly from the project's official
repository".  If it is indeed impossible to populate the dark namespace
remotely then I can't think of a way to exploit it.

> That does sound like a reasonable concern.  Perhaps the documentation
> should be updated along these lines
> 
> 	transfer.hiderefs::
> 		String(s) `upload-pack` and `receive-pack` use to decide
> 		which refs to omit from their initial advertisement.  Use
> 		more than one transfer.hiderefs configuration variables to
> 		specify multiple prefix strings. A ref that are under the
> 		hierarchies listed on the value of this variable is excluded,
> 		and is hidden from `git ls-remote`, `git fetch`, `git push :`,
> 		etc.  An attempt to update or delete a hidden ref by `git push`
> 		is rejected, and an attempt to fetch a hidden ref by `git fetch`
> 		will fail.
> 	+
> 	This setting does not currently affect the `upload-archive` service.
> 
> until someone interested implements the same for upload-archive.

Yes, this sounds reasonable.

>> I also think that the feature is poorly designed.  For example:
> 
> That's another reasonable concern.  It's very hard to get a design
> correct right away, which is presumably part of the motivation of
> getting this into the hands of interested users who can give feedback
> on it.  What would potentially be worth blocking even that is concerns
> about the wire protocol, since it is hard to take back mistakes there.
> 
>> * Why should a repository have exactly one setting for what refs should
>> be hidden?  Wouldn't it make more sense to allow multiple "views" to be
>> defined?:
> 
> How do I request a different view of the repository at
> /path/to/repo.git over the network?  How can we make the common case
> of only one view easy to achieve?  Isn't the multiple-views case
> exactly what gitnamespaces is for?

Hidden references can only be configured by somebody with local access
to the repository being served.  Somebody with that access could also
configure views.  He could also probably organize a mapping from URL ->
(GIT_PATH, GIT_VIEW) and offer several different views of the same
repository.  If the setting of environment variables is thought to
sometimes be too problematic, the default view could be defined via
configuration, too:

[view]
	default = official
[view "official"]
	hiderefs = refs/pull

Or perhaps, if views are made available locally too, then one view could
be designated as the default for transfer purposes:

[transfer]
	defaultview = official

gitnamespaces have the disadvantage that you mentioned yourself earlier.

> [...]
>> * Is it enough to support only reference exclusion (as opposed to
>> exclusion and inclusion rules)?
> 
> The motivating example is turning off advertisement of the
> refs/changes hierarchy.  If and when more complicated cases come up,
> that would presumably be the time to support more complicated
> configuration.

I suppose should the need arise we could later introduce
"transfer.showrefs" and let the longest match to a given reference
(i.e., "transfer.hiderefs" vs. "transfer.showrefs") win.

> [...]
>> * Why should this feature only be available remotely?
> 
> It is about transport.  Ref namespaces have their own set of use cases
> and are a distinct feature.

gitnamespaces offer multiple perspectives of a single object database,
using independent sets of references to define the subset.  Changing a
reference in one namespace has no effect on similarly-named references
in another namespace.

The "views" feature that I vaguely suggested, on the other hand, would
offer multiple subsets of a single set of references, a bit like an SQL
view.  Changing a reference in one view would automatically affect it in
other views that include the reference.  It could be a handy way to
reduce clutter *locally* in the same way that hiderefs would reduce
clutter *remotely*.  But more importantly, both features could be built
on the same foundation.

Michael

[4] http://article.gmane.org/gmane.comp.version-control.git/213984

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 7/8] fetch: fetch objects by their exact SHA-1 object names
  2013-02-05  9:19   ` Jeff King
@ 2013-02-05 11:18     ` Jeff King
  2013-02-05 15:55     ` Junio C Hamano
  1 sibling, 0 replies; 54+ messages in thread
From: Jeff King @ 2013-02-05 11:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Shawn Pearce

On Tue, Feb 05, 2013 at 04:19:38AM -0500, Jeff King wrote:

> But taking a step back, this really seems quite inferior to an
> extension that would allow the client to share its refspecs with the
> server.  That would solve the bandwidth efficiency problem for normal
> fetchers who are

I should have read your cover letter more closely, as I see you make the
point there that this is no longer about the efficiency, but about
uncluttering.

I'm not sure I like it as an uncluttering tool, though; for the reasons
I stated in my previous mail, it makes it much more awkward for the
moments when you actually do want to fetch some of the "clutter".

-Peff

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 3/8] upload/receive-pack: allow hiding ref hierarchies
  2013-02-05  8:50   ` Jeff King
@ 2013-02-05 15:45     ` Junio C Hamano
  2013-02-06 11:31       ` Jeff King
  0 siblings, 1 reply; 54+ messages in thread
From: Junio C Hamano @ 2013-02-05 15:45 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Shawn Pearce

Jeff King <peff@peff.net> writes:

> On Wed, Jan 30, 2013 at 10:45:37AM -0800, Junio C Hamano wrote:
>
>> Teach upload-pack and receive-pack to omit some refs from their
>> initial advertisements by paying attention to the transfer.hiderefs
>> multi-valued configuration variable.  Any ref that is under the
>> hierarchies listed on the value of this variable is excluded from
>> responses to requests made by "ls-remote", "fetch", "clone", "push",
>> etc.
>> 
>> A typical use case may be
>> 
>> 	[transfer]
>> 		hiderefs = refs/pull
>> 
>> to hide the refs that are internally used by the hosting site and
>> should not be exposed over the network.
>
> In the earlier review, I mentioned making this per-service, but I see
> that is not the case here. Do you have an argument against doing so?

Perhaps then I misunderstood your intention.  By reminding me of the
receive-pack side, I thought you were hinting to unify these two
into one, which I did.  There is no argument against it.

> And I
> have not seen complaints about the current system.

Immediately after I added github to the set of places I push into,
which I think is long before you joined GitHub, I noticed that _my_
repository gets contaminated by second rate commits called pull
requests, and I may even have complained, but most likely I didn't,
as I could easily tell that, even though I know it is _not_ the only
way, nor even the best way [*1*], to implement the GitHub's pull
request workflow, I perfectly well understood that it would be the
most expedient way for GitHub folks to implement this feature.

I think you should take lack of complaints with a huge grain of
salt.  It does not suggest much.

> Gerrit's refs/changes may be a different story, if they have a large
> enough number of them to make upload-pack's ref advertisement
> overwhelming.

This is probably a stale count, but platform/frameworks/base part of
AOSP has 3200+ refs; the corresponding repository internal to Google
has 60k+ refs (this is because there are many in-between states
recorded in the internal repository, even though the end result
published to the open source repository may be the same) and results
in ~4MB advertisement.  Which is fairly significant when all you are
interested in doing is an "Am I up to date?" poll.


[Footnote]

*1* From the ownership point of view, objects that are only
reachable from these refs/pull/* refs do *not* belong to the
requestee, until the requestee chooses to accept the changes.

A malicious requestor can fork your repository, add an objectionable
blob to it, and throw a pull request at you.  GitHub shows that the
blob now belongs to your repository, so the requestor turns around
and file a DMCA takedown complaint against your repository.  A
clueful judge would then agree with the complaint after running a
"clone --mirror" and seeing the blob in your repository.  Oops?

A funny thing is that you cannot "push :refs/pull/1/head" to remove
it anymore (I think in the early days, I took them out by doing this
a few times, but I may be misremembering), so you cannot make
yourself into compliance, even though you are not the offending
party.  Your repository is held responsible for whatever the rogue
requestor added.  That is not very nice, is it?

In an ideal world, I would have chosen to create a dedicated fork
managed by the hosting company (i.e. GitHub) for your repository
whose only purpose is to house these refs/pull/ refs (the hosting
site is ultimately who has to respond to DMCA notices anyway, and an
arrangement like this makes it clear who is reponsible for what).

The e-mail sent to you to let you know about outstanding pull
requests and the web UI could just point at that forked repository,
not your own (you also could choose to leave the outging pull
requests in the requestor's repository, but that is only OK if you
do not worry about (1) a requestor sending a pull request, then
updating the branch the pull request talks about later, to trick you
with bait-and-switch, or (2) a requestor sending a pull request,
thinks he is done with the topic and removes the repository).

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 7/8] fetch: fetch objects by their exact SHA-1 object names
  2013-02-05  9:19   ` Jeff King
  2013-02-05 11:18     ` Jeff King
@ 2013-02-05 15:55     ` Junio C Hamano
  1 sibling, 0 replies; 54+ messages in thread
From: Junio C Hamano @ 2013-02-05 15:55 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Shawn Pearce

Jeff King <peff@peff.net> writes:

> On Wed, Jan 30, 2013 at 10:45:41AM -0800, Junio C Hamano wrote:
>
>> Teach "git fetch" to accept an exact SHA-1 object name the user may
>> obtain out of band on the LHS of a pathspec, and send it on a "want"
>> message when the server side advertises the allow-tip-sha1-in-want
>> capability.
>
> Hmm. The UI on this is a little less nice than I would have hoped.

Naming with unadvertised *refname*, not object name, needs protocol
extension for the serving side to expand the name to object name;
otherwise the receiving end wouldn't know what tip what it asked
resulted in.

And that belongs to a separate "expand refs" extension (aka
"delaying ref advertisement") that is outside the scope of this
series but can be built on top, as I said in the cover letter.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-05  8:04 ` [PATCH v3 0/8] Hiding refs Michael Haggerty
  2013-02-05  8:33   ` Jonathan Nieder
@ 2013-02-05 17:27   ` Junio C Hamano
  2013-02-06 10:17     ` Michael Haggerty
  1 sibling, 1 reply; 54+ messages in thread
From: Junio C Hamano @ 2013-02-05 17:27 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git, Jeff King, Shawn Pearce

Michael Haggerty <mhagger@alum.mit.edu> writes:

> I would again like to express my discomfort about this feature, which is
> already listed as "will merge to next".

Do not take "will merge to next" too literally.  One major purpose
of marking a topic as such is exactly to solicit comments like this
;-)

> * I didn't see a response to Peff's convincing arguments that this
> should be a client-side feature rather than a server-side feature [1].

Uncluttering is not about a choice client should make.  "delayed
advertisement" is an orthogonal issue and requires a larger protocol
update (it needs to make "git fetch" speak first instead of the
current protocol in which "upload-pack" speaks first).

> * I didn't see an answer to Duy's question [2] about what is different
> between the proposed feature and gitnamespaces.

I think Jonathan addressed this already.

> * I didn't see a response to my worries that this feature could be
> abused [3].

You can choose not to advertise allow-tip-sha1-in-want capability; I
do not think it is making things worse than the status quo.

> * Why should a repository have exactly one setting for what refs should
> be hidden?  Wouldn't it make more sense to allow multiple "views" to be
> defined?:

You are welcome to extend to have different views, but how would
your clients express which view they would want?

Giving a single view that the serving end decides gives us an
immediate benefit of showing an uncluttered set of refs of server's
choice, without making the problem space larger than necessary.

> * Is it enough to support only reference exclusion (as opposed to
> exclusion and inclusion rules)?

Again, I do not think you cannot extend it to do positive and
negative filtering "exclude these, but include those even though
they match the 'exclude these' patterns I gave you earlier".

> * Why should this feature only be available remotely?

The whole point is to give the server side a choice to show selected
refs, so that it can use hidden portion for its own use.  These refs
should not be hidden from local operations like "gc".

I appreciate the comments, but I do not think any point you raised
in this message is very much relevant as objections.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-05  8:33   ` Jonathan Nieder
  2013-02-05 10:29     ` Michael Haggerty
@ 2013-02-05 17:36     ` Junio C Hamano
  1 sibling, 0 replies; 54+ messages in thread
From: Junio C Hamano @ 2013-02-05 17:36 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Michael Haggerty, git, Jeff King, Shawn Pearce

Jonathan Nieder <jrnieder@gmail.com> writes:

>> * I didn't see a response to Peff's convincing arguments that this
>> should be a client-side feature rather than a server-side feature [1].
>
> The client can't control the size of the ref advertisement.  That is
> the main motivation if I understood correctly.

The answer to this question is more nuanced.

With the current protocol, it is upload-pack who speaks first, so
there is no way for the requestor to say "I am from an updated Git
suite and understand how to tell you to give me limited set of
refs", before upload-pack blasts 4MB of ref advertisement to it.

If the side that fetches is potentially interested in finding out
any and all refs, then an alternative solution would be to break
the current protocol, open a separate port and have upload-pack-2
listen to it, sit silently to let the requestor speak first when it
gets connection to that port.

But if the primary thing you are interested in is to hide the
references that:

 (1) the server side needs to keep track of for its own use; but
 (2) the requestors do not have to learn about from upload-pack,

we can do so without breaking older requestors.  That is what the
early part of this series is about.  We can view the last patch to
add the allow-tip-sha1-in-want as an icing on the cake.

It has the side effect of reducing the transfer overhead, because by
hiding the internal refs, the server side will stop blasting 4MB of
ref advertisements the requestors are not interested in, and that
would be the primary observable outcome from the end-user's point of
view (i.e. your "git pull --ff-only" will become a lot faster).

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-05 10:29     ` Michael Haggerty
@ 2013-02-05 17:38       ` Junio C Hamano
  2013-02-06 10:34       ` Duy Nguyen
  1 sibling, 0 replies; 54+ messages in thread
From: Junio C Hamano @ 2013-02-05 17:38 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jonathan Nieder, git, Jeff King, Shawn Pearce

Michael Haggerty <mhagger@alum.mit.edu> writes:

> On 02/05/2013 09:33 AM, Jonathan Nieder wrote:
>> Michael Haggerty wrote:
>> 
>>> I would again like to express my discomfort about this feature, which is
>>> already listed as "will merge to next".  Frankly, I have the feeling
>>> that this feature is being steamrolled in before a community consensus
>>> has been reached and indeed before many valid points raised by other
>>> members of the community have even been addressed.  For example:
>> 
>> In $dayjob I work with Gerrit, so I think I can start to answer some
>> of these questions.
>> 
>>> * I didn't see a response to Peff's convincing arguments that this
>>> should be a client-side feature rather than a server-side feature [1].
>> 
>> The client can't control the size of the ref advertisement.  That is
>> the main motivation if I understood correctly.
>
> Not according to Junio [4]:
>
>   Look at this as a mechanism for the repository owner to control the
>   clutter in what is shown to the intended audience of what s/he
>   publishes in the repository.  Network bandwidth reduction of
>   advertisement is a side effect of clutter reduction, and not
>   necessarily the primary goal.

See my response to Jonathan.

> Hiderefs creates a "dark" corner of a remote git repo that can hold
> arbitrary content that is impossible for anybody to discover but
> nevertheless possible for anybody to download (if they know the name of
> a hidden reference).

That is why allow-tip-sha1-in-want is a separate opt-in feature only
the server side controls.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-05 17:27   ` Junio C Hamano
@ 2013-02-06 10:17     ` Michael Haggerty
  2013-02-06 19:55       ` Jonathan Nieder
  2013-02-07 15:58       ` Jed Brown
  0 siblings, 2 replies; 54+ messages in thread
From: Michael Haggerty @ 2013-02-06 10:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Shawn Pearce

On 02/05/2013 06:27 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
>> I would again like to express my discomfort about this feature, which is
>> already listed as "will merge to next".
> 
> Do not take "will merge to next" too literally.  One major purpose
> of marking a topic as such is exactly to solicit comments like this
> ;-)

I take "will merge to next" pretty seriously, because I know how hard it
is to get *my* patch series to this state :-)

>> * I didn't see a response to Peff's convincing arguments that this
>> should be a client-side feature rather than a server-side feature [1].
> 
> Uncluttering is not about a choice client should make.  "delayed
> advertisement" is an orthogonal issue and requires a larger protocol
> update (it needs to make "git fetch" speak first instead of the
> current protocol in which "upload-pack" speaks first).

There seem to be a few issues mixed up in this topic.  It is hard to
reason about your patch series without understanding which scenarios and
problems it is meant to address.  First the problems that we might like
to solve:

Clutter: The typical user is subjected to much unneeded clutter in the
         form of references that he/she will likely never use.

Bandwidth: Interactions with the remote repo (clone, fetch, etc) are
           slowed down by the large volume of unnecessary data.

Provenance: Users mistakenly think that content originates with the
            repository owner whereas it in fact came from some other
            (perhaps untrusted) source.


Now, what are some use-case scenarios in which these problems arise?  As
I understand it, there are a few:

Scenario 1: Some providers junk up their users' repositories with
content that is not created by the repository's owner and that the owner
doesn't want to appear to vouch for (e.g., GitHub pull requests).  These
references might sometimes be useful to fetch, singly or in bulk.

Scenario 2: Some systems junk up their users' repositories with
additional references that are not interesting to most pullers (e.g.,
Gerrit activity markers) though they don't add questionable content.

Scenario 3: Some repository owners might *themselves* want to push
references to their repository but hide them from most users (e.g.,
Junio's topic branches) or make them completely hidden from the rest of
the world (e.g., proprietary vs. open-source branches).

In most of these cases, it would be desirable for at least some users to
be able to fetch and/or push hidden content.

A first weakness of your proposal is that even though the hidden refs
are (optionally) fetchable, there is *no* way to discover them remotely
or to bulk-download them; they would have to be retrieved one by one
using out-of-band information.  And if I understand correctly, there
would be no way to push hidden references remotely (whether manually or
from some automated process).  Such processes would have to be local to
the machine holding the repository.

A second weakness of your proposal is that the repository owner would
*anyway* need local access to the repo server or the help of the
provider to implement reference hiding (since hidden references cannot
be configured remotely).  Who will choose what references to hide?  Most
likely each provider will pick a one-size-fits-all configuration and
apply it to all of the repos that they manage.  All users would be at
the mercy of their provider to make wise choices and would not be able
to override the choice via their client.

A third weakness of your hidden references proposal is that it is
schizophrenic: some references are hidden and undiscoverable, but their
content can nevertheless be made fetchable if the user happens to know
the SHA1.  This is more complicated to understand and reason about than
the rule "exactly the content that is referred to by published
references is fetchable".

What would be a better way?  Providers could expose multiple views of
the same repository; for example, one view with just the uncluttered
content, and a second view that includes *all* fetchable references.
Accessing the repository via the first view would give all of the
benefits provided by your hidden reference proposal.  Accessing it via
the second view would allow the hidden references to be fetched (even in
bulk) using purely git tools.  The documentation for the second view
could explain that it contains un-vetted content.

But your proposal does not admit two-tiered access to a single
repository.  You only support one hidden reference configuration that is
applied to all remote access [1].  See below for more ideas about
implementing multiple views.

>> * I didn't see a response to my worries that this feature could be
>> abused [3].
> 
> You can choose not to advertise allow-tip-sha1-in-want capability; I
> do not think it is making things worse than the status quo.

Yes, if the feature is turned off then it is not worse than the status
quo.  But what if the feature is turned on?

Actually, I'm still not clear about how these hidden references are
supposed to be created.  I know that you would forbid updating or
deleting hidden references via the remote protocol, but would you allow
them to be created?  If so, then it seems that any pusher can create
dark content.  Or can they only be created via a separate, local channel
to the repository?  In this case, it seems rather limiting that any
process that wants to create hidden references has to be local.

>> * Why should a repository have exactly one setting for what refs should
>> be hidden?  Wouldn't it make more sense to allow multiple "views" to be
>> defined?:
> 
> You are welcome to extend to have different views, but how would
> your clients express which view they would want?

There are several possibilities:

1. Assuming the cooperation of the provider, the provider could offer
two separate URLs: one for the uncluttered view and one for the
cluttered view.  The client would choose the view by choosing which URL
to clone from.  On the provider side, both of these URLs could refer to
the same Git repository but, for example, set an environment variable
GIT_VIEW differently depending on which URL was used.  This approach
would solve clutter, bandwidth and provenance but require cooperation
from the provider.

2a. Assuming no cooperation from the provider, the git client could have
options like "git fetch --view=uncluttered URL".  This would receive all
references from the server but discard any that are not included in the
client's "uncluttered" view definition.  This would solve clutter.

2b. Again assuming no cooperation from the provider, the user could
clone all references from the remote repo, but define a local
"uncluttered" view that hides the extra references on the local side.
The view could be selected by setting a local environment variable
GIT_VIEW or via configuration option "git config view.default
uncluttered".  This would solve clutter in a more flexible way because
the clutter would still be available locally for those occasions when
the user wants to see it.

Please note that none of the above options require a new remote protocol.

If/when a new protocol is implemented, then the client could tell the
server what view it wants and the server would only advertise those refs
to the client:

3a. The client could tell the host what reference namespaces it wants to
fetch.  Its choice would only be used for the single transaction and
would not be recorded on the server side.

3b. The client could pick a server-defined view by name.  The server
would look up the name in its own configuration to translate it into a
subset of references.  The views that a particular server supports would
be documented in the same place that the URL is documented and might
also be queryable by the client.  There should probably be some standard
views like "default" and "full" that every server would be expected to
implement.  Please note that this method can fall back to 2a when
communicating with a server that does not support the new protocol.

>> * Why should this feature only be available remotely?
> 
> The whole point is to give the server side a choice to show selected
> refs, so that it can use hidden portion for its own use.  These refs
> should not be hidden from local operations like "gc".

Certainly they shouldn't be hidden from "gc", but it would be useful to
be able to hide references from user-facing commands like "log --all",
"log --decorate", "gitk", "grep --all" etc.  For example, here are some
more scenarios where clutter is annoying:

Scenario 4: I occasionally share with colleague Foo, so I want to
configure his repo as a remote for mine and fetch his latest work:

    git remote add foo $URL
    git fetch foo

But now every time I do a "gitk --all" or "git log --decorate", the
output is cluttered with all of his references (most of which are just
old versions of references from the upstream repository that we both
use).  I would like to be able to hide his references most of the time
but turn them back on when I need them.

Scenario 5: Our upstream repository has gazillions of release tags under
"refs/tags/releases/...", sometimes including customer-specific
releases.  In my daily life these are just clutter.  (This scenario is
made worse by the fact that AFAIK there is no way to tell Git to fetch
some tags but not others others.)  But sometimes I need to track down a
bug in a particular release and need to access that release tag.  So it
would be nice to be able to hide and unhide them locally.

> I appreciate the comments, but I do not think any point you raised
> in this message is very much relevant as objections.

Tl;dr summary:

* Hidden refs don't give a way to offer two-tiered remote access to a
  repository (e.g., one uncluttered view and one full view), so

  * local access to the repository would (apparently) be required to
    put *anything* in the hidden namespaces.

  * they don't help in any scenario where you *sometimes*
    want to bulk fetch the hidden refs, and even make it awkward to
    fetch single hidden refs.

* Hidden refs introduce a confusing schizophrenia between "advertised"
  and "not advertised but nonetheless fetchable".

* Hidden refs require the cooperation of the provider to configure and
  will therefore be unusable by many repository owners.

* Some small improvements (e.g. allowing *multiple* views to be
  defined) would provide much more benefit for about the same effort,
  and would be a better base for building other features in the future
  (e.g., local views).

Thanks for listening.
Michael

[1] Theoretically one could support multiple views of a single
repository by using something like "GIT_CONFIG=view_1_config git
upload-pack ..." or "git -c transfer.hiderefs=... git upload-pack ...",
but this would be awkward.

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-05 10:29     ` Michael Haggerty
  2013-02-05 17:38       ` Junio C Hamano
@ 2013-02-06 10:34       ` Duy Nguyen
  2013-02-06 19:17         ` Junio C Hamano
  1 sibling, 1 reply; 54+ messages in thread
From: Duy Nguyen @ 2013-02-06 10:34 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Jonathan Nieder, Junio C Hamano, git, Jeff King, Shawn Pearce

On Tue, Feb 5, 2013 at 5:29 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> Hiderefs creates a "dark" corner of a remote git repo that can hold
> arbitrary content that is impossible for anybody to discover but
> nevertheless possible for anybody to download (if they know the name of
> a hidden reference).  In earlier versions of the patch series I believe
> that it was possible to push to a hidden reference hierarchy, which made
> it possible to upload dark content.  The new version appears (from the
> code) to prohibit adding references in a hidden hierarchy, which would
> close the main loophole that I was worried about.  But the documentation
> and the unit tests only explicitly say that updates and deletes are
> prohibited; nothing is said about adding references (unless "update" is
> understood to include "add").  I think the true behavior should be
> clarified and tested.
>
> I was worried that somehow this "dark" content could be used for
> malicious purposes; for example, pushing compromised code then
> convincing somebody to download it by SHA1 with the implicit argument
> "it's safe since it comes directly from the project's official
> repository".  If it is indeed impossible to populate the dark namespace
> remotely then I can't think of a way to exploit it.

Or you can think hiderefs is the first step to addressing the initial
ref advertisment problem. The series says hidden refs are to be
fetched out of band, but that's not the only way. A new extension can
be added to the protocol later to let the client explore this dark
space. It's only truly dark for old clients. We could even shed some
light to old clients by sending a dummy ref with some loud name like
PLEASE_UPDATE_TO_LATEST_GIT_TO_FETCH_REMAINING_REFS (new clients
silently drop this ref)
--
Duy

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 3/8] upload/receive-pack: allow hiding ref hierarchies
  2013-02-05 15:45     ` Junio C Hamano
@ 2013-02-06 11:31       ` Jeff King
  2013-02-06 15:57         ` Junio C Hamano
  0 siblings, 1 reply; 54+ messages in thread
From: Jeff King @ 2013-02-06 11:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Shawn Pearce

On Tue, Feb 05, 2013 at 07:45:01AM -0800, Junio C Hamano wrote:

> > In the earlier review, I mentioned making this per-service, but I see
> > that is not the case here. Do you have an argument against doing so?
> 
> Perhaps then I misunderstood your intention.  By reminding me of the
> receive-pack side, I thought you were hinting to unify these two
> into one, which I did.  There is no argument against it.

What I meant was that there should be transfer.hiderefs, and an
individual {receive,uploadpack}.hiderefs, similar to the way we have
transfer.unpacklimit. That makes the easy case (hiding the refs
completely) easy, but leaves the flexibility for more.

Like this:

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index a8248d9..131c163 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -59,7 +59,7 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 
 static int receive_pack_config(const char *var, const char *value, void *cb)
 {
-	int status = parse_hide_refs_config(var, value, cb);
+	int status = parse_hide_refs_config(var, value, "receive");
 
 	if (status)
 		return status;
diff --git a/refs.c b/refs.c
index e3574ca..9bfea58 100644
--- a/refs.c
+++ b/refs.c
@@ -2560,9 +2560,13 @@ int parse_hide_refs_config(const char *var, const char *value, void *unused)
 
 static struct string_list *hide_refs;
 
-int parse_hide_refs_config(const char *var, const char *value, void *unused)
+int parse_hide_refs_config(const char *var, const char *value, void *vsection)
 {
-	if (!strcmp("transfer.hiderefs", var)) {
+	const char *section = vsection;
+
+	if (!strcmp("transfer.hiderefs", var) ||
+	    (!prefixcmp(var, section) &&
+	     !strcmp(var + strlen(section), ".hiderefs"))) {
 		char *ref;
 		int len;
 
diff --git a/upload-pack.c b/upload-pack.c
index 37977e2..c0390af 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -794,7 +794,7 @@ static int upload_pack_config(const char *var, const char *value, void *unused)
 {
 	if (!strcmp("uploadpack.allowtipsha1inwant", var))
 		allow_tip_sha1_in_want = git_config_bool(var, value);
-	return parse_hide_refs_config(var, value, unused);
+	return parse_hide_refs_config(var, value, "uploadpack");
 }
 
 int main(int argc, char **argv)


As an aside, I wonder if there is any point to the void pointer
parameter of parse_hide_refs_config. It is not used as a git_config
callback anywhere.

> > And I
> > have not seen complaints about the current system.
> 
> Immediately after I added github to the set of places I push into,
> which I think is long before you joined GitHub, I noticed that _my_
> repository gets contaminated by second rate commits called pull
> requests, and I may even have complained, but most likely I didn't,
> as I could easily tell that, even though I know it is _not_ the only
> way, nor even the best way [*1*], to implement the GitHub's pull
> request workflow, I perfectly well understood that it would be the
> most expedient way for GitHub folks to implement this feature.
> 
> I think you should take lack of complaints with a huge grain of
> salt.  It does not suggest much.

Sure, I do not pretend that nobody cares. But it is certainly not a
pressing issue, or there would probably be more outcry. And we must also
weigh it against the silent majority that are perfectly happy with the
status quo, that lets them fetch refs/pull/* as any other ref.

In your case, I really think the problem is less that you have a problem
with PR refs in the repository, and more that you do not care about the
pull request feature at all. To you it is useless noise, both in the
repo and on the web. Your arguments about provenance could apply equally
well to PRs accessible via the web interface.

I think the refs/ clutter is only an issue if you want to do mirroring,
and then you have an obvious conflict: did the fetcher want to mirror
everything, including refs/pull, or do they consider that to be clutter?
Only the client knows, which is why I think refspecs are the right place
to deal with clutter (the fact that we cannot say "everything except
refs/pull/*" is a weakness in our refspecs).

> *1* From the ownership point of view, objects that are only
> reachable from these refs/pull/* refs do *not* belong to the
> requestee, until the requestee chooses to accept the changes.
> 
> A malicious requestor can fork your repository, add an objectionable
> blob to it, and throw a pull request at you.  GitHub shows that the
> blob now belongs to your repository, so the requestor turns around
> and file a DMCA takedown complaint against your repository.  A
> clueful judge would then agree with the complaint after running a
> "clone --mirror" and seeing the blob in your repository.  Oops?

I don't think this is a problem in practice. DMCA notices do not go to
the repository owner; they go to GitHub. And as far as I know, our
support staff deals with them on a case by case basis (and knows what a
pull request is, and who is responsible for the content in question). It
is not like they see a report of something in refs/pull and lock down
the parent repository; they can see where the request came from and deal
with it appropriately.

But again, such a notice could just as easily come from the list of open
PRs against your repo in the web interface.

> A funny thing is that you cannot "push :refs/pull/1/head" to remove
> it anymore (I think in the early days, I took them out by doing this
> a few times, but I may be misremembering),

We block updates to them explicitly in a hook; it looks like that went
in around mid-2011.

> The e-mail sent to you to let you know about outstanding pull
> requests and the web UI could just point at that forked repository,
> not your own (you also could choose to leave the outging pull
> requests in the requestor's repository, but that is only OK if you
> do not worry about (1) a requestor sending a pull request, then
> updating the branch the pull request talks about later, to trick you
> with bait-and-switch, or (2) a requestor sending a pull request,
> thinks he is done with the topic and removes the repository).

Yes, point (2) is the main reason they are not simply attached to the
sender's repository.

-Peff

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 3/8] upload/receive-pack: allow hiding ref hierarchies
  2013-02-06 11:31       ` Jeff King
@ 2013-02-06 15:57         ` Junio C Hamano
  0 siblings, 0 replies; 54+ messages in thread
From: Junio C Hamano @ 2013-02-06 15:57 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Shawn Pearce

Jeff King <peff@peff.net> writes:

> On Tue, Feb 05, 2013 at 07:45:01AM -0800, Junio C Hamano wrote:
>
>> > In the earlier review, I mentioned making this per-service, but I see
>> > that is not the case here. Do you have an argument against doing so?
>> 
>> Perhaps then I misunderstood your intention.  By reminding me of the
>> receive-pack side, I thought you were hinting to unify these two
>> into one, which I did.  There is no argument against it.
>
> What I meant was that there should be transfer.hiderefs, and an
> individual {receive,uploadpack}.hiderefs, similar to the way we have
> transfer.unpacklimit.

Yes, as I said, I misunderstood your intention.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-06 10:34       ` Duy Nguyen
@ 2013-02-06 19:17         ` Junio C Hamano
  2013-02-06 19:45           ` Jonathan Nieder
                             ` (3 more replies)
  0 siblings, 4 replies; 54+ messages in thread
From: Junio C Hamano @ 2013-02-06 19:17 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Michael Haggerty, Jonathan Nieder, git, Jeff King, Shawn Pearce

Duy Nguyen <pclouds@gmail.com> writes:

> On Tue, Feb 5, 2013 at 5:29 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
>> Hiderefs creates a "dark" corner of a remote git repo that can hold
>> arbitrary content that is impossible for anybody to discover but
>> nevertheless possible for anybody to download (if they know the name of
>> a hidden reference).  In earlier versions of the patch series I believe
>> that it was possible to push to a hidden reference hierarchy, which made
>> it possible to upload dark content.  The new version appears (from the
>> code) to prohibit adding references in a hidden hierarchy, which would
>> close the main loophole that I was worried about.  But the documentation
>> and the unit tests only explicitly say that updates and deletes are
>> prohibited; nothing is said about adding references (unless "update" is
>> understood to include "add").  I think the true behavior should be
>> clarified and tested.
>>
>> I was worried that somehow this "dark" content could be used for
>> malicious purposes; for example, pushing compromised code then
>> convincing somebody to download it by SHA1 with the implicit argument
>> "it's safe since it comes directly from the project's official
>> repository".  If it is indeed impossible to populate the dark namespace
>> remotely then I can't think of a way to exploit it.
>
> Or you can think hiderefs is the first step to addressing the
> initial ref advertisment problem.  The series says hidden refs are
> to be fetched out of band, but that's not the only way.

Let me help unconfuse this thread.

I think the series as 8-patch series was poorly presented, and
separating it into two will help understanding what they are about.

The first three:

  upload-pack: share more code
  upload-pack: simplify request validation
  upload/receive-pack: allow hiding ref hierarchies

is _the_ topic of the series.  As far as I am concerned (I am not
speaking for Gerrit users, but am speaking as the Git maintainer),
the topic is solely about uncluttering.  There may be refs that the
server end may need to keep for its operation, but that remote users
have _no_ business knowing about.  Allowing the server to keep these
refs in the repository, while not showing these refs over the wire,
is the problem the series solves.

In other words, it is not about "these are *usually* not wanted by
clients, so do not show them by default".  It is about "these are
not to be shown, ever".

OK?

Now, there may be some refs that are not *usually* wanted by clients
but there may be cases where clients want to

 (1) learn about them via the same protocol; and/or
 (2) fetch them over the protocol.

If you want to solve both of these two issues generally, the
solution has to involve a separate protocol from the today's
protocol.  It would go like this:

 * The upload-pack-2 service sits on a port different from today's,
   waits for a ls-remote/fetch/clone client to connect to it, makes
   a default advertisement that only includes the refs that are
   usually wanted by clients with hints on what other refs the
   initial advertisement omitted, to let the client know that it is
   allowed to ask for them.

 * An updated client, if it sees that some refs are omitted from the
   initial advertisement *and* what the user told it to fetch or
   list may be one of the omitted ones (this is why the server gives
   hints in the previous step in the first step; when the server
   says it did not omit anything, or when it says it omitted only
   refs/pull/*, a client that wanted to fetch refs/heads/frotz will
   know the request will fail without continuing this step), then
   makes a "expand-refs" request to the server, asking for the refs
   it did not see and the server could supply.

 * When the server sees "expand-refs", it responds with additional
   advertisement.  "expand-refs refs/pull/*" may result in listing
   of all refs in that hierarchy.  "expand-refs refs/changes/1/1"
   would result in listing that single ref.  "expand-refs no-such"
   may result in nothing, indicating an error.

 * After the (possible) expand-refs exchange, the client knows
   exactly the same and necessary information as the current
   protocol gives it in order to go to the common ancestor discovery
   step, and the protocol can continue the same way as the current
   protocol.

Note that this cannot sit on the current port in general, as
existing clients will not be able to tell some refs are not
advertised, so unless you are hiding large and truly unused part of
the refspace, interoperability with older clients will render the
mechanism useless.  You cannot use this to delay the refs/tags/
hierarchy with this mechanism and have older client come to the
updated service that by default does not advertise tags, for
example.

The above is what I called the "delayed advertisement" in the
discussion, which was brought up several months ago but nothing
materialized as the result.  People who are interested in pursuing
this can volunteer and start discussing the design refinements now
and submit implementation for reviews.

But in the meantime, if there is a niche use case where a solution
to only the second problem is sufficient (and Gerrit and GitHub pull
requests could both be such use cases), the remainder of the series
can help, without waiting the solution to solve "usually not wanted
but may need to be learned" problem.  That is the latter 4 patches
(the very last one is a demonstration to illustrate why allowing a
push to hidden ref hierarchy would not and should not work, and is
not for application):

  parse_fetch_refspec(): clarify the codeflow a bit
  fetch: use struct ref to represent refs to be fetched
  upload-pack: optionally allow fetching from the tips of hidden refs
  fetch: fetch objects by their exact SHA-1 object names

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-06 19:17         ` Junio C Hamano
@ 2013-02-06 19:45           ` Jonathan Nieder
  2013-02-06 21:50           ` Michael Haggerty
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 54+ messages in thread
From: Jonathan Nieder @ 2013-02-06 19:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Duy Nguyen, Michael Haggerty, git, Jeff King, Shawn Pearce

Junio C Hamano wrote:
> Duy Nguyen <pclouds@gmail.com> writes:
>> On Tue, Feb 5, 2013 at 5:29 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote:

>>> Hiderefs creates a "dark" corner of a remote git repo
[...]
>> Or you can think hiderefs is the first step to addressing the
>> initial ref advertisment problem.  The series says hidden refs are
>> to be fetched out of band, but that's not the only way.
>
> Let me help unconfuse this thread.
>
> I think the series as 8-patch series was poorly presented, and
> separating it into two will help understanding what they are about.
>
> The first three:
>
>   upload-pack: share more code
>   upload-pack: simplify request validation
>   upload/receive-pack: allow hiding ref hierarchies
>
> is _the_ topic of the series.  As far as I am concerned (I am not
> speaking for Gerrit users, but am speaking as the Git maintainer),
> the topic is solely about uncluttering.  There may be refs that the
> server end may need to keep for its operation, but that remote users
> have _no_ business knowing about.

An obvious question when looking at that alone is, is there ever
actually need for such private refs?  If the refs are not meant to be
shared with users *at all*, why are they even refs?

An answer is "because refs force gc to keep the corresponding
objects".  For example, the sysadmin may want to keep refs/archived/
refs for dead branches that should not be advertised or accessible to
the user any more.  Seems sane, though not especially exciting.

What is more exciting to me is that it is a first step toward
addressing the complicated problem of offering access to more refs
than can be efficiently presented in the current ref advertisement.  I
think that's a harder problem but something like this would be needed
in order to support existing clients without performance degredation.

And in the meantime, it helps with the refs/archived case.

Thanks for explaining.
Jonathan

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-06 10:17     ` Michael Haggerty
@ 2013-02-06 19:55       ` Jonathan Nieder
  2013-02-06 22:01         ` Michael Haggerty
  2013-02-07 15:58       ` Jed Brown
  1 sibling, 1 reply; 54+ messages in thread
From: Jonathan Nieder @ 2013-02-06 19:55 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, git, Jeff King, Shawn Pearce

Michael Haggerty wrote:

> Scenario 1: Some providers junk up their users' repositories with
> content that is not created by the repository's owner and that the owner
> doesn't want to appear to vouch for (e.g., GitHub pull requests).  These
> references might sometimes be useful to fetch, singly or in bulk.
>
> Scenario 2: Some systems junk up their users' repositories with
> additional references that are not interesting to most pullers (e.g.,
> Gerrit activity markers) though they don't add questionable content.

Actually Gerrit's refs/changes refs are pretty similar to Github's
refs/pull.  Both are requests for code review.

[...]
> But now every time I do a "gitk --all" or "git log --decorate", the
> output is cluttered with all of his references (most of which are just
> old versions of references from the upstream repository that we both
> use).  I would like to be able to hide his references most of the time
> but turn them back on when I need them.
>
> Scenario 5: Our upstream repository has gazillions of release tags under
> "refs/tags/releases/...", sometimes including customer-specific
> releases.  In my daily life these are just clutter.

For both of these use cases, putting the refs somewhere other than
refs/heads, refs/tags, and refs/remotes should be enough to avoid
clutter.

I agree that a --decorate-glob along the lines of "git rev-parse"'s
--glob would be nice.

[...]
> * Some small improvements (e.g. allowing *multiple* views to be
>   defined) would provide much more benefit for about the same effort,
>   and would be a better base for building other features in the future
>   (e.g., local views).

Would advertising GIT_CONFIG_PARAMETERS and giving examples for server
admins to set it in inetd et al to provide different kinds of access
to a same repository through different URLs work?

> Thanks for listening.
> Michael
>
> [1] Theoretically one could support multiple views of a single
> repository by using something like "GIT_CONFIG=view_1_config git
> upload-pack ..." or "git -c transfer.hiderefs=... git upload-pack ...",
> but this would be awkward.

Ah, I missed this comment before.  What's awkward about that?  I
think it's a clean way to make many aspects of how a repository is
presented (including hook actions) configurable.

Thanks for your help clarifying this feature.  Hopefully some of the
discussion will filter into the documentation.

Jonathan

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-06 19:17         ` Junio C Hamano
  2013-02-06 19:45           ` Jonathan Nieder
@ 2013-02-06 21:50           ` Michael Haggerty
  2013-02-06 22:12             ` Junio C Hamano
  2013-02-06 22:26           ` Ævar Arnfjörð Bjarmason
  2013-02-06 22:56           ` Jeff King
  3 siblings, 1 reply; 54+ messages in thread
From: Michael Haggerty @ 2013-02-06 21:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Duy Nguyen, Jonathan Nieder, git, Jeff King, Shawn Pearce

On 02/06/2013 08:17 PM, Junio C Hamano wrote:
> Duy Nguyen <pclouds@gmail.com> writes:
> 
>> On Tue, Feb 5, 2013 at 5:29 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
>>> Hiderefs creates a "dark" corner of a remote git repo that can hold
>>> arbitrary content that is impossible for anybody to discover but
>>> nevertheless possible for anybody to download (if they know the name of
>>> a hidden reference).  In earlier versions of the patch series I believe
>>> that it was possible to push to a hidden reference hierarchy, which made
>>> it possible to upload dark content.  The new version appears (from the
>>> code) to prohibit adding references in a hidden hierarchy, which would
>>> close the main loophole that I was worried about.  But the documentation
>>> and the unit tests only explicitly say that updates and deletes are
>>> prohibited; nothing is said about adding references (unless "update" is
>>> understood to include "add").  I think the true behavior should be
>>> clarified and tested.
>>>
>>> I was worried that somehow this "dark" content could be used for
>>> malicious purposes; for example, pushing compromised code then
>>> convincing somebody to download it by SHA1 with the implicit argument
>>> "it's safe since it comes directly from the project's official
>>> repository".  If it is indeed impossible to populate the dark namespace
>>> remotely then I can't think of a way to exploit it.
>>
>> Or you can think hiderefs is the first step to addressing the
>> initial ref advertisment problem.  The series says hidden refs are
>> to be fetched out of band, but that's not the only way.
> 
> Let me help unconfuse this thread.
> 
> I think the series as 8-patch series was poorly presented, and
> separating it into two will help understanding what they are about.
> 
> The first three:
> 
>   upload-pack: share more code
>   upload-pack: simplify request validation
>   upload/receive-pack: allow hiding ref hierarchies
> 
> is _the_ topic of the series.  As far as I am concerned (I am not
> speaking for Gerrit users, but am speaking as the Git maintainer),
> the topic is solely about uncluttering.  There may be refs that the
> server end may need to keep for its operation, but that remote users
> have _no_ business knowing about.  Allowing the server to keep these
> refs in the repository, while not showing these refs over the wire,
> is the problem the series solves.
> 
> In other words, it is not about "these are *usually* not wanted by
> clients, so do not show them by default".  It is about "these are
> not to be shown, ever".
> 
> OK?

Yes, the first three patches sound much more reasonable if this is the
goal.  Do you know of users who want the feature defined by the first
three patches, or is it only a stepping stone towards an actually useful
feature?  (I ask because I have trouble imagining a real-world scenario
where these alone would be useful.)

> Now, there may be some refs that are not *usually* wanted by clients
> but there may be cases where clients want to
> 
>  (1) learn about them via the same protocol; and/or
>  (2) fetch them over the protocol.
> 
> If you want to solve both of these two issues generally, the
> solution has to involve a separate protocol from the today's
> protocol.  It would go like this:
[... omitted clear explanation of how delayed advertisement could be
implemented via a new protocol ...]

> But in the meantime, if there is a niche use case where a solution
> to only the second problem is sufficient (and Gerrit and GitHub pull
> requests could both be such use cases), the remainder of the series
> can help, without waiting the solution to solve "usually not wanted
> but may need to be learned" problem.  That is the latter 4 patches
> (the very last one is a demonstration to illustrate why allowing a
> push to hidden ref hierarchy would not and should not work, and is
> not for application):

Given that some people *do* want to fetch all pull requests, is this a
feature that any hosting service would really turn on?  True, the
majority of users would be spared clutter, but at the cost of completely
preventing other users from fetching all pull requests, mirroring the
repository, etc.

In other words, I wonder whether your two incremental steps are useful
at all, in the real world, without yet-to-be-implemented future changes.
 If not, then it doesn't make sense to merge them without at least
imagining the final goal and gaining confidence that they are not false
starts.


I think that a more useful interim solution would be to make it easy to
have two URLs accessing a single git repository, with different levels
of reference visibility applied to each.  This is something that
providers could turn on without sacrificing any existing functionality.
 And it would solve all three problems: clutter, bandwidth, and provenance.

Your first three patches would allow two-tier access to be implemented,
for example by setting GIT_CONFIG or GIT_CONFIG_PARAMETERS or
command-line parameters differently for the processes serving the two
URLs, like:

    git upload-pack ...

vs.

    GIT_CONFIG=config-with-hidden-refs git upload-pack ...
or
    git -c transfer.hiderefs=refs/pull upload-pack ...

But this is a bit awkward because the admin would either have to
maintain two config files, or maintain the hiderefs configuration in the
script starting upload-pack rather than in the configuration file.

Therefore, I suggest a slight change to how hiderefs are configured to
make two-tier URLs easier to configure, such as

    # Define one or more views:
    [view "uncluttered"]
            hiderefs = refs/pull

    # This would set the default view for all services:
    [transfer]
            view = uncluttered

    # Peff also wanted the possibility to configure each service
    # independently which could be done like this:
    [receive]
            view = uncluttered
    [uploadpack]
            view = full

I also tentatively suggest that we add a git-level option "--view" and
an environment variable GIT_VIEW (similar to "--namespace" and
GIT_NAMESPACE) to override the default setting:

    GIT_VIEW=uncluttered git upload-pack ...

This way whoever starts the process only needs to choose a particular
view name; the actual definition would reside in the config file.

I think these changes would make it easier to support two-tier URLs and
would also leave the way open to use the "view" concept for other things
in the future.


I've said my piece now and am gratified that there has been more
discussion about your proposal, which was my main goal.  Therefore FWIW
I turn my -1 into a -0 and leave it up to the people experiencing more
clutter-induced pain to decide how to proceed.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-06 19:55       ` Jonathan Nieder
@ 2013-02-06 22:01         ` Michael Haggerty
  0 siblings, 0 replies; 54+ messages in thread
From: Michael Haggerty @ 2013-02-06 22:01 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Junio C Hamano, git, Jeff King, Shawn Pearce

On 02/06/2013 08:55 PM, Jonathan Nieder wrote:
> Michael Haggerty wrote:
> [...]
>> But now every time I do a "gitk --all" or "git log --decorate", the
>> output is cluttered with all of his references (most of which are just
>> old versions of references from the upstream repository that we both
>> use).  I would like to be able to hide his references most of the time
>> but turn them back on when I need them.
>>
>> Scenario 5: Our upstream repository has gazillions of release tags under
>> "refs/tags/releases/...", sometimes including customer-specific
>> releases.  In my daily life these are just clutter.
> 
> For both of these use cases, putting the refs somewhere other than
> refs/heads, refs/tags, and refs/remotes should be enough to avoid
> clutter.

Thanks, yes, for release tags in particular your suggestion might be
workable.  But I also like the idea of being able to turn subsets of
references on and off easily, and have the choice persist until I change it.

> [...]
>> * Some small improvements (e.g. allowing *multiple* views to be
>>   defined) would provide much more benefit for about the same effort,
>>   and would be a better base for building other features in the future
>>   (e.g., local views).
> 
> Would advertising GIT_CONFIG_PARAMETERS and giving examples for server
> admins to set it in inetd et al to provide different kinds of access
> to a same repository through different URLs work?
> 
>> Thanks for listening.
>> Michael
>>
>> [1] Theoretically one could support multiple views of a single
>> repository by using something like "GIT_CONFIG=view_1_config git
>> upload-pack ..." or "git -c transfer.hiderefs=... git upload-pack ...",
>> but this would be awkward.
> 
> Ah, I missed this comment before.  What's awkward about that?  I
> think it's a clean way to make many aspects of how a repository is
> presented (including hook actions) configurable.

Awkwardness using GIT_CONFIG: the admin would have to maintain two
separate config files with mostly overlapping content.

Awkwardness using GIT_CONFIG_PARAMETERS or "-c transfer.hiderefs=...":
the hiderefs configuration would have to be maintained in some Apache
config or inetd or ... (or multiple places!) rather than in the
repository's config file, where it belongs.

Additional awkwardness using "-c transfer.hiderefs=...": AFAIK there is
no way to turn *off* a configuration variable via a command-line option.

It's all doable, but I find it awkward.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-06 21:50           ` Michael Haggerty
@ 2013-02-06 22:12             ` Junio C Hamano
  0 siblings, 0 replies; 54+ messages in thread
From: Junio C Hamano @ 2013-02-06 22:12 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Duy Nguyen, Jonathan Nieder, git, Jeff King, Shawn Pearce

Michael Haggerty <mhagger@alum.mit.edu> writes:

> On 02/06/2013 08:17 PM, Junio C Hamano wrote:
> ...
>
> Yes, the first three patches sound much more reasonable if this is the
> goal.
> ...
> I think that a more useful interim solution would be to make it easy to
> have two URLs accessing a single git repository, with different levels
> of reference visibility applied to each.

I think you said "more reasonable" without understanding what you
are saying is "more reasonable", then.  The mechanism is about
server side wanting to use refs to protect its own metadata from gc
without having to expose them; there is no "different levels".

> ...
>     GIT_CONFIG=config-with-hidden-refs git upload-pack ...
> or
>     git -c transfer.hiderefs=refs/pull upload-pack ...
>
> But this is a bit awkward ...

It is awkward to use hammer to drive screws in wood, too.  You want
to use a screwdriver.  The first three patches are to drive a nail
with hammer, OK?  Screws you keep bringing up is to be handled by
delayed ref advertisement.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-06 19:17         ` Junio C Hamano
  2013-02-06 19:45           ` Jonathan Nieder
  2013-02-06 21:50           ` Michael Haggerty
@ 2013-02-06 22:26           ` Ævar Arnfjörð Bjarmason
  2013-02-07  0:12             ` Junio C Hamano
  2013-02-06 22:56           ` Jeff King
  3 siblings, 1 reply; 54+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2013-02-06 22:26 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Duy Nguyen, Michael Haggerty, Jonathan Nieder, git, Jeff King,
	Shawn Pearce

On Wed, Feb 6, 2013 at 8:17 PM, Junio C Hamano <gitster@pobox.com> wrote:

Maybe this should be split up into a different thread, but:

> The upload-pack-2 service sits on a port different from today's
> [...].

I think there's a simpler way to do this, which is that:

 * New clients supporting v2 of the protocol send some piece of data
   that would break old servers.

 * If that fails the new client goes "oh jeeze, I guess it's an old
   server", and try again with the old protocol.

 * The client then saves a date (or the version the server gave us)
   indicating that it tried the new protocol on that remote, tries
   again sometime later.

We already covered in previous discussions how this would be simpler
with the HTTP protocol, since you could just send an extra header
inviting the server to speak the new protocol.

But for the other transports we can just try the new protocol and
retry with the old one as a fallback if it doesn't work. That'll allow
us to gracefully migrate without needing to change the git:// port.

Besides, I think the vast majority of users are using Git via http://
or ssh://, where we can't just change the port, but even so making
people change the port when we could handle this more gracefully would
be a big PITA. Adding new firewall holes is often a big bureaucratic
nightmare in some organizations.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-06 19:17         ` Junio C Hamano
                             ` (2 preceding siblings ...)
  2013-02-06 22:26           ` Ævar Arnfjörð Bjarmason
@ 2013-02-06 22:56           ` Jeff King
  3 siblings, 0 replies; 54+ messages in thread
From: Jeff King @ 2013-02-06 22:56 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Duy Nguyen, Michael Haggerty, Jonathan Nieder, git, Shawn Pearce

On Wed, Feb 06, 2013 at 11:17:06AM -0800, Junio C Hamano wrote:

> Let me help unconfuse this thread.
> 
> I think the series as 8-patch series was poorly presented, and
> separating it into two will help understanding what they are about.
> 
> The first three:
> 
>   upload-pack: share more code
>   upload-pack: simplify request validation
>   upload/receive-pack: allow hiding ref hierarchies
> 
> is _the_ topic of the series.  As far as I am concerned (I am not
> speaking for Gerrit users, but am speaking as the Git maintainer),
> the topic is solely about uncluttering.  There may be refs that the
> server end may need to keep for its operation, but that remote users
> have _no_ business knowing about.  Allowing the server to keep these
> refs in the repository, while not showing these refs over the wire,
> is the problem the series solves.
> 
> In other words, it is not about "these are *usually* not wanted by
> clients, so do not show them by default".  It is about "these are
> not to be shown, ever".
> 
> OK?

Right. I am not opposed to this series, as it does have a use-case. And
if it helps Gerrit folks or other users unclutter, great. The fact that
I could throw away the custom receive.hiderefs patch we use at GitHub is
a bonus. If people want fancier things, they can do them separately.

_But_. As a potential user of the feature (to hide refs/pull/*), I do
not think it is sufficiently flexible for me to use transfer.hiderefs
(or uploadpack.hiderefs). We use "fetch" internally to migrate objects
between forks and our alternates repos. And in that case, we really do
want to see all refs. In other words, all fetches are not the same: we
would want upload-pack to understand the difference between a client
fetch and an internal administrative fetch. But this feature does not
provide that lee-way. Even if you tried:

  git fetch -u 'git -c uploadpack.hiderefs= upload-pack'

the list nature of the config variable means you cannot reset it.

This isn't a show-stopper for the series; it may just mean that it is
not a good fit for GitHub's use case, but others (like Gerrit) may
benefit. But since refs/pull is used as an example of where this could
be applied, I wanted to point out that it does not achieve that goal.

-Peff

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-06 22:26           ` Ævar Arnfjörð Bjarmason
@ 2013-02-07  0:12             ` Junio C Hamano
  2013-02-07  0:16               ` Jeff King
  2014-02-23  2:44               ` Duy Nguyen
  0 siblings, 2 replies; 54+ messages in thread
From: Junio C Hamano @ 2013-02-07  0:12 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Duy Nguyen, Michael Haggerty, Jonathan Nieder, git, Jeff King,
	Shawn Pearce

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> I think there's a simpler way to do this, which is that:
>
>  * New clients supporting v2 of the protocol send some piece of data
>    that would break old servers.
>
>  * If that fails the new client goes "oh jeeze, I guess it's an old
>    server", and try again with the old protocol.
>
>  * The client then saves a date (or the version the server gave us)
>    indicating that it tried the new protocol on that remote, tries
>    again sometime later.

For that to work, the new server needs to wait for the client to
speak first.  How would that server handle old clients who expect to
be spoken first?  Wait with a read timeout (no timeout is the right
timeout for everybody)?

> We already covered in previous discussions how this would be simpler
> with the HTTP protocol,...

Yes, that is a solved problem.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-07  0:12             ` Junio C Hamano
@ 2013-02-07  0:16               ` Jeff King
  2013-02-07 10:30                 ` Ævar Arnfjörð Bjarmason
  2013-02-07 18:25                 ` Junio C Hamano
  2014-02-23  2:44               ` Duy Nguyen
  1 sibling, 2 replies; 54+ messages in thread
From: Jeff King @ 2013-02-07  0:16 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason, Duy Nguyen,
	Michael Haggerty, Jonathan Nieder, git, Shawn Pearce

On Wed, Feb 06, 2013 at 04:12:10PM -0800, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> 
> > I think there's a simpler way to do this, which is that:
> >
> >  * New clients supporting v2 of the protocol send some piece of data
> >    that would break old servers.
> >
> >  * If that fails the new client goes "oh jeeze, I guess it's an old
> >    server", and try again with the old protocol.
> >
> >  * The client then saves a date (or the version the server gave us)
> >    indicating that it tried the new protocol on that remote, tries
> >    again sometime later.
> 
> For that to work, the new server needs to wait for the client to
> speak first.  How would that server handle old clients who expect to
> be spoken first?  Wait with a read timeout (no timeout is the right
> timeout for everybody)?

If the new client can handle the old-style server's response, then the
server can start blasting out refs (optionally after a timeout) and stop
when the client interrupts with "hey, wait, I can speak the new
protocol". The server just has to include "you can interrupt me" in its
capability advertisement (obviously it would have to send out at least
the first ref with the capabilities before the timeout).

-Peff

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-07  0:16               ` Jeff King
@ 2013-02-07 10:30                 ` Ævar Arnfjörð Bjarmason
  2013-02-07 18:25                 ` Junio C Hamano
  1 sibling, 0 replies; 54+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2013-02-07 10:30 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Duy Nguyen, Michael Haggerty, Jonathan Nieder,
	git, Shawn Pearce

On Thu, Feb 7, 2013 at 1:16 AM, Jeff King <peff@peff.net> wrote:
> On Wed, Feb 06, 2013 at 04:12:10PM -0800, Junio C Hamano wrote:
>
>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>>
>> > I think there's a simpler way to do this, which is that:
>> >
>> >  * New clients supporting v2 of the protocol send some piece of data
>> >    that would break old servers.
>> >
>> >  * If that fails the new client goes "oh jeeze, I guess it's an old
>> >    server", and try again with the old protocol.
>> >
>> >  * The client then saves a date (or the version the server gave us)
>> >    indicating that it tried the new protocol on that remote, tries
>> >    again sometime later.
>>
>> For that to work, the new server needs to wait for the client to
>> speak first.  How would that server handle old clients who expect to
>> be spoken first?  Wait with a read timeout (no timeout is the right
>> timeout for everybody)?
>
> If the new client can handle the old-style server's response, then the
> server can start blasting out refs (optionally after a timeout) and stop
> when the client interrupts with "hey, wait, I can speak the new
> protocol". The server just has to include "you can interrupt me" in its
> capability advertisement (obviously it would have to send out at least
> the first ref with the capabilities before the timeout).

Can't this also be handled by passing an extra argument to
upload-pack? Whether you're talking http, ssh + normal shell, ssh +
git-shell or git:// you pass some argument that older clients would
reject on but would cause newer clients that know about that argument
to wait for you to speak before blasting refs at you.

It would mean that older clients (e.g. older git-shell) would reject
your initial connection, but you could just try again, and save away
info about that remote's version.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-06 10:17     ` Michael Haggerty
  2013-02-06 19:55       ` Jonathan Nieder
@ 2013-02-07 15:58       ` Jed Brown
  2013-02-09 23:23         ` Junio C Hamano
  1 sibling, 1 reply; 54+ messages in thread
From: Jed Brown @ 2013-02-07 15:58 UTC (permalink / raw)
  To: Michael Haggerty, Junio C Hamano; +Cc: git, Jeff King, Shawn Pearce

Michael Haggerty <mhagger@alum.mit.edu> writes:

> A first weakness of your proposal is that even though the hidden refs
> are (optionally) fetchable, there is *no* way to discover them remotely
> or to bulk-download them; they would have to be retrieved one by one
> using out-of-band information.  And if I understand correctly, there
> would be no way to push hidden references remotely (whether manually or
> from some automated process).  Such processes would have to be local to
> the machine holding the repository.

I'm the author of git-fat [1], a smudge/clean filter system for managing
large files.  I currently store files in the file system
(.git/fat/objects) and transfer them via rsync because I want to be able
to transfer exact subsets requested by the user.  I would like to put
this data in a git repository so that I can take advantage of packfile
compression when applicable and so that I can use existing access
control, but I would need to store a separate reference to each blob (so
that I can transfer exact subsets).  My refs would be named like
'fat-<SHA1_OF_SMUDGED_DATA>' and are known on the client side because
they are in the cleaned blob (which contains only this SHA1 and the
number of bytes [2]).

I believe that my use case would be well supported if git could push and
pull unadvertised refs, as long as basic operations were not slowed down
by the existence of a very large number of such refs.


[1] https://github.com/jedbrown/git-fat

[2] We could eliminate the performance problem of needing to buffer the
entire file if the smudge filter could be passed the object size as an
argument and if we could forward that size in a stream to 'git
hash-object --stdin'.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-07  0:16               ` Jeff King
  2013-02-07 10:30                 ` Ævar Arnfjörð Bjarmason
@ 2013-02-07 18:25                 ` Junio C Hamano
  1 sibling, 0 replies; 54+ messages in thread
From: Junio C Hamano @ 2013-02-07 18:25 UTC (permalink / raw)
  To: Jeff King
  Cc: Ævar Arnfjörð Bjarmason, Duy Nguyen,
	Michael Haggerty, Jonathan Nieder, git, Shawn Pearce

Jeff King <peff@peff.net> writes:

> If the new client can handle the old-style server's response, then the
> server can start blasting out refs (optionally after a timeout) and stop
> when the client interrupts with "hey, wait, I can speak the new
> protocol". The server just has to include "you can interrupt me" in its
> capability advertisement (obviously it would have to send out at least
> the first ref with the capabilities before the timeout).

Yeah, I would prefer people to come up with a way to share the port
and autodetect.  It is *not* a requirement for the updated server to
run on a separate port at all.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-07 15:58       ` Jed Brown
@ 2013-02-09 23:23         ` Junio C Hamano
  2013-02-10  4:45           ` Jed Brown
  0 siblings, 1 reply; 54+ messages in thread
From: Junio C Hamano @ 2013-02-09 23:23 UTC (permalink / raw)
  To: Jed Brown; +Cc: Michael Haggerty, git, Jeff King, Shawn Pearce

Jed Brown <jed@59A2.org> writes:

> I believe that my use case would be well supported if git could push and
> pull unadvertised refs, as long as basic operations were not slowed down
> by the existence of a very large number of such refs.

I am not sure about "pushing" part, but the jc/fetch-raw-sha1 topic
(split from the main jc/hidden-refs topic) should allow your script,
after the client learns the set of smudged object names, to ask for

    git fetch $there $sha1_1 $sha1_2 ...

or

    git fetch $there $sha1_1:refs/fat/$sha1_1 $sha1_2:refs/fat/$sha1_2 ...

I think.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-09 23:23         ` Junio C Hamano
@ 2013-02-10  4:45           ` Jed Brown
  0 siblings, 0 replies; 54+ messages in thread
From: Jed Brown @ 2013-02-10  4:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, git, Jeff King, Shawn Pearce

Junio C Hamano <gitster@pobox.com> writes:

> I am not sure about "pushing" part, but the jc/fetch-raw-sha1 topic
> (split from the main jc/hidden-refs topic) should allow your script,
> after the client learns the set of smudged object names, to ask for
>
>     git fetch $there $sha1_1 $sha1_2 ...

Well, my out-of-band knowledge is currently the sha1 of the data
contained in the blob I want, not the blob sha1 itself [1].  After
experimenting with jc/hidden-refs, I think it already does exactly what
I want. Specifically, I set this on the server

  git config uploadpack.hiderefs refs/fat

so that 'git ls-remote' no longer transfers these refs. Then on the
client, I do

  contentid=$(sha1sum thefile | cut -f1 -d \ )
  blobid=$(git hash-object -w thefile)
  git update-ref refs/fat/$contentid $blobid

  .... more like this

  git push the-remote refs/fat/$contentid ...

and later, I can fetch specific refs using

  git fetch the-remote refs/fat/$wanted:refs/fat/$wanted ...

The client knows the desired refs out-of-band so this looks okay. It
would be convenient to have '--stdin' options to 'git push' and 'git
fetch'. Would a patch for that be welcome?


[1] The reason for using $contentid instead of $blobid in the key here
is to avoid etching the backend=git detail into the cleaned commits.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2013-02-07  0:12             ` Junio C Hamano
  2013-02-07  0:16               ` Jeff King
@ 2014-02-23  2:44               ` Duy Nguyen
  2014-03-11  1:49                 ` Jeff King
  1 sibling, 1 reply; 54+ messages in thread
From: Duy Nguyen @ 2014-02-23  2:44 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð, Michael Haggerty, Jonathan Nieder,
	Git Mailing List, Jeff King, Shawn Pearce

(Digging up an old thread about initial refs listing in git protocol)

On Thu, Feb 7, 2013 at 7:12 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> I think there's a simpler way to do this, which is that:
>>
>>  * New clients supporting v2 of the protocol send some piece of data
>>    that would break old servers.
>>
>>  * If that fails the new client goes "oh jeeze, I guess it's an old
>>    server", and try again with the old protocol.
>>
>>  * The client then saves a date (or the version the server gave us)
>>    indicating that it tried the new protocol on that remote, tries
>>    again sometime later.
>
> For that to work, the new server needs to wait for the client to
> speak first.  How would that server handle old clients who expect to
> be spoken first?  Wait with a read timeout (no timeout is the right
> timeout for everybody)?

I think the client always speaks first when it asks for a remote
service. Earlier in this thread you described the new protocol
upload-pack-2. Why can't it be a new service "upload-pack-2" in
git-daemon?

So new client will try requesting "upload-pack-2" service with client
capability advertisement before ref listing. Old servers do not
recognize this service and disconnect so the new client falls back to
the good old "upload-pack" (one more round trip though, but you could
configure new client to use old protocol for certain "old" hosts).
Similar thing happens for ssh transport. "upload-pack" service is
always there for old clients.
-- 
Duy

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2014-02-23  2:44               ` Duy Nguyen
@ 2014-03-11  1:49                 ` Jeff King
  2014-03-11 19:32                   ` Junio C Hamano
  2014-03-15  1:23                   ` Duy Nguyen
  0 siblings, 2 replies; 54+ messages in thread
From: Jeff King @ 2014-03-11  1:49 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Junio C Hamano, Ævar Arnfjörð, Michael Haggerty,
	Jonathan Nieder, Git Mailing List, Shawn Pearce

On Sun, Feb 23, 2014 at 09:44:14AM +0700, Duy Nguyen wrote:

> (Digging up an old thread about initial refs listing in git protocol)

And now I am responding to it slowly. :)

> > For that to work, the new server needs to wait for the client to
> > speak first.  How would that server handle old clients who expect to
> > be spoken first?  Wait with a read timeout (no timeout is the right
> > timeout for everybody)?
> 
> I think the client always speaks first when it asks for a remote
> service. Earlier in this thread you described the new protocol
> upload-pack-2. Why can't it be a new service "upload-pack-2" in
> git-daemon?
> 
> So new client will try requesting "upload-pack-2" service with client
> capability advertisement before ref listing. Old servers do not
> recognize this service and disconnect so the new client falls back to
> the good old "upload-pack" (one more round trip though, but you could
> configure new client to use old protocol for certain "old" hosts).
> Similar thing happens for ssh transport. "upload-pack" service is
> always there for old clients.

Right, I recall the general feeling being that such a system would work,
and the transition would be managed by a config variable like
"remote.*.useUploadPack2". Probably with settings like:

  true:
    always try, but allow fall back to upload-pack

  false:
    never try, always use upload-pack

  auto:
    try, but if we fail, set remote.*.uploadPackTimestamp, and do not
    try again for N days

The default would start at false, and people who know their server is
very up-to-date can turn it on. And then when many server
implementations support it, flip the default to auto. And either leave
it there forever, or eventually just set it to "true" and drop "auto"
entirely as a code cleanup.

In theory we could do more radical protocol changes here, but I think
most people are just interested in adding an opportunity for the client
to speak before the ref advertisement in order to set a few
flags/variables.  That should be relatively simple, and for http we can
probably pass those flags via url parameters without any extra
compatibility/round-trip at all.

I think the main flag of interest is giving an fnmatch pattern to limit
the advertised refs. There could potentially be others, but I do not
know of any offhand.

-Peff

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2014-03-11  1:49                 ` Jeff King
@ 2014-03-11 19:32                   ` Junio C Hamano
  2014-03-11 20:05                     ` Jeff King
  2014-03-15  1:23                   ` Duy Nguyen
  1 sibling, 1 reply; 54+ messages in thread
From: Junio C Hamano @ 2014-03-11 19:32 UTC (permalink / raw)
  To: Jeff King
  Cc: Duy Nguyen, Ævar Arnfjörð, Michael Haggerty,
	Jonathan Nieder, Git Mailing List, Shawn Pearce

Jeff King <peff@peff.net> writes:

> I think the main flag of interest is giving an fnmatch pattern to limit
> the advertised refs. There could potentially be others, but I do not
> know of any offhand.

One thing that comes to mind is where symrefs point at, which we
failed to add the last time around because we ran out of the
hidden-space behind NUL.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2014-03-11 19:32                   ` Junio C Hamano
@ 2014-03-11 20:05                     ` Jeff King
  2014-03-11 20:25                       ` Junio C Hamano
  0 siblings, 1 reply; 54+ messages in thread
From: Jeff King @ 2014-03-11 20:05 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Duy Nguyen, Ævar Arnfjörð, Michael Haggerty,
	Jonathan Nieder, Git Mailing List, Shawn Pearce

On Tue, Mar 11, 2014 at 12:32:37PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > I think the main flag of interest is giving an fnmatch pattern to limit
> > the advertised refs. There could potentially be others, but I do not
> > know of any offhand.
> 
> One thing that comes to mind is where symrefs point at, which we
> failed to add the last time around because we ran out of the
> hidden-space behind NUL.

Yeah, good idea. I might be misremembering some complications, but we
can probably do it with:

  1. Teach the client to send an "advertise-symrefs" flag before the ref
     advertisement.

  2. Teach the server to include symrefs in the ref advertisement; we
     can invent a new syntax because we know the client has asked for
     it.

That does not have to come immediately, though. Done correctly,
upload-pack2 is not about implementing the fnmatch feature, but allowing
arbitrary capability strings from the client before the ref
advertisement starts. So this just becomes an extension that we can add
and advertise during that new phase.

-Peff

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2014-03-11 20:05                     ` Jeff King
@ 2014-03-11 20:25                       ` Junio C Hamano
  2014-03-11 20:36                         ` Jeff King
  0 siblings, 1 reply; 54+ messages in thread
From: Junio C Hamano @ 2014-03-11 20:25 UTC (permalink / raw)
  To: Jeff King
  Cc: Duy Nguyen, Ævar Arnfjörð, Michael Haggerty,
	Jonathan Nieder, Git Mailing List, Shawn Pearce

Jeff King <peff@peff.net> writes:

> On Tue, Mar 11, 2014 at 12:32:37PM -0700, Junio C Hamano wrote:
>
>> Jeff King <peff@peff.net> writes:
>> 
>> > I think the main flag of interest is giving an fnmatch pattern to limit
>> > the advertised refs. There could potentially be others, but I do not
>> > know of any offhand.
>> 
>> One thing that comes to mind is where symrefs point at, which we
>> failed to add the last time around because we ran out of the
>> hidden-space behind NUL.
>
> Yeah, good idea. I might be misremembering some complications, but we
> can probably do it with:
>
>   1. Teach the client to send an "advertise-symrefs" flag before the ref
>      advertisement.
>
>   2. Teach the server to include symrefs in the ref advertisement; we
>      can invent a new syntax because we know the client has asked for
>      it.

I was thinking more about the underlying protocol, not advertisement
in particular, and I think we came to the same conclusion.

The capability advertisement deserves to have its own separate
packet message type, when both sides say that they understand it, so
that we do not have to be limited by the pkt-line length limit.  We
could do one message per capability, and at the same time can lift
the traditional "capability hidden after the NUL is purged every
time, so we need to repeat them if we want to later change it,
because that is how older clients and servers use that information"
insanity, for example.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2014-03-11 20:25                       ` Junio C Hamano
@ 2014-03-11 20:36                         ` Jeff King
  2014-03-14 12:37                           ` Duy Nguyen
  0 siblings, 1 reply; 54+ messages in thread
From: Jeff King @ 2014-03-11 20:36 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Duy Nguyen, Ævar Arnfjörð, Michael Haggerty,
	Jonathan Nieder, Git Mailing List, Shawn Pearce

On Tue, Mar 11, 2014 at 01:25:23PM -0700, Junio C Hamano wrote:

> > Yeah, good idea. I might be misremembering some complications, but we
> > can probably do it with:
> >
> >   1. Teach the client to send an "advertise-symrefs" flag before the ref
> >      advertisement.
> >
> >   2. Teach the server to include symrefs in the ref advertisement; we
> >      can invent a new syntax because we know the client has asked for
> >      it.
> 
> I was thinking more about the underlying protocol, not advertisement
> in particular, and I think we came to the same conclusion.
> 
> The capability advertisement deserves to have its own separate
> packet message type, when both sides say that they understand it, so
> that we do not have to be limited by the pkt-line length limit.  We
> could do one message per capability, and at the same time can lift
> the traditional "capability hidden after the NUL is purged every
> time, so we need to repeat them if we want to later change it,
> because that is how older clients and servers use that information"
> insanity, for example.

So this may be entering the "more radical changes" realm I mentioned
earlier.

If the client is limited to setting a few flags, then something like
http can get away with:

  GET foo.git/info/refs?service=git-upload-pack&advertise-symrefs&refspec=refs/heads/*

And it does not need to worry about upload-pack2 at all. Either the
server recognizes and acts on them, or it ignores them.

But given that we do not have such a magic out-of-band method for
passing values over ssh and git, maybe it is not worth worrying about.
Http can move to upload-pack2 along with the rest.

One thing that _is_ worth considering for http is how the protocol
starts. We do not want to introduce an extra http round-trip to the
protocol if we can help it. If the initial GET becomes a POST, then it
could pass along the pkt-line of client capabilities with the initial
request, and the server would respond with the ref advertisement as
usual.

-Peff

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2014-03-11 20:36                         ` Jeff King
@ 2014-03-14 12:37                           ` Duy Nguyen
  2014-03-14 16:45                             ` Shawn Pearce
  0 siblings, 1 reply; 54+ messages in thread
From: Duy Nguyen @ 2014-03-14 12:37 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Ævar Arnfjörð, Michael Haggerty,
	Jonathan Nieder, Git Mailing List, Shawn Pearce

On Wed, Mar 12, 2014 at 3:36 AM, Jeff King <peff@peff.net> wrote:
> If the client is limited to setting a few flags, then something like
> http can get away with:
>
>   GET foo.git/info/refs?service=git-upload-pack&advertise-symrefs&refspec=refs/heads/*
>
> And it does not need to worry about upload-pack2 at all. Either the
> server recognizes and acts on them, or it ignores them.
>
> But given that we do not have such a magic out-of-band method for
> passing values over ssh and git, maybe it is not worth worrying about.

git could go the same if we lift the restriction in 73bb33a (daemon:
Strictly parse the "extra arg" part of the command - 2009-06-04). It's
been five years. Old daemons hopefully have all died out by now. For
ssh, I suppose upload-pack and receive-pack can take an extra argument
like "advertise-symrefs&refspec=refs/heads/*" (daemon would use it too
to pass the advertiment to upload-pack and receive-pack).

That would make all three not need to change the underlying protocol
for capability advertisement. Old git-daemon, upload-pack and
receive-pack will fail hard on the new advertisement though, unlike
http. But that's no worse than upload-pack2.

> Http can move to upload-pack2 along with the rest.

Or maybe http may lead the rest to another way.
-- 
Duy

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2014-03-14 12:37                           ` Duy Nguyen
@ 2014-03-14 16:45                             ` Shawn Pearce
  2014-03-14 23:30                               ` Duy Nguyen
  0 siblings, 1 reply; 54+ messages in thread
From: Shawn Pearce @ 2014-03-14 16:45 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Jeff King, Junio C Hamano, Ævar Arnfjörð,
	Michael Haggerty, Jonathan Nieder, Git Mailing List

On Fri, Mar 14, 2014 at 5:37 AM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Wed, Mar 12, 2014 at 3:36 AM, Jeff King <peff@peff.net> wrote:
>> If the client is limited to setting a few flags, then something like
>> http can get away with:
>>
>>   GET foo.git/info/refs?service=git-upload-pack&advertise-symrefs&refspec=refs/heads/*
>>
>> And it does not need to worry about upload-pack2 at all. Either the
>> server recognizes and acts on them, or it ignores them.
>>
>> But given that we do not have such a magic out-of-band method for
>> passing values over ssh and git, maybe it is not worth worrying about.
>
> git could go the same if we lift the restriction in 73bb33a (daemon:
> Strictly parse the "extra arg" part of the command - 2009-06-04). It's
> been five years. Old daemons hopefully have all died out by now. For
> ssh, I suppose upload-pack and receive-pack can take an extra argument
> like "advertise-symrefs&refspec=refs/heads/*" (daemon would use it too
> to pass the advertiment to upload-pack and receive-pack).

Heh. IIRC you are talking about the DoS attack for git-daemon where
you send an extra header and the process infinite loops forever? We
really don't want a modern client attempting to upgrade the protocol
with an ancient daemon to DoS attack that server.

> That would make all three not need to change the underlying protocol
> for capability advertisement. Old git-daemon, upload-pack and
> receive-pack will fail hard on the new advertisement though, unlike
> http. But that's no worse than upload-pack2.

You missed the SSH case. It doesn't have this slot to hide the data into.

>> Http can move to upload-pack2 along with the rest.
>
> Or maybe http may lead the rest to another way.
> --
> Duy

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2014-03-14 16:45                             ` Shawn Pearce
@ 2014-03-14 23:30                               ` Duy Nguyen
  2014-03-15  0:09                                 ` Shawn Pearce
  0 siblings, 1 reply; 54+ messages in thread
From: Duy Nguyen @ 2014-03-14 23:30 UTC (permalink / raw)
  To: Shawn Pearce
  Cc: Jeff King, Junio C Hamano, Ævar Arnfjörð,
	Michael Haggerty, Jonathan Nieder, Git Mailing List

On Fri, Mar 14, 2014 at 11:45 PM, Shawn Pearce <spearce@spearce.org> wrote:
> On Fri, Mar 14, 2014 at 5:37 AM, Duy Nguyen <pclouds@gmail.com> wrote:
>> On Wed, Mar 12, 2014 at 3:36 AM, Jeff King <peff@peff.net> wrote:
>>> If the client is limited to setting a few flags, then something like
>>> http can get away with:
>>>
>>>   GET foo.git/info/refs?service=git-upload-pack&advertise-symrefs&refspec=refs/heads/*
>>>
>>> And it does not need to worry about upload-pack2 at all. Either the
>>> server recognizes and acts on them, or it ignores them.
>>>
>>> But given that we do not have such a magic out-of-band method for
>>> passing values over ssh and git, maybe it is not worth worrying about.
>>
>> git could go the same if we lift the restriction in 73bb33a (daemon:
>> Strictly parse the "extra arg" part of the command - 2009-06-04). It's
>> been five years. Old daemons hopefully have all died out by now. For
>> ssh, I suppose upload-pack and receive-pack can take an extra argument
>> like "advertise-symrefs&refspec=refs/heads/*" (daemon would use it too
>> to pass the advertiment to upload-pack and receive-pack).
>
> Heh. IIRC you are talking about the DoS attack for git-daemon where
> you send an extra header and the process infinite loops forever? We
> really don't want a modern client attempting to upgrade the protocol
> with an ancient daemon to DoS attack that server.

Shouldn't vulnerable daemons be upgraded anyway? If they keep using
the vulnerable version for all these 5 years, I feel no sorry for new
clients DoSing them. Jeff's idea about "remote.*.useUploadPack2" still
applies here so after we attack the server once, it'll be black listed
for a while (or forever).

>> That would make all three not need to change the underlying protocol
>> for capability advertisement. Old git-daemon, upload-pack and
>> receive-pack will fail hard on the new advertisement though, unlike
>> http. But that's no worse than upload-pack2.
>
> You missed the SSH case. It doesn't have this slot to hide the data into.

Right now we run this for ssh case: "ssh <host> git-upload-pack
<repo-path>". New client can do this instead

ssh <host> git-upload-pack <repo-path> <client capability flags>
-- 
Duy

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2014-03-14 23:30                               ` Duy Nguyen
@ 2014-03-15  0:09                                 ` Shawn Pearce
  2014-03-18  4:17                                   ` Jeff King
  0 siblings, 1 reply; 54+ messages in thread
From: Shawn Pearce @ 2014-03-15  0:09 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Jeff King, Junio C Hamano, Ævar Arnfjörð,
	Michael Haggerty, Jonathan Nieder, Git Mailing List

On Fri, Mar 14, 2014 at 4:30 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Fri, Mar 14, 2014 at 11:45 PM, Shawn Pearce <spearce@spearce.org> wrote:
>>
>> You missed the SSH case. It doesn't have this slot to hide the data into.
>
> Right now we run this for ssh case: "ssh <host> git-upload-pack
> <repo-path>". New client can do this instead
>
> ssh <host> git-upload-pack <repo-path> <client capability flags>

Older servers will fail on this command, and the client must reconnect
over SSH, which may mean supplying their password/passphrase again.
But its remembered that the uploadPack2 didn't work so this can be
blacklisted and not retried for a while.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2014-03-11  1:49                 ` Jeff King
  2014-03-11 19:32                   ` Junio C Hamano
@ 2014-03-15  1:23                   ` Duy Nguyen
  2014-03-18  4:18                     ` Jeff King
  1 sibling, 1 reply; 54+ messages in thread
From: Duy Nguyen @ 2014-03-15  1:23 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Ævar Arnfjörð, Michael Haggerty,
	Jonathan Nieder, Git Mailing List, Shawn Pearce

On Tue, Mar 11, 2014 at 8:49 AM, Jeff King <peff@peff.net> wrote:
> Right, I recall the general feeling being that such a system would work,
> and the transition would be managed by a config variable like
> "remote.*.useUploadPack2". Probably with settings like:
>
>   true:
>     always try, but allow fall back to upload-pack
>
>   false:
>     never try, always use upload-pack
>
>   auto:
>     try, but if we fail, set remote.*.uploadPackTimestamp, and do not
>     try again for N days
>
> The default would start at false, and people who know their server is
> very up-to-date can turn it on. And then when many server
> implementations support it, flip the default to auto. And either leave
> it there forever, or eventually just set it to "true" and drop "auto"
> entirely as a code cleanup.

I would add that upload-pack also advertises about the availability of
upload-pack2 and the client may set the remote.*.useUploadPack2 to
either yes or auto so next time upload-pack2 will be used.
-- 
Duy

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2014-03-15  0:09                                 ` Shawn Pearce
@ 2014-03-18  4:17                                   ` Jeff King
  2014-03-18 14:27                                     ` Duy Nguyen
  0 siblings, 1 reply; 54+ messages in thread
From: Jeff King @ 2014-03-18  4:17 UTC (permalink / raw)
  To: Shawn Pearce
  Cc: Duy Nguyen, Junio C Hamano, Ævar Arnfjörð,
	Michael Haggerty, Jonathan Nieder, Git Mailing List

On Fri, Mar 14, 2014 at 05:09:45PM -0700, Shawn Pearce wrote:

> On Fri, Mar 14, 2014 at 4:30 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> > On Fri, Mar 14, 2014 at 11:45 PM, Shawn Pearce <spearce@spearce.org> wrote:
> >>
> >> You missed the SSH case. It doesn't have this slot to hide the data into.
> >
> > Right now we run this for ssh case: "ssh <host> git-upload-pack
> > <repo-path>". New client can do this instead
> >
> > ssh <host> git-upload-pack <repo-path> <client capability flags>
> 
> Older servers will fail on this command, and the client must reconnect
> over SSH, which may mean supplying their password/passphrase again.
> But its remembered that the uploadPack2 didn't work so this can be
> blacklisted and not retried for a while.

I wonder if we could use the environment for optional values. E.g., can
we run:

  ssh host GIT_CAPABILITIES=... git-upload-pack <repo-path>

That will not work everywhere, of course. Sites with git-shell will
fail, as will sites with custom ssh handler (GitHub, for example, and I
imagine Gerrit sites, if they support ssh). So we'd still need some
fallback, but it would work out-of-the-box in a reasonable number of
cases (and it is really not that different than the http case, which is
just stuffing the values into $QUERY_STRING anyway :) ).

-Peff

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2014-03-15  1:23                   ` Duy Nguyen
@ 2014-03-18  4:18                     ` Jeff King
  0 siblings, 0 replies; 54+ messages in thread
From: Jeff King @ 2014-03-18  4:18 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Junio C Hamano, Ævar Arnfjörð, Michael Haggerty,
	Jonathan Nieder, Git Mailing List, Shawn Pearce

On Sat, Mar 15, 2014 at 08:23:08AM +0700, Duy Nguyen wrote:

> > The default would start at false, and people who know their server is
> > very up-to-date can turn it on. And then when many server
> > implementations support it, flip the default to auto. And either leave
> > it there forever, or eventually just set it to "true" and drop "auto"
> > entirely as a code cleanup.
> 
> I would add that upload-pack also advertises about the availability of
> upload-pack2 and the client may set the remote.*.useUploadPack2 to
> either yes or auto so next time upload-pack2 will be used.

Good idea. If our auto probe is "try 1, learn to upgrade to 2 for next
time", we do not have to be so conservative about flipping it on (as
compared to my "try 2, fall back to 1").

-Peff

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2014-03-18  4:17                                   ` Jeff King
@ 2014-03-18 14:27                                     ` Duy Nguyen
  2014-03-18 14:36                                       ` Duy Nguyen
  0 siblings, 1 reply; 54+ messages in thread
From: Duy Nguyen @ 2014-03-18 14:27 UTC (permalink / raw)
  To: Jeff King
  Cc: Shawn Pearce, Junio C Hamano, Ævar Arnfjörð,
	Michael Haggerty, Jonathan Nieder, Git Mailing List

On Tue, Mar 18, 2014 at 12:17:39AM -0400, Jeff King wrote:
> On Fri, Mar 14, 2014 at 05:09:45PM -0700, Shawn Pearce wrote:
> 
> > On Fri, Mar 14, 2014 at 4:30 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> > > On Fri, Mar 14, 2014 at 11:45 PM, Shawn Pearce <spearce@spearce.org> wrote:
> > >>
> > >> You missed the SSH case. It doesn't have this slot to hide the data into.
> > >
> > > Right now we run this for ssh case: "ssh <host> git-upload-pack
> > > <repo-path>". New client can do this instead
> > >
> > > ssh <host> git-upload-pack <repo-path> <client capability flags>
> > 
> > Older servers will fail on this command, and the client must reconnect
> > over SSH, which may mean supplying their password/passphrase again.
> > But its remembered that the uploadPack2 didn't work so this can be
> > blacklisted and not retried for a while.
> 
> I wonder if we could use the environment for optional values. E.g., can
> we run:
> 
>   ssh host GIT_CAPABILITIES=... git-upload-pack <repo-path>
> 
> That will not work everywhere, of course. Sites with git-shell will
> fail, as will sites with custom ssh handler (GitHub, for example, and I
> imagine Gerrit sites, if they support ssh). So we'd still need some
> fallback, but it would work out-of-the-box in a reasonable number of
> cases (and it is really not that different than the http case, which is
> just stuffing the values into $QUERY_STRING anyway :) ).

Aggressively gc'ing linux-2.6 takes forever (and it's being timed so I
can't really do any heavy lifting), so I outlined what the new
protocol would be instead.

Note that at least for upload-pack client capabilities can be
advertised twice: the first time at transport connection level, the
second time in the first "want", like in v1. I think this will keep
the code change down when we have to support both protocols. Moving
all capabilities to the first negotiation may touch many places, but
that's for now a baseless guess.

The new capability negotiation is also added for push. We didn't pay
much attention to it so far.

I thought about "GIT_CAPABILITIES= git-upload-pack ..." (and actually
added it in pack-protocol.txt then deleted). The thing is, if you want
to new upload-pack, you would need new git-upload-pack at the remote
end that must understand "git-upload-pack <repo> <caps>"
already. Making it aware about GIT_CAPABILITIES is extra cost for
nothing. And we have to update git-shell to support it eventually.

Well, the "must understand" part is not entirely true. If you make
git-daemon pass the early capabilities via GIT_CAPABILITIES too,
upload-pack does not have to support "<repo> <caps>" syntax. The
upside is if old git-upload-pack ignores this GIT_CAPABILITIES, it'll
break the protocol (see below) and can print friendly error
messages. git-daemon has no way of printing friendly messages because
it can't negotiate side-band.

I'm still not sure. But we should support either way, not both. Anyway
the text for new protocols:

-- 8< --
diff --git a/Documentation/config.txt b/Documentation/config.txt
index 79e5768..c329eb1 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2084,6 +2084,16 @@ remote.pushdefault::
 	`branch.<name>.remote` for all branches, and is overridden by
 	`branch.<name>.pushremote` for specific branches.
 
+remote.useUploadPack2::
+	Set to "always" to use only upload-pack version 2, "never" to
+	always use the original "upload-pack", "auto" to use the
+	original protocol, but if the remote claims it support version
+	2, then set "remote.<name>.useUploadPack2" to
+	"always". Default to "auto".
+
+remote.<name>.useUploadPack2::
+	Override remote.useUploadPack2 per remote.
+
 remote.<name>.url::
 	The URL of a remote repository.  See linkgit:git-fetch[1] or
 	linkgit:git-push[1].
diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt
index 39c6410..3db4219 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -40,15 +40,22 @@ hostname parameter, terminated by a NUL byte.
 
    0032git-upload-pack /project.git\0host=myserver.com\0
 
+Some service may accept an extra argument (e.g. upload-pack version
+2). The extra argument must follow "host".
+
+   0042git-upload-pack /project.git\0host=myserver.com\0flags=someflags\0
+
 --
-   git-proto-request = request-command SP pathname NUL [ host-parameter NUL ]
+   git-proto-request = request-command SP pathname NUL
+		       [ host-parameter NUL [ flags-parameter NUL ] ]
    request-command   = "git-upload-pack" / "git-receive-pack" /
 		       "git-upload-archive"   ; case sensitive
    pathname          = *( %x01-ff ) ; exclude NUL
    host-parameter    = "host=" hostname [ ":" port ]
+   flags-parameter   = "flags=" *( %x01-ff ) ; exclude NULL
 --
 
-Only host-parameter is allowed in the git-proto-request. Clients
+No other parameters are allowed in the git-proto-request. Clients
 MUST NOT attempt to send additional parameters. It is used for the
 git-daemon name based virtual hosting.  See --interpolated-path
 option to git daemon, with the %H/%CH format characters.
@@ -77,6 +84,11 @@ It is basically equivalent to running this:
 
    $ ssh git.example.com "git-upload-pack '/project.git'"
 
+Some service may accept an extra argument (e.g. upload-pack version
+2). The extra argument is appended, e.g.
+
+   $ ssh git.example.com "git-upload-pack '/project.git' 'extra-flags'"
+
 For a server to support Git pushing and pulling for a given user over
 SSH, that user needs to be able to execute one or both of those
 commands via the SSH shell that they are provided on login.  On some
@@ -124,6 +136,20 @@ has, the first can 'fetch' from the second.  This operation determines
 what data the server has that the client does not then streams that
 data down to the client in packfile format.
 
+Initial capability negotiation
+------------------------------
+
+When the client connects to the server with the extra argument,
+upload-pack version 2 is used. Otherwise the original version is
+used. Unless explicitly stated, the original version is implied.
+
+When the client initially connects to the server using upload-pack
+version 2, the server MUST reply with one pkt-line describing its
+capabilities. Capabilities that are recognized by both ends are
+immediately effective.
+
+By default, upload-pack version 1's reference discovery will follow
+unless some capability makes it different.
 
 Reference Discovery
 -------------------
@@ -447,9 +473,14 @@ Reference Discovery
 
 The reference discovery phase is done nearly the same way as it is in the
 fetching protocol. Each reference obj-id and name on the server is sent
-in packet-line format to the client, followed by a flush-pkt.  The only
-real difference is that the capability listing is different - the only
-possible values are 'report-status', 'delete-refs' and 'ofs-delta'.
+in packet-line format to the client, followed by a flush-pkt. Or with
+receive-pack version 2, a separate pkt-line containing capabilities is
+sent back, then followed by reference discovery unless some capability
+changes it.
+
+The only real difference is that the capability listing is different -
+the only possible values are 'report-status', 'delete-refs' and
+'ofs-delta'.
 
 Reference Update Request and Packfile Transfer
 ----------------------------------------------
diff --git a/Documentation/technical/protocol-capabilities.txt b/Documentation/technical/protocol-capabilities.txt
index e174343..a165286 100644
--- a/Documentation/technical/protocol-capabilities.txt
+++ b/Documentation/technical/protocol-capabilities.txt
@@ -250,3 +250,8 @@ allow-tip-sha1-in-want
 If the upload-pack server advertises this capability, fetch-pack may
 send "want" lines with SHA-1s that exist at the server but are not
 advertised by upload-pack.
+
+uploadpack2
+-----------
+
+upload-pack version 2 is supported.
-- 8< --

^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH v3 0/8] Hiding refs
  2014-03-18 14:27                                     ` Duy Nguyen
@ 2014-03-18 14:36                                       ` Duy Nguyen
  0 siblings, 0 replies; 54+ messages in thread
From: Duy Nguyen @ 2014-03-18 14:36 UTC (permalink / raw)
  To: Jeff King
  Cc: Shawn Pearce, Junio C Hamano, Ævar Arnfjörð,
	Michael Haggerty, Jonathan Nieder, Git Mailing List

On Tue, Mar 18, 2014 at 9:27 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> I thought about "GIT_CAPABILITIES= git-upload-pack ..." (and actually
> added it in pack-protocol.txt then deleted). The thing is, if you want
> to new upload-pack, you would need new git-upload-pack at the remote
> end that must understand "git-upload-pack <repo> <caps>"
> already. Making it aware about GIT_CAPABILITIES is extra cost for
> nothing. And we have to update git-shell to support it eventually.
>
> Well, the "must understand" part is not entirely true. If you make
> git-daemon pass the early capabilities via GIT_CAPABILITIES too,
> upload-pack does not have to support "<repo> <caps>" syntax. The
> upside is if old git-upload-pack ignores this GIT_CAPABILITIES, it'll
> break the protocol (see below) and can print friendly error
> messages. git-daemon has no way of printing friendly messages because
> it can't negotiate side-band.

I should have read my mail one more time before sending. The
"git-upload-pack ignores..." sentence is wrong. If it's old, its
behavior is fixed and it cannot not send or do anything new.

But on the other hand, this is good. The new protocol expects
upload-pack to send its caps in a new pkt-line. The old upload-pack
does not follow this, which should be the indicator for the client
that this server does not support v2, so it could fall back to v1
gracefully. git:// still fails hard because git-daemon is likely old
too and rejects it from the beginning. But ssh:// (without git-shell)
should work, http:// too. This is a very good point for
GIT_CAPABILITIES.
-- 
Duy

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2014-03-18 14:36 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-30 18:45 [PATCH v3 0/8] Hiding refs Junio C Hamano
2013-01-30 18:45 ` [PATCH v3 1/8] upload-pack: share more code Junio C Hamano
2013-01-30 18:45 ` [PATCH v3 2/8] upload-pack: simplify request validation Junio C Hamano
2013-01-30 18:45 ` [PATCH v3 3/8] upload/receive-pack: allow hiding ref hierarchies Junio C Hamano
2013-02-05  8:50   ` Jeff King
2013-02-05 15:45     ` Junio C Hamano
2013-02-06 11:31       ` Jeff King
2013-02-06 15:57         ` Junio C Hamano
2013-01-30 18:45 ` [PATCH v3 4/8] parse_fetch_refspec(): clarify the codeflow a bit Junio C Hamano
2013-01-30 18:45 ` [PATCH v3 5/8] fetch: use struct ref to represent refs to be fetched Junio C Hamano
2013-01-30 18:45 ` [PATCH v3 6/8] upload-pack: optionally allow fetching from the tips of hidden refs Junio C Hamano
2013-01-30 18:45 ` [PATCH v3 7/8] fetch: fetch objects by their exact SHA-1 object names Junio C Hamano
2013-02-05  9:19   ` Jeff King
2013-02-05 11:18     ` Jeff King
2013-02-05 15:55     ` Junio C Hamano
2013-01-30 18:45 ` [PATCH v3 8/8] WIP: receive.allowupdatestohidden Junio C Hamano
2013-02-05  8:04 ` [PATCH v3 0/8] Hiding refs Michael Haggerty
2013-02-05  8:33   ` Jonathan Nieder
2013-02-05 10:29     ` Michael Haggerty
2013-02-05 17:38       ` Junio C Hamano
2013-02-06 10:34       ` Duy Nguyen
2013-02-06 19:17         ` Junio C Hamano
2013-02-06 19:45           ` Jonathan Nieder
2013-02-06 21:50           ` Michael Haggerty
2013-02-06 22:12             ` Junio C Hamano
2013-02-06 22:26           ` Ævar Arnfjörð Bjarmason
2013-02-07  0:12             ` Junio C Hamano
2013-02-07  0:16               ` Jeff King
2013-02-07 10:30                 ` Ævar Arnfjörð Bjarmason
2013-02-07 18:25                 ` Junio C Hamano
2014-02-23  2:44               ` Duy Nguyen
2014-03-11  1:49                 ` Jeff King
2014-03-11 19:32                   ` Junio C Hamano
2014-03-11 20:05                     ` Jeff King
2014-03-11 20:25                       ` Junio C Hamano
2014-03-11 20:36                         ` Jeff King
2014-03-14 12:37                           ` Duy Nguyen
2014-03-14 16:45                             ` Shawn Pearce
2014-03-14 23:30                               ` Duy Nguyen
2014-03-15  0:09                                 ` Shawn Pearce
2014-03-18  4:17                                   ` Jeff King
2014-03-18 14:27                                     ` Duy Nguyen
2014-03-18 14:36                                       ` Duy Nguyen
2014-03-15  1:23                   ` Duy Nguyen
2014-03-18  4:18                     ` Jeff King
2013-02-06 22:56           ` Jeff King
2013-02-05 17:36     ` Junio C Hamano
2013-02-05 17:27   ` Junio C Hamano
2013-02-06 10:17     ` Michael Haggerty
2013-02-06 19:55       ` Jonathan Nieder
2013-02-06 22:01         ` Michael Haggerty
2013-02-07 15:58       ` Jed Brown
2013-02-09 23:23         ` Junio C Hamano
2013-02-10  4:45           ` Jed Brown

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).