git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* Cloning empty repository uses locally configured default branch name
@ 2020-12-08  1:31 Jonathan Tan
  2020-12-08  2:16 ` Junio C Hamano
                   ` (6 more replies)
  0 siblings, 7 replies; 109+ messages in thread
From: Jonathan Tan @ 2020-12-08  1:31 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

When cloning an empty repository, a local branch is created. But its
name is not the name of the branch that the remote HEAD points to - it
is the locally configured default branch name. This issue arose at
$DAYJOB and, from my memory, it is also not an uncommon workflow to
configure things online on a repo host and then use "git clone" so that
things like remotes are automatically configured.

Has anyone looked into solutions for this? Both protocol v0 and v2 do
not send symref information about unborn branches (v0 because, as
protocol-capabilities.txt says, "servers SHOULD include this capability
for the HEAD symref if it is one of the refs being sent"; v2 because
a symref is included only if it refers to one of the refs being sent).
In protocol v2, this could be done by adding a capability to ls-remote
(maybe, "unborn"), and in protocol v0, this could be done either by
updating the existing "symref" capability to be written even when the
target branch is unborn (which is potentially backwards incompatible) or
introducing a new capability which is like "symref".

A small issue is that upload-pack protocol v0 doesn't even write the
blank ref line ("000...000 capabilities^{}") if HEAD points to an unborn
branch, but that can be fixed as in the patch below.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 upload-pack.c | 40 +++++++++++++++++++++++++++-------------
 1 file changed, 27 insertions(+), 13 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index 1006bebd50..d2359a8560 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1179,18 +1179,15 @@ static void format_symref_info(struct strbuf *buf, struct string_list *symref)
 		strbuf_addf(buf, " symref=%s:%s", item->string, (char *)item->util);
 }
 
-static int send_ref(const char *refname, const struct object_id *oid,
-		    int flag, void *cb_data)
+static const char *capabilities = "multi_ack thin-pack side-band"
+	" side-band-64k ofs-delta shallow deepen-since deepen-not"
+	" deepen-relative no-progress include-tag multi_ack_detailed";
+
+static void write_ref_lines(const char *refname_nons,
+			    const struct object_id *oid,
+			    const struct object_id *peeled,
+			    struct upload_pack_data *data)
 {
-	static const char *capabilities = "multi_ack thin-pack side-band"
-		" side-band-64k ofs-delta shallow deepen-since deepen-not"
-		" deepen-relative no-progress include-tag multi_ack_detailed";
-	const char *refname_nons = strip_namespace(refname);
-	struct object_id peeled;
-	struct upload_pack_data *data = cb_data;
-
-	if (mark_our_ref(refname_nons, refname, oid))
-		return 0;
 
 	if (capabilities) {
 		struct strbuf symref_info = STRBUF_INIT;
@@ -1213,8 +1210,23 @@ static int send_ref(const char *refname, const struct object_id *oid,
 		packet_write_fmt(1, "%s %s\n", oid_to_hex(oid), refname_nons);
 	}
 	capabilities = NULL;
-	if (!peel_ref(refname, &peeled))
-		packet_write_fmt(1, "%s %s^{}\n", oid_to_hex(&peeled), refname_nons);
+	if (peeled)
+		packet_write_fmt(1, "%s %s^{}\n", oid_to_hex(peeled), refname_nons);
+}
+
+static int send_ref(const char *refname, const struct object_id *oid,
+		    int flag, void *cb_data)
+{
+	const char *refname_nons = strip_namespace(refname);
+	struct object_id peeled;
+	struct upload_pack_data *data = cb_data;
+
+	if (mark_our_ref(refname_nons, refname, oid))
+		return 0;
+	write_ref_lines(refname_nons,
+			oid,
+			peel_ref(refname, &peeled) ? NULL : &peeled,
+			data);
 	return 0;
 }
 
@@ -1332,6 +1344,8 @@ void upload_pack(struct upload_pack_options *options)
 		reset_timeout(data.timeout);
 		head_ref_namespaced(send_ref, &data);
 		for_each_namespaced_ref(send_ref, &data);
+		if (capabilities)
+			write_ref_lines("capabilities^{}", &null_oid, NULL, &data);
 		advertise_shallow_grafts(1);
 		packet_flush(1);
 	} else {
-- 
2.29.2.576.ga3fc446d84-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: Cloning empty repository uses locally configured default branch name
  2020-12-08  1:31 Cloning empty repository uses locally configured default branch name Jonathan Tan
@ 2020-12-08  2:16 ` Junio C Hamano
  2020-12-08  2:32   ` brian m. carlson
  2020-12-08 18:55   ` Jonathan Tan
  2020-12-08 15:58 ` Jeff King
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 109+ messages in thread
From: Junio C Hamano @ 2020-12-08  2:16 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> Has anyone looked into solutions for this? Both protocol v0 and v2 do
> not send symref information about unborn branches (v0 because, as
> protocol-capabilities.txt says, "servers SHOULD include this capability
> for the HEAD symref if it is one of the refs being sent"; v2 because
> a symref is included only if it refers to one of the refs being sent).
> In protocol v2, this could be done by adding a capability to ls-remote
> (maybe, "unborn"), and in protocol v0, this could be done either by
> updating the existing "symref" capability to be written even when the
> target branch is unborn (which is potentially backwards incompatible) or
> introducing a new capability which is like "symref".

Thanks for looking into this (I think this came up again today
during my reviews of some topic).

It would be a backward incompatible change to add to v0, but at this
point shouldn't we be leaving v0 as-is and move everybody to v2?

If it is a simple and safe enough change, though, saying "why not"
is very tempting, though.

> A small issue is that upload-pack protocol v0 doesn't even write the
> blank ref line ("000...000 capabilities^{}") if HEAD points to an unborn
> branch, but that can be fixed as in the patch below.

I think the codepaths we have today in process_capabilities() and
process_dummy_ref() (both in connect.c) would do the right thing
when it sees a blank ref line even when nothing gets transported,
but I smell that the rewrite of this state machine is fairly recent
(say in the past few years) and I do not offhand know if clients
before the rewrite of the state machine (say in v2.18.0) would be OK
with the change.  It should be easy to check, though.

> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  upload-pack.c | 40 +++++++++++++++++++++++++++-------------
>  1 file changed, 27 insertions(+), 13 deletions(-)
>
> diff --git a/upload-pack.c b/upload-pack.c
> index 1006bebd50..d2359a8560 100644
> --- a/upload-pack.c
> +++ b/upload-pack.c
> @@ -1179,18 +1179,15 @@ static void format_symref_info(struct strbuf *buf, struct string_list *symref)
>  		strbuf_addf(buf, " symref=%s:%s", item->string, (char *)item->util);
>  }
>  
> -static int send_ref(const char *refname, const struct object_id *oid,
> -		    int flag, void *cb_data)
> +static const char *capabilities = "multi_ack thin-pack side-band"
> +	" side-band-64k ofs-delta shallow deepen-since deepen-not"
> +	" deepen-relative no-progress include-tag multi_ack_detailed";
> +
> +static void write_ref_lines(const char *refname_nons,
> +			    const struct object_id *oid,
> +			    const struct object_id *peeled,
> +			    struct upload_pack_data *data)
>  {
> -	static const char *capabilities = "multi_ack thin-pack side-band"
> -		" side-band-64k ofs-delta shallow deepen-since deepen-not"
> -		" deepen-relative no-progress include-tag multi_ack_detailed";
> -	const char *refname_nons = strip_namespace(refname);
> -	struct object_id peeled;
> -	struct upload_pack_data *data = cb_data;
> -
> -	if (mark_our_ref(refname_nons, refname, oid))
> -		return 0;
>  
>  	if (capabilities) {
>  		struct strbuf symref_info = STRBUF_INIT;
> @@ -1213,8 +1210,23 @@ static int send_ref(const char *refname, const struct object_id *oid,
>  		packet_write_fmt(1, "%s %s\n", oid_to_hex(oid), refname_nons);
>  	}
>  	capabilities = NULL;
> -	if (!peel_ref(refname, &peeled))
> -		packet_write_fmt(1, "%s %s^{}\n", oid_to_hex(&peeled), refname_nons);
> +	if (peeled)
> +		packet_write_fmt(1, "%s %s^{}\n", oid_to_hex(peeled), refname_nons);
> +}
> +
> +static int send_ref(const char *refname, const struct object_id *oid,
> +		    int flag, void *cb_data)
> +{
> +	const char *refname_nons = strip_namespace(refname);
> +	struct object_id peeled;
> +	struct upload_pack_data *data = cb_data;
> +
> +	if (mark_our_ref(refname_nons, refname, oid))
> +		return 0;
> +	write_ref_lines(refname_nons,
> +			oid,
> +			peel_ref(refname, &peeled) ? NULL : &peeled,
> +			data);
>  	return 0;
>  }
>  
> @@ -1332,6 +1344,8 @@ void upload_pack(struct upload_pack_options *options)
>  		reset_timeout(data.timeout);
>  		head_ref_namespaced(send_ref, &data);
>  		for_each_namespaced_ref(send_ref, &data);
> +		if (capabilities)
> +			write_ref_lines("capabilities^{}", &null_oid, NULL, &data);
>  		advertise_shallow_grafts(1);
>  		packet_flush(1);
>  	} else {

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: Cloning empty repository uses locally configured default branch name
  2020-12-08  2:16 ` Junio C Hamano
@ 2020-12-08  2:32   ` brian m. carlson
  2020-12-08 18:55   ` Jonathan Tan
  1 sibling, 0 replies; 109+ messages in thread
From: brian m. carlson @ 2020-12-08  2:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Tan, git

[-- Attachment #1: Type: text/plain, Size: 1748 bytes --]

On 2020-12-08 at 02:16:07, Junio C Hamano wrote:
> Jonathan Tan <jonathantanmy@google.com> writes:
> 
> > Has anyone looked into solutions for this? Both protocol v0 and v2 do
> > not send symref information about unborn branches (v0 because, as
> > protocol-capabilities.txt says, "servers SHOULD include this capability
> > for the HEAD symref if it is one of the refs being sent"; v2 because
> > a symref is included only if it refers to one of the refs being sent).
> > In protocol v2, this could be done by adding a capability to ls-remote
> > (maybe, "unborn"), and in protocol v0, this could be done either by
> > updating the existing "symref" capability to be written even when the
> > target branch is unborn (which is potentially backwards incompatible) or
> > introducing a new capability which is like "symref".
> 
> Thanks for looking into this (I think this came up again today
> during my reviews of some topic).
> 
> It would be a backward incompatible change to add to v0, but at this
> point shouldn't we be leaving v0 as-is and move everybody to v2?
> 
> If it is a simple and safe enough change, though, saying "why not"
> is very tempting, though.

Yeah, I think this would be a nice thing to add to v2.  I've considered
adding a way to push symrefs (that is, update the head on the remote
side), but that would be a bit trickier.  Still, there's no reason the
fetch side couldn't learn a "symref" capability in the meantime.

I don't see a need for this in v0, since all versions of Git that
support this will also support v2.  I think it's okay if other clients
have to add support for v2 before they get the cool new features.
-- 
brian m. carlson (he/him or they/them)
Houston, Texas, US

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: Cloning empty repository uses locally configured default branch name
  2020-12-08  1:31 Cloning empty repository uses locally configured default branch name Jonathan Tan
  2020-12-08  2:16 ` Junio C Hamano
@ 2020-12-08 15:58 ` Jeff King
  2020-12-08 20:06   ` Jonathan Tan
  2020-12-11 21:05 ` [PATCH] clone: in protocol v2, use remote's default branch Jonathan Tan
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 109+ messages in thread
From: Jeff King @ 2020-12-08 15:58 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Mon, Dec 07, 2020 at 05:31:20PM -0800, Jonathan Tan wrote:

> When cloning an empty repository, a local branch is created. But its
> name is not the name of the branch that the remote HEAD points to - it
> is the locally configured default branch name. This issue arose at
> $DAYJOB and, from my memory, it is also not an uncommon workflow to
> configure things online on a repo host and then use "git clone" so that
> things like remotes are automatically configured.
> 
> Has anyone looked into solutions for this? Both protocol v0 and v2 do
> not send symref information about unborn branches (v0 because, as
> protocol-capabilities.txt says, "servers SHOULD include this capability
> for the HEAD symref if it is one of the refs being sent"; v2 because
> a symref is included only if it refers to one of the refs being sent).
> In protocol v2, this could be done by adding a capability to ls-remote
> (maybe, "unborn"), and in protocol v0, this could be done either by
> updating the existing "symref" capability to be written even when the
> target branch is unborn (which is potentially backwards incompatible) or
> introducing a new capability which is like "symref".

We discussed this a few years ago, and I even wrote a small patch (for
v0 at the time, of course):

  https://lore.kernel.org/git/20170525155924.hk5jskennph6tta3@sigill.intra.peff.net/

A rebased version of that patch is below (it needed updating to handle
some namespacing stuff). Coupled with your patch here for the truly
empty repo case, it makes the server side of v0 do what you'd want.

But the client side needs to handle it, too. See the linked thread for
some discussion.

I wouldn't be too worried about the backwards incompatibility of sending
a symref line in the capabilities that doesn't point to a ref we're
sending. Old clients are quite likely to ignore it. But...

> A small issue is that upload-pack protocol v0 doesn't even write the
> blank ref line ("000...000 capabilities^{}") if HEAD points to an unborn
> branch, but that can be fixed as in the patch below.

I would worry how clients handle this bogus entry in the ref
advertisement. It looks like the actual Git client is OK, but what about
jgit, libgit2, etc? That's not necessarily a deal-breaker, but it would
be nice to know how they react.

It also only helps with v0 (and I agree with the sentiment that it would
be OK to ignore v0 at this point). For v2, we'd have to issue a HEAD
line like:

  0000000000000000000000000000000000000000 HEAD symref=refs/heads/foo

That probably would break clients, but the unborn capability should take
care of that.

Patch below (I think it only helps v0, but it could serve as a model for
doing the same thing in v2).

---
 upload-pack.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index 1006bebd50..b0cc337dcb 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1218,20 +1218,21 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	return 0;
 }
 
-static int find_symref(const char *refname, const struct object_id *oid,
-		       int flag, void *cb_data)
+static void find_symref(const char *refname, struct string_list *out)
 {
 	const char *symref_target;
 	struct string_list_item *item;
+	struct strbuf namespaced = STRBUF_INIT;
+	int flag;
+
+	strbuf_addf(&namespaced, "%s%s", get_git_namespace(), refname);
+	symref_target = resolve_ref_unsafe(namespaced.buf, 0, NULL, &flag);
+	strbuf_release(&namespaced);
 
-	if ((flag & REF_ISSYMREF) == 0)
-		return 0;
-	symref_target = resolve_ref_unsafe(refname, 0, NULL, &flag);
 	if (!symref_target || (flag & REF_ISSYMREF) == 0)
-		die("'%s' is a symref but it is not?", refname);
-	item = string_list_append(cb_data, strip_namespace(refname));
+		return;
+	item = string_list_append(out, refname);
 	item->util = xstrdup(strip_namespace(symref_target));
-	return 0;
 }
 
 static int parse_object_filter_config(const char *var, const char *value,
@@ -1326,7 +1327,7 @@ void upload_pack(struct upload_pack_options *options)
 	data.daemon_mode = options->daemon_mode;
 	data.timeout = options->timeout;
 
-	head_ref_namespaced(find_symref, &data.symref);
+	find_symref("HEAD", &data.symref);
 
 	if (options->advertise_refs || !data.stateless_rpc) {
 		reset_timeout(data.timeout);
-- 
2.29.2.980.g00fe049108


^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: Cloning empty repository uses locally configured default branch name
  2020-12-08  2:16 ` Junio C Hamano
  2020-12-08  2:32   ` brian m. carlson
@ 2020-12-08 18:55   ` Jonathan Tan
  2020-12-08 21:00     ` Junio C Hamano
  1 sibling, 1 reply; 109+ messages in thread
From: Jonathan Tan @ 2020-12-08 18:55 UTC (permalink / raw)
  To: gitster; +Cc: jonathantanmy, git

> Jonathan Tan <jonathantanmy@google.com> writes:
> 
> > Has anyone looked into solutions for this? Both protocol v0 and v2 do
> > not send symref information about unborn branches (v0 because, as
> > protocol-capabilities.txt says, "servers SHOULD include this capability
> > for the HEAD symref if it is one of the refs being sent"; v2 because
> > a symref is included only if it refers to one of the refs being sent).
> > In protocol v2, this could be done by adding a capability to ls-remote
> > (maybe, "unborn"), and in protocol v0, this could be done either by
> > updating the existing "symref" capability to be written even when the
> > target branch is unborn (which is potentially backwards incompatible) or
> > introducing a new capability which is like "symref".
> 
> Thanks for looking into this (I think this came up again today
> during my reviews of some topic).
> 
> It would be a backward incompatible change to add to v0, but at this
> point shouldn't we be leaving v0 as-is and move everybody to v2?

That makes sense.

> If it is a simple and safe enough change, though, saying "why not"
> is very tempting, though.

I'll look into how simple and safe it is.

> > A small issue is that upload-pack protocol v0 doesn't even write the
> > blank ref line ("000...000 capabilities^{}") if HEAD points to an unborn
> > branch, but that can be fixed as in the patch below.
> 
> I think the codepaths we have today in process_capabilities() and
> process_dummy_ref() (both in connect.c) would do the right thing
> when it sees a blank ref line even when nothing gets transported,
> but I smell that the rewrite of this state machine is fairly recent
> (say in the past few years) and I do not offhand know if clients
> before the rewrite of the state machine (say in v2.18.0) would be OK
> with the change.  It should be easy to check, though.

Yes - I backported my patch to v2.17.0 and it works:

  $ GIT_TRACE_PACKET=1 ~/git/bin-wrappers/git ls-remote "file://$(pwd)/empty"
  10:49:33.474111 pkt-line.c:80           packet:  upload-pack> 0000000000000000000000000000000000000000 capabilities^{}\0multi_ack thin-pack side-band side-band-64k ofs-delta shallow deepen-since deepen-not deepen-relative no-progress include-tag multi_ack_detailed agent=git/2.17.0.dirty
  10:49:33.474182 pkt-line.c:80           packet:  upload-pack> 0000
  10:49:33.474243 pkt-line.c:80           packet:          git< 0000000000000000000000000000000000000000 capabilities^{}\0multi_ack thin-pack side-band side-band-64k ofs-delta shallow deepen-since deepen-not deepen-relative no-progress include-tag multi_ack_detailed agent=git/2.17.0.dirty
  10:49:33.474315 pkt-line.c:80           packet:          git< 0000
  10:49:33.474320 pkt-line.c:80           packet:          git> 0000
  10:49:33.474358 pkt-line.c:80           packet:  upload-pack< 0000

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: Cloning empty repository uses locally configured default branch name
  2020-12-08 15:58 ` Jeff King
@ 2020-12-08 20:06   ` Jonathan Tan
  2020-12-08 21:15     ` Jeff King
  0 siblings, 1 reply; 109+ messages in thread
From: Jonathan Tan @ 2020-12-08 20:06 UTC (permalink / raw)
  To: peff; +Cc: jonathantanmy, git

> On Mon, Dec 07, 2020 at 05:31:20PM -0800, Jonathan Tan wrote:
> 
> > When cloning an empty repository, a local branch is created. But its
> > name is not the name of the branch that the remote HEAD points to - it
> > is the locally configured default branch name. This issue arose at
> > $DAYJOB and, from my memory, it is also not an uncommon workflow to
> > configure things online on a repo host and then use "git clone" so that
> > things like remotes are automatically configured.
> > 
> > Has anyone looked into solutions for this? Both protocol v0 and v2 do
> > not send symref information about unborn branches (v0 because, as
> > protocol-capabilities.txt says, "servers SHOULD include this capability
> > for the HEAD symref if it is one of the refs being sent"; v2 because
> > a symref is included only if it refers to one of the refs being sent).
> > In protocol v2, this could be done by adding a capability to ls-remote
> > (maybe, "unborn"), and in protocol v0, this could be done either by
> > updating the existing "symref" capability to be written even when the
> > target branch is unborn (which is potentially backwards incompatible) or
> > introducing a new capability which is like "symref".
> 
> We discussed this a few years ago, and I even wrote a small patch (for
> v0 at the time, of course):
> 
>   https://lore.kernel.org/git/20170525155924.hk5jskennph6tta3@sigill.intra.peff.net/
> 
> A rebased version of that patch is below (it needed updating to handle
> some namespacing stuff). Coupled with your patch here for the truly
> empty repo case, it makes the server side of v0 do what you'd want.
> 
> But the client side needs to handle it, too. See the linked thread for
> some discussion.

Thanks for the pointer.

> I wouldn't be too worried about the backwards incompatibility of sending
> a symref line in the capabilities that doesn't point to a ref we're
> sending. Old clients are quite likely to ignore it. But...
> 
> > A small issue is that upload-pack protocol v0 doesn't even write the
> > blank ref line ("000...000 capabilities^{}") if HEAD points to an unborn
> > branch, but that can be fixed as in the patch below.
> 
> I would worry how clients handle this bogus entry in the ref
> advertisement. It looks like the actual Git client is OK, but what about
> jgit, libgit2, etc? That's not necessarily a deal-breaker, but it would
> be nice to know how they react.

That bogus entry is defined in the protocol and JGit both produces and
consumes that line. Consumption was verified by patching Git with my
patch and running the following commands in separate terminals:

~/git/bin-wrappers/git daemon --port=9425 --base-path=. .
sudo tcpdump -i any port 9425 -w -
~/jgit/bazel-bin/org.eclipse.jgit.pgm/jgit ls-remote git://localhost:9425/empty

And production:

~/jgit/bazel-bin/org.eclipse.jgit.pgm/jgit daemon --port=9426 .
GIT_TRACE_PACKET=1 git ls-remote git://localhost:9426/empty

(Note that the JGit CLI does not have a separate --base-path parameter.)

I have not checked libgit2, but quite a few servers use JGit out there,
so it presumably should be able to interoperate with them and hence
support the bogus entry.

> It also only helps with v0 (and I agree with the sentiment that it would
> be OK to ignore v0 at this point). For v2, we'd have to issue a HEAD
> line like:
> 
>   0000000000000000000000000000000000000000 HEAD symref=refs/heads/foo
> 
> That probably would break clients, but the unborn capability should take
> care of that.

Yes - or a special string like "unborn" in place of the 000.000.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: Cloning empty repository uses locally configured default branch name
  2020-12-08 18:55   ` Jonathan Tan
@ 2020-12-08 21:00     ` Junio C Hamano
  0 siblings, 0 replies; 109+ messages in thread
From: Junio C Hamano @ 2020-12-08 21:00 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

>> > A small issue is that upload-pack protocol v0 doesn't even write the
>> > blank ref line ("000...000 capabilities^{}") if HEAD points to an unborn
>> > branch, but that can be fixed as in the patch below.
>> 
>> I think the codepaths we have today in process_capabilities() and
>> process_dummy_ref() (both in connect.c) would do the right thing
>> when it sees a blank ref line even when nothing gets transported,
>> but I smell that the rewrite of this state machine is fairly recent
>> (say in the past few years) and I do not offhand know if clients
>> before the rewrite of the state machine (say in v2.18.0) would be OK
>> with the change.  It should be easy to check, though.
>
> Yes - I backported my patch to v2.17.0 and it works:

I wouldn't be surprised if other reimplementations of Git (like
jgit, libgit2 and Go or Python or whatever your favorite language)
barfs, though.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: Cloning empty repository uses locally configured default branch name
  2020-12-08 20:06   ` Jonathan Tan
@ 2020-12-08 21:15     ` Jeff King
  0 siblings, 0 replies; 109+ messages in thread
From: Jeff King @ 2020-12-08 21:15 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Tue, Dec 08, 2020 at 12:06:49PM -0800, Jonathan Tan wrote:

> > I would worry how clients handle this bogus entry in the ref
> > advertisement. It looks like the actual Git client is OK, but what about
> > jgit, libgit2, etc? That's not necessarily a deal-breaker, but it would
> > be nice to know how they react.
> 
> That bogus entry is defined in the protocol and JGit both produces and
> consumes that line. Consumption was verified by patching Git with my
> patch and running the following commands in separate terminals:

Ah, indeed.

I forgot that we went through all of this a few years ago for your
eb398797cd (connect: advertized capability is not a ref, 2016-09-09). I
stand behind the "it was probably originally an error in the protocol
documentation" from [1], but at this point I think we can say it's a
supported part of the protocol.

All of this is moot, of course, if we only do the v2 solution. :)

-Peff

[1] https://lore.kernel.org/git/20160902233547.mzgluioc7hhabalw@sigill.intra.peff.net/

^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH] clone: in protocol v2, use remote's default branch
  2020-12-08  1:31 Cloning empty repository uses locally configured default branch name Jonathan Tan
  2020-12-08  2:16 ` Junio C Hamano
  2020-12-08 15:58 ` Jeff King
@ 2020-12-11 21:05 ` Jonathan Tan
  2020-12-11 23:41   ` Junio C Hamano
                     ` (5 more replies)
  2021-01-26 18:55 ` [PATCH v5 " Jonathan Tan
                   ` (3 subsequent siblings)
  6 siblings, 6 replies; 109+ messages in thread
From: Jonathan Tan @ 2020-12-11 21:05 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

When cloning an empty repository, a default branch is created. However,
it is named after the locally configured init.defaultBranch, not the
default branch of the remote repository.

To solve this, the remote needs to communicate the target of the HEAD
symref, and "git clone" needs to use this information.

Currently, symrefs that have unborn targets (such as in this case) are
not communicated by the protocol. Teach Git to advertise and support the
"unborn" feature in "ls-refs" (guarded by the lsrefs.unborn config).
This feature indicates that "ls-refs" supports the "unborn" argument;
when it is specified, "ls-refs" will send the HEAD symref with the name
of its unborn target.

On the client side, Git will always send the "unborn" argument if it is
supported by the server. During "git clone", if cloning an empty
repository, Git will use the new information to determine the local
branch to create. In all other cases, Git will ignore it.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Documentation/technical/protocol-v2.txt | 10 ++++-
 builtin/clone.c                         | 19 +++++++--
 builtin/fetch-pack.c                    |  3 +-
 builtin/fetch.c                         |  2 +-
 builtin/ls-remote.c                     |  2 +-
 builtin/remote.c                        |  2 +-
 connect.c                               | 29 +++++++++++--
 ls-refs.c                               | 54 +++++++++++++++++++++++--
 ls-refs.h                               |  1 +
 remote.h                                |  3 +-
 serve.c                                 |  2 +-
 t/t5606-clone-options.sh                |  7 ++--
 t/t5702-protocol-v2.sh                  | 17 ++++++++
 transport-helper.c                      |  7 +++-
 transport-internal.h                    | 13 +++---
 transport.c                             | 29 ++++++++-----
 transport.h                             |  7 +++-
 17 files changed, 165 insertions(+), 42 deletions(-)

diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index e597b74da3..dfe03aa114 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -192,11 +192,19 @@ ls-refs takes in the following arguments:
 	When specified, only references having a prefix matching one of
 	the provided prefixes are displayed.
 
+If the 'unborn' feature is advertised the following argument can be
+included in the client's request.
+
+    unborn
+	The server may send symrefs pointing to unborn branches in the form
+	"unborn <refname> symref-target:<target>".
+
 The output of ls-refs is as follows:
 
     output = *ref
 	     flush-pkt
-    ref = PKT-LINE(obj-id SP refname *(SP ref-attribute) LF)
+    obj-id-or-unborn = (obj-id | "unborn")
+    ref = PKT-LINE(obj-id-or-unborn SP refname *(SP ref-attribute) LF)
     ref-attribute = (symref | peeled)
     symref = "symref-target:" symref-target
     peeled = "peeled:" obj-id
diff --git a/builtin/clone.c b/builtin/clone.c
index a0841923cf..217c87fddf 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -980,6 +980,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	int submodule_progress;
 
 	struct strvec ref_prefixes = STRVEC_INIT;
+	char *unborn_head_target = NULL;
 
 	packet_trace_identity("clone");
 
@@ -1264,7 +1265,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (!option_no_tags)
 		strvec_push(&ref_prefixes, "refs/tags/");
 
-	refs = transport_get_remote_refs(transport, &ref_prefixes);
+	refs = transport_get_remote_refs(transport, &ref_prefixes,
+					 &unborn_head_target);
 
 	if (refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
@@ -1323,10 +1325,20 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		remote_head = NULL;
 		option_no_checkout = 1;
 		if (!option_bare) {
-			const char *branch = git_default_branch_name();
-			char *ref = xstrfmt("refs/heads/%s", branch);
+			const char *branch;
+			char *ref;
+
+			if (unborn_head_target &&
+			    skip_prefix(unborn_head_target, "refs/heads/", &branch)) {
+				ref = unborn_head_target;
+				unborn_head_target = NULL;
+			} else {
+				branch = git_default_branch_name();
+				ref = xstrfmt("refs/heads/%s", branch);
+			}
 
 			install_branch_config(0, branch, remote_name, ref);
+			create_symref("HEAD", ref, "");
 			free(ref);
 		}
 	}
@@ -1373,6 +1385,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	strbuf_release(&key);
 	junk_mode = JUNK_LEAVE_ALL;
 
+	free(unborn_head_target);
 	strvec_clear(&ref_prefixes);
 	return err;
 }
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 58b7c1fbdc..9f921dfab4 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -220,7 +220,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	version = discover_version(&reader);
 	switch (version) {
 	case protocol_v2:
-		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL, args.stateless_rpc);
+		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL,
+				args.stateless_rpc, NULL);
 		break;
 	case protocol_v1:
 	case protocol_v0:
diff --git a/builtin/fetch.c b/builtin/fetch.c
index ecf8537605..a7ef59acfc 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1393,7 +1393,7 @@ static int do_fetch(struct transport *transport,
 
 	if (must_list_refs) {
 		trace2_region_enter("fetch", "remote_refs", the_repository);
-		remote_refs = transport_get_remote_refs(transport, &ref_prefixes);
+		remote_refs = transport_get_remote_refs(transport, &ref_prefixes, NULL);
 		trace2_region_leave("fetch", "remote_refs", the_repository);
 	} else
 		remote_refs = NULL;
diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index 092917eca2..4cf3f60b1b 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -118,7 +118,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	if (server_options.nr)
 		transport->server_options = &server_options;
 
-	ref = transport_get_remote_refs(transport, &ref_prefixes);
+	ref = transport_get_remote_refs(transport, &ref_prefixes, NULL);
 	if (ref) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
 		repo_set_hash_algo(the_repository, hash_algo);
diff --git a/builtin/remote.c b/builtin/remote.c
index c1b211b272..246e62f118 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -950,7 +950,7 @@ static int get_remote_ref_states(const char *name,
 	if (query) {
 		transport = transport_get(states->remote, states->remote->url_nr > 0 ?
 			states->remote->url[0] : NULL);
-		remote_refs = transport_get_remote_refs(transport, NULL);
+		remote_refs = transport_get_remote_refs(transport, NULL, NULL);
 		transport_disconnect(transport);
 
 		states->queried = 1;
diff --git a/connect.c b/connect.c
index 8b8f56cf6d..3c35324b4c 100644
--- a/connect.c
+++ b/connect.c
@@ -376,7 +376,8 @@ struct ref **get_remote_heads(struct packet_reader *reader,
 }
 
 /* Returns 1 when a valid ref has been added to `list`, 0 otherwise */
-static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
+static int process_ref_v2(struct packet_reader *reader, struct ref ***list,
+			  char **unborn_head_target)
 {
 	int ret = 1;
 	int i = 0;
@@ -397,6 +398,25 @@ static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
 		goto out;
 	}
 
+	if (!strcmp("unborn", line_sections.items[i].string)) {
+		i++;
+		if (unborn_head_target &&
+		    !strcmp("HEAD", line_sections.items[i++].string)) {
+			/*
+			 * Look for the symref target (if any). If found,
+			 * return it to the caller.
+			 */
+			for (; i < line_sections.nr; i++) {
+				const char *arg = line_sections.items[i].string;
+
+				if (skip_prefix(arg, "symref-target:", &arg)) {
+					*unborn_head_target = xstrdup(arg);
+					break;
+				}
+			}
+		}
+		goto out;
+	}
 	if (parse_oid_hex_algop(line_sections.items[i++].string, &old_oid, &end, reader->hash_algo) ||
 	    *end) {
 		ret = 0;
@@ -455,7 +475,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
 			     const struct strvec *ref_prefixes,
 			     const struct string_list *server_options,
-			     int stateless_rpc)
+			     int stateless_rpc,
+			     char **unborn_head_target)
 {
 	int i;
 	const char *hash_name;
@@ -488,6 +509,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 	if (!for_push)
 		packet_write_fmt(fd_out, "peel\n");
 	packet_write_fmt(fd_out, "symrefs\n");
+	if (server_supports_feature("ls-refs", "unborn", 0))
+		packet_write_fmt(fd_out, "unborn\n");
 	for (i = 0; ref_prefixes && i < ref_prefixes->nr; i++) {
 		packet_write_fmt(fd_out, "ref-prefix %s\n",
 				 ref_prefixes->v[i]);
@@ -496,7 +519,7 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 
 	/* Process response from server */
 	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
-		if (!process_ref_v2(reader, &list))
+		if (!process_ref_v2(reader, &list, unborn_head_target))
 			die(_("invalid ls-refs response: %s"), reader->line);
 	}
 
diff --git a/ls-refs.c b/ls-refs.c
index a1e0b473e4..fdb644b482 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -32,6 +32,8 @@ struct ls_refs_data {
 	unsigned peel;
 	unsigned symrefs;
 	struct strvec prefixes;
+	unsigned allow_unborn : 1;
+	unsigned unborn : 1;
 };
 
 static int send_ref(const char *refname, const struct object_id *oid,
@@ -47,7 +49,10 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	if (!ref_match(&data->prefixes, refname_nons))
 		return 0;
 
-	strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	if (oid)
+		strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	else
+		strbuf_addf(&refline, "unborn %s", refname_nons);
 	if (data->symrefs && flag & REF_ISSYMREF) {
 		struct object_id unused;
 		const char *symref_target = resolve_ref_unsafe(refname, 0,
@@ -74,8 +79,28 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	return 0;
 }
 
-static int ls_refs_config(const char *var, const char *value, void *data)
+static void send_possibly_unborn_head(struct ls_refs_data *data)
 {
+	struct strbuf namespaced = STRBUF_INIT;
+	struct object_id oid;
+	int flag;
+	int null_oid;
+
+	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
+	resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag);
+	null_oid = is_null_oid(&oid);
+	if (!null_oid || (data->symrefs && (flag & REF_ISSYMREF)))
+		send_ref(namespaced.buf, null_oid ? NULL : &oid, flag, data);
+	strbuf_release(&namespaced);
+}
+
+static int ls_refs_config(const char *var, const char *value, void *cb_data)
+{
+	struct ls_refs_data *data = cb_data;
+
+	if (!strcmp("lsrefs.unborn", var))
+		data->allow_unborn = !strcmp(value, "allow") ||
+			!strcmp(value, "advertise");
 	/*
 	 * We only serve fetches over v2 for now, so respect only "uploadpack"
 	 * config. This may need to eventually be expanded to "receive", but we
@@ -91,7 +116,7 @@ int ls_refs(struct repository *r, struct strvec *keys,
 
 	memset(&data, 0, sizeof(data));
 
-	git_config(ls_refs_config, NULL);
+	git_config(ls_refs_config, &data);
 
 	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
 		const char *arg = request->line;
@@ -103,14 +128,35 @@ int ls_refs(struct repository *r, struct strvec *keys,
 			data.symrefs = 1;
 		else if (skip_prefix(arg, "ref-prefix ", &out))
 			strvec_push(&data.prefixes, out);
+		else if (data.allow_unborn && !strcmp("unborn", arg))
+			data.unborn = 1;
 	}
 
 	if (request->status != PACKET_READ_FLUSH)
 		die(_("expected flush after ls-refs arguments"));
 
-	head_ref_namespaced(send_ref, &data);
+	if (data.unborn)
+		send_possibly_unborn_head(&data);
+	else
+		head_ref_namespaced(send_ref, &data);
 	for_each_namespaced_ref(send_ref, &data);
 	packet_flush(1);
 	strvec_clear(&data.prefixes);
 	return 0;
 }
+
+int ls_refs_advertise(struct repository *r, struct strbuf *value)
+{
+	if (value) {
+		char *str = NULL;
+
+		if (!repo_config_get_string(the_repository, "lsrefs.unborn",
+					    &str) &&
+		    !strcmp("advertise", str)) {
+			strbuf_addstr(value, "unborn");
+			free(str);
+		}
+	}
+
+	return 1;
+}
diff --git a/ls-refs.h b/ls-refs.h
index 7b33a7c6b8..a99e4be0bd 100644
--- a/ls-refs.h
+++ b/ls-refs.h
@@ -6,5 +6,6 @@ struct strvec;
 struct packet_reader;
 int ls_refs(struct repository *r, struct strvec *keys,
 	    struct packet_reader *request);
+int ls_refs_advertise(struct repository *r, struct strbuf *value);
 
 #endif /* LS_REFS_H */
diff --git a/remote.h b/remote.h
index 3211abdf05..967f2178d8 100644
--- a/remote.h
+++ b/remote.h
@@ -198,7 +198,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
 			     const struct strvec *ref_prefixes,
 			     const struct string_list *server_options,
-			     int stateless_rpc);
+			     int stateless_rpc,
+			     char **unborn_head_target);
 
 int resolve_remote_symref(struct ref *ref, struct ref *list);
 
diff --git a/serve.c b/serve.c
index f6341206c4..30cb56d507 100644
--- a/serve.c
+++ b/serve.c
@@ -62,7 +62,7 @@ struct protocol_capability {
 
 static struct protocol_capability capabilities[] = {
 	{ "agent", agent_advertise, NULL },
-	{ "ls-refs", always_advertise, ls_refs },
+	{ "ls-refs", ls_refs_advertise, ls_refs },
 	{ "fetch", upload_pack_advertise, upload_pack_v2 },
 	{ "server-option", always_advertise, NULL },
 	{ "object-format", object_format_advertise, NULL },
diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh
index 7f082fb23b..d3bd79987b 100755
--- a/t/t5606-clone-options.sh
+++ b/t/t5606-clone-options.sh
@@ -102,11 +102,12 @@ test_expect_success 'redirected clone -v does show progress' '
 '
 
 test_expect_success 'chooses correct default initial branch name' '
-	git init --bare empty &&
+	git -c init.defaultBranch=foo init --bare empty &&
+	test_config -C empty lsrefs.unborn advertise &&
 	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
 	git -c init.defaultBranch=up clone empty whats-up &&
-	test refs/heads/up = $(git -C whats-up symbolic-ref HEAD) &&
-	test refs/heads/up = $(git -C whats-up config branch.up.merge)
+	test refs/heads/foo = $(git -C whats-up symbolic-ref HEAD) &&
+	test refs/heads/foo = $(git -C whats-up config branch.foo.merge)
 '
 
 test_expect_success 'guesses initial branch name correctly' '
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 7d5b17909b..380333b662 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -209,6 +209,23 @@ test_expect_success 'clone with file:// using protocol v2' '
 	grep "ref-prefix refs/tags/" log
 '
 
+test_expect_success 'clone of empty repo propagates name of default branch' '
+	git -c init.defaultbranch=mydefaultbranch init file_empty_parent &&
+	test_config -C file_empty_parent lsrefs.unborn advertise &&
+
+	git -c init.defaultbranch=main -c protocol.version=2 \
+		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
+	grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD
+'
+
+test_expect_success '...but not if it is not advertised' '
+	test_config -C file_empty_parent lsrefs.unborn none &&
+
+	git -c init.defaultbranch=main -c protocol.version=2 \
+		clone "file://$(pwd)/file_empty_parent" file_empty_child_2 &&
+	grep "refs/heads/main" file_empty_child_2/.git/HEAD
+'
+
 test_expect_success 'fetch with file:// using protocol v2' '
 	test_when_finished "rm -f log" &&
 
diff --git a/transport-helper.c b/transport-helper.c
index 5f6e0b3bd8..5d97eba935 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1162,13 +1162,16 @@ static int has_attribute(const char *attrs, const char *attr)
 }
 
 static struct ref *get_refs_list(struct transport *transport, int for_push,
-				 const struct strvec *ref_prefixes)
+				 const struct strvec *ref_prefixes,
+				 char **unborn_head_target)
 {
 	get_helper(transport);
 
 	if (process_connect(transport, for_push)) {
 		do_take_over(transport);
-		return transport->vtable->get_refs_list(transport, for_push, ref_prefixes);
+		return transport->vtable->get_refs_list(transport, for_push,
+							ref_prefixes,
+							unborn_head_target);
 	}
 
 	return get_refs_list_using_list(transport, for_push);
diff --git a/transport-internal.h b/transport-internal.h
index 27c9daffc4..5037f6197d 100644
--- a/transport-internal.h
+++ b/transport-internal.h
@@ -18,19 +18,16 @@ struct transport_vtable {
 	 * the transport to try to share connections, for_push is a
 	 * hint as to whether the ultimate operation is a push or a fetch.
 	 *
-	 * If communicating using protocol v2 a list of prefixes can be
-	 * provided to be sent to the server to enable it to limit the ref
-	 * advertisement.  Since ref filtering is done on the server's end, and
-	 * only when using protocol v2, this list will be ignored when not
-	 * using protocol v2 meaning this function can return refs which don't
-	 * match the provided ref_prefixes.
-	 *
 	 * If the transport is able to determine the remote hash for
 	 * the ref without a huge amount of effort, it should store it
 	 * in the ref's old_sha1 field; otherwise it should be all 0.
+	 *
+	 * See transport_get_remote_refs() for information on ref_prefixes and
+	 * unborn_head_target.
 	 **/
 	struct ref *(*get_refs_list)(struct transport *transport, int for_push,
-				     const struct strvec *ref_prefixes);
+				     const struct strvec *ref_prefixes,
+				     char **unborn_head_target);
 
 	/**
 	 * Fetch the objects for the given refs. Note that this gets
diff --git a/transport.c b/transport.c
index 47da955e4f..815e175017 100644
--- a/transport.c
+++ b/transport.c
@@ -127,7 +127,8 @@ struct bundle_transport_data {
 
 static struct ref *get_refs_from_bundle(struct transport *transport,
 					int for_push,
-					const struct strvec *ref_prefixes)
+					const struct strvec *ref_prefixes,
+					char **unborn_head_target)
 {
 	struct bundle_transport_data *data = transport->data;
 	struct ref *result = NULL;
@@ -163,7 +164,7 @@ static int fetch_refs_from_bundle(struct transport *transport,
 	int ret;
 
 	if (!data->get_refs_from_bundle_called)
-		get_refs_from_bundle(transport, 0, NULL);
+		get_refs_from_bundle(transport, 0, NULL, NULL);
 	ret = unbundle(the_repository, &data->header, data->fd,
 			   transport->progress ? BUNDLE_VERBOSE : 0);
 	transport->hash_algo = data->header.hash_algo;
@@ -281,7 +282,7 @@ static void die_if_server_options(struct transport *transport)
  */
 static struct ref *handshake(struct transport *transport, int for_push,
 			     const struct strvec *ref_prefixes,
-			     int must_list_refs)
+			     int must_list_refs, char **unborn_head_target)
 {
 	struct git_transport_data *data = transport->data;
 	struct ref *refs = NULL;
@@ -301,7 +302,8 @@ static struct ref *handshake(struct transport *transport, int for_push,
 			get_remote_refs(data->fd[1], &reader, &refs, for_push,
 					ref_prefixes,
 					transport->server_options,
-					transport->stateless_rpc);
+					transport->stateless_rpc,
+					unborn_head_target);
 		break;
 	case protocol_v1:
 	case protocol_v0:
@@ -324,9 +326,11 @@ static struct ref *handshake(struct transport *transport, int for_push,
 }
 
 static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
-					const struct strvec *ref_prefixes)
+					const struct strvec *ref_prefixes,
+					char **unborn_head_target)
 {
-	return handshake(transport, for_push, ref_prefixes, 1);
+	return handshake(transport, for_push, ref_prefixes, 1,
+			 unborn_head_target);
 }
 
 static int fetch_refs_via_pack(struct transport *transport,
@@ -370,7 +374,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 				break;
 			}
 		}
-		refs_tmp = handshake(transport, 0, NULL, must_list_refs);
+		refs_tmp = handshake(transport, 0, NULL, must_list_refs, NULL);
 	}
 
 	if (data->version == protocol_unknown_version)
@@ -765,7 +769,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 		return -1;
 
 	if (!data->got_remote_heads)
-		get_refs_via_connect(transport, 1, NULL);
+		get_refs_via_connect(transport, 1, NULL, NULL);
 
 	memset(&args, 0, sizeof(args));
 	args.send_mirror = !!(flags & TRANSPORT_PUSH_MIRROR);
@@ -1251,7 +1255,8 @@ int transport_push(struct repository *r,
 
 		trace2_region_enter("transport_push", "get_refs_list", r);
 		remote_refs = transport->vtable->get_refs_list(transport, 1,
-							       &ref_prefixes);
+							       &ref_prefixes,
+							       NULL);
 		trace2_region_leave("transport_push", "get_refs_list", r);
 
 		strvec_clear(&ref_prefixes);
@@ -1370,12 +1375,14 @@ int transport_push(struct repository *r,
 }
 
 const struct ref *transport_get_remote_refs(struct transport *transport,
-					    const struct strvec *ref_prefixes)
+					    const struct strvec *ref_prefixes,
+					    char **unborn_head_target)
 {
 	if (!transport->got_remote_refs) {
 		transport->remote_refs =
 			transport->vtable->get_refs_list(transport, 0,
-							 ref_prefixes);
+							 ref_prefixes,
+							 unborn_head_target);
 		transport->got_remote_refs = 1;
 	}
 
diff --git a/transport.h b/transport.h
index 24558c027d..65de0c9c00 100644
--- a/transport.h
+++ b/transport.h
@@ -241,9 +241,14 @@ int transport_push(struct repository *repo,
  * advertisement.  Since ref filtering is done on the server's end (and only
  * when using protocol v2), this can return refs which don't match the provided
  * ref_prefixes.
+ *
+ * If unborn_head_target is not NULL, and the remote reports HEAD as pointing
+ * to an unborn branch, this function stores the unborn branch in
+ * unborn_head_target. It should be freed by the caller.
  */
 const struct ref *transport_get_remote_refs(struct transport *transport,
-					    const struct strvec *ref_prefixes);
+					    const struct strvec *ref_prefixes,
+					    char **unborn_head_target);
 
 /*
  * Fetch the hash algorithm used by a remote.
-- 
2.29.2.576.ga3fc446d84-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH] clone: in protocol v2, use remote's default branch
  2020-12-11 21:05 ` [PATCH] clone: in protocol v2, use remote's default branch Jonathan Tan
@ 2020-12-11 23:41   ` Junio C Hamano
  2020-12-14 12:38   ` Ævar Arnfjörð Bjarmason
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 109+ messages in thread
From: Junio C Hamano @ 2020-12-11 23:41 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> When cloning an empty repository, a default branch is created. However,
> it is named after the locally configured init.defaultBranch, not the
> default branch of the remote repository.

The default branch of the remote repository and the current branch
pointed at by their HEAD in the remote repository can be different,
and we are interested in setting our HEAD to the latter.  So 

	..., not the current branch of the remote repository.

> To solve this, the remote needs to communicate the target of the HEAD
> symref, and "git clone" needs to use this information.

Yes.  That's a good change (I am on vacation today, so I won't be
reading the changes themselves today, but I agree with the intent
100%).

Thanks.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH] clone: in protocol v2, use remote's default branch
  2020-12-11 21:05 ` [PATCH] clone: in protocol v2, use remote's default branch Jonathan Tan
  2020-12-11 23:41   ` Junio C Hamano
@ 2020-12-14 12:38   ` Ævar Arnfjörð Bjarmason
  2020-12-14 15:51     ` Felipe Contreras
                       ` (2 more replies)
  2020-12-15  1:27   ` Jeff King
                     ` (3 subsequent siblings)
  5 siblings, 3 replies; 109+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2020-12-14 12:38 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git


On Fri, Dec 11 2020, Jonathan Tan wrote:

> When cloning an empty repository, a default branch is created. However,
> it is named after the locally configured init.defaultBranch, not the
> default branch of the remote repository.
>
> To solve this, the remote needs to communicate the target of the HEAD
> symref, and "git clone" needs to use this information.
>
> Currently, symrefs that have unborn targets (such as in this case) are
> not communicated by the protocol. Teach Git to advertise and support the
> "unborn" feature in "ls-refs" (guarded by the lsrefs.unborn config).
> This feature indicates that "ls-refs" supports the "unborn" argument;
> when it is specified, "ls-refs" will send the HEAD symref with the name
> of its unborn target.
>
> On the client side, Git will always send the "unborn" argument if it is
> supported by the server. During "git clone", if cloning an empty
> repository, Git will use the new information to determine the local
> branch to create. In all other cases, Git will ignore it.

I'm not a fan of this change not because of the whole s/master/whatever/
discussion, but because of the magic it adds for seemingly little gain &
without any documentation.

So if I have init.defaultBranch explicitly set that'll be ignored on
"clone", but on "init/git remote add/fetch" it won't?

I think so, and I swear I knew yesterday when I read this patch, but now
I can't remember. Anyway, the point that I avoided re-reading the patch
to find out, because even if there's an on-list answer to that it should
really be documented because I'll forget it next week, and our users
will never know :)

This patch also leaves Documentation/config/init.txt untouched, and now
under lsrefs.unborn it explicitly contradicts the behavior of git:

    Allows overriding the default branch name e.g. when initializing
    a new repository or when cloning an empty repository.

Shouldn't this at the very least be a
init.defaultBranchFromRemote=<bool> which if set overrides
init.defaultBranch? We could turn that to "true" by default and get the
same behavior as you have here, but with less inexplicable magic for the
user, no?

It seems if you're a user and wonder why a clone of a bare repo doesn't
give you "init" defaults the only way you'll find out is
GIT_TRACE_PACKET and the like.

Another reason I'm not a fan of it is because it's another piece of
magic "clone" does that you can't emulate in "init/fetch". We have
e.g. --single-branch as an existing case of that (although you can at
least do that with parse ls-remote -> init -> config -> fetch), and
that's a case that doesn't fit into a refspec.

But shouldn't there at least be a corresponding "fetch" option? On init
we'll create head, but "git fetch --clobber-my-idea-of-HEAD-with-remote
..."?

Maybe not for reasons I haven't thought of, but I'd at least be much
happier with an updated commit message justifying another special-case
in clone that you can't do with "init/fetch".

And on the "litte gain" side of things: I very much suspect that the
only users who'll ever use this will be some big hosting providers (but
maybe not, the commit doesn't suggest a use-case). Wouldn't this be even
more useful in those cases by just a pre-receive hook on their side
detecting an initial push refusing "master", and:

    git push -o yes-use-old-init-default <...>

Instead of a patch to git to do the same & which would take $SOMEYEARS
to be rolled out, since it depends on client-side understanding.

Comment on the patch below (okey I did read some of it:):

> diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> index e597b74da3..dfe03aa114 100644
> --- a/Documentation/technical/protocol-v2.txt
> +++ b/Documentation/technical/protocol-v2.txt
> @@ -192,11 +192,19 @@ ls-refs takes in the following arguments:
>  	When specified, only references having a prefix matching one of
>  	the provided prefixes are displayed.
>  
> +If the 'unborn' feature is advertised the following argument can be
> +included in the client's request.
> +
> +    unborn
> +	The server may send symrefs pointing to unborn branches in the form
> +	"unborn <refname> symref-target:<target>".
> +
>  The output of ls-refs is as follows:
>  
>      output = *ref
>  	     flush-pkt
> -    ref = PKT-LINE(obj-id SP refname *(SP ref-attribute) LF)
> +    obj-id-or-unborn = (obj-id | "unborn")
> +    ref = PKT-LINE(obj-id-or-unborn SP refname *(SP ref-attribute) LF)
>      ref-attribute = (symref | peeled)
>      symref = "symref-target:" symref-target
>      peeled = "peeled:" obj-id
> diff --git a/builtin/clone.c b/builtin/clone.c
> index a0841923cf..217c87fddf 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -980,6 +980,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	int submodule_progress;
>  
>  	struct strvec ref_prefixes = STRVEC_INIT;
> +	char *unborn_head_target = NULL;
>  
>  	packet_trace_identity("clone");
>  
> @@ -1264,7 +1265,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	if (!option_no_tags)
>  		strvec_push(&ref_prefixes, "refs/tags/");
>  
> -	refs = transport_get_remote_refs(transport, &ref_prefixes);
> +	refs = transport_get_remote_refs(transport, &ref_prefixes,
> +					 &unborn_head_target);
>  
>  	if (refs) {
>  		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
> @@ -1323,10 +1325,20 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  		remote_head = NULL;
>  		option_no_checkout = 1;
>  		if (!option_bare) {
> -			const char *branch = git_default_branch_name();
> -			char *ref = xstrfmt("refs/heads/%s", branch);
> +			const char *branch;
> +			char *ref;
> +
> +			if (unborn_head_target &&
> +			    skip_prefix(unborn_head_target, "refs/heads/", &branch)) {
> +				ref = unborn_head_target;
> +				unborn_head_target = NULL;
> +			} else {
> +				branch = git_default_branch_name();
> +				ref = xstrfmt("refs/heads/%s", branch);
> +			}
>  
>  			install_branch_config(0, branch, remote_name, ref);
> +			create_symref("HEAD", ref, "");
>  			free(ref);
>  		}
>  	}
> @@ -1373,6 +1385,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	strbuf_release(&key);
>  	junk_mode = JUNK_LEAVE_ALL;
>  
> +	free(unborn_head_target);
>  	strvec_clear(&ref_prefixes);
>  	return err;
>  }
> diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
> index 58b7c1fbdc..9f921dfab4 100644
> --- a/builtin/fetch-pack.c
> +++ b/builtin/fetch-pack.c
> @@ -220,7 +220,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
>  	version = discover_version(&reader);
>  	switch (version) {
>  	case protocol_v2:
> -		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL, args.stateless_rpc);
> +		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL,
> +				args.stateless_rpc, NULL);
>  		break;
>  	case protocol_v1:
>  	case protocol_v0:
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index ecf8537605..a7ef59acfc 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1393,7 +1393,7 @@ static int do_fetch(struct transport *transport,
>  
>  	if (must_list_refs) {
>  		trace2_region_enter("fetch", "remote_refs", the_repository);
> -		remote_refs = transport_get_remote_refs(transport, &ref_prefixes);
> +		remote_refs = transport_get_remote_refs(transport, &ref_prefixes, NULL);
>  		trace2_region_leave("fetch", "remote_refs", the_repository);
>  	} else
>  		remote_refs = NULL;
> diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
> index 092917eca2..4cf3f60b1b 100644
> --- a/builtin/ls-remote.c
> +++ b/builtin/ls-remote.c
> @@ -118,7 +118,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
>  	if (server_options.nr)
>  		transport->server_options = &server_options;
>  
> -	ref = transport_get_remote_refs(transport, &ref_prefixes);
> +	ref = transport_get_remote_refs(transport, &ref_prefixes, NULL);
>  	if (ref) {
>  		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
>  		repo_set_hash_algo(the_repository, hash_algo);
> diff --git a/builtin/remote.c b/builtin/remote.c
> index c1b211b272..246e62f118 100644
> --- a/builtin/remote.c
> +++ b/builtin/remote.c
> @@ -950,7 +950,7 @@ static int get_remote_ref_states(const char *name,
>  	if (query) {
>  		transport = transport_get(states->remote, states->remote->url_nr > 0 ?
>  			states->remote->url[0] : NULL);
> -		remote_refs = transport_get_remote_refs(transport, NULL);
> +		remote_refs = transport_get_remote_refs(transport, NULL, NULL);
>  		transport_disconnect(transport);
>  
>  		states->queried = 1;
> diff --git a/connect.c b/connect.c
> index 8b8f56cf6d..3c35324b4c 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -376,7 +376,8 @@ struct ref **get_remote_heads(struct packet_reader *reader,
>  }
>  
>  /* Returns 1 when a valid ref has been added to `list`, 0 otherwise */
> -static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
> +static int process_ref_v2(struct packet_reader *reader, struct ref ***list,
> +			  char **unborn_head_target)
>  {
>  	int ret = 1;
>  	int i = 0;
> @@ -397,6 +398,25 @@ static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
>  		goto out;
>  	}
>  
> +	if (!strcmp("unborn", line_sections.items[i].string)) {
> +		i++;
> +		if (unborn_head_target &&
> +		    !strcmp("HEAD", line_sections.items[i++].string)) {
> +			/*
> +			 * Look for the symref target (if any). If found,
> +			 * return it to the caller.
> +			 */
> +			for (; i < line_sections.nr; i++) {
> +				const char *arg = line_sections.items[i].string;
> +
> +				if (skip_prefix(arg, "symref-target:", &arg)) {
> +					*unborn_head_target = xstrdup(arg);
> +					break;
> +				}
> +			}
> +		}
> +		goto out;
> +	}
>  	if (parse_oid_hex_algop(line_sections.items[i++].string, &old_oid, &end, reader->hash_algo) ||
>  	    *end) {
>  		ret = 0;
> @@ -455,7 +475,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
>  			     struct ref **list, int for_push,
>  			     const struct strvec *ref_prefixes,
>  			     const struct string_list *server_options,
> -			     int stateless_rpc)
> +			     int stateless_rpc,
> +			     char **unborn_head_target)
>  {
>  	int i;
>  	const char *hash_name;
> @@ -488,6 +509,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
>  	if (!for_push)
>  		packet_write_fmt(fd_out, "peel\n");
>  	packet_write_fmt(fd_out, "symrefs\n");
> +	if (server_supports_feature("ls-refs", "unborn", 0))
> +		packet_write_fmt(fd_out, "unborn\n");
>  	for (i = 0; ref_prefixes && i < ref_prefixes->nr; i++) {
>  		packet_write_fmt(fd_out, "ref-prefix %s\n",
>  				 ref_prefixes->v[i]);
> @@ -496,7 +519,7 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
>  
>  	/* Process response from server */
>  	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
> -		if (!process_ref_v2(reader, &list))
> +		if (!process_ref_v2(reader, &list, unborn_head_target))
>  			die(_("invalid ls-refs response: %s"), reader->line);
>  	}
>  
> diff --git a/ls-refs.c b/ls-refs.c
> index a1e0b473e4..fdb644b482 100644
> --- a/ls-refs.c
> +++ b/ls-refs.c
> @@ -32,6 +32,8 @@ struct ls_refs_data {
>  	unsigned peel;
>  	unsigned symrefs;
>  	struct strvec prefixes;
> +	unsigned allow_unborn : 1;
> +	unsigned unborn : 1;
>  };
>  
>  static int send_ref(const char *refname, const struct object_id *oid,
> @@ -47,7 +49,10 @@ static int send_ref(const char *refname, const struct object_id *oid,
>  	if (!ref_match(&data->prefixes, refname_nons))
>  		return 0;
>  
> -	strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
> +	if (oid)
> +		strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
> +	else
> +		strbuf_addf(&refline, "unborn %s", refname_nons);
>  	if (data->symrefs && flag & REF_ISSYMREF) {
>  		struct object_id unused;
>  		const char *symref_target = resolve_ref_unsafe(refname, 0,
> @@ -74,8 +79,28 @@ static int send_ref(const char *refname, const struct object_id *oid,
>  	return 0;
>  }
>  
> -static int ls_refs_config(const char *var, const char *value, void *data)
> +static void send_possibly_unborn_head(struct ls_refs_data *data)
>  {
> +	struct strbuf namespaced = STRBUF_INIT;
> +	struct object_id oid;
> +	int flag;
> +	int null_oid;
> +
> +	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
> +	resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag);
> +	null_oid = is_null_oid(&oid);
> +	if (!null_oid || (data->symrefs && (flag & REF_ISSYMREF)))
> +		send_ref(namespaced.buf, null_oid ? NULL : &oid, flag, data);
> +	strbuf_release(&namespaced);
> +}
> +
> +static int ls_refs_config(const char *var, const char *value, void *cb_data)
> +{
> +	struct ls_refs_data *data = cb_data;
> +
> +	if (!strcmp("lsrefs.unborn", var))
> +		data->allow_unborn = !strcmp(value, "allow") ||
> +			!strcmp(value, "advertise");
>  	/*
>  	 * We only serve fetches over v2 for now, so respect only "uploadpack"
>  	 * config. This may need to eventually be expanded to "receive", but we
> @@ -91,7 +116,7 @@ int ls_refs(struct repository *r, struct strvec *keys,
>  
>  	memset(&data, 0, sizeof(data));
>  
> -	git_config(ls_refs_config, NULL);
> +	git_config(ls_refs_config, &data);
>  
>  	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
>  		const char *arg = request->line;
> @@ -103,14 +128,35 @@ int ls_refs(struct repository *r, struct strvec *keys,
>  			data.symrefs = 1;
>  		else if (skip_prefix(arg, "ref-prefix ", &out))
>  			strvec_push(&data.prefixes, out);
> +		else if (data.allow_unborn && !strcmp("unborn", arg))
> +			data.unborn = 1;
>  	}
>  
>  	if (request->status != PACKET_READ_FLUSH)
>  		die(_("expected flush after ls-refs arguments"));
>  
> -	head_ref_namespaced(send_ref, &data);
> +	if (data.unborn)
> +		send_possibly_unborn_head(&data);
> +	else
> +		head_ref_namespaced(send_ref, &data);
>  	for_each_namespaced_ref(send_ref, &data);
>  	packet_flush(1);
>  	strvec_clear(&data.prefixes);
>  	return 0;
>  }
> +
> +int ls_refs_advertise(struct repository *r, struct strbuf *value)
> +{
> +	if (value) {
> +		char *str = NULL;
> +
> +		if (!repo_config_get_string(the_repository, "lsrefs.unborn",
> +					    &str) &&
> +		    !strcmp("advertise", str)) {
> +			strbuf_addstr(value, "unborn");
> +			free(str);
> +		}
> +	}
> +
> +	return 1;
> +}
> diff --git a/ls-refs.h b/ls-refs.h
> index 7b33a7c6b8..a99e4be0bd 100644
> --- a/ls-refs.h
> +++ b/ls-refs.h
> @@ -6,5 +6,6 @@ struct strvec;
>  struct packet_reader;
>  int ls_refs(struct repository *r, struct strvec *keys,
>  	    struct packet_reader *request);
> +int ls_refs_advertise(struct repository *r, struct strbuf *value);
>  
>  #endif /* LS_REFS_H */
> diff --git a/remote.h b/remote.h
> index 3211abdf05..967f2178d8 100644
> --- a/remote.h
> +++ b/remote.h
> @@ -198,7 +198,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
>  			     struct ref **list, int for_push,
>  			     const struct strvec *ref_prefixes,
>  			     const struct string_list *server_options,
> -			     int stateless_rpc);
> +			     int stateless_rpc,
> +			     char **unborn_head_target);
>  
>  int resolve_remote_symref(struct ref *ref, struct ref *list);
>  
> diff --git a/serve.c b/serve.c
> index f6341206c4..30cb56d507 100644
> --- a/serve.c
> +++ b/serve.c
> @@ -62,7 +62,7 @@ struct protocol_capability {
>  
>  static struct protocol_capability capabilities[] = {
>  	{ "agent", agent_advertise, NULL },
> -	{ "ls-refs", always_advertise, ls_refs },
> +	{ "ls-refs", ls_refs_advertise, ls_refs },
>  	{ "fetch", upload_pack_advertise, upload_pack_v2 },
>  	{ "server-option", always_advertise, NULL },
>  	{ "object-format", object_format_advertise, NULL },

All of this looks good to me, and re unrelated recent questions about
packfile-uri I had it's really nice to have a narrow example of adding a
simple ls-refs time verb / functionality like this to the protocol.

> diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh
> index 7f082fb23b..d3bd79987b 100755
> --- a/t/t5606-clone-options.sh
> +++ b/t/t5606-clone-options.sh
> @@ -102,11 +102,12 @@ test_expect_success 'redirected clone -v does show progress' '
>  '
>  
>  test_expect_success 'chooses correct default initial branch name' '
> -	git init --bare empty &&
> +	git -c init.defaultBranch=foo init --bare empty &&
> +	test_config -C empty lsrefs.unborn advertise &&

Isn't this reducing test coverage? You're changing an existing
argument-less "init --bare" test's behavior,

>  	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
>  	git -c init.defaultBranch=up clone empty whats-up &&
> -	test refs/heads/up = $(git -C whats-up symbolic-ref HEAD) &&
> -	test refs/heads/up = $(git -C whats-up config branch.up.merge)
> +	test refs/heads/foo = $(git -C whats-up symbolic-ref HEAD) &&
> +	test refs/heads/foo = $(git -C whats-up config branch.foo.merge)
>  '

Also re the above point about discoverability: Right below this we test
"init --initial-branch=guess". Wouldn't a way to unify bring
fetch/init/clone functionality be to use that as a jump-off point,
i.e. clone having --use-remote-initial-branch, init optionally leaving
behind a (broken) empty/nonexisting HEAD, and "fetch" with an argument
also supporting --use-remote-initial-branch or something.

> +test_expect_success 'clone of empty repo propagates name of default branch' '
> +	git -c init.defaultbranch=mydefaultbranch init file_empty_parent &&
> +	test_config -C file_empty_parent lsrefs.unborn advertise &&
> +
> +	git -c init.defaultbranch=main -c protocol.version=2 \
> +		clone "file://$(pwd)/file_empty_parent" file_empty_child &&

Nit. Let's spell config.likeThis not config.likethis when not in the C
code.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH] clone: in protocol v2, use remote's default branch
  2020-12-14 12:38   ` Ævar Arnfjörð Bjarmason
@ 2020-12-14 15:51     ` Felipe Contreras
  2020-12-14 16:30     ` Junio C Hamano
  2020-12-14 19:25     ` Jonathan Tan
  2 siblings, 0 replies; 109+ messages in thread
From: Felipe Contreras @ 2020-12-14 15:51 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Jonathan Tan; +Cc: git

Ævar Arnfjörð Bjarmason wrote:
> On Fri, Dec 11 2020, Jonathan Tan wrote:
> 
> > When cloning an empty repository, a default branch is created. However,
> > it is named after the locally configured init.defaultBranch, not the
> > default branch of the remote repository.
> >
> > To solve this, the remote needs to communicate the target of the HEAD
> > symref, and "git clone" needs to use this information.
> >
> > Currently, symrefs that have unborn targets (such as in this case) are
> > not communicated by the protocol. Teach Git to advertise and support the
> > "unborn" feature in "ls-refs" (guarded by the lsrefs.unborn config).
> > This feature indicates that "ls-refs" supports the "unborn" argument;
> > when it is specified, "ls-refs" will send the HEAD symref with the name
> > of its unborn target.
> >
> > On the client side, Git will always send the "unborn" argument if it is
> > supported by the server. During "git clone", if cloning an empty
> > repository, Git will use the new information to determine the local
> > branch to create. In all other cases, Git will ignore it.
> 
> I'm not a fan of this change not because of the whole s/master/whatever/
> discussion, but because of the magic it adds for seemingly little gain &
> without any documentation.

I am against the master rename, and yet I am in favor of this patch.

I have been running git with "init.defaultbranch=foobar" to prepare
myself to a future in which the Git project chooses an objectively
inferior default branch name.

When I clone an empty repository, I expect the branch name to be chosen
by the person who created that repository (not 'foobar'). If GitHub
chooses to name the default branch "main", they can tell their users to
always clone the empty repository, and the users don't need to be
instructed to do anything else (like "git init -b main").

This way the Git project could follow a simple maxim:

  He who creates the repository chooses the master branch name

And this in addition offloads the burden on the Git project to choose a
particular default branch name.

> So if I have init.defaultBranch explicitly set that'll be ignored on
> "clone", but on "init/git remote add/fetch" it won't?

It is already ignored on clone... except when the repository is empty.

> I think so, and I swear I knew yesterday when I read this patch, but now
> I can't remember. Anyway, the point that I avoided re-reading the patch
> to find out, because even if there's an on-list answer to that it should
> really be documented because I'll forget it next week, and our users
> will never know :)

I think the patch does bring the expected behavior. The current behavior
is the one that is unexpected, and has been unnoticed simply because
most repositories use the name "master".

Some people would call that unexpected behavior a bug.

By removing the bug we don't have to document it.

> This patch also leaves Documentation/config/init.txt untouched, and now
> under lsrefs.unborn it explicitly contradicts the behavior of git:
> 
>     Allows overriding the default branch name e.g. when initializing
>     a new repository or when cloning an empty repository.

That should be updated.

> Shouldn't this at the very least be a
> init.defaultBranchFromRemote=<bool> which if set overrides
> init.defaultBranch? We could turn that to "true" by default and get the
> same behavior as you have here, but with less inexplicable magic for the
> user, no?

I don't think init.defaultbranch has lived long enough for people to
rely on the "buggy" behavior.

> It seems if you're a user and wonder why a clone of a bare repo doesn't
> give you "init" defaults the only way you'll find out is
> GIT_TRACE_PACKET and the like.

Yeah, but who created that repository?

If you configure Git to use "master", and cloning a new repository from
GitHub fetches "main", you know who to blame.

Let them take backlash.

I suspect they will eventually be forced to provide an option.

> Another reason I'm not a fan of it is because it's another piece of
> magic "clone" does that you can't emulate in "init/fetch". We have
> e.g. --single-branch as an existing case of that (although you can at
> least do that with parse ls-remote -> init -> config -> fetch), and
> that's a case that doesn't fit into a refspec.
> 
> But shouldn't there at least be a corresponding "fetch" option? On init
> we'll create head, but "git fetch --clobber-my-idea-of-HEAD-with-remote
> ..."?

That would be better, yes.

But let's not let the perfect be the enemy of the good.

> And on the "litte gain" side of things: I very much suspect that the
> only users who'll ever use this will be some big hosting providers (but
> maybe not, the commit doesn't suggest a use-case).

Yes, and that's enough reason.

I say let's offload the branch name decision to them, and let *them*
deal with the fallback from their users.

> Wouldn't this be even more useful in those cases by just a pre-receive
> hook on their side detecting an initial push refusing "master", and:

But that's not what they want to do (I suspect).

They want the default branch name be "main", but still the user could
rename the branch, do the initial commit, and push without problems.

Yes, it would take years to roll this change, but it also takes years
for them to update their initial repository instructions too (they
haven't included "git init -b main" yet).

Cheers.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH] clone: in protocol v2, use remote's default branch
  2020-12-14 12:38   ` Ævar Arnfjörð Bjarmason
  2020-12-14 15:51     ` Felipe Contreras
@ 2020-12-14 16:30     ` Junio C Hamano
  2020-12-15  1:41       ` Ævar Arnfjörð Bjarmason
  2020-12-14 19:25     ` Jonathan Tan
  2 siblings, 1 reply; 109+ messages in thread
From: Junio C Hamano @ 2020-12-14 16:30 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Jonathan Tan, git

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> On the client side, Git will always send the "unborn" argument if it is
>> supported by the server. During "git clone", if cloning an empty
>> repository, Git will use the new information to determine the local
>> branch to create. In all other cases, Git will ignore it.
>
> I'm not a fan of this change not because of the whole s/master/whatever/
> discussion, but because of the magic it adds for seemingly little gain &
> without any documentation.
>
> So if I have init.defaultBranch explicitly set that'll be ignored on
> "clone", but on "init/git remote add/fetch" it won't?

That description is backwards.

To help interoperate with the repository you cloned from better, we
made it easy to use whatever your 'origin' uses. "git clone" does so
by (1) in the original implementation, by inferring where HEAD
points at over there by comparing the objects reported for HEAD and
tips of branches (2) later, by adding symref capability to the
protocl so that the sending repository can tell exactly which branch
its HEAD points at.  What was lacking was that symref capability is
not sent if there is nothing in the repository.  And I think this is
an attempt to bring that "cloning nothing" case in line with a clone
of a repository with contents.

> Shouldn't this at the very least be a
> init.defaultBranchFromRemote=<bool> which if set overrides
> init.defaultBranch? We could turn that to "true" by default and get the
> same behavior as you have here, but with less inexplicable magic for the
> user, no?

I view the change in the patch being discussed a bugfix (clone ought
to follow whatever the other side uses by default, unless you say -b,
and the case when cloning an empty repository was buggy).  I am OK
if we wanted to consider a _new_ feature to always use the name you
want locally (i.e. as if you added "-b $(git config init.defaultBranch)"
on your "git clone" command line), but that is a new feature that needs
to be discussed in a separate topic.

> Another reason I'm not a fan of it is because it's another piece of
> magic "clone" does that you can't emulate in "init/fetch".

That ship has sailed longlonglong time ago when dfeff66e (revamp
git-clone., 2006-03-20) started pointing our HEAD to match theirs.

> But shouldn't there at least be a corresponding "fetch" option? On init
> we'll create head, but "git fetch --clobber-my-idea-of-HEAD-with-remote
> ..."?

It may be nice to have a corresponding one, but again, that is a
separate topic on a new feature, and not relevant in the context of
this fix.

> Maybe not for reasons I haven't thought of, but I'd at least be much
> happier with an updated commit message justifying another special-case
> in clone that you can't do with "init/fetch".

This is *not* another special-case, but is 14-year old outstanding
one, so I do not think there specifically needs such justification.
The log message DOES need to be clarified.  Your mistaking that this
is a new feature and not a bugfix may be a good indication that the
proposed log message is not doing its job.

> And on the "litte gain" side of things: I very much suspect that the
> only users who'll ever use this will be some big hosting providers (but
> maybe not, the commit doesn't suggest a use-case).

Explorers who learn this new GitHub or GitLab thingy, create an
empty repository there and then clone it to their local disk, just
to dip their toes in the water, would most benefit.  Those of us who
are working on an already existing and populated projects won't be
helped or bothered.  We do sometimes create our own repositories and
publish to hosting sites, and I expect that many experienced Git
users follow the "local first and the push", and they won't be
helped or bothered.

But I expect some do "create a void at the hosting site and clone to
get a local playpen" for their real projects.  They would be helped,
and because Git userbase is populous enough that their number in
absolute terms would not be insignificant, even if they weren't in
percentage terms.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH] clone: in protocol v2, use remote's default branch
  2020-12-14 12:38   ` Ævar Arnfjörð Bjarmason
  2020-12-14 15:51     ` Felipe Contreras
  2020-12-14 16:30     ` Junio C Hamano
@ 2020-12-14 19:25     ` Jonathan Tan
  2020-12-14 19:42       ` Felipe Contreras
  2 siblings, 1 reply; 109+ messages in thread
From: Jonathan Tan @ 2020-12-14 19:25 UTC (permalink / raw)
  To: avarab; +Cc: jonathantanmy, git

> I'm not a fan of this change not because of the whole s/master/whatever/
> discussion, but because of the magic it adds for seemingly little gain &
> without any documentation.
> 
> So if I have init.defaultBranch explicitly set that'll be ignored on
> "clone", but on "init/git remote add/fetch" it won't?
> 
> I think so, and I swear I knew yesterday when I read this patch, but now
> I can't remember. Anyway, the point that I avoided re-reading the patch
> to find out, because even if there's an on-list answer to that it should
> really be documented because I'll forget it next week, and our users
> will never know :)

That's the plan - yes. It makes sense to me that "git clone" will not
use "init.defaultBranch" (especially since it has "init" in the name),
but "git init" will. (It also makes sense to me that "git remote add"
and "git fetch" will not change HEAD.)

> This patch also leaves Documentation/config/init.txt untouched, and now
> under lsrefs.unborn it explicitly contradicts the behavior of git:
> 
>     Allows overriding the default branch name e.g. when initializing
>     a new repository or when cloning an empty repository.

Ah...thanks for the pointer. I'll change it.

> Shouldn't this at the very least be a
> init.defaultBranchFromRemote=<bool> which if set overrides
> init.defaultBranch? We could turn that to "true" by default and get the
> same behavior as you have here, but with less inexplicable magic for the
> user, no?

I think you're coming with the idea that it is perfectly natural for
"git clone" to respect "init.defaultBranch", but that doesn't even
happen in the typical case wherein we clone a non-empty repository - so
I don't agree with that idea.

> It seems if you're a user and wonder why a clone of a bare repo doesn't
> give you "init" defaults the only way you'll find out is
> GIT_TRACE_PACKET and the like.

I assume you mean empty repo instead of bare repo? For me, I would find
it more surprising that the resulting local repo didn't have the same
HEAD as the remote.

> Another reason I'm not a fan of it is because it's another piece of
> magic "clone" does that you can't emulate in "init/fetch". We have
> e.g. --single-branch as an existing case of that (although you can at
> least do that with parse ls-remote -> init -> config -> fetch), and
> that's a case that doesn't fit into a refspec.

Same answer as above.

> But shouldn't there at least be a corresponding "fetch" option? On init
> we'll create head, but "git fetch --clobber-my-idea-of-HEAD-with-remote
> ..."?

I think that it's OK for "clone" to create HEAD, but not OK for "fetch"
to modify HEAD.

> Maybe not for reasons I haven't thought of, but I'd at least be much
> happier with an updated commit message justifying another special-case
> in clone that you can't do with "init/fetch".

Same answer as above - I don't think this is a special case.

> And on the "litte gain" side of things: I very much suspect that the
> only users who'll ever use this will be some big hosting providers (but
> maybe not, the commit doesn't suggest a use-case). Wouldn't this be even
> more useful in those cases by just a pre-receive hook on their side
> detecting an initial push refusing "master", and:
> 
>     git push -o yes-use-old-init-default <...>
> 
> Instead of a patch to git to do the same & which would take $SOMEYEARS
> to be rolled out, since it depends on client-side understanding.

This would detect the problem only upon push.

> > @@ -62,7 +62,7 @@ struct protocol_capability {
> >  
> >  static struct protocol_capability capabilities[] = {
> >  	{ "agent", agent_advertise, NULL },
> > -	{ "ls-refs", always_advertise, ls_refs },
> > +	{ "ls-refs", ls_refs_advertise, ls_refs },
> >  	{ "fetch", upload_pack_advertise, upload_pack_v2 },
> >  	{ "server-option", always_advertise, NULL },
> >  	{ "object-format", object_format_advertise, NULL },
> 
> All of this looks good to me, and re unrelated recent questions about
> packfile-uri I had it's really nice to have a narrow example of adding a
> simple ls-refs time verb / functionality like this to the protocol.

Thanks.

> > diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh
> > index 7f082fb23b..d3bd79987b 100755
> > --- a/t/t5606-clone-options.sh
> > +++ b/t/t5606-clone-options.sh
> > @@ -102,11 +102,12 @@ test_expect_success 'redirected clone -v does show progress' '
> >  '
> >  
> >  test_expect_success 'chooses correct default initial branch name' '
> > -	git init --bare empty &&
> > +	git -c init.defaultBranch=foo init --bare empty &&
> > +	test_config -C empty lsrefs.unborn advertise &&
> 
> Isn't this reducing test coverage? You're changing an existing
> argument-less "init --bare" test's behavior,

The test here is regarding "clone", not the behavior of "init". I'm
doing some textual comparison below, so I want to insulate this test
against future default branch name changes.

> 
> >  	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
> >  	git -c init.defaultBranch=up clone empty whats-up &&
> > -	test refs/heads/up = $(git -C whats-up symbolic-ref HEAD) &&
> > -	test refs/heads/up = $(git -C whats-up config branch.up.merge)
> > +	test refs/heads/foo = $(git -C whats-up symbolic-ref HEAD) &&
> > +	test refs/heads/foo = $(git -C whats-up config branch.foo.merge)
> >  '
> 
> Also re the above point about discoverability: Right below this we test
> "init --initial-branch=guess". Wouldn't a way to unify bring
> fetch/init/clone functionality be to use that as a jump-off point,
> i.e. clone having --use-remote-initial-branch, 

OK - this is already happening for non-empty repositories, and my patch
makes it also happen for empty repositories.

> init optionally leaving
> behind a (broken) empty/nonexisting HEAD, 

I'm not sure how this is superior to just using what the remote has
(upon "clone") and using init.defaultBranch when no remote is involved
(upon "init").

> and "fetch" with an argument
> also supporting --use-remote-initial-branch or something.

Again, I don't think that "fetch" should update HEAD.

> 
> > +test_expect_success 'clone of empty repo propagates name of default branch' '
> > +	git -c init.defaultbranch=mydefaultbranch init file_empty_parent &&
> > +	test_config -C file_empty_parent lsrefs.unborn advertise &&
> > +
> > +	git -c init.defaultbranch=main -c protocol.version=2 \
> > +		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
> 
> Nit. Let's spell config.likeThis not config.likethis when not in the C
> code.

OK - will do.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH] clone: in protocol v2, use remote's default branch
  2020-12-14 19:25     ` Jonathan Tan
@ 2020-12-14 19:42       ` Felipe Contreras
  0 siblings, 0 replies; 109+ messages in thread
From: Felipe Contreras @ 2020-12-14 19:42 UTC (permalink / raw)
  To: Jonathan Tan, avarab; +Cc: jonathantanmy, git

Jonathan Tan wrote:
> > But shouldn't there at least be a corresponding "fetch" option? On init
> > we'll create head, but "git fetch --clobber-my-idea-of-HEAD-with-remote
> > ..."?
> 
> I think that it's OK for "clone" to create HEAD, but not OK for "fetch"
> to modify HEAD.

Not the local HEAD, the remote HEAD.

See my proposal to update the remote head in different scenarios:

https://lore.kernel.org/git/20201118091219.3341585-1-felipe.contreras@gmail.com/

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH] clone: in protocol v2, use remote's default branch
  2020-12-11 21:05 ` [PATCH] clone: in protocol v2, use remote's default branch Jonathan Tan
  2020-12-11 23:41   ` Junio C Hamano
  2020-12-14 12:38   ` Ævar Arnfjörð Bjarmason
@ 2020-12-15  1:27   ` Jeff King
  2020-12-15 19:10     ` Jonathan Tan
  2020-12-16  2:07   ` [PATCH v2 0/3] Cloning with remote unborn HEAD Jonathan Tan
                     ` (2 subsequent siblings)
  5 siblings, 1 reply; 109+ messages in thread
From: Jeff King @ 2020-12-15  1:27 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Fri, Dec 11, 2020 at 01:05:08PM -0800, Jonathan Tan wrote:

> Subject: Re: [PATCH] clone: in protocol v2, use remote's default branch
> 
> When cloning an empty repository, a default branch is created. However,
> it is named after the locally configured init.defaultBranch, not the
> default branch of the remote repository.

Your subject line puzzled me at first, because I thought we already did
that. And indeed we do, but this is about adding the unborn case. I
think this contributed to Ævar's confusion.

Maybe:

  Subject: clone: respect unborn remote HEAD

  When cloning, we choose the default branch based on the remote HEAD.
  But if there is no remote HEAD, we'll fall back to using our local
  init.defaultBranch. Traditionally this hasn't been a big deal, because
  everybody used "master" as the default. But these days it is likely to
  cause confusion if the server and client implementations choose
  different values (e.g., if the remote started with "main", we may
  choose "master" locally, create commits there, and then the user is
  surprised when they push to "master" and not "main").

  To solve this...

makes the current state more clear, as well as motivating why we care.

It might also be worth breaking the patch up a bit. E.g., implement the
capability in upload-pack, then infrastructure for the client to use the
capability and surface the info to transport callers, and then finally
surface it to in the program logic of ls-refs, then clone, etc.

Not strictly necessary, but it make it easier to see what is being
changed at each step.

> Currently, symrefs that have unborn targets (such as in this case) are
> not communicated by the protocol. Teach Git to advertise and support the
> "unborn" feature in "ls-refs" (guarded by the lsrefs.unborn config).
> This feature indicates that "ls-refs" supports the "unborn" argument;
> when it is specified, "ls-refs" will send the HEAD symref with the name
> of its unborn target.

It's probably also worth mentioning that v0 won't get any support here,
and why.

-Peff

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH] clone: in protocol v2, use remote's default branch
  2020-12-14 16:30     ` Junio C Hamano
@ 2020-12-15  1:41       ` Ævar Arnfjörð Bjarmason
  2020-12-15  2:22         ` Junio C Hamano
                           ` (2 more replies)
  0 siblings, 3 replies; 109+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2020-12-15  1:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Tan, git, Felipe Contreras


On Mon, Dec 14 2020, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>> Maybe not for reasons I haven't thought of, but I'd at least be much
>> happier with an updated commit message justifying another special-case
>> in clone that you can't do with "init/fetch".
>
> This is *not* another special-case, but is 14-year old outstanding
> one, so I do not think there specifically needs such justification.
> The log message DOES need to be clarified.  Your mistaking that this
> is a new feature and not a bugfix may be a good indication that the
> proposed log message is not doing its job.

For context: This clone feature has been there since early 2009, it
wasn't until late 2017/early 2018 that we had protocol v2 that gave us
the ability to fix the bug.

I suppose the distinction between what's new behavior and what's a
bugfix in something that was really meant to work a certain way all
along but didn't is too subtle for me to discern sometimes :)

86ac7518590 (Allow cloning an empty repository, 2009-01-23) which added
it seems to match my mental model of it being just a shortcut for some
of the the URL config "init" otherwise wouldn't setup for you. At a time
when git-init.txt said:

    An initial `HEAD` file that references the HEAD of the master branch
    is also created.

> > Another reason I'm not a fan of it is because it's another piece of
> > magic "clone" does that you can't emulate in "init/fetch".
> 
> That ship has sailed longlonglong time ago when dfeff66e (revamp
> git-clone., 2006-03-20) started pointing our HEAD to match theirs.

Let me rephrase: I think it's unfortunate when we add new things things
to porcelain commands that aren't easy or possible to emulate in
plumbing.

I.e. this feature seems like a candidate to be exposed by something like
by a ls-remote flag if you'd like to do an init/config/fetch. AFAIK the
only way to do it is to mock a "clone" with GIT_TRACE_PACKET or get the
information out-of-bounds.

>> And on the "litte gain" side of things: I very much suspect that the
>> only users who'll ever use this will be some big hosting providers (but
>> maybe not, the commit doesn't suggest a use-case).
>
> Explorers who learn this new GitHub or GitLab thingy, create an
> empty repository there and then clone it to their local disk, just
> to dip their toes in the water, would most benefit.  Those of us who
> are working on an already existing and populated projects won't be
> helped or bothered.  We do sometimes create our own repositories and
> publish to hosting sites, and I expect that many experienced Git
> users follow the "local first and the push", and they won't be
> helped or bothered.
>
> But I expect some do "create a void at the hosting site and clone to
> get a local playpen" for their real projects.  They would be helped,
> and because Git userbase is populous enough that their number in
> absolute terms would not be insignificant, even if they weren't in
> percentage terms.

That's how I've always used it. Seems from the above-referenced
5cd12b85fe8 that's what it was meant for to begin with.

Anyway, there's 3 replies to my E-Mail including yours insisting this
makes perfect sense, I'm happy to go along with the consensus. I wrote
my reply with the assumption that it was obvious that this was a change
in established behavior, but apparently that's not the prevailing view.

To borrow from Felipe Contreras's reply in the side-thread "I expect the
branch name to be chosen by the person who created that repository".

I suppose this comes down to a mental model of what it means to have
"created a repository". When I click "create repo" on those popular
hosting sites (e.g. github & gitlab) and clone it I was expecting it to
just be a shorthand init + a URL in my config (and refspecs...).

That's also what happens with this patch if you "git init --bare
/tmp/my.git", then edit the HEAD symref to point to "foobar" and clone
it with file:///, it'll be "master" in your clone (or whatever
init.defaultBranch is). Isn't that discrepancy a bug then?

But of course then when you push your "foobar" as the first branch the
HEAD symref won't be updated. In the olden times when everyone ran their
own git server this was a common FAQ, "just run 'git symbolic-ref'".

On both of those big hosting sites (didn't test others) whatever their
preferred default name is they'll go with your idea and update HEAD's
pointer on the first such push. So this notion that the default unborn
symref isn't transported & it's up to the client to set it on-push (or
manually afterwards) has been reinforced by in-the-wild use.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH] clone: in protocol v2, use remote's default branch
  2020-12-15  1:41       ` Ævar Arnfjörð Bjarmason
@ 2020-12-15  2:22         ` Junio C Hamano
  2020-12-15  2:38         ` Jeff King
  2020-12-15  3:22         ` Felipe Contreras
  2 siblings, 0 replies; 109+ messages in thread
From: Junio C Hamano @ 2020-12-15  2:22 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jonathan Tan, git, Felipe Contreras

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> I.e. this feature seems like a candidate to be exposed by something like
> by a ls-remote flag if you'd like to do an init/config/fetch. AFAIK the
> only way to do it is to mock a "clone" with GIT_TRACE_PACKET or get the
> information out-of-bounds.

Yes, I think the updated protocol should be able to help adding a
new feature usable by script writers, and I agree that ls-remote may
be the ideal home for such a feature.

> To borrow from Felipe Contreras's reply in the side-thread "I expect the
> branch name to be chosen by the person who created that repository".

I expect a bit differently.  My expectation is that "git clone"
tries to help you inter-operate well with the project you clone.
You may be creating your local repository by cloning theirs, but
because I do not expect "who created" matters more than how well
end-users' workflows would work in the resulting repository, I
do not expect local init.defaultBranch should matter here.

If the project you would eventually push back to designates one
branch as its primary branch everybody is expected to work off of
(which is what it means to point it with their HEAD), it is
convenient if your local clone names your primary branch to match.
The push.default settings like 'simple' and 'current' are designed
to work well when your local branch namespace matches what they
have.

> I suppose this comes down to a mental model of what it means to have
> "created a repository". When I click "create repo" on those popular
> hosting sites (e.g. github & gitlab) and clone it I was expecting it to
> just be a shorthand init + a URL in my config (and refspecs...).

So, no, I do not think "who created a repository" has much to do
with the objective of the patch in question.  It's really "what's
the upstream's view of the primary branch".

> That's also what happens with this patch if you "git init --bare
> /tmp/my.git", then edit the HEAD symref to point to "foobar" and clone
> it with file:///, it'll be "master" in your clone (or whatever
> init.defaultBranch is). Isn't that discrepancy a bug then?

Yes, I view it as the same bug to be fixed; JTan's protocol update
patch only deals with the transport based on the git protocol and
does not (yet?) address the --local short-cut.  In principle, it
should be a lot easier than the protocol update.  Any takers?

> On both of those big hosting sites (didn't test others) whatever their
> preferred default name is they'll go with your idea and update HEAD's
> pointer on the first such push. So this notion that the default unborn
> symref isn't transported & it's up to the client to set it on-push (or
> manually afterwards) has been reinforced by in-the-wild use.

I think it would be great if somebody comes up with a protocol
update for that "other" direction to push into an unborn HEAD.  I
haven't thought things through, but you may be right to point out
that the "clone learns and prepares local to match the other side"
we are discussing may not be complete with such a corresponding fix
in the opposite direction.

Thanks.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH] clone: in protocol v2, use remote's default branch
  2020-12-15  1:41       ` Ævar Arnfjörð Bjarmason
  2020-12-15  2:22         ` Junio C Hamano
@ 2020-12-15  2:38         ` Jeff King
  2020-12-15  2:55           ` Junio C Hamano
  2020-12-15  3:22         ` Felipe Contreras
  2 siblings, 1 reply; 109+ messages in thread
From: Jeff King @ 2020-12-15  2:38 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Junio C Hamano, Jonathan Tan, git, Felipe Contreras

On Tue, Dec 15, 2020 at 02:41:38AM +0100, Ævar Arnfjörð Bjarmason wrote:

> > > Another reason I'm not a fan of it is because it's another piece of
> > > magic "clone" does that you can't emulate in "init/fetch".
> > 
> > That ship has sailed longlonglong time ago when dfeff66e (revamp
> > git-clone., 2006-03-20) started pointing our HEAD to match theirs.
> 
> Let me rephrase: I think it's unfortunate when we add new things things
> to porcelain commands that aren't easy or possible to emulate in
> plumbing.
> 
> I.e. this feature seems like a candidate to be exposed by something like
> by a ls-remote flag if you'd like to do an init/config/fetch. AFAIK the
> only way to do it is to mock a "clone" with GIT_TRACE_PACKET or get the
> information out-of-bounds.

I think the situation is better than that. We are surfacing the remote
HEAD here, and there is already a command for copying that to our local
tracking symref: "git remote set-head origin -a", which will set up
refs/remotes/origin/HEAD.

I think there are two ways we could improve that further:

  - making it more natural to pick up or update the remote HEAD via
    fetch; Felipe's patches to git-fetch look good to me

  - it might be nice to be able to have some equivalent to the dwim "git
    checkout foo" that creates a new "foo" based off of origin/foo.
    Doing "git checkout origin/HEAD" will detach the HEAD. I think right
    now you'd have to do something like:

      tracking=$(git symbolic-ref refs/remotes/origin/HEAD)
      branch=${tracking#refs/remotes/origin/}
      git checkout -b $branch $tracking

    Or maybe not. It's not something people probably need to do a lot.
    And if the point is to have plumbing commands that can do the same,
    then maybe those commands are sufficient.

-Peff

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH] clone: in protocol v2, use remote's default branch
  2020-12-15  2:38         ` Jeff King
@ 2020-12-15  2:55           ` Junio C Hamano
  2020-12-15  4:36             ` Jeff King
  0 siblings, 1 reply; 109+ messages in thread
From: Junio C Hamano @ 2020-12-15  2:55 UTC (permalink / raw)
  To: Jeff King
  Cc: Ævar Arnfjörð Bjarmason, Jonathan Tan, git,
	Felipe Contreras

Jeff King <peff@peff.net> writes:

> I think the situation is better than that. We are surfacing the remote
> HEAD here, and there is already a command for copying that to our local
> tracking symref: "git remote set-head origin -a", which will set up
> refs/remotes/origin/HEAD.
>
> I think there are two ways we could improve that further:
>
>   - making it more natural to pick up or update the remote HEAD via
>     fetch; Felipe's patches to git-fetch look good to me

I do not mind that as an option (not the default) to the "git fetch"
command.  But I think Ævar was driving at the lack of a scriptable
building block.

>   - it might be nice to be able to have some equivalent to the dwim "git
>     checkout foo" that creates a new "foo" based off of origin/foo.
>     Doing "git checkout origin/HEAD" will detach the HEAD. I think right
>     now you'd have to do something like:
>
>       tracking=$(git symbolic-ref refs/remotes/origin/HEAD)
>       branch=${tracking#refs/remotes/origin/}
>       git checkout -b $branch $tracking

Meaning "git checkout origin" would look at origin/HEAD and find the
remote-tracking branch it points at, and uses that name?  I think
that does make quite a lot of sense.  You are correct to point out
that not just "git checkout origin/HEAD", but "git checkout origin",
currently detaches the HEAD at that commit, if you have origin/HEAD
pointing at one of the remote-tracking branches.

But if we were to make such a change, "git fetch" shouldn't
automatically update remotes/origin/HEAD, I would think.  It does
not matter too much if we are talking about a publishing repository
where the HEAD rarely changes (and when it does, it is a significant
event that everybody in the downstream should take notice), but if
you clone from a live repository with active development, you do not
want to lose a stable reference to what you consider as the primary
branch at your origin repository.

>     Or maybe not. It's not something people probably need to do a lot.
>     And if the point is to have plumbing commands that can do the same,
>     then maybe those commands are sufficient.


^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH] clone: in protocol v2, use remote's default branch
  2020-12-15  1:41       ` Ævar Arnfjörð Bjarmason
  2020-12-15  2:22         ` Junio C Hamano
  2020-12-15  2:38         ` Jeff King
@ 2020-12-15  3:22         ` Felipe Contreras
  2 siblings, 0 replies; 109+ messages in thread
From: Felipe Contreras @ 2020-12-15  3:22 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Junio C Hamano
  Cc: Jonathan Tan, git, Felipe Contreras

Ævar Arnfjörð Bjarmason wrote:

> To borrow from Felipe Contreras's reply in the side-thread "I expect the
> branch name to be chosen by the person who created that repository".
> 
> I suppose this comes down to a mental model of what it means to have
> "created a repository". When I click "create repo" on those popular
> hosting sites (e.g. github & gitlab) and clone it I was expecting it to
> just be a shorthand init + a URL in my config (and refspecs...).

Indeed. But then it would be *them* taking away agency from the user (by
not allowing the user to choose the name of the branch), not us.

In Spanish we have a saying: "don't give extra change", which is similar to
the lawyery advice: "don't volunteer [information]".

Let's not volunteer user complaints. Let GitHub take some of those.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH] clone: in protocol v2, use remote's default branch
  2020-12-15  2:55           ` Junio C Hamano
@ 2020-12-15  4:36             ` Jeff King
  2020-12-16  3:09               ` Junio C Hamano
  0 siblings, 1 reply; 109+ messages in thread
From: Jeff King @ 2020-12-15  4:36 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason, Jonathan Tan, git,
	Felipe Contreras

On Mon, Dec 14, 2020 at 06:55:33PM -0800, Junio C Hamano wrote:

> >   - it might be nice to be able to have some equivalent to the dwim "git
> >     checkout foo" that creates a new "foo" based off of origin/foo.
> >     Doing "git checkout origin/HEAD" will detach the HEAD. I think right
> >     now you'd have to do something like:
> >
> >       tracking=$(git symbolic-ref refs/remotes/origin/HEAD)
> >       branch=${tracking#refs/remotes/origin/}
> >       git checkout -b $branch $tracking
> 
> Meaning "git checkout origin" would look at origin/HEAD and find the
> remote-tracking branch it points at, and uses that name?  I think
> that does make quite a lot of sense.  You are correct to point out
> that not just "git checkout origin/HEAD", but "git checkout origin",
> currently detaches the HEAD at that commit, if you have origin/HEAD
> pointing at one of the remote-tracking branches.

I'm not sure if it's a good idea to change "git checkout origin" here or
not. It already does something useful. I was mostly suggesting that the
other thing might _also_ be useful, but I'm not sure if it is wise to
change the current behavior.

I was thinking more like an explicit way to trigger the dwim-behavior,
like:

  # same as "git checkout foo" magic that creates "foo", but we
  # have said explicitly both that we expect to make the new branch, and
  # also that we expect it to come from origin.
  git checkout --make-local origin/foo

  # similar, but because we are being explicit, we know it is reasonable
  # to dereference HEAD to find the actual branch name
  git checkout --make-local origin/HEAD

I dunno. I hate the name "--make-local", and in the non-dereferencing
form, it is not much different than just "git checkout -b foo
origin/foo". I'm mostly just thinking aloud here. :)

> But if we were to make such a change, "git fetch" shouldn't
> automatically update remotes/origin/HEAD, I would think.  It does
> not matter too much if we are talking about a publishing repository
> where the HEAD rarely changes (and when it does, it is a significant
> event that everybody in the downstream should take notice), but if
> you clone from a live repository with active development, you do not
> want to lose a stable reference to what you consider as the primary
> branch at your origin repository.

That seems orthogonal. Whether there is checkout magic or not, changing
what origin/HEAD points to would be disruptive to selecting it as a
tracking source, or doing diffs, or whatever. But that is why the
proposal in that series was to make the behavior configurable, and
default to "fill it in if missing" as the default, not "always update on
fetch".

-Peff

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH] clone: in protocol v2, use remote's default branch
  2020-12-15  1:27   ` Jeff King
@ 2020-12-15 19:10     ` Jonathan Tan
  0 siblings, 0 replies; 109+ messages in thread
From: Jonathan Tan @ 2020-12-15 19:10 UTC (permalink / raw)
  To: peff; +Cc: jonathantanmy, git

> > Subject: Re: [PATCH] clone: in protocol v2, use remote's default branch
> > 
> > When cloning an empty repository, a default branch is created. However,
> > it is named after the locally configured init.defaultBranch, not the
> > default branch of the remote repository.
> 
> Your subject line puzzled me at first, because I thought we already did
> that. And indeed we do, but this is about adding the unborn case. I
> think this contributed to Ævar's confusion.
> 
> Maybe:
> 
>   Subject: clone: respect unborn remote HEAD
> 
>   When cloning, we choose the default branch based on the remote HEAD.
>   But if there is no remote HEAD, we'll fall back to using our local
>   init.defaultBranch. Traditionally this hasn't been a big deal, because
>   everybody used "master" as the default. But these days it is likely to
>   cause confusion if the server and client implementations choose
>   different values (e.g., if the remote started with "main", we may
>   choose "master" locally, create commits there, and then the user is
>   surprised when they push to "master" and not "main").
> 
>   To solve this...
> 
> makes the current state more clear, as well as motivating why we care.
> 
> It might also be worth breaking the patch up a bit. E.g., implement the
> capability in upload-pack, then infrastructure for the client to use the
> capability and surface the info to transport callers, and then finally
> surface it to in the program logic of ls-refs, then clone, etc.
> 
> Not strictly necessary, but it make it easier to see what is being
> changed at each step.

All this sounds good.

> > Currently, symrefs that have unborn targets (such as in this case) are
> > not communicated by the protocol. Teach Git to advertise and support the
> > "unborn" feature in "ls-refs" (guarded by the lsrefs.unborn config).
> > This feature indicates that "ls-refs" supports the "unborn" argument;
> > when it is specified, "ls-refs" will send the HEAD symref with the name
> > of its unborn target.
> 
> It's probably also worth mentioning that v0 won't get any support here,
> and why.

OK - thanks for your comments. I'll send out an updated version soon.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v2 0/3] Cloning with remote unborn HEAD
  2020-12-11 21:05 ` [PATCH] clone: in protocol v2, use remote's default branch Jonathan Tan
                     ` (2 preceding siblings ...)
  2020-12-15  1:27   ` Jeff King
@ 2020-12-16  2:07   ` Jonathan Tan
  2020-12-16  2:07     ` [PATCH v2 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
                       ` (2 more replies)
  2020-12-21 22:30   ` [PATCH v3 0/3] Cloning with " Jonathan Tan
  2020-12-22 21:54   ` [PATCH v4 " Jonathan Tan
  5 siblings, 3 replies; 109+ messages in thread
From: Jonathan Tan @ 2020-12-16  2:07 UTC (permalink / raw)
  To: git; +Cc: peff, felipe.contreras, gitster, avarab, Jonathan Tan

Thanks everyone for your comments. Changes from v1:

 - Split into patches. Patch 1 has the server-side changes, and patch 2
   is a preparatory patch that just updates the API, so that reviewers
   can more clearly see the difference in logic in patch 3.
 - Updated commit messsage.
 - Updated test to use "-c init.defaultBranch" instead of "-c
   init.defaultbranch" (capitalization).

Jonathan Tan (3):
  ls-refs: report unborn targets of symrefs
  connect, transport: add no-op arg for future patch
  clone: respect remote unborn HEAD

 Documentation/config/init.txt           |  2 +-
 Documentation/technical/protocol-v2.txt | 10 ++++-
 builtin/clone.c                         | 19 +++++++--
 builtin/fetch-pack.c                    |  3 +-
 builtin/fetch.c                         |  2 +-
 builtin/ls-remote.c                     |  2 +-
 builtin/remote.c                        |  2 +-
 connect.c                               | 29 +++++++++++--
 ls-refs.c                               | 54 +++++++++++++++++++++++--
 ls-refs.h                               |  1 +
 remote.h                                |  3 +-
 serve.c                                 |  2 +-
 t/t5606-clone-options.sh                |  7 ++--
 t/t5702-protocol-v2.sh                  | 17 ++++++++
 transport-helper.c                      |  7 +++-
 transport-internal.h                    | 13 +++---
 transport.c                             | 29 ++++++++-----
 transport.h                             |  7 +++-
 18 files changed, 166 insertions(+), 43 deletions(-)

-- 
2.29.2.684.gfbc64c5ab5-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v2 1/3] ls-refs: report unborn targets of symrefs
  2020-12-16  2:07   ` [PATCH v2 0/3] Cloning with remote unborn HEAD Jonathan Tan
@ 2020-12-16  2:07     ` Jonathan Tan
  2020-12-16  6:16       ` Junio C Hamano
  2020-12-16 18:23       ` Jeff King
  2020-12-16  2:07     ` [PATCH v2 2/3] connect, transport: add no-op arg for future patch Jonathan Tan
  2020-12-16  2:07     ` [PATCH v2 3/3] clone: respect remote unborn HEAD Jonathan Tan
  2 siblings, 2 replies; 109+ messages in thread
From: Jonathan Tan @ 2020-12-16  2:07 UTC (permalink / raw)
  To: git; +Cc: peff, felipe.contreras, gitster, avarab, Jonathan Tan

When cloning, we choose the default branch based on the remote HEAD.
But if there is no remote HEAD reported (which could happen if the
target of the remote HEAD is unborn), we'll fall back to using our local
init.defaultBranch. Traditionally this hasn't been a big deal, because
most repos used "master" as the default. But these days it is likely to
cause confusion if the server and client implementations choose
different values (e.g., if the remote started with "main", we may choose
"master" locally, create commits there, and then the user is surprised
when they push to "master" and not "main").

To solve this, the remote needs to communicate the target of the HEAD
symref, even if it is unborn, and "git clone" needs to use this
information.

Currently, symrefs that have unborn targets (such as in this case) are
not communicated by the protocol. Teach Git to advertise and support the
"unborn" feature in "ls-refs" (guarded by the lsrefs.unborn config).
This feature indicates that "ls-refs" supports the "unborn" argument;
when it is specified, "ls-refs" will send the HEAD symref with the name
of its unborn target.

This change is only for protocol v2. A similar change for protocol v0
would require independent protocol design (there being no analogous
position to signal support for "unborn") and client-side plumbing of the
data required, so the scope of this patch set is limited to protocol v2.

The client side will be updated to use this in a subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Documentation/technical/protocol-v2.txt | 10 ++++-
 ls-refs.c                               | 54 +++++++++++++++++++++++--
 ls-refs.h                               |  1 +
 serve.c                                 |  2 +-
 4 files changed, 61 insertions(+), 6 deletions(-)

diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index e597b74da3..dfe03aa114 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -192,11 +192,19 @@ ls-refs takes in the following arguments:
 	When specified, only references having a prefix matching one of
 	the provided prefixes are displayed.
 
+If the 'unborn' feature is advertised the following argument can be
+included in the client's request.
+
+    unborn
+	The server may send symrefs pointing to unborn branches in the form
+	"unborn <refname> symref-target:<target>".
+
 The output of ls-refs is as follows:
 
     output = *ref
 	     flush-pkt
-    ref = PKT-LINE(obj-id SP refname *(SP ref-attribute) LF)
+    obj-id-or-unborn = (obj-id | "unborn")
+    ref = PKT-LINE(obj-id-or-unborn SP refname *(SP ref-attribute) LF)
     ref-attribute = (symref | peeled)
     symref = "symref-target:" symref-target
     peeled = "peeled:" obj-id
diff --git a/ls-refs.c b/ls-refs.c
index a1e0b473e4..fdb644b482 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -32,6 +32,8 @@ struct ls_refs_data {
 	unsigned peel;
 	unsigned symrefs;
 	struct strvec prefixes;
+	unsigned allow_unborn : 1;
+	unsigned unborn : 1;
 };
 
 static int send_ref(const char *refname, const struct object_id *oid,
@@ -47,7 +49,10 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	if (!ref_match(&data->prefixes, refname_nons))
 		return 0;
 
-	strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	if (oid)
+		strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	else
+		strbuf_addf(&refline, "unborn %s", refname_nons);
 	if (data->symrefs && flag & REF_ISSYMREF) {
 		struct object_id unused;
 		const char *symref_target = resolve_ref_unsafe(refname, 0,
@@ -74,8 +79,28 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	return 0;
 }
 
-static int ls_refs_config(const char *var, const char *value, void *data)
+static void send_possibly_unborn_head(struct ls_refs_data *data)
 {
+	struct strbuf namespaced = STRBUF_INIT;
+	struct object_id oid;
+	int flag;
+	int null_oid;
+
+	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
+	resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag);
+	null_oid = is_null_oid(&oid);
+	if (!null_oid || (data->symrefs && (flag & REF_ISSYMREF)))
+		send_ref(namespaced.buf, null_oid ? NULL : &oid, flag, data);
+	strbuf_release(&namespaced);
+}
+
+static int ls_refs_config(const char *var, const char *value, void *cb_data)
+{
+	struct ls_refs_data *data = cb_data;
+
+	if (!strcmp("lsrefs.unborn", var))
+		data->allow_unborn = !strcmp(value, "allow") ||
+			!strcmp(value, "advertise");
 	/*
 	 * We only serve fetches over v2 for now, so respect only "uploadpack"
 	 * config. This may need to eventually be expanded to "receive", but we
@@ -91,7 +116,7 @@ int ls_refs(struct repository *r, struct strvec *keys,
 
 	memset(&data, 0, sizeof(data));
 
-	git_config(ls_refs_config, NULL);
+	git_config(ls_refs_config, &data);
 
 	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
 		const char *arg = request->line;
@@ -103,14 +128,35 @@ int ls_refs(struct repository *r, struct strvec *keys,
 			data.symrefs = 1;
 		else if (skip_prefix(arg, "ref-prefix ", &out))
 			strvec_push(&data.prefixes, out);
+		else if (data.allow_unborn && !strcmp("unborn", arg))
+			data.unborn = 1;
 	}
 
 	if (request->status != PACKET_READ_FLUSH)
 		die(_("expected flush after ls-refs arguments"));
 
-	head_ref_namespaced(send_ref, &data);
+	if (data.unborn)
+		send_possibly_unborn_head(&data);
+	else
+		head_ref_namespaced(send_ref, &data);
 	for_each_namespaced_ref(send_ref, &data);
 	packet_flush(1);
 	strvec_clear(&data.prefixes);
 	return 0;
 }
+
+int ls_refs_advertise(struct repository *r, struct strbuf *value)
+{
+	if (value) {
+		char *str = NULL;
+
+		if (!repo_config_get_string(the_repository, "lsrefs.unborn",
+					    &str) &&
+		    !strcmp("advertise", str)) {
+			strbuf_addstr(value, "unborn");
+			free(str);
+		}
+	}
+
+	return 1;
+}
diff --git a/ls-refs.h b/ls-refs.h
index 7b33a7c6b8..a99e4be0bd 100644
--- a/ls-refs.h
+++ b/ls-refs.h
@@ -6,5 +6,6 @@ struct strvec;
 struct packet_reader;
 int ls_refs(struct repository *r, struct strvec *keys,
 	    struct packet_reader *request);
+int ls_refs_advertise(struct repository *r, struct strbuf *value);
 
 #endif /* LS_REFS_H */
diff --git a/serve.c b/serve.c
index f6341206c4..30cb56d507 100644
--- a/serve.c
+++ b/serve.c
@@ -62,7 +62,7 @@ struct protocol_capability {
 
 static struct protocol_capability capabilities[] = {
 	{ "agent", agent_advertise, NULL },
-	{ "ls-refs", always_advertise, ls_refs },
+	{ "ls-refs", ls_refs_advertise, ls_refs },
 	{ "fetch", upload_pack_advertise, upload_pack_v2 },
 	{ "server-option", always_advertise, NULL },
 	{ "object-format", object_format_advertise, NULL },
-- 
2.29.2.684.gfbc64c5ab5-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v2 2/3] connect, transport: add no-op arg for future patch
  2020-12-16  2:07   ` [PATCH v2 0/3] Cloning with remote unborn HEAD Jonathan Tan
  2020-12-16  2:07     ` [PATCH v2 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
@ 2020-12-16  2:07     ` Jonathan Tan
  2020-12-16  6:20       ` Junio C Hamano
  2020-12-16  2:07     ` [PATCH v2 3/3] clone: respect remote unborn HEAD Jonathan Tan
  2 siblings, 1 reply; 109+ messages in thread
From: Jonathan Tan @ 2020-12-16  2:07 UTC (permalink / raw)
  To: git; +Cc: peff, felipe.contreras, gitster, avarab, Jonathan Tan

A future patch will require transport_get_remote_refs() and
get_remote_refs() to gain a new argument. Add the argument in this
patch, with no effect on execution, so that the future patch only needs
to concern itself with new logic.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/clone.c      |  2 +-
 builtin/fetch-pack.c |  3 ++-
 builtin/fetch.c      |  2 +-
 builtin/ls-remote.c  |  2 +-
 builtin/remote.c     |  2 +-
 connect.c            |  5 ++++-
 remote.h             |  3 ++-
 transport-helper.c   |  7 +++++--
 transport-internal.h | 13 +++++--------
 transport.c          | 29 ++++++++++++++++++-----------
 transport.h          |  7 ++++++-
 11 files changed, 46 insertions(+), 29 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index a0841923cf..70f9450db4 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1264,7 +1264,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (!option_no_tags)
 		strvec_push(&ref_prefixes, "refs/tags/");
 
-	refs = transport_get_remote_refs(transport, &ref_prefixes);
+	refs = transport_get_remote_refs(transport, &ref_prefixes, NULL);
 
 	if (refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 58b7c1fbdc..9f921dfab4 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -220,7 +220,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	version = discover_version(&reader);
 	switch (version) {
 	case protocol_v2:
-		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL, args.stateless_rpc);
+		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL,
+				args.stateless_rpc, NULL);
 		break;
 	case protocol_v1:
 	case protocol_v0:
diff --git a/builtin/fetch.c b/builtin/fetch.c
index ecf8537605..a7ef59acfc 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1393,7 +1393,7 @@ static int do_fetch(struct transport *transport,
 
 	if (must_list_refs) {
 		trace2_region_enter("fetch", "remote_refs", the_repository);
-		remote_refs = transport_get_remote_refs(transport, &ref_prefixes);
+		remote_refs = transport_get_remote_refs(transport, &ref_prefixes, NULL);
 		trace2_region_leave("fetch", "remote_refs", the_repository);
 	} else
 		remote_refs = NULL;
diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index 092917eca2..4cf3f60b1b 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -118,7 +118,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	if (server_options.nr)
 		transport->server_options = &server_options;
 
-	ref = transport_get_remote_refs(transport, &ref_prefixes);
+	ref = transport_get_remote_refs(transport, &ref_prefixes, NULL);
 	if (ref) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
 		repo_set_hash_algo(the_repository, hash_algo);
diff --git a/builtin/remote.c b/builtin/remote.c
index c1b211b272..246e62f118 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -950,7 +950,7 @@ static int get_remote_ref_states(const char *name,
 	if (query) {
 		transport = transport_get(states->remote, states->remote->url_nr > 0 ?
 			states->remote->url[0] : NULL);
-		remote_refs = transport_get_remote_refs(transport, NULL);
+		remote_refs = transport_get_remote_refs(transport, NULL, NULL);
 		transport_disconnect(transport);
 
 		states->queried = 1;
diff --git a/connect.c b/connect.c
index 8b8f56cf6d..99d9052365 100644
--- a/connect.c
+++ b/connect.c
@@ -455,7 +455,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
 			     const struct strvec *ref_prefixes,
 			     const struct string_list *server_options,
-			     int stateless_rpc)
+			     int stateless_rpc,
+			     char **unborn_head_target)
 {
 	int i;
 	const char *hash_name;
@@ -496,6 +497,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 
 	/* Process response from server */
 	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+		if (unborn_head_target)
+			BUG("NEEDSWORK: provide unborn HEAD target to caller while reading refs");
 		if (!process_ref_v2(reader, &list))
 			die(_("invalid ls-refs response: %s"), reader->line);
 	}
diff --git a/remote.h b/remote.h
index 3211abdf05..967f2178d8 100644
--- a/remote.h
+++ b/remote.h
@@ -198,7 +198,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
 			     const struct strvec *ref_prefixes,
 			     const struct string_list *server_options,
-			     int stateless_rpc);
+			     int stateless_rpc,
+			     char **unborn_head_target);
 
 int resolve_remote_symref(struct ref *ref, struct ref *list);
 
diff --git a/transport-helper.c b/transport-helper.c
index 5f6e0b3bd8..5d97eba935 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1162,13 +1162,16 @@ static int has_attribute(const char *attrs, const char *attr)
 }
 
 static struct ref *get_refs_list(struct transport *transport, int for_push,
-				 const struct strvec *ref_prefixes)
+				 const struct strvec *ref_prefixes,
+				 char **unborn_head_target)
 {
 	get_helper(transport);
 
 	if (process_connect(transport, for_push)) {
 		do_take_over(transport);
-		return transport->vtable->get_refs_list(transport, for_push, ref_prefixes);
+		return transport->vtable->get_refs_list(transport, for_push,
+							ref_prefixes,
+							unborn_head_target);
 	}
 
 	return get_refs_list_using_list(transport, for_push);
diff --git a/transport-internal.h b/transport-internal.h
index 27c9daffc4..5037f6197d 100644
--- a/transport-internal.h
+++ b/transport-internal.h
@@ -18,19 +18,16 @@ struct transport_vtable {
 	 * the transport to try to share connections, for_push is a
 	 * hint as to whether the ultimate operation is a push or a fetch.
 	 *
-	 * If communicating using protocol v2 a list of prefixes can be
-	 * provided to be sent to the server to enable it to limit the ref
-	 * advertisement.  Since ref filtering is done on the server's end, and
-	 * only when using protocol v2, this list will be ignored when not
-	 * using protocol v2 meaning this function can return refs which don't
-	 * match the provided ref_prefixes.
-	 *
 	 * If the transport is able to determine the remote hash for
 	 * the ref without a huge amount of effort, it should store it
 	 * in the ref's old_sha1 field; otherwise it should be all 0.
+	 *
+	 * See transport_get_remote_refs() for information on ref_prefixes and
+	 * unborn_head_target.
 	 **/
 	struct ref *(*get_refs_list)(struct transport *transport, int for_push,
-				     const struct strvec *ref_prefixes);
+				     const struct strvec *ref_prefixes,
+				     char **unborn_head_target);
 
 	/**
 	 * Fetch the objects for the given refs. Note that this gets
diff --git a/transport.c b/transport.c
index 47da955e4f..815e175017 100644
--- a/transport.c
+++ b/transport.c
@@ -127,7 +127,8 @@ struct bundle_transport_data {
 
 static struct ref *get_refs_from_bundle(struct transport *transport,
 					int for_push,
-					const struct strvec *ref_prefixes)
+					const struct strvec *ref_prefixes,
+					char **unborn_head_target)
 {
 	struct bundle_transport_data *data = transport->data;
 	struct ref *result = NULL;
@@ -163,7 +164,7 @@ static int fetch_refs_from_bundle(struct transport *transport,
 	int ret;
 
 	if (!data->get_refs_from_bundle_called)
-		get_refs_from_bundle(transport, 0, NULL);
+		get_refs_from_bundle(transport, 0, NULL, NULL);
 	ret = unbundle(the_repository, &data->header, data->fd,
 			   transport->progress ? BUNDLE_VERBOSE : 0);
 	transport->hash_algo = data->header.hash_algo;
@@ -281,7 +282,7 @@ static void die_if_server_options(struct transport *transport)
  */
 static struct ref *handshake(struct transport *transport, int for_push,
 			     const struct strvec *ref_prefixes,
-			     int must_list_refs)
+			     int must_list_refs, char **unborn_head_target)
 {
 	struct git_transport_data *data = transport->data;
 	struct ref *refs = NULL;
@@ -301,7 +302,8 @@ static struct ref *handshake(struct transport *transport, int for_push,
 			get_remote_refs(data->fd[1], &reader, &refs, for_push,
 					ref_prefixes,
 					transport->server_options,
-					transport->stateless_rpc);
+					transport->stateless_rpc,
+					unborn_head_target);
 		break;
 	case protocol_v1:
 	case protocol_v0:
@@ -324,9 +326,11 @@ static struct ref *handshake(struct transport *transport, int for_push,
 }
 
 static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
-					const struct strvec *ref_prefixes)
+					const struct strvec *ref_prefixes,
+					char **unborn_head_target)
 {
-	return handshake(transport, for_push, ref_prefixes, 1);
+	return handshake(transport, for_push, ref_prefixes, 1,
+			 unborn_head_target);
 }
 
 static int fetch_refs_via_pack(struct transport *transport,
@@ -370,7 +374,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 				break;
 			}
 		}
-		refs_tmp = handshake(transport, 0, NULL, must_list_refs);
+		refs_tmp = handshake(transport, 0, NULL, must_list_refs, NULL);
 	}
 
 	if (data->version == protocol_unknown_version)
@@ -765,7 +769,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 		return -1;
 
 	if (!data->got_remote_heads)
-		get_refs_via_connect(transport, 1, NULL);
+		get_refs_via_connect(transport, 1, NULL, NULL);
 
 	memset(&args, 0, sizeof(args));
 	args.send_mirror = !!(flags & TRANSPORT_PUSH_MIRROR);
@@ -1251,7 +1255,8 @@ int transport_push(struct repository *r,
 
 		trace2_region_enter("transport_push", "get_refs_list", r);
 		remote_refs = transport->vtable->get_refs_list(transport, 1,
-							       &ref_prefixes);
+							       &ref_prefixes,
+							       NULL);
 		trace2_region_leave("transport_push", "get_refs_list", r);
 
 		strvec_clear(&ref_prefixes);
@@ -1370,12 +1375,14 @@ int transport_push(struct repository *r,
 }
 
 const struct ref *transport_get_remote_refs(struct transport *transport,
-					    const struct strvec *ref_prefixes)
+					    const struct strvec *ref_prefixes,
+					    char **unborn_head_target)
 {
 	if (!transport->got_remote_refs) {
 		transport->remote_refs =
 			transport->vtable->get_refs_list(transport, 0,
-							 ref_prefixes);
+							 ref_prefixes,
+							 unborn_head_target);
 		transport->got_remote_refs = 1;
 	}
 
diff --git a/transport.h b/transport.h
index 24558c027d..65de0c9c00 100644
--- a/transport.h
+++ b/transport.h
@@ -241,9 +241,14 @@ int transport_push(struct repository *repo,
  * advertisement.  Since ref filtering is done on the server's end (and only
  * when using protocol v2), this can return refs which don't match the provided
  * ref_prefixes.
+ *
+ * If unborn_head_target is not NULL, and the remote reports HEAD as pointing
+ * to an unborn branch, this function stores the unborn branch in
+ * unborn_head_target. It should be freed by the caller.
  */
 const struct ref *transport_get_remote_refs(struct transport *transport,
-					    const struct strvec *ref_prefixes);
+					    const struct strvec *ref_prefixes,
+					    char **unborn_head_target);
 
 /*
  * Fetch the hash algorithm used by a remote.
-- 
2.29.2.684.gfbc64c5ab5-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v2 3/3] clone: respect remote unborn HEAD
  2020-12-16  2:07   ` [PATCH v2 0/3] Cloning with remote unborn HEAD Jonathan Tan
  2020-12-16  2:07     ` [PATCH v2 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
  2020-12-16  2:07     ` [PATCH v2 2/3] connect, transport: add no-op arg for future patch Jonathan Tan
@ 2020-12-16  2:07     ` Jonathan Tan
  2 siblings, 0 replies; 109+ messages in thread
From: Jonathan Tan @ 2020-12-16  2:07 UTC (permalink / raw)
  To: git; +Cc: peff, felipe.contreras, gitster, avarab, Jonathan Tan

Teach Git to use the "unborn" feature introduced in a previous patch as
follows: Git will always send the "unborn" argument if it is supported
by the server. During "git clone", if cloning an empty repository, Git
will use the new information to determine the local branch to create. In
all other cases, Git will ignore it.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Documentation/config/init.txt |  2 +-
 builtin/clone.c               | 19 ++++++++++++++++---
 connect.c                     | 28 ++++++++++++++++++++++++----
 t/t5606-clone-options.sh      |  7 ++++---
 t/t5702-protocol-v2.sh        | 17 +++++++++++++++++
 5 files changed, 62 insertions(+), 11 deletions(-)

diff --git a/Documentation/config/init.txt b/Documentation/config/init.txt
index dc77f8c844..79c79d6617 100644
--- a/Documentation/config/init.txt
+++ b/Documentation/config/init.txt
@@ -4,4 +4,4 @@ init.templateDir::
 
 init.defaultBranch::
 	Allows overriding the default branch name e.g. when initializing
-	a new repository or when cloning an empty repository.
+	a new repository.
diff --git a/builtin/clone.c b/builtin/clone.c
index 70f9450db4..217c87fddf 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -980,6 +980,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	int submodule_progress;
 
 	struct strvec ref_prefixes = STRVEC_INIT;
+	char *unborn_head_target = NULL;
 
 	packet_trace_identity("clone");
 
@@ -1264,7 +1265,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (!option_no_tags)
 		strvec_push(&ref_prefixes, "refs/tags/");
 
-	refs = transport_get_remote_refs(transport, &ref_prefixes, NULL);
+	refs = transport_get_remote_refs(transport, &ref_prefixes,
+					 &unborn_head_target);
 
 	if (refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
@@ -1323,10 +1325,20 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		remote_head = NULL;
 		option_no_checkout = 1;
 		if (!option_bare) {
-			const char *branch = git_default_branch_name();
-			char *ref = xstrfmt("refs/heads/%s", branch);
+			const char *branch;
+			char *ref;
+
+			if (unborn_head_target &&
+			    skip_prefix(unborn_head_target, "refs/heads/", &branch)) {
+				ref = unborn_head_target;
+				unborn_head_target = NULL;
+			} else {
+				branch = git_default_branch_name();
+				ref = xstrfmt("refs/heads/%s", branch);
+			}
 
 			install_branch_config(0, branch, remote_name, ref);
+			create_symref("HEAD", ref, "");
 			free(ref);
 		}
 	}
@@ -1373,6 +1385,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	strbuf_release(&key);
 	junk_mode = JUNK_LEAVE_ALL;
 
+	free(unborn_head_target);
 	strvec_clear(&ref_prefixes);
 	return err;
 }
diff --git a/connect.c b/connect.c
index 99d9052365..3c35324b4c 100644
--- a/connect.c
+++ b/connect.c
@@ -376,7 +376,8 @@ struct ref **get_remote_heads(struct packet_reader *reader,
 }
 
 /* Returns 1 when a valid ref has been added to `list`, 0 otherwise */
-static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
+static int process_ref_v2(struct packet_reader *reader, struct ref ***list,
+			  char **unborn_head_target)
 {
 	int ret = 1;
 	int i = 0;
@@ -397,6 +398,25 @@ static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
 		goto out;
 	}
 
+	if (!strcmp("unborn", line_sections.items[i].string)) {
+		i++;
+		if (unborn_head_target &&
+		    !strcmp("HEAD", line_sections.items[i++].string)) {
+			/*
+			 * Look for the symref target (if any). If found,
+			 * return it to the caller.
+			 */
+			for (; i < line_sections.nr; i++) {
+				const char *arg = line_sections.items[i].string;
+
+				if (skip_prefix(arg, "symref-target:", &arg)) {
+					*unborn_head_target = xstrdup(arg);
+					break;
+				}
+			}
+		}
+		goto out;
+	}
 	if (parse_oid_hex_algop(line_sections.items[i++].string, &old_oid, &end, reader->hash_algo) ||
 	    *end) {
 		ret = 0;
@@ -489,6 +509,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 	if (!for_push)
 		packet_write_fmt(fd_out, "peel\n");
 	packet_write_fmt(fd_out, "symrefs\n");
+	if (server_supports_feature("ls-refs", "unborn", 0))
+		packet_write_fmt(fd_out, "unborn\n");
 	for (i = 0; ref_prefixes && i < ref_prefixes->nr; i++) {
 		packet_write_fmt(fd_out, "ref-prefix %s\n",
 				 ref_prefixes->v[i]);
@@ -497,9 +519,7 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 
 	/* Process response from server */
 	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
-		if (unborn_head_target)
-			BUG("NEEDSWORK: provide unborn HEAD target to caller while reading refs");
-		if (!process_ref_v2(reader, &list))
+		if (!process_ref_v2(reader, &list, unborn_head_target))
 			die(_("invalid ls-refs response: %s"), reader->line);
 	}
 
diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh
index 7f082fb23b..d3bd79987b 100755
--- a/t/t5606-clone-options.sh
+++ b/t/t5606-clone-options.sh
@@ -102,11 +102,12 @@ test_expect_success 'redirected clone -v does show progress' '
 '
 
 test_expect_success 'chooses correct default initial branch name' '
-	git init --bare empty &&
+	git -c init.defaultBranch=foo init --bare empty &&
+	test_config -C empty lsrefs.unborn advertise &&
 	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
 	git -c init.defaultBranch=up clone empty whats-up &&
-	test refs/heads/up = $(git -C whats-up symbolic-ref HEAD) &&
-	test refs/heads/up = $(git -C whats-up config branch.up.merge)
+	test refs/heads/foo = $(git -C whats-up symbolic-ref HEAD) &&
+	test refs/heads/foo = $(git -C whats-up config branch.foo.merge)
 '
 
 test_expect_success 'guesses initial branch name correctly' '
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 7d5b17909b..6afd6bb482 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -209,6 +209,23 @@ test_expect_success 'clone with file:// using protocol v2' '
 	grep "ref-prefix refs/tags/" log
 '
 
+test_expect_success 'clone of empty repo propagates name of default branch' '
+	git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&
+	test_config -C file_empty_parent lsrefs.unborn advertise &&
+
+	git -c init.defaultBranch=main -c protocol.version=2 \
+		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
+	grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD
+'
+
+test_expect_success '...but not if it is not advertised' '
+	test_config -C file_empty_parent lsrefs.unborn none &&
+
+	git -c init.defaultBranch=main -c protocol.version=2 \
+		clone "file://$(pwd)/file_empty_parent" file_empty_child_2 &&
+	grep "refs/heads/main" file_empty_child_2/.git/HEAD
+'
+
 test_expect_success 'fetch with file:// using protocol v2' '
 	test_when_finished "rm -f log" &&
 
-- 
2.29.2.684.gfbc64c5ab5-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH] clone: in protocol v2, use remote's default branch
  2020-12-15  4:36             ` Jeff King
@ 2020-12-16  3:09               ` Junio C Hamano
  2020-12-16 18:39                 ` Jeff King
  0 siblings, 1 reply; 109+ messages in thread
From: Junio C Hamano @ 2020-12-16  3:09 UTC (permalink / raw)
  To: Jeff King
  Cc: Ævar Arnfjörð Bjarmason, Jonathan Tan, git,
	Felipe Contreras

Jeff King <peff@peff.net> writes:

>> Meaning "git checkout origin" would look at origin/HEAD and find the
>> remote-tracking branch it points at, and uses that name?  I think
>> that does make quite a lot of sense.  You are correct to point out
>> that not just "git checkout origin/HEAD", but "git checkout origin",
>> currently detaches the HEAD at that commit, if you have origin/HEAD
>> pointing at one of the remote-tracking branches.
>
> I'm not sure if it's a good idea to change "git checkout origin" here or
> not. It already does something useful. I was mostly suggesting that the
> other thing might _also_ be useful, but I'm not sure if it is wise to
> change the current behavior.

Well, "git checkout origin/HEAD" would also do something useful,
which happens to be identical to "git checkout origin", to detach
HEAD at the commit.

> I was thinking more like an explicit way to trigger the dwim-behavior,
> like:
>
>   # same as "git checkout foo" magic that creates "foo", but we
>   # have said explicitly both that we expect to make the new branch, and
>   # also that we expect it to come from origin.
>   git checkout --make-local origin/foo

By default I think --guess (formerly known as --dwim) is enabled, so
"git checkout foo" is "git checkout --guess foo", which is making
local 'foo' out of the uniquely found remote-tracking branch.  This
new one is to reduce the "uniquely found" part from the magic and
let you be a bit more explicit, but not explicit enough to say "-t"
or "-b foo"?  I am not sure if this is all that useful.

If this were a slightly different proposal, I would see the
convenience value in it, though.  Currently what "--guess" does is:

      If the name 'foo' given does not exist as a local branch,
      and the name appears exactly once as a remote-tracking branch
      from some remote (i.e. 'refs/remotes/origin/foo' exists, but
      there is no other 'refs/remotes/*/foo'), create a local 'foo'
      that builds on that remote-tracking branch and check it out.

What would happen if we tweaked the existing "--guess" behaviour
slightly?

      "git checkout --guess origin/foo", even when there is a second
      remote 'publish' that also has a remote-tracking branch for
      its 'foo' (i.e. both 'refs/remotes/{origin,publish}/foo'
      exists), can be used to disambiguate among these remotes with
      'foo'.  You'd get local 'foo' that builds on 'foo' from the
      remote 'origin' and check it out.

>   # similar, but because we are being explicit, we know it is reasonable
>   # to dereference HEAD to find the actual branch name
>   git checkout --make-local origin/HEAD

The user does not need "git symbolic-ref refs/remotes/origin/HEAD"
if such a feature were available.  "git checkout --some-option origin"
without having to say /HEAD may be a better UI, though.

And "checkout" being a Porcelain, and the DWIM feature that is
always on is subject to be improved for human use, I do not see why
that --some-option cannot be --guess.  If I want to get the current
behaviour, I can explicitly say "git checkout --detach origin"
anyway, no?

> That seems orthogonal. Whether there is checkout magic or not, changing
> what origin/HEAD points to would be disruptive to selecting it as a
> tracking source, or doing diffs, or whatever. But that is why the
> proposal in that series was to make the behavior configurable, and
> default to "fill it in if missing" as the default, not "always update on
> fetch".

Ah, I totally forgot that the favoured variant was "fill in if
missing, but don't move once it is set".  Yes, I think that is a
sensible default.

Thanks.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v2 1/3] ls-refs: report unborn targets of symrefs
  2020-12-16  2:07     ` [PATCH v2 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
@ 2020-12-16  6:16       ` Junio C Hamano
  2020-12-16 23:49         ` Jonathan Tan
  2020-12-16 18:23       ` Jeff King
  1 sibling, 1 reply; 109+ messages in thread
From: Junio C Hamano @ 2020-12-16  6:16 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, peff, felipe.contreras, avarab

Jonathan Tan <jonathantanmy@google.com> writes:

> diff --git a/ls-refs.c b/ls-refs.c
> index a1e0b473e4..fdb644b482 100644
> --- a/ls-refs.c
> +++ b/ls-refs.c
> @@ -32,6 +32,8 @@ struct ls_refs_data {
>  	unsigned peel;
>  	unsigned symrefs;
>  	struct strvec prefixes;
> +	unsigned allow_unborn : 1;
> +	unsigned unborn : 1;
>  };

OK, so the idea is

 - lsrefs.unborn controls on the serving side if this new feature is
   advertised to the incoming clients;

 - the client side can ask "unborn" and that is recorded in
   data.unborn bit at the beginning of ls_refs();

 - the response may show an unborn symbolic ref when "unborn" was asked.

which looks internally consistent.

> @@ -47,7 +49,10 @@ static int send_ref(const char *refname, const struct object_id *oid,
>  	if (!ref_match(&data->prefixes, refname_nons))
>  		return 0;
>  
> -	strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
> +	if (oid)
> +		strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
> +	else
> +		strbuf_addf(&refline, "unborn %s", refname_nons);

A line that has token "unborn" instead of a full object name in hex
is something existing clients are not prepared to receive and grok.
It is not immediately clear how this new code makes sure that
clients that did not ask for "unborn" in what was parsed at the
beginning of ls_refs() will not see such a line.  Presumably, no
existing caller of send_ref passes NULL in oid and the only one
that does so is the one in send_possibly_unborn_head() when the HEAD
points at an unborn branch.

OK.

> @@ -74,8 +79,28 @@ static int send_ref(const char *refname, const struct object_id *oid,
>  	return 0;
>  }
>  
> -static int ls_refs_config(const char *var, const char *value, void *data)
> +static void send_possibly_unborn_head(struct ls_refs_data *data)
>  {
> +	struct strbuf namespaced = STRBUF_INIT;
> +	struct object_id oid;
> +	int flag;
> +	int null_oid;

I'd suggest renaming this one, which masks the global null_oid of
"const struct object_id" type.  This code does not break only
because is_null_oid() happens to be implemented as a static inline,
and not as a C-preprocessor macro, right?

> +	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
> +	resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag);
> +	null_oid = is_null_oid(&oid);
> +	if (!null_oid || (data->symrefs && (flag & REF_ISSYMREF)))
> +		send_ref(namespaced.buf, null_oid ? NULL : &oid, flag, data);
> +	strbuf_release(&namespaced);
> +}
> +
> +static int ls_refs_config(const char *var, const char *value, void *cb_data)
> +{
> +	struct ls_refs_data *data = cb_data;
> +
> +	if (!strcmp("lsrefs.unborn", var))
> +		data->allow_unborn = !strcmp(value, "allow") ||
> +			!strcmp(value, "advertise");

Are there differences between allow and advertise?  Would
lsrefs.allowUnborn that is a boolean, thus allowing the value to be
parsed by git_config_bool(), make more sense here, I wonder.  Or is
this meant as some future enhancement, e.g. you plan to have some
servers that allow "unborn" request even though they do not actively
advertise the support of the feature?  Without documentation update
or an in-code comment, it is rather hard to guess the intention
here.

> @@ -91,7 +116,7 @@ int ls_refs(struct repository *r, struct strvec *keys,
>  
>  	memset(&data, 0, sizeof(data));
>  
> -	git_config(ls_refs_config, NULL);
> +	git_config(ls_refs_config, &data);
>  
>  	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
>  		const char *arg = request->line;
> @@ -103,14 +128,35 @@ int ls_refs(struct repository *r, struct strvec *keys,
>  			data.symrefs = 1;
>  		else if (skip_prefix(arg, "ref-prefix ", &out))
>  			strvec_push(&data.prefixes, out);
> +		else if (data.allow_unborn && !strcmp("unborn", arg))
> +			data.unborn = 1;

Somehow, it appears to me that writing it in a way along with this
line ...

		else if (!strcmp("unborn", arg))
			data.unborn = data.allow_unborn;

... would make more sense.  Whether we allowed "unborn" request or
not, when the other side says "unborn", we are handling the request
for the unborn feature, and the condition with strcmp() alone
signals that better (in other words, when we acquire more request
types, we do not want to pass the control to "else if" clauses that
may come after this part when we see "unborn" request and when we
are configured not to accept "unborn" requests.

It does not make any difference in the current code, of course, and
it is more about future-proofing the cleanness of the code.

> -	head_ref_namespaced(send_ref, &data);
> +	if (data.unborn)
> +		send_possibly_unborn_head(&data);
> +	else
> +		head_ref_namespaced(send_ref, &data);

I found the "send_possibly 70% duplicates what the more generic
head_ref_namespaced() does" a bit disturbing.

>  	for_each_namespaced_ref(send_ref, &data);
>  	packet_flush(1);
>  	strvec_clear(&data.prefixes);
>  	return 0;
>  }
> +
> +int ls_refs_advertise(struct repository *r, struct strbuf *value)
> +{
> +	if (value) {
> +		char *str = NULL;
> +
> +		if (!repo_config_get_string(the_repository, "lsrefs.unborn",
> +					    &str) &&
> +		    !strcmp("advertise", str)) {
> +			strbuf_addstr(value, "unborn");
> +			free(str);
> +		}
> +	}
> +
> +	return 1;
> +}
> diff --git a/ls-refs.h b/ls-refs.h
> index 7b33a7c6b8..a99e4be0bd 100644
> --- a/ls-refs.h
> +++ b/ls-refs.h
> @@ -6,5 +6,6 @@ struct strvec;
>  struct packet_reader;
>  int ls_refs(struct repository *r, struct strvec *keys,
>  	    struct packet_reader *request);
> +int ls_refs_advertise(struct repository *r, struct strbuf *value);
>  
>  #endif /* LS_REFS_H */
> diff --git a/serve.c b/serve.c
> index f6341206c4..30cb56d507 100644
> --- a/serve.c
> +++ b/serve.c
> @@ -62,7 +62,7 @@ struct protocol_capability {
>  
>  static struct protocol_capability capabilities[] = {
>  	{ "agent", agent_advertise, NULL },
> -	{ "ls-refs", always_advertise, ls_refs },
> +	{ "ls-refs", ls_refs_advertise, ls_refs },
>  	{ "fetch", upload_pack_advertise, upload_pack_v2 },
>  	{ "server-option", always_advertise, NULL },
>  	{ "object-format", object_format_advertise, NULL },

Thanks.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v2 2/3] connect, transport: add no-op arg for future patch
  2020-12-16  2:07     ` [PATCH v2 2/3] connect, transport: add no-op arg for future patch Jonathan Tan
@ 2020-12-16  6:20       ` Junio C Hamano
  0 siblings, 0 replies; 109+ messages in thread
From: Junio C Hamano @ 2020-12-16  6:20 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, peff, felipe.contreras, avarab

Jonathan Tan <jonathantanmy@google.com> writes:

> A future patch will require transport_get_remote_refs() and
> get_remote_refs() to gain a new argument. Add the argument in this
> patch, with no effect on execution, so that the future patch only needs
> to concern itself with new logic.

Please give at least a hint about what this "new argument" will be
passing around through the callchain.  E.g.

	In a future patch we plan to return the name of an unborn
	current branch from deep in the callchain to a caller via a
	new pointer parameter that points at a variable in the
	caller when the caller calls get_remote_refs() and
	transport_get_remote_refs().  Add the parameter to functions
	involved in the callchain, but no caller passes an actual
	argument yet in this step ...

or something like that.

>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  builtin/clone.c      |  2 +-
>  builtin/fetch-pack.c |  3 ++-
>  builtin/fetch.c      |  2 +-
>  builtin/ls-remote.c  |  2 +-
>  builtin/remote.c     |  2 +-
>  connect.c            |  5 ++++-
>  remote.h             |  3 ++-
>  transport-helper.c   |  7 +++++--
>  transport-internal.h | 13 +++++--------
>  transport.c          | 29 ++++++++++++++++++-----------
>  transport.h          |  7 ++++++-
>  11 files changed, 46 insertions(+), 29 deletions(-)
>
> diff --git a/builtin/clone.c b/builtin/clone.c
> index a0841923cf..70f9450db4 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -1264,7 +1264,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	if (!option_no_tags)
>  		strvec_push(&ref_prefixes, "refs/tags/");
>  
> -	refs = transport_get_remote_refs(transport, &ref_prefixes);
> +	refs = transport_get_remote_refs(transport, &ref_prefixes, NULL);
>  
>  	if (refs) {
>  		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
> diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
> index 58b7c1fbdc..9f921dfab4 100644
> --- a/builtin/fetch-pack.c
> +++ b/builtin/fetch-pack.c
> @@ -220,7 +220,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
>  	version = discover_version(&reader);
>  	switch (version) {
>  	case protocol_v2:
> -		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL, args.stateless_rpc);
> +		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL,
> +				args.stateless_rpc, NULL);
>  		break;
>  	case protocol_v1:
>  	case protocol_v0:
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index ecf8537605..a7ef59acfc 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1393,7 +1393,7 @@ static int do_fetch(struct transport *transport,
>  
>  	if (must_list_refs) {
>  		trace2_region_enter("fetch", "remote_refs", the_repository);
> -		remote_refs = transport_get_remote_refs(transport, &ref_prefixes);
> +		remote_refs = transport_get_remote_refs(transport, &ref_prefixes, NULL);
>  		trace2_region_leave("fetch", "remote_refs", the_repository);
>  	} else
>  		remote_refs = NULL;
> diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
> index 092917eca2..4cf3f60b1b 100644
> --- a/builtin/ls-remote.c
> +++ b/builtin/ls-remote.c
> @@ -118,7 +118,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
>  	if (server_options.nr)
>  		transport->server_options = &server_options;
>  
> -	ref = transport_get_remote_refs(transport, &ref_prefixes);
> +	ref = transport_get_remote_refs(transport, &ref_prefixes, NULL);
>  	if (ref) {
>  		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
>  		repo_set_hash_algo(the_repository, hash_algo);
> diff --git a/builtin/remote.c b/builtin/remote.c
> index c1b211b272..246e62f118 100644
> --- a/builtin/remote.c
> +++ b/builtin/remote.c
> @@ -950,7 +950,7 @@ static int get_remote_ref_states(const char *name,
>  	if (query) {
>  		transport = transport_get(states->remote, states->remote->url_nr > 0 ?
>  			states->remote->url[0] : NULL);
> -		remote_refs = transport_get_remote_refs(transport, NULL);
> +		remote_refs = transport_get_remote_refs(transport, NULL, NULL);
>  		transport_disconnect(transport);
>  
>  		states->queried = 1;
> diff --git a/connect.c b/connect.c
> index 8b8f56cf6d..99d9052365 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -455,7 +455,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
>  			     struct ref **list, int for_push,
>  			     const struct strvec *ref_prefixes,
>  			     const struct string_list *server_options,
> -			     int stateless_rpc)
> +			     int stateless_rpc,
> +			     char **unborn_head_target)
>  {
>  	int i;
>  	const char *hash_name;
> @@ -496,6 +497,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
>  
>  	/* Process response from server */
>  	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
> +		if (unborn_head_target)
> +			BUG("NEEDSWORK: provide unborn HEAD target to caller while reading refs");
>  		if (!process_ref_v2(reader, &list))
>  			die(_("invalid ls-refs response: %s"), reader->line);
>  	}
> diff --git a/remote.h b/remote.h
> index 3211abdf05..967f2178d8 100644
> --- a/remote.h
> +++ b/remote.h
> @@ -198,7 +198,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
>  			     struct ref **list, int for_push,
>  			     const struct strvec *ref_prefixes,
>  			     const struct string_list *server_options,
> -			     int stateless_rpc);
> +			     int stateless_rpc,
> +			     char **unborn_head_target);
>  
>  int resolve_remote_symref(struct ref *ref, struct ref *list);
>  
> diff --git a/transport-helper.c b/transport-helper.c
> index 5f6e0b3bd8..5d97eba935 100644
> --- a/transport-helper.c
> +++ b/transport-helper.c
> @@ -1162,13 +1162,16 @@ static int has_attribute(const char *attrs, const char *attr)
>  }
>  
>  static struct ref *get_refs_list(struct transport *transport, int for_push,
> -				 const struct strvec *ref_prefixes)
> +				 const struct strvec *ref_prefixes,
> +				 char **unborn_head_target)
>  {
>  	get_helper(transport);
>  
>  	if (process_connect(transport, for_push)) {
>  		do_take_over(transport);
> -		return transport->vtable->get_refs_list(transport, for_push, ref_prefixes);
> +		return transport->vtable->get_refs_list(transport, for_push,
> +							ref_prefixes,
> +							unborn_head_target);
>  	}
>  
>  	return get_refs_list_using_list(transport, for_push);
> diff --git a/transport-internal.h b/transport-internal.h
> index 27c9daffc4..5037f6197d 100644
> --- a/transport-internal.h
> +++ b/transport-internal.h
> @@ -18,19 +18,16 @@ struct transport_vtable {
>  	 * the transport to try to share connections, for_push is a
>  	 * hint as to whether the ultimate operation is a push or a fetch.
>  	 *
> -	 * If communicating using protocol v2 a list of prefixes can be
> -	 * provided to be sent to the server to enable it to limit the ref
> -	 * advertisement.  Since ref filtering is done on the server's end, and
> -	 * only when using protocol v2, this list will be ignored when not
> -	 * using protocol v2 meaning this function can return refs which don't
> -	 * match the provided ref_prefixes.
> -	 *
>  	 * If the transport is able to determine the remote hash for
>  	 * the ref without a huge amount of effort, it should store it
>  	 * in the ref's old_sha1 field; otherwise it should be all 0.
> +	 *
> +	 * See transport_get_remote_refs() for information on ref_prefixes and
> +	 * unborn_head_target.
>  	 **/
>  	struct ref *(*get_refs_list)(struct transport *transport, int for_push,
> -				     const struct strvec *ref_prefixes);
> +				     const struct strvec *ref_prefixes,
> +				     char **unborn_head_target);
>  
>  	/**
>  	 * Fetch the objects for the given refs. Note that this gets
> diff --git a/transport.c b/transport.c
> index 47da955e4f..815e175017 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -127,7 +127,8 @@ struct bundle_transport_data {
>  
>  static struct ref *get_refs_from_bundle(struct transport *transport,
>  					int for_push,
> -					const struct strvec *ref_prefixes)
> +					const struct strvec *ref_prefixes,
> +					char **unborn_head_target)
>  {
>  	struct bundle_transport_data *data = transport->data;
>  	struct ref *result = NULL;
> @@ -163,7 +164,7 @@ static int fetch_refs_from_bundle(struct transport *transport,
>  	int ret;
>  
>  	if (!data->get_refs_from_bundle_called)
> -		get_refs_from_bundle(transport, 0, NULL);
> +		get_refs_from_bundle(transport, 0, NULL, NULL);
>  	ret = unbundle(the_repository, &data->header, data->fd,
>  			   transport->progress ? BUNDLE_VERBOSE : 0);
>  	transport->hash_algo = data->header.hash_algo;
> @@ -281,7 +282,7 @@ static void die_if_server_options(struct transport *transport)
>   */
>  static struct ref *handshake(struct transport *transport, int for_push,
>  			     const struct strvec *ref_prefixes,
> -			     int must_list_refs)
> +			     int must_list_refs, char **unborn_head_target)
>  {
>  	struct git_transport_data *data = transport->data;
>  	struct ref *refs = NULL;
> @@ -301,7 +302,8 @@ static struct ref *handshake(struct transport *transport, int for_push,
>  			get_remote_refs(data->fd[1], &reader, &refs, for_push,
>  					ref_prefixes,
>  					transport->server_options,
> -					transport->stateless_rpc);
> +					transport->stateless_rpc,
> +					unborn_head_target);
>  		break;
>  	case protocol_v1:
>  	case protocol_v0:
> @@ -324,9 +326,11 @@ static struct ref *handshake(struct transport *transport, int for_push,
>  }
>  
>  static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
> -					const struct strvec *ref_prefixes)
> +					const struct strvec *ref_prefixes,
> +					char **unborn_head_target)
>  {
> -	return handshake(transport, for_push, ref_prefixes, 1);
> +	return handshake(transport, for_push, ref_prefixes, 1,
> +			 unborn_head_target);
>  }
>  
>  static int fetch_refs_via_pack(struct transport *transport,
> @@ -370,7 +374,7 @@ static int fetch_refs_via_pack(struct transport *transport,
>  				break;
>  			}
>  		}
> -		refs_tmp = handshake(transport, 0, NULL, must_list_refs);
> +		refs_tmp = handshake(transport, 0, NULL, must_list_refs, NULL);
>  	}
>  
>  	if (data->version == protocol_unknown_version)
> @@ -765,7 +769,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
>  		return -1;
>  
>  	if (!data->got_remote_heads)
> -		get_refs_via_connect(transport, 1, NULL);
> +		get_refs_via_connect(transport, 1, NULL, NULL);
>  
>  	memset(&args, 0, sizeof(args));
>  	args.send_mirror = !!(flags & TRANSPORT_PUSH_MIRROR);
> @@ -1251,7 +1255,8 @@ int transport_push(struct repository *r,
>  
>  		trace2_region_enter("transport_push", "get_refs_list", r);
>  		remote_refs = transport->vtable->get_refs_list(transport, 1,
> -							       &ref_prefixes);
> +							       &ref_prefixes,
> +							       NULL);
>  		trace2_region_leave("transport_push", "get_refs_list", r);
>  
>  		strvec_clear(&ref_prefixes);
> @@ -1370,12 +1375,14 @@ int transport_push(struct repository *r,
>  }
>  
>  const struct ref *transport_get_remote_refs(struct transport *transport,
> -					    const struct strvec *ref_prefixes)
> +					    const struct strvec *ref_prefixes,
> +					    char **unborn_head_target)
>  {
>  	if (!transport->got_remote_refs) {
>  		transport->remote_refs =
>  			transport->vtable->get_refs_list(transport, 0,
> -							 ref_prefixes);
> +							 ref_prefixes,
> +							 unborn_head_target);
>  		transport->got_remote_refs = 1;
>  	}
>  
> diff --git a/transport.h b/transport.h
> index 24558c027d..65de0c9c00 100644
> --- a/transport.h
> +++ b/transport.h
> @@ -241,9 +241,14 @@ int transport_push(struct repository *repo,
>   * advertisement.  Since ref filtering is done on the server's end (and only
>   * when using protocol v2), this can return refs which don't match the provided
>   * ref_prefixes.
> + *
> + * If unborn_head_target is not NULL, and the remote reports HEAD as pointing
> + * to an unborn branch, this function stores the unborn branch in
> + * unborn_head_target. It should be freed by the caller.
>   */
>  const struct ref *transport_get_remote_refs(struct transport *transport,
> -					    const struct strvec *ref_prefixes);
> +					    const struct strvec *ref_prefixes,
> +					    char **unborn_head_target);
>  
>  /*
>   * Fetch the hash algorithm used by a remote.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v2 1/3] ls-refs: report unborn targets of symrefs
  2020-12-16  2:07     ` [PATCH v2 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
  2020-12-16  6:16       ` Junio C Hamano
@ 2020-12-16 18:23       ` Jeff King
  2020-12-16 23:54         ` Jonathan Tan
  1 sibling, 1 reply; 109+ messages in thread
From: Jeff King @ 2020-12-16 18:23 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, felipe.contreras, gitster, avarab

On Tue, Dec 15, 2020 at 06:07:56PM -0800, Jonathan Tan wrote:

> +static int ls_refs_config(const char *var, const char *value, void *cb_data)
> +{
> +	struct ls_refs_data *data = cb_data;
> +
> +	if (!strcmp("lsrefs.unborn", var))
> +		data->allow_unborn = !strcmp(value, "allow") ||
> +			!strcmp(value, "advertise");

What's the reason we would want this to be configurable? I would think
we would just want it always on for the server, and then clients can
choose to make us of it or not (and probably not by omitting the
capability; the question is what they want to do with the information
about HEAD, but that is true whether it is unborn or not, and is
controlled by options like "clone -b").

-Peff

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH] clone: in protocol v2, use remote's default branch
  2020-12-16  3:09               ` Junio C Hamano
@ 2020-12-16 18:39                 ` Jeff King
  2020-12-16 20:56                   ` Junio C Hamano
  0 siblings, 1 reply; 109+ messages in thread
From: Jeff King @ 2020-12-16 18:39 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason, Jonathan Tan, git,
	Felipe Contreras

On Tue, Dec 15, 2020 at 07:09:50PM -0800, Junio C Hamano wrote:

> > I'm not sure if it's a good idea to change "git checkout origin" here or
> > not. It already does something useful. I was mostly suggesting that the
> > other thing might _also_ be useful, but I'm not sure if it is wise to
> > change the current behavior.
> 
> Well, "git checkout origin/HEAD" would also do something useful,
> which happens to be identical to "git checkout origin", to detach
> HEAD at the commit.

To be clear, I meant that both of those do the useful thing, and I'm not
sure if it would be confusing to users to change that (but see below).

> > I was thinking more like an explicit way to trigger the dwim-behavior,
> > like:
> >
> >   # same as "git checkout foo" magic that creates "foo", but we
> >   # have said explicitly both that we expect to make the new branch, and
> >   # also that we expect it to come from origin.
> >   git checkout --make-local origin/foo
> 
> By default I think --guess (formerly known as --dwim) is enabled, so
> "git checkout foo" is "git checkout --guess foo", which is making
> local 'foo' out of the uniquely found remote-tracking branch.  This
> new one is to reduce the "uniquely found" part from the magic and
> let you be a bit more explicit, but not explicit enough to say "-t"
> or "-b foo"?  I am not sure if this is all that useful.

I agree it's not all that useful in that example. What I was thinking is
that by giving the implicit/heuristic magic a more explicit verbose
name, then we make it natural to extend the explicit version to more
cases where it might be questionable to do it implicitly.

So no, I doubt anybody would normally type what I wrote above. But it
lets us explain it as:

  - there's a feature "--make-local" (I still hate the name) that makes
    a local branch from a remote one if it doesn't already exist

  - that feature knows how to resolve symrefs and create the branch from
    the pointed-to name

  - as a shortcut, "git checkout foo" is a synonym for "--make-local
    origin/foo" when "origin/foo" exists but "foo" does not

It's definitely not worth it, though, if we decide that there's an
implicit/heuristic syntax that should trigger the symref thing.

> If this were a slightly different proposal, I would see the
> convenience value in it, though.  Currently what "--guess" does is:
> 
>       If the name 'foo' given does not exist as a local branch,
>       and the name appears exactly once as a remote-tracking branch
>       from some remote (i.e. 'refs/remotes/origin/foo' exists, but
>       there is no other 'refs/remotes/*/foo'), create a local 'foo'
>       that builds on that remote-tracking branch and check it out.
> 
> What would happen if we tweaked the existing "--guess" behaviour
> slightly?
> 
>       "git checkout --guess origin/foo", even when there is a second
>       remote 'publish' that also has a remote-tracking branch for
>       its 'foo' (i.e. both 'refs/remotes/{origin,publish}/foo'
>       exists), can be used to disambiguate among these remotes with
>       'foo'.  You'd get local 'foo' that builds on 'foo' from the
>       remote 'origin' and check it out.

I forgot we had --guess. Piggy-backing on that might be sensible as a
stronger "explicit" signal that this is what the user wants (though
"--guess" is still a funny name here, because we're no longer guessing
at all; the user told us what they want).

But yeah, the semantics you outlined in the second paragraph match what
I was expecting "--make-local" to do.

> >   # similar, but because we are being explicit, we know it is reasonable
> >   # to dereference HEAD to find the actual branch name
> >   git checkout --make-local origin/HEAD
> 
> The user does not need "git symbolic-ref refs/remotes/origin/HEAD"
> if such a feature were available.  "git checkout --some-option origin"
> without having to say /HEAD may be a better UI, though.

Right. I'm assuming that "origin/HEAD" and "origin" could be used
interchangeably in my example.

> And "checkout" being a Porcelain, and the DWIM feature that is
> always on is subject to be improved for human use, I do not see why
> that --some-option cannot be --guess.  If I want to get the current
> behaviour, I can explicitly say "git checkout --detach origin"
> anyway, no?

I think:

  git checkout --guess origin

would make sense to dereference origin/HEAD to "foo", as if we had said
"git checkout foo". That's the explicit part that seems safe. My
question is whether:

  git checkout origin

should likewise do so. As you note, one can always use --detach to make
their intention clear, and checkout is a porcelain, so we are OK to
change it. But would users find that annoying? I frequently use "git
checkout origin" to get a detached HEAD pointing at your master (e.g.,
because I want to reproduce a bug, or do a "something like this..."
patch). I'm sure I could retrain my fingers, but I wonder if I'm not the
only one.

Doing it for only an explicit "--guess" turns that feature into a
tri-state (explicitly off, explicitly on, or "implicit, so be a little
more conservative"). Which perhaps is harder to explain, but I think
cleanly adds the new feature in a consistent way, without really
changing any existing behavior.

Related, I assume that:

  git checkout --guess origin/foo
  git checkout origin/foo

should behave the same as their "origin" or "origin/HEAD" counterparts
for consistency (obviously making "foo" in the former case, and either
detaching or making "foo" in the second case, depending on what you
think of the earlier paragraphs).

-Peff

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH] clone: in protocol v2, use remote's default branch
  2020-12-16 18:39                 ` Jeff King
@ 2020-12-16 20:56                   ` Junio C Hamano
  2020-12-18  6:19                     ` Jeff King
  0 siblings, 1 reply; 109+ messages in thread
From: Junio C Hamano @ 2020-12-16 20:56 UTC (permalink / raw)
  To: Jeff King
  Cc: Ævar Arnfjörð Bjarmason, Jonathan Tan, git,
	Felipe Contreras

Jeff King <peff@peff.net> writes:

> I think:
>
>   git checkout --guess origin
>
> would make sense to dereference origin/HEAD to "foo", as if we had said
> "git checkout foo". That's the explicit part that seems safe. My
> question is whether:
>
>   git checkout origin
>
> should likewise do so.

I see.  I think "--guess" is by default true, so unless you have
checkout.guess=false configured, my answer to the above question is
yes.

> As you note, one can always use --detach to make
> their intention clear, and checkout is a porcelain, so we are OK to
> change it. But would users find that annoying? I frequently use "git
> checkout origin" to get a detached HEAD pointing at your master (e.g.,
> because I want to reproduce a bug, or do a "something like this..."
> patch). I'm sure I could retrain my fingers, but I wonder if I'm not the
> only one.

My fingers say "git checkout X^0" instead of "--detach" when I want
to detach for any value of X (e.g. "HEAD", "v2.28.0").  But I do
understand people like to be implicit when they can.

> Doing it for only an explicit "--guess" turns that feature into a
> tri-state (explicitly off, explicitly on, or "implicit, so be a little
> more conservative"). Which perhaps is harder to explain, but I think
> cleanly adds the new feature in a consistent way, without really
> changing any existing behavior.

Hmmmm...  I do not offhand know if that is a good idea or not.

> Related, I assume that:
>
>   git checkout --guess origin/foo
>   git checkout origin/foo
>
> should behave the same as their "origin" or "origin/HEAD" counterparts
> for consistency (obviously making "foo" in the former case, and either
> detaching or making "foo" in the second case, depending on what you
> think of the earlier paragraphs).

I think that is what I said in the "what would happen if we tweaked"
paragraph about using origin/ prefix as a disambiguator?  Then yes,
I think we are in agreement.

Thanks.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v2 1/3] ls-refs: report unborn targets of symrefs
  2020-12-16  6:16       ` Junio C Hamano
@ 2020-12-16 23:49         ` Jonathan Tan
  0 siblings, 0 replies; 109+ messages in thread
From: Jonathan Tan @ 2020-12-16 23:49 UTC (permalink / raw)
  To: gitster; +Cc: jonathantanmy, git, peff, felipe.contreras, avarab

> > @@ -74,8 +79,28 @@ static int send_ref(const char *refname, const struct object_id *oid,
> >  	return 0;
> >  }
> >  
> > -static int ls_refs_config(const char *var, const char *value, void *data)
> > +static void send_possibly_unborn_head(struct ls_refs_data *data)
> >  {
> > +	struct strbuf namespaced = STRBUF_INIT;
> > +	struct object_id oid;
> > +	int flag;
> > +	int null_oid;
> 
> I'd suggest renaming this one, which masks the global null_oid of
> "const struct object_id" type.  This code does not break only
> because is_null_oid() happens to be implemented as a static inline,
> and not as a C-preprocessor macro, right?

OK - will rename.

> > +	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
> > +	resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag);
> > +	null_oid = is_null_oid(&oid);
> > +	if (!null_oid || (data->symrefs && (flag & REF_ISSYMREF)))
> > +		send_ref(namespaced.buf, null_oid ? NULL : &oid, flag, data);
> > +	strbuf_release(&namespaced);
> > +}
> > +
> > +static int ls_refs_config(const char *var, const char *value, void *cb_data)
> > +{
> > +	struct ls_refs_data *data = cb_data;
> > +
> > +	if (!strcmp("lsrefs.unborn", var))
> > +		data->allow_unborn = !strcmp(value, "allow") ||
> > +			!strcmp(value, "advertise");
> 
> Are there differences between allow and advertise?  Would
> lsrefs.allowUnborn that is a boolean, thus allowing the value to be
> parsed by git_config_bool(), make more sense here, I wonder.  Or is
> this meant as some future enhancement, e.g. you plan to have some
> servers that allow "unborn" request even though they do not actively
> advertise the support of the feature?  Without documentation update
> or an in-code comment, it is rather hard to guess the intention
> here.

I'll update the documentation. With this current patch, yes, some
servers will allow "unborn" requests even though they do not actively
advertise it. This allows servers in load-balanced environments to first
be configured to support the feature, then after ensuring that the
configuration for all servers is complete, to turn on advertisement.

> > @@ -91,7 +116,7 @@ int ls_refs(struct repository *r, struct strvec *keys,
> >  
> >  	memset(&data, 0, sizeof(data));
> >  
> > -	git_config(ls_refs_config, NULL);
> > +	git_config(ls_refs_config, &data);
> >  
> >  	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
> >  		const char *arg = request->line;
> > @@ -103,14 +128,35 @@ int ls_refs(struct repository *r, struct strvec *keys,
> >  			data.symrefs = 1;
> >  		else if (skip_prefix(arg, "ref-prefix ", &out))
> >  			strvec_push(&data.prefixes, out);
> > +		else if (data.allow_unborn && !strcmp("unborn", arg))
> > +			data.unborn = 1;
> 
> Somehow, it appears to me that writing it in a way along with this
> line ...
> 
> 		else if (!strcmp("unborn", arg))
> 			data.unborn = data.allow_unborn;
> 
> ... would make more sense.  Whether we allowed "unborn" request or
> not, when the other side says "unborn", we are handling the request
> for the unborn feature, and the condition with strcmp() alone
> signals that better (in other words, when we acquire more request
> types, we do not want to pass the control to "else if" clauses that
> may come after this part when we see "unborn" request and when we
> are configured not to accept "unborn" requests.
> 
> It does not make any difference in the current code, of course, and
> it is more about future-proofing the cleanness of the code.

Good point. I'll go ahead and write it as you describe.

I was following the style in upload-pack, where writing it my way versus
your way would make a difference because we die on invalid arguments at
the end. (It does raise the question whether we should die on invalid
arguments, but maybe that's for another time.)

> 
> > -	head_ref_namespaced(send_ref, &data);
> > +	if (data.unborn)
> > +		send_possibly_unborn_head(&data);
> > +	else
> > +		head_ref_namespaced(send_ref, &data);
> 
> I found the "send_possibly 70% duplicates what the more generic
> head_ref_namespaced() does" a bit disturbing.

There's more duplication in refs.c (e.g. head_ref_namespaced() and
refs_head_ref()) too. I'll see if I can refactor those into something
more generic.


^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v2 1/3] ls-refs: report unborn targets of symrefs
  2020-12-16 18:23       ` Jeff King
@ 2020-12-16 23:54         ` Jonathan Tan
  2020-12-17  1:32           ` Junio C Hamano
  0 siblings, 1 reply; 109+ messages in thread
From: Jonathan Tan @ 2020-12-16 23:54 UTC (permalink / raw)
  To: peff; +Cc: jonathantanmy, git, felipe.contreras, gitster, avarab

> On Tue, Dec 15, 2020 at 06:07:56PM -0800, Jonathan Tan wrote:
> 
> > +static int ls_refs_config(const char *var, const char *value, void *cb_data)
> > +{
> > +	struct ls_refs_data *data = cb_data;
> > +
> > +	if (!strcmp("lsrefs.unborn", var))
> > +		data->allow_unborn = !strcmp(value, "allow") ||
> > +			!strcmp(value, "advertise");
> 
> What's the reason we would want this to be configurable? I would think
> we would just want it always on for the server, and then clients can
> choose to make us of it or not (and probably not by omitting the
> capability; the question is what they want to do with the information
> about HEAD, but that is true whether it is unborn or not, and is
> controlled by options like "clone -b").

Firstly, this allows a staged rollout in load-balancing situations
wherein we turn on "allow" for all servers, then "advertise", so that we
don't end up with a client that sees the advertisement but then sends
the follow-up request to a server that has not received the latest
configuration yet.

Secondly, I wonder if some people purposely set HEAD to an unborn branch
just so that the repository would be presented as not having a HEAD.
Now, the name of the unborn branch would be revealed. I don't know if
that's a problem, though - but if it is, at least this configuration
variable is a way to solve that.

I'll include this in the documentation in a next version of this patch
set.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v2 1/3] ls-refs: report unborn targets of symrefs
  2020-12-16 23:54         ` Jonathan Tan
@ 2020-12-17  1:32           ` Junio C Hamano
  2020-12-18  6:16             ` Jeff King
  0 siblings, 1 reply; 109+ messages in thread
From: Junio C Hamano @ 2020-12-17  1:32 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: peff, git, felipe.contreras, avarab

Jonathan Tan <jonathantanmy@google.com> writes:

> Firstly, this allows a staged rollout in load-balancing situations
> wherein we turn on "allow" for all servers, then "advertise", so that we
> don't end up with a client that sees the advertisement but then sends
> the follow-up request to a server that has not received the latest
> configuration yet.

If this were the _first_ capability we are adding to the system, the
above makes quite a lot of sense, but I do not recall any existing
capability that can be configured this way.  How would one deploy a
set of servers that gradually start allowing fetching unadvertised
but reachable commits, for example?  I am not saying that the "I'll
accept if asked, but I won't actively advertise" is a bad feature; I
just find it disturbing that only this knob has that feature.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v2 1/3] ls-refs: report unborn targets of symrefs
  2020-12-17  1:32           ` Junio C Hamano
@ 2020-12-18  6:16             ` Jeff King
  0 siblings, 0 replies; 109+ messages in thread
From: Jeff King @ 2020-12-18  6:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Tan, git, felipe.contreras, avarab

On Wed, Dec 16, 2020 at 05:32:50PM -0800, Junio C Hamano wrote:

> Jonathan Tan <jonathantanmy@google.com> writes:
> 
> > Firstly, this allows a staged rollout in load-balancing situations
> > wherein we turn on "allow" for all servers, then "advertise", so that we
> > don't end up with a client that sees the advertisement but then sends
> > the follow-up request to a server that has not received the latest
> > configuration yet.
> 
> If this were the _first_ capability we are adding to the system, the
> above makes quite a lot of sense, but I do not recall any existing
> capability that can be configured this way.  How would one deploy a
> set of servers that gradually start allowing fetching unadvertised
> but reachable commits, for example?  I am not saying that the "I'll
> accept if asked, but I won't actively advertise" is a bad feature; I
> just find it disturbing that only this knob has that feature.

Yeah, that was my thought. It is a real problem, but not one we've dealt
with before in this way (or at all, really). Two recent examples:

  - for fetching, the object-format=sha1 capability is always parroted
    back by v2.28 and higher clients. But there's no config to disable
    the advertisement, so during a partial deployment to a load-balanced
    cluster, a client may say "object-format=sha1" to a server that
    doesn't understand it.

  - for pushing, we added report-status-v2, which is likewise always
    repeated back by any v2.29 and higher client.

At GitHub we recently merged in v2.29, and just hacked report-status-v2
out of the advertisement within the code (leaving it in the "we can read
this but won't advertise it state"). Temporarily modifying the code is
definitely ugly, and I don't mind a cleaner solution, but:

  - it would be nice if this were done in a more consistent way for all
    new capabilities

  - one nice thing about the code change is that after the rollout is
    done, it's safe to make the code unconditional again, which makes
    it simpler to read/reason about.

    This may be oversimplifying it a bit, of course. On one platform, we
    know when the rollout is happening. But if it's something we ship
    upstream, then "rollout" may be on the jump from v2.28 to v2.29, or
    to v2.30, or v2.31, etc. You can never say "rollouts are done, and
    existing server versions know about this feature". So any upstream
    support like config has to stay forever.

So I dunno. My biggest complaint is that the config option defaults to
_off_.  So it's helping load-balanced rollouts, but creating complexity
for everyone else who might want to enable the feature.

(I know there was also an indication that some people might want it off
because they somehow want to have no HEAD at all. I don't find this
particularly compelling, but even if it were, I think we could leave it
the config as an escape hatch for such folks, but still default it to
"on").

-Peff

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH] clone: in protocol v2, use remote's default branch
  2020-12-16 20:56                   ` Junio C Hamano
@ 2020-12-18  6:19                     ` Jeff King
  0 siblings, 0 replies; 109+ messages in thread
From: Jeff King @ 2020-12-18  6:19 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason, Jonathan Tan, git,
	Felipe Contreras

On Wed, Dec 16, 2020 at 12:56:22PM -0800, Junio C Hamano wrote:

> > I think:
> >
> >   git checkout --guess origin
> >
> > would make sense to dereference origin/HEAD to "foo", as if we had said
> > "git checkout foo". That's the explicit part that seems safe. My
> > question is whether:
> >
> >   git checkout origin
> >
> > should likewise do so.
> 
> I see.  I think "--guess" is by default true, so unless you have
> checkout.guess=false configured, my answer to the above question is
> yes.

Yes, I agree with the current definition of "--guess", the two would be
the same. I'm just concerned that people will be unhappy with changing
the behavior of the latter, so everything else (the "tri-state --guess"
thing) is an attempt to band-aid over that.

If we decide it's not a concern worth addressing, then I agree the two
should behave the same. I'm just not convinced it won't annoy people who
are used to how "git checkout" works now with non-local branches.

-Peff

^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v3 0/3] Cloning with remote unborn HEAD
  2020-12-11 21:05 ` [PATCH] clone: in protocol v2, use remote's default branch Jonathan Tan
                     ` (3 preceding siblings ...)
  2020-12-16  2:07   ` [PATCH v2 0/3] Cloning with remote unborn HEAD Jonathan Tan
@ 2020-12-21 22:30   ` Jonathan Tan
  2020-12-21 22:30     ` [PATCH v3 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
                       ` (4 more replies)
  2020-12-22 21:54   ` [PATCH v4 " Jonathan Tan
  5 siblings, 5 replies; 109+ messages in thread
From: Jonathan Tan @ 2020-12-21 22:30 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Thanks everyone for your comments. Changes from v2:

Patch 1:

 - I took a look at head_ref_namespaced() was hoping that it could be
   refactored to meet my new needs, but I don't think it's feasible.
   head_ref_namespaced() seems to have a precise use - for when we
   already have a callback that we're using with
   for_each_namespaced_ref(), and we want to use it with the HEAD ref as
   well. Instead what I did is to eliminate the use of
   head_ref_namespaced() here, and used send_possibly_unborn_head() for
   both the regular and the new unborn case.

 - Renamed null_oid to avoid shadowing, as pointed out by Junio.

 - Changed to a boolean lsrefs.allowUnborn, so no more advertise/allow
   distinction. Currently it still defaults to unallowed. Peff, what did
   you mean in [1] by the following:

> So I dunno. My biggest complaint is that the config option defaults to
> _off_.  So it's helping load-balanced rollouts, but creating complexity
> for everyone else who might want to enable the feature.

   So it seems like you're saying that it should default to "on", but at
   the same time you are talking about enabling the feature (which seems
   to imply switching it from "off" to "on"). (Also, note that this is a
   server-side thing; on the client-side, Git will always use what the
   server gives and there is no option to control this.)

 - Added documentation of lsrefs.allowUnborn, which I forgot.

Patch 2:

 - Used Junio's suggested commit message.

Patch 3:

 - No changes except what was necessary due to the config option change.

[1] https://lore.kernel.org/git/X9xJLWdFJfNJTn0p@coredump.intra.peff.net/

Jonathan Tan (3):
  ls-refs: report unborn targets of symrefs
  connect, transport: add no-op arg for future patch
  clone: respect remote unborn HEAD

 Documentation/config.txt                |  2 +
 Documentation/config/init.txt           |  2 +-
 Documentation/config/lsrefs.txt         |  3 ++
 Documentation/technical/protocol-v2.txt | 10 ++++-
 builtin/clone.c                         | 19 +++++++--
 builtin/fetch-pack.c                    |  3 +-
 builtin/fetch.c                         |  2 +-
 builtin/ls-remote.c                     |  2 +-
 builtin/remote.c                        |  2 +-
 connect.c                               | 29 ++++++++++++--
 ls-refs.c                               | 51 +++++++++++++++++++++++--
 ls-refs.h                               |  1 +
 remote.h                                |  3 +-
 serve.c                                 |  2 +-
 t/t5606-clone-options.sh                |  7 ++--
 t/t5702-protocol-v2.sh                  |  9 +++++
 transport-helper.c                      |  7 +++-
 transport-internal.h                    | 13 +++----
 transport.c                             | 29 ++++++++------
 transport.h                             |  7 +++-
 20 files changed, 160 insertions(+), 43 deletions(-)
 create mode 100644 Documentation/config/lsrefs.txt

Range-diff against v2:
1:  7f5b50c7b2 ! 1:  7d20ec323a ls-refs: report unborn targets of symrefs
    @@ Commit message
     
         Currently, symrefs that have unborn targets (such as in this case) are
         not communicated by the protocol. Teach Git to advertise and support the
    -    "unborn" feature in "ls-refs" (guarded by the lsrefs.unborn config).
    -    This feature indicates that "ls-refs" supports the "unborn" argument;
    -    when it is specified, "ls-refs" will send the HEAD symref with the name
    -    of its unborn target.
    +    "unborn" feature in "ls-refs" (guarded by the lsrefs.allowunborn
    +    config). This feature indicates that "ls-refs" supports the "unborn"
    +    argument; when it is specified, "ls-refs" will send the HEAD symref with
    +    the name of its unborn target.
     
         This change is only for protocol v2. A similar change for protocol v0
         would require independent protocol design (there being no analogous
    @@ Commit message
     
         Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
     
    + ## Documentation/config.txt ##
    +@@ Documentation/config.txt: include::config/interactive.txt[]
    + 
    + include::config/log.txt[]
    + 
    ++include::config/lsrefs.txt[]
    ++
    + include::config/mailinfo.txt[]
    + 
    + include::config/mailmap.txt[]
    +
    + ## Documentation/config/lsrefs.txt (new) ##
    +@@
    ++lsrefs.allowUnborn::
    ++	Allow the server to send information about unborn symrefs during the
    ++	protocol v2 ref advertisement.
    +
      ## Documentation/technical/protocol-v2.txt ##
     @@ Documentation/technical/protocol-v2.txt: ls-refs takes in the following arguments:
      	When specified, only references having a prefix matching one of
    @@ ls-refs.c: static int send_ref(const char *refname, const struct object_id *oid,
     +	struct strbuf namespaced = STRBUF_INIT;
     +	struct object_id oid;
     +	int flag;
    -+	int null_oid;
    ++	int oid_is_null;
     +
     +	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
     +	resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag);
    -+	null_oid = is_null_oid(&oid);
    -+	if (!null_oid || (data->symrefs && (flag & REF_ISSYMREF)))
    -+		send_ref(namespaced.buf, null_oid ? NULL : &oid, flag, data);
    ++	oid_is_null = is_null_oid(&oid);
    ++	if (!oid_is_null ||
    ++	    (data->unborn && data->symrefs && (flag & REF_ISSYMREF)))
    ++		send_ref(namespaced.buf, oid_is_null ? NULL : &oid, flag, data);
     +	strbuf_release(&namespaced);
     +}
     +
    @@ ls-refs.c: static int send_ref(const char *refname, const struct object_id *oid,
     +{
     +	struct ls_refs_data *data = cb_data;
     +
    -+	if (!strcmp("lsrefs.unborn", var))
    -+		data->allow_unborn = !strcmp(value, "allow") ||
    -+			!strcmp(value, "advertise");
    ++	if (!strcmp("lsrefs.allowunborn", var))
    ++		data->allow_unborn = git_config_bool(var, value);
    ++
      	/*
      	 * We only serve fetches over v2 for now, so respect only "uploadpack"
      	 * config. This may need to eventually be expanded to "receive", but we
    @@ ls-refs.c: int ls_refs(struct repository *r, struct strvec *keys,
      		die(_("expected flush after ls-refs arguments"));
      
     -	head_ref_namespaced(send_ref, &data);
    -+	if (data.unborn)
    -+		send_possibly_unborn_head(&data);
    -+	else
    -+		head_ref_namespaced(send_ref, &data);
    ++	send_possibly_unborn_head(&data);
      	for_each_namespaced_ref(send_ref, &data);
      	packet_flush(1);
      	strvec_clear(&data.prefixes);
    @@ ls-refs.c: int ls_refs(struct repository *r, struct strvec *keys,
     +int ls_refs_advertise(struct repository *r, struct strbuf *value)
     +{
     +	if (value) {
    -+		char *str = NULL;
    ++		int allow_unborn_value;
     +
    -+		if (!repo_config_get_string(the_repository, "lsrefs.unborn",
    -+					    &str) &&
    -+		    !strcmp("advertise", str)) {
    ++		if (!repo_config_get_bool(the_repository,
    ++					 "lsrefs.allowunborn",
    ++					 &allow_unborn_value) &&
    ++		    allow_unborn_value)
     +			strbuf_addstr(value, "unborn");
    -+			free(str);
    -+		}
     +	}
     +
     +	return 1;
2:  e24fb6d746 ! 2:  b5a78857eb connect, transport: add no-op arg for future patch
    @@ Metadata
      ## Commit message ##
         connect, transport: add no-op arg for future patch
     
    -    A future patch will require transport_get_remote_refs() and
    -    get_remote_refs() to gain a new argument. Add the argument in this
    -    patch, with no effect on execution, so that the future patch only needs
    -    to concern itself with new logic.
    +    In a future patch we plan to return the name of an unborn current branch
    +    from deep in the callchain to a caller via a new pointer parameter that
    +    points at a variable in the caller when the caller calls
    +    get_remote_refs() and transport_get_remote_refs(). Add the parameter to
    +    functions involved in the callchain, but no caller passes an actual
    +    argument yet in this step. Thus, the future patch only needs to concern
    +    itself with new logic.
     
         Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
     
3:  6fcb3b16ce ! 3:  c2303dc976 clone: respect remote unborn HEAD
    @@ t/t5606-clone-options.sh: test_expect_success 'redirected clone -v does show pro
      test_expect_success 'chooses correct default initial branch name' '
     -	git init --bare empty &&
     +	git -c init.defaultBranch=foo init --bare empty &&
    -+	test_config -C empty lsrefs.unborn advertise &&
    ++	test_config -C empty lsrefs.allowUnborn true &&
      	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
      	git -c init.defaultBranch=up clone empty whats-up &&
     -	test refs/heads/up = $(git -C whats-up symbolic-ref HEAD) &&
    @@ t/t5702-protocol-v2.sh: test_expect_success 'clone with file:// using protocol v
      
     +test_expect_success 'clone of empty repo propagates name of default branch' '
     +	git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&
    -+	test_config -C file_empty_parent lsrefs.unborn advertise &&
    ++	test_config -C file_empty_parent lsrefs.allowUnborn true &&
     +
     +	git -c init.defaultBranch=main -c protocol.version=2 \
     +		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
     +	grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD
     +'
    -+
    -+test_expect_success '...but not if it is not advertised' '
    -+	test_config -C file_empty_parent lsrefs.unborn none &&
    -+
    -+	git -c init.defaultBranch=main -c protocol.version=2 \
    -+		clone "file://$(pwd)/file_empty_parent" file_empty_child_2 &&
    -+	grep "refs/heads/main" file_empty_child_2/.git/HEAD
    -+'
     +
      test_expect_success 'fetch with file:// using protocol v2' '
      	test_when_finished "rm -f log" &&
-- 
2.29.2.729.g45daf8777d-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v3 1/3] ls-refs: report unborn targets of symrefs
  2020-12-21 22:30   ` [PATCH v3 0/3] Cloning with " Jonathan Tan
@ 2020-12-21 22:30     ` Jonathan Tan
  2020-12-21 22:31     ` [PATCH v3 2/3] connect, transport: add no-op arg for future patch Jonathan Tan
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 109+ messages in thread
From: Jonathan Tan @ 2020-12-21 22:30 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

When cloning, we choose the default branch based on the remote HEAD.
But if there is no remote HEAD reported (which could happen if the
target of the remote HEAD is unborn), we'll fall back to using our local
init.defaultBranch. Traditionally this hasn't been a big deal, because
most repos used "master" as the default. But these days it is likely to
cause confusion if the server and client implementations choose
different values (e.g., if the remote started with "main", we may choose
"master" locally, create commits there, and then the user is surprised
when they push to "master" and not "main").

To solve this, the remote needs to communicate the target of the HEAD
symref, even if it is unborn, and "git clone" needs to use this
information.

Currently, symrefs that have unborn targets (such as in this case) are
not communicated by the protocol. Teach Git to advertise and support the
"unborn" feature in "ls-refs" (guarded by the lsrefs.allowunborn
config). This feature indicates that "ls-refs" supports the "unborn"
argument; when it is specified, "ls-refs" will send the HEAD symref with
the name of its unborn target.

This change is only for protocol v2. A similar change for protocol v0
would require independent protocol design (there being no analogous
position to signal support for "unborn") and client-side plumbing of the
data required, so the scope of this patch set is limited to protocol v2.

The client side will be updated to use this in a subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Documentation/config.txt                |  2 +
 Documentation/config/lsrefs.txt         |  3 ++
 Documentation/technical/protocol-v2.txt | 10 ++++-
 ls-refs.c                               | 51 +++++++++++++++++++++++--
 ls-refs.h                               |  1 +
 serve.c                                 |  2 +-
 6 files changed, 63 insertions(+), 6 deletions(-)
 create mode 100644 Documentation/config/lsrefs.txt

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 6ba50b1104..d08e83a148 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -398,6 +398,8 @@ include::config/interactive.txt[]
 
 include::config/log.txt[]
 
+include::config/lsrefs.txt[]
+
 include::config/mailinfo.txt[]
 
 include::config/mailmap.txt[]
diff --git a/Documentation/config/lsrefs.txt b/Documentation/config/lsrefs.txt
new file mode 100644
index 0000000000..dcbec11aaa
--- /dev/null
+++ b/Documentation/config/lsrefs.txt
@@ -0,0 +1,3 @@
+lsrefs.allowUnborn::
+	Allow the server to send information about unborn symrefs during the
+	protocol v2 ref advertisement.
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 85daeb5d9e..4707511c10 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -192,11 +192,19 @@ ls-refs takes in the following arguments:
 	When specified, only references having a prefix matching one of
 	the provided prefixes are displayed.
 
+If the 'unborn' feature is advertised the following argument can be
+included in the client's request.
+
+    unborn
+	The server may send symrefs pointing to unborn branches in the form
+	"unborn <refname> symref-target:<target>".
+
 The output of ls-refs is as follows:
 
     output = *ref
 	     flush-pkt
-    ref = PKT-LINE(obj-id SP refname *(SP ref-attribute) LF)
+    obj-id-or-unborn = (obj-id | "unborn")
+    ref = PKT-LINE(obj-id-or-unborn SP refname *(SP ref-attribute) LF)
     ref-attribute = (symref | peeled)
     symref = "symref-target:" symref-target
     peeled = "peeled:" obj-id
diff --git a/ls-refs.c b/ls-refs.c
index a1e0b473e4..82c79895c3 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -32,6 +32,8 @@ struct ls_refs_data {
 	unsigned peel;
 	unsigned symrefs;
 	struct strvec prefixes;
+	unsigned allow_unborn : 1;
+	unsigned unborn : 1;
 };
 
 static int send_ref(const char *refname, const struct object_id *oid,
@@ -47,7 +49,10 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	if (!ref_match(&data->prefixes, refname_nons))
 		return 0;
 
-	strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	if (oid)
+		strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	else
+		strbuf_addf(&refline, "unborn %s", refname_nons);
 	if (data->symrefs && flag & REF_ISSYMREF) {
 		struct object_id unused;
 		const char *symref_target = resolve_ref_unsafe(refname, 0,
@@ -74,8 +79,29 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	return 0;
 }
 
-static int ls_refs_config(const char *var, const char *value, void *data)
+static void send_possibly_unborn_head(struct ls_refs_data *data)
 {
+	struct strbuf namespaced = STRBUF_INIT;
+	struct object_id oid;
+	int flag;
+	int oid_is_null;
+
+	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
+	resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag);
+	oid_is_null = is_null_oid(&oid);
+	if (!oid_is_null ||
+	    (data->unborn && data->symrefs && (flag & REF_ISSYMREF)))
+		send_ref(namespaced.buf, oid_is_null ? NULL : &oid, flag, data);
+	strbuf_release(&namespaced);
+}
+
+static int ls_refs_config(const char *var, const char *value, void *cb_data)
+{
+	struct ls_refs_data *data = cb_data;
+
+	if (!strcmp("lsrefs.allowunborn", var))
+		data->allow_unborn = git_config_bool(var, value);
+
 	/*
 	 * We only serve fetches over v2 for now, so respect only "uploadpack"
 	 * config. This may need to eventually be expanded to "receive", but we
@@ -91,7 +117,7 @@ int ls_refs(struct repository *r, struct strvec *keys,
 
 	memset(&data, 0, sizeof(data));
 
-	git_config(ls_refs_config, NULL);
+	git_config(ls_refs_config, &data);
 
 	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
 		const char *arg = request->line;
@@ -103,14 +129,31 @@ int ls_refs(struct repository *r, struct strvec *keys,
 			data.symrefs = 1;
 		else if (skip_prefix(arg, "ref-prefix ", &out))
 			strvec_push(&data.prefixes, out);
+		else if (data.allow_unborn && !strcmp("unborn", arg))
+			data.unborn = 1;
 	}
 
 	if (request->status != PACKET_READ_FLUSH)
 		die(_("expected flush after ls-refs arguments"));
 
-	head_ref_namespaced(send_ref, &data);
+	send_possibly_unborn_head(&data);
 	for_each_namespaced_ref(send_ref, &data);
 	packet_flush(1);
 	strvec_clear(&data.prefixes);
 	return 0;
 }
+
+int ls_refs_advertise(struct repository *r, struct strbuf *value)
+{
+	if (value) {
+		int allow_unborn_value;
+
+		if (!repo_config_get_bool(the_repository,
+					 "lsrefs.allowunborn",
+					 &allow_unborn_value) &&
+		    allow_unborn_value)
+			strbuf_addstr(value, "unborn");
+	}
+
+	return 1;
+}
diff --git a/ls-refs.h b/ls-refs.h
index 7b33a7c6b8..a99e4be0bd 100644
--- a/ls-refs.h
+++ b/ls-refs.h
@@ -6,5 +6,6 @@ struct strvec;
 struct packet_reader;
 int ls_refs(struct repository *r, struct strvec *keys,
 	    struct packet_reader *request);
+int ls_refs_advertise(struct repository *r, struct strbuf *value);
 
 #endif /* LS_REFS_H */
diff --git a/serve.c b/serve.c
index eec2fe6f29..ac20c72763 100644
--- a/serve.c
+++ b/serve.c
@@ -73,7 +73,7 @@ struct protocol_capability {
 
 static struct protocol_capability capabilities[] = {
 	{ "agent", agent_advertise, NULL },
-	{ "ls-refs", always_advertise, ls_refs },
+	{ "ls-refs", ls_refs_advertise, ls_refs },
 	{ "fetch", upload_pack_advertise, upload_pack_v2 },
 	{ "server-option", always_advertise, NULL },
 	{ "object-format", object_format_advertise, NULL },
-- 
2.29.2.729.g45daf8777d-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v3 2/3] connect, transport: add no-op arg for future patch
  2020-12-21 22:30   ` [PATCH v3 0/3] Cloning with " Jonathan Tan
  2020-12-21 22:30     ` [PATCH v3 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
@ 2020-12-21 22:31     ` Jonathan Tan
  2020-12-21 22:31     ` [PATCH v3 3/3] clone: respect remote unborn HEAD Jonathan Tan
                       ` (2 subsequent siblings)
  4 siblings, 0 replies; 109+ messages in thread
From: Jonathan Tan @ 2020-12-21 22:31 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

In a future patch we plan to return the name of an unborn current branch
from deep in the callchain to a caller via a new pointer parameter that
points at a variable in the caller when the caller calls
get_remote_refs() and transport_get_remote_refs(). Add the parameter to
functions involved in the callchain, but no caller passes an actual
argument yet in this step. Thus, the future patch only needs to concern
itself with new logic.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/clone.c      |  2 +-
 builtin/fetch-pack.c |  3 ++-
 builtin/fetch.c      |  2 +-
 builtin/ls-remote.c  |  2 +-
 builtin/remote.c     |  2 +-
 connect.c            |  5 ++++-
 remote.h             |  3 ++-
 transport-helper.c   |  7 +++++--
 transport-internal.h | 13 +++++--------
 transport.c          | 29 ++++++++++++++++++-----------
 transport.h          |  7 ++++++-
 11 files changed, 46 insertions(+), 29 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index a0841923cf..70f9450db4 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1264,7 +1264,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (!option_no_tags)
 		strvec_push(&ref_prefixes, "refs/tags/");
 
-	refs = transport_get_remote_refs(transport, &ref_prefixes);
+	refs = transport_get_remote_refs(transport, &ref_prefixes, NULL);
 
 	if (refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 58b7c1fbdc..9f921dfab4 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -220,7 +220,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	version = discover_version(&reader);
 	switch (version) {
 	case protocol_v2:
-		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL, args.stateless_rpc);
+		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL,
+				args.stateless_rpc, NULL);
 		break;
 	case protocol_v1:
 	case protocol_v0:
diff --git a/builtin/fetch.c b/builtin/fetch.c
index ecf8537605..a7ef59acfc 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1393,7 +1393,7 @@ static int do_fetch(struct transport *transport,
 
 	if (must_list_refs) {
 		trace2_region_enter("fetch", "remote_refs", the_repository);
-		remote_refs = transport_get_remote_refs(transport, &ref_prefixes);
+		remote_refs = transport_get_remote_refs(transport, &ref_prefixes, NULL);
 		trace2_region_leave("fetch", "remote_refs", the_repository);
 	} else
 		remote_refs = NULL;
diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index 092917eca2..4cf3f60b1b 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -118,7 +118,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	if (server_options.nr)
 		transport->server_options = &server_options;
 
-	ref = transport_get_remote_refs(transport, &ref_prefixes);
+	ref = transport_get_remote_refs(transport, &ref_prefixes, NULL);
 	if (ref) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
 		repo_set_hash_algo(the_repository, hash_algo);
diff --git a/builtin/remote.c b/builtin/remote.c
index d11a5589e4..9a547240ab 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -950,7 +950,7 @@ static int get_remote_ref_states(const char *name,
 	if (query) {
 		transport = transport_get(states->remote, states->remote->url_nr > 0 ?
 			states->remote->url[0] : NULL);
-		remote_refs = transport_get_remote_refs(transport, NULL);
+		remote_refs = transport_get_remote_refs(transport, NULL, NULL);
 		transport_disconnect(transport);
 
 		states->queried = 1;
diff --git a/connect.c b/connect.c
index 8b8f56cf6d..99d9052365 100644
--- a/connect.c
+++ b/connect.c
@@ -455,7 +455,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
 			     const struct strvec *ref_prefixes,
 			     const struct string_list *server_options,
-			     int stateless_rpc)
+			     int stateless_rpc,
+			     char **unborn_head_target)
 {
 	int i;
 	const char *hash_name;
@@ -496,6 +497,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 
 	/* Process response from server */
 	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+		if (unborn_head_target)
+			BUG("NEEDSWORK: provide unborn HEAD target to caller while reading refs");
 		if (!process_ref_v2(reader, &list))
 			die(_("invalid ls-refs response: %s"), reader->line);
 	}
diff --git a/remote.h b/remote.h
index 3211abdf05..967f2178d8 100644
--- a/remote.h
+++ b/remote.h
@@ -198,7 +198,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
 			     const struct strvec *ref_prefixes,
 			     const struct string_list *server_options,
-			     int stateless_rpc);
+			     int stateless_rpc,
+			     char **unborn_head_target);
 
 int resolve_remote_symref(struct ref *ref, struct ref *list);
 
diff --git a/transport-helper.c b/transport-helper.c
index 5f6e0b3bd8..5d97eba935 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1162,13 +1162,16 @@ static int has_attribute(const char *attrs, const char *attr)
 }
 
 static struct ref *get_refs_list(struct transport *transport, int for_push,
-				 const struct strvec *ref_prefixes)
+				 const struct strvec *ref_prefixes,
+				 char **unborn_head_target)
 {
 	get_helper(transport);
 
 	if (process_connect(transport, for_push)) {
 		do_take_over(transport);
-		return transport->vtable->get_refs_list(transport, for_push, ref_prefixes);
+		return transport->vtable->get_refs_list(transport, for_push,
+							ref_prefixes,
+							unborn_head_target);
 	}
 
 	return get_refs_list_using_list(transport, for_push);
diff --git a/transport-internal.h b/transport-internal.h
index 27c9daffc4..5037f6197d 100644
--- a/transport-internal.h
+++ b/transport-internal.h
@@ -18,19 +18,16 @@ struct transport_vtable {
 	 * the transport to try to share connections, for_push is a
 	 * hint as to whether the ultimate operation is a push or a fetch.
 	 *
-	 * If communicating using protocol v2 a list of prefixes can be
-	 * provided to be sent to the server to enable it to limit the ref
-	 * advertisement.  Since ref filtering is done on the server's end, and
-	 * only when using protocol v2, this list will be ignored when not
-	 * using protocol v2 meaning this function can return refs which don't
-	 * match the provided ref_prefixes.
-	 *
 	 * If the transport is able to determine the remote hash for
 	 * the ref without a huge amount of effort, it should store it
 	 * in the ref's old_sha1 field; otherwise it should be all 0.
+	 *
+	 * See transport_get_remote_refs() for information on ref_prefixes and
+	 * unborn_head_target.
 	 **/
 	struct ref *(*get_refs_list)(struct transport *transport, int for_push,
-				     const struct strvec *ref_prefixes);
+				     const struct strvec *ref_prefixes,
+				     char **unborn_head_target);
 
 	/**
 	 * Fetch the objects for the given refs. Note that this gets
diff --git a/transport.c b/transport.c
index 679a35e7c1..396a601d78 100644
--- a/transport.c
+++ b/transport.c
@@ -127,7 +127,8 @@ struct bundle_transport_data {
 
 static struct ref *get_refs_from_bundle(struct transport *transport,
 					int for_push,
-					const struct strvec *ref_prefixes)
+					const struct strvec *ref_prefixes,
+					char **unborn_head_target)
 {
 	struct bundle_transport_data *data = transport->data;
 	struct ref *result = NULL;
@@ -163,7 +164,7 @@ static int fetch_refs_from_bundle(struct transport *transport,
 	int ret;
 
 	if (!data->get_refs_from_bundle_called)
-		get_refs_from_bundle(transport, 0, NULL);
+		get_refs_from_bundle(transport, 0, NULL, NULL);
 	ret = unbundle(the_repository, &data->header, data->fd,
 			   transport->progress ? BUNDLE_VERBOSE : 0);
 	transport->hash_algo = data->header.hash_algo;
@@ -281,7 +282,7 @@ static void die_if_server_options(struct transport *transport)
  */
 static struct ref *handshake(struct transport *transport, int for_push,
 			     const struct strvec *ref_prefixes,
-			     int must_list_refs)
+			     int must_list_refs, char **unborn_head_target)
 {
 	struct git_transport_data *data = transport->data;
 	struct ref *refs = NULL;
@@ -305,7 +306,8 @@ static struct ref *handshake(struct transport *transport, int for_push,
 			get_remote_refs(data->fd[1], &reader, &refs, for_push,
 					ref_prefixes,
 					transport->server_options,
-					transport->stateless_rpc);
+					transport->stateless_rpc,
+					unborn_head_target);
 		break;
 	case protocol_v1:
 	case protocol_v0:
@@ -334,9 +336,11 @@ static struct ref *handshake(struct transport *transport, int for_push,
 }
 
 static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
-					const struct strvec *ref_prefixes)
+					const struct strvec *ref_prefixes,
+					char **unborn_head_target)
 {
-	return handshake(transport, for_push, ref_prefixes, 1);
+	return handshake(transport, for_push, ref_prefixes, 1,
+			 unborn_head_target);
 }
 
 static int fetch_refs_via_pack(struct transport *transport,
@@ -380,7 +384,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 				break;
 			}
 		}
-		refs_tmp = handshake(transport, 0, NULL, must_list_refs);
+		refs_tmp = handshake(transport, 0, NULL, must_list_refs, NULL);
 	}
 
 	if (data->version == protocol_unknown_version)
@@ -775,7 +779,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 		return -1;
 
 	if (!data->got_remote_heads)
-		get_refs_via_connect(transport, 1, NULL);
+		get_refs_via_connect(transport, 1, NULL, NULL);
 
 	memset(&args, 0, sizeof(args));
 	args.send_mirror = !!(flags & TRANSPORT_PUSH_MIRROR);
@@ -1261,7 +1265,8 @@ int transport_push(struct repository *r,
 
 		trace2_region_enter("transport_push", "get_refs_list", r);
 		remote_refs = transport->vtable->get_refs_list(transport, 1,
-							       &ref_prefixes);
+							       &ref_prefixes,
+							       NULL);
 		trace2_region_leave("transport_push", "get_refs_list", r);
 
 		strvec_clear(&ref_prefixes);
@@ -1380,12 +1385,14 @@ int transport_push(struct repository *r,
 }
 
 const struct ref *transport_get_remote_refs(struct transport *transport,
-					    const struct strvec *ref_prefixes)
+					    const struct strvec *ref_prefixes,
+					    char **unborn_head_target)
 {
 	if (!transport->got_remote_refs) {
 		transport->remote_refs =
 			transport->vtable->get_refs_list(transport, 0,
-							 ref_prefixes);
+							 ref_prefixes,
+							 unborn_head_target);
 		transport->got_remote_refs = 1;
 	}
 
diff --git a/transport.h b/transport.h
index 24558c027d..65de0c9c00 100644
--- a/transport.h
+++ b/transport.h
@@ -241,9 +241,14 @@ int transport_push(struct repository *repo,
  * advertisement.  Since ref filtering is done on the server's end (and only
  * when using protocol v2), this can return refs which don't match the provided
  * ref_prefixes.
+ *
+ * If unborn_head_target is not NULL, and the remote reports HEAD as pointing
+ * to an unborn branch, this function stores the unborn branch in
+ * unborn_head_target. It should be freed by the caller.
  */
 const struct ref *transport_get_remote_refs(struct transport *transport,
-					    const struct strvec *ref_prefixes);
+					    const struct strvec *ref_prefixes,
+					    char **unborn_head_target);
 
 /*
  * Fetch the hash algorithm used by a remote.
-- 
2.29.2.729.g45daf8777d-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v3 3/3] clone: respect remote unborn HEAD
  2020-12-21 22:30   ` [PATCH v3 0/3] Cloning with " Jonathan Tan
  2020-12-21 22:30     ` [PATCH v3 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
  2020-12-21 22:31     ` [PATCH v3 2/3] connect, transport: add no-op arg for future patch Jonathan Tan
@ 2020-12-21 22:31     ` Jonathan Tan
  2020-12-21 23:48     ` [PATCH v3 0/3] Cloning with " Junio C Hamano
  2021-01-21 20:14     ` Jeff King
  4 siblings, 0 replies; 109+ messages in thread
From: Jonathan Tan @ 2020-12-21 22:31 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Teach Git to use the "unborn" feature introduced in a previous patch as
follows: Git will always send the "unborn" argument if it is supported
by the server. During "git clone", if cloning an empty repository, Git
will use the new information to determine the local branch to create. In
all other cases, Git will ignore it.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Documentation/config/init.txt |  2 +-
 builtin/clone.c               | 19 ++++++++++++++++---
 connect.c                     | 28 ++++++++++++++++++++++++----
 t/t5606-clone-options.sh      |  7 ++++---
 t/t5702-protocol-v2.sh        |  9 +++++++++
 5 files changed, 54 insertions(+), 11 deletions(-)

diff --git a/Documentation/config/init.txt b/Documentation/config/init.txt
index dc77f8c844..79c79d6617 100644
--- a/Documentation/config/init.txt
+++ b/Documentation/config/init.txt
@@ -4,4 +4,4 @@ init.templateDir::
 
 init.defaultBranch::
 	Allows overriding the default branch name e.g. when initializing
-	a new repository or when cloning an empty repository.
+	a new repository.
diff --git a/builtin/clone.c b/builtin/clone.c
index 70f9450db4..217c87fddf 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -980,6 +980,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	int submodule_progress;
 
 	struct strvec ref_prefixes = STRVEC_INIT;
+	char *unborn_head_target = NULL;
 
 	packet_trace_identity("clone");
 
@@ -1264,7 +1265,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (!option_no_tags)
 		strvec_push(&ref_prefixes, "refs/tags/");
 
-	refs = transport_get_remote_refs(transport, &ref_prefixes, NULL);
+	refs = transport_get_remote_refs(transport, &ref_prefixes,
+					 &unborn_head_target);
 
 	if (refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
@@ -1323,10 +1325,20 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		remote_head = NULL;
 		option_no_checkout = 1;
 		if (!option_bare) {
-			const char *branch = git_default_branch_name();
-			char *ref = xstrfmt("refs/heads/%s", branch);
+			const char *branch;
+			char *ref;
+
+			if (unborn_head_target &&
+			    skip_prefix(unborn_head_target, "refs/heads/", &branch)) {
+				ref = unborn_head_target;
+				unborn_head_target = NULL;
+			} else {
+				branch = git_default_branch_name();
+				ref = xstrfmt("refs/heads/%s", branch);
+			}
 
 			install_branch_config(0, branch, remote_name, ref);
+			create_symref("HEAD", ref, "");
 			free(ref);
 		}
 	}
@@ -1373,6 +1385,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	strbuf_release(&key);
 	junk_mode = JUNK_LEAVE_ALL;
 
+	free(unborn_head_target);
 	strvec_clear(&ref_prefixes);
 	return err;
 }
diff --git a/connect.c b/connect.c
index 99d9052365..3c35324b4c 100644
--- a/connect.c
+++ b/connect.c
@@ -376,7 +376,8 @@ struct ref **get_remote_heads(struct packet_reader *reader,
 }
 
 /* Returns 1 when a valid ref has been added to `list`, 0 otherwise */
-static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
+static int process_ref_v2(struct packet_reader *reader, struct ref ***list,
+			  char **unborn_head_target)
 {
 	int ret = 1;
 	int i = 0;
@@ -397,6 +398,25 @@ static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
 		goto out;
 	}
 
+	if (!strcmp("unborn", line_sections.items[i].string)) {
+		i++;
+		if (unborn_head_target &&
+		    !strcmp("HEAD", line_sections.items[i++].string)) {
+			/*
+			 * Look for the symref target (if any). If found,
+			 * return it to the caller.
+			 */
+			for (; i < line_sections.nr; i++) {
+				const char *arg = line_sections.items[i].string;
+
+				if (skip_prefix(arg, "symref-target:", &arg)) {
+					*unborn_head_target = xstrdup(arg);
+					break;
+				}
+			}
+		}
+		goto out;
+	}
 	if (parse_oid_hex_algop(line_sections.items[i++].string, &old_oid, &end, reader->hash_algo) ||
 	    *end) {
 		ret = 0;
@@ -489,6 +509,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 	if (!for_push)
 		packet_write_fmt(fd_out, "peel\n");
 	packet_write_fmt(fd_out, "symrefs\n");
+	if (server_supports_feature("ls-refs", "unborn", 0))
+		packet_write_fmt(fd_out, "unborn\n");
 	for (i = 0; ref_prefixes && i < ref_prefixes->nr; i++) {
 		packet_write_fmt(fd_out, "ref-prefix %s\n",
 				 ref_prefixes->v[i]);
@@ -497,9 +519,7 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 
 	/* Process response from server */
 	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
-		if (unborn_head_target)
-			BUG("NEEDSWORK: provide unborn HEAD target to caller while reading refs");
-		if (!process_ref_v2(reader, &list))
+		if (!process_ref_v2(reader, &list, unborn_head_target))
 			die(_("invalid ls-refs response: %s"), reader->line);
 	}
 
diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh
index 7f082fb23b..67a4c6d05b 100755
--- a/t/t5606-clone-options.sh
+++ b/t/t5606-clone-options.sh
@@ -102,11 +102,12 @@ test_expect_success 'redirected clone -v does show progress' '
 '
 
 test_expect_success 'chooses correct default initial branch name' '
-	git init --bare empty &&
+	git -c init.defaultBranch=foo init --bare empty &&
+	test_config -C empty lsrefs.allowUnborn true &&
 	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
 	git -c init.defaultBranch=up clone empty whats-up &&
-	test refs/heads/up = $(git -C whats-up symbolic-ref HEAD) &&
-	test refs/heads/up = $(git -C whats-up config branch.up.merge)
+	test refs/heads/foo = $(git -C whats-up symbolic-ref HEAD) &&
+	test refs/heads/foo = $(git -C whats-up config branch.foo.merge)
 '
 
 test_expect_success 'guesses initial branch name correctly' '
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 7d5b17909b..4fbbe5aff5 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -209,6 +209,15 @@ test_expect_success 'clone with file:// using protocol v2' '
 	grep "ref-prefix refs/tags/" log
 '
 
+test_expect_success 'clone of empty repo propagates name of default branch' '
+	git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&
+	test_config -C file_empty_parent lsrefs.allowUnborn true &&
+
+	git -c init.defaultBranch=main -c protocol.version=2 \
+		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
+	grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD
+'
+
 test_expect_success 'fetch with file:// using protocol v2' '
 	test_when_finished "rm -f log" &&
 
-- 
2.29.2.729.g45daf8777d-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v3 0/3] Cloning with remote unborn HEAD
  2020-12-21 22:30   ` [PATCH v3 0/3] Cloning with " Jonathan Tan
                       ` (2 preceding siblings ...)
  2020-12-21 22:31     ` [PATCH v3 3/3] clone: respect remote unborn HEAD Jonathan Tan
@ 2020-12-21 23:48     ` Junio C Hamano
  2021-01-21 20:14     ` Jeff King
  4 siblings, 0 replies; 109+ messages in thread
From: Junio C Hamano @ 2020-12-21 23:48 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> Jonathan Tan (3):
>   ls-refs: report unborn targets of symrefs
>   connect, transport: add no-op arg for future patch
>   clone: respect remote unborn HEAD

This passes standalone all its tests, but in 'seen', seems to break
some tests.

https://travis-ci.org/github/git/git/builds/750936755 has just
started, but my local tests before publishing the day's integration
result failed 5702, 0031, and 5606 with this topic, and all passed
without the topic.

Thanks.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v4 0/3] Cloning with remote unborn HEAD
  2020-12-11 21:05 ` [PATCH] clone: in protocol v2, use remote's default branch Jonathan Tan
                     ` (4 preceding siblings ...)
  2020-12-21 22:30   ` [PATCH v3 0/3] Cloning with " Jonathan Tan
@ 2020-12-22 21:54   ` Jonathan Tan
  2020-12-22 21:54     ` [PATCH v4 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
                       ` (3 more replies)
  5 siblings, 4 replies; 109+ messages in thread
From: Jonathan Tan @ 2020-12-22 21:54 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Thanks Junio for informing me of the test failures. Turns out that it
was partly because I didn't memset oid (and in some code paths, it gets
read without being written to), and partly because I didn't set
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME. Here's an updated patch set with
the fixes.

Jonathan Tan (3):
  ls-refs: report unborn targets of symrefs
  connect, transport: add no-op arg for future patch
  clone: respect remote unborn HEAD

 Documentation/config.txt                |  2 +
 Documentation/config/init.txt           |  2 +-
 Documentation/config/lsrefs.txt         |  3 ++
 Documentation/technical/protocol-v2.txt | 10 ++++-
 builtin/clone.c                         | 19 +++++++--
 builtin/fetch-pack.c                    |  3 +-
 builtin/fetch.c                         |  2 +-
 builtin/ls-remote.c                     |  2 +-
 builtin/remote.c                        |  2 +-
 connect.c                               | 29 ++++++++++++--
 ls-refs.c                               | 52 +++++++++++++++++++++++--
 ls-refs.h                               |  1 +
 remote.h                                |  3 +-
 serve.c                                 |  2 +-
 t/t5606-clone-options.sh                |  8 ++--
 t/t5702-protocol-v2.sh                  | 11 ++++++
 transport-helper.c                      |  7 +++-
 transport-internal.h                    | 13 +++----
 transport.c                             | 29 ++++++++------
 transport.h                             |  7 +++-
 20 files changed, 164 insertions(+), 43 deletions(-)
 create mode 100644 Documentation/config/lsrefs.txt

Range-diff against v3:
1:  7d20ec323a ! 1:  a66e50626e ls-refs: report unborn targets of symrefs
    @@ ls-refs.c: static int send_ref(const char *refname, const struct object_id *oid,
     +	int flag;
     +	int oid_is_null;
     +
    ++	memset(&oid, 0, sizeof(oid));
     +	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
     +	resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag);
     +	oid_is_null = is_null_oid(&oid);
2:  b5a78857eb = 2:  14f3962adc connect, transport: add no-op arg for future patch
3:  c2303dc976 ! 3:  e770fc46eb clone: respect remote unborn HEAD
    @@ t/t5606-clone-options.sh: test_expect_success 'redirected clone -v does show pro
      
      test_expect_success 'chooses correct default initial branch name' '
     -	git init --bare empty &&
    ++	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
     +	git -c init.defaultBranch=foo init --bare empty &&
     +	test_config -C empty lsrefs.allowUnborn true &&
      	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
    @@ t/t5702-protocol-v2.sh: test_expect_success 'clone with file:// using protocol v
      '
      
     +test_expect_success 'clone of empty repo propagates name of default branch' '
    ++	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
     +	git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&
     +	test_config -C file_empty_parent lsrefs.allowUnborn true &&
     +
    ++	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
     +	git -c init.defaultBranch=main -c protocol.version=2 \
     +		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
     +	grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD
-- 
2.29.2.729.g45daf8777d-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v4 1/3] ls-refs: report unborn targets of symrefs
  2020-12-22 21:54   ` [PATCH v4 " Jonathan Tan
@ 2020-12-22 21:54     ` Jonathan Tan
  2021-01-21 20:48       ` Jeff King
  2020-12-22 21:54     ` [PATCH v4 2/3] connect, transport: add no-op arg for future patch Jonathan Tan
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 109+ messages in thread
From: Jonathan Tan @ 2020-12-22 21:54 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

When cloning, we choose the default branch based on the remote HEAD.
But if there is no remote HEAD reported (which could happen if the
target of the remote HEAD is unborn), we'll fall back to using our local
init.defaultBranch. Traditionally this hasn't been a big deal, because
most repos used "master" as the default. But these days it is likely to
cause confusion if the server and client implementations choose
different values (e.g., if the remote started with "main", we may choose
"master" locally, create commits there, and then the user is surprised
when they push to "master" and not "main").

To solve this, the remote needs to communicate the target of the HEAD
symref, even if it is unborn, and "git clone" needs to use this
information.

Currently, symrefs that have unborn targets (such as in this case) are
not communicated by the protocol. Teach Git to advertise and support the
"unborn" feature in "ls-refs" (guarded by the lsrefs.allowunborn
config). This feature indicates that "ls-refs" supports the "unborn"
argument; when it is specified, "ls-refs" will send the HEAD symref with
the name of its unborn target.

This change is only for protocol v2. A similar change for protocol v0
would require independent protocol design (there being no analogous
position to signal support for "unborn") and client-side plumbing of the
data required, so the scope of this patch set is limited to protocol v2.

The client side will be updated to use this in a subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Documentation/config.txt                |  2 +
 Documentation/config/lsrefs.txt         |  3 ++
 Documentation/technical/protocol-v2.txt | 10 ++++-
 ls-refs.c                               | 52 +++++++++++++++++++++++--
 ls-refs.h                               |  1 +
 serve.c                                 |  2 +-
 6 files changed, 64 insertions(+), 6 deletions(-)
 create mode 100644 Documentation/config/lsrefs.txt

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 6ba50b1104..d08e83a148 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -398,6 +398,8 @@ include::config/interactive.txt[]
 
 include::config/log.txt[]
 
+include::config/lsrefs.txt[]
+
 include::config/mailinfo.txt[]
 
 include::config/mailmap.txt[]
diff --git a/Documentation/config/lsrefs.txt b/Documentation/config/lsrefs.txt
new file mode 100644
index 0000000000..dcbec11aaa
--- /dev/null
+++ b/Documentation/config/lsrefs.txt
@@ -0,0 +1,3 @@
+lsrefs.allowUnborn::
+	Allow the server to send information about unborn symrefs during the
+	protocol v2 ref advertisement.
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 85daeb5d9e..4707511c10 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -192,11 +192,19 @@ ls-refs takes in the following arguments:
 	When specified, only references having a prefix matching one of
 	the provided prefixes are displayed.
 
+If the 'unborn' feature is advertised the following argument can be
+included in the client's request.
+
+    unborn
+	The server may send symrefs pointing to unborn branches in the form
+	"unborn <refname> symref-target:<target>".
+
 The output of ls-refs is as follows:
 
     output = *ref
 	     flush-pkt
-    ref = PKT-LINE(obj-id SP refname *(SP ref-attribute) LF)
+    obj-id-or-unborn = (obj-id | "unborn")
+    ref = PKT-LINE(obj-id-or-unborn SP refname *(SP ref-attribute) LF)
     ref-attribute = (symref | peeled)
     symref = "symref-target:" symref-target
     peeled = "peeled:" obj-id
diff --git a/ls-refs.c b/ls-refs.c
index a1e0b473e4..ff61e704f1 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -32,6 +32,8 @@ struct ls_refs_data {
 	unsigned peel;
 	unsigned symrefs;
 	struct strvec prefixes;
+	unsigned allow_unborn : 1;
+	unsigned unborn : 1;
 };
 
 static int send_ref(const char *refname, const struct object_id *oid,
@@ -47,7 +49,10 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	if (!ref_match(&data->prefixes, refname_nons))
 		return 0;
 
-	strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	if (oid)
+		strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	else
+		strbuf_addf(&refline, "unborn %s", refname_nons);
 	if (data->symrefs && flag & REF_ISSYMREF) {
 		struct object_id unused;
 		const char *symref_target = resolve_ref_unsafe(refname, 0,
@@ -74,8 +79,30 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	return 0;
 }
 
-static int ls_refs_config(const char *var, const char *value, void *data)
+static void send_possibly_unborn_head(struct ls_refs_data *data)
 {
+	struct strbuf namespaced = STRBUF_INIT;
+	struct object_id oid;
+	int flag;
+	int oid_is_null;
+
+	memset(&oid, 0, sizeof(oid));
+	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
+	resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag);
+	oid_is_null = is_null_oid(&oid);
+	if (!oid_is_null ||
+	    (data->unborn && data->symrefs && (flag & REF_ISSYMREF)))
+		send_ref(namespaced.buf, oid_is_null ? NULL : &oid, flag, data);
+	strbuf_release(&namespaced);
+}
+
+static int ls_refs_config(const char *var, const char *value, void *cb_data)
+{
+	struct ls_refs_data *data = cb_data;
+
+	if (!strcmp("lsrefs.allowunborn", var))
+		data->allow_unborn = git_config_bool(var, value);
+
 	/*
 	 * We only serve fetches over v2 for now, so respect only "uploadpack"
 	 * config. This may need to eventually be expanded to "receive", but we
@@ -91,7 +118,7 @@ int ls_refs(struct repository *r, struct strvec *keys,
 
 	memset(&data, 0, sizeof(data));
 
-	git_config(ls_refs_config, NULL);
+	git_config(ls_refs_config, &data);
 
 	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
 		const char *arg = request->line;
@@ -103,14 +130,31 @@ int ls_refs(struct repository *r, struct strvec *keys,
 			data.symrefs = 1;
 		else if (skip_prefix(arg, "ref-prefix ", &out))
 			strvec_push(&data.prefixes, out);
+		else if (data.allow_unborn && !strcmp("unborn", arg))
+			data.unborn = 1;
 	}
 
 	if (request->status != PACKET_READ_FLUSH)
 		die(_("expected flush after ls-refs arguments"));
 
-	head_ref_namespaced(send_ref, &data);
+	send_possibly_unborn_head(&data);
 	for_each_namespaced_ref(send_ref, &data);
 	packet_flush(1);
 	strvec_clear(&data.prefixes);
 	return 0;
 }
+
+int ls_refs_advertise(struct repository *r, struct strbuf *value)
+{
+	if (value) {
+		int allow_unborn_value;
+
+		if (!repo_config_get_bool(the_repository,
+					 "lsrefs.allowunborn",
+					 &allow_unborn_value) &&
+		    allow_unborn_value)
+			strbuf_addstr(value, "unborn");
+	}
+
+	return 1;
+}
diff --git a/ls-refs.h b/ls-refs.h
index 7b33a7c6b8..a99e4be0bd 100644
--- a/ls-refs.h
+++ b/ls-refs.h
@@ -6,5 +6,6 @@ struct strvec;
 struct packet_reader;
 int ls_refs(struct repository *r, struct strvec *keys,
 	    struct packet_reader *request);
+int ls_refs_advertise(struct repository *r, struct strbuf *value);
 
 #endif /* LS_REFS_H */
diff --git a/serve.c b/serve.c
index eec2fe6f29..ac20c72763 100644
--- a/serve.c
+++ b/serve.c
@@ -73,7 +73,7 @@ struct protocol_capability {
 
 static struct protocol_capability capabilities[] = {
 	{ "agent", agent_advertise, NULL },
-	{ "ls-refs", always_advertise, ls_refs },
+	{ "ls-refs", ls_refs_advertise, ls_refs },
 	{ "fetch", upload_pack_advertise, upload_pack_v2 },
 	{ "server-option", always_advertise, NULL },
 	{ "object-format", object_format_advertise, NULL },
-- 
2.29.2.729.g45daf8777d-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v4 2/3] connect, transport: add no-op arg for future patch
  2020-12-22 21:54   ` [PATCH v4 " Jonathan Tan
  2020-12-22 21:54     ` [PATCH v4 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
@ 2020-12-22 21:54     ` Jonathan Tan
  2021-01-21 20:55       ` Jeff King
  2020-12-22 21:54     ` [PATCH v4 3/3] clone: respect remote unborn HEAD Jonathan Tan
  2020-12-22 22:06     ` [PATCH v4 0/3] Cloning with " Junio C Hamano
  3 siblings, 1 reply; 109+ messages in thread
From: Jonathan Tan @ 2020-12-22 21:54 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

In a future patch we plan to return the name of an unborn current branch
from deep in the callchain to a caller via a new pointer parameter that
points at a variable in the caller when the caller calls
get_remote_refs() and transport_get_remote_refs(). Add the parameter to
functions involved in the callchain, but no caller passes an actual
argument yet in this step. Thus, the future patch only needs to concern
itself with new logic.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 builtin/clone.c      |  2 +-
 builtin/fetch-pack.c |  3 ++-
 builtin/fetch.c      |  2 +-
 builtin/ls-remote.c  |  2 +-
 builtin/remote.c     |  2 +-
 connect.c            |  5 ++++-
 remote.h             |  3 ++-
 transport-helper.c   |  7 +++++--
 transport-internal.h | 13 +++++--------
 transport.c          | 29 ++++++++++++++++++-----------
 transport.h          |  7 ++++++-
 11 files changed, 46 insertions(+), 29 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index a0841923cf..70f9450db4 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1264,7 +1264,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (!option_no_tags)
 		strvec_push(&ref_prefixes, "refs/tags/");
 
-	refs = transport_get_remote_refs(transport, &ref_prefixes);
+	refs = transport_get_remote_refs(transport, &ref_prefixes, NULL);
 
 	if (refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 58b7c1fbdc..9f921dfab4 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -220,7 +220,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	version = discover_version(&reader);
 	switch (version) {
 	case protocol_v2:
-		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL, args.stateless_rpc);
+		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL,
+				args.stateless_rpc, NULL);
 		break;
 	case protocol_v1:
 	case protocol_v0:
diff --git a/builtin/fetch.c b/builtin/fetch.c
index ecf8537605..a7ef59acfc 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1393,7 +1393,7 @@ static int do_fetch(struct transport *transport,
 
 	if (must_list_refs) {
 		trace2_region_enter("fetch", "remote_refs", the_repository);
-		remote_refs = transport_get_remote_refs(transport, &ref_prefixes);
+		remote_refs = transport_get_remote_refs(transport, &ref_prefixes, NULL);
 		trace2_region_leave("fetch", "remote_refs", the_repository);
 	} else
 		remote_refs = NULL;
diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index 092917eca2..4cf3f60b1b 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -118,7 +118,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	if (server_options.nr)
 		transport->server_options = &server_options;
 
-	ref = transport_get_remote_refs(transport, &ref_prefixes);
+	ref = transport_get_remote_refs(transport, &ref_prefixes, NULL);
 	if (ref) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
 		repo_set_hash_algo(the_repository, hash_algo);
diff --git a/builtin/remote.c b/builtin/remote.c
index d11a5589e4..9a547240ab 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -950,7 +950,7 @@ static int get_remote_ref_states(const char *name,
 	if (query) {
 		transport = transport_get(states->remote, states->remote->url_nr > 0 ?
 			states->remote->url[0] : NULL);
-		remote_refs = transport_get_remote_refs(transport, NULL);
+		remote_refs = transport_get_remote_refs(transport, NULL, NULL);
 		transport_disconnect(transport);
 
 		states->queried = 1;
diff --git a/connect.c b/connect.c
index 8b8f56cf6d..99d9052365 100644
--- a/connect.c
+++ b/connect.c
@@ -455,7 +455,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
 			     const struct strvec *ref_prefixes,
 			     const struct string_list *server_options,
-			     int stateless_rpc)
+			     int stateless_rpc,
+			     char **unborn_head_target)
 {
 	int i;
 	const char *hash_name;
@@ -496,6 +497,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 
 	/* Process response from server */
 	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+		if (unborn_head_target)
+			BUG("NEEDSWORK: provide unborn HEAD target to caller while reading refs");
 		if (!process_ref_v2(reader, &list))
 			die(_("invalid ls-refs response: %s"), reader->line);
 	}
diff --git a/remote.h b/remote.h
index 3211abdf05..967f2178d8 100644
--- a/remote.h
+++ b/remote.h
@@ -198,7 +198,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
 			     const struct strvec *ref_prefixes,
 			     const struct string_list *server_options,
-			     int stateless_rpc);
+			     int stateless_rpc,
+			     char **unborn_head_target);
 
 int resolve_remote_symref(struct ref *ref, struct ref *list);
 
diff --git a/transport-helper.c b/transport-helper.c
index 5f6e0b3bd8..5d97eba935 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1162,13 +1162,16 @@ static int has_attribute(const char *attrs, const char *attr)
 }
 
 static struct ref *get_refs_list(struct transport *transport, int for_push,
-				 const struct strvec *ref_prefixes)
+				 const struct strvec *ref_prefixes,
+				 char **unborn_head_target)
 {
 	get_helper(transport);
 
 	if (process_connect(transport, for_push)) {
 		do_take_over(transport);
-		return transport->vtable->get_refs_list(transport, for_push, ref_prefixes);
+		return transport->vtable->get_refs_list(transport, for_push,
+							ref_prefixes,
+							unborn_head_target);
 	}
 
 	return get_refs_list_using_list(transport, for_push);
diff --git a/transport-internal.h b/transport-internal.h
index 27c9daffc4..5037f6197d 100644
--- a/transport-internal.h
+++ b/transport-internal.h
@@ -18,19 +18,16 @@ struct transport_vtable {
 	 * the transport to try to share connections, for_push is a
 	 * hint as to whether the ultimate operation is a push or a fetch.
 	 *
-	 * If communicating using protocol v2 a list of prefixes can be
-	 * provided to be sent to the server to enable it to limit the ref
-	 * advertisement.  Since ref filtering is done on the server's end, and
-	 * only when using protocol v2, this list will be ignored when not
-	 * using protocol v2 meaning this function can return refs which don't
-	 * match the provided ref_prefixes.
-	 *
 	 * If the transport is able to determine the remote hash for
 	 * the ref without a huge amount of effort, it should store it
 	 * in the ref's old_sha1 field; otherwise it should be all 0.
+	 *
+	 * See transport_get_remote_refs() for information on ref_prefixes and
+	 * unborn_head_target.
 	 **/
 	struct ref *(*get_refs_list)(struct transport *transport, int for_push,
-				     const struct strvec *ref_prefixes);
+				     const struct strvec *ref_prefixes,
+				     char **unborn_head_target);
 
 	/**
 	 * Fetch the objects for the given refs. Note that this gets
diff --git a/transport.c b/transport.c
index 679a35e7c1..396a601d78 100644
--- a/transport.c
+++ b/transport.c
@@ -127,7 +127,8 @@ struct bundle_transport_data {
 
 static struct ref *get_refs_from_bundle(struct transport *transport,
 					int for_push,
-					const struct strvec *ref_prefixes)
+					const struct strvec *ref_prefixes,
+					char **unborn_head_target)
 {
 	struct bundle_transport_data *data = transport->data;
 	struct ref *result = NULL;
@@ -163,7 +164,7 @@ static int fetch_refs_from_bundle(struct transport *transport,
 	int ret;
 
 	if (!data->get_refs_from_bundle_called)
-		get_refs_from_bundle(transport, 0, NULL);
+		get_refs_from_bundle(transport, 0, NULL, NULL);
 	ret = unbundle(the_repository, &data->header, data->fd,
 			   transport->progress ? BUNDLE_VERBOSE : 0);
 	transport->hash_algo = data->header.hash_algo;
@@ -281,7 +282,7 @@ static void die_if_server_options(struct transport *transport)
  */
 static struct ref *handshake(struct transport *transport, int for_push,
 			     const struct strvec *ref_prefixes,
-			     int must_list_refs)
+			     int must_list_refs, char **unborn_head_target)
 {
 	struct git_transport_data *data = transport->data;
 	struct ref *refs = NULL;
@@ -305,7 +306,8 @@ static struct ref *handshake(struct transport *transport, int for_push,
 			get_remote_refs(data->fd[1], &reader, &refs, for_push,
 					ref_prefixes,
 					transport->server_options,
-					transport->stateless_rpc);
+					transport->stateless_rpc,
+					unborn_head_target);
 		break;
 	case protocol_v1:
 	case protocol_v0:
@@ -334,9 +336,11 @@ static struct ref *handshake(struct transport *transport, int for_push,
 }
 
 static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
-					const struct strvec *ref_prefixes)
+					const struct strvec *ref_prefixes,
+					char **unborn_head_target)
 {
-	return handshake(transport, for_push, ref_prefixes, 1);
+	return handshake(transport, for_push, ref_prefixes, 1,
+			 unborn_head_target);
 }
 
 static int fetch_refs_via_pack(struct transport *transport,
@@ -380,7 +384,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 				break;
 			}
 		}
-		refs_tmp = handshake(transport, 0, NULL, must_list_refs);
+		refs_tmp = handshake(transport, 0, NULL, must_list_refs, NULL);
 	}
 
 	if (data->version == protocol_unknown_version)
@@ -775,7 +779,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 		return -1;
 
 	if (!data->got_remote_heads)
-		get_refs_via_connect(transport, 1, NULL);
+		get_refs_via_connect(transport, 1, NULL, NULL);
 
 	memset(&args, 0, sizeof(args));
 	args.send_mirror = !!(flags & TRANSPORT_PUSH_MIRROR);
@@ -1261,7 +1265,8 @@ int transport_push(struct repository *r,
 
 		trace2_region_enter("transport_push", "get_refs_list", r);
 		remote_refs = transport->vtable->get_refs_list(transport, 1,
-							       &ref_prefixes);
+							       &ref_prefixes,
+							       NULL);
 		trace2_region_leave("transport_push", "get_refs_list", r);
 
 		strvec_clear(&ref_prefixes);
@@ -1380,12 +1385,14 @@ int transport_push(struct repository *r,
 }
 
 const struct ref *transport_get_remote_refs(struct transport *transport,
-					    const struct strvec *ref_prefixes)
+					    const struct strvec *ref_prefixes,
+					    char **unborn_head_target)
 {
 	if (!transport->got_remote_refs) {
 		transport->remote_refs =
 			transport->vtable->get_refs_list(transport, 0,
-							 ref_prefixes);
+							 ref_prefixes,
+							 unborn_head_target);
 		transport->got_remote_refs = 1;
 	}
 
diff --git a/transport.h b/transport.h
index 24558c027d..65de0c9c00 100644
--- a/transport.h
+++ b/transport.h
@@ -241,9 +241,14 @@ int transport_push(struct repository *repo,
  * advertisement.  Since ref filtering is done on the server's end (and only
  * when using protocol v2), this can return refs which don't match the provided
  * ref_prefixes.
+ *
+ * If unborn_head_target is not NULL, and the remote reports HEAD as pointing
+ * to an unborn branch, this function stores the unborn branch in
+ * unborn_head_target. It should be freed by the caller.
  */
 const struct ref *transport_get_remote_refs(struct transport *transport,
-					    const struct strvec *ref_prefixes);
+					    const struct strvec *ref_prefixes,
+					    char **unborn_head_target);
 
 /*
  * Fetch the hash algorithm used by a remote.
-- 
2.29.2.729.g45daf8777d-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v4 3/3] clone: respect remote unborn HEAD
  2020-12-22 21:54   ` [PATCH v4 " Jonathan Tan
  2020-12-22 21:54     ` [PATCH v4 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
  2020-12-22 21:54     ` [PATCH v4 2/3] connect, transport: add no-op arg for future patch Jonathan Tan
@ 2020-12-22 21:54     ` Jonathan Tan
  2021-01-21 21:02       ` Jeff King
  2020-12-22 22:06     ` [PATCH v4 0/3] Cloning with " Junio C Hamano
  3 siblings, 1 reply; 109+ messages in thread
From: Jonathan Tan @ 2020-12-22 21:54 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Teach Git to use the "unborn" feature introduced in a previous patch as
follows: Git will always send the "unborn" argument if it is supported
by the server. During "git clone", if cloning an empty repository, Git
will use the new information to determine the local branch to create. In
all other cases, Git will ignore it.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Documentation/config/init.txt |  2 +-
 builtin/clone.c               | 19 ++++++++++++++++---
 connect.c                     | 28 ++++++++++++++++++++++++----
 t/t5606-clone-options.sh      |  8 +++++---
 t/t5702-protocol-v2.sh        | 11 +++++++++++
 5 files changed, 57 insertions(+), 11 deletions(-)

diff --git a/Documentation/config/init.txt b/Documentation/config/init.txt
index dc77f8c844..79c79d6617 100644
--- a/Documentation/config/init.txt
+++ b/Documentation/config/init.txt
@@ -4,4 +4,4 @@ init.templateDir::
 
 init.defaultBranch::
 	Allows overriding the default branch name e.g. when initializing
-	a new repository or when cloning an empty repository.
+	a new repository.
diff --git a/builtin/clone.c b/builtin/clone.c
index 70f9450db4..217c87fddf 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -980,6 +980,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	int submodule_progress;
 
 	struct strvec ref_prefixes = STRVEC_INIT;
+	char *unborn_head_target = NULL;
 
 	packet_trace_identity("clone");
 
@@ -1264,7 +1265,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (!option_no_tags)
 		strvec_push(&ref_prefixes, "refs/tags/");
 
-	refs = transport_get_remote_refs(transport, &ref_prefixes, NULL);
+	refs = transport_get_remote_refs(transport, &ref_prefixes,
+					 &unborn_head_target);
 
 	if (refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
@@ -1323,10 +1325,20 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		remote_head = NULL;
 		option_no_checkout = 1;
 		if (!option_bare) {
-			const char *branch = git_default_branch_name();
-			char *ref = xstrfmt("refs/heads/%s", branch);
+			const char *branch;
+			char *ref;
+
+			if (unborn_head_target &&
+			    skip_prefix(unborn_head_target, "refs/heads/", &branch)) {
+				ref = unborn_head_target;
+				unborn_head_target = NULL;
+			} else {
+				branch = git_default_branch_name();
+				ref = xstrfmt("refs/heads/%s", branch);
+			}
 
 			install_branch_config(0, branch, remote_name, ref);
+			create_symref("HEAD", ref, "");
 			free(ref);
 		}
 	}
@@ -1373,6 +1385,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	strbuf_release(&key);
 	junk_mode = JUNK_LEAVE_ALL;
 
+	free(unborn_head_target);
 	strvec_clear(&ref_prefixes);
 	return err;
 }
diff --git a/connect.c b/connect.c
index 99d9052365..3c35324b4c 100644
--- a/connect.c
+++ b/connect.c
@@ -376,7 +376,8 @@ struct ref **get_remote_heads(struct packet_reader *reader,
 }
 
 /* Returns 1 when a valid ref has been added to `list`, 0 otherwise */
-static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
+static int process_ref_v2(struct packet_reader *reader, struct ref ***list,
+			  char **unborn_head_target)
 {
 	int ret = 1;
 	int i = 0;
@@ -397,6 +398,25 @@ static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
 		goto out;
 	}
 
+	if (!strcmp("unborn", line_sections.items[i].string)) {
+		i++;
+		if (unborn_head_target &&
+		    !strcmp("HEAD", line_sections.items[i++].string)) {
+			/*
+			 * Look for the symref target (if any). If found,
+			 * return it to the caller.
+			 */
+			for (; i < line_sections.nr; i++) {
+				const char *arg = line_sections.items[i].string;
+
+				if (skip_prefix(arg, "symref-target:", &arg)) {
+					*unborn_head_target = xstrdup(arg);
+					break;
+				}
+			}
+		}
+		goto out;
+	}
 	if (parse_oid_hex_algop(line_sections.items[i++].string, &old_oid, &end, reader->hash_algo) ||
 	    *end) {
 		ret = 0;
@@ -489,6 +509,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 	if (!for_push)
 		packet_write_fmt(fd_out, "peel\n");
 	packet_write_fmt(fd_out, "symrefs\n");
+	if (server_supports_feature("ls-refs", "unborn", 0))
+		packet_write_fmt(fd_out, "unborn\n");
 	for (i = 0; ref_prefixes && i < ref_prefixes->nr; i++) {
 		packet_write_fmt(fd_out, "ref-prefix %s\n",
 				 ref_prefixes->v[i]);
@@ -497,9 +519,7 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 
 	/* Process response from server */
 	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
-		if (unborn_head_target)
-			BUG("NEEDSWORK: provide unborn HEAD target to caller while reading refs");
-		if (!process_ref_v2(reader, &list))
+		if (!process_ref_v2(reader, &list, unborn_head_target))
 			die(_("invalid ls-refs response: %s"), reader->line);
 	}
 
diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh
index 7f082fb23b..0111d4e8bd 100755
--- a/t/t5606-clone-options.sh
+++ b/t/t5606-clone-options.sh
@@ -102,11 +102,13 @@ test_expect_success 'redirected clone -v does show progress' '
 '
 
 test_expect_success 'chooses correct default initial branch name' '
-	git init --bare empty &&
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=foo init --bare empty &&
+	test_config -C empty lsrefs.allowUnborn true &&
 	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
 	git -c init.defaultBranch=up clone empty whats-up &&
-	test refs/heads/up = $(git -C whats-up symbolic-ref HEAD) &&
-	test refs/heads/up = $(git -C whats-up config branch.up.merge)
+	test refs/heads/foo = $(git -C whats-up symbolic-ref HEAD) &&
+	test refs/heads/foo = $(git -C whats-up config branch.foo.merge)
 '
 
 test_expect_success 'guesses initial branch name correctly' '
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 7d5b17909b..ed8750fadd 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -209,6 +209,17 @@ test_expect_success 'clone with file:// using protocol v2' '
 	grep "ref-prefix refs/tags/" log
 '
 
+test_expect_success 'clone of empty repo propagates name of default branch' '
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&
+	test_config -C file_empty_parent lsrefs.allowUnborn true &&
+
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=main -c protocol.version=2 \
+		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
+	grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD
+'
+
 test_expect_success 'fetch with file:// using protocol v2' '
 	test_when_finished "rm -f log" &&
 
-- 
2.29.2.729.g45daf8777d-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v4 0/3] Cloning with remote unborn HEAD
  2020-12-22 21:54   ` [PATCH v4 " Jonathan Tan
                       ` (2 preceding siblings ...)
  2020-12-22 21:54     ` [PATCH v4 3/3] clone: respect remote unborn HEAD Jonathan Tan
@ 2020-12-22 22:06     ` Junio C Hamano
  3 siblings, 0 replies; 109+ messages in thread
From: Junio C Hamano @ 2020-12-22 22:06 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> Thanks Junio for informing me of the test failures. Turns out that it
> was partly because I didn't memset oid (and in some code paths, it gets
> read without being written to), and partly because I didn't set
> GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME. Here's an updated patch set with
> the fixes.
>
> Jonathan Tan (3):
>   ls-refs: report unborn targets of symrefs
>   connect, transport: add no-op arg for future patch
>   clone: respect remote unborn HEAD

Having to unset the GIT_TEST_* environment even when we have an
explicit "git -c init.defaultBranch=<val>" is a bit awkward and
frustrating, but hopefully this would futureproof the tests for
the current and future world ;-)

Will replace.

>
>  Documentation/config.txt                |  2 +
>  Documentation/config/init.txt           |  2 +-
>  Documentation/config/lsrefs.txt         |  3 ++
>  Documentation/technical/protocol-v2.txt | 10 ++++-
>  builtin/clone.c                         | 19 +++++++--
>  builtin/fetch-pack.c                    |  3 +-
>  builtin/fetch.c                         |  2 +-
>  builtin/ls-remote.c                     |  2 +-
>  builtin/remote.c                        |  2 +-
>  connect.c                               | 29 ++++++++++++--
>  ls-refs.c                               | 52 +++++++++++++++++++++++--
>  ls-refs.h                               |  1 +
>  remote.h                                |  3 +-
>  serve.c                                 |  2 +-
>  t/t5606-clone-options.sh                |  8 ++--
>  t/t5702-protocol-v2.sh                  | 11 ++++++
>  transport-helper.c                      |  7 +++-
>  transport-internal.h                    | 13 +++----
>  transport.c                             | 29 ++++++++------
>  transport.h                             |  7 +++-
>  20 files changed, 164 insertions(+), 43 deletions(-)
>  create mode 100644 Documentation/config/lsrefs.txt
>
> Range-diff against v3:
> 1:  7d20ec323a ! 1:  a66e50626e ls-refs: report unborn targets of symrefs
>     @@ ls-refs.c: static int send_ref(const char *refname, const struct object_id *oid,
>      +	int flag;
>      +	int oid_is_null;
>      +
>     ++	memset(&oid, 0, sizeof(oid));
>      +	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
>      +	resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag);
>      +	oid_is_null = is_null_oid(&oid);
> 2:  b5a78857eb = 2:  14f3962adc connect, transport: add no-op arg for future patch
> 3:  c2303dc976 ! 3:  e770fc46eb clone: respect remote unborn HEAD
>     @@ t/t5606-clone-options.sh: test_expect_success 'redirected clone -v does show pro
>       
>       test_expect_success 'chooses correct default initial branch name' '
>      -	git init --bare empty &&
>     ++	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
>      +	git -c init.defaultBranch=foo init --bare empty &&
>      +	test_config -C empty lsrefs.allowUnborn true &&
>       	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
>     @@ t/t5702-protocol-v2.sh: test_expect_success 'clone with file:// using protocol v
>       '
>       
>      +test_expect_success 'clone of empty repo propagates name of default branch' '
>     ++	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
>      +	git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&
>      +	test_config -C file_empty_parent lsrefs.allowUnborn true &&
>      +
>     ++	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
>      +	git -c init.defaultBranch=main -c protocol.version=2 \
>      +		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
>      +	grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v3 0/3] Cloning with remote unborn HEAD
  2020-12-21 22:30   ` [PATCH v3 0/3] Cloning with " Jonathan Tan
                       ` (3 preceding siblings ...)
  2020-12-21 23:48     ` [PATCH v3 0/3] Cloning with " Junio C Hamano
@ 2021-01-21 20:14     ` Jeff King
  4 siblings, 0 replies; 109+ messages in thread
From: Jeff King @ 2021-01-21 20:14 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

On Mon, Dec 21, 2020 at 02:30:58PM -0800, Jonathan Tan wrote:

> > So I dunno. My biggest complaint is that the config option defaults to
> > _off_.  So it's helping load-balanced rollouts, but creating complexity
> > for everyone else who might want to enable the feature.
> 
>    So it seems like you're saying that it should default to "on", but at
>    the same time you are talking about enabling the feature (which seems
>    to imply switching it from "off" to "on"). (Also, note that this is a
>    server-side thing; on the client-side, Git will always use what the
>    server gives and there is no option to control this.)

Sorry, I missed this question over the holidays. Yes, what I meant is
that everyone should really want this feature on, because it gives
strictly more information and lets the client be smarter.

But if it defaults to "off", server operators may well not bother to
turn it on (or even know it exists). And the clients who would benefit
may have trouble convincing server operators to do so.

So I would strongly prefer it default to "on", and the onus be on server
operators with non-atomic clusters, etc, to turn it off when deploying
in their environments.

-Peff

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v4 1/3] ls-refs: report unborn targets of symrefs
  2020-12-22 21:54     ` [PATCH v4 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
@ 2021-01-21 20:48       ` Jeff King
  2021-01-26 18:13         ` Jonathan Tan
  0 siblings, 1 reply; 109+ messages in thread
From: Jeff King @ 2021-01-21 20:48 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, gitster

On Tue, Dec 22, 2020 at 01:54:18PM -0800, Jonathan Tan wrote:

> -static int ls_refs_config(const char *var, const char *value, void *data)
> +static void send_possibly_unborn_head(struct ls_refs_data *data)
>  {
> +	struct strbuf namespaced = STRBUF_INIT;
> +	struct object_id oid;
> +	int flag;
> +	int oid_is_null;
> +
> +	memset(&oid, 0, sizeof(oid));
> +	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
> +	resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag);

It feels weird to call resolve_ref_unsafe() without checking the return
value. How do we detect errors?

I think the logic is that we make assumptions about which fields it will
touch (i.e., zeroing the flags, and not touching our zero'd oid), and
then check those. That feels a bit non-obvious and intimate with the
implementation, though (and was presumably the source of the "oops, we
need to clear the oid bug between v3 and v4).

I feel like that deserves a comment, but I also wonder if:

  refname = resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag);
  if (!refname)
	return; /* broken, bad name, not even a symref, etc */

  /*
   * now we can look at oid even if we didn't memset() it, because
   * a successful return from resolve_ref_unsafe() means that it
   * has cleared it if appropriate
   */
  oid_is_null = is_null_oid(&oid);
  ...etc...

> +	if (!oid_is_null ||
> +	    (data->unborn && data->symrefs && (flag & REF_ISSYMREF)))
> +		send_ref(namespaced.buf, oid_is_null ? NULL : &oid, flag, data);

It likewise feels a bit funny that we determine the symref name in the
earlier call to resolve_ref_unsafe(), but we do not pass it here (and in
fact, we'll end up looking it up again!).

But that is not much different than what we do for normal refs passed to
the send_ref() callback. It would be nice if the iteration could pass in
"by the way, here is the symref value" to avoid that. But in practice it
isn't a big deal, since we only do the lookup when we see the ISSYMREF
flag set. So typically it is only one or two extra ref resolutions.

> @@ -91,7 +118,7 @@ int ls_refs(struct repository *r, struct strvec *keys,
>  
>  	memset(&data, 0, sizeof(data));
>  
> -	git_config(ls_refs_config, NULL);
> +	git_config(ls_refs_config, &data);

You will probably not be surprised that I would suggest defaulting
data->allow_unborn to 1 before this config call. :)

> @@ -103,14 +130,31 @@ int ls_refs(struct repository *r, struct strvec *keys,
>  			data.symrefs = 1;
>  		else if (skip_prefix(arg, "ref-prefix ", &out))
>  			strvec_push(&data.prefixes, out);
> +		else if (data.allow_unborn && !strcmp("unborn", arg))
> +			data.unborn = 1;
>  	}

So if we have not set allow_unborn, we will not accept the client saying
"unborn". Which makes sense, because we would not have advertised it in
that case.

But we use the same boolean for advertising, too. So this loses the
"allow us to accept it, but not advertise it" logic that your earlier
versions had, doesn't it? And that is the important element for making
things work across a non-atomic deploy of versions.

This straight-boolean version works as long as you can atomically update
the _config_ on each version. But that seems like roughly the same
problem (having dealt with this on GitHub servers, they are not
equivalent, and depending on your infrastructure, it definitely _can_ be
easier to do one versus the other. But it seems like a funny place to
leave this upstream feature).

Or is the intent that an unconfigured reader would silently ignore the
unborn flag in that case? That would at least not cause it to bail on
the client in a mixed-version environment. But it does feel like a
confusing result.

-Peff

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v4 2/3] connect, transport: add no-op arg for future patch
  2020-12-22 21:54     ` [PATCH v4 2/3] connect, transport: add no-op arg for future patch Jonathan Tan
@ 2021-01-21 20:55       ` Jeff King
  2021-01-26 18:16         ` Jonathan Tan
  0 siblings, 1 reply; 109+ messages in thread
From: Jeff King @ 2021-01-21 20:55 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, gitster

On Tue, Dec 22, 2020 at 01:54:19PM -0800, Jonathan Tan wrote:

> In a future patch we plan to return the name of an unborn current branch
> from deep in the callchain to a caller via a new pointer parameter that
> points at a variable in the caller when the caller calls
> get_remote_refs() and transport_get_remote_refs(). Add the parameter to
> functions involved in the callchain, but no caller passes an actual
> argument yet in this step. Thus, the future patch only needs to concern
> itself with new logic.

OK. Since the call stack is so deep, it's nice to get all of this diff
noise out of the way of the third patch.

It does make me wonder if we should be passing a struct like:

  struct transport_fetch_options {
	struct strvec ref_prefixes;
	char **unborn_head;
  }
  #define TRANSPORT_FETCH_OPTIONS_INIT = { STRVEC_INIT }

which would solve this problem once for any future options.

> @@ -455,7 +455,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
>  			     struct ref **list, int for_push,
>  			     const struct strvec *ref_prefixes,
>  			     const struct string_list *server_options,
> -			     int stateless_rpc)
> +			     int stateless_rpc,
> +			     char **unborn_head_target)

Is a single string enough? The way the protocol is defined, I think the
server is free to tell us about other unborn symrefs, too (but of course
our implementation does not). And I'm not sure what we'd do with such
values (in a "--mirror" clone, I guess we could make local copies of
them).

Should we be prepared for that at the transport layer, or is it
over-engineering?

-Peff

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v4 3/3] clone: respect remote unborn HEAD
  2020-12-22 21:54     ` [PATCH v4 3/3] clone: respect remote unborn HEAD Jonathan Tan
@ 2021-01-21 21:02       ` Jeff King
  2021-01-26 18:22         ` Jonathan Tan
  0 siblings, 1 reply; 109+ messages in thread
From: Jeff King @ 2021-01-21 21:02 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, gitster

On Tue, Dec 22, 2020 at 01:54:20PM -0800, Jonathan Tan wrote:

> @@ -1323,10 +1325,20 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  		remote_head = NULL;
>  		option_no_checkout = 1;
>  		if (!option_bare) {
> -			const char *branch = git_default_branch_name();
> -			char *ref = xstrfmt("refs/heads/%s", branch);
> +			const char *branch;
> +			char *ref;
> +
> +			if (unborn_head_target &&
> +			    skip_prefix(unborn_head_target, "refs/heads/", &branch)) {
> +				ref = unborn_head_target;
> +				unborn_head_target = NULL;
> +			} else {
> +				branch = git_default_branch_name();
> +				ref = xstrfmt("refs/heads/%s", branch);
> +			}
>  
>  			install_branch_config(0, branch, remote_name, ref);
> +			create_symref("HEAD", ref, "");
>  			free(ref);
>  		}

In the old code, we never called create_symref() at all. It makes sense
that we'd do it now when unborn_head_target is not NULL. But what about
in the "else" clause there? Now we're adding an extra create_symref()
call. Who was setting up the HEAD symref before, and are we now doing it
twice?

If we have a valid unborn head, then we alias it to "ref" and we set the
original to NULL. And it gets cleaned up here via free(ref). Makes
sense. It confused me for a moment with this hunk...

> @@ -1373,6 +1385,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	strbuf_release(&key);
>  	junk_mode = JUNK_LEAVE_ALL;
>  
> +	free(unborn_head_target);

...since this line will almost always be free(NULL) as a result (either
there was no unborn head, or we consumed the string already). But it is
covering the case that somebody gave us an unborn_head_target but it
didn't start with refs/heads/. So it's useful to have.

-Peff

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v4 1/3] ls-refs: report unborn targets of symrefs
  2021-01-21 20:48       ` Jeff King
@ 2021-01-26 18:13         ` Jonathan Tan
  2021-01-26 23:16           ` Jeff King
  0 siblings, 1 reply; 109+ messages in thread
From: Jonathan Tan @ 2021-01-26 18:13 UTC (permalink / raw)
  To: peff; +Cc: jonathantanmy, git, gitster

> On Tue, Dec 22, 2020 at 01:54:18PM -0800, Jonathan Tan wrote:
> 
> > -static int ls_refs_config(const char *var, const char *value, void *data)
> > +static void send_possibly_unborn_head(struct ls_refs_data *data)
> >  {
> > +	struct strbuf namespaced = STRBUF_INIT;
> > +	struct object_id oid;
> > +	int flag;
> > +	int oid_is_null;
> > +
> > +	memset(&oid, 0, sizeof(oid));
> > +	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
> > +	resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag);
> 
> It feels weird to call resolve_ref_unsafe() without checking the return
> value. How do we detect errors?
> 
> I think the logic is that we make assumptions about which fields it will
> touch (i.e., zeroing the flags, and not touching our zero'd oid), and
> then check those. That feels a bit non-obvious and intimate with the
> implementation, though (and was presumably the source of the "oops, we
> need to clear the oid bug between v3 and v4).
> 
> I feel like that deserves a comment, but I also wonder if:
> 
>   refname = resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag);
>   if (!refname)
> 	return; /* broken, bad name, not even a symref, etc */

From my reading of this part of refs_resolve_ref_unsafe():

                if (!(read_flags & REF_ISSYMREF)) {
                        if (*flags & REF_BAD_NAME) {
                                oidclr(oid);
                                *flags |= REF_ISBROKEN;
                        }
                        return refname;
                }

it seems that resolve_ref_unsafe() returns non-NULL if the ref is not a
symref but is otherwise valid. But this is exactly what we want -
send_possibly_unborn_head() must send HEAD in this situation anyway.
Thanks - I've switched to checking the return value.

(It was a bit confusing that refs_resolve_ref_unsafe() returns one of
its input arguments if it succeeds and NULL if it fails, but that's
outside the scope of this patch, I think.)

> > +	if (!oid_is_null ||
> > +	    (data->unborn && data->symrefs && (flag & REF_ISSYMREF)))
> > +		send_ref(namespaced.buf, oid_is_null ? NULL : &oid, flag, data);
> 
> It likewise feels a bit funny that we determine the symref name in the
> earlier call to resolve_ref_unsafe(), but we do not pass it here (and in
> fact, we'll end up looking it up again!).
> 
> But that is not much different than what we do for normal refs passed to
> the send_ref() callback. It would be nice if the iteration could pass in
> "by the way, here is the symref value" to avoid that.

Yes, that would be nice.

> But in practice it
> isn't a big deal, since we only do the lookup when we see the ISSYMREF
> flag set. So typically it is only one or two extra ref resolutions.

OK.

> > @@ -91,7 +118,7 @@ int ls_refs(struct repository *r, struct strvec *keys,
> >  
> >  	memset(&data, 0, sizeof(data));
> >  
> > -	git_config(ls_refs_config, NULL);
> > +	git_config(ls_refs_config, &data);
> 
> You will probably not be surprised that I would suggest defaulting
> data->allow_unborn to 1 before this config call. :)

I don't think many people have made comments either way, so I'll go
ahead with defaulting it to true. I can see arguments for both sides.

> > @@ -103,14 +130,31 @@ int ls_refs(struct repository *r, struct strvec *keys,
> >  			data.symrefs = 1;
> >  		else if (skip_prefix(arg, "ref-prefix ", &out))
> >  			strvec_push(&data.prefixes, out);
> > +		else if (data.allow_unborn && !strcmp("unborn", arg))
> > +			data.unborn = 1;
> >  	}
> 
> So if we have not set allow_unborn, we will not accept the client saying
> "unborn". Which makes sense, because we would not have advertised it in
> that case.
> 
> But we use the same boolean for advertising, too. So this loses the
> "allow us to accept it, but not advertise it" logic that your earlier
> versions had, doesn't it?

Yes, it does.

> And that is the important element for making
> things work across a non-atomic deploy of versions.
> 
> This straight-boolean version works as long as you can atomically update
> the _config_ on each version. But that seems like roughly the same
> problem (having dealt with this on GitHub servers, they are not
> equivalent, and depending on your infrastructure, it definitely _can_ be
> easier to do one versus the other. But it seems like a funny place to
> leave this upstream feature).

Well, I was just agreeing with what you said [1]. :-)

[1] https://lore.kernel.org/git/X9xJLWdFJfNJTn0p@coredump.intra.peff.net/

> Or is the intent that an unconfigured reader would silently ignore the
> unborn flag in that case? That would at least not cause it to bail on
> the client in a mixed-version environment. But it does feel like a
> confusing result.

Right now, an old server would ignore "unborn", yes. I'm not sure of
what the intent should be - tightening ls-refs and fetch to forbid
unknown arguments seems like a good idea to me.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v4 2/3] connect, transport: add no-op arg for future patch
  2021-01-21 20:55       ` Jeff King
@ 2021-01-26 18:16         ` Jonathan Tan
  0 siblings, 0 replies; 109+ messages in thread
From: Jonathan Tan @ 2021-01-26 18:16 UTC (permalink / raw)
  To: peff; +Cc: jonathantanmy, git, gitster

> On Tue, Dec 22, 2020 at 01:54:19PM -0800, Jonathan Tan wrote:
> 
> > In a future patch we plan to return the name of an unborn current branch
> > from deep in the callchain to a caller via a new pointer parameter that
> > points at a variable in the caller when the caller calls
> > get_remote_refs() and transport_get_remote_refs(). Add the parameter to
> > functions involved in the callchain, but no caller passes an actual
> > argument yet in this step. Thus, the future patch only needs to concern
> > itself with new logic.
> 
> OK. Since the call stack is so deep, it's nice to get all of this diff
> noise out of the way of the third patch.
> 
> It does make me wonder if we should be passing a struct like:
> 
>   struct transport_fetch_options {
> 	struct strvec ref_prefixes;
> 	char **unborn_head;
>   }
>   #define TRANSPORT_FETCH_OPTIONS_INIT = { STRVEC_INIT }
> 
> which would solve this problem once for any future options.

That's a good idea, and I've switched patch 2 to doing this. It also
makes it easier to explain (no "unborn_head" dummy variable that does
nothing, since I can just introduce "unborn_head" in patch 3).

> > @@ -455,7 +455,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
> >  			     struct ref **list, int for_push,
> >  			     const struct strvec *ref_prefixes,
> >  			     const struct string_list *server_options,
> > -			     int stateless_rpc)
> > +			     int stateless_rpc,
> > +			     char **unborn_head_target)
> 
> Is a single string enough? The way the protocol is defined, I think the
> server is free to tell us about other unborn symrefs, too (but of course
> our implementation does not). And I'm not sure what we'd do with such
> values (in a "--mirror" clone, I guess we could make local copies of
> them).
> 
> Should we be prepared for that at the transport layer, or is it
> over-engineering?

With the struct, I think we're prepared for it.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v4 3/3] clone: respect remote unborn HEAD
  2021-01-21 21:02       ` Jeff King
@ 2021-01-26 18:22         ` Jonathan Tan
  2021-01-26 23:04           ` Jeff King
  0 siblings, 1 reply; 109+ messages in thread
From: Jonathan Tan @ 2021-01-26 18:22 UTC (permalink / raw)
  To: peff; +Cc: jonathantanmy, git, gitster

> On Tue, Dec 22, 2020 at 01:54:20PM -0800, Jonathan Tan wrote:
> 
> > @@ -1323,10 +1325,20 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
> >  		remote_head = NULL;
> >  		option_no_checkout = 1;
> >  		if (!option_bare) {
> > -			const char *branch = git_default_branch_name();
> > -			char *ref = xstrfmt("refs/heads/%s", branch);
> > +			const char *branch;
> > +			char *ref;
> > +
> > +			if (unborn_head_target &&
> > +			    skip_prefix(unborn_head_target, "refs/heads/", &branch)) {
> > +				ref = unborn_head_target;
> > +				unborn_head_target = NULL;
> > +			} else {
> > +				branch = git_default_branch_name();
> > +				ref = xstrfmt("refs/heads/%s", branch);
> > +			}
> >  
> >  			install_branch_config(0, branch, remote_name, ref);
> > +			create_symref("HEAD", ref, "");
> >  			free(ref);
> >  		}
> 
> In the old code, we never called create_symref() at all. It makes sense
> that we'd do it now when unborn_head_target is not NULL. But what about
> in the "else" clause there? Now we're adding an extra create_symref()
> call.

The "else" branch you're referring to is the one enclosing all of the
lines quoted above, I believe?

> Who was setting up the HEAD symref before, and are we now doing it
> twice?

It was init_db(). Yes, we are now setting it once in init_db() and
setting it again, but this is the same as in the regular clone (as can
be seen by the invocation of update_head() that sets HEAD in some
situations).

> If we have a valid unborn head, then we alias it to "ref" and we set the
> original to NULL. And it gets cleaned up here via free(ref). Makes
> sense. It confused me for a moment with this hunk...
> 
> > @@ -1373,6 +1385,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
> >  	strbuf_release(&key);
> >  	junk_mode = JUNK_LEAVE_ALL;
> >  
> > +	free(unborn_head_target);
> 
> ...since this line will almost always be free(NULL) as a result (either
> there was no unborn head, or we consumed the string already). But it is
> covering the case that somebody gave us an unborn_head_target but it
> didn't start with refs/heads/. So it's useful to have.

Yes.

Thanks for your review.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v5 0/3] Cloning with remote unborn HEAD
  2020-12-08  1:31 Cloning empty repository uses locally configured default branch name Jonathan Tan
                   ` (2 preceding siblings ...)
  2020-12-11 21:05 ` [PATCH] clone: in protocol v2, use remote's default branch Jonathan Tan
@ 2021-01-26 18:55 ` Jonathan Tan
  2021-01-26 18:55   ` [PATCH v5 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
                     ` (4 more replies)
  2021-02-02  2:14 ` [PATCH v6 " Jonathan Tan
                   ` (2 subsequent siblings)
  6 siblings, 5 replies; 109+ messages in thread
From: Jonathan Tan @ 2021-01-26 18:55 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan

Thanks, Peff, for your review. I have addressed your comments (through
replies to your emails and here in this v5 patch set).

Jonathan Tan (3):
  ls-refs: report unborn targets of symrefs
  connect, transport: encapsulate arg in struct
  clone: respect remote unborn HEAD

 Documentation/config.txt                |  2 +
 Documentation/config/init.txt           |  2 +-
 Documentation/config/lsrefs.txt         |  3 ++
 Documentation/technical/protocol-v2.txt | 10 ++++-
 builtin/clone.c                         | 34 +++++++++++-----
 builtin/fetch-pack.c                    |  3 +-
 builtin/fetch.c                         | 18 +++++----
 builtin/ls-remote.c                     |  9 +++--
 connect.c                               | 32 +++++++++++++--
 ls-refs.c                               | 53 +++++++++++++++++++++++--
 ls-refs.h                               |  1 +
 remote.h                                |  4 +-
 serve.c                                 |  2 +-
 t/t5606-clone-options.sh                |  8 ++--
 t/t5701-git-serve.sh                    |  2 +-
 t/t5702-protocol-v2.sh                  | 25 ++++++++++++
 transport-helper.c                      |  5 ++-
 transport-internal.h                    |  9 +----
 transport.c                             | 23 ++++++-----
 transport.h                             | 29 ++++++++++----
 20 files changed, 210 insertions(+), 64 deletions(-)
 create mode 100644 Documentation/config/lsrefs.txt

Range-diff against v4:
1:  d7d2ba597e ! 1:  32e16dfdbd ls-refs: report unborn targets of symrefs
    @@ Commit message
     
         Currently, symrefs that have unborn targets (such as in this case) are
         not communicated by the protocol. Teach Git to advertise and support the
    -    "unborn" feature in "ls-refs" (guarded by the lsrefs.allowunborn
    +    "unborn" feature in "ls-refs" (by default, this is advertised, but
    +    server administrators may turn this off through the lsrefs.allowunborn
         config). This feature indicates that "ls-refs" supports the "unborn"
         argument; when it is specified, "ls-refs" will send the HEAD symref with
         the name of its unborn target.
    @@ ls-refs.c: static int send_ref(const char *refname, const struct object_id *oid,
     +	int flag;
     +	int oid_is_null;
     +
    -+	memset(&oid, 0, sizeof(oid));
     +	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
    -+	resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag);
    ++	if (!resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag))
    ++		return; /* bad ref */
     +	oid_is_null = is_null_oid(&oid);
     +	if (!oid_is_null ||
     +	    (data->unborn && data->symrefs && (flag & REF_ISSYMREF)))
    @@ ls-refs.c: int ls_refs(struct repository *r, struct strvec *keys,
      	memset(&data, 0, sizeof(data));
      
     -	git_config(ls_refs_config, NULL);
    ++	data.allow_unborn = 1;
     +	git_config(ls_refs_config, &data);
      
      	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
    @@ ls-refs.c: int ls_refs(struct repository *r, struct strvec *keys,
     +	if (value) {
     +		int allow_unborn_value;
     +
    -+		if (!repo_config_get_bool(the_repository,
    ++		if (repo_config_get_bool(the_repository,
     +					 "lsrefs.allowunborn",
    -+					 &allow_unborn_value) &&
    ++					 &allow_unborn_value) ||
     +		    allow_unborn_value)
     +			strbuf_addstr(value, "unborn");
     +	}
    @@ serve.c: struct protocol_capability {
      	{ "fetch", upload_pack_advertise, upload_pack_v2 },
      	{ "server-option", always_advertise, NULL },
      	{ "object-format", object_format_advertise, NULL },
    +
    + ## t/t5701-git-serve.sh ##
    +@@ t/t5701-git-serve.sh: test_expect_success 'test capability advertisement' '
    + 	cat >expect <<-EOF &&
    + 	version 2
    + 	agent=git/$(git version | cut -d" " -f3)
    +-	ls-refs
    ++	ls-refs=unborn
    + 	fetch=shallow
    + 	server-option
    + 	object-format=$(test_oid algo)
2:  51d8a359c7 < -:  ---------- connect, transport: add no-op arg for future patch
-:  ---------- > 2:  4eec551668 connect, transport: encapsulate arg in struct
3:  896be550f1 ! 3:  922e8c229c clone: respect remote unborn HEAD
    @@ Documentation/config/init.txt: init.templateDir::
     +	a new repository.
     
      ## builtin/clone.c ##
    -@@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
    - 	int submodule_progress;
    - 
    - 	struct strvec ref_prefixes = STRVEC_INIT;
    -+	char *unborn_head_target = NULL;
    - 
    - 	packet_trace_identity("clone");
    - 
    -@@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
    - 	if (!option_no_tags)
    - 		strvec_push(&ref_prefixes, "refs/tags/");
    - 
    --	refs = transport_get_remote_refs(transport, &ref_prefixes, NULL);
    -+	refs = transport_get_remote_refs(transport, &ref_prefixes,
    -+					 &unborn_head_target);
    - 
    - 	if (refs) {
    - 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
     @@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
      		remote_head = NULL;
      		option_no_checkout = 1;
    @@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
     +			const char *branch;
     +			char *ref;
     +
    -+			if (unborn_head_target &&
    -+			    skip_prefix(unborn_head_target, "refs/heads/", &branch)) {
    -+				ref = unborn_head_target;
    -+				unborn_head_target = NULL;
    ++			if (transport_ls_refs_options.unborn_head_target &&
    ++			    skip_prefix(transport_ls_refs_options.unborn_head_target,
    ++					"refs/heads/", &branch)) {
    ++				ref = transport_ls_refs_options.unborn_head_target;
    ++				transport_ls_refs_options.unborn_head_target = NULL;
     +			} else {
     +				branch = git_default_branch_name();
     +				ref = xstrfmt("refs/heads/%s", branch);
    @@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
      		}
      	}
     @@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
    - 	strbuf_release(&key);
      	junk_mode = JUNK_LEAVE_ALL;
      
    -+	free(unborn_head_target);
    - 	strvec_clear(&ref_prefixes);
    + 	strvec_clear(&transport_ls_refs_options.ref_prefixes);
    ++	free(transport_ls_refs_options.unborn_head_target);
      	return err;
      }
     
    @@ connect.c: static int process_ref_v2(struct packet_reader *reader, struct ref **
      	if (parse_oid_hex_algop(line_sections.items[i++].string, &old_oid, &end, reader->hash_algo) ||
      	    *end) {
      		ret = 0;
    +@@ connect.c: struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
    + 	const char *hash_name;
    + 	struct strvec *ref_prefixes = transport_options ?
    + 		&transport_options->ref_prefixes : NULL;
    ++	char **unborn_head_target = transport_options ?
    ++		&transport_options->unborn_head_target : NULL;
    + 	*list = NULL;
    + 
    + 	if (server_supports_v2("ls-refs", 1))
     @@ connect.c: struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
      	if (!for_push)
      		packet_write_fmt(fd_out, "peel\n");
    @@ connect.c: struct ref **get_remote_refs(int fd_out, struct packet_reader *reader
      
      	/* Process response from server */
      	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
    --		if (unborn_head_target)
    --			BUG("NEEDSWORK: provide unborn HEAD target to caller while reading refs");
     -		if (!process_ref_v2(reader, &list))
     +		if (!process_ref_v2(reader, &list, unborn_head_target))
      			die(_("invalid ls-refs response: %s"), reader->line);
    @@ t/t5702-protocol-v2.sh: test_expect_success 'clone with file:// using protocol v
      '
      
     +test_expect_success 'clone of empty repo propagates name of default branch' '
    ++	test_when_finished "rm -rf file_empty_parent file_empty_child" &&
    ++
     +	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
     +	git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&
    -+	test_config -C file_empty_parent lsrefs.allowUnborn true &&
     +
     +	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
     +	git -c init.defaultBranch=main -c protocol.version=2 \
     +		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
     +	grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD
     +'
    ++
    ++test_expect_success '...but not if explicitly forbidden by config' '
    ++	test_when_finished "rm -rf file_empty_parent file_empty_child" &&
    ++
    ++	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
    ++	git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&
    ++	test_config -C file_empty_parent lsrefs.allowUnborn false &&
    ++
    ++	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
    ++	git -c init.defaultBranch=main -c protocol.version=2 \
    ++		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
    ++	! grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD
    ++'
     +
      test_expect_success 'fetch with file:// using protocol v2' '
      	test_when_finished "rm -f log" &&
      
    +
    + ## transport.h ##
    +@@ transport.h: struct transport_ls_refs_options {
    + 	 * provided ref_prefixes.
    + 	 */
    + 	struct strvec ref_prefixes;
    ++
    ++	/*
    ++	 * If unborn_head_target is not NULL, and the remote reports HEAD as
    ++	 * pointing to an unborn branch, transport_get_remote_refs() stores the
    ++	 * unborn branch in unborn_head_target. It should be freed by the
    ++	 * caller.
    ++	 */
    ++	char *unborn_head_target;
    + };
    + #define TRANSPORT_LS_REFS_OPTIONS_INIT { STRVEC_INIT }
    + 
-- 
2.30.0.280.ga3ce27912f-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v5 1/3] ls-refs: report unborn targets of symrefs
  2021-01-26 18:55 ` [PATCH v5 " Jonathan Tan
@ 2021-01-26 18:55   ` Jonathan Tan
  2021-01-26 21:38     ` Junio C Hamano
  2021-01-27  1:28     ` Ævar Arnfjörð Bjarmason
  2021-01-26 18:55   ` [PATCH v5 2/3] connect, transport: encapsulate arg in struct Jonathan Tan
                     ` (3 subsequent siblings)
  4 siblings, 2 replies; 109+ messages in thread
From: Jonathan Tan @ 2021-01-26 18:55 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, Junio C Hamano

When cloning, we choose the default branch based on the remote HEAD.
But if there is no remote HEAD reported (which could happen if the
target of the remote HEAD is unborn), we'll fall back to using our local
init.defaultBranch. Traditionally this hasn't been a big deal, because
most repos used "master" as the default. But these days it is likely to
cause confusion if the server and client implementations choose
different values (e.g., if the remote started with "main", we may choose
"master" locally, create commits there, and then the user is surprised
when they push to "master" and not "main").

To solve this, the remote needs to communicate the target of the HEAD
symref, even if it is unborn, and "git clone" needs to use this
information.

Currently, symrefs that have unborn targets (such as in this case) are
not communicated by the protocol. Teach Git to advertise and support the
"unborn" feature in "ls-refs" (by default, this is advertised, but
server administrators may turn this off through the lsrefs.allowunborn
config). This feature indicates that "ls-refs" supports the "unborn"
argument; when it is specified, "ls-refs" will send the HEAD symref with
the name of its unborn target.

This change is only for protocol v2. A similar change for protocol v0
would require independent protocol design (there being no analogous
position to signal support for "unborn") and client-side plumbing of the
data required, so the scope of this patch set is limited to protocol v2.

The client side will be updated to use this in a subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config.txt                |  2 +
 Documentation/config/lsrefs.txt         |  3 ++
 Documentation/technical/protocol-v2.txt | 10 ++++-
 ls-refs.c                               | 53 +++++++++++++++++++++++--
 ls-refs.h                               |  1 +
 serve.c                                 |  2 +-
 t/t5701-git-serve.sh                    |  2 +-
 7 files changed, 66 insertions(+), 7 deletions(-)
 create mode 100644 Documentation/config/lsrefs.txt

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 6ba50b1104..d08e83a148 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -398,6 +398,8 @@ include::config/interactive.txt[]
 
 include::config/log.txt[]
 
+include::config/lsrefs.txt[]
+
 include::config/mailinfo.txt[]
 
 include::config/mailmap.txt[]
diff --git a/Documentation/config/lsrefs.txt b/Documentation/config/lsrefs.txt
new file mode 100644
index 0000000000..dcbec11aaa
--- /dev/null
+++ b/Documentation/config/lsrefs.txt
@@ -0,0 +1,3 @@
+lsrefs.allowUnborn::
+	Allow the server to send information about unborn symrefs during the
+	protocol v2 ref advertisement.
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 85daeb5d9e..4707511c10 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -192,11 +192,19 @@ ls-refs takes in the following arguments:
 	When specified, only references having a prefix matching one of
 	the provided prefixes are displayed.
 
+If the 'unborn' feature is advertised the following argument can be
+included in the client's request.
+
+    unborn
+	The server may send symrefs pointing to unborn branches in the form
+	"unborn <refname> symref-target:<target>".
+
 The output of ls-refs is as follows:
 
     output = *ref
 	     flush-pkt
-    ref = PKT-LINE(obj-id SP refname *(SP ref-attribute) LF)
+    obj-id-or-unborn = (obj-id | "unborn")
+    ref = PKT-LINE(obj-id-or-unborn SP refname *(SP ref-attribute) LF)
     ref-attribute = (symref | peeled)
     symref = "symref-target:" symref-target
     peeled = "peeled:" obj-id
diff --git a/ls-refs.c b/ls-refs.c
index a1e0b473e4..4077adeb6a 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -32,6 +32,8 @@ struct ls_refs_data {
 	unsigned peel;
 	unsigned symrefs;
 	struct strvec prefixes;
+	unsigned allow_unborn : 1;
+	unsigned unborn : 1;
 };
 
 static int send_ref(const char *refname, const struct object_id *oid,
@@ -47,7 +49,10 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	if (!ref_match(&data->prefixes, refname_nons))
 		return 0;
 
-	strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	if (oid)
+		strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	else
+		strbuf_addf(&refline, "unborn %s", refname_nons);
 	if (data->symrefs && flag & REF_ISSYMREF) {
 		struct object_id unused;
 		const char *symref_target = resolve_ref_unsafe(refname, 0,
@@ -74,8 +79,30 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	return 0;
 }
 
-static int ls_refs_config(const char *var, const char *value, void *data)
+static void send_possibly_unborn_head(struct ls_refs_data *data)
 {
+	struct strbuf namespaced = STRBUF_INIT;
+	struct object_id oid;
+	int flag;
+	int oid_is_null;
+
+	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
+	if (!resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag))
+		return; /* bad ref */
+	oid_is_null = is_null_oid(&oid);
+	if (!oid_is_null ||
+	    (data->unborn && data->symrefs && (flag & REF_ISSYMREF)))
+		send_ref(namespaced.buf, oid_is_null ? NULL : &oid, flag, data);
+	strbuf_release(&namespaced);
+}
+
+static int ls_refs_config(const char *var, const char *value, void *cb_data)
+{
+	struct ls_refs_data *data = cb_data;
+
+	if (!strcmp("lsrefs.allowunborn", var))
+		data->allow_unborn = git_config_bool(var, value);
+
 	/*
 	 * We only serve fetches over v2 for now, so respect only "uploadpack"
 	 * config. This may need to eventually be expanded to "receive", but we
@@ -91,7 +118,8 @@ int ls_refs(struct repository *r, struct strvec *keys,
 
 	memset(&data, 0, sizeof(data));
 
-	git_config(ls_refs_config, NULL);
+	data.allow_unborn = 1;
+	git_config(ls_refs_config, &data);
 
 	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
 		const char *arg = request->line;
@@ -103,14 +131,31 @@ int ls_refs(struct repository *r, struct strvec *keys,
 			data.symrefs = 1;
 		else if (skip_prefix(arg, "ref-prefix ", &out))
 			strvec_push(&data.prefixes, out);
+		else if (data.allow_unborn && !strcmp("unborn", arg))
+			data.unborn = 1;
 	}
 
 	if (request->status != PACKET_READ_FLUSH)
 		die(_("expected flush after ls-refs arguments"));
 
-	head_ref_namespaced(send_ref, &data);
+	send_possibly_unborn_head(&data);
 	for_each_namespaced_ref(send_ref, &data);
 	packet_flush(1);
 	strvec_clear(&data.prefixes);
 	return 0;
 }
+
+int ls_refs_advertise(struct repository *r, struct strbuf *value)
+{
+	if (value) {
+		int allow_unborn_value;
+
+		if (repo_config_get_bool(the_repository,
+					 "lsrefs.allowunborn",
+					 &allow_unborn_value) ||
+		    allow_unborn_value)
+			strbuf_addstr(value, "unborn");
+	}
+
+	return 1;
+}
diff --git a/ls-refs.h b/ls-refs.h
index 7b33a7c6b8..a99e4be0bd 100644
--- a/ls-refs.h
+++ b/ls-refs.h
@@ -6,5 +6,6 @@ struct strvec;
 struct packet_reader;
 int ls_refs(struct repository *r, struct strvec *keys,
 	    struct packet_reader *request);
+int ls_refs_advertise(struct repository *r, struct strbuf *value);
 
 #endif /* LS_REFS_H */
diff --git a/serve.c b/serve.c
index eec2fe6f29..ac20c72763 100644
--- a/serve.c
+++ b/serve.c
@@ -73,7 +73,7 @@ struct protocol_capability {
 
 static struct protocol_capability capabilities[] = {
 	{ "agent", agent_advertise, NULL },
-	{ "ls-refs", always_advertise, ls_refs },
+	{ "ls-refs", ls_refs_advertise, ls_refs },
 	{ "fetch", upload_pack_advertise, upload_pack_v2 },
 	{ "server-option", always_advertise, NULL },
 	{ "object-format", object_format_advertise, NULL },
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index a1f5fdc9fd..df29504161 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -12,7 +12,7 @@ test_expect_success 'test capability advertisement' '
 	cat >expect <<-EOF &&
 	version 2
 	agent=git/$(git version | cut -d" " -f3)
-	ls-refs
+	ls-refs=unborn
 	fetch=shallow
 	server-option
 	object-format=$(test_oid algo)
-- 
2.30.0.280.ga3ce27912f-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v5 2/3] connect, transport: encapsulate arg in struct
  2021-01-26 18:55 ` [PATCH v5 " Jonathan Tan
  2021-01-26 18:55   ` [PATCH v5 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
@ 2021-01-26 18:55   ` Jonathan Tan
  2021-01-26 21:54     ` Junio C Hamano
  2021-01-26 18:55   ` [PATCH v5 3/3] clone: respect remote unborn HEAD Jonathan Tan
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 109+ messages in thread
From: Jonathan Tan @ 2021-01-26 18:55 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, Junio C Hamano

In a future patch we plan to return the name of an unborn current branch
from deep in the callchain to a caller via a new pointer parameter that
points at a variable in the caller when the caller calls
get_remote_refs() and transport_get_remote_refs().

In preparation for that, encapsulate the existing ref_prefixes
parameter into a struct. The aforementioned unborn current branch will
go into this new struct in the future patch.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/clone.c      | 18 +++++++++++-------
 builtin/fetch-pack.c |  3 ++-
 builtin/fetch.c      | 18 +++++++++++-------
 builtin/ls-remote.c  |  9 +++++----
 connect.c            |  4 +++-
 remote.h             |  4 +++-
 transport-helper.c   |  5 +++--
 transport-internal.h |  9 +--------
 transport.c          | 23 ++++++++++++-----------
 transport.h          | 21 ++++++++++++++-------
 10 files changed, 65 insertions(+), 49 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index a5630337e4..211d4f54b0 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -979,7 +979,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	int err = 0, complete_refs_before_fetch = 1;
 	int submodule_progress;
 
-	struct strvec ref_prefixes = STRVEC_INIT;
+	struct transport_ls_refs_options transport_ls_refs_options =
+		TRANSPORT_LS_REFS_OPTIONS_INIT;
 
 	packet_trace_identity("clone");
 
@@ -1257,14 +1258,17 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		transport->smart_options->check_self_contained_and_connected = 1;
 
 
-	strvec_push(&ref_prefixes, "HEAD");
-	refspec_ref_prefixes(&remote->fetch, &ref_prefixes);
+	strvec_push(&transport_ls_refs_options.ref_prefixes, "HEAD");
+	refspec_ref_prefixes(&remote->fetch,
+			     &transport_ls_refs_options.ref_prefixes);
 	if (option_branch)
-		expand_ref_prefix(&ref_prefixes, option_branch);
+		expand_ref_prefix(&transport_ls_refs_options.ref_prefixes,
+				  option_branch);
 	if (!option_no_tags)
-		strvec_push(&ref_prefixes, "refs/tags/");
+		strvec_push(&transport_ls_refs_options.ref_prefixes,
+			    "refs/tags/");
 
-	refs = transport_get_remote_refs(transport, &ref_prefixes);
+	refs = transport_get_remote_refs(transport, &transport_ls_refs_options);
 
 	if (refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
@@ -1380,6 +1384,6 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	strbuf_release(&key);
 	junk_mode = JUNK_LEAVE_ALL;
 
-	strvec_clear(&ref_prefixes);
+	strvec_clear(&transport_ls_refs_options.ref_prefixes);
 	return err;
 }
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 58b7c1fbdc..c2d96f4c89 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -220,7 +220,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	version = discover_version(&reader);
 	switch (version) {
 	case protocol_v2:
-		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL, args.stateless_rpc);
+		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL,
+				args.stateless_rpc);
 		break;
 	case protocol_v1:
 	case protocol_v0:
diff --git a/builtin/fetch.c b/builtin/fetch.c
index ecf8537605..837382ef4f 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1349,7 +1349,8 @@ static int do_fetch(struct transport *transport,
 	int autotags = (transport->remote->fetch_tags == 1);
 	int retcode = 0;
 	const struct ref *remote_refs;
-	struct strvec ref_prefixes = STRVEC_INIT;
+	struct transport_ls_refs_options transport_ls_refs_options =
+		TRANSPORT_LS_REFS_OPTIONS_INIT;
 	int must_list_refs = 1;
 
 	if (tags == TAGS_DEFAULT) {
@@ -1369,7 +1370,7 @@ static int do_fetch(struct transport *transport,
 	if (rs->nr) {
 		int i;
 
-		refspec_ref_prefixes(rs, &ref_prefixes);
+		refspec_ref_prefixes(rs, &transport_ls_refs_options.ref_prefixes);
 
 		/*
 		 * We can avoid listing refs if all of them are exact
@@ -1383,22 +1384,25 @@ static int do_fetch(struct transport *transport,
 			}
 		}
 	} else if (transport->remote && transport->remote->fetch.nr)
-		refspec_ref_prefixes(&transport->remote->fetch, &ref_prefixes);
+		refspec_ref_prefixes(&transport->remote->fetch,
+				     &transport_ls_refs_options.ref_prefixes);
 
 	if (tags == TAGS_SET || tags == TAGS_DEFAULT) {
 		must_list_refs = 1;
-		if (ref_prefixes.nr)
-			strvec_push(&ref_prefixes, "refs/tags/");
+		if (transport_ls_refs_options.ref_prefixes.nr)
+			strvec_push(&transport_ls_refs_options.ref_prefixes,
+				    "refs/tags/");
 	}
 
 	if (must_list_refs) {
 		trace2_region_enter("fetch", "remote_refs", the_repository);
-		remote_refs = transport_get_remote_refs(transport, &ref_prefixes);
+		remote_refs = transport_get_remote_refs(transport,
+							&transport_ls_refs_options);
 		trace2_region_leave("fetch", "remote_refs", the_repository);
 	} else
 		remote_refs = NULL;
 
-	strvec_clear(&ref_prefixes);
+	strvec_clear(&transport_ls_refs_options.ref_prefixes);
 
 	ref_map = get_ref_map(transport->remote, remote_refs, rs,
 			      tags, &autotags);
diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index 092917eca2..ef604752a0 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -45,7 +45,8 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	int show_symref_target = 0;
 	const char *uploadpack = NULL;
 	const char **pattern = NULL;
-	struct strvec ref_prefixes = STRVEC_INIT;
+	struct transport_ls_refs_options transport_options =
+		TRANSPORT_LS_REFS_OPTIONS_INIT;
 	int i;
 	struct string_list server_options = STRING_LIST_INIT_DUP;
 
@@ -94,9 +95,9 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	}
 
 	if (flags & REF_TAGS)
-		strvec_push(&ref_prefixes, "refs/tags/");
+		strvec_push(&transport_options.ref_prefixes, "refs/tags/");
 	if (flags & REF_HEADS)
-		strvec_push(&ref_prefixes, "refs/heads/");
+		strvec_push(&transport_options.ref_prefixes, "refs/heads/");
 
 	remote = remote_get(dest);
 	if (!remote) {
@@ -118,7 +119,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	if (server_options.nr)
 		transport->server_options = &server_options;
 
-	ref = transport_get_remote_refs(transport, &ref_prefixes);
+	ref = transport_get_remote_refs(transport, &transport_options);
 	if (ref) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
 		repo_set_hash_algo(the_repository, hash_algo);
diff --git a/connect.c b/connect.c
index 8b8f56cf6d..328c279250 100644
--- a/connect.c
+++ b/connect.c
@@ -453,12 +453,14 @@ void check_stateless_delimiter(int stateless_rpc,
 
 struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
-			     const struct strvec *ref_prefixes,
+			     struct transport_ls_refs_options *transport_options,
 			     const struct string_list *server_options,
 			     int stateless_rpc)
 {
 	int i;
 	const char *hash_name;
+	struct strvec *ref_prefixes = transport_options ?
+		&transport_options->ref_prefixes : NULL;
 	*list = NULL;
 
 	if (server_supports_v2("ls-refs", 1))
diff --git a/remote.h b/remote.h
index 3211abdf05..4ae676a11b 100644
--- a/remote.h
+++ b/remote.h
@@ -6,6 +6,8 @@
 #include "hashmap.h"
 #include "refspec.h"
 
+struct transport_ls_refs_options;
+
 /**
  * The API gives access to the configuration related to remotes. It handles
  * all three configuration mechanisms historically and currently used by Git,
@@ -196,7 +198,7 @@ struct ref **get_remote_heads(struct packet_reader *reader,
 /* Used for protocol v2 in order to retrieve refs from a remote */
 struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
-			     const struct strvec *ref_prefixes,
+			     struct transport_ls_refs_options *transport_options,
 			     const struct string_list *server_options,
 			     int stateless_rpc);
 
diff --git a/transport-helper.c b/transport-helper.c
index 5f6e0b3bd8..49b7fb4dcb 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1162,13 +1162,14 @@ static int has_attribute(const char *attrs, const char *attr)
 }
 
 static struct ref *get_refs_list(struct transport *transport, int for_push,
-				 const struct strvec *ref_prefixes)
+				 struct transport_ls_refs_options *transport_options)
 {
 	get_helper(transport);
 
 	if (process_connect(transport, for_push)) {
 		do_take_over(transport);
-		return transport->vtable->get_refs_list(transport, for_push, ref_prefixes);
+		return transport->vtable->get_refs_list(transport, for_push,
+							transport_options);
 	}
 
 	return get_refs_list_using_list(transport, for_push);
diff --git a/transport-internal.h b/transport-internal.h
index 27c9daffc4..348daad3e4 100644
--- a/transport-internal.h
+++ b/transport-internal.h
@@ -18,19 +18,12 @@ struct transport_vtable {
 	 * the transport to try to share connections, for_push is a
 	 * hint as to whether the ultimate operation is a push or a fetch.
 	 *
-	 * If communicating using protocol v2 a list of prefixes can be
-	 * provided to be sent to the server to enable it to limit the ref
-	 * advertisement.  Since ref filtering is done on the server's end, and
-	 * only when using protocol v2, this list will be ignored when not
-	 * using protocol v2 meaning this function can return refs which don't
-	 * match the provided ref_prefixes.
-	 *
 	 * If the transport is able to determine the remote hash for
 	 * the ref without a huge amount of effort, it should store it
 	 * in the ref's old_sha1 field; otherwise it should be all 0.
 	 **/
 	struct ref *(*get_refs_list)(struct transport *transport, int for_push,
-				     const struct strvec *ref_prefixes);
+				     struct transport_ls_refs_options *transport_options);
 
 	/**
 	 * Fetch the objects for the given refs. Note that this gets
diff --git a/transport.c b/transport.c
index 679a35e7c1..b13fab5dc3 100644
--- a/transport.c
+++ b/transport.c
@@ -127,7 +127,7 @@ struct bundle_transport_data {
 
 static struct ref *get_refs_from_bundle(struct transport *transport,
 					int for_push,
-					const struct strvec *ref_prefixes)
+					struct transport_ls_refs_options *transport_options)
 {
 	struct bundle_transport_data *data = transport->data;
 	struct ref *result = NULL;
@@ -280,7 +280,7 @@ static void die_if_server_options(struct transport *transport)
  * remote refs.
  */
 static struct ref *handshake(struct transport *transport, int for_push,
-			     const struct strvec *ref_prefixes,
+			     struct transport_ls_refs_options *options,
 			     int must_list_refs)
 {
 	struct git_transport_data *data = transport->data;
@@ -303,7 +303,7 @@ static struct ref *handshake(struct transport *transport, int for_push,
 			trace2_data_string("transfer", NULL, "server-sid", server_sid);
 		if (must_list_refs)
 			get_remote_refs(data->fd[1], &reader, &refs, for_push,
-					ref_prefixes,
+					options,
 					transport->server_options,
 					transport->stateless_rpc);
 		break;
@@ -334,9 +334,9 @@ static struct ref *handshake(struct transport *transport, int for_push,
 }
 
 static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
-					const struct strvec *ref_prefixes)
+					struct transport_ls_refs_options *options)
 {
-	return handshake(transport, for_push, ref_prefixes, 1);
+	return handshake(transport, for_push, options, 1);
 }
 
 static int fetch_refs_via_pack(struct transport *transport,
@@ -1252,19 +1252,20 @@ int transport_push(struct repository *r,
 		int porcelain = flags & TRANSPORT_PUSH_PORCELAIN;
 		int pretend = flags & TRANSPORT_PUSH_DRY_RUN;
 		int push_ret, ret, err;
-		struct strvec ref_prefixes = STRVEC_INIT;
+		struct transport_ls_refs_options transport_options =
+			TRANSPORT_LS_REFS_OPTIONS_INIT;
 
 		if (check_push_refs(local_refs, rs) < 0)
 			return -1;
 
-		refspec_ref_prefixes(rs, &ref_prefixes);
+		refspec_ref_prefixes(rs, &transport_options.ref_prefixes);
 
 		trace2_region_enter("transport_push", "get_refs_list", r);
 		remote_refs = transport->vtable->get_refs_list(transport, 1,
-							       &ref_prefixes);
+							       &transport_options);
 		trace2_region_leave("transport_push", "get_refs_list", r);
 
-		strvec_clear(&ref_prefixes);
+		strvec_clear(&transport_options.ref_prefixes);
 
 		if (flags & TRANSPORT_PUSH_ALL)
 			match_flags |= MATCH_REFS_ALL;
@@ -1380,12 +1381,12 @@ int transport_push(struct repository *r,
 }
 
 const struct ref *transport_get_remote_refs(struct transport *transport,
-					    const struct strvec *ref_prefixes)
+					    struct transport_ls_refs_options *transport_options)
 {
 	if (!transport->got_remote_refs) {
 		transport->remote_refs =
 			transport->vtable->get_refs_list(transport, 0,
-							 ref_prefixes);
+							 transport_options);
 		transport->got_remote_refs = 1;
 	}
 
diff --git a/transport.h b/transport.h
index 24558c027d..1f5b60e4d3 100644
--- a/transport.h
+++ b/transport.h
@@ -233,17 +233,24 @@ int transport_push(struct repository *repo,
 		   struct refspec *rs, int flags,
 		   unsigned int * reject_reasons);
 
+struct transport_ls_refs_options {
+	/*
+	 * Optionally, a list of ref prefixes can be provided which can be sent
+	 * to the server (when communicating using protocol v2) to enable it to
+	 * limit the ref advertisement.  Since ref filtering is done on the
+	 * server's end (and only when using protocol v2),
+	 * transport_get_remote_refs() could return refs which don't match the
+	 * provided ref_prefixes.
+	 */
+	struct strvec ref_prefixes;
+};
+#define TRANSPORT_LS_REFS_OPTIONS_INIT { STRVEC_INIT }
+
 /*
  * Retrieve refs from a remote.
- *
- * Optionally a list of ref prefixes can be provided which can be sent to the
- * server (when communicating using protocol v2) to enable it to limit the ref
- * advertisement.  Since ref filtering is done on the server's end (and only
- * when using protocol v2), this can return refs which don't match the provided
- * ref_prefixes.
  */
 const struct ref *transport_get_remote_refs(struct transport *transport,
-					    const struct strvec *ref_prefixes);
+					    struct transport_ls_refs_options *transport_options);
 
 /*
  * Fetch the hash algorithm used by a remote.
-- 
2.30.0.280.ga3ce27912f-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v5 3/3] clone: respect remote unborn HEAD
  2021-01-26 18:55 ` [PATCH v5 " Jonathan Tan
  2021-01-26 18:55   ` [PATCH v5 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
  2021-01-26 18:55   ` [PATCH v5 2/3] connect, transport: encapsulate arg in struct Jonathan Tan
@ 2021-01-26 18:55   ` Jonathan Tan
  2021-01-26 22:24     ` Junio C Hamano
  2021-01-27  1:11   ` [PATCH v5 0/3] Cloning with " Junio C Hamano
  2021-01-27  1:41   ` Ævar Arnfjörð Bjarmason
  4 siblings, 1 reply; 109+ messages in thread
From: Jonathan Tan @ 2021-01-26 18:55 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, Junio C Hamano

Teach Git to use the "unborn" feature introduced in a previous patch as
follows: Git will always send the "unborn" argument if it is supported
by the server. During "git clone", if cloning an empty repository, Git
will use the new information to determine the local branch to create. In
all other cases, Git will ignore it.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config/init.txt |  2 +-
 builtin/clone.c               | 16 ++++++++++++++--
 connect.c                     | 28 ++++++++++++++++++++++++++--
 t/t5606-clone-options.sh      |  8 +++++---
 t/t5702-protocol-v2.sh        | 25 +++++++++++++++++++++++++
 transport.h                   |  8 ++++++++
 6 files changed, 79 insertions(+), 8 deletions(-)

diff --git a/Documentation/config/init.txt b/Documentation/config/init.txt
index dc77f8c844..79c79d6617 100644
--- a/Documentation/config/init.txt
+++ b/Documentation/config/init.txt
@@ -4,4 +4,4 @@ init.templateDir::
 
 init.defaultBranch::
 	Allows overriding the default branch name e.g. when initializing
-	a new repository or when cloning an empty repository.
+	a new repository.
diff --git a/builtin/clone.c b/builtin/clone.c
index 211d4f54b0..77fdc61f4d 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1330,10 +1330,21 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		remote_head = NULL;
 		option_no_checkout = 1;
 		if (!option_bare) {
-			const char *branch = git_default_branch_name();
-			char *ref = xstrfmt("refs/heads/%s", branch);
+			const char *branch;
+			char *ref;
+
+			if (transport_ls_refs_options.unborn_head_target &&
+			    skip_prefix(transport_ls_refs_options.unborn_head_target,
+					"refs/heads/", &branch)) {
+				ref = transport_ls_refs_options.unborn_head_target;
+				transport_ls_refs_options.unborn_head_target = NULL;
+			} else {
+				branch = git_default_branch_name();
+				ref = xstrfmt("refs/heads/%s", branch);
+			}
 
 			install_branch_config(0, branch, remote_name, ref);
+			create_symref("HEAD", ref, "");
 			free(ref);
 		}
 	}
@@ -1385,5 +1396,6 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	junk_mode = JUNK_LEAVE_ALL;
 
 	strvec_clear(&transport_ls_refs_options.ref_prefixes);
+	free(transport_ls_refs_options.unborn_head_target);
 	return err;
 }
diff --git a/connect.c b/connect.c
index 328c279250..879669df93 100644
--- a/connect.c
+++ b/connect.c
@@ -376,7 +376,8 @@ struct ref **get_remote_heads(struct packet_reader *reader,
 }
 
 /* Returns 1 when a valid ref has been added to `list`, 0 otherwise */
-static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
+static int process_ref_v2(struct packet_reader *reader, struct ref ***list,
+			  char **unborn_head_target)
 {
 	int ret = 1;
 	int i = 0;
@@ -397,6 +398,25 @@ static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
 		goto out;
 	}
 
+	if (!strcmp("unborn", line_sections.items[i].string)) {
+		i++;
+		if (unborn_head_target &&
+		    !strcmp("HEAD", line_sections.items[i++].string)) {
+			/*
+			 * Look for the symref target (if any). If found,
+			 * return it to the caller.
+			 */
+			for (; i < line_sections.nr; i++) {
+				const char *arg = line_sections.items[i].string;
+
+				if (skip_prefix(arg, "symref-target:", &arg)) {
+					*unborn_head_target = xstrdup(arg);
+					break;
+				}
+			}
+		}
+		goto out;
+	}
 	if (parse_oid_hex_algop(line_sections.items[i++].string, &old_oid, &end, reader->hash_algo) ||
 	    *end) {
 		ret = 0;
@@ -461,6 +481,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 	const char *hash_name;
 	struct strvec *ref_prefixes = transport_options ?
 		&transport_options->ref_prefixes : NULL;
+	char **unborn_head_target = transport_options ?
+		&transport_options->unborn_head_target : NULL;
 	*list = NULL;
 
 	if (server_supports_v2("ls-refs", 1))
@@ -490,6 +512,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 	if (!for_push)
 		packet_write_fmt(fd_out, "peel\n");
 	packet_write_fmt(fd_out, "symrefs\n");
+	if (server_supports_feature("ls-refs", "unborn", 0))
+		packet_write_fmt(fd_out, "unborn\n");
 	for (i = 0; ref_prefixes && i < ref_prefixes->nr; i++) {
 		packet_write_fmt(fd_out, "ref-prefix %s\n",
 				 ref_prefixes->v[i]);
@@ -498,7 +522,7 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 
 	/* Process response from server */
 	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
-		if (!process_ref_v2(reader, &list))
+		if (!process_ref_v2(reader, &list, unborn_head_target))
 			die(_("invalid ls-refs response: %s"), reader->line);
 	}
 
diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh
index 7f082fb23b..0111d4e8bd 100755
--- a/t/t5606-clone-options.sh
+++ b/t/t5606-clone-options.sh
@@ -102,11 +102,13 @@ test_expect_success 'redirected clone -v does show progress' '
 '
 
 test_expect_success 'chooses correct default initial branch name' '
-	git init --bare empty &&
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=foo init --bare empty &&
+	test_config -C empty lsrefs.allowUnborn true &&
 	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
 	git -c init.defaultBranch=up clone empty whats-up &&
-	test refs/heads/up = $(git -C whats-up symbolic-ref HEAD) &&
-	test refs/heads/up = $(git -C whats-up config branch.up.merge)
+	test refs/heads/foo = $(git -C whats-up symbolic-ref HEAD) &&
+	test refs/heads/foo = $(git -C whats-up config branch.foo.merge)
 '
 
 test_expect_success 'guesses initial branch name correctly' '
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 7d5b17909b..a8ef92b644 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -209,6 +209,31 @@ test_expect_success 'clone with file:// using protocol v2' '
 	grep "ref-prefix refs/tags/" log
 '
 
+test_expect_success 'clone of empty repo propagates name of default branch' '
+	test_when_finished "rm -rf file_empty_parent file_empty_child" &&
+
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&
+
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=main -c protocol.version=2 \
+		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
+	grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD
+'
+
+test_expect_success '...but not if explicitly forbidden by config' '
+	test_when_finished "rm -rf file_empty_parent file_empty_child" &&
+
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&
+	test_config -C file_empty_parent lsrefs.allowUnborn false &&
+
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=main -c protocol.version=2 \
+		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
+	! grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD
+'
+
 test_expect_success 'fetch with file:// using protocol v2' '
 	test_when_finished "rm -f log" &&
 
diff --git a/transport.h b/transport.h
index 1f5b60e4d3..24e15799e7 100644
--- a/transport.h
+++ b/transport.h
@@ -243,6 +243,14 @@ struct transport_ls_refs_options {
 	 * provided ref_prefixes.
 	 */
 	struct strvec ref_prefixes;
+
+	/*
+	 * If unborn_head_target is not NULL, and the remote reports HEAD as
+	 * pointing to an unborn branch, transport_get_remote_refs() stores the
+	 * unborn branch in unborn_head_target. It should be freed by the
+	 * caller.
+	 */
+	char *unborn_head_target;
 };
 #define TRANSPORT_LS_REFS_OPTIONS_INIT { STRVEC_INIT }
 
-- 
2.30.0.280.ga3ce27912f-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 1/3] ls-refs: report unborn targets of symrefs
  2021-01-26 18:55   ` [PATCH v5 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
@ 2021-01-26 21:38     ` Junio C Hamano
  2021-01-26 23:03       ` Junio C Hamano
                         ` (2 more replies)
  2021-01-27  1:28     ` Ævar Arnfjörð Bjarmason
  1 sibling, 3 replies; 109+ messages in thread
From: Junio C Hamano @ 2021-01-26 21:38 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> When cloning, we choose the default branch based on the remote HEAD.
> But if there is no remote HEAD reported (which could happen if the
> target of the remote HEAD is unborn), we'll fall back to using our local
> init.defaultBranch. Traditionally this hasn't been a big deal, because
> most repos used "master" as the default. But these days it is likely to
> cause confusion if the server and client implementations choose
> different values (e.g., if the remote started with "main", we may choose
> "master" locally, create commits there, and then the user is surprised
> when they push to "master" and not "main").
>
> To solve this, the remote needs to communicate the target of the HEAD
> symref, even if it is unborn, and "git clone" needs to use this
> information.
>
> Currently, symrefs that have unborn targets (such as in this case) are
> not communicated by the protocol. Teach Git to advertise and support the
> "unborn" feature in "ls-refs" (by default, this is advertised, but
> server administrators may turn this off through the lsrefs.allowunborn
> config). This feature indicates that "ls-refs" supports the "unborn"
> argument; when it is specified, "ls-refs" will send the HEAD symref with
> the name of its unborn target.
>
> This change is only for protocol v2. A similar change for protocol v0
> would require independent protocol design (there being no analogous
> position to signal support for "unborn") and client-side plumbing of the
> data required, so the scope of this patch set is limited to protocol v2.
>
> The client side will be updated to use this in a subsequent commit.
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  Documentation/config.txt                |  2 +
>  Documentation/config/lsrefs.txt         |  3 ++
>  Documentation/technical/protocol-v2.txt | 10 ++++-
>  ls-refs.c                               | 53 +++++++++++++++++++++++--
>  ls-refs.h                               |  1 +
>  serve.c                                 |  2 +-
>  t/t5701-git-serve.sh                    |  2 +-
>  7 files changed, 66 insertions(+), 7 deletions(-)
>  create mode 100644 Documentation/config/lsrefs.txt
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 6ba50b1104..d08e83a148 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -398,6 +398,8 @@ include::config/interactive.txt[]
>  
>  include::config/log.txt[]
>  
> +include::config/lsrefs.txt[]
> +
>  include::config/mailinfo.txt[]
>  
>  include::config/mailmap.txt[]
> diff --git a/Documentation/config/lsrefs.txt b/Documentation/config/lsrefs.txt
> new file mode 100644
> index 0000000000..dcbec11aaa
> --- /dev/null
> +++ b/Documentation/config/lsrefs.txt
> @@ -0,0 +1,3 @@
> +lsrefs.allowUnborn::
> +	Allow the server to send information about unborn symrefs during the
> +	protocol v2 ref advertisement.
> diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> index 85daeb5d9e..4707511c10 100644
> --- a/Documentation/technical/protocol-v2.txt
> +++ b/Documentation/technical/protocol-v2.txt
> @@ -192,11 +192,19 @@ ls-refs takes in the following arguments:
>  	When specified, only references having a prefix matching one of
>  	the provided prefixes are displayed.
>  
> +If the 'unborn' feature is advertised the following argument can be
> +included in the client's request.
> +
> +    unborn
> +	The server may send symrefs pointing to unborn branches in the form
> +	"unborn <refname> symref-target:<target>".
> +
>  The output of ls-refs is as follows:
>  
>      output = *ref
>  	     flush-pkt
> -    ref = PKT-LINE(obj-id SP refname *(SP ref-attribute) LF)
> +    obj-id-or-unborn = (obj-id | "unborn")
> +    ref = PKT-LINE(obj-id-or-unborn SP refname *(SP ref-attribute) LF)
>      ref-attribute = (symref | peeled)
>      symref = "symref-target:" symref-target
>      peeled = "peeled:" obj-id
> diff --git a/ls-refs.c b/ls-refs.c
> index a1e0b473e4..4077adeb6a 100644
> --- a/ls-refs.c
> +++ b/ls-refs.c
> @@ -32,6 +32,8 @@ struct ls_refs_data {
>  	unsigned peel;
>  	unsigned symrefs;
>  	struct strvec prefixes;
> +	unsigned allow_unborn : 1;
> +	unsigned unborn : 1;
>  };
>  
>  static int send_ref(const char *refname, const struct object_id *oid,
> @@ -47,7 +49,10 @@ static int send_ref(const char *refname, const struct object_id *oid,
>  	if (!ref_match(&data->prefixes, refname_nons))
>  		return 0;
>  
> -	strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
> +	if (oid)
> +		strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
> +	else
> +		strbuf_addf(&refline, "unborn %s", refname_nons);

When a call is made to this helper with NULL "oid", it unconditionally
sends the "refname" out as an 'unborn' thing.  If data->symrefs is not
true, or flag does not have REF_ISSYMREF set, then we'd end up
sending

    "unborn" SP refname LF

without any ref-attribute.  The caller is responsible for ensuring
that it passes sensible data->symrefs and flag when it passes
oid==NULL to this function, but it is OK because this is a private
helper.

OK.

>  	if (data->symrefs && flag & REF_ISSYMREF) {
>  		struct object_id unused;
>  		const char *symref_target = resolve_ref_unsafe(refname, 0,
> @@ -74,8 +79,30 @@ static int send_ref(const char *refname, const struct object_id *oid,
>  	return 0;
>  }
>  
> -static int ls_refs_config(const char *var, const char *value, void *data)
> +static void send_possibly_unborn_head(struct ls_refs_data *data)
>  {
> +	struct strbuf namespaced = STRBUF_INIT;
> +	struct object_id oid;
> +	int flag;
> +	int oid_is_null;
> +
> +	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
> +	if (!resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag))
> +		return; /* bad ref */
> +	oid_is_null = is_null_oid(&oid);
> +	if (!oid_is_null ||
> +	    (data->unborn && data->symrefs && (flag & REF_ISSYMREF)))
> +		send_ref(namespaced.buf, oid_is_null ? NULL : &oid, flag, data);

And this caller makes sure that send_ref()'s expectation holds.

> +	strbuf_release(&namespaced);
> +}
> +
> +static int ls_refs_config(const char *var, const char *value, void *cb_data)
> +{
> +	struct ls_refs_data *data = cb_data;
> +
> +	if (!strcmp("lsrefs.allowunborn", var))
> +		data->allow_unborn = git_config_bool(var, value);
> +
>  	/*
>  	 * We only serve fetches over v2 for now, so respect only "uploadpack"
>  	 * config. This may need to eventually be expanded to "receive", but we
> @@ -91,7 +118,8 @@ int ls_refs(struct repository *r, struct strvec *keys,
>  
>  	memset(&data, 0, sizeof(data));
>  
> -	git_config(ls_refs_config, NULL);
> +	data.allow_unborn = 1;
> +	git_config(ls_refs_config, &data);

The above is a usual sequence of "an unspecified allow-unborn
defaults to true, but the configuration can turn it off".  OK
>  
>  	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
>  		const char *arg = request->line;
> @@ -103,14 +131,31 @@ int ls_refs(struct repository *r, struct strvec *keys,
>  			data.symrefs = 1;
>  		else if (skip_prefix(arg, "ref-prefix ", &out))
>  			strvec_push(&data.prefixes, out);
> +		else if (data.allow_unborn && !strcmp("unborn", arg))
> +			data.unborn = 1;

I think the use of &&-cascade is iffy here.  Even when we are *not*
accepting request for unborn, we should still parse it as such.
This does not matter in today's code, but it is a basic courtesy for
future developers who may add more "else if" after it.

IOW

		else if (!strcmp("unborn", arg)) {
			if (!data.allow_unborn)
				; /* we are not accepting the request */
			else
				data.unborn = 1;
		}

I wrote the above in longhand only for documentation purposes; in
practice, 

		else if (!strcmp("unborn", arg))
                	data.unborn = data.allow_unborn;

may suffice.

>  	}
>  
>  	if (request->status != PACKET_READ_FLUSH)
>  		die(_("expected flush after ls-refs arguments"));
>  
> -	head_ref_namespaced(send_ref, &data);
> +	send_possibly_unborn_head(&data);
>  	for_each_namespaced_ref(send_ref, &data);

And here is another caller of send_ref().  Are we sure that
send_ref()'s expectation is satisfied by this caller when the
iteration encounters a broken ref (e.g. refs/heads/broken not a
symref but names an object that does not exist and get_sha1()
yielding 0{40}), or a dangling symref (e.g. refs/remotes/origin/HEAD
pointing at something that does not exist)?

>  	packet_flush(1);
>  	strvec_clear(&data.prefixes);
>  	return 0;
>  }
> +
> +int ls_refs_advertise(struct repository *r, struct strbuf *value)
> +{
> +	if (value) {
> +		int allow_unborn_value;
> +
> +		if (repo_config_get_bool(the_repository,
> +					 "lsrefs.allowunborn",
> +					 &allow_unborn_value) ||
> +		    allow_unborn_value)
> +			strbuf_addstr(value, "unborn");
> +	}

This reads "when not explicitly disabled, stuff "unborn" in there".

It feels somewhat brittle that we have to read the same variable and
apply the same "default to true" logic in two places and have to
keep them in sync.  Is this because the decision to advertize or not
has to be made way before the code that is specific to the
implementation of ls-refs is run?

If ls_refs_advertise() is always called first before ls_refs(), I
wonder if it makes sense to reuse what we found out about the
configured (or left unconfigured) state here and use it when
ls_refs() gets called?  I know that the way serve.c infrastructure
calls "do we advertise?" helper from each protocol-element handler
is too narrow and does not allow us to pass such a necessary piece
of information but I view it as a misdesign that can be corrected
(and until that happens, we could use file-local static limited to
ls-refs.c).

> +	return 1;
> +}
> diff --git a/ls-refs.h b/ls-refs.h
> index 7b33a7c6b8..a99e4be0bd 100644
> --- a/ls-refs.h
> +++ b/ls-refs.h
> @@ -6,5 +6,6 @@ struct strvec;
>  struct packet_reader;
>  int ls_refs(struct repository *r, struct strvec *keys,
>  	    struct packet_reader *request);
> +int ls_refs_advertise(struct repository *r, struct strbuf *value);
>  
>  #endif /* LS_REFS_H */
> diff --git a/serve.c b/serve.c
> index eec2fe6f29..ac20c72763 100644
> --- a/serve.c
> +++ b/serve.c
> @@ -73,7 +73,7 @@ struct protocol_capability {
>  
>  static struct protocol_capability capabilities[] = {
>  	{ "agent", agent_advertise, NULL },
> -	{ "ls-refs", always_advertise, ls_refs },
> +	{ "ls-refs", ls_refs_advertise, ls_refs },
>  	{ "fetch", upload_pack_advertise, upload_pack_v2 },
>  	{ "server-option", always_advertise, NULL },
>  	{ "object-format", object_format_advertise, NULL },
> diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
> index a1f5fdc9fd..df29504161 100755
> --- a/t/t5701-git-serve.sh
> +++ b/t/t5701-git-serve.sh
> @@ -12,7 +12,7 @@ test_expect_success 'test capability advertisement' '
>  	cat >expect <<-EOF &&
>  	version 2
>  	agent=git/$(git version | cut -d" " -f3)
> -	ls-refs
> +	ls-refs=unborn
>  	fetch=shallow
>  	server-option
>  	object-format=$(test_oid algo)

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 2/3] connect, transport: encapsulate arg in struct
  2021-01-26 18:55   ` [PATCH v5 2/3] connect, transport: encapsulate arg in struct Jonathan Tan
@ 2021-01-26 21:54     ` Junio C Hamano
  2021-01-30  4:06       ` Jonathan Tan
  0 siblings, 1 reply; 109+ messages in thread
From: Junio C Hamano @ 2021-01-26 21:54 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

> In a future patch we plan to return the name of an unborn current branch
> from deep in the callchain to a caller via a new pointer parameter that
> points at a variable in the caller when the caller calls
> get_remote_refs() and transport_get_remote_refs().
>
> In preparation for that, encapsulate the existing ref_prefixes
> parameter into a struct. The aforementioned unborn current branch will
> go into this new struct in the future patch.
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  builtin/clone.c      | 18 +++++++++++-------
>  builtin/fetch-pack.c |  3 ++-
>  builtin/fetch.c      | 18 +++++++++++-------
>  builtin/ls-remote.c  |  9 +++++----
>  connect.c            |  4 +++-
>  remote.h             |  4 +++-
>  transport-helper.c   |  5 +++--
>  transport-internal.h |  9 +--------
>  transport.c          | 23 ++++++++++++-----------
>  transport.h          | 21 ++++++++++++++-------
>  10 files changed, 65 insertions(+), 49 deletions(-)
>
> diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
> index 58b7c1fbdc..c2d96f4c89 100644
> --- a/builtin/fetch-pack.c
> +++ b/builtin/fetch-pack.c
> @@ -220,7 +220,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
>  	version = discover_version(&reader);
>  	switch (version) {
>  	case protocol_v2:
> -		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL, args.stateless_rpc);
> +		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL,
> +				args.stateless_rpc);
>  		break;

This seems to be an unrelated line-wrapping, but there are overlong
lines that are longer than this line in the same function.

Everything else looks sensible to change the assumption that a strvec
is sufficient for the communication in the codepaths and make it more
easily extended by passing a ls-refs-options structure, which makes
quite a lot of sense.

> diff --git a/transport.h b/transport.h
> index 24558c027d..1f5b60e4d3 100644
> --- a/transport.h
> +++ b/transport.h
> @@ -233,17 +233,24 @@ int transport_push(struct repository *repo,
>  		   struct refspec *rs, int flags,
>  		   unsigned int * reject_reasons);
>  
> +struct transport_ls_refs_options {
> +	/*
> +	 * Optionally, a list of ref prefixes can be provided which can be sent
> +	 * to the server (when communicating using protocol v2) to enable it to
> +	 * limit the ref advertisement.  Since ref filtering is done on the
> +	 * server's end (and only when using protocol v2),
> +	 * transport_get_remote_refs() could return refs which don't match the
> +	 * provided ref_prefixes.
> +	 */
> +	struct strvec ref_prefixes;
> +};
> +#define TRANSPORT_LS_REFS_OPTIONS_INIT { STRVEC_INIT }
> +

And of course, the first step only carries the strvec we have been
passing around, i.e. does not lose or gain features.

Looking good.  Thanks.


^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 3/3] clone: respect remote unborn HEAD
  2021-01-26 18:55   ` [PATCH v5 3/3] clone: respect remote unborn HEAD Jonathan Tan
@ 2021-01-26 22:24     ` Junio C Hamano
  2021-01-30  4:27       ` Jonathan Tan
  0 siblings, 1 reply; 109+ messages in thread
From: Junio C Hamano @ 2021-01-26 22:24 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

>  init.defaultBranch::
>  	Allows overriding the default branch name e.g. when initializing
> -	a new repository or when cloning an empty repository.
> +	a new repository.

Looking good.

> diff --git a/builtin/clone.c b/builtin/clone.c
> index 211d4f54b0..77fdc61f4d 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -1330,10 +1330,21 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  		remote_head = NULL;
>  		option_no_checkout = 1;
>  		if (!option_bare) {
> -			const char *branch = git_default_branch_name();
> -			char *ref = xstrfmt("refs/heads/%s", branch);
> +			const char *branch;
> +			char *ref;
> +
> +			if (transport_ls_refs_options.unborn_head_target &&
> +			    skip_prefix(transport_ls_refs_options.unborn_head_target,
> +					"refs/heads/", &branch)) {
> +				ref = transport_ls_refs_options.unborn_head_target;
> +				transport_ls_refs_options.unborn_head_target = NULL;
> +			} else {
> +				branch = git_default_branch_name();
> +				ref = xstrfmt("refs/heads/%s", branch);
> +			}
>  
>  			install_branch_config(0, branch, remote_name, ref);
> +			create_symref("HEAD", ref, "");
>  			free(ref);

OK, we used to say "point our HEAD always to the local default
name", and the code is still there in the else clause.  But when the
transport found what name the other side uses, we use that name
instead.

I presume that clearing transport_ls_ref_options.unborn_head_target
is to take ownership of this piece of memory ourselves?

We didn't call create_symref() in the original code, but now we do.
Is this a valid bugfix even if we did not have this "learn remote
symref even for unborn HEAD" feature?  Or is the original codepath
now somehow got broken with an extra create_symref() that we used
not to do, but now we do?

> @@ -1385,5 +1396,6 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	junk_mode = JUNK_LEAVE_ALL;
>  
>  	strvec_clear(&transport_ls_refs_options.ref_prefixes);
> +	free(transport_ls_refs_options.unborn_head_target);
>  	return err;
>  }
> diff --git a/connect.c b/connect.c
> index 328c279250..879669df93 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -376,7 +376,8 @@ struct ref **get_remote_heads(struct packet_reader *reader,
>  }
>  
>  /* Returns 1 when a valid ref has been added to `list`, 0 otherwise */
> -static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
> +static int process_ref_v2(struct packet_reader *reader, struct ref ***list,
> +			  char **unborn_head_target)
>  {
>  	int ret = 1;
>  	int i = 0;
> @@ -397,6 +398,25 @@ static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
>  		goto out;
>  	}
>  
> +	if (!strcmp("unborn", line_sections.items[i].string)) {
> +		i++;
> +		if (unborn_head_target &&
> +		    !strcmp("HEAD", line_sections.items[i++].string)) {
> +			/*
> +			 * Look for the symref target (if any). If found,
> +			 * return it to the caller.
> +			 */
> +			for (; i < line_sections.nr; i++) {
> +				const char *arg = line_sections.items[i].string;
> +
> +				if (skip_prefix(arg, "symref-target:", &arg)) {
> +					*unborn_head_target = xstrdup(arg);
> +					break;
> +				}
> +			}
> +		}
> +		goto out;
> +	}

We split the line and notice that the first token is "unborn"; if
the caller is not interested in the unborn head, we just skip the
rest, but otherwise, if it is about HEAD (i.e. we do not care if a
dangling symref that is not HEAD is reported), we notice the target
in unborn_head_target.

OK.  We already saw how this is used in cmd_clone().

> @@ -461,6 +481,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
>  	const char *hash_name;
>  	struct strvec *ref_prefixes = transport_options ?
>  		&transport_options->ref_prefixes : NULL;
> +	char **unborn_head_target = transport_options ?
> +		&transport_options->unborn_head_target : NULL;

So any caller that passes transport_options will get the unborn head
information for free?  The other callers are in fetch-pack.c and
transport.c, which presumably are about fetching and not cloning.

I recall discussions on filling a missing refs/remotes/X/HEAD when
we fetch from X and learn where X points at.  Such an extension can
be done on top of this mechanism to pass transport_options from the
fetch codepath, I presume?


Thanks.  I tried to follow the thought in the patches aloud, and it
was mostly a pleasant read.


^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 1/3] ls-refs: report unborn targets of symrefs
  2021-01-26 21:38     ` Junio C Hamano
@ 2021-01-26 23:03       ` Junio C Hamano
  2021-01-30  3:55         ` Jonathan Tan
  2021-01-26 23:20       ` Jeff King
  2021-01-29 20:23       ` Jonathan Tan
  2 siblings, 1 reply; 109+ messages in thread
From: Junio C Hamano @ 2021-01-26 23:03 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> It feels somewhat brittle that we have to read the same variable and
> apply the same "default to true" logic in two places and have to
> keep them in sync.  Is this because the decision to advertize or not
> has to be made way before the code that is specific to the
> implementation of ls-refs is run?
>
> If ls_refs_advertise() is always called first before ls_refs(), I
> wonder if it makes sense to reuse what we found out about the
> configured (or left unconfigured) state here and use it when
> ls_refs() gets called?  I know that the way serve.c infrastructure
> calls "do we advertise?" helper from each protocol-element handler
> is too narrow and does not allow us to pass such a necessary piece
> of information but I view it as a misdesign that can be corrected
> (and until that happens, we could use file-local static limited to
> ls-refs.c).

After giving the above a bit more thought, here are a few random
thoughts around the area.

 * As "struct protocol_capability" indicates, we have <name of
   service, the logic to advertise, the logic to serve> as a
   three-tuple for services.  The serving logic should know what
   advertising logic advertised (or more precisely, what information
   advertising logic used to make that decision) so that they can
   work consistently.

   For that, there should be a mechanism that advertising logic can
   use to leave a note to serving logic, perhaps by adding a "void
   *" to both of these functions.  The advertising function would
   allocate a piece of memory it wants to use and returns the
   pointer to it to the caller in serve.c, and that pointer is given
   to the corresponding ls_refs() when it is called by serve.c.
   Then ls_refs_advertise can say "I found this configuration
   setting and decided to advertise" to later ls_refs() and the
   latter can say "ah, as you have advertised, I have to respond to
   such a request".

 * I am not sure if "lsrefs.allowunborn = yes/no" is a good way to
   configure this feature.  Wouldn't it be more natural to make this
   three-way, i.e. "lsrefs.unborn = advertise/serve/ignore", where
   the server operator can choose among (1) advertise the presence
   of the capability and respond to requests, (2) do not advertise
   the capability but if a request comes, respond to it, and (3) do
   not advertise and do not respond.  We could throw in 'deny' that
   causes the request to result in a failure but I do not care too
   deeply about that fourth option.

   Using such a configuration mechanism, ls_refs_advertise may leave
   the value of "lsrefs.unborn" (or lack thereof) it found and used
   to base its decision to advertise, for use by ls_refs.  ls_refs
   in turn can use the value found there to decide if it ignores or
   responds to the "unborn" request.


^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v4 3/3] clone: respect remote unborn HEAD
  2021-01-26 18:22         ` Jonathan Tan
@ 2021-01-26 23:04           ` Jeff King
  2021-01-28  5:50             ` Junio C Hamano
  0 siblings, 1 reply; 109+ messages in thread
From: Jeff King @ 2021-01-26 23:04 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, gitster

On Tue, Jan 26, 2021 at 10:22:12AM -0800, Jonathan Tan wrote:

> > On Tue, Dec 22, 2020 at 01:54:20PM -0800, Jonathan Tan wrote:
> > 
> > > @@ -1323,10 +1325,20 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
> > >  		remote_head = NULL;
> > >  		option_no_checkout = 1;
> > >  		if (!option_bare) {
> > > -			const char *branch = git_default_branch_name();
> > > -			char *ref = xstrfmt("refs/heads/%s", branch);
> > > +			const char *branch;
> > > +			char *ref;
> > > +
> > > +			if (unborn_head_target &&
> > > +			    skip_prefix(unborn_head_target, "refs/heads/", &branch)) {
> > > +				ref = unborn_head_target;
> > > +				unborn_head_target = NULL;
> > > +			} else {
> > > +				branch = git_default_branch_name();
> > > +				ref = xstrfmt("refs/heads/%s", branch);
> > > +			}
> > >  
> > >  			install_branch_config(0, branch, remote_name, ref);
> > > +			create_symref("HEAD", ref, "");
> > >  			free(ref);
> > >  		}
> > 
> > In the old code, we never called create_symref() at all. It makes sense
> > that we'd do it now when unborn_head_target is not NULL. But what about
> > in the "else" clause there? Now we're adding an extra create_symref()
> > call.
> 
> The "else" branch you're referring to is the one enclosing all of the
> lines quoted above, I believe?

I meant this clause:

> > > +                 } else {
> > > +                         branch = git_default_branch_name();
> > > +                         ref = xstrfmt("refs/heads/%s", branch);
> > > +                 }

which used to be what we always did unconditionally. So in the original
code, we did not call create_symref() in this code path. Afterwards, we
call it for the unborn HEAD (which I can buy is necessary) but _also_
for that regular path. I.e., why is the new code not:

  if (unborn_head_target && ...) {
          ref = unborn_head_target;
	  unborn_head_target = NULL;
	  create_symref("HEAD", ref, "");
  } else {
          branch = git_default_branch_name();
	  ref = xstrfmt("refs/heads/%s", branch);
  }

I.e., I don't understand:

  - why create_symref() wasn't need before (assuming it was not), and
    why it is OK to run it now in the non-unborn code path

  - why we need create_symref() in the unborn path (which is probably
    something mundane)

I can even buy the argument that it is simply for consistency, so that
all of the HEAD-setup commands are shared between the two paths. And
that it is OK to do so, because we are just overwriting what init-db did
before (even if sometimes it is the same thing). But I feel like that
deserves explanation in the commit message. :)

-Peff

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v4 1/3] ls-refs: report unborn targets of symrefs
  2021-01-26 18:13         ` Jonathan Tan
@ 2021-01-26 23:16           ` Jeff King
  0 siblings, 0 replies; 109+ messages in thread
From: Jeff King @ 2021-01-26 23:16 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, gitster

On Tue, Jan 26, 2021 at 10:13:58AM -0800, Jonathan Tan wrote:

> (It was a bit confusing that refs_resolve_ref_unsafe() returns one of
> its input arguments if it succeeds and NULL if it fails, but that's
> outside the scope of this patch, I think.)

Yep. It would probably be much nicer for it to return a numeric success
code, and to take an optional strbuf into which to write the resolved
symref name (if the caller even cares about it). But definitely out of
scope for your patch.

> > This straight-boolean version works as long as you can atomically update
> > the _config_ on each version. But that seems like roughly the same
> > problem (having dealt with this on GitHub servers, they are not
> > equivalent, and depending on your infrastructure, it definitely _can_ be
> > easier to do one versus the other. But it seems like a funny place to
> > leave this upstream feature).
> 
> Well, I was just agreeing with what you said [1]. :-)
> 
> [1] https://lore.kernel.org/git/X9xJLWdFJfNJTn0p@coredump.intra.peff.net/

Oh, I just need to you to agree harder then. ;)

If we are not going to support config that helps you do an atomic
deploy, then I don't really see the point of having config at all.
Here are three plausible implementations I can conceive of:

  - allowUnborn is a tri-state for "accept-but-do-not-advertise",
    "accept-and-advertise", and "disallow". This helps with rollout in a
    cluster by setting it to the accept-but-do-not-advertise.  The
    default would be accept-and-advertise, which is what most servers
    would want. I don't really know why anyone would want "disallow".

  - allowUnborn is a bool for "accept-and-advertise" or "disallow". This
    doesn't help cluster rollout. I don't know why anyone would want to
    switch away from the default of accept-and-advertise.

  - allowUnborn is always on.

The first one helps the cluster case, at the cost of introducing an
extra config knob. The third one doesn't help that case, but is one less
knob for server admins to think about. But the second one has a knob
that I don't understand why anybody would tweak. It seems like the worst
of both.

Perhaps there's a reason for setting "disallow" that I don't know. Or
perhaps you're happy to help the cluster case using a simple bool with
atomic config rollouts (which are outside the scope of Git itself).

> > Or is the intent that an unconfigured reader would silently ignore the
> > unborn flag in that case? That would at least not cause it to bail on
> > the client in a mixed-version environment. But it does feel like a
> > confusing result.
> 
> Right now, an old server would ignore "unborn", yes. I'm not sure of
> what the intent should be - tightening ls-refs and fetch to forbid
> unknown arguments seems like a good idea to me.

If we had a just a bool (case 2 from above), and there was an
always-implied "accept unborn even if not advertised", then that _does_
let the config help out the cluster case (it just turns off
advertisements, basically making the bool "accept-but-do-not-advertise"
versus "disallow").

I don't love it. The protocol spec does say "don't ask for capability
foo if the server didn't say it knows about foo". We'd be loosening the
enforcement of that (if only for capabilities we _do_ in fact know
about), even though we don't know if it was due to a race, or if the
client is just misbehaving. But I wondered if that was the direction you
were going to try to solve your cluster-rollout problem.

-Peff

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 1/3] ls-refs: report unborn targets of symrefs
  2021-01-26 21:38     ` Junio C Hamano
  2021-01-26 23:03       ` Junio C Hamano
@ 2021-01-26 23:20       ` Jeff King
  2021-01-26 23:38         ` Junio C Hamano
  2021-01-29 20:23       ` Jonathan Tan
  2 siblings, 1 reply; 109+ messages in thread
From: Jeff King @ 2021-01-26 23:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Tan, git

On Tue, Jan 26, 2021 at 01:38:49PM -0800, Junio C Hamano wrote:

> > @@ -103,14 +131,31 @@ int ls_refs(struct repository *r, struct strvec *keys,
> >  			data.symrefs = 1;
> >  		else if (skip_prefix(arg, "ref-prefix ", &out))
> >  			strvec_push(&data.prefixes, out);
> > +		else if (data.allow_unborn && !strcmp("unborn", arg))
> > +			data.unborn = 1;
> 
> I think the use of &&-cascade is iffy here.  Even when we are *not*
> accepting request for unborn, we should still parse it as such.
> This does not matter in today's code, but it is a basic courtesy for
> future developers who may add more "else if" after it.
> 
> IOW
> 
> 		else if (!strcmp("unborn", arg)) {
> 			if (!data.allow_unborn)
> 				; /* we are not accepting the request */
> 			else
> 				data.unborn = 1;
> 		}
> 
> I wrote the above in longhand only for documentation purposes; in
> practice, 
> 
> 		else if (!strcmp("unborn", arg))
>                 	data.unborn = data.allow_unborn;
> 
> may suffice.

Doing it that way is friendlier, but loosens enforcement of:

  Client will then send a space separated list of capabilities it wants
  to be in effect. The client MUST NOT ask for capabilities the server
  did not say it supports.

from Documentation/technical/protocol-capabilities.txt.

It does solve Jonathan's racy cluster-deploy problem, though. See the
discussion in the v4 thread (sorry, seems not to have hit the archive
yet, but hopefully this link will work soon):

  https://lore.kernel.org/git/YBCitNb75rpnuW2L@coredump.intra.peff.net/

-Peff

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 1/3] ls-refs: report unborn targets of symrefs
  2021-01-26 23:20       ` Jeff King
@ 2021-01-26 23:38         ` Junio C Hamano
  0 siblings, 0 replies; 109+ messages in thread
From: Junio C Hamano @ 2021-01-26 23:38 UTC (permalink / raw)
  To: Jeff King; +Cc: Jonathan Tan, git

Jeff King <peff@peff.net> writes:

> See the
> discussion in the v4 thread (sorry, seems not to have hit the archive
> yet, but hopefully this link will work soon):
>
>   https://lore.kernel.org/git/YBCitNb75rpnuW2L@coredump.intra.peff.net/

I guess vger having some sort of constipation, we (not you-and-me
but list participants as a whole) would be doing this kind of
back-and-force while untable to read what others have said. 

https://lore.kernel.org/git/xmqqmtwvz4g9.fsf@gitster.c.googlers.com/

will have the following.

> It feels somewhat brittle that we have to read the same variable and
> apply the same "default to true" logic in two places and have to
> keep them in sync.  Is this because the decision to advertize or not
> has to be made way before the code that is specific to the
> implementation of ls-refs is run?
>
> If ls_refs_advertise() is always called first before ls_refs(), I
> wonder if it makes sense to reuse what we found out about the
> configured (or left unconfigured) state here and use it when
> ls_refs() gets called?  I know that the way serve.c infrastructure
> calls "do we advertise?" helper from each protocol-element handler
> is too narrow and does not allow us to pass such a necessary piece
> of information but I view it as a misdesign that can be corrected
> (and until that happens, we could use file-local static limited to
> ls-refs.c).

After giving the above a bit more thought, here are a few random
thoughts around the area.

 * As "struct protocol_capability" indicates, we have <name of
   service, the logic to advertise, the logic to serve> as a
   three-tuple for services.  The serving logic should know what
   advertising logic advertised (or more precisely, what information
   advertising logic used to make that decision) so that they can
   work consistently.

   For that, there should be a mechanism that advertising logic can
   use to leave a note to serving logic, perhaps by adding a "void
   *" to both of these functions.  The advertising function would
   allocate a piece of memory it wants to use and returns the
   pointer to it to the caller in serve.c, and that pointer is given
   to the corresponding ls_refs() when it is called by serve.c.
   Then ls_refs_advertise can say "I found this configuration
   setting and decided to advertise" to later ls_refs() and the
   latter can say "ah, as you have advertised, I have to respond to
   such a request".

 * I am not sure if "lsrefs.allowunborn = yes/no" is a good way to
   configure this feature.  Wouldn't it be more natural to make this
   three-way, i.e. "lsrefs.unborn = advertise/serve/ignore", where
   the server operator can choose among (1) advertise the presence
   of the capability and respond to requests, (2) do not advertise
   the capability but if a request comes, respond to it, and (3) do
   not advertise and do not respond.  We could throw in 'deny' that
   causes the request to result in a failure but I do not care too
   deeply about that fourth option.

   Using such a configuration mechanism, ls_refs_advertise may leave
   the value of "lsrefs.unborn" (or lack thereof) it found and used
   to base its decision to advertise, for use by ls_refs.  ls_refs
   in turn can use the value found there to decide if it ignores or
   responds to the "unborn" request.


^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 0/3] Cloning with remote unborn HEAD
  2021-01-26 18:55 ` [PATCH v5 " Jonathan Tan
                     ` (2 preceding siblings ...)
  2021-01-26 18:55   ` [PATCH v5 3/3] clone: respect remote unborn HEAD Jonathan Tan
@ 2021-01-27  1:11   ` Junio C Hamano
  2021-01-27  4:25     ` Jeff King
  2021-01-27  1:41   ` Ævar Arnfjörð Bjarmason
  4 siblings, 1 reply; 109+ messages in thread
From: Junio C Hamano @ 2021-01-27  1:11 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, Jeff King

Jonathan Tan <jonathantanmy@google.com> writes:

> Thanks, Peff, for your review. I have addressed your comments (through
> replies to your emails and here in this v5 patch set).
>
> Jonathan Tan (3):
>   ls-refs: report unborn targets of symrefs
>   connect, transport: encapsulate arg in struct
>   clone: respect remote unborn HEAD

Applying this alone to 'master' seems to pass all tests, but
the topic seems to have funny interactions with another topic
in flight, jk/peel-iterated-oid

There is textual conflict whose resolution seems trivial, but with
that resolved ...

diff --cc builtin/clone.c
index e335734b4c,77fdc61f4d..0000000000
--- i/builtin/clone.c
+++ w/builtin/clone.c
@@@ -1326,10 -1330,21 +1330,21 @@@ int cmd_clone(int argc, const char **ar
  		remote_head = NULL;
  		option_no_checkout = 1;
  		if (!option_bare) {
- 			const char *branch = git_default_branch_name(0);
- 			char *ref = xstrfmt("refs/heads/%s", branch);
+ 			const char *branch;
+ 			char *ref;
+ 
+ 			if (transport_ls_refs_options.unborn_head_target &&
+ 			    skip_prefix(transport_ls_refs_options.unborn_head_target,
+ 					"refs/heads/", &branch)) {
+ 				ref = transport_ls_refs_options.unborn_head_target;
+ 				transport_ls_refs_options.unborn_head_target = NULL;
+ 			} else {
 -				branch = git_default_branch_name();
++				branch = git_default_branch_name(0);
+ 				ref = xstrfmt("refs/heads/%s", branch);
+ 			}
  
  			install_branch_config(0, branch, remote_name, ref);
+ 			create_symref("HEAD", ref, "");
  			free(ref);
  		}
  	}


... numerous tests fail.

For example, t5702 dies like so:

expecting success of 5702.15 'clone of empty repo propagates name of default branch':
        test_when_finished "rm -rf file_empty_parent file_empty_child" &&

        GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
        git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&

        GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
        git -c init.defaultBranch=main -c protocol.version=2 \
                clone "file://$(pwd)/file_empty_parent" file_empty_child &&
        grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD

Initialized empty Git repository in /usr/local/google/home/jch/w/git.git/t/trash directory.t5702-protocol-v2/file_empty_parent/.git/
Cloning into 'file_empty_child'...
fatal: expected flush after ref listing
not ok 15 - clone of empty repo propagates name of default branch

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 1/3] ls-refs: report unborn targets of symrefs
  2021-01-26 18:55   ` [PATCH v5 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
  2021-01-26 21:38     ` Junio C Hamano
@ 2021-01-27  1:28     ` Ævar Arnfjörð Bjarmason
  2021-01-30  4:04       ` Jonathan Tan
  1 sibling, 1 reply; 109+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-01-27  1:28 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, Junio C Hamano


On Tue, Jan 26 2021, Jonathan Tan wrote:

> +If the 'unborn' feature is advertised the following argument can be
> +included in the client's request.
> +
> +    unborn
> +	The server may send symrefs pointing to unborn branches in the form
> +	"unborn <refname> symref-target:<target>".
> +

"branches" as in things under refs/heads/*? What should happen if you
send this for a refs/tags/* or refs/xyz/*? Maybe overly pedantic, but it
seems we have no other explicit mention of refs/{heads,tags}/ in
protocol-v2.txt before this[1].

1. Although as I've learned from another recent thread include-tag is
   magical for refs/tags/* only.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 0/3] Cloning with remote unborn HEAD
  2021-01-26 18:55 ` [PATCH v5 " Jonathan Tan
                     ` (3 preceding siblings ...)
  2021-01-27  1:11   ` [PATCH v5 0/3] Cloning with " Junio C Hamano
@ 2021-01-27  1:41   ` Ævar Arnfjörð Bjarmason
  2021-01-30  4:41     ` Jonathan Tan
  2021-02-05 22:28     ` Junio C Hamano
  4 siblings, 2 replies; 109+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-01-27  1:41 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, Jeff King, Felipe Contreras


On Tue, Jan 26 2021, Jonathan Tan wrote:

[For some reason the patches didn't reach my mailbox, but I see them in
the list archive, so I'm replying to the cover-letter]

>  Documentation/config.txt                |  2 +
>  Documentation/config/init.txt           |  2 +-

Good, now we have init.defaultBranch docs, but they say:
    
     init.defaultBranch::
            Allows overriding the default branch name e.g. when initializing
    -       a new repository or when cloning an empty repository.
    +       a new repository.

So this still only applies to file:// and other "protocol" clones, but
not "git clone /some/path"?

Re my reply to v1, do we consider that a bug, feature, something just
left unimplemented?

I really don't care much, but this really needs a corresponding
documentation update. I.e. something like:

    init.defaultBranch::
        Allows overriding the default branch name e.g. when initializing a
        new repository or when cloning an empty repository.
    
        When cloning a repository over protocol v2 (i.e. ssh://, https://,
        file://, but not a /some/path), and if that repository has
        init.defaultBranch configured, the server will advertise its
        preferred default branch name, and we'll take its configuration over
        ours.

Which, just in terms of implementation makes me think it would make more
sense if the server just had:

    uploadPack.sendConfig = "init.defaultBranch=xyz"

The client:

    receivePack.acceptConfig = "init.defaultBranch"

And in terms of things on the wire we'd say:

    "set-config init.defaultBranch=main"

You could have many such lines, but we'd just harcode only accepting
"init.defaultBranch" by default for now.

I.e. we set "init.defaultBranch" on the server, and the client ends up
interpreting things as if though "init.defaultBranch" was set to exactly
that value. So why not just ... send a line saying "you should set your
init.defaultBranch config to this".

Makes it future-extensible pretty much for free, and I think also much
easier to explain to users. I.e. instead of init.defaultBranch somehow
being magical when talking with a remote server we can talk about a
remote server being one source of config per git-config's documented
config order, for a very narrow whitelist of config keys.

Or (not clear to me, should have waited with my other E-Mail) are we
ever expecting to send more than one of:

    "unborn <refname> symref-target:<target>"

Or is the reason closer to us being able to shoehorn this into the
existing ls-refs response, as opposed to some general "here's config for
you" response we don't have?

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 0/3] Cloning with remote unborn HEAD
  2021-01-27  1:11   ` [PATCH v5 0/3] Cloning with " Junio C Hamano
@ 2021-01-27  4:25     ` Jeff King
  2021-01-27  6:14       ` Junio C Hamano
  0 siblings, 1 reply; 109+ messages in thread
From: Jeff King @ 2021-01-27  4:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Tan, git

On Tue, Jan 26, 2021 at 05:11:42PM -0800, Junio C Hamano wrote:

> Jonathan Tan <jonathantanmy@google.com> writes:
> 
> > Thanks, Peff, for your review. I have addressed your comments (through
> > replies to your emails and here in this v5 patch set).
> >
> > Jonathan Tan (3):
> >   ls-refs: report unborn targets of symrefs
> >   connect, transport: encapsulate arg in struct
> >   clone: respect remote unborn HEAD
> 
> Applying this alone to 'master' seems to pass all tests, but
> the topic seems to have funny interactions with another topic
> in flight, jk/peel-iterated-oid

I was worried at first I really screwed up something subtle, but it is
indeed just a funny local interaction.

Here's a fix which can be applied on top of jt/clone-unborn-head. It
could equally well be applied as part of the merge (with a minor
adjustment in the context), but I think it ought to be squashed into
Jonathan's patch 1 anyway.

The conflict you had to resolve was a red herring (it wasn't part of
jk/peel-iterated-oid at all, but rather other commits that got pulled in
because my topic is based on a more recent master).

-- >8 --
Subject: [PATCH] ls-refs: don't peel NULL oid

When the "unborn" feature is enabled, upload-pack serving an ls-refs
command will pass a NULL oid into send_ref(). In this case, there is no
point trying to peel the ref, since we know it points to nothing.

For now this is a harmless waste of cycles (we re-resolve HEAD and find
out that indeed, it points to nothing). But after merging with another
topic that contains 36a317929b (refs: switch peel_ref() to
peel_iterated_oid(), 2021-01-20), we'd actually end up passing NULL to
peel_object(), which segfaults!

Signed-off-by: Jeff King <peff@peff.net>
---
 ls-refs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ls-refs.c b/ls-refs.c
index 4077adeb6a..bc91f03653 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -66,7 +66,7 @@ static int send_ref(const char *refname, const struct object_id *oid,
 			    strip_namespace(symref_target));
 	}
 
-	if (data->peel) {
+	if (data->peel && oid) {
 		struct object_id peeled;
 		if (!peel_ref(refname, &peeled))
 			strbuf_addf(&refline, " peeled:%s", oid_to_hex(&peeled));
-- 
2.30.0.724.gc858251c49


^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 0/3] Cloning with remote unborn HEAD
  2021-01-27  4:25     ` Jeff King
@ 2021-01-27  6:14       ` Junio C Hamano
  0 siblings, 0 replies; 109+ messages in thread
From: Junio C Hamano @ 2021-01-27  6:14 UTC (permalink / raw)
  To: Jeff King; +Cc: Jonathan Tan, git

Jeff King <peff@peff.net> writes:

> Here's a fix which can be applied on top of jt/clone-unborn-head. It
> could equally well be applied as part of the merge (with a minor
> adjustment in the context), but I think it ought to be squashed into
> Jonathan's patch 1 anyway.

Will queue but we are not merging the topic to 'next' yet, so I'll
ask Jonathan to remember making it a part of the series if it needs
to be updated later.

Thanks.

>
> -- >8 --
> Subject: [PATCH] ls-refs: don't peel NULL oid
>
> When the "unborn" feature is enabled, upload-pack serving an ls-refs
> command will pass a NULL oid into send_ref(). In this case, there is no
> point trying to peel the ref, since we know it points to nothing.
>
> For now this is a harmless waste of cycles (we re-resolve HEAD and find
> out that indeed, it points to nothing). But after merging with another
> topic that contains 36a317929b (refs: switch peel_ref() to
> peel_iterated_oid(), 2021-01-20), we'd actually end up passing NULL to
> peel_object(), which segfaults!
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  ls-refs.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/ls-refs.c b/ls-refs.c
> index 4077adeb6a..bc91f03653 100644
> --- a/ls-refs.c
> +++ b/ls-refs.c
> @@ -66,7 +66,7 @@ static int send_ref(const char *refname, const struct object_id *oid,
>  			    strip_namespace(symref_target));
>  	}
>  
> -	if (data->peel) {
> +	if (data->peel && oid) {
>  		struct object_id peeled;
>  		if (!peel_ref(refname, &peeled))
>  			strbuf_addf(&refline, " peeled:%s", oid_to_hex(&peeled));

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v4 3/3] clone: respect remote unborn HEAD
  2021-01-26 23:04           ` Jeff King
@ 2021-01-28  5:50             ` Junio C Hamano
  0 siblings, 0 replies; 109+ messages in thread
From: Junio C Hamano @ 2021-01-28  5:50 UTC (permalink / raw)
  To: Jeff King; +Cc: Jonathan Tan, git

Jeff King <peff@peff.net> writes:

> ... Afterwards, we
> call it for the unborn HEAD (which I can buy is necessary) but _also_
> for that regular path. I.e., why is the new code not:
>
>   if (unborn_head_target && ...) {
>           ref = unborn_head_target;
> 	  unborn_head_target = NULL;
> 	  create_symref("HEAD", ref, "");
>   } else {
>           branch = git_default_branch_name();
> 	  ref = xstrfmt("refs/heads/%s", branch);
>   }
>
> I.e., I don't understand:
>
>   - why create_symref() wasn't need before (assuming it was not), and
>     why it is OK to run it now in the non-unborn code path
>
>   - why we need create_symref() in the unborn path (which is probably
>     something mundane)
>
> I can even buy the argument that it is simply for consistency, so that
> all of the HEAD-setup commands are shared between the two paths. And
> that it is OK to do so, because we are just overwriting what init-db did
> before (even if sometimes it is the same thing). But I feel like that
> deserves explanation in the commit message. :)

Yes, during yesterday's communication glitch, I also independently
was wondering about this and am dying to know if this is an
unrelated "fix", applicable even without the "unborn" support, or
breaking the non "unborn" side of the codepath.

Thanks.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 1/3] ls-refs: report unborn targets of symrefs
  2021-01-26 21:38     ` Junio C Hamano
  2021-01-26 23:03       ` Junio C Hamano
  2021-01-26 23:20       ` Jeff King
@ 2021-01-29 20:23       ` Jonathan Tan
  2021-01-29 22:04         ` Junio C Hamano
  2 siblings, 1 reply; 109+ messages in thread
From: Jonathan Tan @ 2021-01-29 20:23 UTC (permalink / raw)
  To: gitster; +Cc: jonathantanmy, git

> > @@ -47,7 +49,10 @@ static int send_ref(const char *refname, const struct object_id *oid,
> >  	if (!ref_match(&data->prefixes, refname_nons))
> >  		return 0;
> >  
> > -	strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
> > +	if (oid)
> > +		strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
> > +	else
> > +		strbuf_addf(&refline, "unborn %s", refname_nons);
> 
> When a call is made to this helper with NULL "oid", it unconditionally
> sends the "refname" out as an 'unborn' thing.  If data->symrefs is not
> true, or flag does not have REF_ISSYMREF set, then we'd end up
> sending
> 
>     "unborn" SP refname LF
> 
> without any ref-attribute.  The caller is responsible for ensuring
> that it passes sensible data->symrefs and flag when it passes
> oid==NULL to this function, but it is OK because this is a private
> helper.
> 
> OK.

Thanks for checking.

> >  	if (data->symrefs && flag & REF_ISSYMREF) {
> >  		struct object_id unused;
> >  		const char *symref_target = resolve_ref_unsafe(refname, 0,
> > @@ -74,8 +79,30 @@ static int send_ref(const char *refname, const struct object_id *oid,
> >  	return 0;
> >  }
> >  
> > -static int ls_refs_config(const char *var, const char *value, void *data)
> > +static void send_possibly_unborn_head(struct ls_refs_data *data)
> >  {
> > +	struct strbuf namespaced = STRBUF_INIT;
> > +	struct object_id oid;
> > +	int flag;
> > +	int oid_is_null;
> > +
> > +	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
> > +	if (!resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag))
> > +		return; /* bad ref */
> > +	oid_is_null = is_null_oid(&oid);
> > +	if (!oid_is_null ||
> > +	    (data->unborn && data->symrefs && (flag & REF_ISSYMREF)))
> > +		send_ref(namespaced.buf, oid_is_null ? NULL : &oid, flag, data);
> 
> And this caller makes sure that send_ref()'s expectation holds.

Thanks for checking.

> > +	strbuf_release(&namespaced);
> > +}
> > +
> > +static int ls_refs_config(const char *var, const char *value, void *cb_data)
> > +{
> > +	struct ls_refs_data *data = cb_data;
> > +
> > +	if (!strcmp("lsrefs.allowunborn", var))
> > +		data->allow_unborn = git_config_bool(var, value);
> > +
> >  	/*
> >  	 * We only serve fetches over v2 for now, so respect only "uploadpack"
> >  	 * config. This may need to eventually be expanded to "receive", but we
> > @@ -91,7 +118,8 @@ int ls_refs(struct repository *r, struct strvec *keys,
> >  
> >  	memset(&data, 0, sizeof(data));
> >  
> > -	git_config(ls_refs_config, NULL);
> > +	data.allow_unborn = 1;
> > +	git_config(ls_refs_config, &data);
> 
> The above is a usual sequence of "an unspecified allow-unborn
> defaults to true, but the configuration can turn it off".  OK

Later, you address this issue again so I'll comment there.

> > @@ -103,14 +131,31 @@ int ls_refs(struct repository *r, struct strvec *keys,
> >  			data.symrefs = 1;
> >  		else if (skip_prefix(arg, "ref-prefix ", &out))
> >  			strvec_push(&data.prefixes, out);
> > +		else if (data.allow_unborn && !strcmp("unborn", arg))
> > +			data.unborn = 1;
> 
> I think the use of &&-cascade is iffy here.  Even when we are *not*
> accepting request for unborn, we should still parse it as such.
> This does not matter in today's code, but it is a basic courtesy for
> future developers who may add more "else if" after it.
> 
> IOW
> 
> 		else if (!strcmp("unborn", arg)) {
> 			if (!data.allow_unborn)
> 				; /* we are not accepting the request */
> 			else
> 				data.unborn = 1;
> 		}
> 
> I wrote the above in longhand only for documentation purposes; in
> practice, 
> 
> 		else if (!strcmp("unborn", arg))
>                 	data.unborn = data.allow_unborn;
> 
> may suffice.

My thinking was (and is) that falling through in the case of a
disallowed argument (as opposed to a completely unrecognized argument)
makes it more straightforward later if we ever decide to tighten
validation of the ls-refs request - we would only have to put some code
at the end that reports back to the user.

If we write it as you suggest, we would have to remember to replace the
"we are not accepting the request" part (as in the comment in your
suggested code) with an error report, but perhaps that is a good thing -
we would be able to insert a custom error message instead of an
information-hiding "argument not supported".

I'm OK either way.

> >  	}
> >  
> >  	if (request->status != PACKET_READ_FLUSH)
> >  		die(_("expected flush after ls-refs arguments"));
> >  
> > -	head_ref_namespaced(send_ref, &data);
> > +	send_possibly_unborn_head(&data);
> >  	for_each_namespaced_ref(send_ref, &data);
> 
> And here is another caller of send_ref().  Are we sure that
> send_ref()'s expectation is satisfied by this caller when the
> iteration encounters a broken ref (e.g. refs/heads/broken not a
> symref but names an object that does not exist and get_sha1()
> yielding 0{40}), or a dangling symref (e.g. refs/remotes/origin/HEAD
> pointing at something that does not exist)?

I assume that by "this caller" you mean for_each_namespaced_ref(), since
you mention an iteration. I believe so - send_ref has been changed to
tolerate a NULL (as in (void*)0, not 0{40}) oid, and that is the only
change, so if it worked previously, it should still work now.

> >  	packet_flush(1);
> >  	strvec_clear(&data.prefixes);
> >  	return 0;
> >  }
> > +
> > +int ls_refs_advertise(struct repository *r, struct strbuf *value)
> > +{
> > +	if (value) {
> > +		int allow_unborn_value;
> > +
> > +		if (repo_config_get_bool(the_repository,
> > +					 "lsrefs.allowunborn",
> > +					 &allow_unborn_value) ||
> > +		    allow_unborn_value)
> > +			strbuf_addstr(value, "unborn");
> > +	}
> 
> This reads "when not explicitly disabled, stuff "unborn" in there".
> 
> It feels somewhat brittle that we have to read the same variable and
> apply the same "default to true" logic in two places and have to
> keep them in sync.  Is this because the decision to advertize or not
> has to be made way before the code that is specific to the
> implementation of ls-refs is run?
> 
> If ls_refs_advertise() is always called first before ls_refs(), I
> wonder if it makes sense to reuse what we found out about the
> configured (or left unconfigured) state here and use it when
> ls_refs() gets called?  I know that the way serve.c infrastructure
> calls "do we advertise?" helper from each protocol-element handler
> is too narrow and does not allow us to pass such a necessary piece
> of information but I view it as a misdesign that can be corrected
> (and until that happens, we could use file-local static limited to
> ls-refs.c).

Perhaps what I could do is have a static variable that tracks whether
config has been read and what the config is (or if the default variable
is used), and have each function call another function that sets that
static variable if config has not yet been read. I think that will
address this concern.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 1/3] ls-refs: report unborn targets of symrefs
  2021-01-29 20:23       ` Jonathan Tan
@ 2021-01-29 22:04         ` Junio C Hamano
  2021-02-02  2:20           ` Jonathan Tan
  0 siblings, 1 reply; 109+ messages in thread
From: Junio C Hamano @ 2021-01-29 22:04 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

>> I think the use of &&-cascade is iffy here.  Even when we are *not*
>> accepting request for unborn, we should still parse it as such.
>> This does not matter in today's code, but it is a basic courtesy for
>> future developers who may add more "else if" after it.
>> 
>> IOW
>> 
>> 		else if (!strcmp("unborn", arg)) {
>> 			if (!data.allow_unborn)
>> 				; /* we are not accepting the request */
>> 			else
>> 				data.unborn = 1;
>> 		}
>> 
>> I wrote the above in longhand only for documentation purposes; in
>> practice, 
>> 
>> 		else if (!strcmp("unborn", arg))
>>                 	data.unborn = data.allow_unborn;
>> 
>> may suffice.
>
> My thinking was (and is) that falling through in the case of a
> disallowed argument (as opposed to a completely unrecognized argument)
> makes it more straightforward later if we ever decide to tighten
> validation of the ls-refs request - we would only have to put some code
> at the end that reports back to the user.

Sorry, I do not quite follow.  If "unborn" is conditionally allowed,
you can extend what I suggested above like so:

	if (we see we got an unborn request) {
-		if (allowed)
+		if (partially allowed)
+			record that we got unborn request and will
+			partially respond to it
+		else if (allowed)
			record that we got unborn request;
+		else
+			report that we don't accept unborn request;
	}

This will matter even more if you write more else-if.  The
downstream of else-if clauses are forced to interpret (and fail)
"unborn" request they are not interested in.

>> >  	if (request->status != PACKET_READ_FLUSH)
>> >  		die(_("expected flush after ls-refs arguments"));
>> >  
>> > -	head_ref_namespaced(send_ref, &data);
>> > +	send_possibly_unborn_head(&data);
>> >  	for_each_namespaced_ref(send_ref, &data);
>> 
>> And here is another caller of send_ref().  Are we sure that
>> send_ref()'s expectation is satisfied by this caller when the
>> iteration encounters a broken ref (e.g. refs/heads/broken not a
>> symref but names an object that does not exist and get_sha1()
>> yielding 0{40}), or a dangling symref (e.g. refs/remotes/origin/HEAD
>> pointing at something that does not exist)?
>
> I assume that by "this caller" you mean for_each_namespaced_ref(), since
> you mention an iteration. I believe so - send_ref has been changed to
> tolerate a NULL (as in (void*)0, not 0{40}) oid, and that is the only
> change, so if it worked previously, it should still work now.

So a dangling symref, e.g. "refs/remotes/origin/HEAD -> trunk" when
no "refs/remotes/origin/trunk" exists, is not reported to send_ref()
in the same way as an unborn "HEAD"?  I would have expected that we'd
report where it points at, and for that to work, you'd have to use
not just the vanilla send_ref() as the callback, but something that
knows how to do "are we expected to send unborn symrefs" logic, like
send_possibly_unborn_head does.

That "changed to tolerate ... should work" worries me.

If "for_each_namespaced_ref(send_ref, &data)" will never call send_ref()
with NULL (as in (void *)0) oid, then that would be OK, but if it
ends up calling with NULL somehow, it is responsible to ensure that
data->symrefs is true and flag has REF_ISSYMREF set, or send_ref()
would misbehave, (see the first part of your message, which I am
responding to), no?

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 1/3] ls-refs: report unborn targets of symrefs
  2021-01-26 23:03       ` Junio C Hamano
@ 2021-01-30  3:55         ` Jonathan Tan
  0 siblings, 0 replies; 109+ messages in thread
From: Jonathan Tan @ 2021-01-30  3:55 UTC (permalink / raw)
  To: gitster; +Cc: jonathantanmy, git

> Junio C Hamano <gitster@pobox.com> writes:
> 
> > It feels somewhat brittle that we have to read the same variable and
> > apply the same "default to true" logic in two places and have to
> > keep them in sync.  Is this because the decision to advertize or not
> > has to be made way before the code that is specific to the
> > implementation of ls-refs is run?
> >
> > If ls_refs_advertise() is always called first before ls_refs(), I
> > wonder if it makes sense to reuse what we found out about the
> > configured (or left unconfigured) state here and use it when
> > ls_refs() gets called?  I know that the way serve.c infrastructure
> > calls "do we advertise?" helper from each protocol-element handler
> > is too narrow and does not allow us to pass such a necessary piece
> > of information but I view it as a misdesign that can be corrected
> > (and until that happens, we could use file-local static limited to
> > ls-refs.c).
> 
> After giving the above a bit more thought, here are a few random
> thoughts around the area.
> 
>  * As "struct protocol_capability" indicates, we have <name of
>    service, the logic to advertise, the logic to serve> as a
>    three-tuple for services.  The serving logic should know what
>    advertising logic advertised (or more precisely, what information
>    advertising logic used to make that decision) so that they can
>    work consistently.
> 
>    For that, there should be a mechanism that advertising logic can
>    use to leave a note to serving logic, perhaps by adding a "void
>    *" to both of these functions.  The advertising function would
>    allocate a piece of memory it wants to use and returns the
>    pointer to it to the caller in serve.c, and that pointer is given
>    to the corresponding ls_refs() when it is called by serve.c.
>    Then ls_refs_advertise can say "I found this configuration
>    setting and decided to advertise" to later ls_refs() and the
>    latter can say "ah, as you have advertised, I have to respond to
>    such a request".

Usually the advertising is in the same file as the serving (true for
ls-refs and fetch, so far) so I think it's easier if they just
communicate on their own instead of through this "void *". I agree with
the communication idea, though.

>  * I am not sure if "lsrefs.allowunborn = yes/no" is a good way to
>    configure this feature.  Wouldn't it be more natural to make this
>    three-way, i.e. "lsrefs.unborn = advertise/serve/ignore", where
>    the server operator can choose among (1) advertise the presence
>    of the capability and respond to requests, (2) do not advertise
>    the capability but if a request comes, respond to it, and (3) do
>    not advertise and do not respond.  We could throw in 'deny' that
>    causes the request to result in a failure but I do not care too
>    deeply about that fourth option.
> 
>    Using such a configuration mechanism, ls_refs_advertise may leave
>    the value of "lsrefs.unborn" (or lack thereof) it found and used
>    to base its decision to advertise, for use by ls_refs.  ls_refs
>    in turn can use the value found there to decide if it ignores or
>    responds to the "unborn" request.

lsrefs.unborn = advertise/serve/ignore was how it was in version 2 [1]
(with different names) and I changed it due to Peff's suggestion, but
perhaps I landed up in the unhappy middle [2]. I think we should just
pick a standard and use it for this feature and whatever future features
may come. The distinction between advertise and serve ("allow" in my
version 2) is useful in some cases but is useless once migration has
occurred (and thus clutters the code), but perhaps it could be argued
that all servers need to do the migration at least once.

[1] https://lore.kernel.org/git/cover.1608084282.git.jonathantanmy@google.com/
[2] https://lore.kernel.org/git/YBCitNb75rpnuW2L@coredump.intra.peff.net/

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 1/3] ls-refs: report unborn targets of symrefs
  2021-01-27  1:28     ` Ævar Arnfjörð Bjarmason
@ 2021-01-30  4:04       ` Jonathan Tan
  0 siblings, 0 replies; 109+ messages in thread
From: Jonathan Tan @ 2021-01-30  4:04 UTC (permalink / raw)
  To: avarab; +Cc: jonathantanmy, git, gitster

> 
> On Tue, Jan 26 2021, Jonathan Tan wrote:
> 
> > +If the 'unborn' feature is advertised the following argument can be
> > +included in the client's request.
> > +
> > +    unborn
> > +	The server may send symrefs pointing to unborn branches in the form
> > +	"unborn <refname> symref-target:<target>".
> > +
> 
> "branches" as in things under refs/heads/*? What should happen if you
> send this for a refs/tags/* or refs/xyz/*? Maybe overly pedantic, but it
> seems we have no other explicit mention of refs/{heads,tags}/ in
> protocol-v2.txt before this[1].
> 
> 1. Although as I've learned from another recent thread include-tag is
>    magical for refs/tags/* only.

Thanks for spotting this. Right now the server sends anything, but the
client only uses the information if it is a branch. I think this is the
most flexible approach so I'll keep it this way and document it
explicitly.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 2/3] connect, transport: encapsulate arg in struct
  2021-01-26 21:54     ` Junio C Hamano
@ 2021-01-30  4:06       ` Jonathan Tan
  0 siblings, 0 replies; 109+ messages in thread
From: Jonathan Tan @ 2021-01-30  4:06 UTC (permalink / raw)
  To: gitster; +Cc: jonathantanmy, git

> > diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
> > index 58b7c1fbdc..c2d96f4c89 100644
> > --- a/builtin/fetch-pack.c
> > +++ b/builtin/fetch-pack.c
> > @@ -220,7 +220,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
> >  	version = discover_version(&reader);
> >  	switch (version) {
> >  	case protocol_v2:
> > -		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL, args.stateless_rpc);
> > +		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL,
> > +				args.stateless_rpc);
> >  		break;
> 
> This seems to be an unrelated line-wrapping, but there are overlong
> lines that are longer than this line in the same function.

Ah...I'll undo this in the next version.

> Everything else looks sensible to change the assumption that a strvec
> is sufficient for the communication in the codepaths and make it more
> easily extended by passing a ls-refs-options structure, which makes
> quite a lot of sense.
> 
> > diff --git a/transport.h b/transport.h
> > index 24558c027d..1f5b60e4d3 100644
> > --- a/transport.h
> > +++ b/transport.h
> > @@ -233,17 +233,24 @@ int transport_push(struct repository *repo,
> >  		   struct refspec *rs, int flags,
> >  		   unsigned int * reject_reasons);
> >  
> > +struct transport_ls_refs_options {
> > +	/*
> > +	 * Optionally, a list of ref prefixes can be provided which can be sent
> > +	 * to the server (when communicating using protocol v2) to enable it to
> > +	 * limit the ref advertisement.  Since ref filtering is done on the
> > +	 * server's end (and only when using protocol v2),
> > +	 * transport_get_remote_refs() could return refs which don't match the
> > +	 * provided ref_prefixes.
> > +	 */
> > +	struct strvec ref_prefixes;
> > +};
> > +#define TRANSPORT_LS_REFS_OPTIONS_INIT { STRVEC_INIT }
> > +
> 
> And of course, the first step only carries the strvec we have been
> passing around, i.e. does not lose or gain features.
> 
> Looking good.  Thanks.

Thanks for taking a look.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 3/3] clone: respect remote unborn HEAD
  2021-01-26 22:24     ` Junio C Hamano
@ 2021-01-30  4:27       ` Jonathan Tan
  0 siblings, 0 replies; 109+ messages in thread
From: Jonathan Tan @ 2021-01-30  4:27 UTC (permalink / raw)
  To: gitster; +Cc: jonathantanmy, git

> Jonathan Tan <jonathantanmy@google.com> writes:
> 
> >  init.defaultBranch::
> >  	Allows overriding the default branch name e.g. when initializing
> > -	a new repository or when cloning an empty repository.
> > +	a new repository.
> 
> Looking good.
> 
> > diff --git a/builtin/clone.c b/builtin/clone.c
> > index 211d4f54b0..77fdc61f4d 100644
> > --- a/builtin/clone.c
> > +++ b/builtin/clone.c
> > @@ -1330,10 +1330,21 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
> >  		remote_head = NULL;
> >  		option_no_checkout = 1;
> >  		if (!option_bare) {
> > -			const char *branch = git_default_branch_name();
> > -			char *ref = xstrfmt("refs/heads/%s", branch);
> > +			const char *branch;
> > +			char *ref;
> > +
> > +			if (transport_ls_refs_options.unborn_head_target &&
> > +			    skip_prefix(transport_ls_refs_options.unborn_head_target,
> > +					"refs/heads/", &branch)) {
> > +				ref = transport_ls_refs_options.unborn_head_target;
> > +				transport_ls_refs_options.unborn_head_target = NULL;
> > +			} else {
> > +				branch = git_default_branch_name();
> > +				ref = xstrfmt("refs/heads/%s", branch);
> > +			}
> >  
> >  			install_branch_config(0, branch, remote_name, ref);
> > +			create_symref("HEAD", ref, "");
> >  			free(ref);
> 
> OK, we used to say "point our HEAD always to the local default
> name", and the code is still there in the else clause.  But when the
> transport found what name the other side uses, we use that name
> instead.
> 
> I presume that clearing transport_ls_ref_options.unborn_head_target
> is to take ownership of this piece of memory ourselves?

Yes - just to be consistent with the other branch where "ref" needs to
be freed.

> We didn't call create_symref() in the original code, but now we do.
> Is this a valid bugfix even if we did not have this "learn remote
> symref even for unborn HEAD" feature?  Or is the original codepath
> now somehow got broken with an extra create_symref() that we used
> not to do, but now we do?

Ah...now I think I see what you and Peff [1] were saying. Yes I think
the symref creation is not necessary when we use the default branch name
(like we currently do). I'll verify and write back with my findings in
the next version.

[1] https://lore.kernel.org/git/YBCf8SI3fK+rDyox@coredump.intra.peff.net/

> 
> > @@ -1385,5 +1396,6 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
> >  	junk_mode = JUNK_LEAVE_ALL;
> >  
> >  	strvec_clear(&transport_ls_refs_options.ref_prefixes);
> > +	free(transport_ls_refs_options.unborn_head_target);
> >  	return err;
> >  }
> > diff --git a/connect.c b/connect.c
> > index 328c279250..879669df93 100644
> > --- a/connect.c
> > +++ b/connect.c
> > @@ -376,7 +376,8 @@ struct ref **get_remote_heads(struct packet_reader *reader,
> >  }
> >  
> >  /* Returns 1 when a valid ref has been added to `list`, 0 otherwise */
> > -static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
> > +static int process_ref_v2(struct packet_reader *reader, struct ref ***list,
> > +			  char **unborn_head_target)
> >  {
> >  	int ret = 1;
> >  	int i = 0;
> > @@ -397,6 +398,25 @@ static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
> >  		goto out;
> >  	}
> >  
> > +	if (!strcmp("unborn", line_sections.items[i].string)) {
> > +		i++;
> > +		if (unborn_head_target &&
> > +		    !strcmp("HEAD", line_sections.items[i++].string)) {
> > +			/*
> > +			 * Look for the symref target (if any). If found,
> > +			 * return it to the caller.
> > +			 */
> > +			for (; i < line_sections.nr; i++) {
> > +				const char *arg = line_sections.items[i].string;
> > +
> > +				if (skip_prefix(arg, "symref-target:", &arg)) {
> > +					*unborn_head_target = xstrdup(arg);
> > +					break;
> > +				}
> > +			}
> > +		}
> > +		goto out;
> > +	}
> 
> We split the line and notice that the first token is "unborn"; if
> the caller is not interested in the unborn head, we just skip the
> rest, but otherwise, if it is about HEAD (i.e. we do not care if a
> dangling symref that is not HEAD is reported), we notice the target
> in unborn_head_target.
> 
> OK.  We already saw how this is used in cmd_clone().
> 
> > @@ -461,6 +481,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
> >  	const char *hash_name;
> >  	struct strvec *ref_prefixes = transport_options ?
> >  		&transport_options->ref_prefixes : NULL;
> > +	char **unborn_head_target = transport_options ?
> > +		&transport_options->unborn_head_target : NULL;
> 
> So any caller that passes transport_options will get the unborn head
> information for free?  The other callers are in fetch-pack.c and
> transport.c, which presumably are about fetching and not cloning.
> 
> I recall discussions on filling a missing refs/remotes/X/HEAD when
> we fetch from X and learn where X points at.  Such an extension can
> be done on top of this mechanism to pass transport_options from the
> fetch codepath, I presume?

I don't recall those discussions, but I think that we can do that (as
long as HEAD points to a branch that is part of the refspec we're
fetching, because the ref-prefix check still applies).

> Thanks.  I tried to follow the thought in the patches aloud, and it
> was mostly a pleasant read.

Thanks.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 0/3] Cloning with remote unborn HEAD
  2021-01-27  1:41   ` Ævar Arnfjörð Bjarmason
@ 2021-01-30  4:41     ` Jonathan Tan
  2021-01-30 11:13       ` Ævar Arnfjörð Bjarmason
  2021-02-02  2:22       ` Jonathan Tan
  2021-02-05 22:28     ` Junio C Hamano
  1 sibling, 2 replies; 109+ messages in thread
From: Jonathan Tan @ 2021-01-30  4:41 UTC (permalink / raw)
  To: avarab; +Cc: jonathantanmy, git, peff, felipe.contreras

> On Tue, Jan 26 2021, Jonathan Tan wrote:
> 
> [For some reason the patches didn't reach my mailbox, but I see them in
> the list archive, so I'm replying to the cover-letter]
> 
> >  Documentation/config.txt                |  2 +
> >  Documentation/config/init.txt           |  2 +-
> 
> Good, now we have init.defaultBranch docs, but they say:
>     
>      init.defaultBranch::
>             Allows overriding the default branch name e.g. when initializing
>     -       a new repository or when cloning an empty repository.
>     +       a new repository.
> 
> So this still only applies to file:// and other "protocol" clones, but
> not "git clone /some/path"?

Ah...that's true.

> Re my reply to v1, do we consider that a bug, feature, something just
> left unimplemented?
> 
> I really don't care much, but this really needs a corresponding
> documentation update. I.e. something like:
> 
>     init.defaultBranch::
>         Allows overriding the default branch name e.g. when initializing a
>         new repository or when cloning an empty repository.
>     
>         When cloning a repository over protocol v2 (i.e. ssh://, https://,
>         file://, but not a /some/path), and if that repository has
>         init.defaultBranch configured, the server will advertise its
>         preferred default branch name, and we'll take its configuration over
>         ours.

Thanks - I'll use some of your wording, but I think it's best to leave
open the possibility that cloning using protocol v0 or the disk clone
(/some/path) copies over the current HEAD as well.

> Which, just in terms of implementation makes me think it would make more
> sense if the server just had:
> 
>     uploadPack.sendConfig = "init.defaultBranch=xyz"
> 
> The client:
> 
>     receivePack.acceptConfig = "init.defaultBranch"
> 
> And in terms of things on the wire we'd say:
> 
>     "set-config init.defaultBranch=main"
> 
> You could have many such lines, but we'd just harcode only accepting
> "init.defaultBranch" by default for now.
> 
> I.e. we set "init.defaultBranch" on the server, and the client ends up
> interpreting things as if though "init.defaultBranch" was set to exactly
> that value. So why not just ... send a line saying "you should set your
> init.defaultBranch config to this".
> 
> Makes it future-extensible pretty much for free, and I think also much
> easier to explain to users. I.e. instead of init.defaultBranch somehow
> being magical when talking with a remote server we can talk about a
> remote server being one source of config per git-config's documented
> config order, for a very narrow whitelist of config keys.
>
> Or (not clear to me, should have waited with my other E-Mail) are we
> ever expecting to send more than one of:
> 
>     "unborn <refname> symref-target:<target>"
> 
> Or is the reason closer to us being able to shoehorn this into the
> existing ls-refs response, as opposed to some general "here's config for
> you" response we don't have?

It's not the same - from what I understand, what you're suggesting is
setting a config in the repo that has just been cloned, but this patch
set does not set any such config. Also, it may be strange for the server
to be able to change the config of a currently running command - I would
expect such a thing to only take effect on future runs of Git on that
repo.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 0/3] Cloning with remote unborn HEAD
  2021-01-30  4:41     ` Jonathan Tan
@ 2021-01-30 11:13       ` Ævar Arnfjörð Bjarmason
  2021-02-02  2:22       ` Jonathan Tan
  1 sibling, 0 replies; 109+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-01-30 11:13 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, peff, felipe.contreras


On Sat, Jan 30 2021, Jonathan Tan wrote:

>> On Tue, Jan 26 2021, Jonathan Tan wrote:
>> 
>> [For some reason the patches didn't reach my mailbox, but I see them in
>> the list archive, so I'm replying to the cover-letter]
>> 
>> >  Documentation/config.txt                |  2 +
>> >  Documentation/config/init.txt           |  2 +-
>> 
>> Good, now we have init.defaultBranch docs, but they say:
>>     
>>      init.defaultBranch::
>>             Allows overriding the default branch name e.g. when initializing
>>     -       a new repository or when cloning an empty repository.
>>     +       a new repository.
>> 
>> So this still only applies to file:// and other "protocol" clones, but
>> not "git clone /some/path"?
>
> Ah...that's true.
>
>> Re my reply to v1, do we consider that a bug, feature, something just
>> left unimplemented?
>> 
>> I really don't care much, but this really needs a corresponding
>> documentation update. I.e. something like:
>> 
>>     init.defaultBranch::
>>         Allows overriding the default branch name e.g. when initializing a
>>         new repository or when cloning an empty repository.
>>     
>>         When cloning a repository over protocol v2 (i.e. ssh://, https://,
>>         file://, but not a /some/path), and if that repository has
>>         init.defaultBranch configured, the server will advertise its
>>         preferred default branch name, and we'll take its configuration over
>>         ours.
>
> Thanks - I'll use some of your wording, but I think it's best to leave
> open the possibility that cloning using protocol v0 or the disk clone
> (/some/path) copies over the current HEAD as well.

Sure, and maybe a test_expect_failure for those cases? I.e. to
explicitly say in the current docs/tests what does / doesn't work, and
if we consider that intentional or not.

>> Which, just in terms of implementation makes me think it would make more
>> sense if the server just had:
>> 
>>     uploadPack.sendConfig = "init.defaultBranch=xyz"
>> 
>> The client:
>> 
>>     receivePack.acceptConfig = "init.defaultBranch"
>> 
>> And in terms of things on the wire we'd say:
>> 
>>     "set-config init.defaultBranch=main"
>> 
>> You could have many such lines, but we'd just harcode only accepting
>> "init.defaultBranch" by default for now.
>> 
>> I.e. we set "init.defaultBranch" on the server, and the client ends up
>> interpreting things as if though "init.defaultBranch" was set to exactly
>> that value. So why not just ... send a line saying "you should set your
>> init.defaultBranch config to this".
>> 
>> Makes it future-extensible pretty much for free, and I think also much
>> easier to explain to users. I.e. instead of init.defaultBranch somehow
>> being magical when talking with a remote server we can talk about a
>> remote server being one source of config per git-config's documented
>> config order, for a very narrow whitelist of config keys.
>>
>> Or (not clear to me, should have waited with my other E-Mail) are we
>> ever expecting to send more than one of:
>> 
>>     "unborn <refname> symref-target:<target>"
>> 
>> Or is the reason closer to us being able to shoehorn this into the
>> existing ls-refs response, as opposed to some general "here's config for
>> you" response we don't have?
>
> It's not the same - from what I understand, what you're suggesting is
> setting a config in the repo that has just been cloned[...]

No, not to set config, i.e. during/after clone doing "git config
init.defaultBranch <remote>" wouldn't make any sense. Since that would
set config in .git/config, and that would (also?) apply /after/ the
clone, e.g. if you did "git init /tmp/somewhere/else" afterwards.

> [...]but this patch set does not set any such config[...].

It does, within the scope of the runtime of the process. I.e. just like
"git -c" or whatever. In builtin/clone.c you set "branch" from local
init.defaultBranch only if the remote did not provide us a value for it,
i.e. remote config for that config key overrides local config.

> Also, it may be strange for the server to be able to change the config
> of a currently running command - I would expect such a thing to only
> take effect on future runs of Git on that repo.

Yes, as I noted on v1 I think the semantics of this whole thing are a
bit strange :)

But if we're keeping the "strangeness" all I'm saying is that I think
it's more obvious to a user if we just declare the remote to be a
limited config source in tems of explaining this special-case.

And that once we're doing that it's also more obvious IMO to have that
be what's happening on the protocol level, if we're not expecting more
than one of these values.

I.e. if you ignore your current implementation internal and just view
git as a black box, then the functionality of this thing is
indistinguishable from the remote being a (limited) source of config.

So isn't in simpler to explain it to the user in those terms?

^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v6 0/3] Cloning with remote unborn HEAD
  2020-12-08  1:31 Cloning empty repository uses locally configured default branch name Jonathan Tan
                   ` (3 preceding siblings ...)
  2021-01-26 18:55 ` [PATCH v5 " Jonathan Tan
@ 2021-02-02  2:14 ` Jonathan Tan
  2021-02-02  2:14   ` [PATCH v6 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
                     ` (2 more replies)
  2021-02-05  4:58 ` [PATCH v7 0/3] Cloning with " Jonathan Tan
  2021-02-05 20:48 ` [PATCH v8 " Jonathan Tan
  6 siblings, 3 replies; 109+ messages in thread
From: Jonathan Tan @ 2021-02-02  2:14 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, peff

I don't think we have consensus on whether the "unborn" config should be
advertise/allow/ignore or allow/ignore, so I have left that the same as
in version 5.

But as Junio suggested, I have isolated the reading of config into a
function and some global variables (so that one part can "leave a note"
to the other), so we could have advertise/allow/ignore (if we want)
without making the rest of the code much more complicated.

Other changes:
 - Updated the if/else cascade that parses the input to ls-refs to
   follow Junio's suggestion - "else if (!strcmp("unborn", arg)) {".
 - Moved create_symref() to cover the exact case it's needed.
 - Squashed Ramsay Jones's and Peff's patches.

Jonathan Tan (3):
  ls-refs: report unborn targets of symrefs
  connect, transport: encapsulate arg in struct
  clone: respect remote unborn HEAD

 Documentation/config.txt                |  2 +
 Documentation/config/init.txt           |  2 +-
 Documentation/config/lsrefs.txt         |  3 ++
 Documentation/technical/protocol-v2.txt | 10 ++++-
 builtin/clone.c                         | 34 ++++++++++----
 builtin/fetch-pack.c                    |  3 +-
 builtin/fetch.c                         | 18 +++++---
 builtin/ls-remote.c                     |  9 ++--
 connect.c                               | 32 ++++++++++++--
 ls-refs.c                               | 59 +++++++++++++++++++++++--
 ls-refs.h                               |  1 +
 remote.h                                |  4 +-
 serve.c                                 |  2 +-
 t/t5606-clone-options.sh                |  8 ++--
 t/t5701-git-serve.sh                    |  2 +-
 t/t5702-protocol-v2.sh                  | 25 +++++++++++
 transport-helper.c                      |  5 ++-
 transport-internal.h                    | 10 +----
 transport.c                             | 23 +++++-----
 transport.h                             | 29 +++++++++---
 20 files changed, 218 insertions(+), 63 deletions(-)
 create mode 100644 Documentation/config/lsrefs.txt

Range-diff against v5:
1:  cb033f9abc ! 1:  411bbafe25 ls-refs: report unborn targets of symrefs
    @@ Documentation/technical/protocol-v2.txt: ls-refs takes in the following argument
          peeled = "peeled:" obj-id
     
      ## ls-refs.c ##
    +@@
    + #include "pkt-line.h"
    + #include "config.h"
    + 
    ++static int config_read;
    ++static int allow_unborn;
    ++
    ++static void ensure_config_read(void)
    ++{
    ++	if (config_read)
    ++		return;
    ++
    ++	if (repo_config_get_bool(the_repository, "lsrefs.allowunborn",
    ++				 &allow_unborn))
    ++		/*
    ++		 * If there is no such config, set it to 1 to allow it by
    ++		 * default.
    ++		 */
    ++		allow_unborn = 1;
    ++	config_read = 1;
    ++}
    ++
    + /*
    +  * Check if one of the prefixes is a prefix of the ref.
    +  * If no prefixes were provided, all refs match.
     @@ ls-refs.c: struct ls_refs_data {
      	unsigned peel;
      	unsigned symrefs;
      	struct strvec prefixes;
    -+	unsigned allow_unborn : 1;
     +	unsigned unborn : 1;
      };
      
    @@ ls-refs.c: static int send_ref(const char *refname, const struct object_id *oid,
      	if (data->symrefs && flag & REF_ISSYMREF) {
      		struct object_id unused;
      		const char *symref_target = resolve_ref_unsafe(refname, 0,
    +@@ ls-refs.c: static int send_ref(const char *refname, const struct object_id *oid,
    + 			    strip_namespace(symref_target));
    + 	}
    + 
    +-	if (data->peel) {
    ++	if (data->peel && oid) {
    + 		struct object_id peeled;
    + 		if (!peel_ref(refname, &peeled))
    + 			strbuf_addf(&refline, " peeled:%s", oid_to_hex(&peeled));
     @@ ls-refs.c: static int send_ref(const char *refname, const struct object_id *oid,
      	return 0;
      }
      
    --static int ls_refs_config(const char *var, const char *value, void *data)
     +static void send_possibly_unborn_head(struct ls_refs_data *data)
    - {
    ++{
     +	struct strbuf namespaced = STRBUF_INIT;
     +	struct object_id oid;
     +	int flag;
    @@ ls-refs.c: static int send_ref(const char *refname, const struct object_id *oid,
     +	strbuf_release(&namespaced);
     +}
     +
    -+static int ls_refs_config(const char *var, const char *value, void *cb_data)
    -+{
    -+	struct ls_refs_data *data = cb_data;
    -+
    -+	if (!strcmp("lsrefs.allowunborn", var))
    -+		data->allow_unborn = git_config_bool(var, value);
    -+
    + static int ls_refs_config(const char *var, const char *value, void *data)
    + {
      	/*
    - 	 * We only serve fetches over v2 for now, so respect only "uploadpack"
    - 	 * config. This may need to eventually be expanded to "receive", but we
     @@ ls-refs.c: int ls_refs(struct repository *r, struct strvec *keys,
      
      	memset(&data, 0, sizeof(data));
      
    --	git_config(ls_refs_config, NULL);
    -+	data.allow_unborn = 1;
    -+	git_config(ls_refs_config, &data);
    ++	ensure_config_read();
    + 	git_config(ls_refs_config, NULL);
      
      	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
    - 		const char *arg = request->line;
     @@ ls-refs.c: int ls_refs(struct repository *r, struct strvec *keys,
      			data.symrefs = 1;
      		else if (skip_prefix(arg, "ref-prefix ", &out))
      			strvec_push(&data.prefixes, out);
    -+		else if (data.allow_unborn && !strcmp("unborn", arg))
    -+			data.unborn = 1;
    ++		else if (!strcmp("unborn", arg))
    ++			data.unborn = allow_unborn;
      	}
      
      	if (request->status != PACKET_READ_FLUSH)
    @@ ls-refs.c: int ls_refs(struct repository *r, struct strvec *keys,
     +int ls_refs_advertise(struct repository *r, struct strbuf *value)
     +{
     +	if (value) {
    -+		int allow_unborn_value;
    -+
    -+		if (repo_config_get_bool(the_repository,
    -+					 "lsrefs.allowunborn",
    -+					 &allow_unborn_value) ||
    -+		    allow_unborn_value)
    ++		ensure_config_read();
    ++		if (allow_unborn)
     +			strbuf_addstr(value, "unborn");
     +	}
     +
2:  0c7ab71872 ! 2:  fad1ebe6b6 connect, transport: encapsulate arg in struct
    @@ transport-helper.c: static int has_attribute(const char *attrs, const char *attr
      	return get_refs_list_using_list(transport, for_push);
     
      ## transport-internal.h ##
    +@@
    + struct ref;
    + struct transport;
    + struct strvec;
    ++struct transport_ls_refs_options;
    + 
    + struct transport_vtable {
    + 	/**
     @@ transport-internal.h: struct transport_vtable {
      	 * the transport to try to share connections, for_push is a
      	 * hint as to whether the ultimate operation is a push or a fetch.
3:  8015415c79 ! 3:  45a48ccc0d clone: respect remote unborn HEAD
    @@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
     +					"refs/heads/", &branch)) {
     +				ref = transport_ls_refs_options.unborn_head_target;
     +				transport_ls_refs_options.unborn_head_target = NULL;
    ++				create_symref("HEAD", ref, reflog_msg.buf);
     +			} else {
     +				branch = git_default_branch_name();
     +				ref = xstrfmt("refs/heads/%s", branch);
     +			}
      
      			install_branch_config(0, branch, remote_name, ref);
    -+			create_symref("HEAD", ref, "");
      			free(ref);
    - 		}
    - 	}
     @@ builtin/clone.c: int cmd_clone(int argc, const char **argv, const char *prefix)
      	junk_mode = JUNK_LEAVE_ALL;
      
4:  1c06db6494 < -:  ---------- ls-refs: don't peel NULL oid
5:  30d83a9dfa < -:  ---------- transport-internal.h: fix a 'hdr-check' warning
-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v6 1/3] ls-refs: report unborn targets of symrefs
  2021-02-02  2:14 ` [PATCH v6 " Jonathan Tan
@ 2021-02-02  2:14   ` Jonathan Tan
  2021-02-02 16:55     ` Junio C Hamano
  2021-02-02  2:15   ` [PATCH v6 2/3] connect, transport: encapsulate arg in struct Jonathan Tan
  2021-02-02  2:15   ` [PATCH v6 3/3] clone: respect remote unborn HEAD Jonathan Tan
  2 siblings, 1 reply; 109+ messages in thread
From: Jonathan Tan @ 2021-02-02  2:14 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, peff

When cloning, we choose the default branch based on the remote HEAD.
But if there is no remote HEAD reported (which could happen if the
target of the remote HEAD is unborn), we'll fall back to using our local
init.defaultBranch. Traditionally this hasn't been a big deal, because
most repos used "master" as the default. But these days it is likely to
cause confusion if the server and client implementations choose
different values (e.g., if the remote started with "main", we may choose
"master" locally, create commits there, and then the user is surprised
when they push to "master" and not "main").

To solve this, the remote needs to communicate the target of the HEAD
symref, even if it is unborn, and "git clone" needs to use this
information.

Currently, symrefs that have unborn targets (such as in this case) are
not communicated by the protocol. Teach Git to advertise and support the
"unborn" feature in "ls-refs" (by default, this is advertised, but
server administrators may turn this off through the lsrefs.allowunborn
config). This feature indicates that "ls-refs" supports the "unborn"
argument; when it is specified, "ls-refs" will send the HEAD symref with
the name of its unborn target.

This change is only for protocol v2. A similar change for protocol v0
would require independent protocol design (there being no analogous
position to signal support for "unborn") and client-side plumbing of the
data required, so the scope of this patch set is limited to protocol v2.

The client side will be updated to use this in a subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config.txt                |  2 +
 Documentation/config/lsrefs.txt         |  3 ++
 Documentation/technical/protocol-v2.txt | 10 ++++-
 ls-refs.c                               | 59 +++++++++++++++++++++++--
 ls-refs.h                               |  1 +
 serve.c                                 |  2 +-
 t/t5701-git-serve.sh                    |  2 +-
 7 files changed, 73 insertions(+), 6 deletions(-)
 create mode 100644 Documentation/config/lsrefs.txt

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 6ba50b1104..d08e83a148 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -398,6 +398,8 @@ include::config/interactive.txt[]
 
 include::config/log.txt[]
 
+include::config/lsrefs.txt[]
+
 include::config/mailinfo.txt[]
 
 include::config/mailmap.txt[]
diff --git a/Documentation/config/lsrefs.txt b/Documentation/config/lsrefs.txt
new file mode 100644
index 0000000000..dcbec11aaa
--- /dev/null
+++ b/Documentation/config/lsrefs.txt
@@ -0,0 +1,3 @@
+lsrefs.allowUnborn::
+	Allow the server to send information about unborn symrefs during the
+	protocol v2 ref advertisement.
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 85daeb5d9e..4707511c10 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -192,11 +192,19 @@ ls-refs takes in the following arguments:
 	When specified, only references having a prefix matching one of
 	the provided prefixes are displayed.
 
+If the 'unborn' feature is advertised the following argument can be
+included in the client's request.
+
+    unborn
+	The server may send symrefs pointing to unborn branches in the form
+	"unborn <refname> symref-target:<target>".
+
 The output of ls-refs is as follows:
 
     output = *ref
 	     flush-pkt
-    ref = PKT-LINE(obj-id SP refname *(SP ref-attribute) LF)
+    obj-id-or-unborn = (obj-id | "unborn")
+    ref = PKT-LINE(obj-id-or-unborn SP refname *(SP ref-attribute) LF)
     ref-attribute = (symref | peeled)
     symref = "symref-target:" symref-target
     peeled = "peeled:" obj-id
diff --git a/ls-refs.c b/ls-refs.c
index a1e0b473e4..daf4e77a4a 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -7,6 +7,24 @@
 #include "pkt-line.h"
 #include "config.h"
 
+static int config_read;
+static int allow_unborn;
+
+static void ensure_config_read(void)
+{
+	if (config_read)
+		return;
+
+	if (repo_config_get_bool(the_repository, "lsrefs.allowunborn",
+				 &allow_unborn))
+		/*
+		 * If there is no such config, set it to 1 to allow it by
+		 * default.
+		 */
+		allow_unborn = 1;
+	config_read = 1;
+}
+
 /*
  * Check if one of the prefixes is a prefix of the ref.
  * If no prefixes were provided, all refs match.
@@ -32,6 +50,7 @@ struct ls_refs_data {
 	unsigned peel;
 	unsigned symrefs;
 	struct strvec prefixes;
+	unsigned unborn : 1;
 };
 
 static int send_ref(const char *refname, const struct object_id *oid,
@@ -47,7 +66,10 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	if (!ref_match(&data->prefixes, refname_nons))
 		return 0;
 
-	strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	if (oid)
+		strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	else
+		strbuf_addf(&refline, "unborn %s", refname_nons);
 	if (data->symrefs && flag & REF_ISSYMREF) {
 		struct object_id unused;
 		const char *symref_target = resolve_ref_unsafe(refname, 0,
@@ -61,7 +83,7 @@ static int send_ref(const char *refname, const struct object_id *oid,
 			    strip_namespace(symref_target));
 	}
 
-	if (data->peel) {
+	if (data->peel && oid) {
 		struct object_id peeled;
 		if (!peel_ref(refname, &peeled))
 			strbuf_addf(&refline, " peeled:%s", oid_to_hex(&peeled));
@@ -74,6 +96,23 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	return 0;
 }
 
+static void send_possibly_unborn_head(struct ls_refs_data *data)
+{
+	struct strbuf namespaced = STRBUF_INIT;
+	struct object_id oid;
+	int flag;
+	int oid_is_null;
+
+	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
+	if (!resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag))
+		return; /* bad ref */
+	oid_is_null = is_null_oid(&oid);
+	if (!oid_is_null ||
+	    (data->unborn && data->symrefs && (flag & REF_ISSYMREF)))
+		send_ref(namespaced.buf, oid_is_null ? NULL : &oid, flag, data);
+	strbuf_release(&namespaced);
+}
+
 static int ls_refs_config(const char *var, const char *value, void *data)
 {
 	/*
@@ -91,6 +130,7 @@ int ls_refs(struct repository *r, struct strvec *keys,
 
 	memset(&data, 0, sizeof(data));
 
+	ensure_config_read();
 	git_config(ls_refs_config, NULL);
 
 	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
@@ -103,14 +143,27 @@ int ls_refs(struct repository *r, struct strvec *keys,
 			data.symrefs = 1;
 		else if (skip_prefix(arg, "ref-prefix ", &out))
 			strvec_push(&data.prefixes, out);
+		else if (!strcmp("unborn", arg))
+			data.unborn = allow_unborn;
 	}
 
 	if (request->status != PACKET_READ_FLUSH)
 		die(_("expected flush after ls-refs arguments"));
 
-	head_ref_namespaced(send_ref, &data);
+	send_possibly_unborn_head(&data);
 	for_each_namespaced_ref(send_ref, &data);
 	packet_flush(1);
 	strvec_clear(&data.prefixes);
 	return 0;
 }
+
+int ls_refs_advertise(struct repository *r, struct strbuf *value)
+{
+	if (value) {
+		ensure_config_read();
+		if (allow_unborn)
+			strbuf_addstr(value, "unborn");
+	}
+
+	return 1;
+}
diff --git a/ls-refs.h b/ls-refs.h
index 7b33a7c6b8..a99e4be0bd 100644
--- a/ls-refs.h
+++ b/ls-refs.h
@@ -6,5 +6,6 @@ struct strvec;
 struct packet_reader;
 int ls_refs(struct repository *r, struct strvec *keys,
 	    struct packet_reader *request);
+int ls_refs_advertise(struct repository *r, struct strbuf *value);
 
 #endif /* LS_REFS_H */
diff --git a/serve.c b/serve.c
index eec2fe6f29..ac20c72763 100644
--- a/serve.c
+++ b/serve.c
@@ -73,7 +73,7 @@ struct protocol_capability {
 
 static struct protocol_capability capabilities[] = {
 	{ "agent", agent_advertise, NULL },
-	{ "ls-refs", always_advertise, ls_refs },
+	{ "ls-refs", ls_refs_advertise, ls_refs },
 	{ "fetch", upload_pack_advertise, upload_pack_v2 },
 	{ "server-option", always_advertise, NULL },
 	{ "object-format", object_format_advertise, NULL },
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index a1f5fdc9fd..df29504161 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -12,7 +12,7 @@ test_expect_success 'test capability advertisement' '
 	cat >expect <<-EOF &&
 	version 2
 	agent=git/$(git version | cut -d" " -f3)
-	ls-refs
+	ls-refs=unborn
 	fetch=shallow
 	server-option
 	object-format=$(test_oid algo)
-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v6 2/3] connect, transport: encapsulate arg in struct
  2021-02-02  2:14 ` [PATCH v6 " Jonathan Tan
  2021-02-02  2:14   ` [PATCH v6 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
@ 2021-02-02  2:15   ` Jonathan Tan
  2021-02-02  2:15   ` [PATCH v6 3/3] clone: respect remote unborn HEAD Jonathan Tan
  2 siblings, 0 replies; 109+ messages in thread
From: Jonathan Tan @ 2021-02-02  2:15 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, peff

In a future patch we plan to return the name of an unborn current branch
from deep in the callchain to a caller via a new pointer parameter that
points at a variable in the caller when the caller calls
get_remote_refs() and transport_get_remote_refs().

In preparation for that, encapsulate the existing ref_prefixes
parameter into a struct. The aforementioned unborn current branch will
go into this new struct in the future patch.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/clone.c      | 18 +++++++++++-------
 builtin/fetch-pack.c |  3 ++-
 builtin/fetch.c      | 18 +++++++++++-------
 builtin/ls-remote.c  |  9 +++++----
 connect.c            |  4 +++-
 remote.h             |  4 +++-
 transport-helper.c   |  5 +++--
 transport-internal.h | 10 ++--------
 transport.c          | 23 ++++++++++++-----------
 transport.h          | 21 ++++++++++++++-------
 10 files changed, 66 insertions(+), 49 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index a5630337e4..211d4f54b0 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -979,7 +979,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	int err = 0, complete_refs_before_fetch = 1;
 	int submodule_progress;
 
-	struct strvec ref_prefixes = STRVEC_INIT;
+	struct transport_ls_refs_options transport_ls_refs_options =
+		TRANSPORT_LS_REFS_OPTIONS_INIT;
 
 	packet_trace_identity("clone");
 
@@ -1257,14 +1258,17 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		transport->smart_options->check_self_contained_and_connected = 1;
 
 
-	strvec_push(&ref_prefixes, "HEAD");
-	refspec_ref_prefixes(&remote->fetch, &ref_prefixes);
+	strvec_push(&transport_ls_refs_options.ref_prefixes, "HEAD");
+	refspec_ref_prefixes(&remote->fetch,
+			     &transport_ls_refs_options.ref_prefixes);
 	if (option_branch)
-		expand_ref_prefix(&ref_prefixes, option_branch);
+		expand_ref_prefix(&transport_ls_refs_options.ref_prefixes,
+				  option_branch);
 	if (!option_no_tags)
-		strvec_push(&ref_prefixes, "refs/tags/");
+		strvec_push(&transport_ls_refs_options.ref_prefixes,
+			    "refs/tags/");
 
-	refs = transport_get_remote_refs(transport, &ref_prefixes);
+	refs = transport_get_remote_refs(transport, &transport_ls_refs_options);
 
 	if (refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
@@ -1380,6 +1384,6 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	strbuf_release(&key);
 	junk_mode = JUNK_LEAVE_ALL;
 
-	strvec_clear(&ref_prefixes);
+	strvec_clear(&transport_ls_refs_options.ref_prefixes);
 	return err;
 }
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 58b7c1fbdc..c2d96f4c89 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -220,7 +220,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	version = discover_version(&reader);
 	switch (version) {
 	case protocol_v2:
-		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL, args.stateless_rpc);
+		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL,
+				args.stateless_rpc);
 		break;
 	case protocol_v1:
 	case protocol_v0:
diff --git a/builtin/fetch.c b/builtin/fetch.c
index ecf8537605..837382ef4f 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1349,7 +1349,8 @@ static int do_fetch(struct transport *transport,
 	int autotags = (transport->remote->fetch_tags == 1);
 	int retcode = 0;
 	const struct ref *remote_refs;
-	struct strvec ref_prefixes = STRVEC_INIT;
+	struct transport_ls_refs_options transport_ls_refs_options =
+		TRANSPORT_LS_REFS_OPTIONS_INIT;
 	int must_list_refs = 1;
 
 	if (tags == TAGS_DEFAULT) {
@@ -1369,7 +1370,7 @@ static int do_fetch(struct transport *transport,
 	if (rs->nr) {
 		int i;
 
-		refspec_ref_prefixes(rs, &ref_prefixes);
+		refspec_ref_prefixes(rs, &transport_ls_refs_options.ref_prefixes);
 
 		/*
 		 * We can avoid listing refs if all of them are exact
@@ -1383,22 +1384,25 @@ static int do_fetch(struct transport *transport,
 			}
 		}
 	} else if (transport->remote && transport->remote->fetch.nr)
-		refspec_ref_prefixes(&transport->remote->fetch, &ref_prefixes);
+		refspec_ref_prefixes(&transport->remote->fetch,
+				     &transport_ls_refs_options.ref_prefixes);
 
 	if (tags == TAGS_SET || tags == TAGS_DEFAULT) {
 		must_list_refs = 1;
-		if (ref_prefixes.nr)
-			strvec_push(&ref_prefixes, "refs/tags/");
+		if (transport_ls_refs_options.ref_prefixes.nr)
+			strvec_push(&transport_ls_refs_options.ref_prefixes,
+				    "refs/tags/");
 	}
 
 	if (must_list_refs) {
 		trace2_region_enter("fetch", "remote_refs", the_repository);
-		remote_refs = transport_get_remote_refs(transport, &ref_prefixes);
+		remote_refs = transport_get_remote_refs(transport,
+							&transport_ls_refs_options);
 		trace2_region_leave("fetch", "remote_refs", the_repository);
 	} else
 		remote_refs = NULL;
 
-	strvec_clear(&ref_prefixes);
+	strvec_clear(&transport_ls_refs_options.ref_prefixes);
 
 	ref_map = get_ref_map(transport->remote, remote_refs, rs,
 			      tags, &autotags);
diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index 092917eca2..ef604752a0 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -45,7 +45,8 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	int show_symref_target = 0;
 	const char *uploadpack = NULL;
 	const char **pattern = NULL;
-	struct strvec ref_prefixes = STRVEC_INIT;
+	struct transport_ls_refs_options transport_options =
+		TRANSPORT_LS_REFS_OPTIONS_INIT;
 	int i;
 	struct string_list server_options = STRING_LIST_INIT_DUP;
 
@@ -94,9 +95,9 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	}
 
 	if (flags & REF_TAGS)
-		strvec_push(&ref_prefixes, "refs/tags/");
+		strvec_push(&transport_options.ref_prefixes, "refs/tags/");
 	if (flags & REF_HEADS)
-		strvec_push(&ref_prefixes, "refs/heads/");
+		strvec_push(&transport_options.ref_prefixes, "refs/heads/");
 
 	remote = remote_get(dest);
 	if (!remote) {
@@ -118,7 +119,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	if (server_options.nr)
 		transport->server_options = &server_options;
 
-	ref = transport_get_remote_refs(transport, &ref_prefixes);
+	ref = transport_get_remote_refs(transport, &transport_options);
 	if (ref) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
 		repo_set_hash_algo(the_repository, hash_algo);
diff --git a/connect.c b/connect.c
index 8b8f56cf6d..328c279250 100644
--- a/connect.c
+++ b/connect.c
@@ -453,12 +453,14 @@ void check_stateless_delimiter(int stateless_rpc,
 
 struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
-			     const struct strvec *ref_prefixes,
+			     struct transport_ls_refs_options *transport_options,
 			     const struct string_list *server_options,
 			     int stateless_rpc)
 {
 	int i;
 	const char *hash_name;
+	struct strvec *ref_prefixes = transport_options ?
+		&transport_options->ref_prefixes : NULL;
 	*list = NULL;
 
 	if (server_supports_v2("ls-refs", 1))
diff --git a/remote.h b/remote.h
index 3211abdf05..4ae676a11b 100644
--- a/remote.h
+++ b/remote.h
@@ -6,6 +6,8 @@
 #include "hashmap.h"
 #include "refspec.h"
 
+struct transport_ls_refs_options;
+
 /**
  * The API gives access to the configuration related to remotes. It handles
  * all three configuration mechanisms historically and currently used by Git,
@@ -196,7 +198,7 @@ struct ref **get_remote_heads(struct packet_reader *reader,
 /* Used for protocol v2 in order to retrieve refs from a remote */
 struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
-			     const struct strvec *ref_prefixes,
+			     struct transport_ls_refs_options *transport_options,
 			     const struct string_list *server_options,
 			     int stateless_rpc);
 
diff --git a/transport-helper.c b/transport-helper.c
index 5f6e0b3bd8..49b7fb4dcb 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1162,13 +1162,14 @@ static int has_attribute(const char *attrs, const char *attr)
 }
 
 static struct ref *get_refs_list(struct transport *transport, int for_push,
-				 const struct strvec *ref_prefixes)
+				 struct transport_ls_refs_options *transport_options)
 {
 	get_helper(transport);
 
 	if (process_connect(transport, for_push)) {
 		do_take_over(transport);
-		return transport->vtable->get_refs_list(transport, for_push, ref_prefixes);
+		return transport->vtable->get_refs_list(transport, for_push,
+							transport_options);
 	}
 
 	return get_refs_list_using_list(transport, for_push);
diff --git a/transport-internal.h b/transport-internal.h
index 27c9daffc4..b60f1ba907 100644
--- a/transport-internal.h
+++ b/transport-internal.h
@@ -4,6 +4,7 @@
 struct ref;
 struct transport;
 struct strvec;
+struct transport_ls_refs_options;
 
 struct transport_vtable {
 	/**
@@ -18,19 +19,12 @@ struct transport_vtable {
 	 * the transport to try to share connections, for_push is a
 	 * hint as to whether the ultimate operation is a push or a fetch.
 	 *
-	 * If communicating using protocol v2 a list of prefixes can be
-	 * provided to be sent to the server to enable it to limit the ref
-	 * advertisement.  Since ref filtering is done on the server's end, and
-	 * only when using protocol v2, this list will be ignored when not
-	 * using protocol v2 meaning this function can return refs which don't
-	 * match the provided ref_prefixes.
-	 *
 	 * If the transport is able to determine the remote hash for
 	 * the ref without a huge amount of effort, it should store it
 	 * in the ref's old_sha1 field; otherwise it should be all 0.
 	 **/
 	struct ref *(*get_refs_list)(struct transport *transport, int for_push,
-				     const struct strvec *ref_prefixes);
+				     struct transport_ls_refs_options *transport_options);
 
 	/**
 	 * Fetch the objects for the given refs. Note that this gets
diff --git a/transport.c b/transport.c
index 679a35e7c1..b13fab5dc3 100644
--- a/transport.c
+++ b/transport.c
@@ -127,7 +127,7 @@ struct bundle_transport_data {
 
 static struct ref *get_refs_from_bundle(struct transport *transport,
 					int for_push,
-					const struct strvec *ref_prefixes)
+					struct transport_ls_refs_options *transport_options)
 {
 	struct bundle_transport_data *data = transport->data;
 	struct ref *result = NULL;
@@ -280,7 +280,7 @@ static void die_if_server_options(struct transport *transport)
  * remote refs.
  */
 static struct ref *handshake(struct transport *transport, int for_push,
-			     const struct strvec *ref_prefixes,
+			     struct transport_ls_refs_options *options,
 			     int must_list_refs)
 {
 	struct git_transport_data *data = transport->data;
@@ -303,7 +303,7 @@ static struct ref *handshake(struct transport *transport, int for_push,
 			trace2_data_string("transfer", NULL, "server-sid", server_sid);
 		if (must_list_refs)
 			get_remote_refs(data->fd[1], &reader, &refs, for_push,
-					ref_prefixes,
+					options,
 					transport->server_options,
 					transport->stateless_rpc);
 		break;
@@ -334,9 +334,9 @@ static struct ref *handshake(struct transport *transport, int for_push,
 }
 
 static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
-					const struct strvec *ref_prefixes)
+					struct transport_ls_refs_options *options)
 {
-	return handshake(transport, for_push, ref_prefixes, 1);
+	return handshake(transport, for_push, options, 1);
 }
 
 static int fetch_refs_via_pack(struct transport *transport,
@@ -1252,19 +1252,20 @@ int transport_push(struct repository *r,
 		int porcelain = flags & TRANSPORT_PUSH_PORCELAIN;
 		int pretend = flags & TRANSPORT_PUSH_DRY_RUN;
 		int push_ret, ret, err;
-		struct strvec ref_prefixes = STRVEC_INIT;
+		struct transport_ls_refs_options transport_options =
+			TRANSPORT_LS_REFS_OPTIONS_INIT;
 
 		if (check_push_refs(local_refs, rs) < 0)
 			return -1;
 
-		refspec_ref_prefixes(rs, &ref_prefixes);
+		refspec_ref_prefixes(rs, &transport_options.ref_prefixes);
 
 		trace2_region_enter("transport_push", "get_refs_list", r);
 		remote_refs = transport->vtable->get_refs_list(transport, 1,
-							       &ref_prefixes);
+							       &transport_options);
 		trace2_region_leave("transport_push", "get_refs_list", r);
 
-		strvec_clear(&ref_prefixes);
+		strvec_clear(&transport_options.ref_prefixes);
 
 		if (flags & TRANSPORT_PUSH_ALL)
 			match_flags |= MATCH_REFS_ALL;
@@ -1380,12 +1381,12 @@ int transport_push(struct repository *r,
 }
 
 const struct ref *transport_get_remote_refs(struct transport *transport,
-					    const struct strvec *ref_prefixes)
+					    struct transport_ls_refs_options *transport_options)
 {
 	if (!transport->got_remote_refs) {
 		transport->remote_refs =
 			transport->vtable->get_refs_list(transport, 0,
-							 ref_prefixes);
+							 transport_options);
 		transport->got_remote_refs = 1;
 	}
 
diff --git a/transport.h b/transport.h
index 24558c027d..1f5b60e4d3 100644
--- a/transport.h
+++ b/transport.h
@@ -233,17 +233,24 @@ int transport_push(struct repository *repo,
 		   struct refspec *rs, int flags,
 		   unsigned int * reject_reasons);
 
+struct transport_ls_refs_options {
+	/*
+	 * Optionally, a list of ref prefixes can be provided which can be sent
+	 * to the server (when communicating using protocol v2) to enable it to
+	 * limit the ref advertisement.  Since ref filtering is done on the
+	 * server's end (and only when using protocol v2),
+	 * transport_get_remote_refs() could return refs which don't match the
+	 * provided ref_prefixes.
+	 */
+	struct strvec ref_prefixes;
+};
+#define TRANSPORT_LS_REFS_OPTIONS_INIT { STRVEC_INIT }
+
 /*
  * Retrieve refs from a remote.
- *
- * Optionally a list of ref prefixes can be provided which can be sent to the
- * server (when communicating using protocol v2) to enable it to limit the ref
- * advertisement.  Since ref filtering is done on the server's end (and only
- * when using protocol v2), this can return refs which don't match the provided
- * ref_prefixes.
  */
 const struct ref *transport_get_remote_refs(struct transport *transport,
-					    const struct strvec *ref_prefixes);
+					    struct transport_ls_refs_options *transport_options);
 
 /*
  * Fetch the hash algorithm used by a remote.
-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v6 3/3] clone: respect remote unborn HEAD
  2021-02-02  2:14 ` [PATCH v6 " Jonathan Tan
  2021-02-02  2:14   ` [PATCH v6 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
  2021-02-02  2:15   ` [PATCH v6 2/3] connect, transport: encapsulate arg in struct Jonathan Tan
@ 2021-02-02  2:15   ` Jonathan Tan
  2 siblings, 0 replies; 109+ messages in thread
From: Jonathan Tan @ 2021-02-02  2:15 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, peff

Teach Git to use the "unborn" feature introduced in a previous patch as
follows: Git will always send the "unborn" argument if it is supported
by the server. During "git clone", if cloning an empty repository, Git
will use the new information to determine the local branch to create. In
all other cases, Git will ignore it.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config/init.txt |  2 +-
 builtin/clone.c               | 16 ++++++++++++++--
 connect.c                     | 28 ++++++++++++++++++++++++++--
 t/t5606-clone-options.sh      |  8 +++++---
 t/t5702-protocol-v2.sh        | 25 +++++++++++++++++++++++++
 transport.h                   |  8 ++++++++
 6 files changed, 79 insertions(+), 8 deletions(-)

diff --git a/Documentation/config/init.txt b/Documentation/config/init.txt
index dc77f8c844..79c79d6617 100644
--- a/Documentation/config/init.txt
+++ b/Documentation/config/init.txt
@@ -4,4 +4,4 @@ init.templateDir::
 
 init.defaultBranch::
 	Allows overriding the default branch name e.g. when initializing
-	a new repository or when cloning an empty repository.
+	a new repository.
diff --git a/builtin/clone.c b/builtin/clone.c
index 211d4f54b0..09dcd97a2e 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1330,8 +1330,19 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		remote_head = NULL;
 		option_no_checkout = 1;
 		if (!option_bare) {
-			const char *branch = git_default_branch_name();
-			char *ref = xstrfmt("refs/heads/%s", branch);
+			const char *branch;
+			char *ref;
+
+			if (transport_ls_refs_options.unborn_head_target &&
+			    skip_prefix(transport_ls_refs_options.unborn_head_target,
+					"refs/heads/", &branch)) {
+				ref = transport_ls_refs_options.unborn_head_target;
+				transport_ls_refs_options.unborn_head_target = NULL;
+				create_symref("HEAD", ref, reflog_msg.buf);
+			} else {
+				branch = git_default_branch_name();
+				ref = xstrfmt("refs/heads/%s", branch);
+			}
 
 			install_branch_config(0, branch, remote_name, ref);
 			free(ref);
@@ -1385,5 +1396,6 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	junk_mode = JUNK_LEAVE_ALL;
 
 	strvec_clear(&transport_ls_refs_options.ref_prefixes);
+	free(transport_ls_refs_options.unborn_head_target);
 	return err;
 }
diff --git a/connect.c b/connect.c
index 328c279250..879669df93 100644
--- a/connect.c
+++ b/connect.c
@@ -376,7 +376,8 @@ struct ref **get_remote_heads(struct packet_reader *reader,
 }
 
 /* Returns 1 when a valid ref has been added to `list`, 0 otherwise */
-static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
+static int process_ref_v2(struct packet_reader *reader, struct ref ***list,
+			  char **unborn_head_target)
 {
 	int ret = 1;
 	int i = 0;
@@ -397,6 +398,25 @@ static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
 		goto out;
 	}
 
+	if (!strcmp("unborn", line_sections.items[i].string)) {
+		i++;
+		if (unborn_head_target &&
+		    !strcmp("HEAD", line_sections.items[i++].string)) {
+			/*
+			 * Look for the symref target (if any). If found,
+			 * return it to the caller.
+			 */
+			for (; i < line_sections.nr; i++) {
+				const char *arg = line_sections.items[i].string;
+
+				if (skip_prefix(arg, "symref-target:", &arg)) {
+					*unborn_head_target = xstrdup(arg);
+					break;
+				}
+			}
+		}
+		goto out;
+	}
 	if (parse_oid_hex_algop(line_sections.items[i++].string, &old_oid, &end, reader->hash_algo) ||
 	    *end) {
 		ret = 0;
@@ -461,6 +481,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 	const char *hash_name;
 	struct strvec *ref_prefixes = transport_options ?
 		&transport_options->ref_prefixes : NULL;
+	char **unborn_head_target = transport_options ?
+		&transport_options->unborn_head_target : NULL;
 	*list = NULL;
 
 	if (server_supports_v2("ls-refs", 1))
@@ -490,6 +512,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 	if (!for_push)
 		packet_write_fmt(fd_out, "peel\n");
 	packet_write_fmt(fd_out, "symrefs\n");
+	if (server_supports_feature("ls-refs", "unborn", 0))
+		packet_write_fmt(fd_out, "unborn\n");
 	for (i = 0; ref_prefixes && i < ref_prefixes->nr; i++) {
 		packet_write_fmt(fd_out, "ref-prefix %s\n",
 				 ref_prefixes->v[i]);
@@ -498,7 +522,7 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 
 	/* Process response from server */
 	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
-		if (!process_ref_v2(reader, &list))
+		if (!process_ref_v2(reader, &list, unborn_head_target))
 			die(_("invalid ls-refs response: %s"), reader->line);
 	}
 
diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh
index 7f082fb23b..0111d4e8bd 100755
--- a/t/t5606-clone-options.sh
+++ b/t/t5606-clone-options.sh
@@ -102,11 +102,13 @@ test_expect_success 'redirected clone -v does show progress' '
 '
 
 test_expect_success 'chooses correct default initial branch name' '
-	git init --bare empty &&
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=foo init --bare empty &&
+	test_config -C empty lsrefs.allowUnborn true &&
 	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
 	git -c init.defaultBranch=up clone empty whats-up &&
-	test refs/heads/up = $(git -C whats-up symbolic-ref HEAD) &&
-	test refs/heads/up = $(git -C whats-up config branch.up.merge)
+	test refs/heads/foo = $(git -C whats-up symbolic-ref HEAD) &&
+	test refs/heads/foo = $(git -C whats-up config branch.foo.merge)
 '
 
 test_expect_success 'guesses initial branch name correctly' '
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 7d5b17909b..a8ef92b644 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -209,6 +209,31 @@ test_expect_success 'clone with file:// using protocol v2' '
 	grep "ref-prefix refs/tags/" log
 '
 
+test_expect_success 'clone of empty repo propagates name of default branch' '
+	test_when_finished "rm -rf file_empty_parent file_empty_child" &&
+
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&
+
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=main -c protocol.version=2 \
+		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
+	grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD
+'
+
+test_expect_success '...but not if explicitly forbidden by config' '
+	test_when_finished "rm -rf file_empty_parent file_empty_child" &&
+
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&
+	test_config -C file_empty_parent lsrefs.allowUnborn false &&
+
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=main -c protocol.version=2 \
+		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
+	! grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD
+'
+
 test_expect_success 'fetch with file:// using protocol v2' '
 	test_when_finished "rm -f log" &&
 
diff --git a/transport.h b/transport.h
index 1f5b60e4d3..24e15799e7 100644
--- a/transport.h
+++ b/transport.h
@@ -243,6 +243,14 @@ struct transport_ls_refs_options {
 	 * provided ref_prefixes.
 	 */
 	struct strvec ref_prefixes;
+
+	/*
+	 * If unborn_head_target is not NULL, and the remote reports HEAD as
+	 * pointing to an unborn branch, transport_get_remote_refs() stores the
+	 * unborn branch in unborn_head_target. It should be freed by the
+	 * caller.
+	 */
+	char *unborn_head_target;
 };
 #define TRANSPORT_LS_REFS_OPTIONS_INIT { STRVEC_INIT }
 
-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 1/3] ls-refs: report unborn targets of symrefs
  2021-01-29 22:04         ` Junio C Hamano
@ 2021-02-02  2:20           ` Jonathan Tan
  2021-02-02  5:00             ` Junio C Hamano
  0 siblings, 1 reply; 109+ messages in thread
From: Jonathan Tan @ 2021-02-02  2:20 UTC (permalink / raw)
  To: gitster; +Cc: jonathantanmy, git

> So a dangling symref, e.g. "refs/remotes/origin/HEAD -> trunk" when
> no "refs/remotes/origin/trunk" exists, is not reported to send_ref()
> in the same way as an unborn "HEAD"?

I've tried it, and yes, for_each_namespaced_ref() will not report it.
Looking at the code, I think it is packed_ref_iterator_advance() which
checks for broken objects and skips over them.

> I would have expected that we'd
> report where it points at, and for that to work, you'd have to use
> not just the vanilla send_ref() as the callback, but something that
> knows how to do "are we expected to send unborn symrefs" logic, like
> send_possibly_unborn_head does.
> 
> That "changed to tolerate ... should work" worries me.
> 
> If "for_each_namespaced_ref(send_ref, &data)" will never call send_ref()
> with NULL (as in (void *)0) oid, then that would be OK,

If it called send_ref() with (void *)0 OID, it would segfault with the
current code, which calls oid_to_hex() on the OID unconditionally.

> but if it
> ends up calling with NULL somehow, it is responsible to ensure that
> data->symrefs is true and flag has REF_ISSYMREF set, or send_ref()
> would misbehave, (see the first part of your message, which I am
> responding to), no?

If it did, then yes.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 0/3] Cloning with remote unborn HEAD
  2021-01-30  4:41     ` Jonathan Tan
  2021-01-30 11:13       ` Ævar Arnfjörð Bjarmason
@ 2021-02-02  2:22       ` Jonathan Tan
  2021-02-03 14:23         ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 109+ messages in thread
From: Jonathan Tan @ 2021-02-02  2:22 UTC (permalink / raw)
  To: jonathantanmy; +Cc: avarab, git

> > I really don't care much, but this really needs a corresponding
> > documentation update. I.e. something like:
> > 
> >     init.defaultBranch::
> >         Allows overriding the default branch name e.g. when initializing a
> >         new repository or when cloning an empty repository.
> >     
> >         When cloning a repository over protocol v2 (i.e. ssh://, https://,
> >         file://, but not a /some/path), and if that repository has
> >         init.defaultBranch configured, the server will advertise its
> >         preferred default branch name, and we'll take its configuration over
> >         ours.
> 
> Thanks - I'll use some of your wording, but I think it's best to leave
> open the possibility that cloning using protocol v0 or the disk clone
> (/some/path) copies over the current HEAD as well.

Looking back on this, I think that it's natural to think that both an
empty repository and a non-empty one have a HEAD that points somewhere,
and "git clone" would behave the same way in both cases. So I'll hold
off on the documentation change.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 1/3] ls-refs: report unborn targets of symrefs
  2021-02-02  2:20           ` Jonathan Tan
@ 2021-02-02  5:00             ` Junio C Hamano
  0 siblings, 0 replies; 109+ messages in thread
From: Junio C Hamano @ 2021-02-02  5:00 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git

Jonathan Tan <jonathantanmy@google.com> writes:

>> So a dangling symref, e.g. "refs/remotes/origin/HEAD -> trunk" when
>> no "refs/remotes/origin/trunk" exists, is not reported to send_ref()
>> in the same way as an unborn "HEAD"?
>
> I've tried it, and yes, for_each_namespaced_ref() will not report it.
> Looking at the code, I think it is packed_ref_iterator_advance() which
> checks for broken objects and skips over them.
>
>> I would have expected that we'd
>> report where it points at, and for that to work, you'd have to use
>> not just the vanilla send_ref() as the callback, but something that
>> knows how to do "are we expected to send unborn symrefs" logic, like
>> send_possibly_unborn_head does.
>> 
>> That "changed to tolerate ... should work" worries me.
>> 
>> If "for_each_namespaced_ref(send_ref, &data)" will never call send_ref()
>> with NULL (as in (void *)0) oid, then that would be OK,
>
> If it called send_ref() with (void *)0 OID, it would segfault with the
> current code, which calls oid_to_hex() on the OID unconditionally.
>
>> but if it
>> ends up calling with NULL somehow, it is responsible to ensure that
>> data->symrefs is true and flag has REF_ISSYMREF set, or send_ref()
>> would misbehave, (see the first part of your message, which I am
>> responding to), no?
>
> If it did, then yes.

So, in other words, the series is *only* about HEAD and no other
symrefs are reported when dangling as "unborn"?

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v6 1/3] ls-refs: report unborn targets of symrefs
  2021-02-02  2:14   ` [PATCH v6 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
@ 2021-02-02 16:55     ` Junio C Hamano
  2021-02-02 18:34       ` Jonathan Tan
  0 siblings, 1 reply; 109+ messages in thread
From: Junio C Hamano @ 2021-02-02 16:55 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, peff

Jonathan Tan <jonathantanmy@google.com> writes:

> When cloning, we choose the default branch based on the remote HEAD.
> But if there is no remote HEAD reported (which could happen if the
> target of the remote HEAD is unborn), we'll fall back to using our local
> init.defaultBranch. Traditionally this hasn't been a big deal, because
> most repos used "master" as the default. But these days it is likely to
> cause confusion if the server and client implementations choose
> different values (e.g., if the remote started with "main", we may choose
> "master" locally, create commits there, and then the user is surprised
> when they push to "master" and not "main").
> ...
> The client side will be updated to use this in a subsequent commit.

Nicely explained.

> diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> index 85daeb5d9e..4707511c10 100644
> --- a/Documentation/technical/protocol-v2.txt
> +++ b/Documentation/technical/protocol-v2.txt
> @@ -192,11 +192,19 @@ ls-refs takes in the following arguments:
>  	When specified, only references having a prefix matching one of
>  	the provided prefixes are displayed.
>  
> +If the 'unborn' feature is advertised the following argument can be
> +included in the client's request.
> +
> +    unborn
> +	The server may send symrefs pointing to unborn branches in the form
> +	"unborn <refname> symref-target:<target>".
> +

I somehow had an impression that this is done only for HEAD and no
other symrefs.

If this describes the ideal endgame state and the implementation at
the moment only covers what is practically the most useful (i.e.
HEAD), that is fine.

If we do not plan to support symrefs other than HEAD that are
dangling, that is fine as well, but then the description needs
updating, no?

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v6 1/3] ls-refs: report unborn targets of symrefs
  2021-02-02 16:55     ` Junio C Hamano
@ 2021-02-02 18:34       ` Jonathan Tan
  2021-02-02 22:17         ` Junio C Hamano
  0 siblings, 1 reply; 109+ messages in thread
From: Jonathan Tan @ 2021-02-02 18:34 UTC (permalink / raw)
  To: gitster; +Cc: jonathantanmy, git, peff

> > diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> > index 85daeb5d9e..4707511c10 100644
> > --- a/Documentation/technical/protocol-v2.txt
> > +++ b/Documentation/technical/protocol-v2.txt
> > @@ -192,11 +192,19 @@ ls-refs takes in the following arguments:
> >  	When specified, only references having a prefix matching one of
> >  	the provided prefixes are displayed.
> >  
> > +If the 'unborn' feature is advertised the following argument can be
> > +included in the client's request.
> > +
> > +    unborn
> > +	The server may send symrefs pointing to unborn branches in the form
> > +	"unborn <refname> symref-target:<target>".
> > +
> 
> I somehow had an impression that this is done only for HEAD and no
> other symrefs.

Right now, that's true.

> If this describes the ideal endgame state and the implementation at
> the moment only covers what is practically the most useful (i.e.
> HEAD), that is fine.
> 
> If we do not plan to support symrefs other than HEAD that are
> dangling, that is fine as well, but then the description needs
> updating, no?

I'm not sure what the ideal endgame state is, but I could see how
sending all symlinks would be useful (e.g. if a client wanted to mirror
another repo with more fidelity). Right now I don't plan on adding
support for dangling symrefs other than HEAD, though. Maybe I'll update
it to something like:

  If HEAD is a symref pointing to an unborn branch, the server may send
  it in the form "unborn HEAD symref-target:<target>". In the future,
  this may be extended to other symrefs as well.

I think that there is a discussion point to be decided
(advertise/allow/ignore vs allow/ignore), so I'll wait for that before
sending v7.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v6 1/3] ls-refs: report unborn targets of symrefs
  2021-02-02 18:34       ` Jonathan Tan
@ 2021-02-02 22:17         ` Junio C Hamano
  2021-02-03  1:04           ` Jonathan Tan
  0 siblings, 1 reply; 109+ messages in thread
From: Junio C Hamano @ 2021-02-02 22:17 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, peff

Jonathan Tan <jonathantanmy@google.com> writes:

> I'm not sure what the ideal endgame state is, but I could see how
> sending all symlinks would be useful (e.g. if a client wanted to mirror
> another repo with more fidelity). Right now I don't plan on adding
> support for dangling symrefs other than HEAD, though. Maybe I'll update
> it to something like:
>
>   If HEAD is a symref pointing to an unborn branch, the server may send
>   it in the form "unborn HEAD symref-target:<target>". In the future,
>   this may be extended to other symrefs as well.

Unless you plan to add support for symbolic refs that are not HEAD
in immediate future, "In the future, ..." is not even necessary to
say.  The users cannot expect to exploit the missing feature anyway,
and they cannot even plan to use it in near future.

I've been disturbed by the phrase "the server may send it" quite a
lot, actually.  In the original before the rewrite above, it was a
good cop-out excuse "no, we do not send symbolic refs other than
HEAD because we only say 'the server may' and do not promise
anything beyond that".  But now we are tightening the description
to HEAD that we do intend to support well, it probably is a good
idea to give users a promise a bit firmer than that.

    unborn If HEAD is a symref pointing to an unborn branch <b>, the
    server reports it as "unborn HEAD symref-target:refs/heads/<b>"
    in its response.

It would make it clear that by sending 'unborn' in the request, the
client is not just allowing the server to include the unborn
information in the response.  It is asking the server, that has
advertised that it is capable to do so, to exercise the feature.

> I think that there is a discussion point to be decided
> (advertise/allow/ignore vs allow/ignore), so I'll wait for that before
> sending v7.

What is the downside of having three choices (which allows phased
deployment, where everybody starts as capable of responding without
advertising in the first phase, and once everybody becomes capable
of responding, they start advertising) and the reason we might
prefer just allow/ignore instead?  Too much complexity?  It does not
help the real deployment as much in practice as it seems on paper?

I am not advocating three-choice option; I am neutral, but do not
see a good reason to reject it (while I can easily see a reason to
reject the other one).

Thanks.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v6 1/3] ls-refs: report unborn targets of symrefs
  2021-02-02 22:17         ` Junio C Hamano
@ 2021-02-03  1:04           ` Jonathan Tan
  0 siblings, 0 replies; 109+ messages in thread
From: Jonathan Tan @ 2021-02-03  1:04 UTC (permalink / raw)
  To: gitster; +Cc: jonathantanmy, git, peff

> Jonathan Tan <jonathantanmy@google.com> writes:
> 
> > I'm not sure what the ideal endgame state is, but I could see how
> > sending all symlinks would be useful (e.g. if a client wanted to mirror
> > another repo with more fidelity). Right now I don't plan on adding
> > support for dangling symrefs other than HEAD, though. Maybe I'll update
> > it to something like:
> >
> >   If HEAD is a symref pointing to an unborn branch, the server may send
> >   it in the form "unborn HEAD symref-target:<target>". In the future,
> >   this may be extended to other symrefs as well.
> 
> Unless you plan to add support for symbolic refs that are not HEAD
> in immediate future, "In the future, ..." is not even necessary to
> say.  The users cannot expect to exploit the missing feature anyway,
> and they cannot even plan to use it in near future.
> 
> I've been disturbed by the phrase "the server may send it" quite a
> lot, actually.  In the original before the rewrite above, it was a
> good cop-out excuse "no, we do not send symbolic refs other than
> HEAD because we only say 'the server may' and do not promise
> anything beyond that".  But now we are tightening the description
> to HEAD that we do intend to support well, it probably is a good
> idea to give users a promise a bit firmer than that.
> 
>     unborn If HEAD is a symref pointing to an unborn branch <b>, the
>     server reports it as "unborn HEAD symref-target:refs/heads/<b>"
>     in its response.
> 
> It would make it clear that by sending 'unborn' in the request, the
> client is not just allowing the server to include the unborn
> information in the response.  It is asking the server, that has
> advertised that it is capable to do so, to exercise the feature.

That makes sense. OK, I'll make the promise firmer.

> > I think that there is a discussion point to be decided
> > (advertise/allow/ignore vs allow/ignore), so I'll wait for that before
> > sending v7.
> 
> What is the downside of having three choices (which allows phased
> deployment, where everybody starts as capable of responding without
> advertising in the first phase, and once everybody becomes capable
> of responding, they start advertising) and the reason we might
> prefer just allow/ignore instead?  Too much complexity?  It does not
> help the real deployment as much in practice as it seems on paper?
> 
> I am not advocating three-choice option; I am neutral, but do not
> see a good reason to reject it (while I can easily see a reason to
> reject the other one).

Here's a reason from Peff's email [1] against advertise/allow/ignore (the "code
change" is a temporary hack that teaches Git to accept but not advertise
report-status-v2). Granted, he does say that this may be an oversimplification,
and in the overall email, he was arguing more for having this feature on by
default (whether we have advertise/allow/ignore, allow/ignore, or no config at
all) rather than for any specific configuration scheme.

  - one nice thing about the code change is that after the rollout is
    done, it's safe to make the code unconditional again, which makes
    it simpler to read/reason about.

    This may be oversimplifying it a bit, of course. On one platform, we
    know when the rollout is happening. But if it's something we ship
    upstream, then "rollout" may be on the jump from v2.28 to v2.29, or
    to v2.30, or v2.31, etc. You can never say "rollouts are done, and
    existing server versions know about this feature". So any upstream
    support like config has to stay forever.

To balance that out, from the same email [1], a slight argument against no
config at all:

  (I know there was also an indication that some people might want it off
  because they somehow want to have no HEAD at all. I don't find this
  particularly compelling, but even if it were, I think we could leave it
  the config as an escape hatch for such folks, but still default it to
  "on").

[1] https://lore.kernel.org/git/X9xJLWdFJfNJTn0p@coredump.intra.peff.net/

And an argument against the allow/ignore [2]:

  If we are not going to support config that helps you do an atomic
  deploy, then I don't really see the point of having config at all.
  Here are three plausible implementations I can conceive of:
  
    - allowUnborn is a tri-state for "accept-but-do-not-advertise",
      "accept-and-advertise", and "disallow". This helps with rollout in a
      cluster by setting it to the accept-but-do-not-advertise.  The
      default would be accept-and-advertise, which is what most servers
      would want. I don't really know why anyone would want "disallow".
  
    - allowUnborn is a bool for "accept-and-advertise" or "disallow". This
      doesn't help cluster rollout. I don't know why anyone would want to
      switch away from the default of accept-and-advertise.
  
    - allowUnborn is always on.
  
  The first one helps the cluster case, at the cost of introducing an
  extra config knob. The third one doesn't help that case, but is one less
  knob for server admins to think about. But the second one has a knob
  that I don't understand why anybody would tweak. It seems like the worst
  of both.

[2] https://lore.kernel.org/git/YBCitNb75rpnuW2L@coredump.intra.peff.net/

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 0/3] Cloning with remote unborn HEAD
  2021-02-02  2:22       ` Jonathan Tan
@ 2021-02-03 14:23         ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 109+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-03 14:23 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git


On Tue, Feb 02 2021, Jonathan Tan wrote:

>> > I really don't care much, but this really needs a corresponding
>> > documentation update. I.e. something like:
>> > 
>> >     init.defaultBranch::
>> >         Allows overriding the default branch name e.g. when initializing a
>> >         new repository or when cloning an empty repository.
>> >     
>> >         When cloning a repository over protocol v2 (i.e. ssh://, https://,
>> >         file://, but not a /some/path), and if that repository has
>> >         init.defaultBranch configured, the server will advertise its
>> >         preferred default branch name, and we'll take its configuration over
>> >         ours.
>> 
>> Thanks - I'll use some of your wording, but I think it's best to leave
>> open the possibility that cloning using protocol v0 or the disk clone
>> (/some/path) copies over the current HEAD as well.
>
> Looking back on this, I think that it's natural to think that both an
> empty repository and a non-empty one have a HEAD that points somewhere,
> and "git clone" would behave the same way in both cases. So I'll hold
> off on the documentation change.

You mean for a v6 it'll do the same thing in the local clone case too
and thus we won't need to document the exception? Sounds good.

I was mainly pointing out the need to document the current divergent
behavior.

Documenting that something isn't consistent shouldn't be seen as a
blessing that the divergence is a good idea, it's an aid to our users so
they can understand why their git version does X when they might be
expecting Y.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v7 0/3] Cloning with remote unborn HEAD
  2020-12-08  1:31 Cloning empty repository uses locally configured default branch name Jonathan Tan
                   ` (4 preceding siblings ...)
  2021-02-02  2:14 ` [PATCH v6 " Jonathan Tan
@ 2021-02-05  4:58 ` Jonathan Tan
  2021-02-05  4:58   ` [PATCH v7 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
                     ` (3 more replies)
  2021-02-05 20:48 ` [PATCH v8 " Jonathan Tan
  6 siblings, 4 replies; 109+ messages in thread
From: Jonathan Tan @ 2021-02-05  4:58 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

For what it's worth, here's v7 with advertise/allow/ignore and by
default, advertise. I think that some server operators will have use for
this feature, and people who want to disable it for whatever reason can
still do so. The main disadvantage is complexity - the server knob that
server administrators will need to control (but between a simpler
allow/ignore knob and a more complicated advertise/allow/ignore knob, I
think we might as well go for the slightly more complex one) and
complexity in the code (but now that is constrained to one function and
a few global variables).

As you can see from the range-diff, not much has changed from v6.

I've also included Junio's suggestion of tightening the promise made by
the server (when the client says "unborn").

Jonathan Tan (3):
  ls-refs: report unborn targets of symrefs
  connect, transport: encapsulate arg in struct
  clone: respect remote unborn HEAD

 Documentation/config.txt                |  2 +
 Documentation/config/init.txt           |  2 +-
 Documentation/config/lsrefs.txt         |  9 +++
 Documentation/technical/protocol-v2.txt | 11 +++-
 builtin/clone.c                         | 34 +++++++++---
 builtin/fetch-pack.c                    |  3 +-
 builtin/fetch.c                         | 18 +++---
 builtin/ls-remote.c                     |  9 +--
 connect.c                               | 32 ++++++++++-
 ls-refs.c                               | 74 ++++++++++++++++++++++++-
 ls-refs.h                               |  1 +
 remote.h                                |  4 +-
 serve.c                                 |  2 +-
 t/t5606-clone-options.sh                |  8 ++-
 t/t5701-git-serve.sh                    |  2 +-
 t/t5702-protocol-v2.sh                  | 25 +++++++++
 transport-helper.c                      |  5 +-
 transport-internal.h                    | 10 +---
 transport.c                             | 23 ++++----
 transport.h                             | 29 +++++++---
 20 files changed, 240 insertions(+), 63 deletions(-)
 create mode 100644 Documentation/config/lsrefs.txt

Range-diff against v6:
1:  411bbafe25 ! 1:  2d35075369 ls-refs: report unborn targets of symrefs
    @@ Commit message
         Currently, symrefs that have unborn targets (such as in this case) are
         not communicated by the protocol. Teach Git to advertise and support the
         "unborn" feature in "ls-refs" (by default, this is advertised, but
    -    server administrators may turn this off through the lsrefs.allowunborn
    +    server administrators may turn this off through the lsrefs.unborn
         config). This feature indicates that "ls-refs" supports the "unborn"
         argument; when it is specified, "ls-refs" will send the HEAD symref with
         the name of its unborn target.
    @@ Documentation/config.txt: include::config/interactive.txt[]
     
      ## Documentation/config/lsrefs.txt (new) ##
     @@
    -+lsrefs.allowUnborn::
    -+	Allow the server to send information about unborn symrefs during the
    -+	protocol v2 ref advertisement.
    ++lsrefs.unborn::
    ++	May be "advertise" (the default), "allow", or "ignore". If "advertise",
    ++	the server will respond to the client sending "unborn" (as described in
    ++	protocol-v2.txt) and will advertise support for this feature during the
    ++	protocol v2 capability advertisement. "allow" is the same as
    ++	"advertise" except that the server will not advertise support for this
    ++	feature; this is useful for load-balanced servers that cannot be
    ++	updated automatically (for example), since the administrator could
    ++	configure "allow", then after a delay, configure "advertise".
     
      ## Documentation/technical/protocol-v2.txt ##
     @@ Documentation/technical/protocol-v2.txt: ls-refs takes in the following arguments:
    @@ Documentation/technical/protocol-v2.txt: ls-refs takes in the following argument
     +included in the client's request.
     +
     +    unborn
    -+	The server may send symrefs pointing to unborn branches in the form
    -+	"unborn <refname> symref-target:<target>".
    ++	The server will send information about HEAD even if it is a symref
    ++	pointing to an unborn branch in the form "unborn HEAD
    ++	symref-target:<target>".
     +
      The output of ls-refs is as follows:
      
    @@ ls-refs.c
      #include "config.h"
      
     +static int config_read;
    ++static int advertise_unborn;
     +static int allow_unborn;
     +
     +static void ensure_config_read(void)
     +{
    ++	char *str = NULL;
    ++
     +	if (config_read)
     +		return;
     +
    -+	if (repo_config_get_bool(the_repository, "lsrefs.allowunborn",
    -+				 &allow_unborn))
    ++	if (repo_config_get_string(the_repository, "lsrefs.unborn", &str)) {
     +		/*
    -+		 * If there is no such config, set it to 1 to allow it by
    ++		 * If there is no such config, advertise and allow it by
     +		 * default.
     +		 */
    ++		advertise_unborn = 1;
     +		allow_unborn = 1;
    ++	} else {
    ++		if (!strcmp(str, "advertise")) {
    ++			advertise_unborn = 1;
    ++			allow_unborn = 1;
    ++		} else if (!strcmp(str, "allow")) {
    ++			allow_unborn = 1;
    ++		} else if (!strcmp(str, "ignore")) {
    ++			/* do nothing */
    ++		} else {
    ++			die(_("invalid value '%s' for lsrefs.unborn"), str);
    ++		}
    ++	}
     +	config_read = 1;
     +}
     +
    @@ ls-refs.c: int ls_refs(struct repository *r, struct strvec *keys,
     +{
     +	if (value) {
     +		ensure_config_read();
    -+		if (allow_unborn)
    ++		if (advertise_unborn)
     +			strbuf_addstr(value, "unborn");
     +	}
     +
2:  fad1ebe6b6 = 2:  d4ed13d02e connect, transport: encapsulate arg in struct
3:  45a48ccc0d ! 3:  a3e5a0a7c5 clone: respect remote unborn HEAD
    @@ t/t5606-clone-options.sh: test_expect_success 'redirected clone -v does show pro
     -	git init --bare empty &&
     +	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
     +	git -c init.defaultBranch=foo init --bare empty &&
    -+	test_config -C empty lsrefs.allowUnborn true &&
    ++	test_config -C empty lsrefs.unborn advertise &&
      	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
      	git -c init.defaultBranch=up clone empty whats-up &&
     -	test refs/heads/up = $(git -C whats-up symbolic-ref HEAD) &&
    @@ t/t5702-protocol-v2.sh: test_expect_success 'clone with file:// using protocol v
     +
     +	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
     +	git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&
    -+	test_config -C file_empty_parent lsrefs.allowUnborn false &&
    ++	test_config -C file_empty_parent lsrefs.unborn ignore &&
     +
     +	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
     +	git -c init.defaultBranch=main -c protocol.version=2 \
-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v7 1/3] ls-refs: report unborn targets of symrefs
  2021-02-05  4:58 ` [PATCH v7 0/3] Cloning with " Jonathan Tan
@ 2021-02-05  4:58   ` Jonathan Tan
  2021-02-05 16:10     ` Jeff King
  2021-02-05  4:58   ` [PATCH v7 2/3] connect, transport: encapsulate arg in struct Jonathan Tan
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 109+ messages in thread
From: Jonathan Tan @ 2021-02-05  4:58 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

When cloning, we choose the default branch based on the remote HEAD.
But if there is no remote HEAD reported (which could happen if the
target of the remote HEAD is unborn), we'll fall back to using our local
init.defaultBranch. Traditionally this hasn't been a big deal, because
most repos used "master" as the default. But these days it is likely to
cause confusion if the server and client implementations choose
different values (e.g., if the remote started with "main", we may choose
"master" locally, create commits there, and then the user is surprised
when they push to "master" and not "main").

To solve this, the remote needs to communicate the target of the HEAD
symref, even if it is unborn, and "git clone" needs to use this
information.

Currently, symrefs that have unborn targets (such as in this case) are
not communicated by the protocol. Teach Git to advertise and support the
"unborn" feature in "ls-refs" (by default, this is advertised, but
server administrators may turn this off through the lsrefs.unborn
config). This feature indicates that "ls-refs" supports the "unborn"
argument; when it is specified, "ls-refs" will send the HEAD symref with
the name of its unborn target.

This change is only for protocol v2. A similar change for protocol v0
would require independent protocol design (there being no analogous
position to signal support for "unborn") and client-side plumbing of the
data required, so the scope of this patch set is limited to protocol v2.

The client side will be updated to use this in a subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config.txt                |  2 +
 Documentation/config/lsrefs.txt         |  9 +++
 Documentation/technical/protocol-v2.txt | 11 +++-
 ls-refs.c                               | 74 ++++++++++++++++++++++++-
 ls-refs.h                               |  1 +
 serve.c                                 |  2 +-
 t/t5701-git-serve.sh                    |  2 +-
 7 files changed, 95 insertions(+), 6 deletions(-)
 create mode 100644 Documentation/config/lsrefs.txt

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 6ba50b1104..d08e83a148 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -398,6 +398,8 @@ include::config/interactive.txt[]
 
 include::config/log.txt[]
 
+include::config/lsrefs.txt[]
+
 include::config/mailinfo.txt[]
 
 include::config/mailmap.txt[]
diff --git a/Documentation/config/lsrefs.txt b/Documentation/config/lsrefs.txt
new file mode 100644
index 0000000000..e003856c08
--- /dev/null
+++ b/Documentation/config/lsrefs.txt
@@ -0,0 +1,9 @@
+lsrefs.unborn::
+	May be "advertise" (the default), "allow", or "ignore". If "advertise",
+	the server will respond to the client sending "unborn" (as described in
+	protocol-v2.txt) and will advertise support for this feature during the
+	protocol v2 capability advertisement. "allow" is the same as
+	"advertise" except that the server will not advertise support for this
+	feature; this is useful for load-balanced servers that cannot be
+	updated automatically (for example), since the administrator could
+	configure "allow", then after a delay, configure "advertise".
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 85daeb5d9e..f772d90eaf 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -192,11 +192,20 @@ ls-refs takes in the following arguments:
 	When specified, only references having a prefix matching one of
 	the provided prefixes are displayed.
 
+If the 'unborn' feature is advertised the following argument can be
+included in the client's request.
+
+    unborn
+	The server will send information about HEAD even if it is a symref
+	pointing to an unborn branch in the form "unborn HEAD
+	symref-target:<target>".
+
 The output of ls-refs is as follows:
 
     output = *ref
 	     flush-pkt
-    ref = PKT-LINE(obj-id SP refname *(SP ref-attribute) LF)
+    obj-id-or-unborn = (obj-id | "unborn")
+    ref = PKT-LINE(obj-id-or-unborn SP refname *(SP ref-attribute) LF)
     ref-attribute = (symref | peeled)
     symref = "symref-target:" symref-target
     peeled = "peeled:" obj-id
diff --git a/ls-refs.c b/ls-refs.c
index a1e0b473e4..e08fd43e7a 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -7,6 +7,39 @@
 #include "pkt-line.h"
 #include "config.h"
 
+static int config_read;
+static int advertise_unborn;
+static int allow_unborn;
+
+static void ensure_config_read(void)
+{
+	char *str = NULL;
+
+	if (config_read)
+		return;
+
+	if (repo_config_get_string(the_repository, "lsrefs.unborn", &str)) {
+		/*
+		 * If there is no such config, advertise and allow it by
+		 * default.
+		 */
+		advertise_unborn = 1;
+		allow_unborn = 1;
+	} else {
+		if (!strcmp(str, "advertise")) {
+			advertise_unborn = 1;
+			allow_unborn = 1;
+		} else if (!strcmp(str, "allow")) {
+			allow_unborn = 1;
+		} else if (!strcmp(str, "ignore")) {
+			/* do nothing */
+		} else {
+			die(_("invalid value '%s' for lsrefs.unborn"), str);
+		}
+	}
+	config_read = 1;
+}
+
 /*
  * Check if one of the prefixes is a prefix of the ref.
  * If no prefixes were provided, all refs match.
@@ -32,6 +65,7 @@ struct ls_refs_data {
 	unsigned peel;
 	unsigned symrefs;
 	struct strvec prefixes;
+	unsigned unborn : 1;
 };
 
 static int send_ref(const char *refname, const struct object_id *oid,
@@ -47,7 +81,10 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	if (!ref_match(&data->prefixes, refname_nons))
 		return 0;
 
-	strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	if (oid)
+		strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	else
+		strbuf_addf(&refline, "unborn %s", refname_nons);
 	if (data->symrefs && flag & REF_ISSYMREF) {
 		struct object_id unused;
 		const char *symref_target = resolve_ref_unsafe(refname, 0,
@@ -61,7 +98,7 @@ static int send_ref(const char *refname, const struct object_id *oid,
 			    strip_namespace(symref_target));
 	}
 
-	if (data->peel) {
+	if (data->peel && oid) {
 		struct object_id peeled;
 		if (!peel_ref(refname, &peeled))
 			strbuf_addf(&refline, " peeled:%s", oid_to_hex(&peeled));
@@ -74,6 +111,23 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	return 0;
 }
 
+static void send_possibly_unborn_head(struct ls_refs_data *data)
+{
+	struct strbuf namespaced = STRBUF_INIT;
+	struct object_id oid;
+	int flag;
+	int oid_is_null;
+
+	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
+	if (!resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag))
+		return; /* bad ref */
+	oid_is_null = is_null_oid(&oid);
+	if (!oid_is_null ||
+	    (data->unborn && data->symrefs && (flag & REF_ISSYMREF)))
+		send_ref(namespaced.buf, oid_is_null ? NULL : &oid, flag, data);
+	strbuf_release(&namespaced);
+}
+
 static int ls_refs_config(const char *var, const char *value, void *data)
 {
 	/*
@@ -91,6 +145,7 @@ int ls_refs(struct repository *r, struct strvec *keys,
 
 	memset(&data, 0, sizeof(data));
 
+	ensure_config_read();
 	git_config(ls_refs_config, NULL);
 
 	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
@@ -103,14 +158,27 @@ int ls_refs(struct repository *r, struct strvec *keys,
 			data.symrefs = 1;
 		else if (skip_prefix(arg, "ref-prefix ", &out))
 			strvec_push(&data.prefixes, out);
+		else if (!strcmp("unborn", arg))
+			data.unborn = allow_unborn;
 	}
 
 	if (request->status != PACKET_READ_FLUSH)
 		die(_("expected flush after ls-refs arguments"));
 
-	head_ref_namespaced(send_ref, &data);
+	send_possibly_unborn_head(&data);
 	for_each_namespaced_ref(send_ref, &data);
 	packet_flush(1);
 	strvec_clear(&data.prefixes);
 	return 0;
 }
+
+int ls_refs_advertise(struct repository *r, struct strbuf *value)
+{
+	if (value) {
+		ensure_config_read();
+		if (advertise_unborn)
+			strbuf_addstr(value, "unborn");
+	}
+
+	return 1;
+}
diff --git a/ls-refs.h b/ls-refs.h
index 7b33a7c6b8..a99e4be0bd 100644
--- a/ls-refs.h
+++ b/ls-refs.h
@@ -6,5 +6,6 @@ struct strvec;
 struct packet_reader;
 int ls_refs(struct repository *r, struct strvec *keys,
 	    struct packet_reader *request);
+int ls_refs_advertise(struct repository *r, struct strbuf *value);
 
 #endif /* LS_REFS_H */
diff --git a/serve.c b/serve.c
index eec2fe6f29..ac20c72763 100644
--- a/serve.c
+++ b/serve.c
@@ -73,7 +73,7 @@ struct protocol_capability {
 
 static struct protocol_capability capabilities[] = {
 	{ "agent", agent_advertise, NULL },
-	{ "ls-refs", always_advertise, ls_refs },
+	{ "ls-refs", ls_refs_advertise, ls_refs },
 	{ "fetch", upload_pack_advertise, upload_pack_v2 },
 	{ "server-option", always_advertise, NULL },
 	{ "object-format", object_format_advertise, NULL },
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index a1f5fdc9fd..df29504161 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -12,7 +12,7 @@ test_expect_success 'test capability advertisement' '
 	cat >expect <<-EOF &&
 	version 2
 	agent=git/$(git version | cut -d" " -f3)
-	ls-refs
+	ls-refs=unborn
 	fetch=shallow
 	server-option
 	object-format=$(test_oid algo)
-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v7 2/3] connect, transport: encapsulate arg in struct
  2021-02-05  4:58 ` [PATCH v7 0/3] Cloning with " Jonathan Tan
  2021-02-05  4:58   ` [PATCH v7 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
@ 2021-02-05  4:58   ` Jonathan Tan
  2021-02-05  4:58   ` [PATCH v7 3/3] clone: respect remote unborn HEAD Jonathan Tan
  2021-02-05  5:25   ` [PATCH v7 0/3] Cloning with " Junio C Hamano
  3 siblings, 0 replies; 109+ messages in thread
From: Jonathan Tan @ 2021-02-05  4:58 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

In a future patch we plan to return the name of an unborn current branch
from deep in the callchain to a caller via a new pointer parameter that
points at a variable in the caller when the caller calls
get_remote_refs() and transport_get_remote_refs().

In preparation for that, encapsulate the existing ref_prefixes
parameter into a struct. The aforementioned unborn current branch will
go into this new struct in the future patch.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/clone.c      | 18 +++++++++++-------
 builtin/fetch-pack.c |  3 ++-
 builtin/fetch.c      | 18 +++++++++++-------
 builtin/ls-remote.c  |  9 +++++----
 connect.c            |  4 +++-
 remote.h             |  4 +++-
 transport-helper.c   |  5 +++--
 transport-internal.h | 10 ++--------
 transport.c          | 23 ++++++++++++-----------
 transport.h          | 21 ++++++++++++++-------
 10 files changed, 66 insertions(+), 49 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index a5630337e4..211d4f54b0 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -979,7 +979,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	int err = 0, complete_refs_before_fetch = 1;
 	int submodule_progress;
 
-	struct strvec ref_prefixes = STRVEC_INIT;
+	struct transport_ls_refs_options transport_ls_refs_options =
+		TRANSPORT_LS_REFS_OPTIONS_INIT;
 
 	packet_trace_identity("clone");
 
@@ -1257,14 +1258,17 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		transport->smart_options->check_self_contained_and_connected = 1;
 
 
-	strvec_push(&ref_prefixes, "HEAD");
-	refspec_ref_prefixes(&remote->fetch, &ref_prefixes);
+	strvec_push(&transport_ls_refs_options.ref_prefixes, "HEAD");
+	refspec_ref_prefixes(&remote->fetch,
+			     &transport_ls_refs_options.ref_prefixes);
 	if (option_branch)
-		expand_ref_prefix(&ref_prefixes, option_branch);
+		expand_ref_prefix(&transport_ls_refs_options.ref_prefixes,
+				  option_branch);
 	if (!option_no_tags)
-		strvec_push(&ref_prefixes, "refs/tags/");
+		strvec_push(&transport_ls_refs_options.ref_prefixes,
+			    "refs/tags/");
 
-	refs = transport_get_remote_refs(transport, &ref_prefixes);
+	refs = transport_get_remote_refs(transport, &transport_ls_refs_options);
 
 	if (refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
@@ -1380,6 +1384,6 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	strbuf_release(&key);
 	junk_mode = JUNK_LEAVE_ALL;
 
-	strvec_clear(&ref_prefixes);
+	strvec_clear(&transport_ls_refs_options.ref_prefixes);
 	return err;
 }
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 58b7c1fbdc..c2d96f4c89 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -220,7 +220,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	version = discover_version(&reader);
 	switch (version) {
 	case protocol_v2:
-		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL, args.stateless_rpc);
+		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL,
+				args.stateless_rpc);
 		break;
 	case protocol_v1:
 	case protocol_v0:
diff --git a/builtin/fetch.c b/builtin/fetch.c
index ecf8537605..837382ef4f 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1349,7 +1349,8 @@ static int do_fetch(struct transport *transport,
 	int autotags = (transport->remote->fetch_tags == 1);
 	int retcode = 0;
 	const struct ref *remote_refs;
-	struct strvec ref_prefixes = STRVEC_INIT;
+	struct transport_ls_refs_options transport_ls_refs_options =
+		TRANSPORT_LS_REFS_OPTIONS_INIT;
 	int must_list_refs = 1;
 
 	if (tags == TAGS_DEFAULT) {
@@ -1369,7 +1370,7 @@ static int do_fetch(struct transport *transport,
 	if (rs->nr) {
 		int i;
 
-		refspec_ref_prefixes(rs, &ref_prefixes);
+		refspec_ref_prefixes(rs, &transport_ls_refs_options.ref_prefixes);
 
 		/*
 		 * We can avoid listing refs if all of them are exact
@@ -1383,22 +1384,25 @@ static int do_fetch(struct transport *transport,
 			}
 		}
 	} else if (transport->remote && transport->remote->fetch.nr)
-		refspec_ref_prefixes(&transport->remote->fetch, &ref_prefixes);
+		refspec_ref_prefixes(&transport->remote->fetch,
+				     &transport_ls_refs_options.ref_prefixes);
 
 	if (tags == TAGS_SET || tags == TAGS_DEFAULT) {
 		must_list_refs = 1;
-		if (ref_prefixes.nr)
-			strvec_push(&ref_prefixes, "refs/tags/");
+		if (transport_ls_refs_options.ref_prefixes.nr)
+			strvec_push(&transport_ls_refs_options.ref_prefixes,
+				    "refs/tags/");
 	}
 
 	if (must_list_refs) {
 		trace2_region_enter("fetch", "remote_refs", the_repository);
-		remote_refs = transport_get_remote_refs(transport, &ref_prefixes);
+		remote_refs = transport_get_remote_refs(transport,
+							&transport_ls_refs_options);
 		trace2_region_leave("fetch", "remote_refs", the_repository);
 	} else
 		remote_refs = NULL;
 
-	strvec_clear(&ref_prefixes);
+	strvec_clear(&transport_ls_refs_options.ref_prefixes);
 
 	ref_map = get_ref_map(transport->remote, remote_refs, rs,
 			      tags, &autotags);
diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index 092917eca2..ef604752a0 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -45,7 +45,8 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	int show_symref_target = 0;
 	const char *uploadpack = NULL;
 	const char **pattern = NULL;
-	struct strvec ref_prefixes = STRVEC_INIT;
+	struct transport_ls_refs_options transport_options =
+		TRANSPORT_LS_REFS_OPTIONS_INIT;
 	int i;
 	struct string_list server_options = STRING_LIST_INIT_DUP;
 
@@ -94,9 +95,9 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	}
 
 	if (flags & REF_TAGS)
-		strvec_push(&ref_prefixes, "refs/tags/");
+		strvec_push(&transport_options.ref_prefixes, "refs/tags/");
 	if (flags & REF_HEADS)
-		strvec_push(&ref_prefixes, "refs/heads/");
+		strvec_push(&transport_options.ref_prefixes, "refs/heads/");
 
 	remote = remote_get(dest);
 	if (!remote) {
@@ -118,7 +119,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	if (server_options.nr)
 		transport->server_options = &server_options;
 
-	ref = transport_get_remote_refs(transport, &ref_prefixes);
+	ref = transport_get_remote_refs(transport, &transport_options);
 	if (ref) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
 		repo_set_hash_algo(the_repository, hash_algo);
diff --git a/connect.c b/connect.c
index 8b8f56cf6d..328c279250 100644
--- a/connect.c
+++ b/connect.c
@@ -453,12 +453,14 @@ void check_stateless_delimiter(int stateless_rpc,
 
 struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
-			     const struct strvec *ref_prefixes,
+			     struct transport_ls_refs_options *transport_options,
 			     const struct string_list *server_options,
 			     int stateless_rpc)
 {
 	int i;
 	const char *hash_name;
+	struct strvec *ref_prefixes = transport_options ?
+		&transport_options->ref_prefixes : NULL;
 	*list = NULL;
 
 	if (server_supports_v2("ls-refs", 1))
diff --git a/remote.h b/remote.h
index 3211abdf05..4ae676a11b 100644
--- a/remote.h
+++ b/remote.h
@@ -6,6 +6,8 @@
 #include "hashmap.h"
 #include "refspec.h"
 
+struct transport_ls_refs_options;
+
 /**
  * The API gives access to the configuration related to remotes. It handles
  * all three configuration mechanisms historically and currently used by Git,
@@ -196,7 +198,7 @@ struct ref **get_remote_heads(struct packet_reader *reader,
 /* Used for protocol v2 in order to retrieve refs from a remote */
 struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
-			     const struct strvec *ref_prefixes,
+			     struct transport_ls_refs_options *transport_options,
 			     const struct string_list *server_options,
 			     int stateless_rpc);
 
diff --git a/transport-helper.c b/transport-helper.c
index 5f6e0b3bd8..49b7fb4dcb 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1162,13 +1162,14 @@ static int has_attribute(const char *attrs, const char *attr)
 }
 
 static struct ref *get_refs_list(struct transport *transport, int for_push,
-				 const struct strvec *ref_prefixes)
+				 struct transport_ls_refs_options *transport_options)
 {
 	get_helper(transport);
 
 	if (process_connect(transport, for_push)) {
 		do_take_over(transport);
-		return transport->vtable->get_refs_list(transport, for_push, ref_prefixes);
+		return transport->vtable->get_refs_list(transport, for_push,
+							transport_options);
 	}
 
 	return get_refs_list_using_list(transport, for_push);
diff --git a/transport-internal.h b/transport-internal.h
index 27c9daffc4..b60f1ba907 100644
--- a/transport-internal.h
+++ b/transport-internal.h
@@ -4,6 +4,7 @@
 struct ref;
 struct transport;
 struct strvec;
+struct transport_ls_refs_options;
 
 struct transport_vtable {
 	/**
@@ -18,19 +19,12 @@ struct transport_vtable {
 	 * the transport to try to share connections, for_push is a
 	 * hint as to whether the ultimate operation is a push or a fetch.
 	 *
-	 * If communicating using protocol v2 a list of prefixes can be
-	 * provided to be sent to the server to enable it to limit the ref
-	 * advertisement.  Since ref filtering is done on the server's end, and
-	 * only when using protocol v2, this list will be ignored when not
-	 * using protocol v2 meaning this function can return refs which don't
-	 * match the provided ref_prefixes.
-	 *
 	 * If the transport is able to determine the remote hash for
 	 * the ref without a huge amount of effort, it should store it
 	 * in the ref's old_sha1 field; otherwise it should be all 0.
 	 **/
 	struct ref *(*get_refs_list)(struct transport *transport, int for_push,
-				     const struct strvec *ref_prefixes);
+				     struct transport_ls_refs_options *transport_options);
 
 	/**
 	 * Fetch the objects for the given refs. Note that this gets
diff --git a/transport.c b/transport.c
index 679a35e7c1..b13fab5dc3 100644
--- a/transport.c
+++ b/transport.c
@@ -127,7 +127,7 @@ struct bundle_transport_data {
 
 static struct ref *get_refs_from_bundle(struct transport *transport,
 					int for_push,
-					const struct strvec *ref_prefixes)
+					struct transport_ls_refs_options *transport_options)
 {
 	struct bundle_transport_data *data = transport->data;
 	struct ref *result = NULL;
@@ -280,7 +280,7 @@ static void die_if_server_options(struct transport *transport)
  * remote refs.
  */
 static struct ref *handshake(struct transport *transport, int for_push,
-			     const struct strvec *ref_prefixes,
+			     struct transport_ls_refs_options *options,
 			     int must_list_refs)
 {
 	struct git_transport_data *data = transport->data;
@@ -303,7 +303,7 @@ static struct ref *handshake(struct transport *transport, int for_push,
 			trace2_data_string("transfer", NULL, "server-sid", server_sid);
 		if (must_list_refs)
 			get_remote_refs(data->fd[1], &reader, &refs, for_push,
-					ref_prefixes,
+					options,
 					transport->server_options,
 					transport->stateless_rpc);
 		break;
@@ -334,9 +334,9 @@ static struct ref *handshake(struct transport *transport, int for_push,
 }
 
 static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
-					const struct strvec *ref_prefixes)
+					struct transport_ls_refs_options *options)
 {
-	return handshake(transport, for_push, ref_prefixes, 1);
+	return handshake(transport, for_push, options, 1);
 }
 
 static int fetch_refs_via_pack(struct transport *transport,
@@ -1252,19 +1252,20 @@ int transport_push(struct repository *r,
 		int porcelain = flags & TRANSPORT_PUSH_PORCELAIN;
 		int pretend = flags & TRANSPORT_PUSH_DRY_RUN;
 		int push_ret, ret, err;
-		struct strvec ref_prefixes = STRVEC_INIT;
+		struct transport_ls_refs_options transport_options =
+			TRANSPORT_LS_REFS_OPTIONS_INIT;
 
 		if (check_push_refs(local_refs, rs) < 0)
 			return -1;
 
-		refspec_ref_prefixes(rs, &ref_prefixes);
+		refspec_ref_prefixes(rs, &transport_options.ref_prefixes);
 
 		trace2_region_enter("transport_push", "get_refs_list", r);
 		remote_refs = transport->vtable->get_refs_list(transport, 1,
-							       &ref_prefixes);
+							       &transport_options);
 		trace2_region_leave("transport_push", "get_refs_list", r);
 
-		strvec_clear(&ref_prefixes);
+		strvec_clear(&transport_options.ref_prefixes);
 
 		if (flags & TRANSPORT_PUSH_ALL)
 			match_flags |= MATCH_REFS_ALL;
@@ -1380,12 +1381,12 @@ int transport_push(struct repository *r,
 }
 
 const struct ref *transport_get_remote_refs(struct transport *transport,
-					    const struct strvec *ref_prefixes)
+					    struct transport_ls_refs_options *transport_options)
 {
 	if (!transport->got_remote_refs) {
 		transport->remote_refs =
 			transport->vtable->get_refs_list(transport, 0,
-							 ref_prefixes);
+							 transport_options);
 		transport->got_remote_refs = 1;
 	}
 
diff --git a/transport.h b/transport.h
index 24558c027d..1f5b60e4d3 100644
--- a/transport.h
+++ b/transport.h
@@ -233,17 +233,24 @@ int transport_push(struct repository *repo,
 		   struct refspec *rs, int flags,
 		   unsigned int * reject_reasons);
 
+struct transport_ls_refs_options {
+	/*
+	 * Optionally, a list of ref prefixes can be provided which can be sent
+	 * to the server (when communicating using protocol v2) to enable it to
+	 * limit the ref advertisement.  Since ref filtering is done on the
+	 * server's end (and only when using protocol v2),
+	 * transport_get_remote_refs() could return refs which don't match the
+	 * provided ref_prefixes.
+	 */
+	struct strvec ref_prefixes;
+};
+#define TRANSPORT_LS_REFS_OPTIONS_INIT { STRVEC_INIT }
+
 /*
  * Retrieve refs from a remote.
- *
- * Optionally a list of ref prefixes can be provided which can be sent to the
- * server (when communicating using protocol v2) to enable it to limit the ref
- * advertisement.  Since ref filtering is done on the server's end (and only
- * when using protocol v2), this can return refs which don't match the provided
- * ref_prefixes.
  */
 const struct ref *transport_get_remote_refs(struct transport *transport,
-					    const struct strvec *ref_prefixes);
+					    struct transport_ls_refs_options *transport_options);
 
 /*
  * Fetch the hash algorithm used by a remote.
-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v7 3/3] clone: respect remote unborn HEAD
  2021-02-05  4:58 ` [PATCH v7 0/3] Cloning with " Jonathan Tan
  2021-02-05  4:58   ` [PATCH v7 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
  2021-02-05  4:58   ` [PATCH v7 2/3] connect, transport: encapsulate arg in struct Jonathan Tan
@ 2021-02-05  4:58   ` Jonathan Tan
  2021-02-05  5:25   ` [PATCH v7 0/3] Cloning with " Junio C Hamano
  3 siblings, 0 replies; 109+ messages in thread
From: Jonathan Tan @ 2021-02-05  4:58 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster

Teach Git to use the "unborn" feature introduced in a previous patch as
follows: Git will always send the "unborn" argument if it is supported
by the server. During "git clone", if cloning an empty repository, Git
will use the new information to determine the local branch to create. In
all other cases, Git will ignore it.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config/init.txt |  2 +-
 builtin/clone.c               | 16 ++++++++++++++--
 connect.c                     | 28 ++++++++++++++++++++++++++--
 t/t5606-clone-options.sh      |  8 +++++---
 t/t5702-protocol-v2.sh        | 25 +++++++++++++++++++++++++
 transport.h                   |  8 ++++++++
 6 files changed, 79 insertions(+), 8 deletions(-)

diff --git a/Documentation/config/init.txt b/Documentation/config/init.txt
index dc77f8c844..79c79d6617 100644
--- a/Documentation/config/init.txt
+++ b/Documentation/config/init.txt
@@ -4,4 +4,4 @@ init.templateDir::
 
 init.defaultBranch::
 	Allows overriding the default branch name e.g. when initializing
-	a new repository or when cloning an empty repository.
+	a new repository.
diff --git a/builtin/clone.c b/builtin/clone.c
index 211d4f54b0..09dcd97a2e 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1330,8 +1330,19 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		remote_head = NULL;
 		option_no_checkout = 1;
 		if (!option_bare) {
-			const char *branch = git_default_branch_name();
-			char *ref = xstrfmt("refs/heads/%s", branch);
+			const char *branch;
+			char *ref;
+
+			if (transport_ls_refs_options.unborn_head_target &&
+			    skip_prefix(transport_ls_refs_options.unborn_head_target,
+					"refs/heads/", &branch)) {
+				ref = transport_ls_refs_options.unborn_head_target;
+				transport_ls_refs_options.unborn_head_target = NULL;
+				create_symref("HEAD", ref, reflog_msg.buf);
+			} else {
+				branch = git_default_branch_name();
+				ref = xstrfmt("refs/heads/%s", branch);
+			}
 
 			install_branch_config(0, branch, remote_name, ref);
 			free(ref);
@@ -1385,5 +1396,6 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	junk_mode = JUNK_LEAVE_ALL;
 
 	strvec_clear(&transport_ls_refs_options.ref_prefixes);
+	free(transport_ls_refs_options.unborn_head_target);
 	return err;
 }
diff --git a/connect.c b/connect.c
index 328c279250..879669df93 100644
--- a/connect.c
+++ b/connect.c
@@ -376,7 +376,8 @@ struct ref **get_remote_heads(struct packet_reader *reader,
 }
 
 /* Returns 1 when a valid ref has been added to `list`, 0 otherwise */
-static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
+static int process_ref_v2(struct packet_reader *reader, struct ref ***list,
+			  char **unborn_head_target)
 {
 	int ret = 1;
 	int i = 0;
@@ -397,6 +398,25 @@ static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
 		goto out;
 	}
 
+	if (!strcmp("unborn", line_sections.items[i].string)) {
+		i++;
+		if (unborn_head_target &&
+		    !strcmp("HEAD", line_sections.items[i++].string)) {
+			/*
+			 * Look for the symref target (if any). If found,
+			 * return it to the caller.
+			 */
+			for (; i < line_sections.nr; i++) {
+				const char *arg = line_sections.items[i].string;
+
+				if (skip_prefix(arg, "symref-target:", &arg)) {
+					*unborn_head_target = xstrdup(arg);
+					break;
+				}
+			}
+		}
+		goto out;
+	}
 	if (parse_oid_hex_algop(line_sections.items[i++].string, &old_oid, &end, reader->hash_algo) ||
 	    *end) {
 		ret = 0;
@@ -461,6 +481,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 	const char *hash_name;
 	struct strvec *ref_prefixes = transport_options ?
 		&transport_options->ref_prefixes : NULL;
+	char **unborn_head_target = transport_options ?
+		&transport_options->unborn_head_target : NULL;
 	*list = NULL;
 
 	if (server_supports_v2("ls-refs", 1))
@@ -490,6 +512,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 	if (!for_push)
 		packet_write_fmt(fd_out, "peel\n");
 	packet_write_fmt(fd_out, "symrefs\n");
+	if (server_supports_feature("ls-refs", "unborn", 0))
+		packet_write_fmt(fd_out, "unborn\n");
 	for (i = 0; ref_prefixes && i < ref_prefixes->nr; i++) {
 		packet_write_fmt(fd_out, "ref-prefix %s\n",
 				 ref_prefixes->v[i]);
@@ -498,7 +522,7 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 
 	/* Process response from server */
 	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
-		if (!process_ref_v2(reader, &list))
+		if (!process_ref_v2(reader, &list, unborn_head_target))
 			die(_("invalid ls-refs response: %s"), reader->line);
 	}
 
diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh
index 7f082fb23b..ca6339a5fb 100755
--- a/t/t5606-clone-options.sh
+++ b/t/t5606-clone-options.sh
@@ -102,11 +102,13 @@ test_expect_success 'redirected clone -v does show progress' '
 '
 
 test_expect_success 'chooses correct default initial branch name' '
-	git init --bare empty &&
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=foo init --bare empty &&
+	test_config -C empty lsrefs.unborn advertise &&
 	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
 	git -c init.defaultBranch=up clone empty whats-up &&
-	test refs/heads/up = $(git -C whats-up symbolic-ref HEAD) &&
-	test refs/heads/up = $(git -C whats-up config branch.up.merge)
+	test refs/heads/foo = $(git -C whats-up symbolic-ref HEAD) &&
+	test refs/heads/foo = $(git -C whats-up config branch.foo.merge)
 '
 
 test_expect_success 'guesses initial branch name correctly' '
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 7d5b17909b..b2ead93af9 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -209,6 +209,31 @@ test_expect_success 'clone with file:// using protocol v2' '
 	grep "ref-prefix refs/tags/" log
 '
 
+test_expect_success 'clone of empty repo propagates name of default branch' '
+	test_when_finished "rm -rf file_empty_parent file_empty_child" &&
+
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&
+
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=main -c protocol.version=2 \
+		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
+	grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD
+'
+
+test_expect_success '...but not if explicitly forbidden by config' '
+	test_when_finished "rm -rf file_empty_parent file_empty_child" &&
+
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&
+	test_config -C file_empty_parent lsrefs.unborn ignore &&
+
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=main -c protocol.version=2 \
+		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
+	! grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD
+'
+
 test_expect_success 'fetch with file:// using protocol v2' '
 	test_when_finished "rm -f log" &&
 
diff --git a/transport.h b/transport.h
index 1f5b60e4d3..24e15799e7 100644
--- a/transport.h
+++ b/transport.h
@@ -243,6 +243,14 @@ struct transport_ls_refs_options {
 	 * provided ref_prefixes.
 	 */
 	struct strvec ref_prefixes;
+
+	/*
+	 * If unborn_head_target is not NULL, and the remote reports HEAD as
+	 * pointing to an unborn branch, transport_get_remote_refs() stores the
+	 * unborn branch in unborn_head_target. It should be freed by the
+	 * caller.
+	 */
+	char *unborn_head_target;
 };
 #define TRANSPORT_LS_REFS_OPTIONS_INIT { STRVEC_INIT }
 
-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v7 0/3] Cloning with remote unborn HEAD
  2021-02-05  4:58 ` [PATCH v7 0/3] Cloning with " Jonathan Tan
                     ` (2 preceding siblings ...)
  2021-02-05  4:58   ` [PATCH v7 3/3] clone: respect remote unborn HEAD Jonathan Tan
@ 2021-02-05  5:25   ` Junio C Hamano
  2021-02-05 16:15     ` Jeff King
  2021-02-05 21:15     ` Ævar Arnfjörð Bjarmason
  3 siblings, 2 replies; 109+ messages in thread
From: Junio C Hamano @ 2021-02-05  5:25 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, Jeff King

Jonathan Tan <jonathantanmy@google.com> writes:

> For what it's worth, here's v7 with advertise/allow/ignore and by
> default, advertise. I think that some server operators will have use for
> this feature, and people who want to disable it for whatever reason can
> still do so. The main disadvantage is complexity - the server knob that
> server administrators will need to control (but between a simpler
> allow/ignore knob and a more complicated advertise/allow/ignore knob, I
> think we might as well go for the slightly more complex one) and
> complexity in the code (but now that is constrained to one function and
> a few global variables).
>
> As you can see from the range-diff, not much has changed from v6.
>
> I've also included Junio's suggestion of tightening the promise made by
> the server (when the client says "unborn").

This looks reasonable overall, especially with the feature turned on
by default, we'd hopefully get reasonable exposure from the get-go.

Let's mark the topic to be merged to 'next' soonish, unless people
object.

Thanks.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v7 1/3] ls-refs: report unborn targets of symrefs
  2021-02-05  4:58   ` [PATCH v7 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
@ 2021-02-05 16:10     ` Jeff King
  0 siblings, 0 replies; 109+ messages in thread
From: Jeff King @ 2021-02-05 16:10 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, gitster

On Thu, Feb 04, 2021 at 08:58:31PM -0800, Jonathan Tan wrote:

> diff --git a/Documentation/config/lsrefs.txt b/Documentation/config/lsrefs.txt
> new file mode 100644
> index 0000000000..e003856c08
> --- /dev/null
> +++ b/Documentation/config/lsrefs.txt
> @@ -0,0 +1,9 @@
> +lsrefs.unborn::
> +	May be "advertise" (the default), "allow", or "ignore". If "advertise",
> +	the server will respond to the client sending "unborn" (as described in
> +	protocol-v2.txt) and will advertise support for this feature during the
> +	protocol v2 capability advertisement. "allow" is the same as
> +	"advertise" except that the server will not advertise support for this
> +	feature; this is useful for load-balanced servers that cannot be
> +	updated automatically (for example), since the administrator could
> +	configure "allow", then after a delay, configure "advertise".

Minor nit, but did you mean "updated atomically" in the second-to-last
line?

> diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> index 85daeb5d9e..f772d90eaf 100644
> --- a/Documentation/technical/protocol-v2.txt
> +++ b/Documentation/technical/protocol-v2.txt
> @@ -192,11 +192,20 @@ ls-refs takes in the following arguments:
>  	When specified, only references having a prefix matching one of
>  	the provided prefixes are displayed.
>  
> +If the 'unborn' feature is advertised the following argument can be
> +included in the client's request.
> +
> +    unborn
> +	The server will send information about HEAD even if it is a symref
> +	pointing to an unborn branch in the form "unborn HEAD
> +	symref-target:<target>".

I saw there was some discussion back-and-forth on the wording here.
Reading this, I'm left with the impression that an implementation would
be wrong (i.e., violating the protocol) to send a non-HEAD symref with
the unborn value.

I'd have thought we would describe the protocol in the most liberal
fashion, so that clients would be made ready to handle any future
improvements on the server side.

That kind of sounds like the opposite of what Junio said in [1]. And I'm
not sure it's worth re-opening if it's going to derail or delay this
otherwise useful feature. So I offer it as my 2 cents, but not a real
objection to moving forward.

[1] https://lore.kernel.org/git/xmqq1rdyf71k.fsf@gitster.c.googlers.com/

> diff --git a/ls-refs.c b/ls-refs.c
> index a1e0b473e4..e08fd43e7a 100644
> --- a/ls-refs.c
> +++ b/ls-refs.c
> @@ -7,6 +7,39 @@
>  #include "pkt-line.h"
>  #include "config.h"
>  
> +static int config_read;
> +static int advertise_unborn;
> +static int allow_unborn;

OK, we're back to the three-way config. I'm happy with it, as the
default is now "advertise+accept". :)

> +static void ensure_config_read(void)
> +{
> +	char *str = NULL;
> +
> +	if (config_read)
> +		return;
> +
> +	if (repo_config_get_string(the_repository, "lsrefs.unborn", &str)) {

This function leaks "str". I think you can use repo_config_get_string_tmp()
to avoid the allocation entirely.

(I think the initialization to NULL is also unnecessary, but I don't
mind it).

The rest of the patch looks good to me.

-Peff

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v7 0/3] Cloning with remote unborn HEAD
  2021-02-05  5:25   ` [PATCH v7 0/3] Cloning with " Junio C Hamano
@ 2021-02-05 16:15     ` Jeff King
  2021-02-05 21:15     ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 109+ messages in thread
From: Jeff King @ 2021-02-05 16:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Tan, git

On Thu, Feb 04, 2021 at 09:25:57PM -0800, Junio C Hamano wrote:

> Jonathan Tan <jonathantanmy@google.com> writes:
> 
> > For what it's worth, here's v7 with advertise/allow/ignore and by
> > default, advertise. I think that some server operators will have use for
> > this feature, and people who want to disable it for whatever reason can
> > still do so. The main disadvantage is complexity - the server knob that
> > server administrators will need to control (but between a simpler
> > allow/ignore knob and a more complicated advertise/allow/ignore knob, I
> > think we might as well go for the slightly more complex one) and
> > complexity in the code (but now that is constrained to one function and
> > a few global variables).
> >
> > As you can see from the range-diff, not much has changed from v6.
> >
> > I've also included Junio's suggestion of tightening the promise made by
> > the server (when the client says "unborn").
> 
> This looks reasonable overall, especially with the feature turned on
> by default, we'd hopefully get reasonable exposure from the get-go.
> 
> Let's mark the topic to be merged to 'next' soonish, unless people
> object.

No objection here. I sent a few comments in response to patch 1; the doc
fix and the leak are probably worth addressing before it hits next. I
couldn't help express my thoughts on the protocol wording, but it may be
best to ignore me. ;)

Thanks for working on this, Jonathan. I think it's a very useful
feature.

-Peff

^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v8 0/3] Cloning with remote unborn HEAD
  2020-12-08  1:31 Cloning empty repository uses locally configured default branch name Jonathan Tan
                   ` (5 preceding siblings ...)
  2021-02-05  4:58 ` [PATCH v7 0/3] Cloning with " Jonathan Tan
@ 2021-02-05 20:48 ` Jonathan Tan
  2021-02-05 20:48   ` [PATCH v8 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
                     ` (3 more replies)
  6 siblings, 4 replies; 109+ messages in thread
From: Jonathan Tan @ 2021-02-05 20:48 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, peff

Peff sent a review (which I don't see in lore.kernel.org/git, but I do
see it in my inbox); here's v8 in response to that.

As you can see from the range-diff, there are just minor changes to v7
(wording in documentation and a memory leak fix).

Jonathan Tan (3):
  ls-refs: report unborn targets of symrefs
  connect, transport: encapsulate arg in struct
  clone: respect remote unborn HEAD

 Documentation/config.txt                |  2 +
 Documentation/config/init.txt           |  2 +-
 Documentation/config/lsrefs.txt         |  9 +++
 Documentation/technical/protocol-v2.txt | 11 +++-
 builtin/clone.c                         | 34 +++++++++---
 builtin/fetch-pack.c                    |  3 +-
 builtin/fetch.c                         | 18 +++---
 builtin/ls-remote.c                     |  9 +--
 connect.c                               | 32 ++++++++++-
 ls-refs.c                               | 74 ++++++++++++++++++++++++-
 ls-refs.h                               |  1 +
 remote.h                                |  4 +-
 serve.c                                 |  2 +-
 t/t5606-clone-options.sh                |  8 ++-
 t/t5701-git-serve.sh                    |  2 +-
 t/t5702-protocol-v2.sh                  | 25 +++++++++
 transport-helper.c                      |  5 +-
 transport-internal.h                    | 10 +---
 transport.c                             | 23 ++++----
 transport.h                             | 29 +++++++---
 20 files changed, 240 insertions(+), 63 deletions(-)
 create mode 100644 Documentation/config/lsrefs.txt

Range-diff against v7:
1:  2d35075369 ! 1:  8b0f55b5e4 ls-refs: report unborn targets of symrefs
    @@ Documentation/config/lsrefs.txt (new)
     +	protocol v2 capability advertisement. "allow" is the same as
     +	"advertise" except that the server will not advertise support for this
     +	feature; this is useful for load-balanced servers that cannot be
    -+	updated automatically (for example), since the administrator could
    ++	updated atomically (for example), since the administrator could
     +	configure "allow", then after a delay, configure "advertise".
     
      ## Documentation/technical/protocol-v2.txt ##
    @@ ls-refs.c
     +
     +static void ensure_config_read(void)
     +{
    -+	char *str = NULL;
    ++	const char *str = NULL;
     +
     +	if (config_read)
     +		return;
     +
    -+	if (repo_config_get_string(the_repository, "lsrefs.unborn", &str)) {
    ++	if (repo_config_get_string_tmp(the_repository, "lsrefs.unborn", &str)) {
     +		/*
     +		 * If there is no such config, advertise and allow it by
     +		 * default.
2:  d4ed13d02e = 2:  f09bd56d5f connect, transport: encapsulate arg in struct
3:  a3e5a0a7c5 = 3:  a5495a42f1 clone: respect remote unborn HEAD
-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v8 1/3] ls-refs: report unborn targets of symrefs
  2021-02-05 20:48 ` [PATCH v8 " Jonathan Tan
@ 2021-02-05 20:48   ` Jonathan Tan
  2021-02-05 20:48   ` [PATCH v8 2/3] connect, transport: encapsulate arg in struct Jonathan Tan
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 109+ messages in thread
From: Jonathan Tan @ 2021-02-05 20:48 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, peff

When cloning, we choose the default branch based on the remote HEAD.
But if there is no remote HEAD reported (which could happen if the
target of the remote HEAD is unborn), we'll fall back to using our local
init.defaultBranch. Traditionally this hasn't been a big deal, because
most repos used "master" as the default. But these days it is likely to
cause confusion if the server and client implementations choose
different values (e.g., if the remote started with "main", we may choose
"master" locally, create commits there, and then the user is surprised
when they push to "master" and not "main").

To solve this, the remote needs to communicate the target of the HEAD
symref, even if it is unborn, and "git clone" needs to use this
information.

Currently, symrefs that have unborn targets (such as in this case) are
not communicated by the protocol. Teach Git to advertise and support the
"unborn" feature in "ls-refs" (by default, this is advertised, but
server administrators may turn this off through the lsrefs.unborn
config). This feature indicates that "ls-refs" supports the "unborn"
argument; when it is specified, "ls-refs" will send the HEAD symref with
the name of its unborn target.

This change is only for protocol v2. A similar change for protocol v0
would require independent protocol design (there being no analogous
position to signal support for "unborn") and client-side plumbing of the
data required, so the scope of this patch set is limited to protocol v2.

The client side will be updated to use this in a subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config.txt                |  2 +
 Documentation/config/lsrefs.txt         |  9 +++
 Documentation/technical/protocol-v2.txt | 11 +++-
 ls-refs.c                               | 74 ++++++++++++++++++++++++-
 ls-refs.h                               |  1 +
 serve.c                                 |  2 +-
 t/t5701-git-serve.sh                    |  2 +-
 7 files changed, 95 insertions(+), 6 deletions(-)
 create mode 100644 Documentation/config/lsrefs.txt

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 6ba50b1104..d08e83a148 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -398,6 +398,8 @@ include::config/interactive.txt[]
 
 include::config/log.txt[]
 
+include::config/lsrefs.txt[]
+
 include::config/mailinfo.txt[]
 
 include::config/mailmap.txt[]
diff --git a/Documentation/config/lsrefs.txt b/Documentation/config/lsrefs.txt
new file mode 100644
index 0000000000..adeda0f24d
--- /dev/null
+++ b/Documentation/config/lsrefs.txt
@@ -0,0 +1,9 @@
+lsrefs.unborn::
+	May be "advertise" (the default), "allow", or "ignore". If "advertise",
+	the server will respond to the client sending "unborn" (as described in
+	protocol-v2.txt) and will advertise support for this feature during the
+	protocol v2 capability advertisement. "allow" is the same as
+	"advertise" except that the server will not advertise support for this
+	feature; this is useful for load-balanced servers that cannot be
+	updated atomically (for example), since the administrator could
+	configure "allow", then after a delay, configure "advertise".
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 85daeb5d9e..f772d90eaf 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -192,11 +192,20 @@ ls-refs takes in the following arguments:
 	When specified, only references having a prefix matching one of
 	the provided prefixes are displayed.
 
+If the 'unborn' feature is advertised the following argument can be
+included in the client's request.
+
+    unborn
+	The server will send information about HEAD even if it is a symref
+	pointing to an unborn branch in the form "unborn HEAD
+	symref-target:<target>".
+
 The output of ls-refs is as follows:
 
     output = *ref
 	     flush-pkt
-    ref = PKT-LINE(obj-id SP refname *(SP ref-attribute) LF)
+    obj-id-or-unborn = (obj-id | "unborn")
+    ref = PKT-LINE(obj-id-or-unborn SP refname *(SP ref-attribute) LF)
     ref-attribute = (symref | peeled)
     symref = "symref-target:" symref-target
     peeled = "peeled:" obj-id
diff --git a/ls-refs.c b/ls-refs.c
index a1e0b473e4..32deb7be44 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -7,6 +7,39 @@
 #include "pkt-line.h"
 #include "config.h"
 
+static int config_read;
+static int advertise_unborn;
+static int allow_unborn;
+
+static void ensure_config_read(void)
+{
+	const char *str = NULL;
+
+	if (config_read)
+		return;
+
+	if (repo_config_get_string_tmp(the_repository, "lsrefs.unborn", &str)) {
+		/*
+		 * If there is no such config, advertise and allow it by
+		 * default.
+		 */
+		advertise_unborn = 1;
+		allow_unborn = 1;
+	} else {
+		if (!strcmp(str, "advertise")) {
+			advertise_unborn = 1;
+			allow_unborn = 1;
+		} else if (!strcmp(str, "allow")) {
+			allow_unborn = 1;
+		} else if (!strcmp(str, "ignore")) {
+			/* do nothing */
+		} else {
+			die(_("invalid value '%s' for lsrefs.unborn"), str);
+		}
+	}
+	config_read = 1;
+}
+
 /*
  * Check if one of the prefixes is a prefix of the ref.
  * If no prefixes were provided, all refs match.
@@ -32,6 +65,7 @@ struct ls_refs_data {
 	unsigned peel;
 	unsigned symrefs;
 	struct strvec prefixes;
+	unsigned unborn : 1;
 };
 
 static int send_ref(const char *refname, const struct object_id *oid,
@@ -47,7 +81,10 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	if (!ref_match(&data->prefixes, refname_nons))
 		return 0;
 
-	strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	if (oid)
+		strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	else
+		strbuf_addf(&refline, "unborn %s", refname_nons);
 	if (data->symrefs && flag & REF_ISSYMREF) {
 		struct object_id unused;
 		const char *symref_target = resolve_ref_unsafe(refname, 0,
@@ -61,7 +98,7 @@ static int send_ref(const char *refname, const struct object_id *oid,
 			    strip_namespace(symref_target));
 	}
 
-	if (data->peel) {
+	if (data->peel && oid) {
 		struct object_id peeled;
 		if (!peel_ref(refname, &peeled))
 			strbuf_addf(&refline, " peeled:%s", oid_to_hex(&peeled));
@@ -74,6 +111,23 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	return 0;
 }
 
+static void send_possibly_unborn_head(struct ls_refs_data *data)
+{
+	struct strbuf namespaced = STRBUF_INIT;
+	struct object_id oid;
+	int flag;
+	int oid_is_null;
+
+	strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
+	if (!resolve_ref_unsafe(namespaced.buf, 0, &oid, &flag))
+		return; /* bad ref */
+	oid_is_null = is_null_oid(&oid);
+	if (!oid_is_null ||
+	    (data->unborn && data->symrefs && (flag & REF_ISSYMREF)))
+		send_ref(namespaced.buf, oid_is_null ? NULL : &oid, flag, data);
+	strbuf_release(&namespaced);
+}
+
 static int ls_refs_config(const char *var, const char *value, void *data)
 {
 	/*
@@ -91,6 +145,7 @@ int ls_refs(struct repository *r, struct strvec *keys,
 
 	memset(&data, 0, sizeof(data));
 
+	ensure_config_read();
 	git_config(ls_refs_config, NULL);
 
 	while (packet_reader_read(request) == PACKET_READ_NORMAL) {
@@ -103,14 +158,27 @@ int ls_refs(struct repository *r, struct strvec *keys,
 			data.symrefs = 1;
 		else if (skip_prefix(arg, "ref-prefix ", &out))
 			strvec_push(&data.prefixes, out);
+		else if (!strcmp("unborn", arg))
+			data.unborn = allow_unborn;
 	}
 
 	if (request->status != PACKET_READ_FLUSH)
 		die(_("expected flush after ls-refs arguments"));
 
-	head_ref_namespaced(send_ref, &data);
+	send_possibly_unborn_head(&data);
 	for_each_namespaced_ref(send_ref, &data);
 	packet_flush(1);
 	strvec_clear(&data.prefixes);
 	return 0;
 }
+
+int ls_refs_advertise(struct repository *r, struct strbuf *value)
+{
+	if (value) {
+		ensure_config_read();
+		if (advertise_unborn)
+			strbuf_addstr(value, "unborn");
+	}
+
+	return 1;
+}
diff --git a/ls-refs.h b/ls-refs.h
index 7b33a7c6b8..a99e4be0bd 100644
--- a/ls-refs.h
+++ b/ls-refs.h
@@ -6,5 +6,6 @@ struct strvec;
 struct packet_reader;
 int ls_refs(struct repository *r, struct strvec *keys,
 	    struct packet_reader *request);
+int ls_refs_advertise(struct repository *r, struct strbuf *value);
 
 #endif /* LS_REFS_H */
diff --git a/serve.c b/serve.c
index eec2fe6f29..ac20c72763 100644
--- a/serve.c
+++ b/serve.c
@@ -73,7 +73,7 @@ struct protocol_capability {
 
 static struct protocol_capability capabilities[] = {
 	{ "agent", agent_advertise, NULL },
-	{ "ls-refs", always_advertise, ls_refs },
+	{ "ls-refs", ls_refs_advertise, ls_refs },
 	{ "fetch", upload_pack_advertise, upload_pack_v2 },
 	{ "server-option", always_advertise, NULL },
 	{ "object-format", object_format_advertise, NULL },
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index a1f5fdc9fd..df29504161 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -12,7 +12,7 @@ test_expect_success 'test capability advertisement' '
 	cat >expect <<-EOF &&
 	version 2
 	agent=git/$(git version | cut -d" " -f3)
-	ls-refs
+	ls-refs=unborn
 	fetch=shallow
 	server-option
 	object-format=$(test_oid algo)
-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v8 2/3] connect, transport: encapsulate arg in struct
  2021-02-05 20:48 ` [PATCH v8 " Jonathan Tan
  2021-02-05 20:48   ` [PATCH v8 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
@ 2021-02-05 20:48   ` Jonathan Tan
  2021-02-05 20:48   ` [PATCH v8 3/3] clone: respect remote unborn HEAD Jonathan Tan
  2021-02-06 18:51   ` [PATCH v8 0/3] Cloning with " Junio C Hamano
  3 siblings, 0 replies; 109+ messages in thread
From: Jonathan Tan @ 2021-02-05 20:48 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, peff

In a future patch we plan to return the name of an unborn current branch
from deep in the callchain to a caller via a new pointer parameter that
points at a variable in the caller when the caller calls
get_remote_refs() and transport_get_remote_refs().

In preparation for that, encapsulate the existing ref_prefixes
parameter into a struct. The aforementioned unborn current branch will
go into this new struct in the future patch.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/clone.c      | 18 +++++++++++-------
 builtin/fetch-pack.c |  3 ++-
 builtin/fetch.c      | 18 +++++++++++-------
 builtin/ls-remote.c  |  9 +++++----
 connect.c            |  4 +++-
 remote.h             |  4 +++-
 transport-helper.c   |  5 +++--
 transport-internal.h | 10 ++--------
 transport.c          | 23 ++++++++++++-----------
 transport.h          | 21 ++++++++++++++-------
 10 files changed, 66 insertions(+), 49 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index a5630337e4..211d4f54b0 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -979,7 +979,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	int err = 0, complete_refs_before_fetch = 1;
 	int submodule_progress;
 
-	struct strvec ref_prefixes = STRVEC_INIT;
+	struct transport_ls_refs_options transport_ls_refs_options =
+		TRANSPORT_LS_REFS_OPTIONS_INIT;
 
 	packet_trace_identity("clone");
 
@@ -1257,14 +1258,17 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		transport->smart_options->check_self_contained_and_connected = 1;
 
 
-	strvec_push(&ref_prefixes, "HEAD");
-	refspec_ref_prefixes(&remote->fetch, &ref_prefixes);
+	strvec_push(&transport_ls_refs_options.ref_prefixes, "HEAD");
+	refspec_ref_prefixes(&remote->fetch,
+			     &transport_ls_refs_options.ref_prefixes);
 	if (option_branch)
-		expand_ref_prefix(&ref_prefixes, option_branch);
+		expand_ref_prefix(&transport_ls_refs_options.ref_prefixes,
+				  option_branch);
 	if (!option_no_tags)
-		strvec_push(&ref_prefixes, "refs/tags/");
+		strvec_push(&transport_ls_refs_options.ref_prefixes,
+			    "refs/tags/");
 
-	refs = transport_get_remote_refs(transport, &ref_prefixes);
+	refs = transport_get_remote_refs(transport, &transport_ls_refs_options);
 
 	if (refs) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
@@ -1380,6 +1384,6 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	strbuf_release(&key);
 	junk_mode = JUNK_LEAVE_ALL;
 
-	strvec_clear(&ref_prefixes);
+	strvec_clear(&transport_ls_refs_options.ref_prefixes);
 	return err;
 }
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 58b7c1fbdc..c2d96f4c89 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -220,7 +220,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	version = discover_version(&reader);
 	switch (version) {
 	case protocol_v2:
-		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL, args.stateless_rpc);
+		get_remote_refs(fd[1], &reader, &ref, 0, NULL, NULL,
+				args.stateless_rpc);
 		break;
 	case protocol_v1:
 	case protocol_v0:
diff --git a/builtin/fetch.c b/builtin/fetch.c
index ecf8537605..837382ef4f 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1349,7 +1349,8 @@ static int do_fetch(struct transport *transport,
 	int autotags = (transport->remote->fetch_tags == 1);
 	int retcode = 0;
 	const struct ref *remote_refs;
-	struct strvec ref_prefixes = STRVEC_INIT;
+	struct transport_ls_refs_options transport_ls_refs_options =
+		TRANSPORT_LS_REFS_OPTIONS_INIT;
 	int must_list_refs = 1;
 
 	if (tags == TAGS_DEFAULT) {
@@ -1369,7 +1370,7 @@ static int do_fetch(struct transport *transport,
 	if (rs->nr) {
 		int i;
 
-		refspec_ref_prefixes(rs, &ref_prefixes);
+		refspec_ref_prefixes(rs, &transport_ls_refs_options.ref_prefixes);
 
 		/*
 		 * We can avoid listing refs if all of them are exact
@@ -1383,22 +1384,25 @@ static int do_fetch(struct transport *transport,
 			}
 		}
 	} else if (transport->remote && transport->remote->fetch.nr)
-		refspec_ref_prefixes(&transport->remote->fetch, &ref_prefixes);
+		refspec_ref_prefixes(&transport->remote->fetch,
+				     &transport_ls_refs_options.ref_prefixes);
 
 	if (tags == TAGS_SET || tags == TAGS_DEFAULT) {
 		must_list_refs = 1;
-		if (ref_prefixes.nr)
-			strvec_push(&ref_prefixes, "refs/tags/");
+		if (transport_ls_refs_options.ref_prefixes.nr)
+			strvec_push(&transport_ls_refs_options.ref_prefixes,
+				    "refs/tags/");
 	}
 
 	if (must_list_refs) {
 		trace2_region_enter("fetch", "remote_refs", the_repository);
-		remote_refs = transport_get_remote_refs(transport, &ref_prefixes);
+		remote_refs = transport_get_remote_refs(transport,
+							&transport_ls_refs_options);
 		trace2_region_leave("fetch", "remote_refs", the_repository);
 	} else
 		remote_refs = NULL;
 
-	strvec_clear(&ref_prefixes);
+	strvec_clear(&transport_ls_refs_options.ref_prefixes);
 
 	ref_map = get_ref_map(transport->remote, remote_refs, rs,
 			      tags, &autotags);
diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index 092917eca2..ef604752a0 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -45,7 +45,8 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	int show_symref_target = 0;
 	const char *uploadpack = NULL;
 	const char **pattern = NULL;
-	struct strvec ref_prefixes = STRVEC_INIT;
+	struct transport_ls_refs_options transport_options =
+		TRANSPORT_LS_REFS_OPTIONS_INIT;
 	int i;
 	struct string_list server_options = STRING_LIST_INIT_DUP;
 
@@ -94,9 +95,9 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	}
 
 	if (flags & REF_TAGS)
-		strvec_push(&ref_prefixes, "refs/tags/");
+		strvec_push(&transport_options.ref_prefixes, "refs/tags/");
 	if (flags & REF_HEADS)
-		strvec_push(&ref_prefixes, "refs/heads/");
+		strvec_push(&transport_options.ref_prefixes, "refs/heads/");
 
 	remote = remote_get(dest);
 	if (!remote) {
@@ -118,7 +119,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	if (server_options.nr)
 		transport->server_options = &server_options;
 
-	ref = transport_get_remote_refs(transport, &ref_prefixes);
+	ref = transport_get_remote_refs(transport, &transport_options);
 	if (ref) {
 		int hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport));
 		repo_set_hash_algo(the_repository, hash_algo);
diff --git a/connect.c b/connect.c
index 8b8f56cf6d..328c279250 100644
--- a/connect.c
+++ b/connect.c
@@ -453,12 +453,14 @@ void check_stateless_delimiter(int stateless_rpc,
 
 struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
-			     const struct strvec *ref_prefixes,
+			     struct transport_ls_refs_options *transport_options,
 			     const struct string_list *server_options,
 			     int stateless_rpc)
 {
 	int i;
 	const char *hash_name;
+	struct strvec *ref_prefixes = transport_options ?
+		&transport_options->ref_prefixes : NULL;
 	*list = NULL;
 
 	if (server_supports_v2("ls-refs", 1))
diff --git a/remote.h b/remote.h
index 3211abdf05..4ae676a11b 100644
--- a/remote.h
+++ b/remote.h
@@ -6,6 +6,8 @@
 #include "hashmap.h"
 #include "refspec.h"
 
+struct transport_ls_refs_options;
+
 /**
  * The API gives access to the configuration related to remotes. It handles
  * all three configuration mechanisms historically and currently used by Git,
@@ -196,7 +198,7 @@ struct ref **get_remote_heads(struct packet_reader *reader,
 /* Used for protocol v2 in order to retrieve refs from a remote */
 struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 			     struct ref **list, int for_push,
-			     const struct strvec *ref_prefixes,
+			     struct transport_ls_refs_options *transport_options,
 			     const struct string_list *server_options,
 			     int stateless_rpc);
 
diff --git a/transport-helper.c b/transport-helper.c
index 5f6e0b3bd8..49b7fb4dcb 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1162,13 +1162,14 @@ static int has_attribute(const char *attrs, const char *attr)
 }
 
 static struct ref *get_refs_list(struct transport *transport, int for_push,
-				 const struct strvec *ref_prefixes)
+				 struct transport_ls_refs_options *transport_options)
 {
 	get_helper(transport);
 
 	if (process_connect(transport, for_push)) {
 		do_take_over(transport);
-		return transport->vtable->get_refs_list(transport, for_push, ref_prefixes);
+		return transport->vtable->get_refs_list(transport, for_push,
+							transport_options);
 	}
 
 	return get_refs_list_using_list(transport, for_push);
diff --git a/transport-internal.h b/transport-internal.h
index 27c9daffc4..b60f1ba907 100644
--- a/transport-internal.h
+++ b/transport-internal.h
@@ -4,6 +4,7 @@
 struct ref;
 struct transport;
 struct strvec;
+struct transport_ls_refs_options;
 
 struct transport_vtable {
 	/**
@@ -18,19 +19,12 @@ struct transport_vtable {
 	 * the transport to try to share connections, for_push is a
 	 * hint as to whether the ultimate operation is a push or a fetch.
 	 *
-	 * If communicating using protocol v2 a list of prefixes can be
-	 * provided to be sent to the server to enable it to limit the ref
-	 * advertisement.  Since ref filtering is done on the server's end, and
-	 * only when using protocol v2, this list will be ignored when not
-	 * using protocol v2 meaning this function can return refs which don't
-	 * match the provided ref_prefixes.
-	 *
 	 * If the transport is able to determine the remote hash for
 	 * the ref without a huge amount of effort, it should store it
 	 * in the ref's old_sha1 field; otherwise it should be all 0.
 	 **/
 	struct ref *(*get_refs_list)(struct transport *transport, int for_push,
-				     const struct strvec *ref_prefixes);
+				     struct transport_ls_refs_options *transport_options);
 
 	/**
 	 * Fetch the objects for the given refs. Note that this gets
diff --git a/transport.c b/transport.c
index 679a35e7c1..b13fab5dc3 100644
--- a/transport.c
+++ b/transport.c
@@ -127,7 +127,7 @@ struct bundle_transport_data {
 
 static struct ref *get_refs_from_bundle(struct transport *transport,
 					int for_push,
-					const struct strvec *ref_prefixes)
+					struct transport_ls_refs_options *transport_options)
 {
 	struct bundle_transport_data *data = transport->data;
 	struct ref *result = NULL;
@@ -280,7 +280,7 @@ static void die_if_server_options(struct transport *transport)
  * remote refs.
  */
 static struct ref *handshake(struct transport *transport, int for_push,
-			     const struct strvec *ref_prefixes,
+			     struct transport_ls_refs_options *options,
 			     int must_list_refs)
 {
 	struct git_transport_data *data = transport->data;
@@ -303,7 +303,7 @@ static struct ref *handshake(struct transport *transport, int for_push,
 			trace2_data_string("transfer", NULL, "server-sid", server_sid);
 		if (must_list_refs)
 			get_remote_refs(data->fd[1], &reader, &refs, for_push,
-					ref_prefixes,
+					options,
 					transport->server_options,
 					transport->stateless_rpc);
 		break;
@@ -334,9 +334,9 @@ static struct ref *handshake(struct transport *transport, int for_push,
 }
 
 static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
-					const struct strvec *ref_prefixes)
+					struct transport_ls_refs_options *options)
 {
-	return handshake(transport, for_push, ref_prefixes, 1);
+	return handshake(transport, for_push, options, 1);
 }
 
 static int fetch_refs_via_pack(struct transport *transport,
@@ -1252,19 +1252,20 @@ int transport_push(struct repository *r,
 		int porcelain = flags & TRANSPORT_PUSH_PORCELAIN;
 		int pretend = flags & TRANSPORT_PUSH_DRY_RUN;
 		int push_ret, ret, err;
-		struct strvec ref_prefixes = STRVEC_INIT;
+		struct transport_ls_refs_options transport_options =
+			TRANSPORT_LS_REFS_OPTIONS_INIT;
 
 		if (check_push_refs(local_refs, rs) < 0)
 			return -1;
 
-		refspec_ref_prefixes(rs, &ref_prefixes);
+		refspec_ref_prefixes(rs, &transport_options.ref_prefixes);
 
 		trace2_region_enter("transport_push", "get_refs_list", r);
 		remote_refs = transport->vtable->get_refs_list(transport, 1,
-							       &ref_prefixes);
+							       &transport_options);
 		trace2_region_leave("transport_push", "get_refs_list", r);
 
-		strvec_clear(&ref_prefixes);
+		strvec_clear(&transport_options.ref_prefixes);
 
 		if (flags & TRANSPORT_PUSH_ALL)
 			match_flags |= MATCH_REFS_ALL;
@@ -1380,12 +1381,12 @@ int transport_push(struct repository *r,
 }
 
 const struct ref *transport_get_remote_refs(struct transport *transport,
-					    const struct strvec *ref_prefixes)
+					    struct transport_ls_refs_options *transport_options)
 {
 	if (!transport->got_remote_refs) {
 		transport->remote_refs =
 			transport->vtable->get_refs_list(transport, 0,
-							 ref_prefixes);
+							 transport_options);
 		transport->got_remote_refs = 1;
 	}
 
diff --git a/transport.h b/transport.h
index 24558c027d..1f5b60e4d3 100644
--- a/transport.h
+++ b/transport.h
@@ -233,17 +233,24 @@ int transport_push(struct repository *repo,
 		   struct refspec *rs, int flags,
 		   unsigned int * reject_reasons);
 
+struct transport_ls_refs_options {
+	/*
+	 * Optionally, a list of ref prefixes can be provided which can be sent
+	 * to the server (when communicating using protocol v2) to enable it to
+	 * limit the ref advertisement.  Since ref filtering is done on the
+	 * server's end (and only when using protocol v2),
+	 * transport_get_remote_refs() could return refs which don't match the
+	 * provided ref_prefixes.
+	 */
+	struct strvec ref_prefixes;
+};
+#define TRANSPORT_LS_REFS_OPTIONS_INIT { STRVEC_INIT }
+
 /*
  * Retrieve refs from a remote.
- *
- * Optionally a list of ref prefixes can be provided which can be sent to the
- * server (when communicating using protocol v2) to enable it to limit the ref
- * advertisement.  Since ref filtering is done on the server's end (and only
- * when using protocol v2), this can return refs which don't match the provided
- * ref_prefixes.
  */
 const struct ref *transport_get_remote_refs(struct transport *transport,
-					    const struct strvec *ref_prefixes);
+					    struct transport_ls_refs_options *transport_options);
 
 /*
  * Fetch the hash algorithm used by a remote.
-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* [PATCH v8 3/3] clone: respect remote unborn HEAD
  2021-02-05 20:48 ` [PATCH v8 " Jonathan Tan
  2021-02-05 20:48   ` [PATCH v8 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
  2021-02-05 20:48   ` [PATCH v8 2/3] connect, transport: encapsulate arg in struct Jonathan Tan
@ 2021-02-05 20:48   ` Jonathan Tan
  2021-02-06 18:51   ` [PATCH v8 0/3] Cloning with " Junio C Hamano
  3 siblings, 0 replies; 109+ messages in thread
From: Jonathan Tan @ 2021-02-05 20:48 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, peff

Teach Git to use the "unborn" feature introduced in a previous patch as
follows: Git will always send the "unborn" argument if it is supported
by the server. During "git clone", if cloning an empty repository, Git
will use the new information to determine the local branch to create. In
all other cases, Git will ignore it.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config/init.txt |  2 +-
 builtin/clone.c               | 16 ++++++++++++++--
 connect.c                     | 28 ++++++++++++++++++++++++++--
 t/t5606-clone-options.sh      |  8 +++++---
 t/t5702-protocol-v2.sh        | 25 +++++++++++++++++++++++++
 transport.h                   |  8 ++++++++
 6 files changed, 79 insertions(+), 8 deletions(-)

diff --git a/Documentation/config/init.txt b/Documentation/config/init.txt
index dc77f8c844..79c79d6617 100644
--- a/Documentation/config/init.txt
+++ b/Documentation/config/init.txt
@@ -4,4 +4,4 @@ init.templateDir::
 
 init.defaultBranch::
 	Allows overriding the default branch name e.g. when initializing
-	a new repository or when cloning an empty repository.
+	a new repository.
diff --git a/builtin/clone.c b/builtin/clone.c
index 211d4f54b0..09dcd97a2e 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1330,8 +1330,19 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 		remote_head = NULL;
 		option_no_checkout = 1;
 		if (!option_bare) {
-			const char *branch = git_default_branch_name();
-			char *ref = xstrfmt("refs/heads/%s", branch);
+			const char *branch;
+			char *ref;
+
+			if (transport_ls_refs_options.unborn_head_target &&
+			    skip_prefix(transport_ls_refs_options.unborn_head_target,
+					"refs/heads/", &branch)) {
+				ref = transport_ls_refs_options.unborn_head_target;
+				transport_ls_refs_options.unborn_head_target = NULL;
+				create_symref("HEAD", ref, reflog_msg.buf);
+			} else {
+				branch = git_default_branch_name();
+				ref = xstrfmt("refs/heads/%s", branch);
+			}
 
 			install_branch_config(0, branch, remote_name, ref);
 			free(ref);
@@ -1385,5 +1396,6 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	junk_mode = JUNK_LEAVE_ALL;
 
 	strvec_clear(&transport_ls_refs_options.ref_prefixes);
+	free(transport_ls_refs_options.unborn_head_target);
 	return err;
 }
diff --git a/connect.c b/connect.c
index 328c279250..879669df93 100644
--- a/connect.c
+++ b/connect.c
@@ -376,7 +376,8 @@ struct ref **get_remote_heads(struct packet_reader *reader,
 }
 
 /* Returns 1 when a valid ref has been added to `list`, 0 otherwise */
-static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
+static int process_ref_v2(struct packet_reader *reader, struct ref ***list,
+			  char **unborn_head_target)
 {
 	int ret = 1;
 	int i = 0;
@@ -397,6 +398,25 @@ static int process_ref_v2(struct packet_reader *reader, struct ref ***list)
 		goto out;
 	}
 
+	if (!strcmp("unborn", line_sections.items[i].string)) {
+		i++;
+		if (unborn_head_target &&
+		    !strcmp("HEAD", line_sections.items[i++].string)) {
+			/*
+			 * Look for the symref target (if any). If found,
+			 * return it to the caller.
+			 */
+			for (; i < line_sections.nr; i++) {
+				const char *arg = line_sections.items[i].string;
+
+				if (skip_prefix(arg, "symref-target:", &arg)) {
+					*unborn_head_target = xstrdup(arg);
+					break;
+				}
+			}
+		}
+		goto out;
+	}
 	if (parse_oid_hex_algop(line_sections.items[i++].string, &old_oid, &end, reader->hash_algo) ||
 	    *end) {
 		ret = 0;
@@ -461,6 +481,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 	const char *hash_name;
 	struct strvec *ref_prefixes = transport_options ?
 		&transport_options->ref_prefixes : NULL;
+	char **unborn_head_target = transport_options ?
+		&transport_options->unborn_head_target : NULL;
 	*list = NULL;
 
 	if (server_supports_v2("ls-refs", 1))
@@ -490,6 +512,8 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 	if (!for_push)
 		packet_write_fmt(fd_out, "peel\n");
 	packet_write_fmt(fd_out, "symrefs\n");
+	if (server_supports_feature("ls-refs", "unborn", 0))
+		packet_write_fmt(fd_out, "unborn\n");
 	for (i = 0; ref_prefixes && i < ref_prefixes->nr; i++) {
 		packet_write_fmt(fd_out, "ref-prefix %s\n",
 				 ref_prefixes->v[i]);
@@ -498,7 +522,7 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
 
 	/* Process response from server */
 	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
-		if (!process_ref_v2(reader, &list))
+		if (!process_ref_v2(reader, &list, unborn_head_target))
 			die(_("invalid ls-refs response: %s"), reader->line);
 	}
 
diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh
index 7f082fb23b..ca6339a5fb 100755
--- a/t/t5606-clone-options.sh
+++ b/t/t5606-clone-options.sh
@@ -102,11 +102,13 @@ test_expect_success 'redirected clone -v does show progress' '
 '
 
 test_expect_success 'chooses correct default initial branch name' '
-	git init --bare empty &&
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=foo init --bare empty &&
+	test_config -C empty lsrefs.unborn advertise &&
 	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
 	git -c init.defaultBranch=up clone empty whats-up &&
-	test refs/heads/up = $(git -C whats-up symbolic-ref HEAD) &&
-	test refs/heads/up = $(git -C whats-up config branch.up.merge)
+	test refs/heads/foo = $(git -C whats-up symbolic-ref HEAD) &&
+	test refs/heads/foo = $(git -C whats-up config branch.foo.merge)
 '
 
 test_expect_success 'guesses initial branch name correctly' '
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 7d5b17909b..b2ead93af9 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -209,6 +209,31 @@ test_expect_success 'clone with file:// using protocol v2' '
 	grep "ref-prefix refs/tags/" log
 '
 
+test_expect_success 'clone of empty repo propagates name of default branch' '
+	test_when_finished "rm -rf file_empty_parent file_empty_child" &&
+
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&
+
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=main -c protocol.version=2 \
+		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
+	grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD
+'
+
+test_expect_success '...but not if explicitly forbidden by config' '
+	test_when_finished "rm -rf file_empty_parent file_empty_child" &&
+
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=mydefaultbranch init file_empty_parent &&
+	test_config -C file_empty_parent lsrefs.unborn ignore &&
+
+	GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \
+	git -c init.defaultBranch=main -c protocol.version=2 \
+		clone "file://$(pwd)/file_empty_parent" file_empty_child &&
+	! grep "refs/heads/mydefaultbranch" file_empty_child/.git/HEAD
+'
+
 test_expect_success 'fetch with file:// using protocol v2' '
 	test_when_finished "rm -f log" &&
 
diff --git a/transport.h b/transport.h
index 1f5b60e4d3..24e15799e7 100644
--- a/transport.h
+++ b/transport.h
@@ -243,6 +243,14 @@ struct transport_ls_refs_options {
 	 * provided ref_prefixes.
 	 */
 	struct strvec ref_prefixes;
+
+	/*
+	 * If unborn_head_target is not NULL, and the remote reports HEAD as
+	 * pointing to an unborn branch, transport_get_remote_refs() stores the
+	 * unborn branch in unborn_head_target. It should be freed by the
+	 * caller.
+	 */
+	char *unborn_head_target;
 };
 #define TRANSPORT_LS_REFS_OPTIONS_INIT { STRVEC_INIT }
 
-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v7 0/3] Cloning with remote unborn HEAD
  2021-02-05  5:25   ` [PATCH v7 0/3] Cloning with " Junio C Hamano
  2021-02-05 16:15     ` Jeff King
@ 2021-02-05 21:15     ` Ævar Arnfjörð Bjarmason
  2021-02-05 23:07       ` Junio C Hamano
  1 sibling, 1 reply; 109+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-05 21:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Tan, git, Jeff King


On Fri, Feb 05 2021, Junio C Hamano wrote:

> Jonathan Tan <jonathantanmy@google.com> writes:
>
>> For what it's worth, here's v7 with advertise/allow/ignore and by
>> default, advertise. I think that some server operators will have use for
>> this feature, and people who want to disable it for whatever reason can
>> still do so. The main disadvantage is complexity - the server knob that
>> server administrators will need to control (but between a simpler
>> allow/ignore knob and a more complicated advertise/allow/ignore knob, I
>> think we might as well go for the slightly more complex one) and
>> complexity in the code (but now that is constrained to one function and
>> a few global variables).
>>
>> As you can see from the range-diff, not much has changed from v6.
>>
>> I've also included Junio's suggestion of tightening the promise made by
>> the server (when the client says "unborn").
>
> This looks reasonable overall, especially with the feature turned on
> by default, we'd hopefully get reasonable exposure from the get-go.
>
> Let's mark the topic to be merged to 'next' soonish, unless people
> object.

FWIW I'm still unclear on re [1] whether Jonathan thinks the semantics
of this shouldn't be documented for <reasons>, or whether he just
doesn't want to submit the patch and I should, or something else.

I still think this "remote as config source" without actually explaining
that it is is a very glaring hole in the docs[2].

And in [3] I noted that we introduced the word "branches" into
protocol-v2.txt for the first time without defining what it means
(presumably just refs/heads/*, but then let's say so...). There was a
reply promising clarification, but I note that v7 still just says
"branches".

To be clear I'm perfectly fine with a note in a CL to the effect of "had
X feedback on last version, Ævar said Y and Z both of which I think are
stupid ideas, so I'm not doing them" :)

The only thing I mind is being left hanging to the effect of not knowing
if a diff you proposed is considered bad by the primary author, or if I
should just submit it myself as a patch on top or whatever.

It also saves other people following along with reviews time, since they
can read later cover letters and get a brief summary of some
side-discussion in an earlier round without diving into it themselves.

Me too honestly, sometimes I come back to these threads and completely
forgot what I had to say in earlier rounds, and when I try to find out
it's hit-and-miss whether I agree with much/any of it :)





1. https://lore.kernel.org/git/87h7n3p363.fsf@evledraar.gmail.com/
2. https://lore.kernel.org/git/878s8apthr.fsf@evledraar.gmail.com/
3. https://lore.kernel.org/git/87k0rzp3qx.fsf@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v5 0/3] Cloning with remote unborn HEAD
  2021-01-27  1:41   ` Ævar Arnfjörð Bjarmason
  2021-01-30  4:41     ` Jonathan Tan
@ 2021-02-05 22:28     ` Junio C Hamano
  1 sibling, 0 replies; 109+ messages in thread
From: Junio C Hamano @ 2021-02-05 22:28 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jonathan Tan, git, Jeff King, Felipe Contreras

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Tue, Jan 26 2021, Jonathan Tan wrote:
>
> [For some reason the patches didn't reach my mailbox, but I see them in
> the list archive, so I'm replying to the cover-letter]
>
>>  Documentation/config.txt                |  2 +
>>  Documentation/config/init.txt           |  2 +-
>
> Good, now we have init.defaultBranch docs, but they say:
>     
>      init.defaultBranch::
>             Allows overriding the default branch name e.g. when initializing
>     -       a new repository or when cloning an empty repository.
>     +       a new repository.
>
> So this still only applies to file:// and other "protocol" clones, but
> not "git clone /some/path"?

I agree with you that the new "unborn HEAD will also follow what the
upstream has" should be done for --local transport.  It is a bug
waiting to be complained about by users.

>     init.defaultBranch::
>         Allows overriding the default branch name e.g. when initializing a
>         new repository or when cloning an empty repository.
>     
>         When cloning a repository over protocol v2 (i.e. ssh://, https://,
>         file://, but not a /some/path), and if that repository has
>         init.defaultBranch configured, the server will advertise its
>         preferred default branch name, and we'll take its configuration over
>         ours.

I actually do not think that is what is going on.  What the other
side advertises is *NOT* their preferred default branch name and it
does not matter if they have init.defaultBranch configured or not.

What the new protocol extension gives us is that we can learn what
the other side is actually using (not their preferred default) as
their primary branch.  We've always done so since very early days of
"git clone" (even back when we failed to clone an empty repository),
by trying to guess which branch their HEAD points at.

The only thing that is new with this topic is that it now gives us
a reliable way to learn what their HEAD points at, even when it is
pointing at an unborn branch.

In general we do not let other side _dictate_ what our configuration
should look like, as that can have security implications, and this
is not sending any configuration at all.

Their HEAD may be pointing at a specific branch (which may or may
not be unborn) because that is what they configured their
init.defaultBranch to, or their version of Git created the branch
and they haven't changed it since repository creation, or the user
using that repository just started working with that branch with
"git checkout [--orphan]" (the repository being cloned does not have
to be a bare repository).  It does not matter how their HEAD ended
up pointing at a specific branch---we just try to mimic their
current status---it is because it would make it easier to give our
changes back to them if everybody involved used the same name for
the primary integration branch, and the repositories the people
clone from are most often have their primary integration branch
pointed at by their HEAD.

And I do not consider it transfering any configuration.

So while I agree that the logic to choose the branch that gets
checked out in a new repository created by "git clone" needs to be
documented well, it has very little to do with "init.defaultBranch".


^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v7 0/3] Cloning with remote unborn HEAD
  2021-02-05 21:15     ` Ævar Arnfjörð Bjarmason
@ 2021-02-05 23:07       ` Junio C Hamano
  0 siblings, 0 replies; 109+ messages in thread
From: Junio C Hamano @ 2021-02-05 23:07 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Jonathan Tan, git, Jeff King

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> And in [3] I noted that we introduced the word "branches" into
> protocol-v2.txt for the first time without defining what it means
> (presumably just refs/heads/*, but then let's say so...). There was a
> reply promising clarification, but I note that v7 still just says
> "branches".

I do not think it so bad to mention "branch" in the part that
explains things to humans in the terminology they are used to.  It
is a different matter to introduce EBNF terminal <branch> without a
proper definition of the word, but I od not think we are doing so
here.

I however have to agree with the need to tighten what gets sent;
that is why a suggested replacement in my earlier review phrased it
this way:

    unborn

    If HEAD is a symref pointing to an unborn branch <b>, the
    server reports it as "unborn HEAD symref-target:refs/heads/<b>"
    in its response.

to make it clear that a full refname is sent for the pointee by
HEAD.

Thanks.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v8 0/3] Cloning with remote unborn HEAD
  2021-02-05 20:48 ` [PATCH v8 " Jonathan Tan
                     ` (2 preceding siblings ...)
  2021-02-05 20:48   ` [PATCH v8 3/3] clone: respect remote unborn HEAD Jonathan Tan
@ 2021-02-06 18:51   ` Junio C Hamano
  2021-02-08 22:28     ` Junio C Hamano
  3 siblings, 1 reply; 109+ messages in thread
From: Junio C Hamano @ 2021-02-06 18:51 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, peff, Ævar Arnfjörð Bjarmason

Jonathan Tan <jonathantanmy@google.com> writes:

> Peff sent a review (which I don't see in lore.kernel.org/git, but I do
> see it in my inbox); here's v8 in response to that.
>
> As you can see from the range-diff, there are just minor changes to v7
> (wording in documentation and a memory leak fix).
>
> Jonathan Tan (3):
>   ls-refs: report unborn targets of symrefs
>   connect, transport: encapsulate arg in struct
>   clone: respect remote unborn HEAD

Queued and pushed out, but with vger.kernel.org seem to be delaying
messages randomly, I'll hold off for a few days before merging them
down to 'next'.  To me this version looks good (at least good enough
to cook in 'next' and details can be tweaked incrementally).

Thanks.

>  Documentation/config.txt                |  2 +
>  Documentation/config/init.txt           |  2 +-
>  Documentation/config/lsrefs.txt         |  9 +++
>  Documentation/technical/protocol-v2.txt | 11 +++-
>  builtin/clone.c                         | 34 +++++++++---
>  builtin/fetch-pack.c                    |  3 +-
>  builtin/fetch.c                         | 18 +++---
>  builtin/ls-remote.c                     |  9 +--
>  connect.c                               | 32 ++++++++++-
>  ls-refs.c                               | 74 ++++++++++++++++++++++++-
>  ls-refs.h                               |  1 +
>  remote.h                                |  4 +-
>  serve.c                                 |  2 +-
>  t/t5606-clone-options.sh                |  8 ++-
>  t/t5701-git-serve.sh                    |  2 +-
>  t/t5702-protocol-v2.sh                  | 25 +++++++++
>  transport-helper.c                      |  5 +-
>  transport-internal.h                    | 10 +---
>  transport.c                             | 23 ++++----
>  transport.h                             | 29 +++++++---
>  20 files changed, 240 insertions(+), 63 deletions(-)
>  create mode 100644 Documentation/config/lsrefs.txt
>
> Range-diff against v7:
> 1:  2d35075369 ! 1:  8b0f55b5e4 ls-refs: report unborn targets of symrefs
>     @@ Documentation/config/lsrefs.txt (new)
>      +	protocol v2 capability advertisement. "allow" is the same as
>      +	"advertise" except that the server will not advertise support for this
>      +	feature; this is useful for load-balanced servers that cannot be
>     -+	updated automatically (for example), since the administrator could
>     ++	updated atomically (for example), since the administrator could
>      +	configure "allow", then after a delay, configure "advertise".
>      
>       ## Documentation/technical/protocol-v2.txt ##
>     @@ ls-refs.c
>      +
>      +static void ensure_config_read(void)
>      +{
>     -+	char *str = NULL;
>     ++	const char *str = NULL;
>      +
>      +	if (config_read)
>      +		return;
>      +
>     -+	if (repo_config_get_string(the_repository, "lsrefs.unborn", &str)) {
>     ++	if (repo_config_get_string_tmp(the_repository, "lsrefs.unborn", &str)) {
>      +		/*
>      +		 * If there is no such config, advertise and allow it by
>      +		 * default.
> 2:  d4ed13d02e = 2:  f09bd56d5f connect, transport: encapsulate arg in struct
> 3:  a3e5a0a7c5 = 3:  a5495a42f1 clone: respect remote unborn HEAD

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [PATCH v8 0/3] Cloning with remote unborn HEAD
  2021-02-06 18:51   ` [PATCH v8 0/3] Cloning with " Junio C Hamano
@ 2021-02-08 22:28     ` Junio C Hamano
  0 siblings, 0 replies; 109+ messages in thread
From: Junio C Hamano @ 2021-02-08 22:28 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, peff, Ævar Arnfjörð Bjarmason

Junio C Hamano <gitster@pobox.com> writes:

> Jonathan Tan <jonathantanmy@google.com> writes:
>
>> Peff sent a review (which I don't see in lore.kernel.org/git, but I do
>> see it in my inbox); here's v8 in response to that.
>>
>> As you can see from the range-diff, there are just minor changes to v7
>> (wording in documentation and a memory leak fix).
>>
>> Jonathan Tan (3):
>>   ls-refs: report unborn targets of symrefs
>>   connect, transport: encapsulate arg in struct
>>   clone: respect remote unborn HEAD
>
> Queued and pushed out, but with vger.kernel.org seem to be delaying
> messages randomly, I'll hold off for a few days before merging them
> down to 'next'.  To me this version looks good (at least good enough
> to cook in 'next' and details can be tweaked incrementally).

A few days have passed; let's merge it to 'next'.

Thanks.

^ permalink raw reply	[flat|nested] 109+ messages in thread

end of thread, other threads:[~2021-02-08 22:30 UTC | newest]

Thread overview: 109+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-08  1:31 Cloning empty repository uses locally configured default branch name Jonathan Tan
2020-12-08  2:16 ` Junio C Hamano
2020-12-08  2:32   ` brian m. carlson
2020-12-08 18:55   ` Jonathan Tan
2020-12-08 21:00     ` Junio C Hamano
2020-12-08 15:58 ` Jeff King
2020-12-08 20:06   ` Jonathan Tan
2020-12-08 21:15     ` Jeff King
2020-12-11 21:05 ` [PATCH] clone: in protocol v2, use remote's default branch Jonathan Tan
2020-12-11 23:41   ` Junio C Hamano
2020-12-14 12:38   ` Ævar Arnfjörð Bjarmason
2020-12-14 15:51     ` Felipe Contreras
2020-12-14 16:30     ` Junio C Hamano
2020-12-15  1:41       ` Ævar Arnfjörð Bjarmason
2020-12-15  2:22         ` Junio C Hamano
2020-12-15  2:38         ` Jeff King
2020-12-15  2:55           ` Junio C Hamano
2020-12-15  4:36             ` Jeff King
2020-12-16  3:09               ` Junio C Hamano
2020-12-16 18:39                 ` Jeff King
2020-12-16 20:56                   ` Junio C Hamano
2020-12-18  6:19                     ` Jeff King
2020-12-15  3:22         ` Felipe Contreras
2020-12-14 19:25     ` Jonathan Tan
2020-12-14 19:42       ` Felipe Contreras
2020-12-15  1:27   ` Jeff King
2020-12-15 19:10     ` Jonathan Tan
2020-12-16  2:07   ` [PATCH v2 0/3] Cloning with remote unborn HEAD Jonathan Tan
2020-12-16  2:07     ` [PATCH v2 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
2020-12-16  6:16       ` Junio C Hamano
2020-12-16 23:49         ` Jonathan Tan
2020-12-16 18:23       ` Jeff King
2020-12-16 23:54         ` Jonathan Tan
2020-12-17  1:32           ` Junio C Hamano
2020-12-18  6:16             ` Jeff King
2020-12-16  2:07     ` [PATCH v2 2/3] connect, transport: add no-op arg for future patch Jonathan Tan
2020-12-16  6:20       ` Junio C Hamano
2020-12-16  2:07     ` [PATCH v2 3/3] clone: respect remote unborn HEAD Jonathan Tan
2020-12-21 22:30   ` [PATCH v3 0/3] Cloning with " Jonathan Tan
2020-12-21 22:30     ` [PATCH v3 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
2020-12-21 22:31     ` [PATCH v3 2/3] connect, transport: add no-op arg for future patch Jonathan Tan
2020-12-21 22:31     ` [PATCH v3 3/3] clone: respect remote unborn HEAD Jonathan Tan
2020-12-21 23:48     ` [PATCH v3 0/3] Cloning with " Junio C Hamano
2021-01-21 20:14     ` Jeff King
2020-12-22 21:54   ` [PATCH v4 " Jonathan Tan
2020-12-22 21:54     ` [PATCH v4 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
2021-01-21 20:48       ` Jeff King
2021-01-26 18:13         ` Jonathan Tan
2021-01-26 23:16           ` Jeff King
2020-12-22 21:54     ` [PATCH v4 2/3] connect, transport: add no-op arg for future patch Jonathan Tan
2021-01-21 20:55       ` Jeff King
2021-01-26 18:16         ` Jonathan Tan
2020-12-22 21:54     ` [PATCH v4 3/3] clone: respect remote unborn HEAD Jonathan Tan
2021-01-21 21:02       ` Jeff King
2021-01-26 18:22         ` Jonathan Tan
2021-01-26 23:04           ` Jeff King
2021-01-28  5:50             ` Junio C Hamano
2020-12-22 22:06     ` [PATCH v4 0/3] Cloning with " Junio C Hamano
2021-01-26 18:55 ` [PATCH v5 " Jonathan Tan
2021-01-26 18:55   ` [PATCH v5 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
2021-01-26 21:38     ` Junio C Hamano
2021-01-26 23:03       ` Junio C Hamano
2021-01-30  3:55         ` Jonathan Tan
2021-01-26 23:20       ` Jeff King
2021-01-26 23:38         ` Junio C Hamano
2021-01-29 20:23       ` Jonathan Tan
2021-01-29 22:04         ` Junio C Hamano
2021-02-02  2:20           ` Jonathan Tan
2021-02-02  5:00             ` Junio C Hamano
2021-01-27  1:28     ` Ævar Arnfjörð Bjarmason
2021-01-30  4:04       ` Jonathan Tan
2021-01-26 18:55   ` [PATCH v5 2/3] connect, transport: encapsulate arg in struct Jonathan Tan
2021-01-26 21:54     ` Junio C Hamano
2021-01-30  4:06       ` Jonathan Tan
2021-01-26 18:55   ` [PATCH v5 3/3] clone: respect remote unborn HEAD Jonathan Tan
2021-01-26 22:24     ` Junio C Hamano
2021-01-30  4:27       ` Jonathan Tan
2021-01-27  1:11   ` [PATCH v5 0/3] Cloning with " Junio C Hamano
2021-01-27  4:25     ` Jeff King
2021-01-27  6:14       ` Junio C Hamano
2021-01-27  1:41   ` Ævar Arnfjörð Bjarmason
2021-01-30  4:41     ` Jonathan Tan
2021-01-30 11:13       ` Ævar Arnfjörð Bjarmason
2021-02-02  2:22       ` Jonathan Tan
2021-02-03 14:23         ` Ævar Arnfjörð Bjarmason
2021-02-05 22:28     ` Junio C Hamano
2021-02-02  2:14 ` [PATCH v6 " Jonathan Tan
2021-02-02  2:14   ` [PATCH v6 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
2021-02-02 16:55     ` Junio C Hamano
2021-02-02 18:34       ` Jonathan Tan
2021-02-02 22:17         ` Junio C Hamano
2021-02-03  1:04           ` Jonathan Tan
2021-02-02  2:15   ` [PATCH v6 2/3] connect, transport: encapsulate arg in struct Jonathan Tan
2021-02-02  2:15   ` [PATCH v6 3/3] clone: respect remote unborn HEAD Jonathan Tan
2021-02-05  4:58 ` [PATCH v7 0/3] Cloning with " Jonathan Tan
2021-02-05  4:58   ` [PATCH v7 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
2021-02-05 16:10     ` Jeff King
2021-02-05  4:58   ` [PATCH v7 2/3] connect, transport: encapsulate arg in struct Jonathan Tan
2021-02-05  4:58   ` [PATCH v7 3/3] clone: respect remote unborn HEAD Jonathan Tan
2021-02-05  5:25   ` [PATCH v7 0/3] Cloning with " Junio C Hamano
2021-02-05 16:15     ` Jeff King
2021-02-05 21:15     ` Ævar Arnfjörð Bjarmason
2021-02-05 23:07       ` Junio C Hamano
2021-02-05 20:48 ` [PATCH v8 " Jonathan Tan
2021-02-05 20:48   ` [PATCH v8 1/3] ls-refs: report unborn targets of symrefs Jonathan Tan
2021-02-05 20:48   ` [PATCH v8 2/3] connect, transport: encapsulate arg in struct Jonathan Tan
2021-02-05 20:48   ` [PATCH v8 3/3] clone: respect remote unborn HEAD Jonathan Tan
2021-02-06 18:51   ` [PATCH v8 0/3] Cloning with " Junio C Hamano
2021-02-08 22:28     ` Junio C Hamano

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).