git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jonathan Nieder <jrnieder@gmail.com>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 2/2] connect: know that zero-ID is not a ref
Date: Fri, 2 Sep 2016 12:37:14 -0700	[thread overview]
Message-ID: <20160902193714.GF14758@google.com> (raw)
In-Reply-To: <2bea354c6218a33db3972e42baa75676fdcbc598.1472836026.git.jonathantanmy@google.com>

Hi,

Jonathan Tan wrote:

> connect.c, when processing packfiles, treats a zero ID (with
> `capabilities^{}` in place of the refname) as an actual ref instead of a
> placeholder for a capability declaration, contrary to the specification
> in Reference Discovery in Documentation/technical/pack-protocol.txt.
> This is an issue when interacting with Git implementations that follow
> this specification. For example, `ls-remote` (against a git://
> repository served by such a Git implementation) will report a ref when
> there are none, and `clone` (against something similar) will fail its
> checkout.

For example, this means I get the following confusing message when
cloning an empty repository served by "jgit daemon":

 $ git clone https://googlers.googlesource.com/jrn/empty
 Cloning into 'empty'...
 Checking connectivity... done.
 warning: remote HEAD refers to nonexistent ref, unable to checkout.

 $

It also means that standard "git daemon" is not able to advertise
capabilities on a fetch from a repository without advertising some
refs, so I cannot fetch-by-sha1 from a C git server unless some refs
are advertised.

This fix should solve the former problem and paves the way for the
latter to be solved once a year or two passes and people's clients are
up to date (or earlier if a server operator chooses).

> Make connect.c follow the specification with respect to this, while
> maintaining compatibility with existing implementations that do not
> serve the zero ID when a repository has no refs.

This description focuses on the code.  When deciding whether to apply
the patch (or whether to revert it if it comes up while trying to
investigate another problem), I am likely to wonder "what effect will
the patch have on me?" instead of "what standards does it follow?"

Flipping the emphasis, we could say something like

	connect: treat ref advertisement with capabilities^{} line as one with no refs

	When cloning an empty repository served by standard git, "git clone"
	produces the following reassuring message:

		$ git clone git://localhost/tmp/empty
		Cloning into 'empty'...
		warning: You appear to have cloned an empty repository.
		Checking connectivity... done.
	
	Meanwhile when cloning an empty repository served by JGit, the output
	is more haphazard:

		$ git clone git://localhost/tmp/empty
		Cloning into 'empty'...
		Checking connectivity... done.
		warning: remote HEAD refers to nonexistent ref, unable to checkout.

	This is a common command to run immediately after creating a remote
	repository as preparation for adding content to populate it and pushing.
	The warning is confusing and needlessly worrying.

	The cause is that, since v3.1.0.201309270735-rc1~22 (Advertise capabilities
	with no refs in upload service., 2013-08-08), JGit's ref advertisement
	includes a ref named capabilities^{} to advertise its refs on, while git's
	ref advertisement is empty in this case.  This allows the client to learn
	about the server's capabilities and is need, for example, for fetch-by-sha1
	to work when no refs are advertised.

	Git advertises the same capabilities^{} ref in its ref advertisement for
	push but since it never remembered to do so for fetch, the client forgot
	to handle this case. Handle it.

	So now you can see the same friendly message whether the server runs C
	git or JGit.

	The pack-protocol.txt documentation gives some guidance about the expected
	server behavior: what JGit does is currently required, since a list-of-refs
	is required to be non-empty.

		  advertised-refs  =  (no-refs / list-of-refs)
				      *shallow
				      flush-pkt

		  no-refs          =  PKT-LINE(zero-id SP "capabilities^{}"
				      NUL capability-list)

		  list-of-refs     =  first-ref *other-ref

	Because git client versions without this fix are expected to exist in the
	wild for a while, we should not change the server to always send the
	capabilities^{} line when there are no refs to advertise yet.  A transition
	will take multiple steps:

	 1. This patch, which updates the client

	 2. Update pack-protocol to clarify that both server behaviors must be
	    tolerated.

	 3. Add a configuration variable to allow git upload-pack to advertise
	    capabilities when there are no refs to advertise.  Leave it disabled
	    by default since git clients can't be counted on to have this patch (1)
	    yet.

	 4. After a year or so, flip the default for that server configuration
	    variable to true.

[...]
> --- a/connect.c
> +++ b/connect.c
> @@ -165,6 +165,13 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
>  			continue;
>  		}
>  
> +		if (is_null_oid(&old_oid)) {
> +			if (strcmp(name, "capabilities^{}"))
> +				warning("zero object ID received that is not accompanied by a "
> +				        "capability declaration, ignoring and continuing anyway");
> +			continue;
> +		}

I'd rather this be stricter.  Though it's not your fault: the code
before it is very lax in letting each line specify capabilities again.

Could this behave more like the ".have" case?  That is, just

		if (!strcmp(name, "capabilities^{}"))
			continue;

Or if we want to sanity-check the line further,

		if (!strcmp(name, "capabilities^{}")) {
			if (len == name_len + GIT_SHA1_HEX_SZ + 1)
				die("protocol error: expected capabilities after nul, got none");
			if (!is_null_oid(&old_oid))
				die("protocol error: expected zero-id, got %s", oid_to_hex(&old_oid));
			continue;
		}

The protocol definition and other clients already handle capabilities^{} differently
from other cases of zero-id.

[...]
> --- a/t/t5512-ls-remote.sh
> +++ b/t/t5512-ls-remote.sh
> @@ -207,5 +207,27 @@ test_expect_success 'ls-remote --symref omits filtered-out matches' '
[...]
> +test_expect_success JGIT,GIT_DAEMON 'indicate no refs in standards-compliant empty remote' '
> +	JGIT_DAEMON_PID= &&
> +	git init --bare empty.git &&
> +	touch empty.git/git-daemon-export-ok &&
> +	{
> +		jgit daemon --port="$JGIT_DAEMON_PORT" . &
> +		JGIT_DAEMON_PID=$!
> +	} &&
> +	test_when_finished kill "$JGIT_DAEMON_PID" &&
> +	sleep 1 && # allow jgit daemon some time to set up

Can this sleep be avoided?  E.g. start_git_daemon in lib-git-daemon.sh
uses output from the daemon to tell when it is serving.

Thanks for writing this.  I'll be happy when this protocol blip is gone.

Sincerely,
Jonathan

  reply	other threads:[~2016-09-02 19:37 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-02 17:15 [PATCH 0/2] handle empty spec-compliant remote repos correctly Jonathan Tan
2016-09-02 17:15 ` [PATCH 1/2] tests: move test_lazy_prereq JGIT to test-lib.sh Jonathan Tan
2016-09-02 17:15 ` [PATCH 2/2] connect: know that zero-ID is not a ref Jonathan Tan
2016-09-02 19:37   ` Jonathan Nieder [this message]
2016-09-02 19:39   ` Shawn Pearce
2016-09-02 19:56     ` Stefan Beller
2016-09-02 20:00       ` Shawn Pearce
2016-09-02 20:13   ` Jeff King
2016-09-02 22:11     ` Jonathan Tan
2016-09-02 23:19       ` Jeff King
2016-09-03  2:03     ` Shawn Pearce
2016-09-03  2:17       ` Jeff King
2016-09-02 22:06 ` [PATCH v2 0/2] handle empty spec-compliant remote repos correctly Jonathan Tan
2016-09-02 22:06 ` [PATCH v2 1/2] tests: move test_lazy_prereq JGIT to test-lib.sh Jonathan Tan
2016-09-07 16:47   ` Junio C Hamano
2016-09-02 22:06 ` [PATCH v2 2/2] connect: advertized capability is not a ref Jonathan Tan
2016-09-02 22:40   ` Jonathan Nieder
2016-09-02 23:35   ` Jeff King
2016-09-02 23:48     ` Stefan Beller
2016-09-03  0:37       ` Jonathan Nieder
2016-09-02 23:51     ` Jonathan Nieder
2016-09-03  0:56       ` Jeff King
2016-09-07 17:02   ` Junio C Hamano
2016-09-07 17:10   ` Junio C Hamano
2016-09-07 20:38     ` Jonathan Nieder
2016-09-07 23:02       ` Junio C Hamano
2016-09-07 23:50 ` [PATCH v3 0/2] handle empty spec-compliant remote repos correctly Jonathan Tan
2016-09-07 23:50 ` [PATCH v3 1/2] tests: move test_lazy_prereq JGIT to test-lib.sh Jonathan Tan
2016-09-07 23:50 ` [PATCH v3 2/2] connect: advertized capability is not a ref Jonathan Tan
2016-09-08  1:34   ` Jonathan Nieder
2016-09-08  1:45     ` [PATCH] connect: tighten check for unexpected early hang up (Re: [PATCH v3 2/2] connect: advertized capability is not a ref) Jonathan Nieder
2016-09-08  1:46       ` Jonathan Nieder
2016-09-08  1:50       ` Jonathan Nieder
2016-09-08 16:42         ` Junio C Hamano
2016-09-08 16:28       ` Stefan Beller
2016-09-08 16:18     ` [PATCH v3 2/2] connect: advertized capability is not a ref Junio C Hamano
2016-09-09 17:36 ` [PATCH v4 0/3] handle empty spec-compliant remote repos correctly Jonathan Tan
2016-09-09 17:36 ` [PATCH v4 1/3] tests: move test_lazy_prereq JGIT to test-lib.sh Jonathan Tan
2016-09-10  5:51   ` Torsten Bögershausen
2016-09-10  6:00     ` Jeff King
2016-09-09 17:36 ` [PATCH v4 2/3] connect: tighten check for unexpected early hang up Jonathan Tan
2016-09-09 17:36 ` [PATCH v4 3/3] connect: advertized capability is not a ref Jonathan Tan
2016-09-09 19:40   ` Jonathan Nieder
2016-09-09 20:40     ` Junio C Hamano
2016-09-09 20:09   ` Junio C Hamano
2016-09-09 20:17 ` [PATCH v5 0/3] handle empty spec-compliant remote repos correctly Jonathan Tan
2016-09-09 21:07   ` Jonathan Nieder
2016-09-09 20:17 ` [PATCH v5 1/3] tests: move test_lazy_prereq JGIT to test-lib.sh Jonathan Tan
2016-09-09 20:17 ` [PATCH v5 2/3] connect: tighten check for unexpected early hang up Jonathan Tan
2016-09-09 20:17 ` [PATCH v5 3/3] connect: advertized capability is not a ref Jonathan Tan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160902193714.GF14758@google.com \
    --to=jrnieder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).