Re: [RFC 00/14] Allow fetch-pack to send ref names (globs allowed)

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

From: Jonathan Tan <jonathantanmy@google.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org
Subject: Re: [RFC 00/14] Allow fetch-pack to send ref names (globs allowed)
Date: Thu, 26 Jan 2017 16:26:47 -0800	[thread overview]
Message-ID: <67afbb3b-5d0b-8c0d-3f6e-3f559c68f4bd@google.com> (raw)
In-Reply-To: <20170126230046.aknesybfyzxhx3ia@sigill.intra.peff.net>

Thanks for your comments.

On 01/26/2017 03:00 PM, Jeff King wrote:
> On Wed, Jan 25, 2017 at 02:02:53PM -0800, Jonathan Tan wrote:
>
>> Negotiation currently happens by upload-pack initially sending a list of
>> refs with names and SHA-1 hashes, and then several request/response
>> pairs in which the request from fetch-pack consists of SHA-1 hashes
>> (selected from the initial list). Allowing the request to consist of
>> names instead of SHA-1 hashes increases tolerance to refs changing
>> (due to time, and due to having load-balanced servers without strong
>> consistency),
>
> Interesting. My big question is: what happens when a ref _does_ change?
> How does the client handle this?
>
> The existing uploadpack.allowReachableSHA1InWant is there to work around
> the problem that an http client may get a ref advertisement in one step,
> and then come back later to do the want/have negotiation, at which point
> the server has moved on (or maybe it's even a different server). There
> the client says "I want sha1 X", and the server needs to say "well, X
> isn't my tip now, but it's still acceptable for you to fetch".
>
> But this seems to go in the opposite direction. After the advertisement,
> the client decides "OK, I want to fetch refs/heads/master which is at
> SHA1 X". It connects to the server and says "I want refs/heads/master".
> Let's say the server has moved its version of the ref to SHA1 Y.
>
> What happens? I think the server will say "wanted-ref master Y". Does
> the client just decide to use "Y" then?  How does that interact with any
> decisions the client might have made about X? I guess things like
> fast-forwards have to be decided after we fetch the objects anyway
> (since we cannot compute them until we get the pack), so maybe there
> aren't any such decisions. I haven't checked.

Yes, the server will say "wanted-ref master Y". The relevant code 
regarding the decisions the client makes regarding X or Y is in do_fetch 
in builtin/fetch.c.

There, I can see these decisions done using X:
  - check_not_current_branch (forbidding fetching that modifies the
    current branch) (I just noticed that this has to be done for Y too,
    and will do so)
  - prune refs [*]
  - automatic tag following [*]

[*] X and Y may differ in that one relevant ref appears in one set but 
not in the other (because a ref was added or removed in the meantime), 
causing a different result if these decisions were to be done using Y, 
but I think that it is OK either way.

Fetch optimizations (for example, everything_local in fetch-pack.c) that 
check if the client really needs to fetch are also done using X, of 
course (and if the optimization succeeds, there is no Y).

Fast-forwards (and everything else in store_updated_refs) are decided 
using Y.

>> and is a step towards eliminating the need for the server
>> to send the list of refs first (possibly improving performance).
>
> I'm not sure it is all that useful towards that end. You still have to
> break compatibility so that the client tells the server to suppress the
> ref advertisement. After that, it is just a question of asking for the
> refs. And you have two options:
>
>   1. Ask the server to tell you the values of some subset of the refs,
>      pick what you want, and then do the want/have as normal.
>
>   2. Go straight to the want/have, but tell the server the refs you want
>      instead of their sha1s.
>
> I think your approach here would lead to (2).
>
> But (1), besides being closer to how the protocol works now, seems like
> it's more flexible. I can ask about the ref state without necessarily
> having to retrieve the objects. How would you write git-ls-remote with
> such a system?

Assuming a new protocol with the appropriate backwards compatibility 
(which would have to be done for both options), (2) does save a 
request/response pair (because we can send the ref names directly as 
"want-ref" in the initial request instead of sending ref names in the 
initial request and then confirming them using "want <SHA-1>" in the 
subsequent request). Also, (2) is more tolerant towards refs changing 
over time. (1) might be closer to the current protocol, but I think that 
the difference is not so significant (only in "want-ref" vs "want" 
request and the "wanted-ref" response).

As for git-ls-remote, I currently think that it would have to use the 
existing protocol.

>> [1] There has been some discussion about whether the server should
>> accept partial ref names, e.g. [2]. In this patch set, I have made the
>> server only accept full names, and it is the responsibility of the
>> client to send the multiple patterns which it wants to match. Quoting
>> from the commit message of the second patch:
>>
>>   For example, a client could reasonably expand an abbreviated
>>   name "foo" to "want-ref foo", "want-ref refs/heads/foo", "want-ref
>>   refs/tags/foo", among others, and ensure that at least one such ref has
>>   been fetched.
>
> That has a cost that scales linearly with the number of refs, because
> you have to ask for each name 6 times.  After the discussion you linked,
> I think my preference is more like:
>
>   1. Teach servers to accept a list of patterns from the client
>      which will be resolved in order. Unlike your system, the client
>      only needs to specify the list once per session, rather than once
>      per ref.
>
>   2. (Optional) Give a shorthand for the stock patterns that git has had
>      in place for years. That saves some bytes over specifying the
>      patterns completely (though it's really not _that_ many bytes, so
>      perhaps the complication isn't a big deal).

I envision that most fetches would be done on a single name (with or 
without globs), that expands to 5 (so it is not so crucial to factor out 
the list of patterns). Having said that, I am open to changing this.

next prev parent reply	other threads:[~2017-01-27  0:27 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-25 22:02 [RFC 00/14] Allow fetch-pack to send ref names (globs allowed) Jonathan Tan
2017-01-25 22:02 ` [RFC 01/14] upload-pack: move parsing of "want" line Jonathan Tan
2017-01-25 22:02 ` [RFC 02/14] upload-pack: allow ref name and glob requests Jonathan Tan
2017-01-26 22:23   ` Junio C Hamano
2017-01-27  0:35     ` Jonathan Tan
2017-01-27  1:54       ` Junio C Hamano
2017-01-25 22:02 ` [RFC 03/14] upload-pack: test negotiation with changing repo Jonathan Tan
2017-01-26 22:33   ` Junio C Hamano
2017-01-27  0:44     ` Jonathan Tan
2017-02-22 23:36       ` Junio C Hamano
2017-02-23 18:43         ` [PATCH] upload-pack: report "not our ref" to client Jonathan Tan
2017-02-23 20:14           ` Junio C Hamano
2017-01-25 22:02 ` [RFC 04/14] fetch: refactor the population of hashes Jonathan Tan
2017-01-25 22:02 ` [RFC 05/14] fetch: refactor fetch_refs into two functions Jonathan Tan
2017-01-25 22:02 ` [RFC 06/14] fetch: refactor to make function args narrower Jonathan Tan
2017-01-25 22:03 ` [RFC 07/14] fetch-pack: put shallow info in out param Jonathan Tan
2017-01-25 22:03 ` [RFC 08/14] fetch-pack: check returned refs for matches Jonathan Tan
2017-01-25 22:03 ` [RFC 09/14] transport: put ref oid in out param Jonathan Tan
2017-01-25 22:03 ` [RFC 10/14] fetch-pack: support partial names and globs Jonathan Tan
2017-01-25 22:03 ` [RFC 11/14] fetch-pack: support want-ref Jonathan Tan
2017-01-25 22:03 ` [RFC 12/14] fetch-pack: do not printf after closing stdout Jonathan Tan
2017-01-26  0:50   ` Stefan Beller
2017-01-26 18:18     ` Jonathan Tan
2017-01-25 22:03 ` [RFC 13/14] fetch: send want-ref and receive fetched refs Jonathan Tan
2017-01-25 22:03 ` [RFC 14/14] DONT USE advertise_ref_in_want=1 Jonathan Tan
2017-01-26 22:15 ` [RFC 00/14] Allow fetch-pack to send ref names (globs allowed) Stefan Beller
2017-01-26 23:00 ` Jeff King
2017-01-27  0:26   ` Jonathan Tan [this message]
2017-02-07 23:53 ` Jonathan Tan
2017-02-09  0:26   ` Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2018-05-20  8:24 Apinan Ponchan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=67afbb3b-5d0b-8c0d-3f6e-3f559c68f4bd@google.com \
    --to=jonathantanmy@google.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).