git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Stefan Beller <sbeller@google.com>
To: Jeff King <peff@peff.net>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>,
	Duy Nguyen <pclouds@gmail.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [RFC/WIP PATCH 00/11] Protocol version 2, again!
Date: Thu, 4 Jun 2015 09:44:49 -0700	[thread overview]
Message-ID: <CAGZ79ka1qjspbbE4ZKHrys9TOmKbS30KA1PJkwzA47ptjxQfSA@mail.gmail.com> (raw)
In-Reply-To: <20150604130902.GA12404@peff.net>

On Thu, Jun 4, 2015 at 6:09 AM, Jeff King <peff@peff.net> wrote:
> On Mon, Jun 01, 2015 at 10:49:45AM -0700, Stefan Beller wrote:
>
>> However the client side with builtin/fetch, builtin/fetch-pack, fetch-pack
>> is a bit of a mystery to me, as I cannot fully grasp the difference between
>>  * connect.{h,c}
>>  * remote.{h.c}
>>  * transport.{h.c}
>> there. All of it seems to be doing network related stuff, but I have trouble
>> getting the big picture. I am assuming all of these 3 are rather a low level,
>> used like a library, though there must be even more hierarchy in there,
>> connect is most low level judging from the header file and used by
>> the other two.
>> transport.h seems to provide the most toplevel library stuff as it includes
>> remote.h in its header?
>
> connect.c was originally "the git protocol", and was used by send-pack
> and fetch-pack. Other individual programs implemented other transports.
> Later, as the interface moved towards everybody running "fetch" and
> "push", and those delegating work to the individual transports, we got
> transport.c, which is an abstract interface for all transports. It
> delegates actual git-protocol work to the functions in connect.c (or
> bundle work elsewhere, or handles remote-helpers itself).
>
> And then remote.c contains routines for dealing with the remotes at a
> logical level. E.g., which refs to fetch or push, etc.
>
> So in theory, the flow is something like:
>
>   - fetch.c knows "the user wants to fetch from 'foo'"
>
>   - fetch asks remote.c: "who is remote 'foo'"; we get back a URL
>
>   - fetch asks transport.c: "what are the refs for $URL"
>
>   - it turns out to be a git URL. transport.c calls into connect.c to
>     implement get_refs_via_connect.

Currently the distinction which protocol to speak is made
here (in get_refs_via_connect), which may be a bit late. Though I updating
the git protocol only first would also be feasible.

So for the next git protocol

 - get_refs_via_connect first asks for the capabilities and gets an answer from
   upload-pack-2. Now what?

 - we could have a callback in struct transport, which must be set
accordingly by
   fetch in step 4 (it turns out to be a git URL. transport.c ...)
   This callback is called with each pkt-line such as

        void parse-capability(char *line, struct
*transport_capabilities, void *cdata);

The line would contain the pkt-line, while the transport_capabilities
would be a struct
similar as in "[RFCv2 06/16] remote.h: add new struct for options",
where the fetch
implementation must select the right bits. Looking at fetch-pack.{c,h}
we only expose
one do-it-all method there, so we currently don't have file wide
easily accessible variables,
but rather all in a struct fetch_pack_args, which carries important
information for the
selection process such as verbosity or desired options. This is why a
void* comes in
handy as well. (It will be easy later to adapt that to the sending
side as well).

Instead of a full grown line by line callback we could also just
collect all the capabilities
first in a string list and then only call back once into a

    void select_capabilities(struct string_list *available, struct
string_list *selected);

I think I'd find this second approach more handy as there are subtle
details you'd miss in
the first approach. Looking at fetch-pack.c, do_fetch_pack (line 790),
we have one
selection (no_done) conditioned on another (multi_ack_detailed), so
having the full list
there makes the code easier.

This second approach however might not be as future proof as the
first, because we store
all received capabilities (which may grow large in the future) and not
throw unknowns away
immediately.

I tend to rather implement the second one (easier to read/maintain trumps a
maybe-performance-problem-in-the-future).

This performance-problem-in-the-future could be mitigated easily by having a
preselection in transport.c get_capabilities, which ignores any capabilities not
white listed there (harder to maintain though, as we have a more than one spot
where to put a list)

By writing this mail I realized another thing. I have had the patch
    "[RFCv2 09/16] remote.h: add get_remote_capabilities, request_capabilities"
which has request_capabilities just translating from a struct
containing some bits
into a sequence of pkt-lines containing the actual protocol
capabilities. Maybe we
should not have that in the connect file, but rather as proposed in
this email, the
high level command directly selects the strings to put back on the
wire. (By having
"struct string_list *selected" as part of the select_capabilities arguments.)
then the request_capabilities in connect.c would be dumbed down to just:

    void request_capabilities(int out, struct string_list *list)
    {
        struct string_list_item *item;
        for_each_string_list_item(item, list) {
             packet_write(out, item->string);
        }
        packet_flush(out);
    }

I think that would be reasonable?

>
>   - after fetch gets back the list of refs, it uses routines in remote.c
>     to figure out which refs we are interested in, handle refspecs, etc
>
>   - now fetch asks transport.c: "OK, fetch just these refs"
>
>   - transport.c again calls into connect.c to handle the actual fetch
>
> Of course over the years a lot of cruft has grown around all of them. I
> wouldn't be surprised if there are functions which cross these
> abstractions, or other random functions inside each file that do not
> belong.
>
>> and the issue I am concerned about is the select_capabilities as well as
>> the request_capabilities function here. The select_capabilities functionality
>> is currently residing in the high level parts of the code as it both depends on
>> the advertised server capabilities and on the user input (--use-frotz or config
>> options), so the capability selection is done in fetchpack.c
>>
>> So there are 2 routes to go: Either we leave the select_capabilities in the
>> upper layers (near the actual high level command, fetch, fetchpack) or we put
>> it into the transport layer and just passing in a struct what the user desires.
>> And when the users desire doesn't meet the server capabilities we die deep down
>> in the transport layer.
>
> I think you have to leave it in the fetch-pack code. As you note, it's
> the place where we know about what the user is asking for and can
> manipulate the list. And not all transports even support capabilities
> like this.
>
> -Peff

Okay

      reply	other threads:[~2015-06-04 16:45 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-26 22:01 [RFC/WIP PATCH 00/11] Protocol version 2, again! Stefan Beller
2015-05-26 22:01 ` [RFC/WIP PATCH 01/11] upload-pack: make client capability parsing code a separate function Stefan Beller
2015-05-26 22:01 ` [RFC/WIP PATCH 02/11] upload-pack: only accept capabilities on the first "want" line Stefan Beller
2015-05-26 22:17   ` Junio C Hamano
2015-05-26 22:20     ` Stefan Beller
2015-05-26 22:01 ` [RFC/WIP PATCH 03/11] upload-pack: move capabilities out of send_ref Stefan Beller
2015-05-26 22:01 ` [RFC/WIP PATCH 04/11] upload-pack-2: Implement the version 2 of upload-pack Stefan Beller
2015-05-27  2:30   ` Eric Sunshine
2015-05-27  6:35   ` Jeff King
2015-05-27 17:30     ` Eric Sunshine
2015-05-27 20:14       ` Jeff King
2015-05-27 17:40     ` Stefan Beller
2015-05-27 20:34       ` Jeff King
2015-05-27 20:45         ` Stefan Beller
2015-05-27 21:46           ` Jeff King
2015-05-26 22:01 ` [RFC/WIP PATCH 05/11] transport: add infrastructure to support a protocol version number Stefan Beller
2015-05-27  6:39   ` Jeff King
2015-05-27 19:01     ` Stefan Beller
2015-05-27 20:17       ` Jeff King
2015-05-27 19:10     ` Junio C Hamano
2015-05-26 22:01 ` [RFC/WIP PATCH 06/11] remote.h: add get_remote_capabilities, request_capabilities Stefan Beller
2015-05-27  3:25   ` Eric Sunshine
2015-05-27  6:50     ` Jeff King
2015-05-27 17:19       ` Eric Sunshine
2015-05-27 20:09         ` Jeff King
2015-05-27  6:45   ` Jeff King
2015-05-29 19:39     ` Stefan Beller
2015-05-29 22:08       ` Jeff King
2015-05-26 22:01 ` [RFC/WIP PATCH 07/11] fetch-pack: use the configured transport protocol Stefan Beller
2015-05-26 22:19   ` Junio C Hamano
2015-05-26 22:23     ` Stefan Beller
2015-05-27  6:53   ` Jeff King
2015-05-26 22:01 ` [RFC/WIP PATCH 08/11] transport: connect_setup appends protocol version number Stefan Beller
2015-05-26 22:21   ` Junio C Hamano
2015-05-26 22:31     ` Stefan Beller
2015-05-27  5:09       ` Junio C Hamano
2015-05-27  6:56         ` Jeff King
2015-05-27  3:33   ` Eric Sunshine
2015-05-27  7:02   ` Jeff King
2015-05-26 22:01 ` [RFC/WIP PATCH 09/11] transport: get_refs_via_connect exchanges capabilities before refs Stefan Beller
2015-05-27  5:37   ` Eric Sunshine
2015-05-27  7:06   ` Jeff King
2015-05-26 22:01 ` [RFC/WIP PATCH 10/11] t5544: add a test case for the new protocol Stefan Beller
2015-05-27  5:34   ` Eric Sunshine
2015-05-27  7:12   ` Jeff King
2015-05-26 22:01 ` [RFC/WIP PATCH 11/11] Document protocol version 2 Stefan Beller
2015-05-29 20:35   ` Junio C Hamano
2015-05-29 21:36     ` Stefan Beller
2015-05-29 21:52       ` Junio C Hamano
2015-05-29 22:21         ` Jeff King
2015-06-01 23:14           ` Stefan Beller
2015-06-01 23:40             ` Stefan Beller
2015-06-04 13:18               ` Jeff King
2015-06-04 17:01                 ` Junio C Hamano
2015-06-02 17:06             ` Junio C Hamano
2015-05-27  6:18 ` [RFC/WIP PATCH 00/11] Protocol version 2, again! Jeff King
2015-05-27  7:08   ` Jeff King
2015-06-01 17:49     ` Stefan Beller
2015-06-02 10:10       ` Duy Nguyen
2015-06-04 13:09       ` Jeff King
2015-06-04 16:44         ` Stefan Beller [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGZ79ka1qjspbbE4ZKHrys9TOmKbS30KA1PJkwzA47ptjxQfSA@mail.gmail.com \
    --to=sbeller@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).