git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: David Turner <dturner@twopensource.com>
Cc: Shawn Pearce <spearce@spearce.org>,
	Junio C Hamano <gitster@pobox.com>,
	git@vger.kernel.org
Subject: Re: [PATCH/RFC 4/6] transport: add refspec list parameters to functions
Date: Tue, 19 Apr 2016 19:22:43 -0400	[thread overview]
Message-ID: <20160419232243.GF18255@sigill.intra.peff.net> (raw)
In-Reply-To: <1461102001.5540.125.camel@twopensource.com>

On Tue, Apr 19, 2016 at 05:40:01PM -0400, David Turner wrote:

> > I dunno, I am a bit negative on bringing new features to Git-over
> > -HTTP
> > (which is already less efficient than the other protocols!) without
> > any
> > plan for supporting them in the other protocols.
> 
> Interesting -- can you expand on git-over-http being less efficient?
> This is the first I'd heard of it.  Is it documented somewhere?

I don't know offhand of thorough discussion I can link to. But
basically, the issue is the tip negotiation that happens during a fetch.
In the normal git-over-ssh and git-over-tcp protocols, we're
full-duplex, and both sides remember all of the state. So the client is
spewing "want" and "have" lines at the server, which is responding
asynchronously with acks or naks until they reach a shared point to
generate the pack from.

In the HTTP protocol, this negotiation has to happen via synchronous
request/response pairs. So the client says "here are some haves; what do
you think?" and gets back the response. Then it prepares another of
haves, and so on, until the server says "OK, I've seen enough; here's
the pack". But because the server is stateless, each request has to
summarize the findings of the prior request. And so each request gets
slightly bigger as we iterate.

There are some tunable parameters there (e.g., how many haves to send in
the first batch?), and the current settings are meant to be a mix of not
wasting too much time preparing a request, but also putting enough into
it that common requests can complete with only a single round trip.

I don't have numbers on how often we have to fall back multiple
requests, or how big they can grow. I know I have very occasionally seen
pathological cases where we outgrew the HTTP buffer sizes, and re-trying
the fetch via ssh just worked.

I'm cc-ing Shawn, who designed all of this, and can probably give more
details (and may also have opinions on new http-only protocol features,
as he'd probably end up implementing them in JGit, too).

It would be nice if we could do a true full-duplex conversation over
HTTP. I looked into Websockets at one point, but IIRC there wasn't
libcurl support for them.

> > So I'd rather see something like:
> > 
> >   1. Support for v2 "capabilities only" initial negotiation, followed
> >      by ref advertisement.
> > 
> >   2. Support for refspec-limiting capability.
> > 
> >   3. HTTP-only option from client to trigger v2 on the server.
> > 
> > That's still HTTP-specific, but it has a clear path for converging
> > with
> > the ssh and git protocols eventually, rather than having to support
> > magic out-of-band capabilities forever.
> > 
> > It does require an extra round of HTTP request/response, though.
> 
> This seems way more complicated to me, and not necessarily super
> -efficient.  That is, it seems like rather a lot of work to add a whole
> round of negotiation and a new protocol, when all we really need is one
> little tweak.

It is less efficient because of the extra round. If the new protocol
were truly client-speaks-first, we could drop that round (which is
essentially what your proposal is doing; you're just sticking the
first-speak part into HTTP parameters).

I don't know how much that round costs if it's part of the same TCP
session, or part of the same pipelined HTTP connection.

> I wonder if it would be possible to just add these tweaks to v1, and
> save the v2 work for when someone has the time to implement it?

I don't think it's possible for the non-HTTP protocols. The single
change in v2 is to add a phase before the ref advertisement starts.
Without that, the server is going to start spewing advertisements.

You can find previous discussion on the list, but I think the options
basically are:

  1. Something like v2, where the client gets a chance to speak before
     the advertisement.

  2. Some out-of-band way of getting values from the client to the
     server (so maybe extra command-line arguments for git-over-ssh, and
     maybe shoving something after the "\0" for git-daemon, and of
     course extra parameters for HTTP).

  3. The client saying "stop spewing refs at me, I want to give you a
     ref filter" asynchronously, and accepting a little spew at the
     beginning of each conversation. That obviously only works for the
     full-duplex transports, so you'd probably fall back to (2) for
     http.

-Peff

  reply	other threads:[~2016-04-19 23:23 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-15 19:19 [PATCH/RFC 0/6] fetch with refspec David Turner
2016-04-15 19:19 ` [PATCH/RFC 1/6] http-backend: use argv_array functions David Turner
2016-04-18 18:34   ` Junio C Hamano
2016-04-19 19:11     ` David Turner
2016-04-15 19:19 ` [PATCH/RFC 2/6] remote-curl.c: fix variable shadowing David Turner
2016-04-18 18:35   ` Junio C Hamano
2016-04-19 19:14     ` David Turner
2016-04-15 19:19 ` [PATCH/RFC 3/6] http-backend: handle refspec argument David Turner
2016-04-17  1:51   ` Eric Sunshine
2016-04-19 18:57     ` David Turner
2016-04-15 19:19 ` [PATCH/RFC 4/6] transport: add refspec list parameters to functions David Turner
2016-04-18 18:45   ` Junio C Hamano
2016-04-19  7:14     ` Jeff King
2016-04-19 18:04       ` Stefan Beller
2016-04-19 20:55       ` Junio C Hamano
2016-04-19 21:40       ` David Turner
2016-04-19 23:22         ` Jeff King [this message]
2016-04-19 23:43           ` David Turner
2016-04-20  1:17             ` Jeff King
2016-04-20 20:46               ` David Turner
2016-04-20 20:57                 ` Jeff King
2016-04-25 16:44                   ` David Turner
2016-04-25 22:10                     ` Stefan Beller
2016-04-27  3:59                       ` Stefan Beller
2016-04-27  4:11                         ` Jeff King
2016-04-27 15:07                           ` Junio C Hamano
2016-04-29 23:05                         ` David Turner
2016-04-29 23:12                           ` Stefan Beller
2016-04-19 19:31     ` David Turner
2016-04-15 19:19 ` [PATCH/RFC 5/6] fetch: pass refspec to http server David Turner
2016-04-17  2:33   ` Eric Sunshine
2016-04-19 21:25     ` David Turner
2016-04-15 19:19 ` [PATCH/RFC 6/6] clone: send refspec for single-branch clones David Turner
2016-04-17  2:36   ` Eric Sunshine
2016-04-19 21:24     ` David Turner
2016-04-15 19:30 ` [PATCH/RFC 0/6] fetch with refspec Stefan Beller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160419232243.GF18255@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).