From: Stefan Beller <sbeller@google.com>
To: Jeff King <peff@peff.net>
Cc: Junio C Hamano <gitster@pobox.com>,
"git@vger.kernel.org" <git@vger.kernel.org>,
Duy Nguyen <pclouds@gmail.com>
Subject: Re: [RFC/WIP PATCH 11/11] Document protocol version 2
Date: Mon, 1 Jun 2015 16:40:54 -0700 [thread overview]
Message-ID: <CAGZ79kYD--dZ_V=_X_Eo31KYTKXt2njuf56XqRRdaTJeLhDjaQ@mail.gmail.com> (raw)
In-Reply-To: <CAGZ79kaRTLX7eBCOA=yQHVwcN-H-o_aZFfQ1gw7Nx-NC82pbag@mail.gmail.com>
On Mon, Jun 1, 2015 at 4:14 PM, Stefan Beller <sbeller@google.com> wrote:
> On Fri, May 29, 2015 at 3:21 PM, Jeff King <peff@peff.net> wrote:
>> On Fri, May 29, 2015 at 02:52:14PM -0700, Junio C Hamano wrote:
>>
>>> > Currently we can do a = as part of the line after the first ref, such as
>>> >
>>> > symref=HEAD:refs/heads/master agent=git/2:2.4.0
>>> >
>>> > so I thought we want to keep this.
>>>
>>> I do not understand that statement.
>>>
>>> Capability exchange in v2 is one packet per cap, so the above
>>> example would be expressed as:
>>>
>>> symref=HEAD:refs/heads/master
>>> agent=git/2:2.4.0
>>>
>>> right? Your "keyvaluepair" is limited to [a-z0-9-_=]*, and neither
>>> of the above two can be expressed with that, which was why I said
>>> you need two different set of characters before and after "=". Left
>>> hand side of "=" is tightly limited and that is OK. Right hand side
>>> may contain characters like ':', '.' and '/', so your alphabet need
>>> to be more lenient, even in v1 (which I would imagine would be "any
>>> octet other than SP, LF and NUL").
>
> I think the recent issue with the push certificates shows that having arbitrary
> data after the = is a bad idea. So we need to be very cautious when to allow
> which data after the =.
>
> I'll try split up the patch.
>
>>
>> Yes. See git_user_agent_sanitized(), for example, which allows basically
>> any printable ASCII except for SP.
>>
>> I think the v2 capabilities do not even need to have that restriction.
>> It can allow arbitrary binary data, because it has an 8bit-clean framing
>> mechanism (pkt-lines). Of course, that means such capabilities cannot be
>> represented in a v1 conversation (whose framing mechanism involves SP
>> and NUL). But it's probably acceptable to introduce new capabilities
>> which are only available in a v2 conversation. Old clients that do not
>> understand v2 would not understand the capability either. It does
>> require new clients implementing the capability to _also_ implement v2
>> if they have not done so, but I do not mind pushing people in that
>> direction.
>>
>> The initial v2 client implementation should probably do a few cautionary
>> things, then:
>>
>> 1. Do _not_ fold the per-pkt capabilities into a v1 string; that loses
>> the robust framing. I suggested string_list earlier, but probably
>> we want a list of ptr/len pair, so that it can remain NUL-clean.
>>
>> 2. Avoid holding on to unknown packets longer than necessary. Some
>> capability pkt-lines may be arbitrarily large (up to 64K). If we do
>> not understand them during the v2 read of the capabilities, there
>> is no point hanging on to them. It's not _wrong_ to do so, but just
>> inefficient; if we know that clients will just throw away unknown
>> packets, then we can later introduce new packets with large data,
>> without worrying about wasting the client's resources.
>>
>> I suspect it's not that big a deal either way, though. I have no
>> plans for sending a bunch of large packets, and anyway network
>> bandwidth is probably more precious than client memory.
>
> That's very sensible thoughts after rereading this email. The version
> I'll be sending out today will not follow those suggestions though. :(
Thinking about this further, maybe it is a good idea to restrict the
capabilities
advertising to alphabetical order?
The exchange would look like this:
server:
for capability in list:
pkt_write(capability)
pkt_flush
client:
do
line = recv_pkt()
parse_capability(line)
while line != flush
with parse_capability checking if we know the capability and maybe setting some
internal field if we know this capability.
Now if we assume the number of capabilities grows over time a lot (someone may
"abuse" it for a cool feature, similar to the refs currently. Nobody
thought about
having so many refs in advance)
So how does parse_capability scale w.r.t the number of capabilities?
If parse_capability is just a linear search then it is O(n) and with n
capabilities
the client faces an O(n^2) computation which is bad. So if we were to require
alphabetic capabilities, you could internally keep track and the whole operation
is O(n). I just wonder if this is premature optimization or some thought we need
to think of.
To prevent this problem from popping up, it must be easier to
introduce a new phase
after the capabilities exchange than to just abuse the capabilities
phase for whatever
you plan on doing.
Thanks,
Stefan
>
>>
>> -Peff
next prev parent reply other threads:[~2015-06-01 23:41 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-26 22:01 [RFC/WIP PATCH 00/11] Protocol version 2, again! Stefan Beller
2015-05-26 22:01 ` [RFC/WIP PATCH 01/11] upload-pack: make client capability parsing code a separate function Stefan Beller
2015-05-26 22:01 ` [RFC/WIP PATCH 02/11] upload-pack: only accept capabilities on the first "want" line Stefan Beller
2015-05-26 22:17 ` Junio C Hamano
2015-05-26 22:20 ` Stefan Beller
2015-05-26 22:01 ` [RFC/WIP PATCH 03/11] upload-pack: move capabilities out of send_ref Stefan Beller
2015-05-26 22:01 ` [RFC/WIP PATCH 04/11] upload-pack-2: Implement the version 2 of upload-pack Stefan Beller
2015-05-27 2:30 ` Eric Sunshine
2015-05-27 6:35 ` Jeff King
2015-05-27 17:30 ` Eric Sunshine
2015-05-27 20:14 ` Jeff King
2015-05-27 17:40 ` Stefan Beller
2015-05-27 20:34 ` Jeff King
2015-05-27 20:45 ` Stefan Beller
2015-05-27 21:46 ` Jeff King
2015-05-26 22:01 ` [RFC/WIP PATCH 05/11] transport: add infrastructure to support a protocol version number Stefan Beller
2015-05-27 6:39 ` Jeff King
2015-05-27 19:01 ` Stefan Beller
2015-05-27 20:17 ` Jeff King
2015-05-27 19:10 ` Junio C Hamano
2015-05-26 22:01 ` [RFC/WIP PATCH 06/11] remote.h: add get_remote_capabilities, request_capabilities Stefan Beller
2015-05-27 3:25 ` Eric Sunshine
2015-05-27 6:50 ` Jeff King
2015-05-27 17:19 ` Eric Sunshine
2015-05-27 20:09 ` Jeff King
2015-05-27 6:45 ` Jeff King
2015-05-29 19:39 ` Stefan Beller
2015-05-29 22:08 ` Jeff King
2015-05-26 22:01 ` [RFC/WIP PATCH 07/11] fetch-pack: use the configured transport protocol Stefan Beller
2015-05-26 22:19 ` Junio C Hamano
2015-05-26 22:23 ` Stefan Beller
2015-05-27 6:53 ` Jeff King
2015-05-26 22:01 ` [RFC/WIP PATCH 08/11] transport: connect_setup appends protocol version number Stefan Beller
2015-05-26 22:21 ` Junio C Hamano
2015-05-26 22:31 ` Stefan Beller
2015-05-27 5:09 ` Junio C Hamano
2015-05-27 6:56 ` Jeff King
2015-05-27 3:33 ` Eric Sunshine
2015-05-27 7:02 ` Jeff King
2015-05-26 22:01 ` [RFC/WIP PATCH 09/11] transport: get_refs_via_connect exchanges capabilities before refs Stefan Beller
2015-05-27 5:37 ` Eric Sunshine
2015-05-27 7:06 ` Jeff King
2015-05-26 22:01 ` [RFC/WIP PATCH 10/11] t5544: add a test case for the new protocol Stefan Beller
2015-05-27 5:34 ` Eric Sunshine
2015-05-27 7:12 ` Jeff King
2015-05-26 22:01 ` [RFC/WIP PATCH 11/11] Document protocol version 2 Stefan Beller
2015-05-29 20:35 ` Junio C Hamano
2015-05-29 21:36 ` Stefan Beller
2015-05-29 21:52 ` Junio C Hamano
2015-05-29 22:21 ` Jeff King
2015-06-01 23:14 ` Stefan Beller
2015-06-01 23:40 ` Stefan Beller [this message]
2015-06-04 13:18 ` Jeff King
2015-06-04 17:01 ` Junio C Hamano
2015-06-02 17:06 ` Junio C Hamano
2015-05-27 6:18 ` [RFC/WIP PATCH 00/11] Protocol version 2, again! Jeff King
2015-05-27 7:08 ` Jeff King
2015-06-01 17:49 ` Stefan Beller
2015-06-02 10:10 ` Duy Nguyen
2015-06-04 13:09 ` Jeff King
2015-06-04 16:44 ` Stefan Beller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAGZ79kYD--dZ_V=_X_Eo31KYTKXt2njuf56XqRRdaTJeLhDjaQ@mail.gmail.com' \
--to=sbeller@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pclouds@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).