git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Masaya Suzuki <masayasuzuki@google.com>
To: peff@peff.net
Cc: Josh Steadmon <steadmon@google.com>,
	git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v3 1/4] pack-protocol.txt: accept error packets in any context
Date: Wed, 12 Dec 2018 17:17:01 -0800	[thread overview]
Message-ID: <CAJB1erXRqQW0yQyZutJAJKC7WbdVhBAYUMWM+8ZutxA-W-7S8w@mail.gmail.com> (raw)
In-Reply-To: <20181212110206.GA30673@sigill.intra.peff.net>

On Wed, Dec 12, 2018 at 3:02 AM Jeff King <peff@peff.net> wrote:
>
> On Tue, Dec 11, 2018 at 04:25:15PM -0800, Josh Steadmon wrote:
>
> > From: Masaya Suzuki <masayasuzuki@google.com>
> >
> > In the Git pack protocol definition, an error packet may appear only in
> > a certain context. However, servers can face a runtime error (e.g. I/O
> > error) at an arbitrary timing. This patch changes the protocol to allow
> > an error packet to be sent instead of any packet.
> >
> > Following this protocol spec change, the error packet handling code is
> > moved to pkt-line.c.
>
> This is a change in the spec with an accompanying change in the code,
> which raises the question: what do other implementations do with this
> change (both older Git, and implementations like JGit, libgit2, etc)?

JGit is similar to Git. It parses "ERR " in limited places. When it sees an ERR
packet in an unexpected place, it'll fail somewhere in the parsing code.

https://github.com/eclipse/jgit/blob/30c6c7542190c149e2aee792f992a312a5fc5793/org.eclipse.jgit/src/org/eclipse/jgit/transport/PacketLineIn.java#L145-L147
https://github.com/eclipse/jgit/blob/f40b39345cd9b54473ee871bff401fe3d394ffe3/org.eclipse.jgit/src/org/eclipse/jgit/transport/BasePackConnection.java#L208

I'm not familiar with libgit2 code, but it seems it handles this at a
lower level. An error type packet is parsed out at a low level, and
the error handling is done by the callers of the packet parser.

https://github.com/libgit2/libgit2/blob/bea65980c7a42e34edfafbdc40b199ba7b2a564e/src/transports/smart_pkt.c#L482-L483

I cannot find an ERR packet handling in go-git. It seems to me that if
an ERR packet appears it treats it as a parsing error.

https://github.com/src-d/go-git/blob/master/plumbing/protocol/packp/common.go#L60-L62

>
> I think the answer for older Git is "hang up unceremoniously", which is
> probably OK given the semantics of the change. And I'd suspect most
> other implementations would do the same. I just wonder if anybody tested
> it with other implementations.

I'm thinking aloud here. There would be two aspects of the protocol
compatibility: (1) new clients speak to old servers (2) old clients
speak to a new server that speaks the updated protocol.

For (1), I assume that in the Git pack protocol, a packet starting
from "ERR " does not appear naturally except for a very special case
that the server doesn't support sideband, but using the updated
protocol. As you mentioned, at first it looks like this can mistakenly
parse the pack file of git-receive-pack as an ERR packet, assuming
that git-receive-pack's pack file is packetized. Actually
git-receive-pack's pack file is not packetized in the Git pack
protocol (https://github.com/git/git/blob/master/builtin/receive-pack.c#L1695).
I recently wrote a Git protocol parser
(https://github.com/google/gitprotocolio), and I confirmed that this
is the case at least for the HTTP transport. git-upload-pack's pack
file is indeed packetized, but packetized with sideband. Except for
the case where sideband is not used, the packfiles wouldn't be
considered as an ERR packet accidentally.

For (2), if the old clients see an unexpected ERR packet, they cannot
parse it. They would handle this unparsable data as if the server is
not speaking Git protocol correctly. Even if the old clients just
ignore the packet, due to the nature of the ERR packet, the server
won't send further data. The client won't be able to proceed. Overall,
the clients anyway face an error, and the only difference would be
whether the clients can show an error nicely or not. The new clients
will show the errors nicely to users. Old clients will not.

>
> > +An error packet is a special pkt-line that contains an error string.
> > +
> > +----
> > +  error-line     =  PKT-LINE("ERR" SP explanation-text)
> > +----
> > +
> > +Throughout the protocol, where `PKT-LINE(...)` is expected, an error packet MAY
> > +be sent. Once this packet is sent by a client or a server, the data transfer
> > +process defined in this protocol is terminated.
>
> The packfile data is typically packetized, too, and contains arbitrary
> data (that could have "ERR" in it). It looks like we don't specifically
> say PKT-LINE() in that part of the protocol spec, though, so I think
> this is OK.

As I described above, as far as I can see, the packfile in
git-upload-pack is not packetized. The packfile in git-receive-pack is
packetized but typically with sideband. At least at the Git pack
protocol level, this should be OK.

>
> Likewise, in the implementation:
>
> > diff --git a/pkt-line.c b/pkt-line.c
> > index 04d10bbd03..ce9e42d10e 100644
> > --- a/pkt-line.c
> > +++ b/pkt-line.c
> > @@ -346,6 +346,10 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
> >               return PACKET_READ_EOF;
> >       }
> >
> > +     if (starts_with(buffer, "ERR ")) {
> > +             die(_("remote error: %s"), buffer + 4);
> > +     }
> > +
> >       if ((options & PACKET_READ_CHOMP_NEWLINE) &&
> >           len && buffer[len-1] == '\n')
> >               len--;
>
> This ERR handling has been moved to a very low level. What happens if
> we're passing arbitrary data via the packet_read() code? Could we
> erroneously trigger an error if a packfile happens to have the bytes
> "ERR " at a packet boundary?
>
> For packfiles via upload-pack, I _think_ we're OK, because we only
> packetize it when a sideband is in use. In which case this would never
> match, because we'd have "\1" in the first byte slot.
>
> But are there are other cases we need to worry about? Just
> brainstorming, I can think of:
>
>   1. We also pass packetized packfiles between git-remote-https and
>      the stateless-rpc mode of fetch-pack/send-pack. And I don't think
>      we use sidebands there.
>
>   2. The packet code is used for long-lived clean/smudge filters these
>      days, which also pass arbitrary data.
>
> So I think it's probably not a good idea to unconditionally have callers
> of packet_read_with_status() handle this. We'd need a flag like
> PACKET_READ_RESPECT_ERR, and to trigger it from the appropriate callers.

This is outside of the Git pack protocol so having a separate parsing
mode makes sense to me.

>
> -Peff

  reply	other threads:[~2018-12-13  1:17 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-12 22:44 [PATCH] remote-curl: die on server-side errors steadmon
2018-11-12 22:55 ` Stefan Beller
2018-11-13  2:52 ` Junio C Hamano
2018-11-13  3:02 ` Junio C Hamano
2018-11-13 22:15   ` Josh Steadmon
2018-11-13 14:26 ` Jeff King
2018-11-13 22:25   ` Josh Steadmon
2018-11-14  0:49     ` Jeff King
2018-11-14  7:00       ` Jeff King
2018-11-15 21:51         ` Josh Steadmon
2018-11-16  8:44           ` [PATCH 0/3] remote-curl smart-http discovery cleanup Jeff King
2018-11-16  8:47             ` [PATCH 1/3] remote-curl: refactor smart-http discovery Jeff King
2018-11-16 20:27               ` Josh Steadmon
2019-02-05 23:29               ` Junio C Hamano
2019-02-06 19:16                 ` Jeff King
2019-02-06 19:18                   ` Jeff King
2019-02-06 19:29                     ` Josh Steadmon
2019-02-06 20:42                       ` Junio C Hamano
2019-02-06 21:14                       ` Jeff King
2019-02-06 19:18                   ` [PATCH 2/3] remote-curl: tighten "version 2" check for smart-http Jeff King
2019-02-06 19:19                   ` [PATCH 3/3] t5551: test server-side ERR packet Jeff King
2019-02-06 22:19                   ` [PATCH 1/3] remote-curl: refactor smart-http discovery Junio C Hamano
2018-11-16  8:48             ` [PATCH 2/3] remote-curl: tighten "version 2" check for smart-http Jeff King
2018-11-16 20:28               ` Josh Steadmon
2018-11-16  8:49             ` [PATCH 3/3] remote-curl: die on server-side errors Jeff King
2018-11-16 20:04             ` [PATCH 0/3] remote-curl smart-http discovery cleanup Josh Steadmon
2018-12-12  0:25             ` [PATCH v3 0/4] Unify pkt-line error handling and refactor smart-http Josh Steadmon
2018-12-12  0:25               ` [PATCH v3 1/4] pack-protocol.txt: accept error packets in any context Josh Steadmon
2018-12-12 11:02                 ` Jeff King
2018-12-13  1:17                   ` Masaya Suzuki [this message]
2018-12-13  8:04                     ` Jeff King
2018-12-13 22:18                     ` Josh Steadmon
2018-12-17 21:33                       ` Jeff King
2018-12-19 23:30                         ` Josh Steadmon
2018-12-20 15:49                           ` Jeff King
2018-12-12  0:25               ` [PATCH v3 2/4] remote-curl: refactor smart-http discovery Josh Steadmon
2018-12-12  0:25               ` [PATCH v3 3/4] remote-curl: tighten "version 2" check for smart-http Josh Steadmon
2018-12-12  0:25               ` [PATCH v3 4/4] lib-httpd, t5551: check server-side HTTP errors Josh Steadmon
2018-12-12  8:43               ` [PATCH v3 0/4] Unify pkt-line error handling and refactor smart-http Junio C Hamano
2018-11-13 14:30 ` [PATCH] remote-curl: die on server-side errors Junio C Hamano
2018-11-13 22:28   ` Josh Steadmon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJB1erXRqQW0yQyZutJAJKC7WbdVhBAYUMWM+8ZutxA-W-7S8w@mail.gmail.com \
    --to=masayasuzuki@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=steadmon@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).