git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Farhan Khan" <farhan@farhan.codes>
Cc: git@vger.kernel.org
Subject: Re: How to determine when to stop receiving pack content
Date: Sun, 11 Aug 2019 15:38:50 -0700	[thread overview]
Message-ID: <xmqq7e7jcop1.fsf@gitster-ct.c.googlers.com> (raw)
In-Reply-To: <c1754835efe3aa8a5ac93ee2db4a99c5@farhan.codes> (Farhan Khan's message of "Sat, 10 Aug 2019 23:47:22 +0000")

"Farhan Khan" <farhan@farhan.codes> writes:

> I am trying to write an implementation of git clone over ssh and
> am a little confused how to determine a server response has
> ended. Specifically, after a client sends its requested 'want',
> the server sends the pack content over. However, how does the
> client know to stop reading data? If I run a simple read() of the
> file descriptor:
>
> A. If I use reading blocking, the client will wait until new data is available, potentially forever.
> B. If I use non-blocking, the client might terminate reading for new data, when in reality new data is in transit.

It's TCP stream, so blocking read will tell you when the the other
side finishes talking to you and disconnects.  Your read() will
signal EOF.  If you are paranoid and want to protect your reader
against malicious writer, then you cannot trust anything the other
side says (including possibly any "I have N megabyte of data" kind
of length information), so you'd need to set up a timeout to get
yourself out of a stuck read, but that is neither a news nor a
rocket surgery ;-)

The "upload-pack" (the component that talks with your "fetch" and
"clone"), after negotiating what objects to include in the data
transfer with the program on your side, produces a pack data stream,
and is allowed to send additional "garbage" after that.

The receiving end, after finishing the negotiation, reads the pack
data stream (there is only one packfile contents in it) and parses
it according to the packfile format so that it can find the end
(cf. Documentation/technical/pack-format.txt).

After seeing the end of the pack stream, anything that follows is
"garbage" and is generally passed through to the standard output.

There are two codepaths on the receiving end ("unpack-objects" and
"index-pack --stdin").  Most likely an initial "clone" would end up
following the latter, but for educational purposes, the unpack-objects
may be easier to follow.  These two codepaths are morally equivalent
at the higher conceptual levels.

  parent reply	other threads:[~2019-08-11 22:39 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-10 23:47 How to determine when to stop receiving pack content Farhan Khan
2019-08-11 15:04 ` Pratyush Yadav
2019-08-11 22:38 ` Junio C Hamano [this message]
2019-08-11 23:31 ` Farhan Khan
2019-08-12  0:22   ` Pratyush Yadav

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq7e7jcop1.fsf@gitster-ct.c.googlers.com \
    --to=gitster@pobox.com \
    --cc=farhan@farhan.codes \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).