git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Stefan Beller <sbeller@google.com>
Cc: git@vger.kernel.org, peff@peff.net, jrnieder@gmail.com,
	johannes.schindelin@gmail.com, Jens.Lehmann@web.de,
	vlovich@gmail.com
Subject: Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
Date: Thu, 17 Sep 2015 09:13:40 -0700	[thread overview]
Message-ID: <xmqqtwqtja6j.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <1442453948-9885-2-git-send-email-sbeller@google.com> (Stefan Beller's message of "Wed, 16 Sep 2015 18:38:59 -0700")

Stefan Beller <sbeller@google.com> writes:

> Subject: Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock

s/Add/add/;

> We need to read from pipes without blocking in a later patch.

I am hoping that you are at least not spinning---i.e. do a poll 
first to make sure there is at least some progress to be made
before calling this.

> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  strbuf.c | 25 +++++++++++++++++++++++--
>  strbuf.h |  6 ++++++
>  2 files changed, 29 insertions(+), 2 deletions(-)
>
> diff --git a/strbuf.c b/strbuf.c
> index cce5eed..4130ee2 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -357,7 +357,10 @@ size_t strbuf_fread(struct strbuf *sb, size_t size, FILE *f)
>  	return res;
>  }
>  
> -ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
> +#define IGNORE_EAGAIN (1)
> +
> +static ssize_t strbuf_read_internal(struct strbuf *sb, int fd,
> +				    size_t hint, int flags)
>  {
>  	size_t oldlen = sb->len;
>  	size_t oldalloc = sb->alloc;
> @@ -366,8 +369,16 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
>  	for (;;) {
>  		ssize_t cnt;
>  
> -		cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1);
> +		cnt = read(fd, sb->buf + sb->len, sb->alloc - sb->len - 1);
>  		if (cnt < 0) {
> +			if (errno == EINTR)
> +				continue;
> +			if (errno == EAGAIN) {
> +				if (flags & IGNORE_EAGAIN)
> +					break;
> +				else
> +					continue;
> +			}

In order to ensure that this is not breaking the normal case, I had
to look at the implementation of xread() to see it behaves identically
when the flags is not passed.  That one-time review burden implies
that this is adding a maintenance burden to keep this copied function
in sync.

We should also handle EWOULDBLOCK not just EAGAIN.

More importantly, I am not sure if this helper is even necessary.

Looking at xread (reproduced in its full glory):

/*
 * xread() is the same a read(), but it automatically restarts read()
 * operations with a recoverable error (EAGAIN and EINTR). xread()
 * DOES NOT GUARANTEE that "len" bytes is read even if the data is available.
 */
ssize_t xread(int fd, void *buf, size_t len)
{
	ssize_t nr;
	if (len > MAX_IO_SIZE)
	    len = MAX_IO_SIZE;
	while (1) {
		nr = read(fd, buf, len);
		if ((nr < 0) && (errno == EAGAIN || errno == EINTR))
			continue;
		return nr;
	}
}

Are we doing the right thing for EAGAIN here?

Now, man read(2) says this:

       EAGAIN 
		The file descriptor fd refers to a file other than a
                socket and has been marked nonblocking (O_NONBLOCK),
                and the read would block.

       EAGAIN or EWOULDBLOCK
		The file descriptor fd refers to a socket and has
                been marked nonblocking (O_NONBLOCK), and the read
                would block.  POSIX.1-2001 allows either error to be
                returned for this case, and does not require these
                con‐ stants to have the same value, so a portable
                application should check for both possibilities.

If the fd is not marked nonblocking, then we will not get EAGAIN (or
EWOULDBLOCK).

When fd is set to nonblocking, the current xread() spins if read()
says that the operation would block.  What would we achieve by
spinning ourselves here, though?  The caller must have set the fd to
be nonblocking for a reason, and that reason cannot be "if the data
is not ready, we do not have anything better than just spinning for
new data to arrive"---if so, the caller wouldn't have set the fd to
be nonblocking in the first place.

Even if there is such a stupid caller that only wants us to loop,
because we explicitly allow xread() to return a short read, all of
our callers must call it in a loop if they know how much they want
to read.  We can just return from here and let them loop around us.

And your new caller that does O_NONBLOCK wants to do more than
looping upon EWOULDBLOCK.  It certainly would not want us to loop
here.

So I wonder if you can just O_NONBLOCK the fd and use the usual
strbuf_read(), i.e. without any change in this patch, and update
xread() to _unconditionally_ return when read(2) says EAGAIN or
EWOULDBLOCK.

What would that break?

>  			if (oldalloc == 0)
>  				strbuf_release(sb);
>  			else
> @@ -384,6 +395,16 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
>  	return sb->len - oldlen;
>  }
>  
> +ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint)
> +{
> +	return strbuf_read_internal(sb, fd, hint, 0);
> +}
> +
> +ssize_t strbuf_read_noblock(struct strbuf *sb, int fd, size_t hint)
> +{
> +	return strbuf_read_internal(sb, fd, hint, IGNORE_EAGAIN);
> +}
> +
>  #define STRBUF_MAXLINK (2*PATH_MAX)
>  
>  int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint)
> diff --git a/strbuf.h b/strbuf.h
> index aef2794..23ca7aa 100644
> --- a/strbuf.h
> +++ b/strbuf.h
> @@ -367,6 +367,12 @@ extern size_t strbuf_fread(struct strbuf *, size_t, FILE *);
>  extern ssize_t strbuf_read(struct strbuf *, int fd, size_t hint);
>  
>  /**
> + * Same as strbuf_read, just returns non-blockingly by ignoring EAGAIN.
> + * The fd must have set O_NONBLOCK.
> + */
> +extern ssize_t strbuf_read_noblock(struct strbuf *, int fd, size_t hint);
> +
> +/**
>   * Read the contents of a file, specified by its path. The third argument
>   * can be used to give a hint about the file size, to avoid reallocs.
>   */

  reply	other threads:[~2015-09-17 16:13 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-17  1:38 [PATCH 00/10] fetch submodules in parallel and a preview on parallel "submodule update" Stefan Beller
2015-09-17  1:38 ` [PATCH 01/10] strbuf: Add strbuf_read_noblock Stefan Beller
2015-09-17 16:13   ` Junio C Hamano [this message]
2015-09-17 16:30     ` Jeff King
2015-09-17 16:44       ` Junio C Hamano
2015-09-17 16:51         ` Stefan Beller
2015-09-17 16:57         ` Jeff King
2015-09-17 16:58       ` Junio C Hamano
2015-09-17 17:13         ` Jeff King
2015-09-17 17:26           ` Stefan Beller
2015-09-17 17:35             ` Jeff King
2015-09-17 17:45               ` Stefan Beller
2015-09-17 17:50                 ` Jeff King
2015-09-17 17:53                   ` Stefan Beller
2015-09-17 17:57               ` Junio C Hamano
2015-09-17 17:54           ` Junio C Hamano
2015-09-17 18:02             ` Jeff King
2015-09-17 17:20         ` Stefan Beller
2015-09-17  1:39 ` [PATCH 02/10] run-command: factor out return value computation Stefan Beller
2015-09-17 10:33   ` Jeff King
2015-09-17  1:39 ` [PATCH 03/10] run-command: add an asynchronous parallel child processor Stefan Beller
2015-09-17 21:44   ` Junio C Hamano
2015-09-17 23:19     ` Stefan Beller
2015-09-18  1:05       ` Junio C Hamano
2015-09-18 16:36         ` Stefan Beller
2015-09-17  1:39 ` [PATCH 04/10] fetch_populated_submodules: use new parallel job processing Stefan Beller
2015-09-17  1:39 ` [PATCH 05/10] submodules: Allow parallel fetching, add tests and documentation Stefan Beller
2015-09-17  1:39 ` [PATCH 06/10] git submodule update: Redirect any output to stderr Stefan Beller
2015-09-17 20:31   ` Eric Sunshine
2015-09-17 20:38     ` Stefan Beller
2015-09-17  1:39 ` [PATCH 07/10] git submodule update: pass --prefix only with a non empty prefix Stefan Beller
2015-09-17 20:33   ` Eric Sunshine
2015-09-17  1:39 ` [PATCH 08/10] git submodule update: cmd_update_recursive Stefan Beller
2015-09-17  1:39 ` [PATCH 09/10] " Stefan Beller
2015-09-17 20:37   ` Eric Sunshine
2015-09-17  1:39 ` [PATCH 10/10] git submodule update: cmd_update_fetch Stefan Beller
2015-09-17 17:06 ` [PATCH 00/10] fetch submodules in parallel and a preview on parallel "submodule update" Jacob Keller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqtwqtja6j.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=Jens.Lehmann@web.de \
    --cc=git@vger.kernel.org \
    --cc=johannes.schindelin@gmail.com \
    --cc=jrnieder@gmail.com \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    --cc=vlovich@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).