git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/2] wrapper: xread/xwrite fixes for non-blocking FDs
@ 2016-06-26 23:21 Eric Wong
  2016-06-26 23:21 ` [PATCH 1/2] xread: retry after poll on EAGAIN/EWOULDBLOCK Eric Wong
  2016-06-26 23:21 ` [PATCH 2/2] xwrite: poll on non-blocking FDs Eric Wong
  0 siblings, 2 replies; 17+ messages in thread
From: Eric Wong @ 2016-06-26 23:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Beller, git, Johannes Sixt

1/2 fixes a bug introduced in commit 1079c4be0b720
("xread: poll on non blocking fds") where the "continue"
got dropped.

I noticed the 1/2 bug while working on 2/2 and intentionally
triggering EAGAIN on a custom HTTP server to test 100% CPU
usage.  I originally blindly copied the branch from xread
into xwrite without the "continue" and was greeted with a
failed clone.

Eric Wong (2):
      xread: retry after poll on EAGAIN/EWOULDBLOCK
      xwrite: poll on non-blocking FDs

 wrapper.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/2] xread: retry after poll on EAGAIN/EWOULDBLOCK
  2016-06-26 23:21 [PATCH 0/2] wrapper: xread/xwrite fixes for non-blocking FDs Eric Wong
@ 2016-06-26 23:21 ` Eric Wong
  2016-06-26 23:42   ` Jeff King
  2016-06-26 23:51   ` Jeff King
  2016-06-26 23:21 ` [PATCH 2/2] xwrite: poll on non-blocking FDs Eric Wong
  1 sibling, 2 replies; 17+ messages in thread
From: Eric Wong @ 2016-06-26 23:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Beller, git, Johannes Sixt

We should continue to loop after EAGAIN/EWOULDBLOCK as the
intent of xread is to read as much as possible until an
EOF or real error occurs.

Fixes: 1079c4be0b720 ("xread: poll on non blocking fds")

Signed-off-by: Eric Wong <e@80x24.org>
---
 wrapper.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/wrapper.c b/wrapper.c
index 5dc4e15..f1155d0 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -255,6 +255,7 @@ ssize_t xread(int fd, void *buf, size_t len)
 				 * call to read(2).
 				 */
 				poll(&pfd, 1, -1);
+				continue;
 			}
 		}
 		return nr;

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/2] xwrite: poll on non-blocking FDs
  2016-06-26 23:21 [PATCH 0/2] wrapper: xread/xwrite fixes for non-blocking FDs Eric Wong
  2016-06-26 23:21 ` [PATCH 1/2] xread: retry after poll on EAGAIN/EWOULDBLOCK Eric Wong
@ 2016-06-26 23:21 ` Eric Wong
  2016-06-26 23:49   ` Jeff King
  1 sibling, 1 reply; 17+ messages in thread
From: Eric Wong @ 2016-06-26 23:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Beller, git, Johannes Sixt

write(2) can hit the same EAGAIN/EWOULDBLOCK errors as read(2),
so busy-looping on a non-blocking FD is a waste of resources.

Currently, I do not know of a way for this happen:

* the NonBlocking directive in systemd does not apply to stdin,
  stdout, or stderr.

* xinetd provides no way to set the non-blocking flag at all

But theoretically, it's possible a careless C10K HTTP server
could use pipe2(..., O_NONBLOCK) to setup a pipe for
git-http-backend with only the intent to use non-blocking reads;
but accidentally leave non-blocking set on the write end passed
as stdout to git-upload-pack.

Followup-to: 1079c4be0b720 ("xread: poll on non blocking fds")

Signed-off-by: Eric Wong <e@80x24.org>
---
 wrapper.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/wrapper.c b/wrapper.c
index f1155d0..d973f86 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -274,8 +274,26 @@ ssize_t xwrite(int fd, const void *buf, size_t len)
 	    len = MAX_IO_SIZE;
 	while (1) {
 		nr = write(fd, buf, len);
-		if ((nr < 0) && (errno == EAGAIN || errno == EINTR))
-			continue;
+		if (nr < 0) {
+			if (errno == EINTR)
+				continue;
+			if (errno == EAGAIN || errno == EWOULDBLOCK) {
+				struct pollfd pfd;
+				pfd.events = POLLOUT;
+				pfd.fd = fd;
+				/*
+				 * it is OK if this poll() failed; we
+				 * want to leave this infinite loop
+				 * only when write() returns with
+				 * success, or an expected failure,
+				 * which would be checked by the next
+				 * call to write(2).
+				 */
+				poll(&pfd, 1, -1);
+				continue;
+			}
+		}
+
 		return nr;
 	}
 }

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] xread: retry after poll on EAGAIN/EWOULDBLOCK
  2016-06-26 23:21 ` [PATCH 1/2] xread: retry after poll on EAGAIN/EWOULDBLOCK Eric Wong
@ 2016-06-26 23:42   ` Jeff King
  2016-06-27 13:02     ` Junio C Hamano
  2016-06-26 23:51   ` Jeff King
  1 sibling, 1 reply; 17+ messages in thread
From: Jeff King @ 2016-06-26 23:42 UTC (permalink / raw)
  To: Eric Wong; +Cc: Junio C Hamano, Stefan Beller, git, Johannes Sixt

On Sun, Jun 26, 2016 at 11:21:11PM +0000, Eric Wong wrote:

> We should continue to loop after EAGAIN/EWOULDBLOCK as the
> intent of xread is to read as much as possible until an
> EOF or real error occurs.
> 
> Fixes: 1079c4be0b720 ("xread: poll on non blocking fds")
> 
> Signed-off-by: Eric Wong <e@80x24.org>
> ---
>  wrapper.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/wrapper.c b/wrapper.c
> index 5dc4e15..f1155d0 100644
> --- a/wrapper.c
> +++ b/wrapper.c
> @@ -255,6 +255,7 @@ ssize_t xread(int fd, void *buf, size_t len)
>  				 * call to read(2).
>  				 */
>  				poll(&pfd, 1, -1);
> +				continue;
>  			}
>  		}
>  		return nr;

Eek. This is a real bug that could cause racy and apparently random
failures. I guess we haven't seen many reports of it because it only
triggers when you have unexpected non-blocking descriptors, which are
not all that common. But your fix is definitely the right thing to do.

I also wondered how we managed to miss such an obvious point in review
of the original patch. Sadly, we _did_ notice it[1] but it looks like we
never fixed the problem. That is even more disturbing.

-Peff

[1] http://mid.gmane.org/20151218031336.GA8467@sigill.intra.peff.net

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] xwrite: poll on non-blocking FDs
  2016-06-26 23:21 ` [PATCH 2/2] xwrite: poll on non-blocking FDs Eric Wong
@ 2016-06-26 23:49   ` Jeff King
  0 siblings, 0 replies; 17+ messages in thread
From: Jeff King @ 2016-06-26 23:49 UTC (permalink / raw)
  To: Eric Wong; +Cc: Junio C Hamano, Stefan Beller, git, Johannes Sixt

On Sun, Jun 26, 2016 at 11:21:12PM +0000, Eric Wong wrote:

> write(2) can hit the same EAGAIN/EWOULDBLOCK errors as read(2),
> so busy-looping on a non-blocking FD is a waste of resources.
> 
> Currently, I do not know of a way for this happen:
> 
> * the NonBlocking directive in systemd does not apply to stdin,
>   stdout, or stderr.
> 
> * xinetd provides no way to set the non-blocking flag at all
> 
> But theoretically, it's possible a careless C10K HTTP server
> could use pipe2(..., O_NONBLOCK) to setup a pipe for
> git-http-backend with only the intent to use non-blocking reads;
> but accidentally leave non-blocking set on the write end passed
> as stdout to git-upload-pack.
> 
> Followup-to: 1079c4be0b720 ("xread: poll on non blocking fds")

I think this is sensible. I don't know of a way for it to happen either,
but who knows what environments people will spawn git in (especially
http-backend, as we are at the mercy of the webserver).

So it's a good defensive measure, it shouldn't affect normal use at all
(because blocking descriptors would never return EAGAIN), and it can't
possibly hurt anybody expecting to use xwrite() non-blockingly to get
EAGAIN (because we already spin on it; this just converts the spin to a
poll).

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] xread: retry after poll on EAGAIN/EWOULDBLOCK
  2016-06-26 23:21 ` [PATCH 1/2] xread: retry after poll on EAGAIN/EWOULDBLOCK Eric Wong
  2016-06-26 23:42   ` Jeff King
@ 2016-06-26 23:51   ` Jeff King
  2016-06-27  3:56     ` [PATCHv2 " Eric Wong
  1 sibling, 1 reply; 17+ messages in thread
From: Jeff King @ 2016-06-26 23:51 UTC (permalink / raw)
  To: Eric Wong; +Cc: Junio C Hamano, Stefan Beller, git, Johannes Sixt

On Sun, Jun 26, 2016 at 11:21:11PM +0000, Eric Wong wrote:

> We should continue to loop after EAGAIN/EWOULDBLOCK as the
> intent of xread is to read as much as possible until an
> EOF or real error occurs.

BTW, a minor nit here. xread() does _not_ read as much as possible until
EOF. It tries until it gets a real error or at least one byte.

I know you inherited this mistaken text from 1079c4be0b, but we should
probably not repeat it.

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCHv2 1/2] xread: retry after poll on EAGAIN/EWOULDBLOCK
  2016-06-26 23:51   ` Jeff King
@ 2016-06-27  3:56     ` Eric Wong
  0 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2016-06-27  3:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Beller, git, Johannes Sixt, Jeff King

Jeff King <peff@peff.net> wrote:
> On Sun, Jun 26, 2016 at 11:21:11PM +0000, Eric Wong wrote:
> 
> > We should continue to loop after EAGAIN/EWOULDBLOCK as the
> > intent of xread is to read as much as possible until an
> > EOF or real error occurs.
> 
> BTW, a minor nit here. xread() does _not_ read as much as possible until
> EOF. It tries until it gets a real error or at least one byte.
> 
> I know you inherited this mistaken text from 1079c4be0b, but we should
> probably not repeat it.

Good catch, here's v2 of PATCH 1/2 reworded:

----------8<----------
Subject: [PATCH] xread: retry after poll on EAGAIN/EWOULDBLOCK

We should continue to loop after EAGAIN/EWOULDBLOCK as the
intent of xread is to try until there is available data,
EOF, or an unrecoverable error.

Fixes: 1079c4be0b720 ("xread: poll on non blocking fds")

Signed-off-by: Eric Wong <e@80x24.org>
---
 wrapper.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/wrapper.c b/wrapper.c
index 5dc4e15..f1155d0 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -255,6 +255,7 @@ ssize_t xread(int fd, void *buf, size_t len)
 				 * call to read(2).
 				 */
 				poll(&pfd, 1, -1);
+				continue;
 			}
 		}
 		return nr;
-- 
EW

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] xread: retry after poll on EAGAIN/EWOULDBLOCK
  2016-06-26 23:42   ` Jeff King
@ 2016-06-27 13:02     ` Junio C Hamano
  2016-06-27 14:36       ` Jeff King
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2016-06-27 13:02 UTC (permalink / raw)
  To: Jeff King; +Cc: Eric Wong, Stefan Beller, git, Johannes Sixt

Jeff King <peff@peff.net> writes:

> I also wondered how we managed to miss such an obvious point in review
> of the original patch. Sadly, we _did_ notice it[1] but it looks like we
> never fixed the problem. That is even more disturbing.

Yes indeed.

I try to pay attention to "this is broken because..."  comments in
discussions to make a note in my copy of "What's cooking" report for
a problematic topic, as that is where I work from when merging
topics down, but apparently that procedure failed work in this case.
There needs a stronger mechanism to stop a known-buggy patch from
going thru, but I am not sure offhand what that should be.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] xread: retry after poll on EAGAIN/EWOULDBLOCK
  2016-06-27 13:02     ` Junio C Hamano
@ 2016-06-27 14:36       ` Jeff King
  2016-06-27 16:49         ` Stefan Beller
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff King @ 2016-06-27 14:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Eric Wong, Stefan Beller, git, Johannes Sixt

On Mon, Jun 27, 2016 at 06:02:12AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > I also wondered how we managed to miss such an obvious point in review
> > of the original patch. Sadly, we _did_ notice it[1] but it looks like we
> > never fixed the problem. That is even more disturbing.
> 
> Yes indeed.
> 
> I try to pay attention to "this is broken because..."  comments in
> discussions to make a note in my copy of "What's cooking" report for
> a problematic topic, as that is where I work from when merging
> topics down, but apparently that procedure failed work in this case.
> There needs a stronger mechanism to stop a known-buggy patch from
> going thru, but I am not sure offhand what that should be.

I was the one who saw the bug. I could have followed the series more
closely to make sure my concern was addressed. Or possibly pointed out
the bug more prominently than an in a "PS" as part of the discussion.

I think part of the problem was that this particular series was
large-ish and involved a lot of re-rolls, and I got sick of looking at
it. I dunno.

It's also true that our error rate will never be 0%. So some bugs will
always slip through, some review comments will be forgotten, etc. Eric
did find and fix the bug just now, so the "many eyes" theory did work
here eventually.

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] xread: retry after poll on EAGAIN/EWOULDBLOCK
  2016-06-27 14:36       ` Jeff King
@ 2016-06-27 16:49         ` Stefan Beller
  2016-06-27 19:17           ` Jeff King
  2016-06-27 20:13           ` Eric Wong
  0 siblings, 2 replies; 17+ messages in thread
From: Stefan Beller @ 2016-06-27 16:49 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Eric Wong, git@vger.kernel.org, Johannes Sixt

On Mon, Jun 27, 2016 at 7:36 AM, Jeff King <peff@peff.net> wrote:
> On Mon, Jun 27, 2016 at 06:02:12AM -0700, Junio C Hamano wrote:
>
>> Jeff King <peff@peff.net> writes:
>>
>> > I also wondered how we managed to miss such an obvious point in review
>> > of the original patch. Sadly, we _did_ notice it[1] but it looks like we
>> > never fixed the problem. That is even more disturbing.
>>
>> Yes indeed.
>>
>> I try to pay attention to "this is broken because..."  comments in
>> discussions to make a note in my copy of "What's cooking" report for
>> a problematic topic, as that is where I work from when merging
>> topics down, but apparently that procedure failed work in this case.
>> There needs a stronger mechanism to stop a known-buggy patch from
>> going thru, but I am not sure offhand what that should be.
>
> I was the one who saw the bug. I could have followed the series more
> closely to make sure my concern was addressed. Or possibly pointed out
> the bug more prominently than an in a "PS" as part of the discussion.

I thought I would have fixed that bug, but apparently I did not.
(I agreed on the bug being there at the time of discussion [1], so I
guess I can be
blamed most for this failure)

[1] http://thread.gmane.org/gmane.comp.version-control.git/282514/focus=282694

>
> I think part of the problem was that this particular series was
> large-ish and involved a lot of re-rolls, and I got sick of looking at
> it. I dunno.

I haven't send patches to git for quite a while now, but writing smaller patches
is the way to go for me. (I am currently looking at the repo tool,
that has no test
suite, so there too I try to make very small patches.)

>
> It's also true that our error rate will never be 0%. So some bugs will
> always slip through, some review comments will be forgotten, etc. Eric
> did find and fix the bug just now, so the "many eyes" theory did work
> here eventually.

Eric, thanks for catching and fixing the bug!

Quite a while ago, when I started doing code reviews professionally, I wondered
if the code review procedure can be semi-automated, as automation helps keeping
the error rate low. By that I mean having a check list which I can
check off each point
for each patch. That seems to be very good in theory, but when trying
it I was finding
myself doing a lot of unneeded work as some points of such a check
list just do not
apply for a specific patch. So I did not follow through with that.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] xread: retry after poll on EAGAIN/EWOULDBLOCK
  2016-06-27 16:49         ` Stefan Beller
@ 2016-06-27 19:17           ` Jeff King
  2016-06-27 19:43             ` Stefan Beller
  2016-06-27 20:13           ` Eric Wong
  1 sibling, 1 reply; 17+ messages in thread
From: Jeff King @ 2016-06-27 19:17 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, Eric Wong, git@vger.kernel.org, Johannes Sixt

On Mon, Jun 27, 2016 at 09:49:06AM -0700, Stefan Beller wrote:

> Quite a while ago, when I started doing code reviews professionally, I
> wondered if the code review procedure can be semi-automated, as
> automation helps keeping the error rate low. By that I mean having a
> check list which I can check off each point for each patch. That seems
> to be very good in theory, but when trying it I was finding myself
> doing a lot of unneeded work as some points of such a check list just
> do not apply for a specific patch. So I did not follow through with
> that.

I have wondered, too, if we could have better tooling to help us with
reviews. But one of the things I really _like_ about doing reviews for
git (versus other projects) is that doing review via email is
unconstrained. The primary recipient is human, and I can format and say
whatever I like in the way that best communicates to the human, without
worrying about fitting my comments into a pre-made form.

That being said, I suspect one could go a long way by picking out basic
patterns from emailed responses. For example, you could imagine a system
that makes a todo list of review comments (one comment per response to a
quoted section) and associates them with given bits of the code (by
seeing what's in the quoted section). That todo list can become a
checklist when sending out the next revision, or could even be used
interactively to see what happened to each code spot (did you fix it?
How? In which commit?). That would help reviewers, but also would help
submitters send out the cover letter for the next version (by reminding
them what to mention).

Of course, none of that would have helped my comment, which was in a
"PS" several emails deep in a discussion thread. ;)

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] xread: retry after poll on EAGAIN/EWOULDBLOCK
  2016-06-27 19:17           ` Jeff King
@ 2016-06-27 19:43             ` Stefan Beller
  2016-06-27 19:51               ` Jeff King
  0 siblings, 1 reply; 17+ messages in thread
From: Stefan Beller @ 2016-06-27 19:43 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Eric Wong, git@vger.kernel.org, Johannes Sixt

On Mon, Jun 27, 2016 at 12:17 PM, Jeff King <peff@peff.net> wrote:
> On Mon, Jun 27, 2016 at 09:49:06AM -0700, Stefan Beller wrote:
>
>> Quite a while ago, when I started doing code reviews professionally, I
>> wondered if the code review procedure can be semi-automated, as
>> automation helps keeping the error rate low. By that I mean having a
>> check list which I can check off each point for each patch. That seems
>> to be very good in theory, but when trying it I was finding myself
>> doing a lot of unneeded work as some points of such a check list just
>> do not apply for a specific patch. So I did not follow through with
>> that.
>
> I have wondered, too, if we could have better tooling to help us with
> reviews. But one of the things I really _like_ about doing reviews for
> git (versus other projects) is that doing review via email is
> unconstrained. The primary recipient is human, and I can format and say
> whatever I like in the way that best communicates to the human, without
> worrying about fitting my comments into a pre-made form.

Maybe I did not mean automation, but rather a more structured approach,
in the review itself. (I just realize this is a slightly different
problem than what
we saw here).

Here we had a case of:

   1) Jeff: "Wait, there is a bug, please fix!"
   2) Stefan: "ok will do."
   3) Stefan doesn't do it,
   4) Jeff doesn't notice or is tired of doing a review.
   5) Junio picks up the buggy version.

There are a few issues:

    1) How did you spot the bug? ("Experience/Logical thinking" is the hand
      wavy answer, but if you had a list like
      [ ] check for mem leak
      [ ] check for futureproof design
      [ ] check for failure modes (What if a syscall fails?")
      [ ] ... List is not complete, but has some made up points

    While "Experience/Logical thinking" is a very good approximation
    I find myself not thinking of all these details all the time, so sometimes
    I would fail at step 1 already.

    2) I used to send out only "done"s, i.e. when I already fixed the
issue, instead
    of acknowledging the problem and postponing the fix for later.
I'll revert to that.

    3) Optimally here I would have a list of all reviewers suggestions and can
    check these one by one if I really addressed all of them and not forgot one.

    4) If the list from 3) was available, it becomes much easier in a
follow up review
    to see how the author did address each issue.

    5) likewise as 4, just with less attention to the detail. (Junio
trusts Jeff that he checked
    that Stefan addressed all issues in Patch v1)

>
> That being said, I suspect one could go a long way by picking out basic
> patterns from emailed responses. For example, you could imagine a system
> that makes a todo list of review comments (one comment per response to a
> quoted section) and associates them with given bits of the code (by
> seeing what's in the quoted section). That todo list can become a
> checklist when sending out the next revision, or could even be used
> interactively to see what happened to each code spot (did you fix it?
> How? In which commit?). That would help reviewers, but also would help
> submitters send out the cover letter for the next version (by reminding
> them what to mention).

I agree that would help with 3 and 4 in the case above.

>
> Of course, none of that would have helped my comment, which was in a
> "PS" several emails deep in a discussion thread. ;)

Maybe a "P.S." would get its own point in the todo list not associated
with code?
Then it would still be on the radar.

Stefan

>
> -Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] xread: retry after poll on EAGAIN/EWOULDBLOCK
  2016-06-27 19:43             ` Stefan Beller
@ 2016-06-27 19:51               ` Jeff King
  0 siblings, 0 replies; 17+ messages in thread
From: Jeff King @ 2016-06-27 19:51 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, Eric Wong, git@vger.kernel.org, Johannes Sixt

On Mon, Jun 27, 2016 at 12:43:32PM -0700, Stefan Beller wrote:

> There are a few issues:
> 
>     1) How did you spot the bug? ("Experience/Logical thinking" is the hand
>       wavy answer, but if you had a list like
>       [ ] check for mem leak
>       [ ] check for futureproof design
>       [ ] check for failure modes (What if a syscall fails?")
>       [ ] ... List is not complete, but has some made up points

I use the hand-wavy approach to reviews, and I'm pretty sure that's how
I saw your bug (just thinking about what the code would do, how I would
write it, and then noticing a discrepancy).

I suspect a checklist could make things more thorough, but I think it
can also discourage deep thinking. Ideally we have some reviewers who
are checklist-oriented looking for those sorts of details and some who
are not.

>     2) I used to send out only "done"s, i.e. when I already fixed the
>     issue, instead of acknowledging the problem and postponing the fix
>     for later.  I'll revert to that.

Yes, I tend to use the review thread as a todo list, and do a single
session where I read over all of the review mails again (even if I've
read and responded to them already), fixing or adjusting the code. Sort
of a human version of the automated tool I described. :)

> > Of course, none of that would have helped my comment, which was in a
> > "PS" several emails deep in a discussion thread. ;)
> 
> Maybe a "P.S." would get its own point in the todo list not associated
> with code?  Then it would still be on the radar.

Yes, there are definitely design points that are not about a specific
part of the code, and you would want to somehow represent them in your
todo list.

There is a danger, though, of having a bunch of false positives added to
your list: tangent discussion that distracts you from the actual review.
And I think the bug here being in a PS is less relevant than being
buried amidst other discussion. That made it hard for a tool _or_ a
human to find.

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] xread: retry after poll on EAGAIN/EWOULDBLOCK
  2016-06-27 16:49         ` Stefan Beller
  2016-06-27 19:17           ` Jeff King
@ 2016-06-27 20:13           ` Eric Wong
  2016-06-27 21:49             ` Jeff King
  1 sibling, 1 reply; 17+ messages in thread
From: Eric Wong @ 2016-06-27 20:13 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Jeff King, Junio C Hamano, git@vger.kernel.org, Johannes Sixt

Stefan Beller <sbeller@google.com> wrote:
> On Mon, Jun 27, 2016 at 7:36 AM, Jeff King <peff@peff.net> wrote:
> > It's also true that our error rate will never be 0%. So some bugs will
> > always slip through, some review comments will be forgotten, etc. Eric
> > did find and fix the bug just now, so the "many eyes" theory did work
> > here eventually.
> 
> Eric, thanks for catching and fixing the bug!

No problem :)  I only noticed it because I was scanning emails
randomly and Duy and David's index-helper thread turned up.

> Quite a while ago, when I started doing code reviews professionally, I wondered
> if the code review procedure can be semi-automated, as automation helps keeping
> the error rate low. By that I mean having a check list which I can
> check off each point

Maybe a test case or even a small unit test would've helped.
I didn't notice the problem in xread until:

1) I copied the code into xwrite
2) s/POLLIN/POLLOUT/;
3) forced EAGAIN using a patched, home-baked HTTP server

The biggish comment before the poll() obscured the missing
"continue" for me.  I read xread() before and did not notice
the missing "continue".

Maybe the following optional patch on top of this series
improves readability:

----------8<--------
Subject: [PATCH 3/2] hoist out io_wait function for xread and xwrite

At least for me, this improves the readability of xread and
xwrite; hopefully allowing missing "continue" statements to
be spotted more easily.

Signed-off-by: Eric Wong <e@80x24.org>
---
 wrapper.c | 40 ++++++++++++++++------------------------
 1 file changed, 16 insertions(+), 24 deletions(-)

diff --git a/wrapper.c b/wrapper.c
index d973f86..04bb952 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -227,6 +227,20 @@ int xopen(const char *path, int oflag, ...)
 	}
 }
 
+static void io_wait(int fd, short poll_events)
+{
+	struct pollfd pfd;
+
+	pfd.fd = fd;
+	pfd.events = poll_events;
+
+	/*
+	 * no need to check for errors, here;
+	 * a subsequent read/write will detect unrecoverable errors
+	 */
+	poll(&pfd, 1, -1);
+}
+
 /*
  * xread() is the same a read(), but it automatically restarts read()
  * operations with a recoverable error (EAGAIN and EINTR). xread()
@@ -243,18 +257,7 @@ ssize_t xread(int fd, void *buf, size_t len)
 			if (errno == EINTR)
 				continue;
 			if (errno == EAGAIN || errno == EWOULDBLOCK) {
-				struct pollfd pfd;
-				pfd.events = POLLIN;
-				pfd.fd = fd;
-				/*
-				 * it is OK if this poll() failed; we
-				 * want to leave this infinite loop
-				 * only when read() returns with
-				 * success, or an expected failure,
-				 * which would be checked by the next
-				 * call to read(2).
-				 */
-				poll(&pfd, 1, -1);
+				io_wait(fd, POLLIN);
 				continue;
 			}
 		}
@@ -278,18 +281,7 @@ ssize_t xwrite(int fd, const void *buf, size_t len)
 			if (errno == EINTR)
 				continue;
 			if (errno == EAGAIN || errno == EWOULDBLOCK) {
-				struct pollfd pfd;
-				pfd.events = POLLOUT;
-				pfd.fd = fd;
-				/*
-				 * it is OK if this poll() failed; we
-				 * want to leave this infinite loop
-				 * only when write() returns with
-				 * success, or an expected failure,
-				 * which would be checked by the next
-				 * call to write(2).
-				 */
-				poll(&pfd, 1, -1);
+				io_wait(fd, POLLOUT);
 				continue;
 			}
 		}
-- 
EW


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] xread: retry after poll on EAGAIN/EWOULDBLOCK
  2016-06-27 20:13           ` Eric Wong
@ 2016-06-27 21:49             ` Jeff King
  2016-06-27 22:22               ` Jeff King
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff King @ 2016-06-27 21:49 UTC (permalink / raw)
  To: Eric Wong
  Cc: Stefan Beller, Junio C Hamano, git@vger.kernel.org, Johannes Sixt

On Mon, Jun 27, 2016 at 08:13:11PM +0000, Eric Wong wrote:

> > Quite a while ago, when I started doing code reviews professionally, I wondered
> > if the code review procedure can be semi-automated, as automation helps keeping
> > the error rate low. By that I mean having a check list which I can
> > check off each point
> 
> Maybe a test case or even a small unit test would've helped.
> I didn't notice the problem in xread until:
> 
> 1) I copied the code into xwrite
> 2) s/POLLIN/POLLOUT/;
> 3) forced EAGAIN using a patched, home-baked HTTP server
> 
> The biggish comment before the poll() obscured the missing
> "continue" for me.  I read xread() before and did not notice
> the missing "continue".

Here's an easier reproduction:

	nonblock () {
		perl -e '
			use Fcntl;
			fcntl(STDIN, F_SETFL, Fcntl::O_NONBLOCK);
			exec @ARGV;
		' "$@"
	}
	{
	  sleep 1
	  git pack-objects --all --stdout </dev/null
	} | nonblock git index-pack --stdin

Or even simpler:

        sleep 1 | nonblock git index-pack --stdin

The first one is nicer because it shows in the working case that we
actually do eventually read the input, as opposed to just getting EOF.
Prior to v2.8.0, or current master with your patch applied, it works
fine.

It turned out to be harder than I thought to find somebody who calls
xread() on stdin.  My first attempt was:

	{
		printf 'HE'
		sleep 1
		printf 'AD\n'
	} |
	nonblock git cat-file --batch-check

but this ends up in strbuf_getwholeline(), which uses getdelim(). Much to
my disappointment, getdelim() does not handle this case transparently;
it will quietly return the first two bytes (though with errno set to
EAGAIN), and we process bogus input.

So in general I would say that handing non-blocking descriptors to git
is not safe. I think it's possible to loop on getdelim() when we see
EAGAIN, but I'm not sure if it's worth it.

> Maybe the following optional patch on top of this series
> improves readability:
> 
> ----------8<--------
> Subject: [PATCH 3/2] hoist out io_wait function for xread and xwrite
> 
> At least for me, this improves the readability of xread and
> xwrite; hopefully allowing missing "continue" statements to
> be spotted more easily.

IMHO the end result is much nicer with this patch.

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] xread: retry after poll on EAGAIN/EWOULDBLOCK
  2016-06-27 21:49             ` Jeff King
@ 2016-06-27 22:22               ` Jeff King
  2016-06-27 23:19                 ` Eric Wong
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff King @ 2016-06-27 22:22 UTC (permalink / raw)
  To: Eric Wong
  Cc: Stefan Beller, Junio C Hamano, git@vger.kernel.org, Johannes Sixt

On Mon, Jun 27, 2016 at 05:49:48PM -0400, Jeff King wrote:

> So in general I would say that handing non-blocking descriptors to git
> is not safe. I think it's possible to loop on getdelim() when we see
> EAGAIN, but I'm not sure if it's worth it.

The patch for that is below, for the curious. It works with even this:

  {
    for i in H E A D; do
      sleep 1
      printf $i
    done
    printf "\n"
  } | nonblock git cat-file --batch-check

Note that I folded the "did we see EAGAIN" check into my sub-function
(which is the equivalent of your io_wait). You might want to do that in
the xread() code path, too, as it makes the resulting code there a very
nice:

  if (errno == EINTR)
	continue;
  if (handle_nonblock(fd, POLLIN))
	continue;

---
diff --git a/strbuf.c b/strbuf.c
index 1ba600b..2147873 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -452,6 +452,38 @@ int strbuf_getcwd(struct strbuf *sb)
 	return -1;
 }
 
+int handle_nonblock(FILE *fp, short poll_events)
+{
+	if (ferror(fp) && (errno == EAGAIN || errno == EWOULDBLOCK)) {
+		struct pollfd pfd;
+
+		pfd.events = poll_events;
+		pfd.fd = fileno(fp);
+		poll(&pfd, 1, -1);
+		clearerr(fp);
+		return 1;
+	}
+	return 0;
+}
+
+static int getline_stdio_loop(struct strbuf *sb, FILE *fp, int term)
+{
+	int ch;
+	do {
+		errno = 0;
+		flockfile(fp);
+		while ((ch = getc_unlocked(fp)) != EOF) {
+			if (!strbuf_avail(sb))
+				strbuf_grow(sb, 1);
+			sb->buf[sb->len++] = ch;
+			if (ch == term)
+				break;
+		}
+		funlockfile(fp);
+	} while (handle_nonblock(fp, POLLIN));
+	return ch;
+}
+
 #ifdef HAVE_GETDELIM
 int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term)
 {
@@ -465,12 +497,31 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term)
 	/* Translate slopbuf to NULL, as we cannot call realloc on it */
 	if (!sb->alloc)
 		sb->buf = NULL;
+
+	errno = 0;
 	r = getdelim(&sb->buf, &sb->alloc, term, fp);
 
 	if (r > 0) {
 		sb->len = r;
-		return 0;
 	}
+
+	/*
+	 * getdelim will return characters to us even if it sees EAGAIN;
+	 * we want to read all the way to an actual delimiter or EOF.
+	 *
+	 * We can't just loop on getdelim, though, as it wants to write
+	 * the whole buffer itself. So fall back to a stdio loop, which
+	 * can incrementally add to our strbuf.
+	 */
+	if (handle_nonblock(fp, POLLIN)) {
+		getline_stdio_loop(sb, fp, term);
+		/* Propagate real errors */
+		if (ferror(fp))
+			r = -1;
+	}
+	if (r > 0)
+		return 0;
+
 	assert(r == -1);
 
 	/*
@@ -507,15 +558,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term)
 		return EOF;
 
 	strbuf_reset(sb);
-	flockfile(fp);
-	while ((ch = getc_unlocked(fp)) != EOF) {
-		if (!strbuf_avail(sb))
-			strbuf_grow(sb, 1);
-		sb->buf[sb->len++] = ch;
-		if (ch == term)
-			break;
-	}
-	funlockfile(fp);
+	ch = getline_stdio_loop(sb, fp, term);
 	if (ch == EOF && sb->len == 0)
 		return EOF;
 

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] xread: retry after poll on EAGAIN/EWOULDBLOCK
  2016-06-27 22:22               ` Jeff King
@ 2016-06-27 23:19                 ` Eric Wong
  0 siblings, 0 replies; 17+ messages in thread
From: Eric Wong @ 2016-06-27 23:19 UTC (permalink / raw)
  To: Jeff King
  Cc: Stefan Beller, Junio C Hamano, git@vger.kernel.org, Johannes Sixt

Jeff King <peff@peff.net> wrote:
> On Mon, Jun 27, 2016 at 05:49:48PM -0400, Jeff King wrote:
> 
> > So in general I would say that handing non-blocking descriptors to git
> > is not safe.

Indeed.  This also makes me wonder if our output to stdout/stderr
suffer from the same theoretical problems if given non-blocking
outputs; I suspect they do.

>>  I think it's possible to loop on getdelim() when we see
>> EAGAIN, but I'm not sure if it's worth it.

> The patch for that is below, for the curious. It works with even this:
> 
>   {
>     for i in H E A D; do
>       sleep 1
>       printf $i
>     done
>     printf "\n"
>   } | nonblock git cat-file --batch-check
> 
> Note that I folded the "did we see EAGAIN" check into my sub-function
> (which is the equivalent of your io_wait). You might want to do that in
> the xread() code path, too, as it makes the resulting code there a very
> nice:
> 
>   if (errno == EINTR)
> 	continue;
>   if (handle_nonblock(fd, POLLIN))
> 	continue;

Yes :)

> +int handle_nonblock(FILE *fp, short poll_events)
> +{
> +	if (ferror(fp) && (errno == EAGAIN || errno == EWOULDBLOCK)) {

<snip>

> +		clearerr(fp);
> +		return 1;
> +	}
> +	return 0;
> +}
> +
> +static int getline_stdio_loop(struct strbuf *sb, FILE *fp, int term)
> +{
> +	int ch;
> +	do {
> +		errno = 0;
> +		flockfile(fp);
> +		while ((ch = getc_unlocked(fp)) != EOF) {

<snip>

> +		}
> +		funlockfile(fp);
> +	} while (handle_nonblock(fp, POLLIN));

I haven't used stdio in a while and I'm glad :)
Error handling with ferror + clearerr + errno checking is
difficult and error-prone.

Linus said this back in 2006, too:
http://mid.gmane.org/Pine.LNX.4.64.0609141023130.4388@g5.osdl.org

So I wonder if we're better off relying entirely on xread/xwrite
+ strbuf for all our buffering.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-06-27 23:20 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-26 23:21 [PATCH 0/2] wrapper: xread/xwrite fixes for non-blocking FDs Eric Wong
2016-06-26 23:21 ` [PATCH 1/2] xread: retry after poll on EAGAIN/EWOULDBLOCK Eric Wong
2016-06-26 23:42   ` Jeff King
2016-06-27 13:02     ` Junio C Hamano
2016-06-27 14:36       ` Jeff King
2016-06-27 16:49         ` Stefan Beller
2016-06-27 19:17           ` Jeff King
2016-06-27 19:43             ` Stefan Beller
2016-06-27 19:51               ` Jeff King
2016-06-27 20:13           ` Eric Wong
2016-06-27 21:49             ` Jeff King
2016-06-27 22:22               ` Jeff King
2016-06-27 23:19                 ` Eric Wong
2016-06-26 23:51   ` Jeff King
2016-06-27  3:56     ` [PATCHv2 " Eric Wong
2016-06-26 23:21 ` [PATCH 2/2] xwrite: poll on non-blocking FDs Eric Wong
2016-06-26 23:49   ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).