git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Randall S. Becker" <rsbecker@nexbridge.com>
Cc: <git@vger.kernel.org>, Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Bug/Solution] Git hangs in compat/poll/poll.c on HPE NonStop
Date: Fri, 29 Sep 2017 07:47:17 +0900	[thread overview]
Message-ID: <xmqqo9puzbii.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <009c01d3389c$f6c0cb00$e4426100$@nexbridge.com> (Randall S. Becker's message of "Thu, 28 Sep 2017 17:01:27 -0400")

[jch: a patch that hopefully is applicable is attached at the end;
 I'd appreciate input from Paolo, the contributor of the original
 upstream]

"Randall S. Becker" <rsbecker@nexbridge.com> writes:

> After a whole lot of investigating, we (it is a large "we") have discovered
> the reason for the hang we occasionally get in git-upload-pack on HPE
> NonStop servers - reported here well over a year ago. This resulted from a
> subtle check that the operating system does on file descriptors. When it
> sees random values in pfd[i].revents,...

What do you mean by "random values"?  Where do these random values
come from?  The original code the patch touches is _not_ setting
anything, so it is leaving it uninitialized and that is seen by the
system?  If that is the case perhaps should we clear it before we
call into this codepath?

> .... There is a non-destructive fix
> that I would like to propose for this that I have already tested.

I am not sure in what sense this is "non-destructive"; we left the
value as it was when we didn't notice anything happening in
compute_revents().  Now we explicitly destroy the value there was
(just like we do in the case where the corresponding file descriptor
was negative).  Maybe overwriting with 0 is the right thing, but I
wouldn't call it "non-destructive", though.  Puzzling...

> --- a/compat/poll/poll.c
> +++ b/compat/poll/poll.c
> @@ -438,6 +438,10 @@ poll (struct pollfd *pfd, nfds_t nfd, int timeout)
>             pfd[i].revents = happened;
>             rc++;
>           }
> +        else
> +          {
> +            pfd[i].revents = 0;
> +          }
>        }
>
>    return rc;

FYI, the upstream gnulib rewrites this part of the code with
d42461c3 ("poll: fixes for large fds", 2015-02-20) quite a bit, and
it reads like this:

+    {
+      pfd[i].revents = (pfd[i].fd < 0
+                        ? 0
+                        : compute_revents (pfd[i].fd, pfd[i].events,
+                                           &rfds, &wfds, &efds));
+      rc += pfd[i].revents != 0;
+    }

The code before your fix (and before d42461c3) used to assign to
pfd[i].revents only when compute_revents() gave us non-zero value,
but with d42461c3 in the upstream, it pfd[i].revents is assigned
the return value from compute_revents() even if the value returned
is 0.  And rc is incremented only when that value is non-zero.

The result of applying your patch is equivalent, so after all, I
tend to think that your patch is the right fix in the context of the
code we have (i.e. the older code we borrowed from them).

I wonder if we want to refresh the borrowed copy of poll.c to a
newer snapshot someday, but that is totally a separate topic.  At
least, I now know that your fix in this patch will not be broken due
to d42461c3 updating the code in a slightly different way ;-)

Randall, I forged your Sign-off in the patch below, but please say
it is OK, after reading Documentation/SubmittingPatches.

Thanks.

-- >8 --
From: Randall S. Becker <rsbecker@nexbridge.com>
Subject: poll.c: always set revents, even if to zero.

Match what happens to pfd[i].revents when compute_revents() returns
0 to the upstream gnulib's commit d42461c3 ("poll: fixes for large
fds", 2015-02-20).  The revents field is set to 0, without
incrementing the value rc to be returned from the function.

This fixes occasional hangs in git-upload-pack on NPE NonStop.

Signed-off-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 compat/poll/poll.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/compat/poll/poll.c b/compat/poll/poll.c
index b10adc780f..ae03b74a6f 100644
--- a/compat/poll/poll.c
+++ b/compat/poll/poll.c
@@ -438,6 +438,10 @@ poll (struct pollfd *pfd, nfds_t nfd, int timeout)
 	    pfd[i].revents = happened;
 	    rc++;
 	  }
+	else
+	  {
+	    pfd[i].revents = 0;
+	  }
       }
 
   return rc;

  reply	other threads:[~2017-09-28 22:47 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-28 21:01 [Bug/Solution] Git hangs in compat/poll/poll.c on HPE NonStop Randall S. Becker
2017-09-28 22:47 ` Junio C Hamano [this message]
2017-09-29  8:41   ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqo9puzbii.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rsbecker@nexbridge.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).