git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: Git mailing list <git@vger.kernel.org>
Cc: Jeff King <peff@peff.net>, Clemens Buchacher <drizzd@aon.at>
Subject: t5570-git-daemon fails with SIGPIPE on OSX
Date: Mon, 6 Aug 2018 17:11:13 +0200	[thread overview]
Message-ID: <CAM0VKj=MCS+cmOgzf_XyPeb+qZrFmuMH52-PV_NDMZA9X+rRoA@mail.gmail.com> (raw)

Travis CI changed its default OSX image to use XCode 9.4 on 2018-07-31
[1].  Since then OSX build jobs fail rather frequently because of a
SIGPIPE in the tests 'fetch notices corrupt pack' or 'fetch notices
corrupt idx' in 't5570-git-daemon.sh' [2].  I think this is a symptom
a real bug in Git affecting other platforms as well, but these tests
are too lax to catch it.

What it boils down to is this sequence:

  - The test first prepares a repository containing a corrupt pack,
    ready to be server via 'git daemon'.

  - Then the test runs 'test_must_fail git fetch ....', which connects
    to 'git daemon', which forks 'git upload-pack', which then
    advertises refs (only HEAD) and capabilities.  So far so good.

  - 'git fetch' eventually calls fetch-pack.c:find_common().  The
    first half of this function assembles a request consisting of a
    want and a flush pkt-line, and sends it via a send_request() call.

    At this point the scheduling becomes important: let's suppose that
    fetch is slow and upload-pack is fast.

  - 'git upload-pack' receives the request, parses the want line,
    notices the corrupt pack, responds with an 'ERR upload-pack: not
    our ref' pkt-line, and die()s right away.

  - 'git fetch' finally approaches the end of the function, where it
    attempts to send a done pkt-line via another send_request() call
    through the now closing TCP socket.

  - What happens now seems to depend on the platform:

    - On Linux, both on my machine and on Travis CI, it shows textbook
      example behaviour: write() returns with error and sets errno to
      ECONNRESET.  Since it happens in write_or_die(), 'git fetch'
      die()s with 'fatal: write error: Connection reset by peer', and
      doesn't show the error send by 'git upload-pack'; how could it,
      it doesn't even get as far to receive upload-pack's ERR
      pkt-line.

      The test only checks that 'git fetch' fails, but it doesn't
      check whether it failed with the right error message, so the
      test still succeeds.  Had it checked the error message as well,
      we most likely had noticed this issue already, it doesn't happen
      all that rarely.

    - On the new OSX images with XCode 9.4 on Travis CI the write()
      triggers SIGPIPE right away, and 'test_must_fail' notices it and
      fails the test.  I couldn't see any sign of an ECONNRESET or any
      other error that we could act upon to avoid the SIGPIPE.

    - On OSX with XCode 9.2 on Travis CI there is neither SIGPIPE, nor
      ECONNRESET, but sending the request actually succeeds even
      though there is no process on the other end of the socket
      anymore.  'git fetch' then simply continues execution, reads and
      parses the ERR pkt-line, and then dies()s with 'fatal: remote
      error: upload-pack: not our ref'.  So, on the face of it, it
      shows the desired behaviour, but I have no idea how that write()
      could succeed instead of returning error.

I don't know what happens on a real Mac as I don't have access to one;
I figured out all the above by enabling packet tracing, adding a
couple of well placed tracing printf() and sleep() calls, running a
bunch of builds on Travis CI, and looking through their logs.  But
without access to a debugger and netstat and what not I can't really
go any further.  So I would now happily pass the baton to those who
have a Mac and know a thing or two about its porting issues to first
check whether OSX on a real Mac shows the same behaviour as it does in
Travis CI's virtualized(?) environment.  And then they can pass the
baton to those who know all the intricacies of the pack protocol and
its implementation to decide what to do with this issue.

For a mostly reliable reproduction recipe you might want to fetch this
branch:

  https://github.com/szeder/git t5570-git-daemon-sigpipe

and then run 'make && cd t && ./t5570-git-daemon.sh -v -x'


Have fun! ;)


1 - https://blog.travis-ci.com/2018-07-19-xcode9-4-default-announce

2 - On git.git's master:
    https://travis-ci.org/git/git/jobs/411517552#L2717

             reply	other threads:[~2018-08-06 15:11 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-06 15:11 SZEDER Gábor [this message]
2018-08-06 15:31 ` t5570-git-daemon fails with SIGPIPE on OSX SZEDER Gábor
2019-02-08  8:32   ` Johannes Schindelin
2019-02-08 12:54     ` SZEDER Gábor
2018-08-14 22:32 ` Jeff King
2018-08-14 22:37   ` Jeff King
2019-02-08  9:02   ` Johannes Schindelin
2019-02-08  9:28     ` Johannes Schindelin
2019-02-08 19:54       ` Jeff King
2019-03-01 15:02         ` Johannes Schindelin
2019-03-01 19:00           ` Jeff King
2019-03-02 21:21             ` Johannes Schindelin
2019-03-03 16:54               ` Jeff King
2019-03-03 16:55                 ` [PATCH 1/2] fetch: avoid calling write_or_die() Jeff King
2019-03-04 13:42                   ` Duy Nguyen
2019-03-05  4:11                     ` Jeff King
2019-03-03 16:58                 ` [PATCH 2/2] fetch: ignore SIGPIPE during network operation Jeff King
2019-03-04  1:11                   ` Junio C Hamano
2019-03-05  4:11                     ` Jeff King
2019-03-03  1:21             ` t5570-git-daemon fails with SIGPIPE on OSX Junio C Hamano
2019-03-03 14:56               ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAM0VKj=MCS+cmOgzf_XyPeb+qZrFmuMH52-PV_NDMZA9X+rRoA@mail.gmail.com' \
    --to=szeder.dev@gmail.com \
    --cc=drizzd@aon.at \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).