From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Git mailing list <git@vger.kernel.org>, Jeff King <peff@peff.net>,
Clemens Buchacher <drizzd@gmx.net>
Subject: Re: t5570-git-daemon fails with SIGPIPE on OSX
Date: Fri, 8 Feb 2019 13:54:35 +0100 [thread overview]
Message-ID: <20190208125435.GW10587@szeder.dev> (raw)
In-Reply-To: <nycvar.QRO.7.76.6.1902080928500.41@tvgsbejvaqbjf.bet>
On Fri, Feb 08, 2019 at 09:32:32AM +0100, Johannes Schindelin wrote:
> Team,
>
> On Mon, 6 Aug 2018, SZEDER Gábor wrote:
>
> > [Resending with Clemens' last used email address.
> > Clemens, please consider sending a patch to update our .mailmap file.]
> >
> >
> > On Mon, Aug 6, 2018 at 5:11 PM SZEDER Gábor <szeder.dev@gmail.com> wrote:
> > >
> > > Travis CI changed its default OSX image to use XCode 9.4 on 2018-07-31
> > > [1]. Since then OSX build jobs fail rather frequently because of a
> > > SIGPIPE in the tests 'fetch notices corrupt pack' or 'fetch notices
> > > corrupt idx' in 't5570-git-daemon.sh' [2]. I think this is a symptom
> > > a real bug in Git affecting other platforms as well, but these tests
> > > are too lax to catch it.
>
> I am seeing this very frequently now, as it feels like failing in the
> Azure Pipeline about half of the time.
I was wondering whether it's only an issue on Travis CI, and was
waiting whether you'll complain about it :) Evidently it is not, but
I still would like to see a report about macOS running directly on
someone's Mac hardware, i.e. without all the CI/cloud/whatnot magic.
> Any ideas how to fix this test, anyone?
I'm afraid that this is not merely an issue with the test, but a
platform issue that we should work around somehow. I would have
suggested to follow up on what Peff suggested, but I see that you
already did, and it didn't work out...
In the meantime, for lack of a better option, I started to skip the
two failure-prone tests in the OSX build jobs in my automated CI
builds with:
-- >8 --
Subject: [PATCH] travis-ci: skip flaky tests in 't5570-git-daemon.sh' in OSX
build jobs
See: https://public-inbox.org/git/CAM0VKj=MCS+cmOgzf_XyPeb+qZrFmuMH52-PV_NDMZA9X+rRoA@mail.gmail.com/
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
ci/lib-travisci.sh | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/ci/lib-travisci.sh b/ci/lib-travisci.sh
index 69dff4d1ec..1cd2c1db7d 100755
--- a/ci/lib-travisci.sh
+++ b/ci/lib-travisci.sh
@@ -121,7 +121,9 @@ osx-clang|osx-gcc)
# t9810 occasionally fails on Travis CI OS X
# t9816 occasionally fails with "TAP out of sequence errors" on
# Travis CI OS X
- export GIT_SKIP_TESTS="t9810 t9816"
+ # In 't5570-git-daemon.sh', the tests 'fetch notices corrupt pack'
+ # and 'fetch notices corrupt idx' fail rather frequently.
+ export GIT_SKIP_TESTS="t9810 t9816 t5570.9 t5570.10"
;;
GIT_TEST_GETTEXT_POISON)
export GIT_TEST_GETTEXT_POISON=YesPlease
--
2.20.1.940.g8404bb2d1a
-- >8 --
> > > What it boils down to is this sequence:
> > >
> > > - The test first prepares a repository containing a corrupt pack,
> > > ready to be server via 'git daemon'.
> > >
> > > - Then the test runs 'test_must_fail git fetch ....', which connects
> > > to 'git daemon', which forks 'git upload-pack', which then
> > > advertises refs (only HEAD) and capabilities. So far so good.
> > >
> > > - 'git fetch' eventually calls fetch-pack.c:find_common(). The
> > > first half of this function assembles a request consisting of a
> > > want and a flush pkt-line, and sends it via a send_request() call.
> > >
> > > At this point the scheduling becomes important: let's suppose that
> > > fetch is slow and upload-pack is fast.
> > >
> > > - 'git upload-pack' receives the request, parses the want line,
> > > notices the corrupt pack, responds with an 'ERR upload-pack: not
> > > our ref' pkt-line, and die()s right away.
> > >
> > > - 'git fetch' finally approaches the end of the function, where it
> > > attempts to send a done pkt-line via another send_request() call
> > > through the now closing TCP socket.
> > >
> > > - What happens now seems to depend on the platform:
> > >
> > > - On Linux, both on my machine and on Travis CI, it shows textbook
> > > example behaviour: write() returns with error and sets errno to
> > > ECONNRESET. Since it happens in write_or_die(), 'git fetch'
> > > die()s with 'fatal: write error: Connection reset by peer', and
> > > doesn't show the error send by 'git upload-pack'; how could it,
> > > it doesn't even get as far to receive upload-pack's ERR
> > > pkt-line.
> > >
> > > The test only checks that 'git fetch' fails, but it doesn't
> > > check whether it failed with the right error message, so the
> > > test still succeeds. Had it checked the error message as well,
> > > we most likely had noticed this issue already, it doesn't happen
> > > all that rarely.
> > >
> > > - On the new OSX images with XCode 9.4 on Travis CI the write()
> > > triggers SIGPIPE right away, and 'test_must_fail' notices it and
> > > fails the test. I couldn't see any sign of an ECONNRESET or any
> > > other error that we could act upon to avoid the SIGPIPE.
> > >
> > > - On OSX with XCode 9.2 on Travis CI there is neither SIGPIPE, nor
> > > ECONNRESET, but sending the request actually succeeds even
> > > though there is no process on the other end of the socket
> > > anymore. 'git fetch' then simply continues execution, reads and
> > > parses the ERR pkt-line, and then dies()s with 'fatal: remote
> > > error: upload-pack: not our ref'. So, on the face of it, it
> > > shows the desired behaviour, but I have no idea how that write()
> > > could succeed instead of returning error.
> > >
> > > I don't know what happens on a real Mac as I don't have access to one;
> > > I figured out all the above by enabling packet tracing, adding a
> > > couple of well placed tracing printf() and sleep() calls, running a
> > > bunch of builds on Travis CI, and looking through their logs. But
> > > without access to a debugger and netstat and what not I can't really
> > > go any further. So I would now happily pass the baton to those who
> > > have a Mac and know a thing or two about its porting issues to first
> > > check whether OSX on a real Mac shows the same behaviour as it does in
> > > Travis CI's virtualized(?) environment. And then they can pass the
> > > baton to those who know all the intricacies of the pack protocol and
> > > its implementation to decide what to do with this issue.
> > >
> > > For a mostly reliable reproduction recipe you might want to fetch this
> > > branch:
> > >
> > > https://github.com/szeder/git t5570-git-daemon-sigpipe
> > >
> > > and then run 'make && cd t && ./t5570-git-daemon.sh -v -x'
> > >
> > >
> > > Have fun! ;)
> > >
> > >
> > > 1 - https://blog.travis-ci.com/2018-07-19-xcode9-4-default-announce
> > >
> > > 2 - On git.git's master:
> > > https://travis-ci.org/git/git/jobs/411517552#L2717
> >
> >
next prev parent reply other threads:[~2019-02-08 12:54 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-06 15:11 t5570-git-daemon fails with SIGPIPE on OSX SZEDER Gábor
2018-08-06 15:31 ` SZEDER Gábor
2019-02-08 8:32 ` Johannes Schindelin
2019-02-08 12:54 ` SZEDER Gábor [this message]
2018-08-14 22:32 ` Jeff King
2018-08-14 22:37 ` Jeff King
2019-02-08 9:02 ` Johannes Schindelin
2019-02-08 9:28 ` Johannes Schindelin
2019-02-08 19:54 ` Jeff King
2019-03-01 15:02 ` Johannes Schindelin
2019-03-01 19:00 ` Jeff King
2019-03-02 21:21 ` Johannes Schindelin
2019-03-03 16:54 ` Jeff King
2019-03-03 16:55 ` [PATCH 1/2] fetch: avoid calling write_or_die() Jeff King
2019-03-04 13:42 ` Duy Nguyen
2019-03-05 4:11 ` Jeff King
2019-03-03 16:58 ` [PATCH 2/2] fetch: ignore SIGPIPE during network operation Jeff King
2019-03-04 1:11 ` Junio C Hamano
2019-03-05 4:11 ` Jeff King
2019-03-03 1:21 ` t5570-git-daemon fails with SIGPIPE on OSX Junio C Hamano
2019-03-03 14:56 ` Johannes Schindelin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190208125435.GW10587@szeder.dev \
--to=szeder.dev@gmail.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=drizzd@gmx.net \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).