From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 562011F461; Sun, 30 Jun 2019 07:41:46 +0000 (UTC) Date: Sun, 30 Jun 2019 07:41:46 +0000 From: Eric Wong To: meta@public-inbox.org Subject: [PATCH] examples/*@.service: sockets MUST be NonBlocking Message-ID: <20190630074146.GA16199@dcvr> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline List-Id: For users running multiple (-nntpd@1, -nntpd@2) instances of either -httpd or -nntpd via systemd to implement zero-downtime restarts; it's possible for a listen socket to become blocking for a moment during an accept syscall and cause a daemons to get stuck in a blocking accept() during PublicInbox::Listener::event_step (event_read in previous versions). Since O_NONBLOCK is a file description flag, systemd clearing O_NONBLOCK momentarily (before PublicInbox::Listener::new re-enables it) creates a window for another instance of our daemon to get stuck in accept(). cf. systemd.service(5) --- Additional warnings and tests for this race coming; but I need food + sleep. I kinda wished I'd pushed for accept4(..., SOCK_DONTWAIT): https://lore.kernel.org/lkml/20150513023712.GA4206@dcvr.yhbt.net/ examples/public-inbox-httpd@.service | 5 +++++ examples/public-inbox-nntpd@.service | 5 +++++ examples/unsubscribe-psgi@.service | 5 +++++ 3 files changed, 15 insertions(+) diff --git a/examples/public-inbox-httpd@.service b/examples/public-inbox-httpd@.service index 56117ef..e811da4 100644 --- a/examples/public-inbox-httpd@.service +++ b/examples/public-inbox-httpd@.service @@ -20,7 +20,12 @@ ExecStartPre = /bin/mkdir -p -m 1777 /tmp/.pub-inline ExecStart = /usr/local/bin/public-inbox-httpd \ -1 /var/log/public-inbox/httpd.out.log StandardError = syslog + +# NonBlocking is REQUIRED to avoid a race condition if running +# simultaneous services +NonBlocking = true Sockets = public-inbox-httpd.socket + KillSignal = SIGQUIT User = nobody Group = nogroup diff --git a/examples/public-inbox-nntpd@.service b/examples/public-inbox-nntpd@.service index 62202c2..a879841 100644 --- a/examples/public-inbox-nntpd@.service +++ b/examples/public-inbox-nntpd@.service @@ -20,7 +20,12 @@ ExecStartPre = /bin/mkdir -p -m 1777 /tmp/.pub-inline ExecStart = /usr/local/bin/public-inbox-nntpd \ -1 /var/log/public-inbox/nntpd.out.log StandardError = syslog + +# NonBlocking is REQUIRED to avoid a race condition if running +# simultaneous services +NonBlocking = true Sockets = public-inbox-nntpd.socket + KillSignal = SIGQUIT User = nobody Group = nogroup diff --git a/examples/unsubscribe-psgi@.service b/examples/unsubscribe-psgi@.service index acc29e8..c8721fb 100644 --- a/examples/unsubscribe-psgi@.service +++ b/examples/unsubscribe-psgi@.service @@ -12,7 +12,12 @@ After = unsubscribe-psgi.socket # any PSGI server ought to work, # but public-inbox-httpd supports socket activation like unsubscribe.milter ExecStart = /usr/local/bin/public-inbox-httpd -W0 /etc/unsubscribe.psgi + +# NonBlocking is REQUIRED to avoid a race condition if running +# simultaneous services +NonBlocking = true Sockets = unsubscribe-psgi.socket + # we need to modify the mlmmj spool User = mlmmj KillMode = process -- EW