From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-3.8 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id E6F9F1F463 for ; Fri, 20 Sep 2019 15:17:26 +0000 (UTC) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-type; q=dns; s=default; b=oavH GYP2xsKPpnAA6KwR/M+y+KTeSfvV8j7e05BcCgAUg+50S65RUP0g4tPN1sfsPQy+ qAv1Bg3tRKo0RNBjswAJ8WgjucXtQ46rOmLVAuiD5tT4FKyOPkxgpSsoItekz1No OZF+d6qEpUaAenwnsl3pwXUuubr0Q9cIUc+/WMg= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-type; s=default; bh=kJcJzqmBgx skgJDCyWd110AYDSo=; b=s5S9yzXX+PrVqWTWC6otpPgwV2J7BTkgEgY5yCaUFi SX3iM1B/Yzno1y/fYOe2bbi6C+49DJJG2P+OD1nhkSJcfUdrzByvp8wZUjrxNgfm KTAcA+oXopJgP/IedgQ79glPn8bt3EmcRoxMWdYf+391rQ/gYO6+N6a36poYJK9h 4= Received: (qmail 98368 invoked by alias); 20 Sep 2019 15:17:22 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 98317 invoked by uid 89); 20 Sep 2019 15:17:21 -0000 Authentication-Results: sourceware.org; auth=none X-HELO: mailbackend.panix.com MIME-Version: 1.0 References: <62d353ae-c6a8-16ce-01d5-56933cd18802@redhat.com> <205f17ba-a82b-7806-54fa-49b61268e64d@linux.ibm.com> In-Reply-To: <205f17ba-a82b-7806-54fa-49b61268e64d@linux.ibm.com> From: Zack Weinberg Date: Fri, 20 Sep 2019 11:17:04 -0400 Message-ID: Subject: Re: [PATCH] Speedup various nptl/tst-cancel20 and nptl/tst-cancel21 tests. To: Stefan Liebler Cc: "Carlos O'Donell" , libc-alpha Content-Type: text/plain; charset="UTF-8" On Fri, Sep 20, 2019 at 8:59 AM Stefan Liebler wrote: ... > I've tried to just remove the sleeps. This works fine with tst-cancel20. > But with tst-cancelx20 (same code as tst-cancel20.c but compiled with > -fexceptions -fasynchronous-unwind-tables) it sometimes fails with: > called cleanups 124 => cleanup-handler in tf_body was not called. > > In the "good" case, SIGHUP is received while tf_body is blocking in read. > In the "error" case, SIGHUP is received while tf_body is waiting in > pthread_barrier_wait. > (This occures on s390x in the same way as on e.g. x86_64; I've used gcc > version 9.1.1) ... > According to the .gcc_except_table, the region with our cleanup-handler > starts - after pthread_barrier_wait - directly before the read call, > even though pthread_barrier_wait is within pthread_cleanup_push / pop: ... > Is this behaviour intended? > > The difference between those calls is "nothrow": > extern ssize_t read (int __fd, void *__buf, size_t __nbytes) ; > vs > extern int pthread_barrier_wait (pthread_barrier_t *__barrier) > __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1))); This looks like GCC is assuming it can hoist a call to a nothrow function out of a cleanup region that lexically contains it. That's not true when an exception can be thrown by a signal handler. I can see an argument that this is an incorrect optimization when -fasynchronous-unwind-tables is in use, but the documentation makes it sound like -fasynchronous-unwind-tables is only intended to enable *stack tracing* from an arbitrary instruction, not exceptions. (See the documentation for -fnon-call-exceptions as well as for -fasynchronous-exception-tables.) It is not entirely clear to me what this test is supposed to test, but I think the intent is for the SIGHUP to be delivered *only* after we are sure tf_body is blocked on read. If it's delivered at any other point, the test might not be testing the right thing. Instead of your change to use a second pipe, therefore, I suggest use of pthread_sigmask and ppoll: remove the barriers; block SIGHUP using pthread_sigmask in do_test (this will inherit to all child threads); then change tf_body to something like this: static void __attribute__((noinline)) tf_body (void) { char c; pthread_cleanup_push (cl, (void *) 3L); sigset_t unblock_SIGHUP; sigemptyset(&unblock_SIGHUP); struct pollfd pfds[1]; pfds[0].fd = fd[0]; pfds[0].events = POLLIN; // this call is expected to be interrupted by a SIGHUP // before anything is written to the pipe if (ppoll (pfds, 1, 0, &unblock_SIGHUP) != -1) { puts ("read succeeded"); exit (1); } // drain the pipe read (fd[0], &c, 1); pthread_cleanup_pop (0); } ppoll _atomically_ sets the signal mask before waiting, and restores it afterward, so, this should ensure that the SIGHUP is delivered at exactly the right point. zw