From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3215 2.6.0.0/16 X-Spam-Status: No, score=-3.4 required=3.0 tests=AWL,BAYES_00,BODY_8BITS, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id B75091F487 for ; Fri, 27 Mar 2020 21:43:31 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 951F1385E010; Fri, 27 Mar 2020 21:43:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 951F1385E010 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1585345410; bh=L2k3uurJp+vBUgQsv9wnqN3XkhZsf/K2kLySQYyw6A4=; h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=SRJaoWuQ9xgKEoq8hQkuf1ky54r8h9yFa3mAb/bQBVWZLVXS6TYX5w/j6rg7G+SIV o5jEREAPXtkKl08sRXlUFWR1R6Fq+OKu0E7pQfxzfIXoVKpqSvVofPWNJavzw+5OqC 8COkTrOguWy6Z+eeZU4u6z3E0ELWW038SenMF3tw= Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 30CA8385E009 for ; Fri, 27 Mar 2020 21:43:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 30CA8385E009 Received: from pps.filterd (m0187473.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 02RLXAon087478; Fri, 27 Mar 2020 17:43:26 -0400 Received: from ppma03dal.us.ibm.com (b.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.11]) by mx0a-001b2d01.pphosted.com with ESMTP id 2ywek19k89-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Mar 2020 17:43:26 -0400 Received: from pps.filterd (ppma03dal.us.ibm.com [127.0.0.1]) by ppma03dal.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id 02RLgYaF023374; Fri, 27 Mar 2020 21:43:25 GMT Received: from b03cxnp08026.gho.boulder.ibm.com (b03cxnp08026.gho.boulder.ibm.com [9.17.130.18]) by ppma03dal.us.ibm.com with ESMTP id 2ywawajqdk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Mar 2020 21:43:25 +0000 Received: from b03ledav002.gho.boulder.ibm.com (b03ledav002.gho.boulder.ibm.com [9.17.130.233]) by b03cxnp08026.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 02RLhO2l57868748 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 27 Mar 2020 21:43:24 GMT Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9B670136055; Fri, 27 Mar 2020 21:43:24 +0000 (GMT) Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1B669136051; Fri, 27 Mar 2020 21:43:24 +0000 (GMT) Received: from [9.85.139.44] (unknown [9.85.139.44]) by b03ledav002.gho.boulder.ibm.com (Postfix) with ESMTP; Fri, 27 Mar 2020 21:43:23 +0000 (GMT) Subject: Re: [PATCH v2 1/3] math: Remove fenvinline.h To: Adhemerval Zanella , libc-alpha@sourceware.org References: <20200309183234.11891-1-adhemerval.zanella@linaro.org> <196ea31e-f170-7f3c-6169-4c0ab95e0209@linux.ibm.com> Message-ID: <9a986217-f5d1-6f43-450b-7bbff2965d54@linux.ibm.com> Date: Fri, 27 Mar 2020 16:43:22 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.645 definitions=2020-03-27_08:2020-03-27, 2020-03-27 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 clxscore=1015 suspectscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 bulkscore=0 phishscore=0 lowpriorityscore=0 priorityscore=1501 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2003270178 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Paul E Murphy via Libc-alpha Reply-To: Paul E Murphy Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" On 3/10/20 11:51 AM, Adhemerval Zanella wrote: > > > On 09/03/2020 17:19, Paul E Murphy wrote: >> >> >> On 3/9/20 1:32 PM, Adhemerval Zanella wrote: >>> Changes from previous version: >>> >>>    - Mention on commit message x86 also exports a similar optimization, >>>      but on a different header. >>> >>> -- >>> >>> Similar to string2.h (18b10de7ce) and string3.h (09a596cc2c) this >>> patch removes the fenvinline.h on all architectures.  Currently >>> only powerpc implements some optimizations.  This kind of optimization >>> is better implemented by the compiler (which handles the architecture >>> ISA transparently). >>> >>> Also, for the specific optimized powerpc implementation the code is >>> becoming convoluted and these micro-optimization are hardly wildly >>> used, even more being a possible hotspot in realword cases >>> (non-default rounding are used only on specific cases and exception >>> handling are done most likely only on errors path).  Only x86 >>> implements similar optimization (on fenv.h) also indicates that >>> these should no be on libc. >>> >>> The math/test-fenv already covers all math/test-fenvinline tests, >>> so it is safe to remove it. >>> >>> Checked on x86_64-linux-gnu and powerpc64le-linux-gnu. >>> --- >>>   bits/fenvinline.h                 |   8 - >>>   math/Makefile                     |   4 +- >>>   math/fenv.h                       |   4 - >>>   math/test-fenvinline.c            | 354 ------------------------------ >>>   sysdeps/powerpc/bits/fenvinline.h | 108 --------- >>>   5 files changed, 2 insertions(+), 476 deletions(-) >>>   delete mode 100644 bits/fenvinline.h >>>   delete mode 100644 math/test-fenvinline.c >>>   delete mode 100644 sysdeps/powerpc/bits/fenvinline.h > > Indeed, I misread the failures on powerpc64le-linux-gnu. Below it is > an updated patch with the fegetround optimization moved to an internal > header. > diff --git a/sysdeps/powerpc/fpu/fegetround.c b/sysdeps/powerpc/fpu/fegetround.c > index 00b4462624..9d7762f08b 100644 > --- a/sysdeps/powerpc/fpu/fegetround.c > +++ b/sysdeps/powerpc/fpu/fegetround.c > @@ -21,10 +21,8 @@ > int > (__fegetround) (void) > { > - return __fegetround(); > + return __fegetround_inline (); > } > -#undef fegetround > -#undef __fegetround > libm_hidden_def (__fegetround) > weak_alias (__fegetround, fegetround) > libm_hidden_weak (fegetround) > diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h > index e888c6621c..09dbd3e2df 100644 > --- a/sysdeps/powerpc/fpu/fenv_libc.h > +++ b/sysdeps/powerpc/fpu/fenv_libc.h > @@ -68,6 +68,14 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden; > __fr; \ > }) > > +#define __fe_mffsl() \ > + ({register fenv_union_t __fr; \ > + __asm__ __volatile__ ( \ > + ".machine push; .machine \"power9\"; mffsl %0; .machine pop" \ > + : "=f" (__fr.fenv)); \ > + __fr.l & 0x3; \ > + }) > + > #define __fe_mffscrn(rn) \ > ({register fenv_union_t __fr; \ > if (__builtin_constant_p (rn)) \ > @@ -144,6 +152,20 @@ typedef union > unsigned long long l; > } fenv_union_t; > > +static inline int > +__fegetround_inline (void) > +{ > +#ifdef _ARCH_PWR9 > + return __fe_mffsl (); > +#else > + if (__glibc_likely (GLRO(dl_hwcap2) & PPC_FEATURE2_ARCH_3_00)) > + return __fe_mffsl (); > + Can the above be removed, and fegetenv_register() be replaced with fegetenv_control()? Such should work optimally on all ppc machines. Otherwise, it LGTM. I have mixed feelings about regressing these inlines before compiler support arrives, but I suspect these are likely not used in performance critical places, so I am not objecting. > + fenv_union_t fe; > + fe.fenv = fegetenv_register (); > + return fe.l & 0x3; > +#endif > +} > > static inline int > __fesetround_inline (int round)