From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Szabolcs Nagy Newsgroups: gmane.comp.lib.glibc.alpha Subject: Re: [PATCH] v11 Improves __ieee754_exp() performance by greater than 5x on sparc/x86. Date: Wed, 14 Feb 2018 20:05:45 +0000 Message-ID: References: NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Trace: blaine.gmane.org 1518638675 19648 195.159.176.226 (14 Feb 2018 20:04:35 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 14 Feb 2018 20:04:35 +0000 (UTC) User-Agent: Mozilla/5.0 (X11; Linux aarch64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 Cc: nd@arm.com, libc-alpha@sourceware.org To: Joseph Myers , Patrick McGehearty Original-X-From: libc-alpha-return-90299-glibc-alpha=m.gmane.org@sourceware.org Wed Feb 14 21:04:31 2018 Return-path: Envelope-to: glibc-alpha@blaine.gmane.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:cc:subject:to:references:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; q=dns; s=default; b=H5AsbVG/wsscnGQ1 OmolOmgnx9QOoOzk1wiTXVZdKy4orTUbl80t543DSBdGJTa2E4IMBHYUg4YryxwY khInjRtpKNUZO1AunKNcugmUjL88rxLsgtIOgaRhALzdSLXguuK3/DHpcRYHzHFy g4zAn8jIJCrXOBK6xjSt9JL/HC8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:cc:subject:to:references:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; s=default; bh=XJTI7HQZYJ88a4A4UzoDMX 9zYbs=; b=o150g2v7imExLldDgWf1I8FhteFUP3STXcVlQNa++qmgR9Jz5gg7Ij lZgnF11gKqJi+Ol2Q/LLBbQZ/pPKQt50CU07pDITGMJUFZzsSVSfv0rlgfLUd+yG wr8+bQb65CCoUcCt1J4zMNrmra9ZZ9zYHDnQ9OxFGiA3xMEr1apgI= Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Original-Sender: libc-alpha-owner@sourceware.org Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 spammy=051, H*M:1227 X-HELO: EUR01-VE1-obe.outbound.protection.outlook.com Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Szabolcs.Nagy@arm.com; In-Reply-To: X-ClientProxiedBy: VI1PR0101CA0053.eurprd01.prod.exchangelabs.com (2603:10a6:800:1f::21) To DB6PR0802MB2487.eurprd08.prod.outlook.com (2603:10a6:4:a0::22) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 91635417-010b-4eeb-c6ce-08d573e658a8 X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(4534165)(4627221)(201703031133081)(201702281549075)(2017052603307)(7153060)(7193020);SRVR:DB6PR0802MB2487; X-Microsoft-Exchange-Diagnostics: 1;DB6PR0802MB2487;3:A0n2HlGHMUM4vqD9h3MqGtg1rdwnnLn19OjaKRSg7TCKEm8Fbph12GXx04uZvf7FU8Gz5guN3B0avnPUOILrGuCfq4QbdVFw6HFldbdvq75aQpcYPPvw/hKqtzHREX5ntR1yKeeIYWsCiz2I7MtudGNj8Mab4gvDG5yV6TAqvZnctmNvdZ2UzinCiXBc9hKaeC1FsDRyBKJzmRB/mxOprt7vLgzs8fka+86eXdEccIaVcGYv4eggd26Q6rBFMaIM;25:rXS+u8VS51Lb1TYAWzF7IYEw8D3BJI9nP+E8ZLwmnIMdb5VKW8I+zhnHatQB2lR8cNHpLXJHNrGz1nh3/Owu3EIskn9q3g/IAn7FmCI8fgQF94AeWbS7u4dsbBJFnB6kXTxDZEZAOWFGHMH0S07gZ+VFAubAYcK7DqVLODOhXeMFLXSjoJQQjKYDsjLASvtkkYVTRSHY7FzaoSeLqRi+xAT1hME3mQPEulCMCe0bOMAV9+NLD+Flo8GaaLi68uUCLA0FsWUS8Oa6S4phBbYwIaz+zmYcUeIYfI7paUjGNIUy/jrC7PrsO2J6plY1vGXq24SxUQmQxCPKNSnG5/8LmA==;31:92jh2nHGsz7sVkEKIN2ALSOL+Id8fNUQDINRNrybGfNcbI0oLre/7vuLS2pa5mhOzW19NLxhIF10V1NRKL+l6dnCCVzBa2FAKsbUrh8XRs7Y7YoqUpqjAGpa/ezAWuCYrKad1DruRFM3uQ2SAnif8+b/4kSORy3ftcVzrDzVitI OM8wsoJZoWKsALJhUMVgcLEOLzLT73kqq/DA0cnuIACGjKL1gh8b2GisbZmIEl08= X-MS-TrafficTypeDiagnostic: DB6PR0802MB2487: NoDisclaimer: True X-Microsoft-Exchange-Diagnostics: 1;DB6PR0802MB2487;20:fC6LXAGqO2u2t4KIJOiAzpzplQVzUrMU5hrodPI5YEKZr4rQEsnCCWdLiOb3i72nC/lInAqNN/xX3lJk8Vk6KHIvzsULLbcHLQbHMxalLPJwvfMxvA2SZ8JjcD+KygL9EsWYPg0weq80ru8wfpODmeW/bFVcLTeUHT+iGUSZXMM=;4:fLZTlYH4FR1jZPWh0z04/IZ8zszsTVvcIAxXHkfuuOT4XCbl3BO7lHIyv6CNWGKeHIjssiX4CqnJw4obIqtofkjdFRGMoukvsP4q0HLlZ5HII8fqsaQ/FQMDawRPvMEnohnWj0JYY4FE1nzPVXLvquW9m7kuahgeTzgxm0gtDGnJ0xNXrCw9QUALQDS0+JuP7O8MYY7OZuBcQVTB68RvnH12PKqraYB83kzJil8C4EA1sVUMYoYpI6xhBB5xA4S1+udM8GbfaVVLHA8KGOlUIA== X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040501)(2401047)(5005006)(8121501046)(10201501046)(3231101)(2400082)(944501161)(3002001)(93006095)(93001095)(6055026)(6041288)(20161123562045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123564045)(20161123558120)(6072148)(201708071742011);SRVR:DB6PR0802MB2487;BCL:0;PCL:0;RULEID:;SRVR:DB6PR0802MB2487; X-Forefront-PRVS: 0583A86C08 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(6049001)(39860400002)(366004)(346002)(39380400002)(396003)(376002)(189003)(199004)(72206003)(8936002)(31686004)(2906002)(5660300001)(229853002)(31696002)(2950100002)(6666003)(6116002)(3846002)(105586002)(53936002)(230700001)(65826007)(6246003)(6486002)(47776003)(25786009)(83506002)(6306002)(52146003)(2486003)(58126008)(110136005)(81156014)(64126003)(52116002)(478600001)(77096007)(97736004)(16576012)(4326008)(16526019)(67846002)(386003)(186003)(26005)(81166006)(86362001)(106356001)(65806001)(53546011)(23676004)(8676002)(50466002)(76176011)(7736002)(316002)(66066001)(305945005)(65956001)(68736007)(36756003);DIR:OUT;SFP:1101;SCL:1;SRVR:DB6PR0802MB2487;H:[10.2.206.230];FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; Received-SPF: None (protection.outlook.com: arm.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtEQjZQUjA4MDJNQjI0ODc7MjM6S3NGM1ZieGlyTWhRa21neUlrOXdvZTJq?= =?utf-8?B?djV0TCtaeGVTNUhoSkwwdjlERWszWkxzbmhLZzd4ZEl0UnU1RHJsZTJCOTda?= =?utf-8?B?cVRhbWt5amU4d1ppQnVaS01OZmoyNkY1L0dqNEpBOG1seEJuRkZKSTZFbXgv?= =?utf-8?B?WVhwWlpzMmhiTTB0bm42R0RoTWkyQTVlVnhXejkxMnE1dTVrOHQyUmRGWGRN?= =?utf-8?B?cnJsaGxhYlkvYWZsdEJCdDNwdGdkV2h4K2d1V3FCY3F4NnNob0lVa1hGL1Ur?= =?utf-8?B?ZXlERmhRSVF1djFPZHVieXNrNTZaRWwyNVc4QmZoS2g4MzRRRkRuaVlvekRF?= =?utf-8?B?VGJVSndNZnVsSGdEUmNzNXVmTHg3aCtMWHZCMUYxSTR3SE93R1hEeXN6ZUI3?= =?utf-8?B?SmR6eS9nUHBGYVZsbm1HWE9nM1FZcDk1YU1qK1ppT0hVVWtHSUtPdk9kUlBv?= =?utf-8?B?NVEvdWtsWEdEMUtwYUlobklsd0tLOUhuc2pkY3lzbUF4c3hua1U5RkVDZ3Qz?= =?utf-8?B?ejdubVJDZFhFTWM5QkhDd28wUi9rdS9jMmZKSmw2bVo1eTkrNFhNcVJ5TVVL?= =?utf-8?B?NXlqeTVBQ3dqemtEQVFIL2pMK1d6eVF4TWJOZDZka1pnaGpWWkhJN3hSdjk2?= =?utf-8?B?SzVDbWhUeHZLWCtpU04zK0hOc1E5eTJBZi9jK2FEbE X-Microsoft-Exchange-Diagnostics: 1;DB6PR0802MB2487;6:U3y/XmZYdSmd6dz9CGtC/bmSptWQMqE8XZ4CR3qS7kXO2qOlnNXitUBp/BUYxYj+tROWGBC9EmSAE7exd2BJF6Jac2WRPgkoghWCHpMyFA51qzqDfMTqcFRBCsu4wOjbSaziAsOM+zIzQds7tz5OoqbTmjvts6HdmH+XYFdbK22rRa85WunfRPNsU+wLty2vqicGP9BeyU26aSyCZA7T/1BvH4SROMOB9AX1HgkTbq74mpJxMPAJhV1v8Wf2OuQBfyyFzzhqYVsi3EoWgKzZUi2aUFNa2NqCGIJ2AfxZk/cWh16NZtPDMTEnBseGKIvczlvUkCdEj5JiKPpzUvkwA/h1E8eJBx1Nf4F3P7diAOI=;5:byCKAUrQkE5rFmi6fKvXVgFMVxxz+zgm0RcL+mc2Qu9iMyLUgb4t9jtZdb58RI1Zyu8OpZy9FWKg7oc3De5635okLRid4WCqo632OnmsT9UmRngpgoAarjCJHttKQRAGKK7nlNukkbiQVie2e6smSQSZxvoarf+KZx8yvRTCQJU=;24:fLHEuuIP8cP45gI9mMHAY17xPOfTI3gDBdkJLc2PqDYUl7SnKdRXSU481N2GwR0dDEcX6eVVxdtY+I/xIuou9Athfk/Jl8aahmkKdbEeYtE=;7:gBbDZp4Ur3pXsuWc0osHEVyrpKDhBLeLBhMOn4jCpcH+lAN1c7WTfaxWUaP+PgJ0EZs1pbyKuZi4zlBFYGjjL+cT97+K+3uHDXUGwwY7L5fNeqAmHk+ImDp3QntKS 7snvsPNcdw5ZbZtS6nTYBOQx4gizTUniS8+zfQJ/dHh64f6/PsY9ZfTr7hqy1iVxG SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Feb 2018 20:05:50.4452 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 91635417-010b-4eeb-c6ce-08d573e658a8 X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0802MB2487 Xref: news.gmane.org gmane.comp.lib.glibc.alpha:82647 Archived-At: Received: from server1.sourceware.org ([209.132.180.131] helo=sourceware.org) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1em3HW-0002fO-Nt for glibc-alpha@blaine.gmane.org; Wed, 14 Feb 2018 21:03:55 +0100 Received: (qmail 33893 invoked by alias); 14 Feb 2018 20:05:57 -0000 Received: (qmail 33881 invoked by uid 89); 14 Feb 2018 20:05:56 -0000 On 14/02/18 16:41, Joseph Myers wrote: > On Tue, 13 Feb 2018, Patrick McGehearty wrote: > >> Any thoughts on general principles on how to decide which patch >> to accept, given both seem much more better than the existing code? > > My understanding would be that Szabolcs intends (as per > and > ) to eliminate > rounding mode changes from the present exp, and possibly make other > speedups there. Then the final result of such speedups would need i won't have a patch that keeps the current algorithm just removes the rounding mode change. i'm now trying various approaches doing exp with < 0.51 worst case ulp error and < 0.2% misroundings, and < 4K table size (i think these are reasonable parameters and the proposed exp is similar too) i think the rounding mode change can be eliminated from such an exp with 1.0 ulp worst case non-nearest-rounding error (or may be 2). i haven't yet dealt with the subnormal range: it seems the proposed exp and my prototype one both have about 0.75 ulp error when the result is between 0x1p-1023 and 0x1p-1022 (because the polynomial has one rounding and then the final scaling does another rounding right at the next bit, this can be fixed by doing the final add of the polynomial differently, i'm not yet sure if it's worth fixing) and i haven't yet looked at __exp1 (which should be probably moved to e_pow.c and if it can share tables with exp then that should be in a separate file), but i think it should be possible to do similarly. > comparing with a version of your patch that also eliminates rounding mode > changes (and updates libm-test-ulps expectations for other functions in > non-default rounding modes as needed to avoid introducing failures). It > would be best to have a precise statement of what "both my throughput and > latency benchmarks" are in > , to make sure > there is a common basis of comparison so we can see if it's really the > case that one version is faster on some architectures and another on other > architectures, or whether different people are measuring different things. >