From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3215 2.6.0.0/16 X-Spam-Status: No, score=-4.2 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, MSGID_FROM_MTA_HEADER,RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS, UNPARSEABLE_RELAY shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 13D551F5AE for ; Wed, 26 May 2021 10:23:01 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 443D7384F012; Wed, 26 May 2021 10:23:00 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 443D7384F012 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1622024580; bh=/01QIV9/bY1jsEaLCju1dcKW2vsfMzt05geFyngZ2Ts=; h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=t0mUP6iryjRk013ctR9mkn8dagziHdRQUaSAsAF7TE5KyvY4hOHEEy+wReMrpzrhr JvMLlklLb4GA+SnwyXNtAsi+buvognWQYsL+9LNg6/MhqMyO8e1yvRK8oNyHbtzzfQ kmT5iw8OIvHEMzBVssrekpuFof+WXAExF868ax5o= Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2080.outbound.protection.outlook.com [40.107.21.80]) by sourceware.org (Postfix) with ESMTPS id 8227A3850435 for ; Wed, 26 May 2021 10:22:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 8227A3850435 Received: from AS8P251CA0025.EURP251.PROD.OUTLOOK.COM (2603:10a6:20b:2f2::20) by VI1PR0801MB1822.eurprd08.prod.outlook.com (2603:10a6:800:5c::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4150.23; Wed, 26 May 2021 10:22:55 +0000 Received: from VE1EUR03FT034.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:2f2:cafe::cb) by AS8P251CA0025.outlook.office365.com (2603:10a6:20b:2f2::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4173.20 via Frontend Transport; Wed, 26 May 2021 10:22:55 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; sourceware.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;sourceware.org; dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VE1EUR03FT034.mail.protection.outlook.com (10.152.18.85) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4129.25 via Frontend Transport; Wed, 26 May 2021 10:22:54 +0000 Received: ("Tessian outbound 504317ef584c:v92"); Wed, 26 May 2021 10:22:54 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 1cc4c839c0c00c00 X-CR-MTA-TID: 64aa7808 Received: from 6bedde2356cf.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 5845DBD5-913D-4E32-8B0C-DC190CC84EAA.1; Wed, 26 May 2021 10:22:49 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 6bedde2356cf.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 26 May 2021 10:22:49 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=lx1MtpOj+3pEKSf/mA+zQAQvuuGY6L8brkycfkvxbpp5BAoS7z0Uf7aH48ZgmMn50lmpUiZVEsf99ZTPfT9KlbZtaEdEz6n7qgsi5Rt99un4Okvxk42d2n3YURNk67jqhQn063uEEHQtzDTS5NC2/K9Ae/NQS0SQYfNY/oeilmkK0QDZibEjC9PXBc4R03vbror9wvjSx0J933RHyVlE1vUagklvgO7CaAh3vUA5C6FKnXDRDQVMH7R0ztxbfbHVsWrP+hsBkxyxws7VzGfHGAqfSKrdRyBTm7Em6wNga8g1yFakdgaG6NIhbHmW6WdGM7QVSqugHjAWQtqH6XZu0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/01QIV9/bY1jsEaLCju1dcKW2vsfMzt05geFyngZ2Ts=; b=GNrnXHMdGX9MA+QqvhQSN8OZpWn9WQAEkMUtLexp/v73w+bQ4VNQAVgqhIJ0loFNrq9zr4umomMGbLSqZDpYP/FLivag/moW/GMty0k3PNYdW69/cvmS73XBxaLrEccNRJ0BC9TLvfqj7tPPU0X1KcCUB5z8SY3t6ZNBm/5lT1vE0oQa2/CaIL3JxHej21QGMVI6x2uqoQLVHbeXlf0guw+ra0WeuWUhG8j+vgsrvKUkM8X6vW8PESPLhdVIGRZl8bhjz3NAEnQe0XH9aAIc3HoQcE8eRriJjhyN8F0mrhtA7EZFF385Y5tkrCeruWK8vK9t8qhzgjdlvSy9Jk/KHQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: fujitsu.com; dkim=none (message not signed) header.d=none;fujitsu.com; dmarc=none action=none header.from=arm.com; Received: from PA4PR08MB6320.eurprd08.prod.outlook.com (2603:10a6:102:e5::9) by PAXPR08MB6558.eurprd08.prod.outlook.com (2603:10a6:102:151::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4129.31; Wed, 26 May 2021 10:22:47 +0000 Received: from PA4PR08MB6320.eurprd08.prod.outlook.com ([fe80::c99f:671d:bb2c:f20b]) by PA4PR08MB6320.eurprd08.prod.outlook.com ([fe80::c99f:671d:bb2c:f20b%7]) with mapi id 15.20.4173.022; Wed, 26 May 2021 10:22:47 +0000 Date: Wed, 26 May 2021 11:22:40 +0100 To: Naohiro Tamura Subject: Re: [PATCH v2 4/6] aarch64: Added optimized memset for A64FX Message-ID: <20210526102239.GA9028@arm.com> References: <20210512092308.900998-1-naohirot@fujitsu.com> <20210512092842.901235-1-naohirot@fujitsu.com> Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20210512092842.901235-1-naohirot@fujitsu.com> User-Agent: Mutt/1.9.4 (2018-02-28) X-Originating-IP: [217.140.106.55] X-ClientProxiedBy: SA9PR13CA0042.namprd13.prod.outlook.com (2603:10b6:806:22::17) To PA4PR08MB6320.eurprd08.prod.outlook.com (2603:10a6:102:e5::9) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.55) by SA9PR13CA0042.namprd13.prod.outlook.com (2603:10b6:806:22::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4173.12 via Frontend Transport; Wed, 26 May 2021 10:22:46 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: c1d79645-7467-4ae6-5861-08d9203039bd X-MS-TrafficTypeDiagnostic: PAXPR08MB6558:|VI1PR0801MB1822: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:7219;OLM:7219; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: HppHGS2HxzdYGkBpfjDtB01zA9X15EjiaNsZBIYR89tttlw3FC+sf7KkBBMm0S2DPgAuvrzVVyc/VtP4pSsDARK2QMcvIGWyH9yCmBhHBAaIE0oBi1x51w6AwIwSDVGe2+P5nSmNOMpVxCNOCOJ/syRdrywDkSgJAtPgkcEhkO2PqeUtrKeQif1XdvLEVrzHM5bG3bAoV9K66IZXdt10eqUw69QInDjOv6eSuAsYNdRqtMfzCVFxu0dmMxyCJoOFv0OGg31ITYAtwPaT9uAb9YOL/QonR3AMDEqGPydrQMrG/9LeGCbh0bexi/WscbOgc3MffhCU6/xhV811qMlSF0WNG/hLzf2CqI63Z6kRzsOcgq5957oSEq1IIt7BEvVzWfa/aUMyKdWOjgGvCCzkcBCTsjcTy0vKEqsERpyqu9V43+DppLVUpOEoJDnXnWuIPl+x3/zFX/cIrG75sJgdbwthA2kBH6vujUyr5N0VLzVkbBVR1Z4Ytp3+dTA0rTnnDUs8ljkUpdzFwJtSfOKPwo9jQ5AwSnLZ5FZYrukb2oKrwkDeNMig+IGw3hunx6nhweDvmchBZXWHVLagJnZ5TPIf/uZfGo+QuSFs4w4AF5Ii/ejqyAFYqkH3FIiQSu0yE3Mlwywjl4fjhPy2nL+XNiYy5bOoEJdFHUKd199Us44bg1wK1g4ycb9b73Aosv3xMkE56vBc/7sri7E9T1/KWopPOHL2B7CHNx6w1xJDr2Hl+lYquZ2mLqVhOHYNCfF8uy2fisQkS6Am9svRRGqgFw== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PA4PR08MB6320.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(136003)(39840400004)(346002)(376002)(366004)(55016002)(4326008)(33656002)(38350700002)(316002)(5660300002)(44832011)(6666004)(1076003)(66556008)(52116002)(8676002)(2616005)(7696005)(478600001)(186003)(16526019)(2906002)(26005)(66476007)(83380400001)(38100700002)(36756003)(86362001)(8936002)(956004)(8886007)(6916009)(66946007)(966005); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: =?utf-8?B?NDhWckRMNVliY0t4Y2VwcXYzb1hRclNkczc5VTdVQXpwMEJ3bWY0Ym96cHk5?= =?utf-8?B?c0hJTm9rZTRsZVpFbDJYQVd2bnExU2VaY1NPUnZxanpKTUZiVDErTEhZSTZV?= =?utf-8?B?WkIxYVlLZEtMdnlNbWpVUy8rZ3RHRFhsMEhuRVZ2dHA4bUw0cFYrTDAyaCtY?= =?utf-8?B?SWV3MGpyclhIZkdYbHlIaE1acG1yVUtJU0RSekl6bjZjaForRUlNdWhSSVdB?= =?utf-8?B?M1FIRWIrVXQ1Wm1uczdHUFdWaE8vTDU4dEJ1QURHNVEzQ25UMHluUFR2VEl6?= =?utf-8?B?YWRleitrd3RxQ29kUjBWeWpTcHl6NVdsMVBQUVV1WTRJQ1pBUnlMUjVscStk?= =?utf-8?B?ZGRuUTVyUnlqa0lGcGNrRTBVRTJkQVFUaHlWL0cvNndUVXpGaFFVZDk0Zkd4?= =?utf-8?B?NkdZZnJEZjVLV3hnd2JTZzZkQXgzNzFBY3VZZldIdlBkbkd1bTRIcEswSUVX?= =?utf-8?B?QWpFTzJXVzhtNkk3L0xtbklXL3BTdkpGODFFcm9FSTZRZ1lpT1B1Mmd2aWxY?= =?utf-8?B?ckg4TU9oVkJqSGZFQW5UUVh2anFtTkl3WUwxdnZRYlZPSjN4SXkvMVd1b3RO?= =?utf-8?B?T1BqZ1FpNzZYV2Fub2hEdTZrSHBwcEJOZTNuTjhHcTVNV21jQ2hwcXI4dGhQ?= =?utf-8?B?NW9QUURrb3pKWmRMUUVFc2ZxT2xseGVUMmgvY2d0NTRlMWFKWEZXUjhNK1FX?= =?utf-8?B?eHdZNkpmbTRXR1J1d0VCQWhPazl0SnNKQmxtV3ZSOTF3WExTRmUrOHNCVWlv?= =?utf-8?B?ZDhMaFVicDB0cUlZVENTL3pPTzRFZjBodkx6ZnpuNmEyZ1BiSkJ4UXJVQnB1?= =?utf-8?B?T2JnMkhBRmxoNktJT3ZVQVVJeG1WYThMNjVxaC9ncmZVN0VUZ1VjcWNhV1lw?= =?utf-8?B?WWgvU3pEL0p2RFBOTncxWW5XaCs4QVpYb0huUG5ISW14TEpKeHNtOTN4NENR?= =?utf-8?B?Vkp1ZXFMVUY3eC9YOVlVZ0NCRk9PUXRGTWwxOW1od2F4dFlUVk42K0MwVG43?= =?utf-8?B?SGdScktkbXNneDYzZ0g5V3pzSjhiVy85dE9GNnVaNCtEQ0NiV3pGNTUrV0lB?= =?utf-8?B?N2JQOTZWMDF1UGpURmR3Vjcydy9kMVUrUGJ6MGtiblRtNklTYjR3T2RrRE82?= =?utf-8?B?SGc1UzB1SHNLSFJPdDBid2s1L25mM1FTWjJDRG9XSWdDbUw4SHVMMHI4U1lv?= =?utf-8?B?a2RxcFk4Z21VM08wYXhmd0V2bjA2bW9aVUtsS3ZqTmZ5aXNqZ3JaSysrZVZn?= =?utf-8?B?OFB6OWRoNGFjRVd1ZUlsdDJZTndNdmZEZDNDZ0h1TnpRWGlmRzJ5YmN3Y0Ru?= =?utf-8?B?bHFBRTJzY3ZEUWRvOG0wL29EblhHZUxlSlVpaHZscHkxTjVuUWU1SFRWTjBk?= =?utf-8?B?dmxPYTZWKzlRdy9SZHlXMy9nVE1KS2NFQ2hXTnBBMlEvb0VRNE9nMUJoZUlX?= =?utf-8?B?RXk2VExrSmt1d1AvU2lmZnFPNjlZa1JhS2FwNEpsYThQVXIrTlZOSXJCWmVZ?= =?utf-8?B?czhJODlrNUR2ZmxrVjNZTEpuSUhZNEYrL2pSeENXZ2VRbzhOSmRnZ3BnbkFp?= =?utf-8?B?aTlpUUZPQmpzRkk5Z1RoQ29aU1hTY1gvM0NXdThUMlJVUDhWWmZ0TlJtQWpl?= =?utf-8?B?ZmVtaEVNVkFPNkR6bkVYbmkzcTNMZVIrVmJ2NkZkU3pZaTBWeStWbEJlbzll?= =?utf-8?B?ZE9uWDlxajVJTHpkRFRVUjdrMmtZRTdzeTJKUi9KeUxhYngrV1VCU09XVXpF?= =?utf-8?Q?mVfgj1oWs8Y8Yb42Yv7BQvd2LIqFl80uzW+SgyT?= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAXPR08MB6558 Original-Authentication-Results: fujitsu.com; dkim=none (message not signed) header.d=none;fujitsu.com; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT034.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: e291b65e-506a-4d04-120d-08d92030352e X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: pckDVRTjuqkDKSh3tgeE53rzAYjS6QqzeDNH24HliUg7udeaMRB5Uns5Zsm8lPBSspER33U5b0uoWJGnZUx6YG/7TEHxOKW6PJWApD7RVlc5yPBY/vx8SQp34WchWhgQ+UHuiT+tPb1CzWHUzkfh4c0CWuHuwYq0pj8acUscr4ka1FmnqQlb02nfD0z4grLIcuKEQaWNga1vxSsw8FSn/bqjnfZVTFlv3On7lqBaIg9E5AIvxm2o/wd9u9tDNAqRLDqWIuvZ7zibNPJ94IE5QnpmTLBRdJSm0wfg5iMECV/Zwj4VESZgxxQCbxXzXhLuoJAf0LnwJH59Msjs5bNunSOiNZY4R50eEOSY1bFMZWQCiHiqzYtdsQTu8G8fljE8kVKdvvPQW2hlpDene/q0PQqu8pGnOKygigR6BWm4Bsih2oqGPXImm9LK64xq/1dOuaVF4gUKfIfO2iybHc+UT7KLcK8S2lhpmk/Nt4IboyTVWm6fdLoAfxnSg2hk3Josq+BuOo7M0SEbJTA2x0KeD9+n/bKRb+rikdHqUTywLlmp9nOUln9AcmwvuRWzuOK+1i4stp8sFy5a4H41PY+SAwlNQi3o/80QHeiwtPscU04+9pWnYtiCrz92HvM9Sowg5fAQPB3PbRNxIp2KuoAdguEi/PeGPAsvcR0BeGJyltetKDXCJkBh/lusFNI82x5+CIZQYrQ2V1Fj+C/szm3VGeRyygovDdEgXAGO+k9/nm9mc4TpGBKNTHcbdsFlOqvAt8ieh1vN78suQMIseUE+gw== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(136003)(396003)(346002)(39860400002)(376002)(46966006)(36840700001)(2616005)(1076003)(55016002)(47076005)(8886007)(70586007)(82310400003)(36756003)(186003)(83380400001)(956004)(336012)(8676002)(26005)(6862004)(82740400003)(81166007)(966005)(356005)(8936002)(107886003)(478600001)(6666004)(86362001)(2906002)(4326008)(7696005)(36860700001)(33656002)(316002)(16526019)(44832011)(70206006)(5660300002); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 May 2021 10:22:54.7095 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c1d79645-7467-4ae6-5861-08d9203039bd X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT034.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0801MB1822 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Szabolcs Nagy via Libc-alpha Reply-To: Szabolcs Nagy Cc: Naohiro Tamura , libc-alpha@sourceware.org Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" The 05/12/2021 09:28, Naohiro Tamura wrote: > From: Naohiro Tamura > > This patch optimizes the performance of memset for A64FX [1] which > implements ARMv8-A SVE and has L1 64KB cache per core and L2 8MB cache > per NUMA node. > > The performance optimization makes use of Scalable Vector Register > with several techniques such as loop unrolling, memory access > alignment, cache zero fill and prefetch. > > SVE assembler code for memset is implemented as Vector Length Agnostic > code so theoretically it can be run on any SOC which supports ARMv8-A > SVE standard. > > We confirmed that all testcases have been passed by running 'make > check' and 'make xcheck' not only on A64FX but also on ThunderX2. > > And also we confirmed that the SVE 512 bit vector register performance > is roughly 4 times better than Advanced SIMD 128 bit register and 8 > times better than scalar 64 bit register by running 'make bench'. > > [1] https://github.com/fujitsu/A64FX thanks, this looks good, except for whitespace. can you please send a version with fixed whitespaces? > --- a/sysdeps/aarch64/multiarch/memset.c > +++ b/sysdeps/aarch64/multiarch/memset.c ... > - : __memset_generic))); > +#if HAVE_AARCH64_SVE_ASM > + : (IS_A64FX (midr) > + ? __memset_a64fx > + : __memset_generic)))); > +#else > + : __memset_generic))); > +#endif replace 8 spaces with 1 tab. > --- /dev/null > +++ b/sysdeps/aarch64/multiarch/memset_a64fx.S ... > + .arch armv8.2-a+sve > + > + .macro dc_zva times > + dc zva, tmp1 > + add tmp1, tmp1, CACHE_LINE_SIZE > + .if \times-1 > + dc_zva "(\times-1)" > + .endif > + .endm use 1 tab indentation throughout.