From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3215 2.6.0.0/16 X-Spam-Status: No, score=-4.2 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, MSGID_FROM_MTA_HEADER,RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS, UNPARSEABLE_RELAY shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 1DC341F5AE for ; Wed, 26 May 2021 10:19:26 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BB568384F007; Wed, 26 May 2021 10:19:24 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BB568384F007 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1622024364; bh=tnhLB7CbQnea+E6thmM22JJ5ib4woWh0oJ3QSmJ47lU=; h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=Vy3Qjyu2QnMsdKS8p8T7BCILuD9fGbh8jlmoXJYh1CmKvjgrYhHlOViMNfm9KYhz5 Y8eS9Iik8QPxfVjcb+GM1dUl8KdBNy/1rH2hGoA1MmCSP8KpuwE4SyYJ9hupIsFDRs 5U1Hc/MOs/6BFbPKKeEMnvtLz8iny3idlUTcO7Sk= Received: from EUR02-AM5-obe.outbound.protection.outlook.com (mail-eopbgr00088.outbound.protection.outlook.com [40.107.0.88]) by sourceware.org (Postfix) with ESMTPS id 9A2EB386180A for ; Wed, 26 May 2021 10:19:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 9A2EB386180A Received: from AS8PR04CA0171.eurprd04.prod.outlook.com (2603:10a6:20b:331::26) by VI1PR08MB5293.eurprd08.prod.outlook.com (2603:10a6:803:df::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4150.26; Wed, 26 May 2021 10:19:18 +0000 Received: from VE1EUR03FT053.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:331:cafe::fa) by AS8PR04CA0171.outlook.office365.com (2603:10a6:20b:331::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4173.20 via Frontend Transport; Wed, 26 May 2021 10:19:18 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; sourceware.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;sourceware.org; dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VE1EUR03FT053.mail.protection.outlook.com (10.152.19.198) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4129.25 via Frontend Transport; Wed, 26 May 2021 10:19:18 +0000 Received: ("Tessian outbound 2cd7db0b285f:v92"); Wed, 26 May 2021 10:19:17 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 798579fff092d618 X-CR-MTA-TID: 64aa7808 Received: from 67bd70daf12b.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 36DC2A4D-3CF1-4DFE-8A28-6542431A8086.1; Wed, 26 May 2021 10:19:11 +0000 Received: from FRA01-PR2-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 67bd70daf12b.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 26 May 2021 10:19:11 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=nDotyvQCpOy+eUayH3vCiXO9yFWDl1PQtkLW5vJY8v9qqfxe9fMyOqRiPg5pY6c6kDLa7J7oxErhcccZkKUUDVh5rH3Nn6/CNASBs6KhUPABreeVuPj36bbMOzNdRsy3jmcbtVMLhG6OXVfBonkUQlCMrk25EXHZsyJT0jMRlyNxyP9rP10VAotiVzbeSGEQzZbmkFuMyAqedlidKg36GIpFNLFdqSuhgxM/Ux7t9u/2bsNcSLhKIshcMjnUsAvnw0KcctRz0gP8cFWCH7JXr/ozswePwB6pZh40J/bZhRWjlNrTeuqfUEUKilhLxfq+wti6PSQnZArgelPuXYPJsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=tnhLB7CbQnea+E6thmM22JJ5ib4woWh0oJ3QSmJ47lU=; b=ZcanuLHQycKXyrinsONX0xfV5eO/WTa/6HO5ugRXKUiKtPnbgl//WA5ifBCjXpct27+JycED7EjXGI6cBsANa5EYed6xYENEoqyLz+DXnCjCb3ufz/FzqPOzuTH3U9a3Qpz5hqiQ/v0zi7LYEB7NGEnGyMuDcKSNrjtdfFiRT6hHSk9cX1f8mjXz5pL9XRzQwr9x7Ikxh2GNVpPx84D/fqPiuTzWjjVxO83zJjqIEEMeG0sYvMRpUbn3h9EyhhLTR6pvmvXOumrtyYg/5Led4zxRBpf30TF8ZJzdnDYdz7wBdk+HyoDIwNrf71QXjT0ucwlsuJs/rHMiv6EZiIP4Rw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: fujitsu.com; dkim=none (message not signed) header.d=none;fujitsu.com; dmarc=none action=none header.from=arm.com; Received: from PA4PR08MB6320.eurprd08.prod.outlook.com (2603:10a6:102:e5::9) by PR2PR08MB4923.eurprd08.prod.outlook.com (2603:10a6:101:24::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4150.23; Wed, 26 May 2021 10:19:10 +0000 Received: from PA4PR08MB6320.eurprd08.prod.outlook.com ([fe80::c99f:671d:bb2c:f20b]) by PA4PR08MB6320.eurprd08.prod.outlook.com ([fe80::c99f:671d:bb2c:f20b%7]) with mapi id 15.20.4173.022; Wed, 26 May 2021 10:19:10 +0000 Date: Wed, 26 May 2021 11:19:08 +0100 To: Naohiro Tamura Subject: Re: [PATCH v2 3/6] aarch64: Added optimized memcpy and memmove for A64FX Message-ID: <20210526101908.GZ9028@arm.com> References: <20210512092308.900998-1-naohirot@fujitsu.com> <20210512092809.901182-1-naohirot@fujitsu.com> Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20210512092809.901182-1-naohirot@fujitsu.com> User-Agent: Mutt/1.9.4 (2018-02-28) X-Originating-IP: [217.140.106.55] X-ClientProxiedBy: LO4P123CA0002.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:150::7) To PA4PR08MB6320.eurprd08.prod.outlook.com (2603:10a6:102:e5::9) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.55) by LO4P123CA0002.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:150::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4173.20 via Frontend Transport; Wed, 26 May 2021 10:19:09 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 46acc97c-c80e-416e-5d7b-08d9202fb8bd X-MS-TrafficTypeDiagnostic: PR2PR08MB4923:|VI1PR08MB5293: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:7691;OLM:7691; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: cdtBT+m5aSIkbIYLuJN1p/aXSm6O56dVkdEt1TaCtAJiTE1rjXi73aefdlvW4WkrLSOTJfuOxUkK8ftU4t+U6U71cXBfZgZUUPRHBX1LNRaTUuNqzNj4gh8CdRwQTuIU68JyilAU6QakJEa0GA+L4HXORNNVBS945c+o178OCjeNEnv4+iqgOKhlMI9urfuJ6LYgkQobm4fO14EJoL57BEHL9bkKHGNS1LTL6ZAJq+0kAfbKZb2ANO+0xLYHasgLb67b802/hZDvL8LwGgXht0Nhu7z2knANGgpIcTT/xd08X5+evLShi6hy78A7Pcdmt+pCNAAFuPkckJxsmfhFNSUoKgeoT8wjoabgbcED45B7mVPXVMp2ufmgQJvmGRsBY2qYHiIaHkQsNax4Hi2k7aQdaXxiuvndnwZ8VqHJEhAN+joaeUitRwdzVO3gyONO8GwHwpmwGtvMG4bnIdD+86+Uuj/MCbm664W7kkLB4f1XUCXuCfVakuqAu7OnvH3TAinQWmY9Ls9XSdGIw/wTN6TT9yUNgRtTWyWXVsoIZ+c2xNSREjd51ZXhvr9wSYIDGYR6gGWIwfE0wY/SvhuyzOWvDbUMzcrm6f5Ev9QWwdugRCUCb8yjH+GpVkD0p4c0hsTNDvsqMnI5Vi3inq9S4bkI0iaB9ucENWXfCOl1K11GuGEG4rIC2sJ5yCEPrEgyk3nNDIJ2CIrXTfofx/kQrVRYaclotD833vbBD1883654tsORwd9hG0en7BAZg7O/an52UI9sEd672Jj8eFW5ddxDxj6zw9RRT/lVofoQ0YnvEmCVsfI5QSYL9ojxt0ag X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PA4PR08MB6320.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(346002)(39860400002)(136003)(376002)(366004)(396003)(8676002)(4326008)(1076003)(6916009)(8886007)(316002)(8936002)(38100700002)(38350700002)(55016002)(26005)(2616005)(52116002)(36756003)(7696005)(66476007)(66556008)(44832011)(33656002)(956004)(83380400001)(86362001)(16526019)(186003)(478600001)(66946007)(966005)(2906002)(5660300002)(2004002)(357404004); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: =?utf-8?B?SFBqdXppVHNXMG1QUVJzTUNlR3lKeEtTMk1JTHZDY3pGSHdUOUxhZFdUcDVj?= =?utf-8?B?S1RLN09LUDd2U1hpT2R3Tk4yakIxNEU3N09lekc5R2Z5U2RxWWlXQ0tBeis1?= =?utf-8?B?QkNyZWpkblkrdnVnaGhDeE5aYlBtUTBtamVXbUdheUhmWSt6RWw2dTFaNUV4?= =?utf-8?B?ZEljZEc4ZXpxNlltN3MzK0hqVDdQWmlMbUlPMXZBR2NtRitmSHJMb2xIQmN1?= =?utf-8?B?THQzMkM4VXR2L1ZxQ0crc1RSdGU3RmFaNVdjLzZtTGlLWDB5SnJENFV0OXRJ?= =?utf-8?B?c1JYMGNQY21LQXJSMDIyWkdDZE1pUkx1MzBBVEw1Z2Q5b0VDdVZSaEV0bG1n?= =?utf-8?B?SHJsTm5FWGdLdW9kekUrOFdOMEcvRDE2Qy9Wdm05bTE3VkhxRkRHM0NwVm8z?= =?utf-8?B?VWR0ZE0rbUxFOW4veVpwb3JJSmUxVjljakpjaTc2bXNSUW9QcHJ4OGNEV1c3?= =?utf-8?B?dzRwWVdVN0E3VXJPeGJHQXZkK2llRDlxYTFCWlJFQUNvc2VMTU9hajUvYmtL?= =?utf-8?B?cExEVnk0TnUwN04vVndwTTczWktsa0pjOWUyRkc5eDdua3NPU0JDNUtvcUky?= =?utf-8?B?VGZWTm5NUU4zNUhzWWVVTzlQSVBoZkhneVdwdktXZDRZY0pvcUw3dFRQQ1l3?= =?utf-8?B?R2xObHI1VWY0SEZwZEdPY1ErVXp6VEVkLy9wazhVVXhzakF6Z1c1QWVYQldv?= =?utf-8?B?VTk5YzMvMDJKT3Mwc0tjUythN3pyU0o1eDdqVmpqTy9lOGJDcFp6YnpiQnlq?= =?utf-8?B?dzZCUzVuSXNlZE5VLy9yRkh1eVg5MFJHMFhSTjFkbGpGeW4yV0hRaWF4Zk5q?= =?utf-8?B?LzNVNm9XNk9RNlBSZkcrOUlzUnA1RVdOUHFQQjVqaG9iM0UzSjFlSEZOd256?= =?utf-8?B?TGwxOENydTBjdUZEb3FzYTY2bkR2ZHN6WDgrU1J0b25MQVNNNWtiMFl0d1la?= =?utf-8?B?dkFBTEZMRFBtUFJtZlJVUTg1ZGJ3c090MlY4Q3RETUt3SWxsWmpVQmxTVkZC?= =?utf-8?B?aklLQ3M5UGluMkxkeC9YUTc1Ulk5Y3JydWNkbVZoWktBNWZWNFk4dVloejIx?= =?utf-8?B?anlyeXFpVVVZZ2FyU0dOWFphYUFyUThKKzlocXF6SmYzM2w4UGU1N29uTlNR?= =?utf-8?B?SEtaaFkwMVJXNE1jNWxTV3U5cEhqRk93YThhY2o0ZkxibHdaQ1BZQlN4ZjJs?= =?utf-8?B?R2ZHNnlraGtaZ2RRTEwwMXRRZmJjSTcxeXFYdHViN3JDWmkvQjlhQk9HaWtp?= =?utf-8?B?RGRxbEgvUFNrZjZYbiszdnlMVURNeVM3dmJaN3Bkc0Z0MlZLN1hjU1NxTE5D?= =?utf-8?B?ZFdGbmg4ZnZkU3NVQkZiOWNXYlVkcHNwWG5ZT2Z0R0NVdkVSL3FZRTZCS1VR?= =?utf-8?B?eGptREVJOU1LTzFwTS9qWFNKeE1XTGM3K2NWMVNlQXNYc24xUmdLV2ZQNWg0?= =?utf-8?B?bzhoN3VOTzJMd01FMzkzVk9tb3JVbFN0WkxWSWxPdUZkMytHeTRBZS9uSTl1?= =?utf-8?B?bHpsdzRQcktTa20vbm5CK2JmakVEdk5RWjhCUFdVWDVPNFhWQUxjTTg5TS84?= =?utf-8?B?OVJGSzVWSklWWmlKRk1vQ2JIcmYrRlcvSlhpaXRRUUtnZEc0M3RZb3h6QStY?= =?utf-8?B?MDdrOUl3bHlpVDhsZm1SOFNzRjdqWEYvNFUvQnkxVlVOOUJ2bGpFRldTaGVt?= =?utf-8?B?OVNTTms0SlA2SzNyWFVLV3lIZWhkbXVoNWNzZkZMcUQ4Qm1FYi92bWZpVU95?= =?utf-8?Q?6GbsE2XBu7QHsQ+wgwMYBPYOEuWaH4M331dV4my?= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PR2PR08MB4923 Original-Authentication-Results: fujitsu.com; dkim=none (message not signed) header.d=none;fujitsu.com; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT053.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 75984cc1-5af9-481f-af60-08d9202fb3a5 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Gf2fVBYW9XgaTr7EMd5aS7guXNHjeq3RzpkIVTH4VC7akEyx52x2vB8WDVOfcUZE/cFwWZ0YHv1B66mX0712fdR0qsTTA+6z+KWkre5faoOJEuU24RmOV116LsGYcTTP3078EXFh8CCuUE91VrxYwqNkRbAT8JF2wbpQh3jWcGxgIGZMyB97p7rC1fPgd7ISFxdbLReiLPMiQhOD9KqOTGrmDfnc1QJqWLKxh39FY0Ccy+YRY/V0etBpLVe3sp9P626j8LAy94z01QWLNZEAwJCNGqsQ6EBrOvM18CmgHywvkrM3l7+geKKR8/6c88Ee1Ce+XJWpuKacaOVa8jXA5+71+6CDmmIqhkoRm4ZYVn9TN1g1UCJm80IazOM8AeoEZDZhDLHP2y91c8CeIN6x50M/icyjbGNMEatKmvl+dlXGICUG9a0xp0QgKcD5AYGw6F4/+a1E7mByCGL3Rfp6xxJswCOaGEtnP0yCYLUjMWOum3v8r0z875lUy6RxQX1bf11aVzb/ofx4DPVBPIFG3ZoDnuAdf9oxedfxZeRdqwUEtxkKNq66N8Tu7PrN6S+oIxLzjTj6G8wY7FTIMMVi2KohZpmY/CqXgSetu3eFrYns6bMeqIs+97F9jlkdiYtkMBia38dyxI/2FjDe7hIcMk491gtGX8SIyMxBXZCd6ceuIaIjl+uqSGgOTzjQyzj3knZtp4Y1HvoIMepGbok0Qtl6W+jRugSHPmaXbRfdFvWQsR/XZ1EGlvVg5St3a0O02XpTUWA0YFWa+rvD2p4S1P5Ah3tj7AgXzWX90K2y51ZmxjATMznF7Ca5Rpu2z1yV X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(346002)(376002)(136003)(39860400002)(396003)(36840700001)(46966006)(16526019)(81166007)(316002)(82740400003)(356005)(8886007)(1076003)(4326008)(966005)(8936002)(55016002)(478600001)(47076005)(36756003)(8676002)(2616005)(33656002)(83380400001)(7696005)(36860700001)(44832011)(2906002)(6862004)(70586007)(5660300002)(82310400003)(70206006)(336012)(186003)(26005)(956004)(107886003)(86362001)(2004002)(357404004); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 May 2021 10:19:18.2906 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 46acc97c-c80e-416e-5d7b-08d9202fb8bd X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT053.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB5293 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Szabolcs Nagy via Libc-alpha Reply-To: Szabolcs Nagy Cc: Naohiro Tamura , libc-alpha@sourceware.org Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" The 05/12/2021 09:28, Naohiro Tamura wrote: > From: Naohiro Tamura > > This patch optimizes the performance of memcpy/memmove for A64FX [1] > which implements ARMv8-A SVE and has L1 64KB cache per core and L2 8MB > cache per NUMA node. > > The performance optimization makes use of Scalable Vector Register > with several techniques such as loop unrolling, memory access > alignment, cache zero fill, and software pipelining. > > SVE assembler code for memcpy/memmove is implemented as Vector Length > Agnostic code so theoretically it can be run on any SOC which supports > ARMv8-A SVE standard. > > We confirmed that all testcases have been passed by running 'make > check' and 'make xcheck' not only on A64FX but also on ThunderX2. > > And also we confirmed that the SVE 512 bit vector register performance > is roughly 4 times better than Advanced SIMD 128 bit register and 8 > times better than scalar 64 bit register by running 'make bench'. > > [1] https://github.com/fujitsu/A64FX thanks. this looks ok, except for whitespace usage. can you please send a version with fixed whitespaces? > --- a/sysdeps/aarch64/multiarch/memcpy.c > +++ b/sysdeps/aarch64/multiarch/memcpy.c > @@ -33,6 +33,9 @@ extern __typeof (__redirect_memcpy) __memcpy_simd attribute_hidden; > extern __typeof (__redirect_memcpy) __memcpy_thunderx attribute_hidden; > extern __typeof (__redirect_memcpy) __memcpy_thunderx2 attribute_hidden; > extern __typeof (__redirect_memcpy) __memcpy_falkor attribute_hidden; > +#if HAVE_AARCH64_SVE_ASM > +extern __typeof (__redirect_memcpy) __memcpy_a64fx attribute_hidden; > +#endif > > libc_ifunc (__libc_memcpy, > (IS_THUNDERX (midr) > @@ -44,8 +47,13 @@ libc_ifunc (__libc_memcpy, > : (IS_NEOVERSE_N1 (midr) || IS_NEOVERSE_N2 (midr) > || IS_NEOVERSE_V1 (midr) > ? __memcpy_simd > - : __memcpy_generic))))); > - > +#if HAVE_AARCH64_SVE_ASM > + : (IS_A64FX (midr) > + ? __memcpy_a64fx > + : __memcpy_generic)))))); > +#else > + : __memcpy_generic))))); > +#endif glibc uses a mix of tabs and spaces, you used space only. > --- /dev/null > +++ b/sysdeps/aarch64/multiarch/memcpy_a64fx.S > @@ -0,0 +1,405 @@ > +/* Optimized memcpy for Fujitsu A64FX processor. > + Copyright (C) 2012-2021 Free Software Foundation, Inc. > + > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library. If not, see > + . */ > + > +#include > + > +#if HAVE_AARCH64_SVE_ASM > +#if IS_IN (libc) > +# define MEMCPY __memcpy_a64fx > +# define MEMMOVE __memmove_a64fx > + > +/* Assumptions: > + * > + * ARMv8.2-a, AArch64, unaligned accesses, sve > + * > + */ > + > +#define L2_SIZE (8*1024*1024)/2 // L2 8MB/2 > +#define CACHE_LINE_SIZE 256 > +#define ZF_DIST (CACHE_LINE_SIZE * 21) // Zerofill distance > +#define dest x0 > +#define src x1 > +#define n x2 // size > +#define tmp1 x3 > +#define tmp2 x4 > +#define tmp3 x5 > +#define rest x6 > +#define dest_ptr x7 > +#define src_ptr x8 > +#define vector_length x9 > +#define cl_remainder x10 // CACHE_LINE_SIZE remainder > + > + .arch armv8.2-a+sve > + > + .macro dc_zva times > + dc zva, tmp1 > + add tmp1, tmp1, CACHE_LINE_SIZE > + .if \times-1 > + dc_zva "(\times-1)" > + .endif > + .endm > + > + .macro ld1b_unroll8 > + ld1b z0.b, p0/z, [src_ptr, #0, mul vl] > + ld1b z1.b, p0/z, [src_ptr, #1, mul vl] > + ld1b z2.b, p0/z, [src_ptr, #2, mul vl] > + ld1b z3.b, p0/z, [src_ptr, #3, mul vl] > + ld1b z4.b, p0/z, [src_ptr, #4, mul vl] > + ld1b z5.b, p0/z, [src_ptr, #5, mul vl] > + ld1b z6.b, p0/z, [src_ptr, #6, mul vl] > + ld1b z7.b, p0/z, [src_ptr, #7, mul vl] > + .endm ... please indent all asm code with one tab, see other asm files. > --- a/sysdeps/aarch64/multiarch/memmove.c > +++ b/sysdeps/aarch64/multiarch/memmove.c > @@ -33,6 +33,9 @@ extern __typeof (__redirect_memmove) __memmove_simd attribute_hidden; > extern __typeof (__redirect_memmove) __memmove_thunderx attribute_hidden; > extern __typeof (__redirect_memmove) __memmove_thunderx2 attribute_hidden; > extern __typeof (__redirect_memmove) __memmove_falkor attribute_hidden; > +#if HAVE_AARCH64_SVE_ASM > +extern __typeof (__redirect_memmove) __memmove_a64fx attribute_hidden; > +#endif > > libc_ifunc (__libc_memmove, > (IS_THUNDERX (midr) > @@ -44,8 +47,13 @@ libc_ifunc (__libc_memmove, > : (IS_NEOVERSE_N1 (midr) || IS_NEOVERSE_N2 (midr) > || IS_NEOVERSE_V1 (midr) > ? __memmove_simd > - : __memmove_generic))))); > - > +#if HAVE_AARCH64_SVE_ASM > + : (IS_A64FX (midr) > + ? __memmove_a64fx > + : __memmove_generic)))))); > +#else > + : __memmove_generic))))); > +#endif same as above.