From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS17314 8.43.84.0/22 X-Spam-Status: No, score=-4.2 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, MSGID_FROM_MTA_HEADER,RCVD_IN_DNSWL_HI,SPF_HELO_PASS,SPF_PASS, UNPARSEABLE_RELAY shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 941921F8C6 for ; Tue, 10 Aug 2021 09:41:10 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C4C90385780A for ; Tue, 10 Aug 2021 09:41:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C4C90385780A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1628588469; bh=zoACSCGeTF8aJWv36YsUqMOiPtVp+1SvBIPey09CTLA=; h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=CeCVZGBxn1kVtlQKEWuIHkDfpj8xvJSARHVajbQAA0aU0a9CL25hbZfMp0YRfhdEJ o7nb/9B0IEGlVRxibN4cVlh83GTZ1pQzYOt6nhsd2QjK43RONjN/DbXzyIGJ8jY+FZ 11Ja6fzrcV8cajDjpaBL//kZIFz/IwkcWVpBiUOk= Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2087.outbound.protection.outlook.com [40.107.20.87]) by sourceware.org (Postfix) with ESMTPS id 57C28385780A for ; Tue, 10 Aug 2021 09:39:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 57C28385780A Received: from AM7PR03CA0003.eurprd03.prod.outlook.com (2603:10a6:20b:130::13) by AM0PR08MB3857.eurprd08.prod.outlook.com (2603:10a6:208:104::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4394.21; Tue, 10 Aug 2021 09:39:33 +0000 Received: from AM5EUR03FT042.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:130:cafe::55) by AM7PR03CA0003.outlook.office365.com (2603:10a6:20b:130::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4415.13 via Frontend Transport; Tue, 10 Aug 2021 09:39:33 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; sourceware.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;sourceware.org; dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM5EUR03FT042.mail.protection.outlook.com (10.152.17.168) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4394.16 via Frontend Transport; Tue, 10 Aug 2021 09:39:32 +0000 Received: ("Tessian outbound ab45ca2b67bc:v101"); Tue, 10 Aug 2021 09:39:32 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: f1fb94f60c794bc4 X-CR-MTA-TID: 64aa7808 Received: from 0870c3b575fd.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 96EE81B2-CBFE-409E-BBE2-59DF98E433F5.1; Tue, 10 Aug 2021 09:39:21 +0000 Received: from EUR04-DB3-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 0870c3b575fd.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 10 Aug 2021 09:39:21 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=E7CpHr+HiKR/chaXBvifpHibKOEjwUQM2k28y0rBxh+ZmqP9IwwYePd+pNeNy61RyiBdAnXhqFOa3YBxx9wpBkND3K8gIIHWS/X0t9KHepebToMidUqH5KZW/SyF79N+qMDredOCtowvf2rqwY1m+noojfUwS02ENyZ5kwT/AallZ4LJpjyRPfT4aO3I0bXxLJinj+L2lF1Zlnj3tbmKKaNWLD/PS5SxJi13XiifSk56khNWh0vvuf9cS6lR4Wn7n4NiQEcUg9rjTE2BvM3E9CJppFjnaA8N9Vc0lnAhgLm5ZhLd8NSneGPvzj0Z0aL9dzj5os9RDhsD74AzqeRztQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=zoACSCGeTF8aJWv36YsUqMOiPtVp+1SvBIPey09CTLA=; b=B8pYtHDi2UcyiHDHxlcgRh04yhcsTBBiVAJghrMUAYn+yiNYM2gNbGnu6XbFuortsK618vzyuquVi18N93mJGOA8hmDWikZehZtsp+JD+D5G1FgxAZCY+ha3yGmmBtl38ndnwtroLN3Hsg1+jYL1qJv6muMbY4+o+iMaArNMrKw9uZ/Cyu4APuA1BBs3nJmhMgXEBvrgRRdOcJvZLOS2Xb5K9/LXU4M9AL8IR1luWzDd+MglGgbVcaGDFLwQGPnDkZU8W6q3tXUqn3wg8LmeA3CedtaOCa8prfm60zbxKeM3FIqvbUk79XQj3HHgZNybO6THNL/QGrC801vAJHKIog== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: arm.com; dkim=none (message not signed) header.d=none;arm.com; dmarc=none action=none header.from=arm.com; Received: from PA4PR08MB6320.eurprd08.prod.outlook.com (2603:10a6:102:e5::9) by PAXPR08MB6589.eurprd08.prod.outlook.com (2603:10a6:102:159::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4394.19; Tue, 10 Aug 2021 09:39:20 +0000 Received: from PA4PR08MB6320.eurprd08.prod.outlook.com ([fe80::cd22:a583:c97c:72a6]) by PA4PR08MB6320.eurprd08.prod.outlook.com ([fe80::cd22:a583:c97c:72a6%7]) with mapi id 15.20.4415.013; Tue, 10 Aug 2021 09:39:20 +0000 Date: Tue, 10 Aug 2021 10:39:19 +0100 To: Wilco Dijkstra Subject: Re: [PATCH v4 3/5] AArch64: Improve A64FX memset for remaining bytes Message-ID: <20210810093918.GE20410@arm.com> References: Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: LO4P123CA0043.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:152::12) To PA4PR08MB6320.eurprd08.prod.outlook.com (2603:10a6:102:e5::9) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.49) by LO4P123CA0043.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:152::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4394.17 via Frontend Transport; Tue, 10 Aug 2021 09:39:20 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 509ba9af-4ac9-4373-18df-08d95be2c236 X-MS-TrafficTypeDiagnostic: PAXPR08MB6589:|AM0PR08MB3857: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:619;OLM:619; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: x8C0Rt4km+dZx7PbvLBnPIWir3clC1iS765XWo0AcbITnLNPpJz32no94lOBZSUy8+hJdMsVKk8xM+vfhfq4Ss0V48bEmOtNpLHM4+9ANApbNGSSTCMS6b/dpGjMxfE7UlP0Fv6dcy40X3ggAwoGcvKFVnHRPhjmYDkapeu08kmVBuSo+hBCVkgi7F051ju567Bcw6FuTx28nzAmdMPNu4HTD6ToKEmkBoVXnOc+TEKPGLrzHLtcSbVynUYSueG+fRaPUHOrYNbDMRtaTGx7xAktNRNhNxJ5X0Nt9ew23SCWqAQE5/RPbDRlJEjbMW2zvR6UEOCnkzjqKy83SR8QOc1l3Hi/QNfBDoMLZUZQfQhrSDfyv4aYnmiL7Gf2Hlfdkjd+kdLjlMIVRwTMyP/YQRwXWf7ezEZYQMwM5ODzeX5olnGLcjYnQG+OxHWyAODjz4amMIeNSH9bcfloTKE49JmbUkVMxVvtXDASrI7l9Mw/ppMTgR2RRWU5o+w0zfBmgdc1vCUwWz/l3PZDJNoHy7nJdtnZsnBJqvmuzxxoOFaQV7gyARVsoxx9L92v1f2BdHRSrj1L5eItgSZCz5Fa7OVWIcP2o6C4ZJEVINpsNXtUlMbjfvHPJn9Yhdenj+BpOAcOOfEs1YeXry24+a7tADLilNbkcbOZqNwMVo2SHj/aIGgmBCpAShlxjC+1YCj9/WofqPNbdsksgSh7Dds+nPGjaIAjk9bCqSdia4S4xPA= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PA4PR08MB6320.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(346002)(396003)(376002)(366004)(136003)(39850400004)(66946007)(956004)(66556008)(2616005)(66476007)(44832011)(33656002)(8676002)(54906003)(8936002)(37006003)(316002)(55016002)(2906002)(1076003)(52116002)(8886007)(186003)(5660300002)(38100700002)(38350700002)(6862004)(7696005)(26005)(86362001)(6636002)(478600001)(4326008)(36756003)(357404004); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?Y1NZczN1OUVDMUFRbldMbC85OGdsTytrWGJjSmxsVTBCNzFnVVZXbU9zQkNr?= =?utf-8?B?bEtJbTFlOGE5NGQ2VzI1TTREZGNpSS9TR2l6NFVXb3l1SEM0VmNaK3VtTGow?= =?utf-8?B?SW81WWE3Q1lseTE1NUFFY1NkeUZjaExJTFkvdVBuTDF2UXBwSGxqdkZyS3RU?= =?utf-8?B?Q2QvRSs3UnFEM3hyVHg3UTR5K3J1L3dBaVJqYk0yN2lCWDhJd0RVYllINDht?= =?utf-8?B?T1c5UnU2eW96UldTUDM2RWxKeXN3WmRDc1VGMXVTZnlWTmFQMDR1QklEMHN3?= =?utf-8?B?K21rV2VKckNESG5QVmlXc1llQndWL3lta2ZlQ0pPelR4OFhFOTZzY3VKanlY?= =?utf-8?B?K0VxSEx5OWxTM1F0YjVPZkpGZ3o0U2l0NnFTK3RtbkgyK2FPb2xqS2RwcUJY?= =?utf-8?B?RjZ2UDh6Yk5PQzY4QkM5RW03a2M2VG9lcm9YYkQwY2dOWmsvblRhM2gyMjBh?= =?utf-8?B?cEhoSGliVmVNMXF6bHp6N1BhTE03WlNQOGZ2NkN2TWJqTmR6dmtKZFJ1RXlv?= =?utf-8?B?M0VGdFJOT2dzenBIcitxTmlQU2ZYUUlBL1FjZkJiTnZFeUhUVWxSek9na0JP?= =?utf-8?B?bVNoN2VPVlJmS0NCMUVVenNnSUhvRForU3paVHZ0UUZBb05aMXRBem1SdUp4?= =?utf-8?B?bjFDVHk3akxQdG1CVVRhNnhXSHVMeXZwOWJYRVBRb0FYem1jOW01Y2haWDhu?= =?utf-8?B?YUdna2VwQS9DZnp2VS9lbXdNZHRJQThNRC9KRmJpY3NLWFJGeVhHSkQxR3dO?= =?utf-8?B?alJQekMyNWJKN0ZSTEJiWG1mK24zRVJGTUVwb2ZRMGdDTFZkUjc1Z2M5TzFT?= =?utf-8?B?QitEVWxoWnUxNVFMRm9aSTVIQVpCKzg1UXpYVU5rME1oTk92UDBVSm9na21n?= =?utf-8?B?eTdTTWFiT2ZEMkZKakJZUWdKbVNPQStDeFJRZWZ4clpvdmlmakJUc3l3eDhW?= =?utf-8?B?eXJsS3FrMEZTUVF6bHBNRDZLM3NzQjRrWlIyeW1rNGduOXlGQURHTUJSc3Vo?= =?utf-8?B?Nm9xQW5sbFNta25sbUFwdGxWTW05Qjg0NENGeVAzMjB4UDFXL2I5dVZHSktw?= =?utf-8?B?VGpsS3hCZHNQZ21PSWJTUnBpOUJSSnhrd1I2WEdvKzNMSUc1azZsdXRKM1da?= =?utf-8?B?NkszNCthdXR1YVNLVnl0SjduV0puNGxYT0xFSVdPSGFkVGZSZi9ramRGZ3pX?= =?utf-8?B?TTdOR0JXRmRXUEhxTzZrdEtzVGp3aVBkZGR3UnprOGJ5M0dZVEk4V2twaUU1?= =?utf-8?B?RDk5SGgxQzRkb0QyUHE4WXdsN2tMSm1YenhmTElqUEFKK2JkQzEyNHpPbEZH?= =?utf-8?B?WDBwUXN4NndYRjRRUFJzVEtuVExNMldBZktKeHZLSUlweDlrMW9haGpYLzhw?= =?utf-8?B?dWNnRFptNWpHWDZpNHBBaFdTcmh1NXp2T1ErY1RFVHlkZjU3dXJFOHZEbEp6?= =?utf-8?B?VnF3VzNnS2l4bGhmOXU2WnBXbTdHaDVtdmtKWVVYNERnYmU3UmoreHJXMnJm?= =?utf-8?B?WGF2QUs0SkpWZTE4ZmdvZnBPN1dRTFo1QlNIbGhKT01sZ2F3R1ZCQTlkekdz?= =?utf-8?B?VWl3cDNVZUFxUDVZc003cTFwVUdnbk5CZk1iZmpsSXVlNlg3blRpUk1FdnVw?= =?utf-8?B?eXNUMEFwZHAyeklxZDdCM2xlVStpaVRDcnpUNnhpQlhqZkVOU2VUN2k4ZkJJ?= =?utf-8?B?MUNDb0JoNXpaNGFybG40aFhYYVU2RWRKTVlsejVGS2NQM3dseGU0K2FRQ0FK?= =?utf-8?Q?D8QFfuRRhwN3vteI5XN2vYmLHEEZdvYXCupz9XZ?= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAXPR08MB6589 Original-Authentication-Results: arm.com; dkim=none (message not signed) header.d=none;arm.com; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM5EUR03FT042.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: c4d548a5-7304-47f1-5e67-08d95be2baf6 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 7HDEbrvYQEf5XeK9nAFf935ZMxQcci92tco4qBJJKgx17hoJGWr+LIxcsX9QjEa1wmKrMePGTexsWH5GgjdVxzQeBIwqOcC0N9O1qskoDtr/yduICe+uAgPKau7zuITmFIrSueauZjybLxfi+C7tvpjJuEn45mmhIzlr8Wp8yObye3Z6EtRsTBrNwx1cy23IUsDDN+zzcb98kUvRJk8CX0AlLp2HtjlEQpxKBgk194dUeGp+3Pb9QStv0Q/hK3G1SiWybXT6q+8L9HsHTRJ0DItvywAXV2nZ7IhBK2lX2YvwBqk89TORA/DovsfYucZXDQ78WFbBDR7VH1Zj4KmRUFLryE9ZZJ5dC/MrZs1CNgb5wW8COSIoLh93DVaKz1W0PikjpjVqADEiBIUKDXYiNRCqPfPXG9+3dPqCmGV43Vu3hz85T+kM8CTDxxohomm+GDdGENf9sqDJEPbsUyCfyE+fURNtB1579jMNnwcrZIeTZ0gcHg74tri2BkAky1FBLz3lTKFEutZIhQzl7ckHNGzTKDtXXJq4F+48PDnw77y1avw5E/bGCe+Y2tURpsr6+l6l48XvqPXTMaXd8Q89pzNuHgmdUphGiMpGuW+urpXft+2UUHbJwGpusRtGVlVn/Pq/XbGRl6aeKnE/PwPeoAq8q+pflK6rMmAHMMK8kzVETSkSb4T/cKMFpVBju5OKHenHwMzJBBxLrF140rH67IGCJH0/9lY8U+XneXdcm98= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(136003)(39850400004)(346002)(376002)(396003)(36840700001)(46966006)(356005)(44832011)(55016002)(33656002)(2616005)(186003)(956004)(8936002)(36756003)(82310400003)(81166007)(36860700001)(1076003)(47076005)(7696005)(316002)(6636002)(86362001)(336012)(2906002)(4326008)(8676002)(6862004)(54906003)(8886007)(26005)(82740400003)(70206006)(5660300002)(37006003)(478600001)(70586007)(357404004); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Aug 2021 09:39:32.8098 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 509ba9af-4ac9-4373-18df-08d95be2c236 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM5EUR03FT042.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR08MB3857 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Szabolcs Nagy via Libc-alpha Reply-To: Szabolcs Nagy Cc: 'GNU C Library' Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org Sender: "Libc-alpha" The 08/09/2021 13:11, Wilco Dijkstra via Libc-alpha wrote: > v4: no changes > > Simplify handling of remaining bytes. Avoid lots of taken branches and complex > whilelo computations, instead unconditionally write vectors from the end. OK for commit, but keep Reviewed-by: Naohiro Tamura > > --- > > diff --git a/sysdeps/aarch64/multiarch/memset_a64fx.S b/sysdeps/aarch64/multiarch/memset_a64fx.S > index 6bc8ef5e0c84dbb59a57d114ae6ec8e3fa3822ad..55f28b644defdffb140c88da0635ef099235546c 100644 > --- a/sysdeps/aarch64/multiarch/memset_a64fx.S > +++ b/sysdeps/aarch64/multiarch/memset_a64fx.S > @@ -130,38 +130,19 @@ L(unroll8): > b 1b > > L(last): > - whilelo p0.b, xzr, rest > - whilelo p1.b, vector_length, rest > - b.last 1f > - st1b z0.b, p0, [dst, #0, mul vl] > - st1b z0.b, p1, [dst, #1, mul vl] > - ret > -1: lsl tmp1, vector_length, 1 // vector_length * 2 > - whilelo p2.b, tmp1, rest > - incb tmp1 > - whilelo p3.b, tmp1, rest > - b.last 1f > - st1b z0.b, p0, [dst, #0, mul vl] > - st1b z0.b, p1, [dst, #1, mul vl] > - st1b z0.b, p2, [dst, #2, mul vl] > - st1b z0.b, p3, [dst, #3, mul vl] > - ret > -1: lsl tmp1, vector_length, 2 // vector_length * 4 > - whilelo p4.b, tmp1, rest > - incb tmp1 > - whilelo p5.b, tmp1, rest > - incb tmp1 > - whilelo p6.b, tmp1, rest > - incb tmp1 > - whilelo p7.b, tmp1, rest > - st1b z0.b, p0, [dst, #0, mul vl] > - st1b z0.b, p1, [dst, #1, mul vl] > - st1b z0.b, p2, [dst, #2, mul vl] > - st1b z0.b, p3, [dst, #3, mul vl] > - st1b z0.b, p4, [dst, #4, mul vl] > - st1b z0.b, p5, [dst, #5, mul vl] > - st1b z0.b, p6, [dst, #6, mul vl] > - st1b z0.b, p7, [dst, #7, mul vl] > + cmp count, vector_length, lsl 1 > + b.ls 2f > + add tmp2, vector_length, vector_length, lsl 2 > + cmp count, tmp2 > + b.ls 5f > + st1b z0.b, p0, [dstend, -8, mul vl] > + st1b z0.b, p0, [dstend, -7, mul vl] > + st1b z0.b, p0, [dstend, -6, mul vl] > +5: st1b z0.b, p0, [dstend, -5, mul vl] > + st1b z0.b, p0, [dstend, -4, mul vl] > + st1b z0.b, p0, [dstend, -3, mul vl] > +2: st1b z0.b, p0, [dstend, -2, mul vl] > + st1b z0.b, p0, [dstend, -1, mul vl] > ret > > L(L1_prefetch): // if rest >= L1_SIZE > @@ -199,7 +180,6 @@ L(L2): > subs count, count, CACHE_LINE_SIZE > b.hi 1b > add count, count, CACHE_LINE_SIZE > - add dst, dst, CACHE_LINE_SIZE > b L(last) > > END (MEMSET) --