From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS17314 8.43.84.0/22 X-Spam-Status: No, score=-3.7 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, PDS_RDNS_DYNAMIC_FP,RCVD_IN_DNSWL_MED,RDNS_DYNAMIC,SPF_HELO_PASS, SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id CD9201F8C6 for ; Tue, 3 Aug 2021 03:08:28 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 59C3A3890417 for ; Tue, 3 Aug 2021 03:08:27 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 59C3A3890417 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1627960107; bh=7mEuOeTgIz1GTATjcYvPRXDNb03cve3rZkHaTfZ3dSY=; h=To:Subject:Date:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=g+g+IL+ZTtEICE4duqeHTlOWCpj2cIMPKn4WfC/0bcXwwOV52AQHRs7cbrdt/k4/C XCBlvibgwpkheo+zdDXDo3EMoknjcvTFEHijsfiw7X3oeT/dl7lFouz6Pi8rgl7Sl3 +4WXLsb694YX2NiIq/lzXXNCirvIW7V/Zh6emYuU= Received: from esa10.fujitsucc.c3s2.iphmx.com (esa10.fujitsucc.c3s2.iphmx.com [68.232.159.247]) by sourceware.org (Postfix) with ESMTPS id 2F738385AC1E for ; Tue, 3 Aug 2021 03:08:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2F738385AC1E X-IronPort-AV: E=McAfee;i="6200,9189,10064"; a="36043620" X-IronPort-AV: E=Sophos;i="5.84,290,1620658800"; d="scan'208";a="36043620" Received: from mail-ty1jpn01lp2052.outbound.protection.outlook.com (HELO JPN01-TY1-obe.outbound.protection.outlook.com) ([104.47.93.52]) by ob1.fujitsucc.c3s2.iphmx.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Aug 2021 12:08:03 +0900 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=AtBAmoqpBHNj/Kt9flpaRU//qxSRwoGSo7LAPiN6sCg6V5FI/k2fgTWjSaNlIa2SnZFlz4wTQAHAPyLrcrZ+6z1XLB8HMxhxs2MJKHRstma9W21tVfndnH54wXixd/zw6tb9OjvO7BBKtPObvTO1Ye4JtUnjO8wYFeXdxrbJyuAkzi7FyQ80SC8qbCqGCIRXrFcVUyptiFvk2t5rZfuTXy2f9vTDuZNSqWiQcRmZQLeSDVexbMqa1a2buYaodhyN6Stov4e/jnH2zUVyxhoQDKHPy6hdhDHrkHghOtBx23MWOZFZ5bHwUG5mmgPoqaYE+T99Enhau/QaED8ICatylw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7mEuOeTgIz1GTATjcYvPRXDNb03cve3rZkHaTfZ3dSY=; b=iD1PT7rIhFEQ8j23jlj+NnD5ZJw3I30JOoDnwC7sNlU1VtGiHXzQWYiVjCfRWjfFUFFRlp7oWwHGy8zkSj/wKchJW3Rx3qCauVb0hgsccheOuY0W0qUa+HlhbbvECWW4Dm+Yek5oqRJEmNzzhg5bVMSEsqz/NDoFvPcxV7TYT/6QBR54YU6gwEZqQsIw45w2H/b2uqI+BtsGQ99IiXztjkp1ApfovQhaSC+gYmiZZUs67HLwV1jcvmZG+C9HnQyJZJ05vR7bZtO/L4b5qQxBcSl3bnu80gJj3jYIYK4SqmQ0OKOIvkxYbZ7BXWlZDKP2VjMmCNsPU3ENFt+4kDL6Dg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=fujitsu.com; dmarc=pass action=none header.from=fujitsu.com; dkim=pass header.d=fujitsu.com; arc=none Received: from TYAPR01MB6025.jpnprd01.prod.outlook.com (2603:1096:402:36::13) by TYCPR01MB6850.jpnprd01.prod.outlook.com (2603:1096:400:b5::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4373.18; Tue, 3 Aug 2021 03:08:02 +0000 Received: from TYAPR01MB6025.jpnprd01.prod.outlook.com ([fe80::5816:45c1:5336:c108]) by TYAPR01MB6025.jpnprd01.prod.outlook.com ([fe80::5816:45c1:5336:c108%7]) with mapi id 15.20.4373.026; Tue, 3 Aug 2021 03:08:02 +0000 To: Wilco Dijkstra Subject: RE: [PATCH v3 2/5] AArch64: Improve A64FX memset Thread-Topic: [PATCH v3 2/5] AArch64: Improve A64FX memset Thread-Index: AQHXfxKaeH7mpjfTrkCBy3mxGimKXqtZRZ/wgAfdwvA= Date: Tue, 3 Aug 2021 03:08:01 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-001, ja-JP, en-US Content-Language: aa X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_Enabled=True; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_SiteId=a19f121d-81e1-4858-a9d8-736e267fd4c7; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_SetDate=2021-08-03T03:08:01.524Z; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_Name=FUJITSU-RESTRICTED; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_ContentBits=0; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_Method=Standard; x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 94304416-fa9c-4388-8da6-08d9562be7be x-ms-traffictypediagnostic: TYCPR01MB6850: x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:5236; x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: LWP57vdVX+GVyzLisbCAV80vetRet8YQjkFWWa7Wxo9bv6/4STxSEYuI+oxTojKPQwRJ7bO2AHR+d5gDEjWyXo3+BuT5nIeAcJzbghIDSuqbQuYdChszNXZqmGNRrXyGlmNu/SwpeYtRFJ6EYArrn6bdUB7/9K8a57WsuH69e4h3St0LZv7Hdwys/ijr/JkSnE2z1NKaX+UPhsA41wznor/PYPuiA9Rp9fJaEiWdfj1RcmD48VpmUzcLn0fh6xYY1stou8+weSEAImpmLtPdyWU8LChezXYmq29dDUrYrfJ2VUdLAyAObcVFb6WnLi4v0/Rs3N+gs7Akzj8g0iFU8MoH86yWukfkqzXjVeWDncuZBd6Iav61j/X95OYSPO7sJ6Z82KL1kR/KZavQFKiyuZLX6QVclu3FtMbPDp3Gn8f3Ie/z/uVbm0nL/EGP1Dz9zJ0cvGeXipJ/BUlp7UH/f5Ks6Xjosj18xl5IbEv3zrzsqmC9WF5fWOIMbojyWXmJd/fWy+nq4ZZT3qn4TBSBJ3WW9vL3oPPp1EpKOtH4p1yCVdZ0Kmpu8YV6asSZ5oevI1YWNuFXafsU8GTJp9+QA3743DDSbzxg3v3Qc1IEKOM8Kza8EgpiykfuPWC4m0aa/btUVamPIUJuGz4DSFYGuER8AYW4oc7YPX3FUR552/7QND3BG4o+XVXNZmc2Y00lHfSFF5/vj4uh2pNn439hvg== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:TYAPR01MB6025.jpnprd01.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(346002)(136003)(39860400002)(376002)(366004)(8676002)(38100700002)(26005)(55016002)(186003)(85182001)(33656002)(8936002)(71200400001)(9686003)(38070700005)(5660300002)(122000001)(52536014)(66446008)(2906002)(6506007)(7696005)(64756008)(316002)(6916009)(76116006)(66946007)(478600001)(4326008)(66556008)(4744005)(66476007)(86362001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?iso-2022-jp?B?VjRXTWYrRTJhVjF6NURWbHY3S2hZVVlPSk51Zmd5Szd4NEtBNTdwUVU4?= =?iso-2022-jp?B?RWpWN2RGOS9BSEZOU3BtTDZ6ZzduQ0hSc1BHRzhnNUpzQ2NtM0hheGxk?= =?iso-2022-jp?B?SlZOZ0xDUXJzUW52R2VRSm9VdmxxK1hOek5wSGpYRHpPV1V3ZzRBOUxT?= =?iso-2022-jp?B?RG9mQW9odkpUdDYrOTg2cUF6Zkg1aCt5V0xRZU50WTB2WVFxOSsvRVh0?= =?iso-2022-jp?B?N2tTa1pETDRaMlFxWWhsa0Z2emJRN3NSWjdWVWFQUmFRbjloU2R4WUZT?= =?iso-2022-jp?B?bUNnd0d0eEJ3OC9kejlXVm1GL0pEWlIyK1NoYzQ1OE9xSXExc1gwV1JJ?= =?iso-2022-jp?B?aUIvMUljTWtnZFIxaXQzMWRCVDZkbGc4QWtjem04V2g0L08ySmszMHlM?= =?iso-2022-jp?B?WDFvTWNjRW1rQllscFg0ekQxTSs2S1ZVUnNYaXhTR0VERUZ5VXgyOGFQ?= =?iso-2022-jp?B?MXA0REk3ZjlzK2huK1JwT3NpSkJYOSs1bWFNRm0xLzlySnVoaHE1M2Z5?= =?iso-2022-jp?B?M3FtcVhRZklaNlQ5NEtRaWN3Q04vQVJjcWZ2enZYdnlpdVdqQmxRQU5I?= =?iso-2022-jp?B?UGVXdGN2UTkySW50T0RKelFXZE1LVUNoTjRqd1A4YitHQld4eklUZ1NE?= =?iso-2022-jp?B?cThKSUFoMnVkVzJDNXpWRDFZTHd6ZktOUWRVcDBja083Qi9qbkJ5N0pV?= =?iso-2022-jp?B?NzU3WCtWSkp4NWRuNERvRmVlcTVJWC8wRTBxK1lzZ05FbXV5cE5lZjN3?= =?iso-2022-jp?B?MzFEeEw1QXpOZWRNeWlCclhXb05lNXlUN0pqWi9EWS8xZ0o3czI0UWxC?= =?iso-2022-jp?B?YmxKMldIR3BWaDZQbE4wZ3FlQlhiZ3c1NVpTU1ozVTF3dDBUS0NHR3kr?= =?iso-2022-jp?B?QldES2VYSTlNcTNlM3FVM1dRQkFuZVVwT0E0cnl0UGtTSSt5K0ttZm0z?= =?iso-2022-jp?B?L1ZoRUNDdDlzaHZCWitxMHk5UVNGM1NqMWtqSExzdmNOYi9KL3ZsYlpv?= =?iso-2022-jp?B?a3pDUWFlM3JBclNvRzhFWUNZL3BMRlppK0Q3WGJwbHlLdjdaSytFam5Y?= =?iso-2022-jp?B?YUNWME5ISW9NTFVBbFo0K3FrQjBPTS9YcDZGK21qTlJKK3ZCTFpLaDVl?= =?iso-2022-jp?B?NUl6cm5IQWJYbFR4Q3p2ckRhMnhqclVlZVBaSnpuQ25kcFc2TzdpN3RW?= =?iso-2022-jp?B?ZUFkc0g5NkY0NWJ3Yi9CWmdITjBpMWxoSit5alVKYzR0Y0IvcGo2Q0lF?= =?iso-2022-jp?B?UU8ybjY4YnJXSjBQN0htUkNSK1RQTkRXeVc5MndrYTJZUkFuK3NrQnhP?= =?iso-2022-jp?B?WXVyK0M1V0Jidk81blhhcnRBZjdUVHE5NHFhR2lycDU2VVdjT0hOK2NB?= =?iso-2022-jp?B?TVZSc09zMk44eGY1UVVjL3pEclBaeFZUSk9XdWtRTUZFSnh1ZUdnZFNF?= =?iso-2022-jp?B?eGRhTHQrVWxRVnZ4Wk5BYnlLMXBOUHB2a3h2dmVselVwcFVhNkJldFRh?= =?iso-2022-jp?B?ZWZkc3hiZnA3K0x6TGI3TW1xQlduZmNhQ2g3WTlIaVRUODUweEpOTGZs?= =?iso-2022-jp?B?TlBOMVRNY2pRUmdYRlR3cm54dGFQcHdYbmRYUUdKY1RVU2kwWU0xcitM?= =?iso-2022-jp?B?bkRZUVN1R1lMbEtEZ0d5VktsZ2lhQjFZczk1VGJGVjVQS3kzbFVuVEt1?= =?iso-2022-jp?B?dnJ6ZC9td3YzK252SU0rQWVpL2p1MlhXZFVoK0kvV1hkR2ZwUEJJOE15?= =?iso-2022-jp?B?VFpZQnZmcHYvbzlIa29hbzNUOVllSDJtZnRPQnNGd2VTU25WcnBIMDhj?= =?iso-2022-jp?B?dWlsbUlKenN5K2NPeGdaSzg0T2hibHVYdWJGWGZOUExUNnZvbXc2Zmtu?= =?iso-2022-jp?B?aGtzUU1CbkVGMERsNkhZK2pVTGNrPQ==?= Content-Type: text/plain; charset="iso-2022-jp" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: fujitsu.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: TYAPR01MB6025.jpnprd01.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 94304416-fa9c-4388-8da6-08d9562be7be X-MS-Exchange-CrossTenant-originalarrivaltime: 03 Aug 2021 03:08:02.0078 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a19f121d-81e1-4858-a9d8-736e267fd4c7 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: VHVUf1QT7ygiN29gzMycHLBf7Wc16U/vtQxxeLBBh6W7eDxXtj7uqGX0GEbp2g/5m+atJCbsPz/9rw4hOjFsaA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: TYCPR01MB6850 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: naohirot--- via Libc-alpha Reply-To: "naohirot@fujitsu.com" Cc: 'GNU C Library' Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org Sender: "Libc-alpha" Hi Wilco,=0A= =0A= I found my typo in the original code comment.=0A= Would you fix it with the following?=0A= > > -#define ZF_DIST (CACHE_LINE_SIZE * 21) // Zerofill dis= tance=0A= =0A= > From: Tamura, Naohiro/=1B$BEDB<=1B(B =1B$BD>9-=1B(B =0A= > Sent: Monday, August 2, 2021 10:29 PM=0A= > > diff --git a/sysdeps/aarch64/multiarch/memset_a64fx.S b/sysdeps/aarch64= /multiarch/memset_a64fx.S=0A= > > index f7fcc7b323e1553f50a2e005b8ccef344a08127d..608e0e2e2ff5259178e2fda= df1eea8816194d879 100644=0A= > > --- a/sysdeps/aarch64/multiarch/memset_a64fx.S=0A= > > +++ b/sysdeps/aarch64/multiarch/memset_a64fx.S=0A= > > @@ -30,10 +30,8 @@=0A= > > #define L2_SIZE (8*1024*1024) // L2 8MB - 1MB=0A= =0A= Wrong: // L2 8MB - 1MB=0A= Right: // L2 8MB=0A= =0A= > > #define CACHE_LINE_SIZE 256=0A= > > #define PF_DIST_L1 (CACHE_LINE_SIZE * 16) // Prefetch distance L1=0A= > > -#define rest x8=0A= > > +#define rest x2=0A= > > #define vector_length x9=0A= > > -#define vl_remainder x10 // vector_length remainder=0A= > > -#define cl_remainder x11 // CACHE_LINE_SIZE remainder=0A= > >=0A= =0A= Thanks.=0A= Naohiro=0A=