From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS17314 8.43.84.0/22 X-Spam-Status: No, score=-4.2 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 8C18C1F8C6 for ; Tue, 3 Aug 2021 02:58:01 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 814B23890024 for ; Tue, 3 Aug 2021 02:58:00 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 814B23890024 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1627959480; bh=i0afhHAjdx3W5WWqF9b/SdD8hJM7+JtrwTbRwzhWLE8=; h=To:Subject:Date:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=a6k6lg8tIVKIUZ6Zhv3FTdUkhcbRXvr9wgislGdqEXMJ4kiJdeyE6FMDY1UzDh21y 5SDPJDBZZso6S0j8PLOltvNmWykYUTdr8QhofFO7ICBUZzUdaXWpYFW56w8bsJwqQ0 fcQyY3vVRAAVggLkO6EgGkJCSLMICkqbdorS3Mlc= Received: from esa11.fujitsucc.c3s2.iphmx.com (esa11.fujitsucc.c3s2.iphmx.com [216.71.156.121]) by sourceware.org (Postfix) with ESMTPS id B54053857C70 for ; Tue, 3 Aug 2021 02:57:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B54053857C70 X-IronPort-AV: E=McAfee;i="6200,9189,10064"; a="36133727" X-IronPort-AV: E=Sophos;i="5.84,290,1620658800"; d="scan'208";a="36133727" Received: from mail-os2jpn01lp2058.outbound.protection.outlook.com (HELO JPN01-OS2-obe.outbound.protection.outlook.com) ([104.47.92.58]) by ob1.fujitsucc.c3s2.iphmx.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Aug 2021 11:57:37 +0900 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hr4sahq704H9gYwJmYPF+bkCnKCQfiuvpAbSZr3+/8vK2WGgHCK5fzSWbIlCB37T9libpOPq1yGAyD+QM5KyGE3WKZNQCbmFwZxckCVj+SIhY678CkZ0V31hW/49g1NDKbng4Fil3c2D7rtyYzj32hVGud59b8ayExz5gcHTpV6CpkTsud0JTKlGd4GPrZRJI659OwlZ0jJipmsdyWM2Y119sndxAZ/3tpChQM84M8qXhUHs/w9RBUf8XenXjUDJUVKVepYhm0NO1eWUoL5iy4m7ij8ClqbZdkvFG5jYh8024SwLz6SJuJl1c6z6Vc0HrhThgIxxz+DU2HpsWKzF3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=i0afhHAjdx3W5WWqF9b/SdD8hJM7+JtrwTbRwzhWLE8=; b=J+xNBW2P1rK7FE9P7A9qReUhhEQsgHUaoXXt2G665b3gZ+gjb/XcSqsLbostAM1pSiyYdlGvciFMzr0hiBYAbK8MB83TzqluPadcIEqpj2X4PlDOWPJdfC/fiubHyfonzdgtah707npCmjslTlGq1bXuvRvkxlKnhjJapDElDMFqwHMTgHPk8PrTLbftME+Za+UJl6xSyDd54Zm6taMLxL6M+cGAPDrrQ171YvPpG6FzcF4bsUYokNY+z0DP5EB8zTnQZCIC5tv43pqBud3UBbS6hljqEwz2s1dCf/zoNSeKMK+3FG2Cq/oJV7prpihGyuaXZtXnlMyqSLyMF53IDg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=fujitsu.com; dmarc=pass action=none header.from=fujitsu.com; dkim=pass header.d=fujitsu.com; arc=none Received: from TYAPR01MB6025.jpnprd01.prod.outlook.com (2603:1096:402:36::13) by TYCPR01MB6852.jpnprd01.prod.outlook.com (2603:1096:400:b7::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4373.18; Tue, 3 Aug 2021 02:57:35 +0000 Received: from TYAPR01MB6025.jpnprd01.prod.outlook.com ([fe80::5816:45c1:5336:c108]) by TYAPR01MB6025.jpnprd01.prod.outlook.com ([fe80::5816:45c1:5336:c108%7]) with mapi id 15.20.4373.026; Tue, 3 Aug 2021 02:57:35 +0000 To: Szabolcs Nagy , Wilco Dijkstra Subject: RE: [PATCH v3 1/5] AArch64: Improve A64FX memset Thread-Topic: [PATCH v3 1/5] AArch64: Improve A64FX memset Thread-Index: AQHXfxJfA7Ac8a4LAESeqDyonguxdKtYES17gAgauZCAACmXY4AAB7eAgAC0mZA= Date: Tue, 3 Aug 2021 02:57:35 +0000 Message-ID: References: <20210802145003.GH14854@arm.com> In-Reply-To: <20210802145003.GH14854@arm.com> Accept-Language: en-001, ja-JP, en-US Content-Language: aa X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_Enabled=True; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_SiteId=a19f121d-81e1-4858-a9d8-736e267fd4c7; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_SetDate=2021-08-03T02:57:34.840Z; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_Name=FUJITSU-RESTRICTED; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_ContentBits=0; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_Method=Standard; x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 3d2831ad-2561-4178-8553-08d9562a722f x-ms-traffictypediagnostic: TYCPR01MB6852: x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: T5j5uj8GXtQWLDnCxPp9qpub+za2hTX6c7BjCGnpeW8YWv1M2y6tpfkxIYk7GM57SzjMWXs0885gHuLXUXNqHfBzFEiWWpjEL+T30jNXip62SfZqZ59eTovQCubp35Nr/lG6xAbJoOSl0mJALIiGiestbu+AJM7X0B1h8jPo0Nvnweudb34deuooJIHrHWhQejQVtiHKNrdFY4k61qg+dwUrzzV4udXUDxBj+mDpaumiyJ+jengjMUHcJqAZW0HRHkXcDA+VvLFOyZmnQBjCan51o2280LQRw0GrQcdwA4JQepBvyG5ZdXE1qQgLiabmTBcEGEx7V7xVV53epBsbYR9ZmZQhnxeNojDuIrPP1RjDlH6E3cIG/AzGCRvE1DY8p0uq6kRDsH7yCYJatHnHqcXivqTtilok3iIVkQp81UhakraZYVFaUd7Q5PBqEIvX+yc7FhJfCOCJ/XSN4bQRWmMIih/b+5Tdpd2Lb6cKa90jOKNDLsGxcZIq6/4e4dVeY7adGGsqY6jzJAbC0p2LZQsLuZliV7HZ4IP0W2H9nzLxcUAo5kDraIsgOohA7ymzOLPIyExYmYjss0uyVIFLPsY7BNXd2GiyyCO3XaI/tjCI0LCveCZQ5DLifODV+XNuiarUBcuJIes+0t6IWFjhVVC+9GVJOl6svvb9k5vKYZDrmjLI8rXjzpXW9DGgoRy/3ZeC046OShE3LEEEZqcelQ== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:TYAPR01MB6025.jpnprd01.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(376002)(366004)(136003)(39860400002)(346002)(86362001)(66476007)(76116006)(66556008)(64756008)(66446008)(33656002)(66946007)(8676002)(186003)(26005)(4326008)(85182001)(83380400001)(316002)(71200400001)(478600001)(110136005)(55016002)(5660300002)(2906002)(7696005)(9686003)(38070700005)(52536014)(6506007)(8936002)(38100700002)(122000001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?iso-2022-jp?B?djJuemc0S2Rpblg1UDNvN09wZDJDMmZJeFgwZWNwODl4V05qeDRyUThx?= =?iso-2022-jp?B?VGdtdTZDZW1tUllPRDZKSktqOU1qNzN1TXJwY3ExM2tmYURhK04vSG15?= =?iso-2022-jp?B?c25INytndkViR0hFdmZ1RU9CdXBnUm5kTWZLamZHZldXenFoMGNST0Nh?= =?iso-2022-jp?B?YkhHb0gyNGpjcVJmNW1CdXRaaU5HOUN2WUJVd1JnWmNkdTY5Tlg1azlT?= =?iso-2022-jp?B?OGdWYVV6Zk5hZ2xlUlp0bGFNZEF4bk9KQUwvbUE3Z0hwY1ArSXJRSG5R?= =?iso-2022-jp?B?bXRzQ1oxUjByTXN1KzFWSEkxN2tqZU14WUllSXhIV1VnQ2YwL0lELzM2?= =?iso-2022-jp?B?VlNmTWwzRzM3Zjk4S0JTcUJzMjlEcnlYWEhWVGZKS0J1cjNKVlcvQ1g4?= =?iso-2022-jp?B?S1ZTQ0NudWEwZHBHZVNCVGswN1hWUGw1UjFCNXpTZVBlZnhMNXJmNWlT?= =?iso-2022-jp?B?QlRHb0VrS2hMWnJDRkVSRnY4TURKbFMxNUQzZ0c4UjVCVUNpYVE4V3FN?= =?iso-2022-jp?B?dmdPa05kd2RMcWR6dVFVbW9NZzRoK093cXZYS1IwSDdVNFJycEZKSmVH?= =?iso-2022-jp?B?SzR3Ni9YREVmbjk0Sm5XRUJmaFdGQlZCdlM2ZzRNNjJTREJPQ2pjVWhI?= =?iso-2022-jp?B?NkhMb3pnRkFOK0wrS0F5OENaVW8yRzNuTm9mcmUxRklXQmV6akpqdW01?= =?iso-2022-jp?B?MmtKUUFIYzIvLzBnSkpha1pjSHp0QjIyTFhEN3NxYys4clFaZUlocTg2?= =?iso-2022-jp?B?Y285Q05xWXR3cjBldlMwZGhHbWZNTSt5WEppbnM3TkZwUUhteG9xVkIw?= =?iso-2022-jp?B?ZXdIU09VYXF4dWRoVWlNR0dRdVhBc2xiRVYxR2pRRlZxNUROUE9teHht?= =?iso-2022-jp?B?WUxvVm4rbmFicFNwN2h1LzBqYkRVRVJrbUwvais4bncwMHBzSm44UERF?= =?iso-2022-jp?B?Nm8veFNCcGRiZWxKbjUvMmFOUFkvM2FOaVFDN2xpMTBURFExYnZLMGkr?= =?iso-2022-jp?B?QmorUGpVNTg4ZGtSUUxpRThuYVBFbStPMTlkSHI1V216aUZweFNEOXNz?= =?iso-2022-jp?B?TFNMQVhXRk50NzhXb2VqZkJ6Y0ZiN1VWQlZYTno5RG5iODdoR1gyZUpx?= =?iso-2022-jp?B?dlBYcS93WG8xSmo0dG9taHJMVWpTMENkNlB0MFNsY2RQcTA3c1ZWNXlm?= =?iso-2022-jp?B?TjVJbXAyUVE0aGlXUEczTndpcjlYekk3ZVhOUUw2ZGUzS0s4WjRiclZs?= =?iso-2022-jp?B?TmdnV01YWFhnYTNrUG5ZV1M2cjMyMTR3VitBR3YzM3dsczhhVW4zS2tl?= =?iso-2022-jp?B?QVR3K2FLQVRqcWR1d3l5YVozVEt1UklVd0dKaklLZytsL3ZIWDV0c1Vy?= =?iso-2022-jp?B?K01DeUVqZWNmeEFjeUFtVWpnVnZrSkduYWZmdjAxL2lEV0ZQNS96ZDlr?= =?iso-2022-jp?B?RE5oTUEySXh3ZVZaV1pRR2JTWmtKWDZHS2JJSkxObjh5eXJPQzdPbThq?= =?iso-2022-jp?B?dG54cWJKQ09GajEyK0ZGY1hBU3hqOStlZXhudkRRMkhld0VIdzZIaXNJ?= =?iso-2022-jp?B?UGcxT0JpQksxMUZBMTlFdUltdHZteS9pZGYzd0tJWVEzcFRkUzk4QWJS?= =?iso-2022-jp?B?aENKN0ZxMmhDc2JzLzc1MStreVdWbnEvVG5Eci9HeE4zeVZ3UnVuaGta?= =?iso-2022-jp?B?QVVVNmxtM01yN0ZTRVZIY2h5dzFzSlgrcGtzMFlvM0k1T0VHSVEwdWRn?= =?iso-2022-jp?B?L1BXMHhMbFhOejNRU0wxd3QzTkhDT2dvWVl3T1VxQlh6aXhUbDl1aWV3?= =?iso-2022-jp?B?SXlaTTRLdUxmUEtMcE9pSGlENmc0Y1IvdGt4bWdwaU9ja3h1U1FhOEZT?= =?iso-2022-jp?B?VGJ3QXpvc3lYRUIrcm9KSWY4WFhzPQ==?= Content-Type: text/plain; charset="iso-2022-jp" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: fujitsu.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: TYAPR01MB6025.jpnprd01.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 3d2831ad-2561-4178-8553-08d9562a722f X-MS-Exchange-CrossTenant-originalarrivaltime: 03 Aug 2021 02:57:35.3440 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a19f121d-81e1-4858-a9d8-736e267fd4c7 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: WeucZ6CXNw/hQdl1KgwIBCZbSd90uTLom/kBjI0Xhct+CDsPIjEl3P3yZEOS2xeAqZt95ApVTBw1hBQTo+ne3g== X-MS-Exchange-Transport-CrossTenantHeadersStamped: TYCPR01MB6852 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: naohirot--- via Libc-alpha Reply-To: "naohirot@fujitsu.com" Cc: 'GNU C Library' Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org Sender: "Libc-alpha" Hi Szabolcs, Wilco,=0A= =0A= > From: Szabolcs Nagy =0A= > Sent: Monday, August 2, 2021 11:50 PM=0A= > The 08/02/2021 14:38, Wilco Dijkstra via Libc-alpha wrote:=0A= > > > We discussed how should be defined BTI_C macro before, at that time c= onclusion=0A= > > > was "NOP" rather than empty unless HAVE_AARCH64_BTI.=0A= > > > Now the above code defines BTI_C as empty unconditionally.=0A= > > > A64FX doesn't support BTI, so this code is OK.=0A= > > > But I'm just interested in the reason why it is changed.=0A= > >=0A= > > We changed to NOP in the generic code, so that works for all string fun= ctions.=0A= > > In this specific case removing the initial NOP as well allows all perfo= rmance critical=0A= > > code for <=3D 512 bytes to be perfectly aligned to 16-byte fetch blocks= .=0A= > =0A= > yes, this makes sense:=0A= > =0A= > originally BTI_C was always hint 34, but since that can be=0A= > slow it was changed for !HAVE_AARCH64_BTI. We don't want the=0A= > layout of asm code to change based on toolchain configuration=0A= > so BTI_C is defined as a place holder nop then.=0A= =0A= Now I understood the difference between nop and empty is the layout.=0A= When we discussed BTI_C before, I didn't ask the difference so as not=0A= to prolong the discussion because there is no performance difference.=0A= =0A= > but in a64fx specific code bti is never needed so we also=0A= > don't need the place holder nop, BTI_C can be unconditionally=0A= > empty.=0A= =0A= Yes, I'd like to change __memcpy_a64fx and __memmove_a64fx to the same=0A= way too.=0A= =0A= Thanks.=0A= Naohiro=0A=