From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3215 2.6.0.0/16 X-Spam-Status: No, score=-4.1 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 0C4631F8C6 for ; Wed, 28 Jul 2021 08:16:30 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0D455398B894 for ; Wed, 28 Jul 2021 08:16:29 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0D455398B894 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1627460189; bh=C4LYeDaeFvx9l3XMA3c/oi8r8LiLIho//ATyhLoiXmE=; h=To:Subject:Date:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=LnMj+u3aF3uYnq+L8/oZQ7fdKPTVGfENjSM+a4v+TlbD4Babc6h7ACgSP0su/50e1 ewOUwPZeWysi40l8kUM5udYXVEYBOk+jlcXRB1idKWn+m6x8THxV+dWeJ873XlMEZk pEDJYkz5uRABp+CSnNAdeLIh0vmvpIFMjaYxTzc4= Received: from esa17.fujitsucc.c3s2.iphmx.com (esa17.fujitsucc.c3s2.iphmx.com [216.71.158.34]) by sourceware.org (Postfix) with ESMTPS id 7AD193988414 for ; Wed, 28 Jul 2021 07:28:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7AD193988414 X-IronPort-AV: E=McAfee;i="6200,9189,10058"; a="35559518" X-IronPort-AV: E=Sophos;i="5.84,275,1620658800"; d="scan'208";a="35559518" Received: from mail-os2jpn01lp2054.outbound.protection.outlook.com (HELO JPN01-OS2-obe.outbound.protection.outlook.com) ([104.47.92.54]) by ob1.fujitsucc.c3s2.iphmx.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Jul 2021 16:27:59 +0900 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=G+yHp0hDdHLWmJci0g6SSJu4McGoWuG7r83/lgD+h7R8uvYi0gNcdeWRIUiJB7WtPEMjUBEu0xhY38nPEnwV9VMpUyxheTKKoo+R3iS/d1Xtrm8AG9HsijNSUN49PnFxjQc+8c1Ngpbl3FHU7uF2wm44DFRs4SXHJtM51kNZfTg0JCuSpvTllTJkk+t0hNAAPQBCuXPVOFg6vvEqS5gTe1K9XOqPvj6QIPQUL4U9uFYssmRgOd8CMIle9pkFiVa/gEgGopQbm+OcYs9+OP0lq/RNrgwXl795h7dEDYvSfEscHvDFUWP9xzuCENB5uPPjGfjwycZUf0j/83Kp7mmyrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=C4LYeDaeFvx9l3XMA3c/oi8r8LiLIho//ATyhLoiXmE=; b=JfVHPjX7G7rtROLIH2kQRhrbxZjCvfVpuYWMJdej3zP5XdfFi/3I+iVd08nDlDklyAE4OAmJrISweAWm0ornEsObAjIX+IDtM1KuJ1E+4VIgVw3DK2pofQuDSujfjlSDE6GJjNZH0agrHARDjRDS18e2EWhfSekZjbqKWlt1ARoObJNIBb/4PArD5JcAKu/qTkAH02XZtmErgyYWKUUAFOvWghz4CLC1Sa+MoqOej7T2T5LxVPCF7kkbTf6452JUjSWswk98whifQ6XIVl4+SdjtCP5vsUCJEZnxgWuG4Aazef1vn5F1CLRIZbvDVraK13xcve9fdCdJ4TIzLSMLWw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=fujitsu.com; dmarc=pass action=none header.from=fujitsu.com; dkim=pass header.d=fujitsu.com; arc=none Received: from TYAPR01MB6025.jpnprd01.prod.outlook.com (2603:1096:402:36::13) by TYCPR01MB6654.jpnprd01.prod.outlook.com (2603:1096:400:9b::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4352.25; Wed, 28 Jul 2021 07:27:55 +0000 Received: from TYAPR01MB6025.jpnprd01.prod.outlook.com ([fe80::5816:45c1:5336:c108]) by TYAPR01MB6025.jpnprd01.prod.outlook.com ([fe80::5816:45c1:5336:c108%7]) with mapi id 15.20.4352.032; Wed, 28 Jul 2021 07:27:55 +0000 To: Wilco Dijkstra , Noah Goldstein Subject: Re: [PATCH v2 2/5] benchtests: Add memset zero fill benchtest Thread-Topic: [PATCH v2 2/5] benchtests: Add memset zero fill benchtest Thread-Index: AQHXfTFkDFnCQ66ygU6ZgGc9SBuuUatME7OAgAEdVVCAADXcUYAAVzoAgAARdwCABymxSIAAK2oAgAD9BDqAAP0PAIAA6oOm Date: Wed, 28 Jul 2021 07:27:55 +0000 Message-ID: References: <20210713082214.307529-1-naohirot@fujitsu.com> <20210720063500.362313-1-naohirot@fujitsu.com> , , , , , , In-Reply-To: Accept-Language: en-001, ja-JP, en-US Content-Language: aa X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_Enabled=True; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_SiteId=a19f121d-81e1-4858-a9d8-736e267fd4c7; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_SetDate=2021-07-28T07:27:54.689Z; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_Name=FUJITSU-RESTRICTED; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_ContentBits=0; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_Method=Standard; x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 290e691f-4635-48f9-f5bf-08d9519937ac x-ms-traffictypediagnostic: TYCPR01MB6654: x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:3631; x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: T1ROfadJe/sbbSUc3HzfT/HUB6ElUO84NZkGv/POJKC/txsEeYfg2gDmRrgWBqzeh/qBTMQxWqOukT1UEA/3KBHIxlV4fCLxUPi6/S2ze+PI1zLo+mLyws4a8L5Exrws7HNuBneHfbtH0j5grO7QITxJXryroLQT1IQIfqpw7kib3ziutbnIfyVCIbDQOfNBGN0Vhr+NNGTgPxEXSING6xW4cS0epE6AE+j+xFjJkjl9dJMqYv0MENYk8EzEZVXqOKUQuI/MwuJPWsLH/BU7rlsV+3OU0oba7J6reqrjbqGTjnktEOjyQLqz6SUREETKxKPmSEbeRiTBwtQVXXiIzhhAYr4TjhhPQEgC9ItsqkaGF9vRvx8eJEnuaeX0Na+dD+8uBOTYk6Vcxlo45+kgPrJX5zfxVK80asQj2GUAN/ojdaJhFnxvqFVOyXy9YFiNTZZIcQ453+trwTFPOj3gi1z4peQZq395J/hmTYPgM2ovtYuiDf2F4OVqkBe+YvHZHJX2AWyVcnNZ/7HnDUiJHKKlkz6e1feycGh5rvtuEwJypC9KPBZM0BCiiwlV82Etuph/qUPBdt6rL6pz0PnW4JnRBVjjBb/WSY2OFOjoEGPtfemBxka5TQ7eXPuy1n3M703+46RvnKIluGQMlpLuZ0za7DRyA0F0sBdTIhWeFxrfWq4OwXKSNmBOMh+dAKLZKLOi0L6z5zIxHWrcEUsjVl0uVQS2STxxMWqv4ZJxBpueJpYvX5R36ebbXjjWMnNMxjKKdFbOzG9xZkGktxM2qdey15zaETfNC8J6U1bwUTwwfMPQoc4XjqaROSBao/S3 x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:TYAPR01MB6025.jpnprd01.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(76116006)(66946007)(9686003)(54906003)(8676002)(26005)(6506007)(2906002)(83380400001)(38070700005)(508600001)(966005)(7696005)(85182001)(33656002)(186003)(66446008)(66476007)(4326008)(55016002)(66556008)(86362001)(52536014)(8936002)(64756008)(71200400001)(5660300002)(110136005)(316002)(38100700002)(122000001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?iso-2022-jp?B?S3A1RjJPMGp4Y1VVdEU3RXBQajRmdkMxRExiQlpHUmdUUzVSbEQyQVJt?= =?iso-2022-jp?B?cFhDSlBDNzB4NlhQYkVNcEc3MGIwZzNsSXpGVEs5bG9JNklROWRockZr?= =?iso-2022-jp?B?Wk9hbmxOMXdSbkcxYzJIS3NpTncyTGhER1l1S1YyV0RDbkFYK3d0WUFZ?= =?iso-2022-jp?B?TmpjOEo0NHg5b2dneE9rbDdLR01sZkJzS1RENzVzS21YUVdIdFhNQ0lw?= =?iso-2022-jp?B?a2prYVRrdzViREt4eEFJYWw5V1hzMmVtWmJ5alJCdzdUUGxNMitPSUNR?= =?iso-2022-jp?B?K3g0NWdCa0RUUXBHOFJBN000NEt4TDdlTGhBWGZQQVh4ZGhpekRDaGZt?= =?iso-2022-jp?B?YVd3TFBZdGluaHFoOTN1S2ltQlZmMVVIQUliL0Q5SW5VMXl0aktVRVBy?= =?iso-2022-jp?B?dUpqdDJpVkM1bHE5emNscXV1K1N1Vmg2aUpoaXNLZ0drVG1iZkRWSkVK?= =?iso-2022-jp?B?bGFNRitKUGlhZi8yWThUbU1VOGsrYlU3RVFMZnZwRGxOODlDWDhFUUNY?= =?iso-2022-jp?B?K1NjWVNaY04vdVRGZmhGK04yR2JiZ2pPWnh3MVpEOXNEd0NBRUFhMjBt?= =?iso-2022-jp?B?Q2xubWJ5TjYxd05wWXVhVFhES3d3QXl2TGpYTFZqSE5IQlFDZkpZSHU1?= =?iso-2022-jp?B?U0FMdXJmSWhvbHgzNTIxcy9zTXJWeHpuMVN4cEh0Sm04L3l3Rk56UUtM?= =?iso-2022-jp?B?NlRVd0NyYUFiWEhWc09WYXVMTEhDb29MdVBJcVVZNm9HeVNCWmZBNU5v?= =?iso-2022-jp?B?TURWSzdVb3YyaUhHcVR4Uk1ja2ZsUW05NjVtUnk2T2lRKzgrWGl4MFAz?= =?iso-2022-jp?B?bVMxNnZwQmpNOFUzVktsTWpHcHZ2ZCtIQ2JxUjBkc0lncmowcmVCTWNB?= =?iso-2022-jp?B?RjRMaHp1SjREd2dUdUx3N3JHd21MNXhyQng2V2RXNFNUV0ZleW1xVm0r?= =?iso-2022-jp?B?dmlSRTI1aVpzRXlxUGptNTBnemo4blg4NWt1OHpoMXJZMzVCbzljMFln?= =?iso-2022-jp?B?YnFxckxMd3RuRXc4RkRjTk12UjhVZjJjS1ptMmczekM2ejBLMmMrcWVG?= =?iso-2022-jp?B?ZmV0dUw2OVFuRUttT1dNV0FFeEp5OVdoQ0FVb205N3BySWZ2Yllmd2pr?= =?iso-2022-jp?B?SDFlclByYjM1Q09CWEI2TW1yZzBtNXVGVVV0dk03S3grdFpZWEE3eDl2?= =?iso-2022-jp?B?dFEvU25zQjgvcEE1MWhVRHR1dEsxcTliZ0RWNU0zQ2NGdjFVRWNtUjJw?= =?iso-2022-jp?B?Y1JxY3VqSVBFN2ozSGlDRW1CM2lHRFdpNDRNb29xNmNUR1N0RTU2NnJq?= =?iso-2022-jp?B?eHNpbm85bWF2dkNRaTY2ck9FMFQ5ZC9XeUF3ZTdDamFyd0xHbHRzeFNr?= =?iso-2022-jp?B?UUFJNGVFZFBpangrN3NvY1J4eUViY3lFMWxIRjhSNW80a1JLL0VjbXpH?= =?iso-2022-jp?B?MFFNWmdrYytXZDArSU5BTXNiYkdvVzhNc3N3R1FYT1J2aU41V0dWV2JD?= =?iso-2022-jp?B?Q1R1UE16RGNyWmU3NjQ5ZmF5S25obEZVL1AzR3l0dkU5UXJMRGttZURz?= =?iso-2022-jp?B?aUNIY202ekhlMDY4c2JvMkFHSGhaazhodXpQVFhqVFhhZEwwbVMyYi82?= =?iso-2022-jp?B?K2MwZTh3eGpDa2U3MjBkUGd5a0ovenUzcFBTMGxEdnJwUnRaNnZxeitX?= =?iso-2022-jp?B?bkdoa1BWYWt6S3RvODFRSlBFMHRGR2twSDNlbTZuMThRd2RiRllpT1dx?= =?iso-2022-jp?B?Q2dZcUhsY1Vja2crSm8ycGw3VXRtWUhnNDZoWEtGeDZpNkh0YTBKU0I1?= =?iso-2022-jp?B?Z3pZQkwycXo5NjQzVXBmeXZLVmp0TDdEOStWaVJKMlZZYVdSblhFM25r?= =?iso-2022-jp?B?Um1NS0JJd0FiSzNLNW9DYmxReU5zPQ==?= Content-Type: text/plain; charset="iso-2022-jp" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: fujitsu.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: TYAPR01MB6025.jpnprd01.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 290e691f-4635-48f9-f5bf-08d9519937ac X-MS-Exchange-CrossTenant-originalarrivaltime: 28 Jul 2021 07:27:55.4219 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a19f121d-81e1-4858-a9d8-736e267fd4c7 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 5YV4ShSp7psQ4qf/PbcbgqJp7DmW2rqXn2leUuZ8O8V8Sp3weksgOykzVI38Kz5oRzkpYGfuOeYd460AyMGJPQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: TYCPR01MB6654 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: naohirot--- via Libc-alpha Reply-To: "naohirot@fujitsu.com" Cc: GNU C Library Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org Sender: "Libc-alpha" Hi Wilco, Noah,=0A= =0A= > > There may be miscomminuation.=0A= > > The * 16 is already in the outer loop (1).=0A= > =0A= > The outer loop is in test_main, and it determines 'n' in do_one_test:=0A= > =0A= > for (i =3D ...)=0A= > {=0A= > do_test (&json_ctx, 0, c, i);=0A= > }=0A= > =0A= > > Let me copy the code from the mail [1] I put in the previouse mail [2].= =0A= > =0A= > The key issue is that this loop:=0A= > =0A= > for (j =3D 0; j < 16; j++)=0A= > CALL (impl, s + n * j, c2, n);=0A= > =0A= > is equivalent to:=0A= > =0A= > CALL (impl, s, c2, n * 16);=0A= > =0A= > The loop we really want is something like bench-memset-large:=0A= > =0A= > CALL (impl, s, c, n);=0A= > TIMING_NOW (start);=0A= > for (i =3D 0; i < iters; ++i)=0A= > {=0A= > CALL (impl, s, c, n);=0A= > }=0A= > TIMING_NOW (stop);=0A= > =0A= > This repeats CALL on data of size 'n' after an initial warmup of the cach= es.=0A= > =0A= > > It doesn't matter what kind of memset is called, but matters the=0A= > > function name in the code so that we can understand it is not mesured.= =0A= > =0A= > Then using the standard name 'memset' would be best.=0A= > =0A= =0A= OK, I understood, thanks.=0A= =0A= Taking Noah's comment [1] into account, the final code should be like=0A= the below. Can we agree with this code?=0A= =0A= Two results, two loop version in the mail [1] and one loop version=0A= below, are almost same in case of __memset_generic on a64fx as=0A= shown in the graph [2].=0A= =0A= -----=0A= static void=0A= __attribute__((noinline, noclone))=0A= do_one_test (json_ctx_t *json_ctx, impl_t *impl, CHAR *s,=0A= int c1 __attribute ((unused)), int c2 __attribute ((unused)),= =0A= size_t n)=0A= {=0A= size_t i, iters =3D 32;=0A= timing_t start, stop, cur, latency =3D 0;=0A= =0A= CALL (impl, s, c2, n); // warm up=0A= =0A= for (i =3D 0; i < iters; i++)=0A= {=0A= memset (s, c1, n); // alternation=0A= =0A= TIMING_NOW (start);=0A= =0A= CALL (impl, s, c2, n);=0A= =0A= TIMING_NOW (stop);=0A= TIMING_DIFF (cur, start, stop);=0A= TIMING_ACCUM (latency, cur);=0A= }=0A= =0A= json_element_double (json_ctx, (double) latency / (double) iters);=0A= }=0A= -----=0A= =0A= [1] https://sourceware.org/pipermail/libc-alpha/2021-July/129486.html=0A= [2] https://drive.google.com/file/d/1bptHqg5vvFAGoYgoR3w_pvclXFSP8Sr0/view?= usp=3Dsharing=0A= =0A= Thanks.=0A= Naohiro=0A=