From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS17314 8.43.84.0/22 X-Spam-Status: No, score=-3.7 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, PDS_RDNS_DYNAMIC_FP,RCVD_IN_DNSWL_HI,RDNS_DYNAMIC,SPF_HELO_PASS, SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id B4AF11F5AE for ; Wed, 21 Jul 2021 13:07:30 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BC02E395181A for ; Wed, 21 Jul 2021 13:07:29 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BC02E395181A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1626872849; bh=wVc8spfa8UgajgHstf7n0egCGogrVDiQuSlV419wL98=; h=To:Subject:Date:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=VqyzXFnbI4+do03iThmjnBLpS4shw916gSJyAFLUumqakTEcQ/7BPAuiO/qRuoNuk V3UDyFQYhibmw39BsCzWnyRn/wT6PPsArI/FNpdMh37kCOGrBejR1sI7jogovXIecj x4i2zsy9lH7IxsbMdJtwbXCkbj3eQskeMXVBj2pY= Received: from esa11.fujitsucc.c3s2.iphmx.com (esa11.fujitsucc.c3s2.iphmx.com [216.71.156.121]) by sourceware.org (Postfix) with ESMTPS id 211F73957015 for ; Wed, 21 Jul 2021 13:07:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 211F73957015 X-IronPort-AV: E=McAfee;i="6200,9189,10051"; a="35479158" X-IronPort-AV: E=Sophos;i="5.84,258,1620658800"; d="scan'208";a="35479158" Received: from mail-os2jpn01lp2054.outbound.protection.outlook.com (HELO JPN01-OS2-obe.outbound.protection.outlook.com) ([104.47.92.54]) by ob1.fujitsucc.c3s2.iphmx.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jul 2021 22:07:02 +0900 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=chryVCc2zgOrtrk8Uy259FxDfb0gNRNKK+98g/IUPQKDh0TmwGDy5ARLeF8+6KTf6vN1g4jCu3ncOhvnxdhK13ohLVuhEPP+aLCmOYvMuRS65AR2k5nmWgPWSX9M67nE1CmwVlYW3mWlTA9xyPLvtDU53qjyN2XgfrO2EIik0UBgCnYoyDHETjqBChVMRACdx+jqbGZJ+ZTBdMX9TkQryYRsT5bMt7Q/8G26Ozve6603EFMI904/cMmnehwYS0EN6ePFvlBxvsiJWOSEHuVSZhz4H8IVr44s/zW4UuJgAKvDZmaPJ3ns4sciLplp1fUtZPrugRQgfXZ1ID9jwNDPdw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=wVc8spfa8UgajgHstf7n0egCGogrVDiQuSlV419wL98=; b=D7KNvWBblL84p4AEoyMpS6x0ogHysSucxKXAXg8YwkkbsjtUy/nK68UiM2eyNzKZVrbQGsPLgMdGrTaD1mdhApmVn0BrQUrRtaLvqyrFcBTzpb1cEfy/A0hFfSUawJKTbZkWzqMrSypSkrQqyJkGTR+O4SS9r5XKgFe5Q/YpQke02RktuaxVAiyK7RDtaHWVxZEWVoiekz7Y71SK4QQyBnVwRPAqu3JmNaJelhQ8kSSKtEbF+mHsidE5wnEFmr1eOftMpTna+qnosnj6YZR8Bf5I+Ld1TCipPv0xH8RdLg7boVoiu6xyBOfOyZw2w+TPyxXegP+hEiaToiXbvXLbxw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=fujitsu.com; dmarc=pass action=none header.from=fujitsu.com; dkim=pass header.d=fujitsu.com; arc=none Received: from TYAPR01MB6025.jpnprd01.prod.outlook.com (2603:1096:402:36::13) by TYAPR01MB2733.jpnprd01.prod.outlook.com (2603:1096:404:8a::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4331.29; Wed, 21 Jul 2021 13:07:00 +0000 Received: from TYAPR01MB6025.jpnprd01.prod.outlook.com ([fe80::5816:45c1:5336:c108]) by TYAPR01MB6025.jpnprd01.prod.outlook.com ([fe80::5816:45c1:5336:c108%8]) with mapi id 15.20.4331.034; Wed, 21 Jul 2021 13:07:00 +0000 To: Noah Goldstein Subject: Re: [PATCH v2 2/5] benchtests: Add memset zero fill benchtest Thread-Topic: [PATCH v2 2/5] benchtests: Add memset zero fill benchtest Thread-Index: AQHXfTFkDFnCQ66ygU6ZgGc9SBuuUatME7OAgAEdVVCAADXcUQ== Date: Wed, 21 Jul 2021 13:07:00 +0000 Message-ID: References: <20210713082214.307529-1-naohirot@fujitsu.com> <20210720063500.362313-1-naohirot@fujitsu.com> , In-Reply-To: Accept-Language: en-001, ja-JP, en-US Content-Language: aa X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_Enabled=True; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_SiteId=a19f121d-81e1-4858-a9d8-736e267fd4c7; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_SetDate=2021-07-21T13:06:59.884Z; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_Name=FUJITSU-RESTRICTED; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_ContentBits=0; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_Method=Standard; x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 2abde43d-1a3d-4e12-ba34-08d94c486d78 x-ms-traffictypediagnostic: TYAPR01MB2733: x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:4303; x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: Px3XTzZRCRMHQRhVh5Ut+gsUwTAYsBsJa4Mfxzlpe1VLQBX15ue6I/Q4ASxennCq46R2LExVlwXf+5xSTg4TuLGZQRv5sshnFDRQEVz7HSGlFsHhUhkhsVd9eREPhF28wBM5sTJIpLIQdfDNIraQgXyQ8nbloZ9jRlmk62iXNoT7xTRXsWSqC7aSH/gnZqUz6n/VB+QIR/cT6u+ouvyqAOmkcEzX6fr12WIOeJ658OQAvrgGy3qhdzumJPJVTIHjfJvk5ua3Z7iWahITGiYXs3c2geShljsbLyO5QucGIJ27MlsYyYjLom+mZvU9R7+s1mjnUtngvcvSSreNFQGXP5QUwT10uAnsZZpm2jpgpG6HZxmF5g2HpKLpMdYGgbmT/A4DOR0CTXusM3/nlY4eniyezrDploWrhQSc1gg+jRlgH8Ct8F/LxSP0IYgTRLvDqhlE7X3DhoENfPTnb4B5Jdedvfs83Br7uEGiLhNVNmMwxnDbTCTqt4TzsgM+d7rARmnmB7KB6mLClZDfJK3NysFkcpsKY8BZkOszIG7sPYrDT5HDUKDp/rgMTJ6UqvYJGQtt1Nss/vo5LKAa0bBK9FAyqJjAvKjHOrbr/JTi0pELGxN/qWRmkg+Upeftji0a8MNvomS0J6l35cJ3jKHzZSLxk1GxcjVjBYL5W3iOxpRR1hpv4/2IP98Iqjtn+ZKnKaBelf3agAXkjRIxmXdLMw== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:TYAPR01MB6025.jpnprd01.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(346002)(376002)(136003)(39860400002)(396003)(6506007)(8676002)(2940100002)(122000001)(6916009)(2906002)(38100700002)(53546011)(86362001)(8936002)(26005)(9686003)(71200400001)(83380400001)(52536014)(54906003)(4326008)(7696005)(186003)(85182001)(33656002)(478600001)(5660300002)(316002)(76116006)(55016002)(64756008)(66476007)(66556008)(66946007)(66446008)(38070700004); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?iso-2022-jp?B?NE1na0tKMkFOTUpSbGhmNnEzZElWL09CRHIxeGswNE5MMjZ1QU41ZWVR?= =?iso-2022-jp?B?OUpnZ3ZJRlpaVVlRSWdBUHMwQnNRTSswWHFEWmxnMi9DQUVDQVBqbm5o?= =?iso-2022-jp?B?QzNtbS9QZzdnWU1Fd1hvK1d1VWlvVmFsODlBTXVjdG04RVh3RTlHUzd1?= =?iso-2022-jp?B?SkVaUFJsMTJQcmhKUkV0aGdqRi9SWkdXUVhWS1RYYTg5VTVJaStLajNw?= =?iso-2022-jp?B?QnJudlJmNGpuc1RPWElPVEJ5eC9HeUdrazM0YWRjeFZZMzA1RmhtYkVi?= =?iso-2022-jp?B?U0hRb0Y3bWxZWkRueUVTOC9vZ2hndHMrSEIycnNtd1ZjanN0N0JIY1lR?= =?iso-2022-jp?B?eENyU2VveU1ET2lQd2dvcWFpVElBV3VVVlNTTzFnVGdzZDc5b0Y4cWpO?= =?iso-2022-jp?B?ZGJiZGliYzZkNGF4T0pKbVNJTkxHSXZ3akQ2bFlrdk9uTjJwVWVaSWJ4?= =?iso-2022-jp?B?UVZPZ1dJOC9KUHQ3YUttSFI3K0RRV3JUdWJneVh6cEl1Ly9CdUhVWmNK?= =?iso-2022-jp?B?VDFDTVRYYWNCYUM2WkpMWnFTV0ZvQ0R4U3QxYWExc29DdEd1YVJGOGlu?= =?iso-2022-jp?B?VmtWZkxtTjAzbFAxclFFd3F4VUlsTmFMTGhBbmhUN1MrU2VFL0dMa05H?= =?iso-2022-jp?B?V3duam5jSGZ4RVNxbVZmd2trbjJ3dlVoUDRSRG5INm5pWGVUeWQzWGQv?= =?iso-2022-jp?B?WnBBdzVOLzhzeEprQittMkdacTVpMDVoa2M5QlFLSGt3cXQzNnJUR3JW?= =?iso-2022-jp?B?RFhwRHU0L0w3NU5nTGdsV2pnNXRDSUZ2cHEzclRSUy9SV3RWczhYNVBu?= =?iso-2022-jp?B?QWNIdGtmTkpMb3MzSmxFQ0pjYlpjNWczak9IZVd5ZmRaWnBhdWlDdjJL?= =?iso-2022-jp?B?aDVFTTRpbVBWaHBTb2k0cTkxVlE5MkFVSjNOdjU5SGg5a2dXZVY4TE9k?= =?iso-2022-jp?B?Vk9lUWNiakI3NTFNaExLd2FKalFWNEh4RWJtZjJTZWRXRTBUS1k4NC9k?= =?iso-2022-jp?B?YzEwcWJzTXNRd28vZnBESCt3dVFVUlQrbkhWbE5hcDNtVGtmUVJLY3pP?= =?iso-2022-jp?B?d2JlR3U1MzdxVWVzVGJOckV0emlQd2hmRXJrM2xQa3VCMXN1aks5dGw0?= =?iso-2022-jp?B?MDE2cUNVMU45M1hhVDZ2a0FIUUxJanlWeWI0ZHhsU0t4M3JBaUFYc0lk?= =?iso-2022-jp?B?TEVkV2JNR1FpeGs3S2ZUTFZLWmZrRkVwSVRnT2ovRDZPV3JVMUJnZkFm?= =?iso-2022-jp?B?c3NlbUpaclIwV2ZzdmVCQ2k1YWpsMlUrMmRWOUpqc2JHY01KZC9naU9j?= =?iso-2022-jp?B?NHl2ZW1zUkQ1VnJrdDRvemRQemJxRVlLYnRlUjFQN0hWVy95cmpmMVhY?= =?iso-2022-jp?B?SE53S1hPeHFPZC8yK0x0TUc1VElaS0h1S1cwd010dHB4K3JmMCtwblBP?= =?iso-2022-jp?B?TWVHYzA5aTBzQ3kySVlsa0RWQkFOcWR1ZDJZdVREWTJPcHI3SjFXY25N?= =?iso-2022-jp?B?NmtTNjY3ZEVJZFQycXlwZzd0WFpuODlEdENHVmVoTXppV1FhZ1p5OWlJ?= =?iso-2022-jp?B?dDVZUlBHbTJ4S1JIVEEwNjRzVi9qM09TYmRtOVJ0UnhDdXQvS3V0VXB1?= =?iso-2022-jp?B?STZjWlZzSzVHZ3BUU2lxMXlUME1WNzkxazM4WHVhT1RjZ1N6TEoyS3c1?= =?iso-2022-jp?B?Y2xHZkRnUUN0Z1d5bkxOTWpjNUl6bytKdTQwSnRUVVJoS3F0ZnBXRUIv?= =?iso-2022-jp?B?SFRLc0t5UW9JTU5ac3pDaFFwQ21aMFRiY29sQktTRVZMOG5zNG9VWmhV?= =?iso-2022-jp?B?LzN2dE1ZbzltT0ViOEsxeldNWnc0T0tZQjZ3SWFtZUtPbXBiSlN2Q0Rk?= =?iso-2022-jp?B?UzJIVG5DKzVYRkp3Ri9FRVdabzRJPQ==?= Content-Type: text/plain; charset="iso-2022-jp" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: fujitsu.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: TYAPR01MB6025.jpnprd01.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2abde43d-1a3d-4e12-ba34-08d94c486d78 X-MS-Exchange-CrossTenant-originalarrivaltime: 21 Jul 2021 13:07:00.7043 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a19f121d-81e1-4858-a9d8-736e267fd4c7 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: SsheknfOyF5q/MyoRBgXo30VWpo8l6MdsY3kbajI+YZt4Jt/sm79nhYouCVxiWjeQ4FAPh/qCP+5RLH49TbZfw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: TYAPR01MB2733 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: naohirot--- via Libc-alpha Reply-To: "naohirot@fujitsu.com" Cc: GNU C Library , Wilco Dijkstra Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org Sender: "Libc-alpha" Hi Noah,=0A= =0A= One typo in the updated code.=0A= Wrong:=0A= #define START_SIZE (16 * 1024)=0A= Right:=0A= #define BUF1PAGES 16=0A= =0A= Thanks=0A= Naohiro=0A= ________________________________________=0A= From: Tamura, Naohiro/=1B$BEDB<=1B(B =1B$BD>9-=1B(B = =0A= Sent: Wednesday, 21 July 2021 21:56=0A= To: Noah Goldstein=0A= Cc: Wilco Dijkstra; Lucas A. M. Magalhaes; GNU C Library=0A= Subject: RE: [PATCH v2 2/5] benchtests: Add memset zero fill benchtest=0A= =0A= Hi Noah,=0A= =0A= Thank you for the review.=0A= =0A= > > +#define TEST_MAIN=0A= > > +#define TEST_NAME "memset"=0A= > > +#define START_SIZE (16 * 1024)=0A= > > +#define MIN_PAGE_SIZE (getpagesize () + 64 * 1024 * 1024)=0A= > > +#define TIMEOUT (20 * 60)=0A= > > +#include "bench-string.h"=0A= > > +=0A= > > +#include "json-lib.h"=0A= > > +=0A= > > +void *generic_memset (void *, int, size_t);=0A= > > +typedef void *(*proto_t) (void *, int, size_t);=0A= > > +=0A= > > +IMPL (MEMSET, 1)=0A= > > +IMPL (generic_memset, 0)=0A= > > +=0A= > > +static void=0A= > Do we want __attribute__((noinline, noclone))?=0A= =0A= Yes, I'll add it.=0A= =0A= > > +do_one_test (json_ctx_t *json_ctx, impl_t *impl, CHAR *s,=0A= > > + int c1 __attribute ((unused)), int c2 __attribute ((unused= )),=0A= > > + size_t n)=0A= > > +{=0A= > > + size_t i, iters =3D 16;=0A= >=0A= > I think 16 is probably too few iterations for reliable benchmarking.=0A= > Maybe `INNER_LOOP_ITERS` which is 8192=0A= =0A= I tried it. If it is changed to 8192, it hit the TIMEOUT (20 * 60) on a64fx= .=0A= Please check the code below.=0A= =0A= >=0A= > > + timing_t start, stop, cur;=0A= > > +=0A= > > + TIMING_NOW (start);=0A= > > + for (i =3D 0; i < iters; i +=3D 2)=0A= > > + {=0A= > > + CALL (impl, s, c1, n);=0A= > I am a bit worried that the overhead from the first call with `c1` will d= istort the results.=0A= > Is it possible to implement it with a nested loop where you fill `s` with= `c1` for=0A= > `n * inner_loop_iterations` in the outer loop and in the inner loop fill = `c2` on `s + n * i`?=0A= > In that case maybe 16 for inner loop iterations and 512 for outer loop it= erations.=0A= =0A= It seems that we have to set smaller number if this implementation is not w= rong.=0A= Because it will take 99.4 minutes estimating from the case that "iters =3D = 32"=0A= took 23.3 seconds.=0A= (8192/32*23.3/60=3D99.4)=0A= =0A= =0A= #define START_SIZE (16 * 1024)=0A= ...=0A= static void=0A= __attribute__((noinline, noclone))=0A= do_one_test (json_ctx_t *json_ctx, impl_t *impl, CHAR *s,=0A= int c1 __attribute ((unused)), int c2 __attribute ((unused)),= =0A= size_t n)=0A= {=0A= size_t i, j, iters =3D INNER_LOOP_ITERS; // 32;=0A= timing_t start, stop, cur, latency =3D 0;=0A= =0A= for (i =3D 0; i < 512; i++) // for (i =3D 0; i < 2; i++)=0A= {=0A= CALL (impl, s, c1, n * 16);=0A= TIMING_NOW (start);=0A= for (j =3D 0; j < 16; j++)=0A= CALL (impl, s + n * j, c2, n);=0A= TIMING_NOW (stop);=0A= TIMING_DIFF (cur, start, stop);=0A= TIMING_ACCUM (latency, cur);=0A= }=0A= =0A= json_element_double (json_ctx, (double) latency / (double) iters);=0A= }=0A= =0A= > > + CALL (impl, s, c2, n);=0A= > > + }=0A= > > + TIMING_NOW (stop);=0A= > > +=0A= > > + TIMING_DIFF (cur, start, stop);=0A= > > +=0A= > > + json_element_double (json_ctx, (double) cur / (double) iters);=0A= > > +}=0A= > > +=0A= > > +static void=0A= > > +do_test (json_ctx_t *json_ctx, size_t align, int c1, int c2, size_t le= n)=0A= > > +{=0A= > > + align &=3D 63;=0A= > Can you make this `align &=3D getpagesize () - 1;`?=0A= =0A= I'll change it.=0A= =0A= Thanks.=0A= Naohiro=0A=