From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-4.1 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 603C61F462 for ; Wed, 22 May 2019 11:11:29 +0000 (UTC) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:references :in-reply-to:content-type:content-transfer-encoding :mime-version; q=dns; s=default; b=l1oSr/irN/S9brD1D3xmVOhRKvzq/ gCrncoyqmPWeLGPiY9UVbbQymi+tVklvew31FG9ZWnRBWf9/4UgDSldTKWrHLxf8 pk2kGzMN98LVnYomwhMWt5jp8eHaLqr1QtHBBxBR9KHtA/nCmrR+nzliilJ2nyYF 19BwaHvBNRz3zM= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:references :in-reply-to:content-type:content-transfer-encoding :mime-version; s=default; bh=51XjDygwQbtCWErmUtunFI7l8KE=; b=a6u hcPibNkktNfWeA0BW1Bv5oSEvfGaSFt9IbaSqjTqlasfxbWobAqCqbXejTI7SICG 2A0STJzWUVPG32gzgKuE97Q87DiYINIy4hBD4jPHqr4sYENumfQHmefTC1NOhSmW UkfTGOzzKymvZYgRc5+EYnX1akqD1uAdCqA6klJY= Received: (qmail 81756 invoked by alias); 22 May 2019 11:11:26 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 81679 invoked by uid 89); 22 May 2019 11:11:26 -0000 Authentication-Results: sourceware.org; auth=none X-HELO: EUR02-AM5-obe.outbound.protection.outlook.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=qgMdHADLDBt+aIQkerkZRjNVtz1TNXDczOVYiBqkP70=; b=4C/aZN0d5QUprBkJD3ar7owgkef2dpPAOhAN+8O+M+W8rZy5vohiPY4U1vRiMcDic1XGz+CL+HDvWuTdyE+x52++ZM/0Iuon1f3hr1U6FYucHPNfKKJ3GtM10TOAbmVGdIXkOUUAU1kZh/IxoJLGm8xFi51yrE6A4prCTtkI/xc= From: Wilco Dijkstra To: Siddhesh Poyarekar , Adhemerval Zanella , "libc-alpha@sourceware.org" CC: nd Subject: Re: [PATCH] Improve string benchtest timing Date: Wed, 22 May 2019 11:11:20 +0000 Message-ID: References: ,<3acd7a7f-c06a-6679-d526-e758d9ff30ab@gotplt.org> In-Reply-To: <3acd7a7f-c06a-6679-d526-e758d9ff30ab@gotplt.org> authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-oob-tlc-oobclassifiers: OLM:9508; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED Hi Siddhesh, =A0=20 >On 21/05/19 7:28 PM, Wilco Dijkstra wrote: >> Well the test doesn't actually test misaligned copies - both source and >> destination are always mutually aligned, so any memcpy implementation wh= ich >> aligns either source or destination will only do aligned copies. > > They were not intended to be mutually misaligned, that was not the > intent of the benchmark since the target application it modeled did not > have such inputs. > >> In any case I'm not sure what the test is supposed to measure - the scor= es are >> identical across all memcpy implementations. The time taken for double t= he >> copy size is exactly twice as much. > > Right, you'll probably only see differences in case of mutually > misaligned inputs. Well if I force the copies to be mutually unaligned, there is only about 1%= difference for a few of the memcpy implementations compared to them being always align= ed The others show identical performance whether aligned or not. This is not t= oo surprising since the test is basically waiting for DRAM most of the time. So if we wanted to measure something useful we'd need to do it differently.= Maybe the goal was to measure DRAM bandwidth? If so we could modify it to compare copy bandwidth for just a few different sizes (corresponding with typical L= 1/L2/L3 sizes). Wilco =