From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS17314 8.43.84.0/22 X-Spam-Status: No, score=-3.7 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, PDS_RDNS_DYNAMIC_FP,RCVD_IN_DNSWL_MED,RDNS_DYNAMIC,SPF_HELO_PASS, SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id C88CA1F8C6 for ; Tue, 13 Jul 2021 13:50:36 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8712839450CE for ; Tue, 13 Jul 2021 13:50:35 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8712839450CE DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1626184235; bh=A8ZZ2hgXc2O9SYgwkQQA/rCSANVJZsQhhdVeY1s79IE=; h=In-Reply-To:References:Subject:To:Date:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=Istryo093Isicm/pfMrLHfJpbdqoYcxgQEQYW2OtLj4vOUP2g0owNlYzCjrj47k+m vN0j32CPqiH7ZvRotFxtJru2Onow+Uem6RiiRermYDKMlAtnzATepunl6c139hFgFc /wxS8mkD0lwzvvnjLbRtr0FlQiHfU4RwKukZcQHA= Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 9AE193848404 for ; Tue, 13 Jul 2021 13:50:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9AE193848404 Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 16DDXPsL115888; Tue, 13 Jul 2021 09:50:11 -0400 Received: from ppma01wdc.us.ibm.com (fd.55.37a9.ip4.static.sl-reverse.com [169.55.85.253]) by mx0a-001b2d01.pphosted.com with ESMTP id 39qrmctx54-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jul 2021 09:50:11 -0400 Received: from pps.filterd (ppma01wdc.us.ibm.com [127.0.0.1]) by ppma01wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 16DDlWkq004940; Tue, 13 Jul 2021 13:50:09 GMT Received: from b01cxnp23034.gho.pok.ibm.com (b01cxnp23034.gho.pok.ibm.com [9.57.198.29]) by ppma01wdc.us.ibm.com with ESMTP id 39q36bd25n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jul 2021 13:50:09 +0000 Received: from b01ledav005.gho.pok.ibm.com (b01ledav005.gho.pok.ibm.com [9.57.199.110]) by b01cxnp23034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 16DDo96A43909480 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 Jul 2021 13:50:09 GMT Received: from b01ledav005.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4F912AE06D; Tue, 13 Jul 2021 13:50:09 +0000 (GMT) Received: from b01ledav005.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 01611AE05F; Tue, 13 Jul 2021 13:50:08 +0000 (GMT) Received: from localhost (unknown [9.65.86.59]) by b01ledav005.gho.pok.ibm.com (Postfix) with ESMTP; Tue, 13 Jul 2021 13:50:08 +0000 (GMT) Content-Type: text/plain; charset="utf-8" In-Reply-To: <20210713082214.307529-1-naohirot@fujitsu.com> References: <20210713082214.307529-1-naohirot@fujitsu.com> Subject: Re: [PATCH] benchtests: Add memset zero fill benchmark tests To: Naohiro Tamura , libc-alpha@sourceware.org Date: Tue, 13 Jul 2021 10:50:07 -0300 Message-ID: <162618420792.2738828.14663488304092122966@localhost.localdomain> User-Agent: alot/0.9.1 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: p5pfl9H8PkA9gjrkjWGgqkaRvL_kJs86 X-Proofpoint-GUID: p5pfl9H8PkA9gjrkjWGgqkaRvL_kJs86 Content-Transfer-Encoding: quoted-printable X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-07-13_05:2021-07-13, 2021-07-13 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 bulkscore=0 phishscore=0 mlxscore=0 lowpriorityscore=0 adultscore=0 malwarescore=0 suspectscore=0 impostorscore=0 mlxlogscore=999 priorityscore=1501 clxscore=1011 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2107130086 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: "Lucas A. M. Magalhaes via Libc-alpha" Reply-To: "Lucas A. M. Magalhaes" Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org Sender: "Libc-alpha" Hi Naohiro, Thanks for working on this. I like the idea of a benchmark specific for 0 on memset. However having two implementations seems too much. I would rather see just one bench-memset-zerofill.c. What I guess would be even better is to have this performance test inside bench-memset.c and bench-memset-large.c. Quoting Naohiro Tamura via Libc-alpha (2021-07-13 05:22:14) > Memset takes 0 as the second parameter in most cases. > More than 95% of memset takes 0 as the second parameter in case of > Linux Kernel source code. The Linux Kernel does not use glibc, it has his own memset implementation. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib= /string.c#n784 Therefore IMO this argument is not suited for this commit. > However, we cannot measure the zero fill performance by > bench-memset-zerofill.c and bench-memset-large-zerofill.c. > This patch provides bench-memset-zerofill.c and > bench-memset-large-zerofill.c which are suitable to see the > performance of zero fill by fixing the second parameter to 0. In this section I guess you mistake bench-memset.c and bench-memset-large.c= for bench-memset-zerofill.c and bench-memset-large-zerofill.c. > --- > benchtests/Makefile | 3 +- > benchtests/bench-memset-large-zerofill.c | 125 ++++++++++++++++++ > benchtests/bench-memset-zerofill.c | 156 +++++++++++++++++++++++ > 3 files changed, 283 insertions(+), 1 deletion(-) > create mode 100644 benchtests/bench-memset-large-zerofill.c > create mode 100644 benchtests/bench-memset-zerofill.c >=20 > diff --git a/benchtests/Makefile b/benchtests/Makefile > index 1530939a8ce8..1261f7650fc7 100644 > --- a/benchtests/Makefile > +++ b/benchtests/Makefile > @@ -53,7 +53,8 @@ string-benchset :=3D memccpy memchr memcmp memcpy memme= m memmove \ > strncasecmp strncat strncmp strncpy strnlen strpbrk st= rrchr \ > strspn strstr strcpy_chk stpcpy_chk memrchr strsep str= tok \ > strcoll memcpy-large memcpy-random memmove-large memse= t-large \ > - memcpy-walk memset-walk memmove-walk > + memcpy-walk memset-walk memmove-walk memset-zerofill \ > + memset-large-zerofill >=20=20 > # Build and run locale-dependent benchmarks only if we're building nativ= ely. > ifeq (no,$(cross-compiling)) > diff --git a/benchtests/bench-memset-large-zerofill.c b/benchtests/bench-= memset-large-zerofill.c > new file mode 100644 > index 000000000000..d8eae9d9789f > --- /dev/null > +++ b/benchtests/bench-memset-large-zerofill.c > @@ -0,0 +1,125 @@ > +/* Measure memset functions with large data sizes. Please fix this description. > + Copyright (C) 2016-2021 Free Software Foundation, Inc. Just 2021 here. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#define TEST_MAIN > +#define TEST_NAME "memset" > +#define START_SIZE (128 * 1024) > +#define MIN_PAGE_SIZE (getpagesize () + 64 * 1024 * 1024) > +#define TIMEOUT (20 * 60) > +#include "bench-string.h" > + > +#include > +#include "json-lib.h" > + This code don't need the assert.h. > +void *generic_memset (void *, int, size_t); > +typedef void *(*proto_t) (void *, int, size_t); > + > +IMPL (MEMSET, 1) > +IMPL (generic_memset, 0) > + > +static void > +do_one_test (json_ctx_t *json_ctx, impl_t *impl, CHAR *s, > + int c __attribute ((unused)), size_t n) > +{ > + size_t i, iters =3D 16; > + timing_t start, stop, cur; > + > + TIMING_NOW (start); > + for (i =3D 0; i < iters; ++i) > + { > + CALL (impl, s, c, n); > + } > + TIMING_NOW (stop); > + > + TIMING_DIFF (cur, start, stop); > + > + json_element_double (json_ctx, (double) cur / (double) iters); > +} > + > +static void > +do_test (json_ctx_t *json_ctx, size_t align, int c, size_t len) > +{ > + align &=3D 63; > + if ((align + len) * sizeof (CHAR) > page_size) > + return; > + > + json_element_object_begin (json_ctx); > + json_attr_uint (json_ctx, "length", len); > + json_attr_uint (json_ctx, "alignment", align); > + json_attr_int (json_ctx, "char", c); > + json_array_begin (json_ctx, "timings"); > + > + FOR_EACH_IMPL (impl, 0) > + { > + do_one_test (json_ctx, impl, (CHAR *) (buf1) + align, c, len); > + alloc_bufs (); > + } > + > + json_array_end (json_ctx); > + json_element_object_end (json_ctx); > +} > + > +int > +test_main (void) > +{ > + json_ctx_t json_ctx; > + size_t i; > + int c; > + > + test_init (); > + > + json_init (&json_ctx, 0, stdout); > + > + json_document_begin (&json_ctx); > + json_attr_string (&json_ctx, "timing_type", TIMING_TYPE); > + > + json_attr_object_begin (&json_ctx, "functions"); > + json_attr_object_begin (&json_ctx, TEST_NAME); > + json_attr_string (&json_ctx, "bench-variant", "large-zerofill"); > + > + json_array_begin (&json_ctx, "ifuncs"); > + FOR_EACH_IMPL (impl, 0) > + json_element_string (&json_ctx, impl->name); > + json_array_end (&json_ctx); > + > + json_array_begin (&json_ctx, "results"); > + > + c =3D 0; > + for (i =3D START_SIZE; i <=3D MIN_PAGE_SIZE; i <<=3D 1) > + { > + do_test (&json_ctx, 0, c, i); > + do_test (&json_ctx, 3, c, i); > + } > + > + json_array_end (&json_ctx); > + json_attr_object_end (&json_ctx); > + json_attr_object_end (&json_ctx); > + json_document_end (&json_ctx); > + > + return ret; > +} > + > +#include > + > +#define libc_hidden_builtin_def(X) > +#define libc_hidden_def(X) > +#define libc_hidden_weak(X) > +#define weak_alias(X,Y) > +#undef MEMSET > +#define MEMSET generic_memset > +#include > diff --git a/benchtests/bench-memset-zerofill.c b/benchtests/bench-memset= -zerofill.c > new file mode 100644 > index 000000000000..ac20ae4c6537 > --- /dev/null > +++ b/benchtests/bench-memset-zerofill.c > @@ -0,0 +1,156 @@ > +/* Measure memset functions. Fix the description. > + Copyright (C) 2013-2021 Free Software Foundation, Inc. Only 2021 here. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#define TEST_MAIN > +#ifndef WIDE > +# define TEST_NAME "memset" > +#else > +# define TEST_NAME "wmemset" > +# define generic_memset generic_wmemset > +#endif /* WIDE */ > +#define MIN_PAGE_SIZE 131072 > +#include "bench-string.h" > + > +#include "json-lib.h" > + > +#ifdef WIDE > +CHAR *generic_wmemset (CHAR *, CHAR, size_t); > +#else > +void *generic_memset (void *, int, size_t); > +#endif > + > +typedef void *(*proto_t) (void *, int, size_t); > + > +IMPL (MEMSET, 1) > +IMPL (generic_memset, 0) > + > +static void > +do_one_test (json_ctx_t *json_ctx, impl_t *impl, CHAR *s, > + int c __attribute ((unused)), size_t n) > +{ > + size_t i, iters =3D INNER_LOOP_ITERS; > + timing_t start, stop, cur; > + > + TIMING_NOW (start); > + for (i =3D 0; i < iters; ++i) > + { > + CALL (impl, s, c, n); > + } > + TIMING_NOW (stop); > + > + TIMING_DIFF (cur, start, stop); > + > + json_element_double (json_ctx, (double) cur / (double) iters); > +} > + > +static void > +do_test (json_ctx_t *json_ctx, size_t align, int c, size_t len) > +{ > + align &=3D 4095; > + if ((align + len) * sizeof (CHAR) > page_size) > + return; > + > + json_element_object_begin (json_ctx); > + json_attr_uint (json_ctx, "length", len); > + json_attr_uint (json_ctx, "alignment", align); > + json_attr_int (json_ctx, "char", c); > + json_array_begin (json_ctx, "timings"); > + > + FOR_EACH_IMPL (impl, 0) > + { > + do_one_test (json_ctx, impl, (CHAR *) (buf1) + align, c, len); > + alloc_bufs (); > + } > + > + json_array_end (json_ctx); > + json_element_object_end (json_ctx); > +} > + > +int > +test_main (void) > +{ > + json_ctx_t json_ctx; > + size_t i; > + int c =3D 0; > + > + test_init (); > + > + json_init (&json_ctx, 0, stdout); > + > + json_document_begin (&json_ctx); > + json_attr_string (&json_ctx, "timing_type", TIMING_TYPE); > + > + json_attr_object_begin (&json_ctx, "functions"); > + json_attr_object_begin (&json_ctx, TEST_NAME); > + json_attr_string (&json_ctx, "bench-variant", "default-zerofill"); > + > + json_array_begin (&json_ctx, "ifuncs"); > + FOR_EACH_IMPL (impl, 0) > + json_element_string (&json_ctx, impl->name); > + json_array_end (&json_ctx); > + > + json_array_begin (&json_ctx, "results"); > + > + c =3D 0; > + for (i =3D 0; i < 18; ++i) > + do_test (&json_ctx, 0, c, 1 << i); > + for (i =3D 1; i < 64; ++i) > + { > + do_test (&json_ctx, i, c, i); > + do_test (&json_ctx, 4096 - i, c, i); > + do_test (&json_ctx, 4095, c, i); > + if (i & (i - 1)) > + do_test (&json_ctx, 0, c, i); > + } > + for (i =3D 32; i < 512; i+=3D32) > + { > + do_test (&json_ctx, 0, c, i); > + do_test (&json_ctx, i, c, i); > + } > + do_test (&json_ctx, 1, c, 14); > + do_test (&json_ctx, 3, c, 1024); > + do_test (&json_ctx, 4, c, 64); > + do_test (&json_ctx, 2, c, 25); > + for (i =3D 33; i <=3D 256; i +=3D 4) > + { > + do_test (&json_ctx, 0, c, 32 * i); > + do_test (&json_ctx, i, c, 32 * i); > + } > + > + json_array_end (&json_ctx); > + json_attr_object_end (&json_ctx); > + json_attr_object_end (&json_ctx); > + json_document_end (&json_ctx); > + > + return ret; > +} > + > +#include > + > +#define libc_hidden_builtin_def(X) > +#define libc_hidden_def(X) > +#define libc_hidden_weak(X) > +#define weak_alias(X,Y) > +#ifndef WIDE > +# undef MEMSET > +# define MEMSET generic_memset > +# include > +#else > +# define WMEMSET generic_wmemset > +# include > +#endif > --=20 > 2.17.1 >=20 --- Lucas A. M. Magalh=C3=A3es