unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: naohirot--- via Libc-alpha <libc-alpha@sourceware.org>
To: Noah Goldstein <goldstein.w.n@gmail.com>
Cc: GNU C Library <libc-alpha@sourceware.org>,
	Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Subject: Re: [PATCH v2 2/5] benchtests: Add memset zero fill benchtest
Date: Mon, 26 Jul 2021 08:39:05 +0000	[thread overview]
Message-ID: <TYAPR01MB6025720E6684DFC3BBEC7474DFE89@TYAPR01MB6025.jpnprd01.prod.outlook.com> (raw)
In-Reply-To: <CAFUsyf+ofuxmBj_jpMrgeQb48BB=1A43iSVCgTRN3x6p22e6cQ@mail.gmail.com>

Hi Noah,

> I see. I think 16 for the inner loop makes sense. From the x86_64
> perspective this
> will keep the loop from running out of the LSD which is necessary for
> accurate
> benchmarking. I guess then somewhere between [2, 8] is reasonable for the
> outer
> loop?
> 
> 
> > #define START_SIZE (16 * 1024)
> > ...
> > static void
> > __attribute__((noinline, noclone))
> > do_one_test (json_ctx_t *json_ctx, impl_t *impl, CHAR *s,
> >              int c1 __attribute ((unused)), int c2 __attribute ((unused)),
> >              size_t n)
> > {
> >   size_t i, j, iters = INNER_LOOP_ITERS; // 32;
> >   timing_t start, stop, cur, latency = 0;
> >
> >   for (i = 0; i < 512; i++) // for (i = 0; i < 2; i++)
> >     {
> >
> >       CALL (impl, s, c1, n * 16);
> >       TIMING_NOW (start);
> >       for (j = 0; j < 16; j++)
> >         CALL (impl, s + n * j, c2, n);
> >       TIMING_NOW (stop);
> >       TIMING_DIFF (cur, start, stop);
> >       TIMING_ACCUM (latency, cur);
> >     }
> >
> This looks good. But as you said, a much smaller value for outer loop.

I made one improvement that replaced 
  CALL (impl, s, c1, n * 16);
to
  __builtin_memset (s, c1, n * 16);
and tentatively chose outer loop two times such as the followings:

-----
static void
__attribute__((noinline, noclone))
do_one_test (json_ctx_t *json_ctx, impl_t *impl, CHAR *s,
             int c1 __attribute ((unused)), int c2 __attribute ((unused)),
             size_t n)
{
  size_t i, j, iters = 32;
  timing_t start, stop, cur, latency = 0;

  for (i = 0; i < 2; i++)
    {
      __builtin_memset (s, c1, n * 16);
      TIMING_NOW (start);
      for (j = 0; j < 16; j++)
        CALL (impl, s + n * j, c2, n);
      TIMING_NOW (stop);
      TIMING_DIFF (cur, start, stop);
      TIMING_ACCUM (latency, cur);
    }

  json_element_double (json_ctx, (double) latency / (double) iters);
}
-----

In case of __memset_generic on a64fx, execution of outer loop 8times
and 2times took as follows:

8times
real    0m26.236s
user    0m18.806s
sys     0m6.562s

2times
real    0m12.956s
user    0m5.081s
sys     0m6.594s

The performance difference is shown in a comparison graph [1],
there is a difference at 16KB.
This difference would not be critical if we use the performance data
mainly to compare "before" with "after" such as master version of
memset with patched version of memset.


This graph[1] can be drawn as the following:

$ cat 2times/bench-memset-zerofill.out 8times/bench-memset-zerofill.out | \
> merge_strings4graph.sh __memset_generic 2times 8times | \
> plot_strings.py -l -p thru -v -


In order to use __builtin_memset() and create the comparison graph [1],
I submitted two ground work patches [2][3].

[1] https://drive.google.com/file/d/1vD1VE3pdHLoYdaAMWXtImvDlGFDHYkyx/view?usp=sharing
[2] https://sourceware.org/pipermail/libc-alpha/2021-July/129459.html
[3] https://sourceware.org/pipermail/libc-alpha/2021-July/129460.html

Thanks.
Naohiro

  parent reply	other threads:[~2021-07-26  8:39 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-13  8:22 [PATCH] benchtests: Add memset zero fill benchmark tests Naohiro Tamura via Libc-alpha
2021-07-13 13:50 ` Lucas A. M. Magalhaes via Libc-alpha
2021-07-20  6:31 ` [PATCH v2 0/5] " Naohiro Tamura via Libc-alpha
2021-08-05  7:47   ` [PATCH v3 0/5] benchtests: Add memset zero fill benchmark test Naohiro Tamura via Libc-alpha
2021-08-05  7:49     ` [PATCH v3 1/5] benchtests: Enable scripts/plot_strings.py to read stdin Naohiro Tamura via Libc-alpha
2021-08-05  7:56       ` Siddhesh Poyarekar
2021-09-08  1:46         ` naohirot--- via Libc-alpha
2021-09-08 12:56           ` Siddhesh Poyarekar
2021-09-09  0:22             ` naohirot--- via Libc-alpha
2021-09-13  3:45               ` Siddhesh Poyarekar
2021-08-05  7:50     ` [PATCH v3 2/5] benchtests: Add memset zero fill benchtest Naohiro Tamura via Libc-alpha
2021-09-08  2:03       ` naohirot--- via Libc-alpha
2021-09-10 20:40       ` Lucas A. M. Magalhaes via Libc-alpha
2021-09-13  0:53         ` naohirot--- via Libc-alpha
2021-09-13 14:05           ` Lucas A. M. Magalhaes via Libc-alpha
2021-09-14  0:38             ` [PATCH v4] " Naohiro Tamura via Libc-alpha
2021-09-14  0:44             ` [PATCH v3 2/5] " naohirot--- via Libc-alpha
2021-09-14 14:02               ` Wilco Dijkstra via Libc-alpha
2021-09-15  8:24                 ` naohirot--- via Libc-alpha
2021-09-21  1:27                   ` naohirot--- via Libc-alpha
2021-09-21 11:09                     ` Wilco Dijkstra via Libc-alpha
2021-09-22  1:05                       ` [PATCH v5] " Naohiro Tamura via Libc-alpha
2023-02-09 17:23                         ` Carlos O'Donell via Libc-alpha
2023-02-10  1:26                           ` Siddhesh Poyarekar via Libc-alpha
2021-09-22  1:07                       ` [PATCH v3 2/5] " naohirot--- via Libc-alpha
2021-09-28  1:40                         ` naohirot--- via Libc-alpha
2021-09-30  0:55                           ` Tamura, Naohiro/田村 直� via Libc-alpha
2021-10-18 12:57                           ` Lucas A. M. Magalhaes via Libc-alpha
2021-10-20 13:44                             ` Wilco Dijkstra via Libc-alpha
2021-10-20 15:35                               ` Lucas A. M. Magalhaes via Libc-alpha
2021-10-20 17:47                                 ` Wilco Dijkstra via Libc-alpha
2021-10-22 13:08                                   ` Lucas A. M. Magalhaes via Libc-alpha
2021-08-05  7:51     ` [PATCH v3 3/5] benchtests: Remove redundant assert.h Naohiro Tamura via Libc-alpha
2021-09-08  1:59       ` naohirot--- via Libc-alpha
2021-09-13  3:36       ` Siddhesh Poyarekar
2021-08-05  7:51     ` [PATCH v3 4/5] benchtests: Fix validate_benchout.py exceptions Naohiro Tamura via Libc-alpha
2021-09-08  1:55       ` naohirot--- via Libc-alpha
2021-09-13  3:42       ` Siddhesh Poyarekar
2021-09-13  3:50         ` Siddhesh Poyarekar
2021-09-13 13:44           ` [PATCH v4] " Naohiro Tamura via Libc-alpha
2021-09-15  3:23             ` Siddhesh Poyarekar
2021-09-16  1:12               ` naohirot--- via Libc-alpha
2021-09-16  1:41                 ` Siddhesh Poyarekar
2021-09-16  2:23                   ` [PATCH v5] " Naohiro Tamura via Libc-alpha
2021-09-16  3:48                     ` Siddhesh Poyarekar
2021-09-16  5:23                       ` naohirot--- via Libc-alpha
2021-09-16  2:26                   ` [PATCH v4] " naohirot--- via Libc-alpha
2021-09-13 13:46           ` [PATCH v3 4/5] " naohirot--- via Libc-alpha
2021-08-05  7:52     ` [PATCH v3 5/5] config: Rename HAVE_BUILTIN_MEMSET macro Naohiro Tamura via Libc-alpha
2021-08-11 20:34       ` Adhemerval Zanella via Libc-alpha
2021-07-20  6:34 ` [PATCH v2 1/5] benchtests: Enable scripts/plot_strings.py to read stdin Naohiro Tamura via Libc-alpha
2021-07-20  6:35 ` [PATCH v2 2/5] benchtests: Add memset zero fill benchtest Naohiro Tamura via Libc-alpha
2021-07-20 16:48   ` Noah Goldstein via Libc-alpha
2021-07-21 12:56     ` naohirot--- via Libc-alpha
2021-07-21 13:07       ` naohirot--- via Libc-alpha
2021-07-21 18:14         ` Noah Goldstein via Libc-alpha
2021-07-21 19:17           ` Wilco Dijkstra via Libc-alpha
2021-07-26  8:42             ` naohirot--- via Libc-alpha
2021-07-26 11:15               ` Wilco Dijkstra via Libc-alpha
2021-07-27  2:24                 ` naohirot--- via Libc-alpha
2021-07-27 17:26                   ` Wilco Dijkstra via Libc-alpha
2021-07-28  7:27                     ` naohirot--- via Libc-alpha
2021-08-04  9:11                       ` naohirot--- via Libc-alpha
2021-07-26  8:39     ` naohirot--- via Libc-alpha [this message]
2021-07-26 17:22       ` Noah Goldstein via Libc-alpha
2021-07-20  6:35 ` [PATCH v2 3/5] benchtests: Add a script to convert benchout string JSON to CSV Naohiro Tamura via Libc-alpha
2021-07-21  2:41   ` naohirot--- via Libc-alpha
2021-07-27 20:17   ` Joseph Myers
2021-07-29  1:56     ` naohirot--- via Libc-alpha
2021-07-29  4:42       ` Siddhesh Poyarekar
2021-07-30  7:05         ` naohirot--- via Libc-alpha
2021-07-31 10:47           ` Siddhesh Poyarekar
2021-07-20  6:36 ` [PATCH v2 4/5] benchtests: Remove redundant assert.h Naohiro Tamura via Libc-alpha
2021-07-20  6:37 ` [PATCH v2 5/5] benchtests: Fix validate_benchout.py exceptions Naohiro Tamura via Libc-alpha
2021-07-26  8:34 ` [PATCH] config: Remove HAVE_BUILTIN_MEMSET macro Naohiro Tamura via Libc-alpha
2021-07-26  8:48   ` naohirot--- via Libc-alpha
2021-07-26  8:49   ` Andreas Schwab
2021-07-26  9:42     ` naohirot--- via Libc-alpha
2021-07-26  9:51       ` Andreas Schwab
2021-07-26 13:16         ` naohirot--- via Libc-alpha
2021-07-26  8:35 ` [PATCH] benchtests: Add a script to merge two benchout string files Naohiro Tamura via Libc-alpha
2021-07-27 20:51   ` Joseph Myers
2021-07-30  7:04     ` naohirot--- via Libc-alpha

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/libc/involved.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=TYAPR01MB6025720E6684DFC3BBEC7474DFE89@TYAPR01MB6025.jpnprd01.prod.outlook.com \
    --to=libc-alpha@sourceware.org \
    --cc=Wilco.Dijkstra@arm.com \
    --cc=goldstein.w.n@gmail.com \
    --cc=naohirot@fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).