unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: "H.J. Lu via Libc-alpha" <libc-alpha@sourceware.org>
To: liqingqing <liqingqing3@huawei.com>
Cc: Hushiyuan <hushiyuan@huawei.com>,
	"libc-alpha@sourceware.org" <libc-alpha@sourceware.org>
Subject: Re: [PATCH] x86: Add thresholds for "rep movsb/stosb" to tunables
Date: Thu, 28 May 2020 04:56:52 -0700	[thread overview]
Message-ID: <CAMe9rOoMCW_frUoO6G4YC95q2U=EChdiRWbsSsOVy=d-OHSNqg@mail.gmail.com> (raw)
In-Reply-To: <CAMe9rOoYXMdOfedtZLx=GT-nFXThzoo7Q__H4vg=2vyOufGY6A@mail.gmail.com>

On Fri, May 22, 2020 at 9:37 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Fri, May 22, 2020 at 9:10 PM liqingqing <liqingqing3@huawei.com> wrote:
> >
> > this commitid 830566307f038387ca0af3fd327706a8d1a2f595 optimize implementation of function memset,
> > and set macro REP_STOSB_THRESHOLD's default value to 2KB, when the input value is less than 2KB, the data flow is the same, and when the input value is large than 2KB,
> > this api will use STOB to instead of  MOVQ
> >
> > but when I test this API on x86_64 platform
> > and found that this default value is not appropriate for some input length. here it's the enviornment and result
> >
> > test suite: libMicro-0.4.0
> >         ./memset -E -C 200 -L -S -W -N "memset_4k"    -s 4k    -I 250
> >         ./memset -E -C 200 -L -S -W -N "memset_4k_uc" -s 4k    -u -I 400
> >         ./memset -E -C 200 -L -S -W -N "memset_1m"    -s 1m   -I 200000
> >         ./memset -E -C 200 -L -S -W -N "memset_10m"   -s 10m -I 2000000
> >
> > hardware platform:
> >         Intel(R) Xeon(R) Gold 6266C CPU @ 3.00GHz
> >         L1d cache:32KB
> >         L1i cache: 32KB
> >         L2 cache: 1MB
> >         L3 cache: 60MB
> >
> > the result is that when input length is between the processor's L1 data cache and L2 cache size, the REP_STOSB_THRESHOLD=2KB will reduce performance.
> >
> >         before this commit     after this commit
> >                 cycle      cycle
> > memset_4k       249         96
> > memset_10k      657         185
> > memset_36k      2773        3767
> > memset_100k     7594        10002
> > memset_500k     37678       52149
> > memset_1m       86780       108044
> > memset_10m      1307238     1148994
> >
> >         before this commit          after this commit
> >            MLC cache miss(10sec)         MLC cache miss(10sec)
> > memset_4k       1,09,33,823          1,01,79,270
> > memset_10k      1,23,78,958          1,05,41,087
> > memset_36k      3,61,64,244          4,07,22,429
> > memset_100k     8,25,33,052          9,31,81,253
> > memset_500k     37,32,55,449         43,56,70,395
> > memset_1m       75,16,28,239         88,29,90,237
> > memset_10m      9,36,61,67,397       8,96,69,49,522
> >
> >
> > though REP_STOSB_THRESHOLD can be modified at the building time by use -DREP_STOSB_THRESHOLD=xxx,
> > but I think the default value may be is not a better one, cause I think most of the processor's L2 cache is large than 2KB, so i submit a patch as below:
> >
> >
> >
> > From 44314a556239a7524b5a6451025737c1bdbb1cd0 Mon Sep 17 00:00:00 2001
> > From: liqingqing <liqingqing3@huawei.com>
> > Date: Thu, 21 May 2020 11:23:06 +0800
> > Subject: [PATCH] update REP_STOSB_THRESHOLD's default value from 2k to 1M
> > macro REP_STOSB_THRESHOLD's value will reduce memset performace when input length is between processor's L1 data cache and L2 cache.
> > so update the defaule value to eliminate the decrement .
> >
>
> There is no single threshold value which is good for all workloads.
> I don't think we should change REP_STOSB_THRESHOLD to 1MB.
> On the other hand, the fixed threshold isn't flexible.  Please try this
> patch to see if you can set the threshold for your specific workload.
>

Any comments, objections?

https://sourceware.org/pipermail/libc-alpha/2020-May/114281.html

-- 
H.J.

  reply	other threads:[~2020-05-28 11:57 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-16  7:30 pthread_cond performence Discussion liqingqing
2020-03-18 12:12 ` Carlos O'Donell via Libc-alpha
2020-03-18 12:53   ` Torvald Riegel via Libc-alpha
2020-03-18 14:42     ` Carlos O'Donell via Libc-alpha
2020-05-23  4:04 ` liqingqing
2020-05-23  4:10   ` [PATCH]x86: update REP_STOSB_THRESHOLD's default value from 2k to 1M liqingqing
2020-05-23  4:37     ` [PATCH] x86: Add thresholds for "rep movsb/stosb" to tunables H.J. Lu via Libc-alpha
2020-05-28 11:56       ` H.J. Lu via Libc-alpha [this message]
2020-05-28 13:47         ` liqingqing
2020-05-29 13:13       ` Carlos O'Donell via Libc-alpha
2020-05-29 13:21         ` H.J. Lu via Libc-alpha
2020-05-29 16:18           ` Carlos O'Donell via Libc-alpha
2020-06-01 19:32             ` H.J. Lu via Libc-alpha
2020-06-01 19:38               ` Carlos O'Donell via Libc-alpha
2020-06-01 20:15                 ` H.J. Lu via Libc-alpha
2020-06-01 20:19                   ` H.J. Lu via Libc-alpha
2020-06-01 20:48                     ` Florian Weimer
2020-06-01 20:56                       ` Carlos O'Donell via Libc-alpha
2020-06-01 21:13                         ` H.J. Lu via Libc-alpha
2020-06-01 22:43                           ` H.J. Lu via Libc-alpha
2020-06-02  2:08                             ` Carlos O'Donell via Libc-alpha
2020-06-04 21:00                               ` [PATCH] libc.so: Add --list-tunables H.J. Lu via Libc-alpha
2020-06-05 22:45                                 ` V2 " H.J. Lu via Libc-alpha
2020-06-06 21:51                                   ` V3 [PATCH] libc.so: Add --list-tunables support to __libc_main H.J. Lu via Libc-alpha
2020-07-02 18:00                                     ` Carlos O'Donell via Libc-alpha
2020-07-02 19:08                                       ` [PATCH] Update tunable min/max values H.J. Lu via Libc-alpha
2020-07-03 16:14                                         ` Carlos O'Donell via Libc-alpha
2020-07-03 16:54                                           ` [PATCH] x86: Add thresholds for "rep movsb/stosb" to tunables H.J. Lu via Libc-alpha
2020-07-03 17:43                                             ` Carlos O'Donell via Libc-alpha
2020-07-03 17:53                                               ` H.J. Lu via Libc-alpha
2020-12-21  4:38     ` [PATCH]x86: update REP_STOSB_THRESHOLD's default value from 2k to 1M Siddhesh Poyarekar
2020-12-22  1:02       ` Qingqing Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/libc/involved.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMe9rOoMCW_frUoO6G4YC95q2U=EChdiRWbsSsOVy=d-OHSNqg@mail.gmail.com' \
    --to=libc-alpha@sourceware.org \
    --cc=hjl.tools@gmail.com \
    --cc=hushiyuan@huawei.com \
    --cc=liqingqing3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).