Re: [PATCH] Add malloc micro benchmark

unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed

From: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
To: DJ Delorie <dj@redhat.com>
Cc: "carlos@redhat.com" <carlos@redhat.com>,
	"libc-alpha@sourceware.org" <libc-alpha@sourceware.org>,
	nd <nd@arm.com>
Subject: Re: [PATCH] Add malloc micro benchmark
Date: Thu, 28 Dec 2017 14:09:29 +0000	[thread overview]
Message-ID: <DB6PR0801MB2053D2E9BF766A9EF7F47E3B83040@DB6PR0801MB2053.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <xnr2rrabd9.fsf@greed.delorie.com>

DJ Delorie wrote:
> Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes:
> > Since DJ didn't seem keen on increasing the tcache size despite it
> > showing major gains across a wide range of benchmarks,
>
> It's not that I'm not keen on increasing the size, it's that there are
> drawbacks to doing so and I don't want to base such a change on a guess
> (even a good guess).  If you have benchmarks, let's collect them and add
> them to the trace corpus.  I can send you my corpus.  (We don't have a
> good solution for centrally storing such a corpus, yet) Let's run all
> the tests against all the options and make an informed decision, that's
> all.  If it shows gains for synthetic benchmarks, but makes qemu slower,
> we need to know that.

Yes I'd be interested in the traces. I presume they are ISA independent and
can just be replayed?

> Also, as Carlos noted, there are some downstream uses where a larger
> cache may be detrimental.  Sometimes there are no universally "better"
> defaults, and we provide tunables for those cases.

It depends. I've seen cases where returning pages to the OS too quickly
causes a huge performance loss. I think in many of these cases we can
be far smarter and use adaptive algorithms. If say 50% of your memory
ends up in the tcache and you can't allocate a new block, it seems a good
idea to consolidate first. If it's less than 1%, why worry about it?

So short term there may be simple ways to tune tcache, eg. allow a larger
number of small blocks (trivial change), or limit total bytes in the tcache
(which could be dynamically increased as more memory is allocated).

Longer term we need to make arena's per-thread - see below.

> Again, tcache is intended to help the multi-threaded case.  Your patches
> help the single-threaded case.  If you recall, I ran your patch against
> my corpus of multi-threaded tests, and saw no regressions, which is
> good.

Arenas are already mostly per-thread. My observation was that the gains
from tcache are due to bypassing completely uncontended locks.
If an arena could be marked as owned by a thread, the fast single-threaded
paths could be used all of the time (you'd have to handle frees from other
threads of course but those could go in a separate bin for consolidation).

> So our paranoia here is twofold...
>
> 1. Make sure that when someone says "some benchmarks" we have those
>    benchmarks available to us, either as a microbenchmark in glibc or as
>    a trace we can simulate and benchmark.  No more random benchmarks! :-)

Agreed, it's quite feasible to create more traces and more microbenchmarks.

> 2. When we say a patch "is faster", let's run all our benchmarks and
>    make sure that we don't mean "on some benchmarks."  The whole point
>    of the trace/sim stuff is to make sure key downstream users aren't
>    left out of the optimization work, and end up with worse performance.

Well you can't expect gains on all benchmarks or have a "never regress
anything ever" rule. Minor changes in alignment of a heap block or allocation
of pages from the OS can have a large performance impact that's hard to
control. The smallest possible RSS isn't always better. The goal should be to
improve average performance across a wide range of applications.

> We probably should add "on all major architectures" too but that assumes
> we have machines on which we can run the benchmarks.

Szabolcs or I would be happy to run the traces on AArch64.

Wilco

next prev parent reply	other threads:[~2017-12-28 14:07 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-01 13:51 [PATCH] Add malloc micro benchmark Wilco Dijkstra
2017-12-01 16:13 ` Carlos O'Donell
2017-12-18 15:18   ` Wilco Dijkstra
2017-12-18 16:32     ` Carlos O'Donell
2018-01-02 18:20       ` [PATCH v2] " Wilco Dijkstra
2018-01-02 18:45         ` DJ Delorie
2018-01-03 12:12           ` Wilco Dijkstra
2018-01-03 15:07             ` Carlos O'Donell
2018-01-04 13:48               ` Wilco Dijkstra
2018-01-04 16:37                 ` Adhemerval Zanella
2018-01-05 14:32                 ` Carlos O'Donell
2018-01-05 15:50                   ` Adhemerval Zanella
2018-01-05 16:17                     ` Carlos O'Donell
2018-01-05 16:46                       ` Adhemerval Zanella
2018-01-05 17:27                         ` Carlos O'Donell
2018-01-05 14:33         ` Carlos O'Donell
2018-01-05 16:28           ` Joseph Myers
2018-01-05 17:26             ` Carlos O'Donell
2018-02-28 12:40               ` Florian Weimer
2018-02-28 14:11                 ` Ondřej Bílka
2018-02-28 14:16                   ` Florian Weimer
2018-02-28 16:16                     ` Carlos O'Donell
2018-02-28 20:17                       ` Ondřej Bílka
2018-02-28 16:46                     ` Ondřej Bílka
2018-02-28 17:01                       ` Wilco Dijkstra
2018-02-28 18:21                         ` Carlos O'Donell
2018-02-28 19:56                         ` Ondřej Bílka
2018-02-28 21:56                           ` DJ Delorie
2018-03-01 11:24                             ` Ondřej Bílka
2017-12-18 23:02     ` [PATCH] " DJ Delorie
2017-12-28 14:09       ` Wilco Dijkstra [this message]
2017-12-28 19:01         ` DJ Delorie
  -- strict thread matches above, loose matches on Subject: below --
2019-02-01 16:27 Wilco Dijkstra
2019-02-08 19:37 ` DJ Delorie
2019-02-14 16:38   ` Wilco Dijkstra
2019-02-14 20:42     ` DJ Delorie
2019-02-28  4:52 ` Carlos O'Donell
2019-03-04 17:35   ` Wilco Dijkstra
2019-03-18 17:16     ` Wilco Dijkstra
2019-04-09  5:25       ` Carlos O'Donell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/libc/involved.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DB6PR0801MB2053D2E9BF766A9EF7F47E3B83040@DB6PR0801MB2053.eurprd08.prod.outlook.com \
    --to=wilco.dijkstra@arm.com \
    --cc=carlos@redhat.com \
    --cc=dj@redhat.com \
    --cc=libc-alpha@sourceware.org \
    --cc=nd@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).