From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: DJ Delorie Newsgroups: gmane.comp.lib.glibc.alpha Subject: Re: [PATCH] Add malloc micro benchmark Date: Mon, 18 Dec 2017 18:02:10 -0500 Message-ID: References: NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: blaine.gmane.org 1513638028 20178 195.159.176.226 (18 Dec 2017 23:00:28 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 18 Dec 2017 23:00:28 +0000 (UTC) Cc: carlos@redhat.com, libc-alpha@sourceware.org, nd@arm.com To: Wilco Dijkstra Original-X-From: libc-alpha-return-88310-glibc-alpha=m.gmane.org@sourceware.org Tue Dec 19 00:00:24 2017 Return-path: Envelope-to: glibc-alpha@blaine.gmane.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:in-reply-to:date:message-id :mime-version:content-type; q=dns; s=default; b=JO97+tJ+XaPiwh1D z1rAuQNRVCp0Jvx9WQtf5AmLKjAqzwtWJ05r5JusBGniIp7QwejjAD5Xpa7qOFAi nEYR24bfZokH6T/lAQcEAu0XjOBThugYR68u4sn76et9Hdoj7a0kXjewX79n623x whmHfhFHlJOzlMKTjJfkRwXl6Dg= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:in-reply-to:date:message-id :mime-version:content-type; s=default; bh=tAGOC1owxFK3Tx5h1pifx/ fee04=; b=yK61NN0eEqXaSps3F7qb1KAgceUgd3A2yBzC7RmUBRRBt9eO2imVe5 XXMC8KXC5+uWhTH76fEjdoQSIZg3x6Ny3/YqdPaRCYtbF/xjAyJR09vFaJdu5Z2l 4IBCdBr8uGMJFDJMd4CAozbQHKocyV/LAdQn5ByG7hGXBU4NL0mNc= Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Original-Sender: libc-alpha-owner@sourceware.org Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=H*r:8.14.7, universally, H*F:U*dj, informed X-HELO: mx1.redhat.com In-Reply-To: (message from Wilco Dijkstra on Mon, 18 Dec 2017 15:18:47 +0000) Xref: news.gmane.org gmane.comp.lib.glibc.alpha:80673 Archived-At: Received: from server1.sourceware.org ([209.132.180.131] helo=sourceware.org) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eR4OU-0004sK-32 for glibc-alpha@blaine.gmane.org; Tue, 19 Dec 2017 00:00:22 +0100 Received: (qmail 543 invoked by alias); 18 Dec 2017 23:02:20 -0000 Received: (qmail 491 invoked by uid 89); 18 Dec 2017 23:02:19 -0000 Wilco Dijkstra writes: > Since DJ didn't seem keen on increasing the tcache size despite it > showing major gains across a wide range of benchmarks, It's not that I'm not keen on increasing the size, it's that there are drawbacks to doing so and I don't want to base such a change on a guess (even a good guess). If you have benchmarks, let's collect them and add them to the trace corpus. I can send you my corpus. (We don't have a good solution for centrally storing such a corpus, yet) Let's run all the tests against all the options and make an informed decision, that's all. If it shows gains for synthetic benchmarks, but makes qemu slower, we need to know that. Also, as Carlos noted, there are some downstream uses where a larger cache may be detrimental. Sometimes there are no universally "better" defaults, and we provide tunables for those cases. And, as always, I can be out-voted if the consensus disagrees with me ;-) > I decided to fix the performance for the single-threaded case at > least. It's now 2.5x faster on a few sever benchmarks (of course the > next question is whether tcache is actually useful in its current > form). Again, tcache is intended to help the multi-threaded case. Your patches help the single-threaded case. If you recall, I ran your patch against my corpus of multi-threaded tests, and saw no regressions, which is good. So our paranoia here is twofold... 1. Make sure that when someone says "some benchmarks" we have those benchmarks available to us, either as a microbenchmark in glibc or as a trace we can simulate and benchmark. No more random benchmarks! :-) 2. When we say a patch "is faster", let's run all our benchmarks and make sure that we don't mean "on some benchmarks." The whole point of the trace/sim stuff is to make sure key downstream users aren't left out of the optimization work, and end up with worse performance. We probably should add "on all major architectures" too but that assumes we have machines on which we can run the benchmarks. So we should be able to answer your question, not just wonder... > I'd have to check how easy it is to force it to use the thread arena. I'm guessing we could have a glibc-internal API to tag the heap as "corrupt" which would preclude using it. > If consolidation doesn't work that's a serious bug. Sometimes it's not a case of "doesn't work" as a case of "not attempted for performance reasons". If we can show that a different design choice is universally better[*], we should change it. [*] or at least, universally-enough for a "system" allocator like glibc must provide.