From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <libc-alpha-return-94969-e=80x24.org@sourceware.org>
X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on dcvr.yhbt.net
X-Spam-Level: 
X-Spam-ASN: AS31976 209.132.180.0/23
X-Spam-Status: No, score=-3.8 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED,
	DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,
	SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no
	version=3.4.1
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by dcvr.yhbt.net (Postfix) with ESMTPS id E96681F597
	for <e@80x24.org>; Wed,  1 Aug 2018 04:41:12 +0000 (UTC)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
	:list-unsubscribe:list-subscribe:list-archive:list-post
	:list-help:sender:subject:to:cc:references:from:message-id:date
	:mime-version:in-reply-to:content-type
	:content-transfer-encoding; q=dns; s=default; b=HJBRaHOwDQLFBNdL
	XxsHwm3QxjcRkXfpN4I18rt9dVc3O13cyxxL1uTWbHBNC6zG2LA+moTrc++6MeTQ
	XbwwITE0kT26Mdx+gqQz2vYcbXdgqdxwepGgenwKK4/y4EG7+9tS0rEP4oM8hP/d
	TxrZIojiruyrq8krVJDPp+4HUgQ=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
	:list-unsubscribe:list-subscribe:list-archive:list-post
	:list-help:sender:subject:to:cc:references:from:message-id:date
	:mime-version:in-reply-to:content-type
	:content-transfer-encoding; s=default; bh=Te3tXyFuaHrbyHjb6YRABz
	KeU7g=; b=aCkUhsaOQBRWgbCgohhrIwwBYWnH/IPmkzJd4W+ngUpNzrYutOv+9H
	Y/OgI1O7+u/+FksqyDvqb5rxGKb0nh1OMrZSZBMpLZo3ww7rEmBnIb4CfTXY1E39
	hriWvAzTjnF/rrX6S3PyG6kTBvrDpBoxaA3EdIWPdQbx0XWUDAiuk=
Received: (qmail 131005 invoked by alias); 1 Aug 2018 04:41:09 -0000
Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <libc-alpha.sourceware.org>
List-Unsubscribe: <mailto:libc-alpha-unsubscribe-e=80x24.org@sourceware.org>
List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: libc-alpha-owner@sourceware.org
Received: (qmail 130988 invoked by uid 89); 1 Aug 2018 04:41:08 -0000
Authentication-Results: sourceware.org; auth=none
X-HELO: mail-qk0-f194.google.com
Subject: Re: [RFC/PoC] malloc: use wfcqueue to speed up remote frees
To: Eric Wong <normalperson@yhbt.net>
Cc: libc-alpha@sourceware.org
References: <20180731084936.g4yw6wnvt677miti@dcvr>
 <0cfdccea-d173-486c-85f4-27e285a30a1a@redhat.com>
 <20180731231819.57xsqvdfdyfxrzy5@whir>
From: Carlos O'Donell <carlos@redhat.com>
Openpgp: preference=signencrypt
Message-ID: <c061de55-cc2a-88fe-564b-2ea9c4a7e632@redhat.com>
Date: Wed, 1 Aug 2018 00:41:01 -0400
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
 Thunderbird/52.8.0
MIME-Version: 1.0
In-Reply-To: <20180731231819.57xsqvdfdyfxrzy5@whir>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

On 07/31/2018 07:18 PM, Eric Wong wrote:
>> - Can you explain the RSS reduction given this patch? You
>> might think that just adding the frees to a queue wouldn't
>> result in any RSS gains.
> 
> At least two reasons I can see:
> 
> 1) With lock contention, the freeing thread can lose to the
>    allocating thread.  This makes the allocating thread hit
>    sysmalloc since it prevented the freeing thread from doing
>    its job.  sysmalloc is the slow path, so the lock gets held
>    even longer and the problem compounds from there.

How does this impact RSS? It would only block the remote thread
from freeing in a timely fashion, but it would eventually make
progress.

> 2) thread caching - memory ends up in the wrong thread and
>    could never get used in some cases.  Fortunately this is
>    bounded, but still a waste.

We can't have memory end up in the wrong thread. The remote thread
computes the arena from the chunk it has, and then frees back to
the appropriate arena, even if it's not the arena that the thread
is attached to.

> I'm still new to the code, but it looks like threads are pinned
> to the arena and the memory used for arenas never gets released.
> Is that correct?

Threads are pinned to their arenas, but they can move in the event
of allocation failures, particularly to the main arena to attempt
sbrk to get more memory.

> I was wondering if there was another possibility: the allocating
> thread gives up the arena and creates a new one because the
> freeing thread locked it, but I don't think that's the case.

No.

> Also, if I spawn a bunch of threads and get a bunch of
> arenas early in the program lifetime; and then only have few
> threads later, there can be a lot of idle arenas.
 
Yes. That is true. We don't coalesce arenas to match the thread
demand.

>> However, you are calling _int_free a lot in row and that
>> deinterleaving may help (you really want vector free API here
>> so you don't walk all the lists so many times, tcache had the
>> same problem but in reverse for finding chunks). 
> 
> Maybe...  I think in the ideal case, the number of allocations
> and frees is close 1:1, so the loop is kept short.
> 
> What may be worth trying is to bypass _int_free for cases where
> a chunk can fulfill the allocation which triggers it.  Delaying
> or avoiding consolidation could worsen fragmentation, though. 

Right.

>> - Adding urcu as a build-time dependency is not acceptable for
>> bootstrap, instead we would bundle a copy of urcu and keep it
>> in sync with upstream. Would that make your work easier?
> 
> Yes, bundling that sounds great.  I assume it's something for
> you or one of the regular contributors to work on (build systems
> scare me :x)

Yes, that is something we'd have to do.

>> - What problems are you having with `make -j4 check?' Try
>> master and report back.  We are about to release 2.28 so it
>> should build and pass.
> 
> My fault.  It seems like tests aren't automatically rerun when I
> change the code; so some of my broken work-in-progress changes
> ended up being false positives :x.  When working on this, I made
> the mistake of doing remote_free_step inside malloc_consolidate,
> which could recurse into _int_free or _int_malloc

This depends a bit on what you touch.

> I guess I should remove the *.test-result files before rerunning
> tests?

Yes, that will definitely force the test to be re-run.

> I still get:
> 
> FAIL: nptl/tst-sched1
> 
> 	"Create failed"
> 
> 	I guess my system was overloaded.  pthread_create
> 	failures seemed to happen a lot for me when working
> 	on Ruby, too, and POSIX forcing EAGAIN makes it
> 	hard to diagnose :< (ulimit -u 47999 and 12GB RAM)
> 
> 	Removing the test-result and retrying seems OK.

OK. This one is new. There are a few tests where pthread_create
fails with EAGAIN because the kernel can't reap the children
fast enough.

> 
> FAIL: resolv/tst-resolv-ai_idn
> FAIL: resolv/tst-resolv-ai_idn-latin1
> 
> 	Not root, so no CLONE_NEWUTS
> 
> So I think that's expected...
> 

Agreed.

-- 
Cheers,
Carlos.