From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-4.0 required=3.0 tests=AWL,BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 5E5A01F466 for ; Sat, 1 Feb 2020 06:14:49 +0000 (UTC) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:to:cc:references:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; q=dns; s=default; b=mZDqHdAGABaW2yqB Ecv+Q4ltNpZxV8FZlbVxVICYRTnborptEWqbK9Tab15nzaHjD31YUo0UjqPYPpj6 ZMLtsaSrM4fakQdTg6T3nYHoiGbJVC2+QdSJ1bog1H2vQXlrtH094X9Ocf8du4hh W7Or2hnm1AT9PuCsLgiWjwf/Oo0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:to:cc:references:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; s=default; bh=LJqVhWwhJJKMzhM/58d+Gm d5Iwg=; b=OfZ5mltMS8eVfztco2HQzpNNMWhZ2k7SkX0pn7C2BlPFhQnhkv3znf qMIDibbFXUfLlLzfKn1MtGLum9pakdVfMBNtP8M2DE0YFMkwzjgHbCc4rdF2r6s4 dfbrO3dO4IltvEv4CRGsoMEznVC2fO8UNeY2rBqMxjLtrt/64wn7k= Received: (qmail 105433 invoked by alias); 1 Feb 2020 06:14:46 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 105409 invoked by uid 89); 1 Feb 2020 06:14:46 -0000 Authentication-Results: sourceware.org; auth=none X-HELO: us-smtp-1.mimecast.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580537683; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=piQuhD0mSnz2rI7V56PP6sK/SjcNxO+f5uIrvlDAsrk=; b=O0eXSbuqA2eqGCp/oqxI7Z3vvSeYs929uTWTpL21zIv40UilbRxGBYH7oIE+mdwA1obMZr 02Vs6dSyFhEcDcHVm2Nue9w/hHl8NToYqGMd8YP/HmuY93mp/JNTv627FBxiVQ513lgY3A xTTMUr/J9aTh+2m3q8/vPqRrS13pGrw= Subject: Re: i386: Lazy binding trampoline and vector register usage To: Szabolcs Nagy , Florian Weimer , "H.J. Lu" Cc: nd , "libc-alpha@sourceware.org" References: <87lfra2as7.fsf@oldenburg2.str.redhat.com> <7648385b-ebd0-efa6-73e8-1a91cfc517f0@arm.com> From: Carlos O'Donell Message-ID: <8edb1937-b91b-99a1-2027-1de51914098c@redhat.com> Date: Sat, 1 Feb 2020 01:14:35 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: <7648385b-ebd0-efa6-73e8-1a91cfc517f0@arm.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit On 1/9/20 5:49 AM, Szabolcs Nagy wrote: > On 18/12/2019 10:22, Florian Weimer wrote: >> We have this in sysdeps/i386/Makefile: >> >> # Make sure no code in ld.so uses mm/xmm/ymm/zmm registers on i386 since >> # the first 3 mm/xmm/ymm/zmm registers are used to pass vector parameters >> # which must be preserved. >> # With SSE disabled, ensure -fpmath is not set to use sse either. >> rtld-CFLAGS += -mno-sse -mno-mmx -mfpmath=387 >> ifeq ($(subdir),elf) >> CFLAGS-.os += $(if $(filter $(@F),$(patsubst %,%.os,$(all-rtld-routines))),\ >> $(rtld-CFLAGS)) >> >> tests-special += $(objpfx)tst-ld-sse-use.out >> $(objpfx)tst-ld-sse-use.out: ../sysdeps/i386/tst-ld-sse-use.sh $(objpfx)ld.so >> @echo "Checking ld.so for SSE register use. This will take a few seconds..." >> $(BASH) $< $(objpfx) '$(NM)' '$(OBJDUMP)' '$(READELF)' > $@; \ >> $(evaluate-test) >> else >> CFLAGS-.os += $(if $(filter rtld-%.os,$(@F)), $(rtld-CFLAGS)) >> endif >> >> The idea is that we do not need to save and restore vector registers in >> the trampoline (or align the stack) if we compile ld.so in such a way >> that only general registers are used. But that does not actually work >> in all cases because lazy binding can call malloc, which lives in >> libc.so or might even be interposed, and is thus free to use vector >> registers. >> >> What should we do about this? Calling malloc from _dl_fixup is unsafe >> for other reasons because lazy binding can happen in signal handlers, so >> maybe this would be fixed if we switched to a non-interposable >> async-signal-safe allocator? > > note that ifunc resolvers can also run during lazy binding > and those can execute arbitrary user code (even if the > allocator issue is fixed). (1) ifunc resolvers I think ifunc resolvers are a unique problem that needs to be handled by specific solutions for ifunc. I think the resolvers should not run during lazy binding. They are effectively user code and should be handled more like initializers, and that means processing them up-front to ensure dlopen completes successfully. (2) Non-interposable AS-safe allocator. The history behind having a non-interposable AS-safe allocator looks like this: - Google proposed and wrote patches to create a no-interpose AS-safe allocator. https://www.sourceware.org/ml/libc-alpha/2013-09/msg00721.html - We accepted the patches and they solved the lazy TLS allocation on first use in signal handler bug which can cause calloc to be called illegally from a signal handler if it happens you touch TLS for the first time in a signal handler. - We subsequently had reports of tooling, I can't remember which, one of the sanitizers, loosing track of TLS entirely because of this new internal allocator. - We reverted the patches. In the end we accepted that for TLS the allocation should just happen upfront at dlopen time. I'm not entirely sold on the idea of having to do all the allocations upfront and I *like* the idea of a non-interposable AS-safe allocator for ld/libc's own internal uses that a user never sees and can never observe. So for example if we have internal book keeping to allocate for TLS then we can use that allocator to create the details of the book keeping. However, this must be balanced against the users desire to control their own allocation strategy. Therefore they must be able to have some control over larger allocations and their placement via the malloc family APIs where possible. In summary: - We need to keep using malloc for users to be able to interpose. - For internal book keeping we could use a non-interposable AS-safe allocator. - I don't know if we can solve our current problems entirely with a non-interposable AS-safe allocator. (3) Calling malloc API functions from from _dl_fixup is wrong. In the case of _dl_fixup we may need call add_dependency and malloc new link maps. This is all wrong for lazy binding. This is more work that needs to be moved to *before* we commit to ever running anything in that library. The new link maps are part of our scope tracking and with lots of DSOs this could be quite a bit of memory hidden inside a local allocator or allocated up-front. -- Cheers, Carlos.