From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-4.0 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 3533F20248 for ; Tue, 12 Mar 2019 10:30:36 +0000 (UTC) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:references:date:in-reply-to :message-id:mime-version:content-type; q=dns; s=default; b=gwCnb SiwIcpbasxuwoaq9x8CTBxpWYtVafkl2jo0fn0ZIv8kGeAyjzf7elJFdkp68vo5V zyq1MgAIBq/DK+S/vKOhzpX9Iuu8SaSomiZAX6Fw02AOnsNbpCiZOfeZ+VboyqdX Bf7s2C/ulegkdW+STQEb+LdU77eK55YssRQYTM= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:references:date:in-reply-to :message-id:mime-version:content-type; s=default; bh=BB0E2yLd0aB U2tWZ50vWRAsEAig=; b=wvBfu6WiW8IWl6PGpSr6Z7pij66Kfqq90lRnhLu+tYM dE5Dkg54YsXmuvXRXgT943kG8FjYiycJ5Adspu6BUbC7/PIgXGVNEx1NxxGPZRPr wR+LHCpriAeAfa5+LpSsvLQsSs3FALqxUr/J+fqb4+s6z4Ror3O+qIS8nAUJ4efQ = Received: (qmail 11358 invoked by alias); 12 Mar 2019 10:30:33 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 11338 invoked by uid 89); 12 Mar 2019 10:30:33 -0000 Authentication-Results: sourceware.org; auth=none X-HELO: mx1.redhat.com From: Florian Weimer To: Rich Felker Cc: libc-alpha@sourceware.org Subject: Re: Removing longjmp error handling from the dynamic loader References: <871s3lgtvu.fsf@oldenburg2.str.redhat.com> <20190306154013.GQ23599@brightrain.aerifal.cx> <877ed5zfrq.fsf@oldenburg2.str.redhat.com> <20190311225200.GA23599@brightrain.aerifal.cx> Date: Tue, 12 Mar 2019 11:30:28 +0100 In-Reply-To: <20190311225200.GA23599@brightrain.aerifal.cx> (Rich Felker's message of "Mon, 11 Mar 2019 18:52:00 -0400") Message-ID: <87mum0pepn.fsf@oldenburg2.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain * Rich Felker: > Assuming ifunc resolvers aren't "allowed" to do much beyond probing > hwcaps/cpuid/etc. to pick an implementation, I don't see any reason > that resolver failure during an ifunc resolver function should > terminate the process. Depending on which documentation you read, IFUNC resolvers must not depend on run-time relocations themselves. In that case, lazy binding failure during execution of an IFUNC resolver cannot possibly happen in a valid program. We would also have to disable signals while IFUNC resolvers are running, so that lazy binding errors in signal handlers do not leak into the dlopen call (after longjmp'ing out of the signal handler). Is this really worth the trouble? > Missing symbols at dlopen time with RTLD_NOW or DT_BINDNOW or whatever > should never crash the application, but should report the error. With > ifunc, I think (?) you have the possibility that the ifunc resolver > code will call another function in the library being loaded (or one of > its deps) via a plt slot that hasn't yet been initialized, because > there's no way to know a dependency order for the relocations to avoid > this. We perform relocations in topological order, and IRELATIVE relocations are sorted last by current binutils. So in the absence of cyclic dependencies (perhaps as the result of symbol interposition), IFUNC resolvers will not encounter uninitialized PLT slots. > This should probably longjmp back and make dlopen fail; I can't see > any other way to make it work since there's no way to make forward > progress past the impossible-to-satisfy call. If you can detect at all that the relocation has not been processed, you could longjmp out of the IFUNC resolver and try something else, until there is definitely no way to make progress. (I'm not saying that this longjmp is valid, but it is a possibility.) But I really don't see how we can make this work reliably because there are relocation dependencies that do not involve lazy binding or PLT calls, and we cannot detect those. The IFUNC resolver would just use uninitialized data or data that is later overwritten. Two-phase relocation processing in topology order (first all non-IFUNC relocations, then the IFUNC relocations) seems to cover all the practical cases involving symbol interposition. It deals correctly with glibc's internal uses of those (which can currently lead to crashes, see bug 21041). It also covers, by design, all relocations for data symbols because they cannot involve IFUNCs. It does not deal with all cases where an IFUNC resolver uses a function pointer variable that has been initialized by a relocation, and that value is itself the result of an IFUNC resolver. It also cannot support cases where an IFUNC resolver depends on ELF constructors having run for one of its dependencies (which some people did, until distributions started building with BIND_NOW). > But maybe the relocations can just be ordered such that this isn't a > concern (by checking all symbolic references prior to doing any ifunc > resolvers?). I think doing that would be excessive and it wouldn't cover the case where the IFUNC resolver actually relies on lazy binding for choosing the implementation (which could be constructed as valid if IFUNC resolvers may rely on relocations). >> If we want to give users more precise control over binding errors, I >> don't think anything based on SJLJ-style exception handling is the >> answer. > > I don't see why there should be any expectation that you can use C++ > exception handling for this; the contract of dlopen is that it succeed > or return an error, not that it might terminate via an exception. I meant for a call that results in a lazy binding failure. Thanks, Florian