From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-4.0 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 063E020248 for ; Wed, 13 Mar 2019 15:51:58 +0000 (UTC) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:references:date:in-reply-to :message-id:mime-version:content-type:content-transfer-encoding; q=dns; s=default; b=gL0TDwFNJSNsHX2UnyCFj1JcXZtjD0B9nkhocTCQIm+ Xgas7udpaMPEDRRYJF6KorjWrFXzgQN0PXpa8zN63xY89Nhxw+zaL7xc3FOvRoL8 MqWcO0z2oQVXR3ChuahWx6vjM/cbGuBV4exyh0dpqEzSW1aM2KiQ5NjctmvAw/Qs = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:references:date:in-reply-to :message-id:mime-version:content-type:content-transfer-encoding; s=default; bh=1G+S+5epki9fYfi8c7wN8gsZfkQ=; b=TdKtTn5w5tgCuuv68 l3LbWhuXniv7T6s5wmz1BpzB3dpkdmtlFQKZmYURERrzIxV+aL0GKCbl+jtOf9zi SFrfonWm8rWopZpZKFimHhsffTkl1K842SIYPa34d1u+GfmWm89Uh8qxkq85CdPX QGjvA4fvY1R//nQjURKRdxQpfU= Received: (qmail 71743 invoked by alias); 13 Mar 2019 15:51:56 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 71113 invoked by uid 89); 13 Mar 2019 15:51:56 -0000 Authentication-Results: sourceware.org; auth=none X-HELO: mx1.redhat.com From: Florian Weimer To: Carlos O'Donell Cc: libc-alpha@sourceware.org Subject: Re: Removing longjmp error handling from the dynamic loader References: <871s3lgtvu.fsf@oldenburg2.str.redhat.com> <7ad76477-c936-5db4-91be-c304ea322299@redhat.com> <8736ntxwe5.fsf@oldenburg2.str.redhat.com> <4f7706a4-b914-0ba5-84f6-352703a54e82@redhat.com> <87va0ptfvr.fsf@oldenburg2.str.redhat.com> <16606277-89a7-f288-ac69-257d01a2f1c5@redhat.com> Date: Wed, 13 Mar 2019 16:51:45 +0100 In-Reply-To: <16606277-89a7-f288-ac69-257d01a2f1c5@redhat.com> (Carlos O'Donell's message of "Mon, 11 Mar 2019 15:30:38 -0400") Message-ID: <87imwmdb72.fsf@oldenburg2.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable * Carlos O'Donell: >> One thing we have not considered much so far is what you call =E2=80=9Cf= ree >> lists=E2=80=9D, and generally not doing any unrecoverable actions until = we reach >> the point of no return in dlopen (and eliminating the possiblity of >> failures in dlclose, by avoiding calling malloc there). For example, we >> could store the temporary file descriptors in the exception context (by >> a callout to libc.so.6) and make sure that they are closed when >> unwinding. We can avoid memory leaks for temporary allocations in much >> the same way. >>=20 >> In a sense, we would provide a more approachable programming environment >> for unwinding. > > OK, so lets expand on that a bit, if you went down this route what do > the steps look like? If you enumerate them a bit more I think we'll > all be able to comment on them first, and then agree that it's the > right way forward. I haven't read all of the loader code, but I think we need to manage at least the following resources: * The loader lock * Temporary memory allocation (e.g. file name buffers) * Persistant memory allocations related to the link map * Other persistant memory allocations * One file descriptor * File mappings * How new link maps are hooked into the loader state * Updates to global loader state (various search paths) * Maybe TLS data in the future (if we allocate it eagerly) Ideally, there would be a precise point, clearly indicated in the source code, at which which we know that dlopen cannot fail. I tried to identify this point in the existing sources, but it is entirely unclear where it should be. Conceptually, I think we all agree is that it has to come before we call the ELF constructors. After the point of no return, we cannot do anything that might potentially fail in a non-critical fashion, such as allocating memory. For example, in the current code, we appear to call malloc in add_to_global, when adding the new objects to the global scope. We would have to perform this allocation beforehand (and make sure that we do not need more memory afterwards because some other thread made further changes). So overall, I think the longjmp vs no longjmp discussion may be a distraction, and the harder problems lie elsewhere. Please also consider Rich Felker's comments regarding lazy binding failures in IFUNC resolvers; for those we would have to use longjmp anyway if we want to keep them as non-fatal because the IFUNC resolver may not have unwinding information. Thanks, Florian