From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3215 2.6.0.0/16 X-Spam-Status: No, score=-4.2 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 892FE1F9F4 for ; Wed, 17 Nov 2021 20:43:01 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 097B13858005 for ; Wed, 17 Nov 2021 20:43:00 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 097B13858005 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1637181780; bh=kQyWpRkA7lMne+Kz35i+mAI8g+ojOTYngFMVBZfghdg=; h=To:Subject:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=kCmzxt0uVnRuJJkXOGKX9xlHVKjNKxIWt6POA7v9LSFQOctlXiFdw0cDlLCd4PLQM H2ih3ntLWjLhDZ9Q7a1ptL9OutVwdVLIVyzyG9adf+H325Vwh85g5HYGdGjKZXpJmS H8ci0rwQlfa7f/yUDWY+THgy3DspJAvX1VBHMYvI= Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 66A763858402 for ; Wed, 17 Nov 2021 20:42:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 66A763858402 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-450-DAGOqHEpOUOTxq3ew4PU3Q-1; Wed, 17 Nov 2021 15:42:34 -0500 X-MC-Unique: DAGOqHEpOUOTxq3ew4PU3Q-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1C8088042CF; Wed, 17 Nov 2021 20:42:33 +0000 (UTC) Received: from oldenburg.str.redhat.com (unknown [10.39.194.81]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 6800860C9F; Wed, 17 Nov 2021 20:42:31 +0000 (UTC) To: Jonathon Anderson Subject: Re: Fwd: [PATCH v5 00/22] Some rtld-audit fixes References: <0D3F0C5F-2586-42F9-916D-2F327432AF13@rice.edu> Date: Wed, 17 Nov 2021 21:42:29 +0100 In-Reply-To: <0D3F0C5F-2586-42F9-916D-2F327432AF13@rice.edu> (John Mellor-Crummey via Libc-alpha's message of "Wed, 17 Nov 2021 12:08:17 -0600") Message-ID: <87bl2i345m.fsf@oldenburg.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Florian Weimer via Libc-alpha Reply-To: Florian Weimer Cc: John Mellor-Crummey , libc-alpha@sourceware.org, "Mark W. Krentel" , Xiaozhu Meng Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org Sender: "Libc-alpha" Thank you for the feedback. At this time, I want to comment on the l_addr aspect. > > 3. For non-PIE executables the base address listed in link_map->l_add= r > > for the main application binary is 0, even though dladdr is able t= o > > recover the correct offset. La_objopen is affected by this. > > > > This would require to change an internal semantic for link_map->l_add= r. > > This is not straighfoward and I am not sure about the direct gains. >=20 > Again, we are wholly sympathetic to the difficulty of refactoring > complex code! >=20 > The motivation for providing a consistent link_map->l_addr value is > to unify the handling for the main executable with any other binary > and to allow access to the ELF header of the main executable (which > provides fields not available anywhere else: type, ABI, entry > point...). An alternative would be to re-open the file from its path > (link_map->l_name), however this is a serious performance concern for > large-scale executions (metadata servers are known to be a bottleneck > of parallel filesystems). It also has its security issues because you might get back the binary you expect. > dladdr is not always an option for > auditors, as noted by another of our 'Tier 2' issues. Right now, we > only require the program headers which we can obtain from > getauxval(AT_PHDR), however this technique has questionable > portability and robustness (getauxval returns an unsigned long, not a > pointer). A glibc port to an architecture where a long value cannot hold all pointer values will have to provide an alternative interface similar to getauxval, but that returns pointer values. Of course that's not the only interface with this problem (ElfW(Addr) is an integer as well). It makes the Morello glibc port quite interesting. So I think *something* like getauxval (AT_PHDR) will always be available, with pretty much identical semantics. > From an outside perspective the current l_addr semantic is fairly > undocumented, the dladdr and dlinfo man pages define it vaguely as > the "difference between the address in the ELF file and the address > in memory." That sounds (to me at least) like l_addr should point to > byte 0 in the file (the ELF header), and that seems to be correct in > all but the non-PIE case. I have struggled with this in the past. I agree that it is confusing. l_addr is the offset between virtual addresses in the program header of the ELF object and the actual addresses in the process image. This offset happens to be 0 for ET_EXEC objects, and only there. I think the sort-of-official way out of this conundrum is to call dl_iterate_phdr and then look at the program headers. This is what _Unwind_Find_FDE in libgcc does to find the object boundaries and the PT_GNU_EH_FRAME segment, see libgcc/unwind-dw2-fde-dip.c in the GCC sources: [=E2=80=A6] # define __RELOC_POINTER(ptr, base) ((ptr) + (base)) [=E2=80=A6] _Unwind_Ptr load_base; [=E2=80=A6] load_base =3D info->dlpi_addr; [=E2=80=A6] /* See if PC falls into one of the loaded segments. Find the eh_frame segment at the same time. */ for (n =3D info->dlpi_phnum; --n >=3D 0; phdr++) { if (phdr->p_type =3D=3D PT_LOAD) { _Unwind_Ptr vaddr =3D (_Unwind_Ptr) __RELOC_POINTER (phdr->p_vaddr, load_base); if (data->pc >=3D vaddr && data->pc < vaddr + phdr->p_memsz) { match =3D 1; pc_low =3D vaddr; pc_high =3D vaddr + phdr->p_memsz; } } [=E2=80=A6] This is not even glibc-specific, other systems have dl_iterate_phdr as well. getauxval (AT_PHDR) is definitely the more direct, Linux-specific approach. It's also much faster because it does not involve locking. > dladdr gets its value from link_map->l_map_start instead of l_addr, > so the semantic we want is already present in a private field. It > seems to me these two fields could be swapped with little issue, if > altering the public semantic is not acceptable we could also be sated > if l_map_start was made public. Applications which know about the current semantics of l_addr will break, though. l_addr is also exposed to debuggers via the _r_debug interface. I really do not think we can make changes to l_addr. We have a similar issue around l_name being "" for the main program, and unfortuantely I will have to argue quite strongly against changing that. We should perhaps collect these grievances and work on better interfaces for the future. However, that will be quite a long-term investment because it will take years until these new interfaces will be available on your users' systems. Bug fixes we can roll out more quickly, but glibc interface changes typically happen at major distribution release boundaries only. For the time being, it has to be workarounds for missing interfaces, I'm afraid. Thanks, Florian