From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3215 2.6.0.0/16 X-Spam-Status: No, score=-5.5 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI,NICE_REPLY_A, RCVD_IN_DNSWL_HI,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id EEB4E1F9F4 for ; Thu, 18 Nov 2021 21:55:47 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CC5D53858015 for ; Thu, 18 Nov 2021 21:55:45 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CC5D53858015 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1637272545; bh=IONHRpC5VjmQxrkLENZJWBk0/9gd98dbCZ6oknogQ1M=; h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=oVn6rJ1ezB+dtQ776yFWhq9Q7cXRgjCH+sImnDkFLm77fsSZp11Xb2hZox+tlOAo4 TfW37IkNqcaYTTiuJsG/o5zg+LEuI4y74dORYWnSQPtCla0Lg7iHQfEVX7X1gzgavt Wte49V7UGt7XkakI5F4tnkXXjAltCi6v1yz6pC38= Received: from mx0b-0010f301.pphosted.com (mx0b-0010f301.pphosted.com [148.163.153.244]) by sourceware.org (Postfix) with ESMTPS id 7035E3858422 for ; Thu, 18 Nov 2021 21:55:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7035E3858422 Received: from pps.filterd (m0102860.ppops.net [127.0.0.1]) by mx0b-0010f301.pphosted.com (8.16.1.2/8.16.1.2) with ESMTP id 1AIK4ctt020527; Thu, 18 Nov 2021 15:55:23 -0600 Received: from mx4.mail.rice.edu (mx4.mail.rice.edu [128.42.199.101]) by mx0b-0010f301.pphosted.com (PPS) with ESMTPS id 3cdcdqha6a-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Nov 2021 15:55:22 -0600 Received: from mx4.mail.rice.edu (localhost [127.0.0.1]) by mx4.mail.rice.edu (Postfix) with ESMTP id 78DC041104C; Thu, 18 Nov 2021 15:55:22 -0600 (CST) Received: from localhost (localhost [127.0.0.1]) by mx4.mail.rice.edu (Postfix) with ESMTP id 771FB41104B; Thu, 18 Nov 2021 15:55:22 -0600 (CST) X-Virus-Scanned: by amavis-2.12.1 at mx4.mail.rice.edu, auth channel Received: from mx4.mail.rice.edu ([127.0.0.1]) by localhost (mx4.mail.rice.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id IlD8k2kE9wbj; Thu, 18 Nov 2021 15:55:12 -0600 (CST) Received: from [76.30.156.87] (c-76-30-156-87.hsd1.tx.comcast.net [76.30.156.87]) (using TLSv1.2 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: jma14) by mx4.mail.rice.edu (Postfix) with ESMTPSA id EB237209DB2; Thu, 18 Nov 2021 15:55:11 -0600 (CST) Message-ID: Date: Thu, 18 Nov 2021 15:55:11 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.3.0 Subject: Re: Fwd: [PATCH v5 00/22] Some rtld-audit fixes To: Florian Weimer References: <0D3F0C5F-2586-42F9-916D-2F327432AF13@rice.edu> <87bl2i345m.fsf@oldenburg.str.redhat.com> Content-Language: en-US In-Reply-To: <87bl2i345m.fsf@oldenburg.str.redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Proofpoint-ORIG-GUID: GkZxHLYHZFu5RgIkNg73N85fDVzFDaqJ X-Proofpoint-GUID: GkZxHLYHZFu5RgIkNg73N85fDVzFDaqJ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-11-18_12,2021-11-17_01,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 phishscore=0 spamscore=0 priorityscore=1501 mlxscore=0 lowpriorityscore=0 impostorscore=0 mlxlogscore=999 malwarescore=0 clxscore=1011 bulkscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111180113 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Jonathon Anderson via Libc-alpha Reply-To: Jonathon Anderson Cc: John Mellor-Crummey , libc-alpha@sourceware.org, "Mark W. Krentel" , Xiaozhu Meng Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org Sender: "Libc-alpha" On 11/17/21 14:42, Florian Weimer wrote: > Thank you for the feedback. At this time, I want to comment on the > l_addr aspect. Thank you for the comments. I think I understand your points, below are my own opinions on this matter. >> An alternative would be to re-open the file from its path >> (link_map->l_name), however this is a serious performance concern for >> large-scale executions (metadata servers are known to be a bottleneck >> of parallel filesystems). > It also has its security issues because you might get back the binary > you expect. Agreed! >> Right now, we >> only require the program headers which we can obtain from >> getauxval(AT_PHDR), however this technique has questionable >> portability and robustness (getauxval returns an unsigned long, not a >> pointer). > A glibc port to an architecture where a long value cannot hold all > pointer values will have to provide an alternative interface similar to > getauxval, but that returns pointer values. I would go one step farther and say getauxval is already broken for any 64-bit architecture, unsigned long is only required to support 32 bits as per the C standards. One of my greater fears is that some exotic compiler will cleverly allocate only 4 bytes of stack space for the return value, and we wouldn't know except for a subtle bug (dependent on optimization flags!) that crashes our entire tool with SEGVs in the auditor (where GDB doesn't give properly unwound call stacks). > Of course that's not the > only interface with this problem (ElfW(Addr) is an integer as well). AFAICT ElfW(Addr) is fine, it should always be an integer large enough to store a pointer on the host architecture (i.e. a uintptr_t). Unless I missed some specific arch where this doesn't work out to be the case? > It > makes the Morello glibc port quite interesting. So I think *something* > like getauxval (AT_PHDR) will always be available, with pretty much > identical semantics. > >> From an outside perspective the current l_addr semantic is fairly >> undocumented, the dladdr and dlinfo man pages define it vaguely as >> the "difference between the address in the ELF file and the address >> in memory." That sounds (to me at least) like l_addr should point to >> byte 0 in the file (the ELF header), and that seems to be correct in >> all but the non-PIE case. > I have struggled with this in the past. I agree that it is confusing. > l_addr is the offset between virtual addresses in the program header of > the ELF object and the actual addresses in the process image. This > offset happens to be 0 for ET_EXEC objects, and only there. This is a much clearer description of the semantic, it would be very helpful the man pages used that sentence (or one like it) wherever the l_addr value is exposed in the API (link_map->l_addr and dl_phdr_info->dlpi_addr). It would also be very helpful if Dl_info.dli_fbase was clearly documented as *not* l_addr but instead byte 0/ELF header in the process image. > I think the sort-of-official way out of this conundrum is to call > dl_iterate_phdr and then look at the program headers. This is what > _Unwind_Find_FDE in libgcc does to find the object boundaries and the > PT_GNU_EH_FRAME segment, see libgcc/unwind-dw2-fde-dip.c in the GCC > sources: > > […] > # define __RELOC_POINTER(ptr, base) ((ptr) + (base)) > […] > _Unwind_Ptr load_base; > […] > load_base = info->dlpi_addr; > […] > /* See if PC falls into one of the loaded segments. Find the eh_frame > segment at the same time. */ > for (n = info->dlpi_phnum; --n >= 0; phdr++) > { > if (phdr->p_type == PT_LOAD) > { > _Unwind_Ptr vaddr = (_Unwind_Ptr) > __RELOC_POINTER (phdr->p_vaddr, load_base); > if (data->pc >= vaddr && data->pc < vaddr + phdr->p_memsz) > { > match = 1; > pc_low = vaddr; > pc_high = vaddr + phdr->p_memsz; > } > } > […] > > This is not even glibc-specific, other systems have dl_iterate_phdr as > well. That does sound like the "correct" way out, but dl_iterate_phdr operates on the caller's namespace so one would need to inject a shim library to do the actual call. Pulling this off is not trivial, I have a number of concerns with the approach:  - It seems to be fixed upstream, but with 2.32 and prior dlmopen will assert when used within the la_activity(CONSISTENT) for the target namespace. So the shim library needs to be loaded via other means (LD_PRELOAD), leaving the (much earlier) auditor without options until the shim gets loaded.  - The above also means that dlsym can't be used to resolve the shim function symbol, so it needs to be manually derived from the dynamic symbol table. Which is awkward and a bit dangerous, AFAIK there isn't a simple way to get the size of the dynamic symbol table so if the symbol isn't there it usually SEGVs in the auditor (running off the end).  - The above also means that init constructors can't be "promoted," so dl_iterate_phdr will be called before init constructors (Glibc and shim) have run for the target namespace. My mini-experiments on x86-64 pass so it doesn't seem like an issue, but it still makes me anxious.  - The above *also* means that you can only shim into the main namespace (with upstream Glibc, only into non-auditor namespaces), so this (complicated!) trick is only useful for getting data for the main executable. Which really is the only binary that needs to be worked around, but it still leaves oddly diverging control flow.  - dl_iterate_phdr will iterate the whole namespace and can't be stopped in the middle, so there are mild performance concerns with applications pulling in hundreds of libraries. A single call might be acceptable, but more than that probably will be a problem. In summary, ugly as heck but it probably works. With all these caveats though I have difficulty considering this workaround "official" by any stretch of the imagination. >> dladdr gets its value from link_map->l_map_start instead of l_addr, >> so the semantic we want is already present in a private field. It >> seems to me these two fields could be swapped with little issue, if >> altering the public semantic is not acceptable we could also be sated >> if l_map_start was made public. > Applications which know about the current semantics of l_addr will > break, though. l_addr is also exposed to debuggers via the _r_debug > interface. I really do not think we can make changes to l_addr. > We have a similar issue around l_name being "" for the main program, and > unfortuantely I will have to argue quite strongly against changing that. Is adding new public fields completely off the table? HPCToolkit (and other users in need) can use a simple configure test to check whether the fields are present, and AFAICT adding new fields shouldn't break backwards compatibility for older projects or binaries. This would also give a place to resolve the concerns around returning argv[0] instead of the actual binary path. The motivation for these requests is improved usability, the auditor interface (and _r_debug) are unique in that they *only* give the link_map and everything else must be derived from it. dladdr isn't always an option for auditors (another of our 'Tier 2' issues), dl_iterate_phdr is painful to use as noted above, the remaining dl* functions require dlopen handles (and/or crash in the auditor, another 'Tier 2' issue). In HPCToolkit we managed to scrounge up alternative paths for the missing data (which is why this isn't a 'Tier 1' issue), but the interface itself is still unacceptably unusable and has been since its first implementation. Any change to fill in this data is strictly an improvement. If I can humor the impossible for a few moments longer, I personally have a difficult time believing that anyone actually uses link_map->l_addr or link_map->l_name in a way that would break by changing their semantics for the main executable:  - The documentation hasn't improved for years so there can't be many users that care about (or even noticed) this case in particular.  - Every use case I can think of for obtaining a link_map from the dl* functions (dlinfo and dladdr1) will either already have the special handling, or won't operate on the main executable, or likely won't opt to use l_addr (vs. dlsym or dli_fbase) or l_name (vs. dli_fname).  - dl_iterate_phdr is a preferable alternative to manually iterating the link_map chain and provides l_addr itself, and the "better" dli_fname from dladdr(dlpi_phdr) is far easier to get than l_name.  - The users of the auditor interface can be counted on one hand (IIRC HPCToolkit, Spindle and sotruss), at least for HPCToolkit we are generally fine with changes if it means we get a *usable* audit interface out of the deal.  - And my cursory examination of the logic in GDB [1] and LLDB [2] (and immediately surrounding code) makes it seem plausible that debuggers treat the main executable very differently and generally ignore it in the link_map. I can't say with absolute confidence that nothing would break, but I currently consider it very rare and most likely those that do were already intentionally abusing the dl* interface. [1] https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/solib-svr4.c;hb=d3de0860104b8bb8d496527fbb042c3b4c5c82dc#l1225 [2] https://github.com/llvm/llvm-project/blob/ebf8d74e929d908829eda4ad8548ec21e2dbc6ae/lldb/source/Plugins/Process/POSIX/NativeProcessELF.cpp#L175-L176 > We should perhaps collect these grievances and work on better interfaces > for the future. However, that will be quite a long-term investment > because it will take years until these new interfaces will be available > on your users' systems. Bug fixes we can roll out more quickly, but > glibc interface changes typically happen at major distribution release > boundaries only. For the time being, it has to be workarounds for > missing interfaces, I'm afraid. I am keeping my hopes low for changing the public semantics, but I hope that adding new fields for the missing data at least could fall into the "bug fixes" category. From my perspective the public link_map interface was *never* actually usable, any change to fix it is strictly an (ABI-compatible) improvement. :) -Jonathon