From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-4.3 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 7A2AB1F934 for ; Fri, 9 Oct 2020 04:31:04 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 45D63385483E; Fri, 9 Oct 2020 04:31:03 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 45D63385483E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1602217863; bh=a03zj1EjXrucTiHl84Xj5rW6ONqZ+2HlecBod4ZNncA=; h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=M42AHoH0+PUPwDKSSqOpAAeYwTYvQjRX2w9xvJksnZHCFuOiczQ6wmzQTxAe42FoT altqMlQZcOocVMiXIssddbMcRELPpbnf35cI7GzbskE2Wf256FOUFS3muIvcYXqo+C svo9wQwRuN5e5r6k2HXrvv85Lzjt/ZfKtTURa3X4= Received: from mr85p00im-zteg06011601.me.com (mr85p00im-zteg06011601.me.com [17.58.23.186]) by sourceware.org (Postfix) with ESMTPS id 17ADA3857C55 for ; Fri, 9 Oct 2020 04:31:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 17ADA3857C55 Received: from [192.168.0.18] (125-239-162-180-vdsl.sparkbb.co.nz [125.239.162.180]) by mr85p00im-zteg06011601.me.com (Postfix) with ESMTPSA id 5B308920780; Fri, 9 Oct 2020 04:30:58 +0000 (UTC) Subject: Re: [PATCH v2] ldd: revise trace output for left-aligned relative addresses To: Adhemerval Zanella , libc-alpha@sourceware.org References: <20201006054255.1676065-1-michaeljclark@mac.com> <20201006235648.1811725-1-michaeljclark@mac.com> Message-ID: <29c6b035-131c-2bf3-4c5d-ad27cbbb942c@mac.com> Date: Fri, 9 Oct 2020 17:30:55 +1300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235, 18.0.687 definitions=2020-10-09_01:2020-10-09, 2020-10-09 signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-2006250000 definitions=main-2010090032 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Michael Clark via Libc-alpha Reply-To: Michael Clark Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" On 10/9/20 1:09 AM, Adhemerval Zanella wrote: > > > On 08/10/2020 02:44, Michael Clark wrote: >> >> >> On 10/8/20 10:01 AM, Adhemerval Zanella wrote: >>> On 06/10/2020 20:56, Michael Clark via Libc-alpha wrote: >>>> This change updates ld.so trace for left-aligned relative addresses. >>>> The primary goal of this change is to increase `ldd` readability by: >>> >>> I am not sure if we want to extend the loader to expose debug format >>> printing where it could be archive by extending the elf/ldd.bash.in >>> itself to handle it.  In fact, I would like to avoid such extra >>> complexity on a core component of the program loading. >>> >>>> >>>>   - modifying trace output to use relative addresses by default. >>> >>> You can get similar information with setarch -R, which disable ASLR. >>> >>>>   - adding an alternative trace output mode with left-aligned addresses. >>> >>> And you can do it with some pos-processing tool (elf/ldd.bash.in, although >>> I give you it might be cumbersome to accomplish with sheel script). >>> >>>> >>>> The relative addresses are composed by subtracting the ELF ehdr address >>>> which makes the output constant under address space layout randomization. >>>> This should be a safe change because the default format is preserved. >>>> >>>> The intention is to make `ldd` easier to cross reference with objdump. >>>> Also, log files including `ldd` output will contain less differences as >>>> the vdso is the only address that changes when using relative addresses. >>>> >>> >>> Which information exactly are you trying to match from what you read reading >>> the ELF information through objdump? Afaik without prelink sections, it does >>> not give any information whether loader might place the DSO segments. >> >> Precisely the linked run-time relative offsets of DSOs. >> >> I have spent countless hours reading and cross referencing words and numbers from command line tools. For me it's a use case of trace output from a simulator (e.g. qemu -d in_asm, op_opt, out_asm) and a window beside me with objdump and ldd there. Time and cognitive load. An addend would be useful too, but one should be able to pipe cut to bc for that. > > But this relative offset only make sense with ALSR disabled, which you can > do by forcing it with a personality call. What I am trying to understand > is why exactly you need to use a base address (__ehdr_start) and present > the offset relative address (since this will be also subject to ASLR). > >> >> So more words around whether to adopt "left-aligned relative addresses". >> >> I completely understand why it is difficult to change existing formats which is why the patch does not change the default. There is also musl ldd and freebsd ldd that also have adopted that brain damaged format. >> >>>> * Aligned output * >>>> >>>> The new trace format is enabled with `LD_TRACE_ADDR_ALIGN=1`, otherwise >>>> the default `ldd` trace format is selected by default for compatibility. >>>> >>>> * Relative addresses * >>>> >>>> `ldd` load addresses are displayed relative to the ld.so executable header >>>> address. Relative addresses are enabled by default, given the output mimics >>>> systems without ASLR, thus there should be minimal compatibility issues. >>>> There is also an option to negate addresses as an aid in interpreting them, >>>> seeing library addresses relative to the loader with negative offsets. >>>> >>>> The changes adds three new ld.so flags accessible via environment variables: >>>> >>>>   - `LD_TRACE_ADDR_ALIGN=1` - Show addresses left-aligned >>>>   - `LD_TRACE_ADDR_ABSOLUTE=1` - Show absolute addresses (backwards compat) >>>>   - `LD_TRACE_ADDR_NEGATE=1` - Show negated addresses (combination option) >>> >>> What I would like is in fact to move lld support *out* of the loader, where >>> it would require to process anything more the strictly required and without >>> commit any system resource (such as mmap).  It will result in slight less >>> complex code and attack surfac >> That's kind of irrespective to this patch though. Kerckhoff's principle. The rationale is not to hide ASLR. It's to reduce diffs in CI logs where we run ldd to check which lib our build system decided to link us to. > > Not really because I also want to avoid make the loader code *more* complex > and move all this format complexity on how to present the information to > a helper script. This is similar to multiple traces/profile utilities on > Linux, where the interface to *obtain* the information is concise as > possible. That's reasonable. > My rationale is this could be accomplish by changing ldd script itself > (by either using python if this makes it easier). Understand. I agree. An if statement switch here adds complexity and it is the wrong way to get this functionality. It should have been done this way in the beginning and it is a good example of a usability change for how tools should respond in a world with self-contained tests where a change like this could be tested, at least against the core system dependencies, in CI and the change should simply switch the order of columns; no options. I am free to keep the patch locally. It suits me to patch ldd so I can use my version. That's the good thing about open source tools. It just happens to be a random thing that grabbed my attention while trying to visually parse load addresses. It didn't make sense to me the position, from a tool user perspective. >> Making output more difficult for humans to read is not a good rationale. Backwards compatibility on the other hand is completely reasonable. >> >>> Carlos O'Donnel has stated a project to accomplish it some time ago [1], >>> but I haven't heard yet if it has been released. Maybe it something we >>> can work on glibc side as well. >> >> No worries. I didn't expect that anyone would pick up the patch. It just occurred to me how brain damaged the present layout is. Not that I also wouldn't make brain damaged layouts myself. If logging something for trace purposes, one probably does not think too much about column order. There is also the field separator and potentially spaces in filenames, which is not addressed. QEMU has good trace infra btw. >> >> On windows we have process hacker which has an easy to read scrollable table view but it has an ugly color scheme. There is another tool I use for dependency analysis on windows. depends.exe iirc. >> >> It might be a bad idea to make the Linux tools look good. > > Again, I am not against a better tracing output of lld and I do agree with > you that presenting the information in different ways might help users in > parse the information. What I think is we should move this to helper > program/script/tool and make the loader as concise as possible. Good. Hopefully it is a standard format that has easy to parse columns. Count up hours wasted for every user trying to find the pivot character to form a regex specific to the tool in use because there is strictly no standard format for tabular output from the core utilities. cut may or may not work so one moves on to grep, egrep or awk or whatever one knows best. No delimiter; space delimiter; comma delimeter; no delimiter but value in parenthesis; has spaces in fields; single/double quotes, ... User googles it... For tabular output, my preference leans towards Kernighan style: - https://ampl.com/resources/the-ampl-book/ - https://ampl.com/resources/the-ampl-book/example-files/ ...which is in fact implemented in GLPK. Michael.