From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-3.3 required=3.0 tests=AWL,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_BLOCKED, RDNS_NONE,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=no autolearn_force=no version=3.4.2 Received: from sourceware.org (unknown [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id D9A191F4B4 for ; Mon, 18 Jan 2021 22:04:16 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 389333896C0C; Mon, 18 Jan 2021 22:04:11 +0000 (GMT) Received: from mail-pg1-f180.google.com (mail-pg1-f180.google.com [209.85.215.180]) by sourceware.org (Postfix) with ESMTPS id E20563890409; Mon, 18 Jan 2021 22:04:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org E20563890409 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=maskray.me Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=emacsray@gmail.com Received: by mail-pg1-f180.google.com with SMTP id 30so11761618pgr.6; Mon, 18 Jan 2021 14:04:06 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:mime-version :content-disposition; bh=vAt2Pi73CyQxuMlh8nCNe48SxSR5naWulfoN/Ie7Krk=; b=BwbU+BAszpsvwouUeY+IEKMUO+Cu8Aj5bPGvXZ9SpMmMCDR38dqjs3Jaz1U68mKTxR Aepy2s6FCJdEUxIZJlAn7H7VfJ8SsIOdZxqmARiD5Ee/kUZNYlDHI10cjDYJl9bkyZdV dKWP00U8TmqTApUBriWgMGIsyatsPegK7Sb+vnN/Rwzd2vLz4d1GJSXeNKUEUhTyDKZq bxCa+m0KdwO9vxEZNBSMMwZER1EFbi3eVmdsjMD0x2s5+U0vTCrTq756IQeUL6ezk58b 6+4oprMbFW+afziaOTwPoT6EKjl2HYO5whnu9mkX+koRWAQ+ukFP9cYQKxNnc+YMbvWv ZqAQ== X-Gm-Message-State: AOAM533cMIjScqsBWqOnJKuA4tgUCCzBnqwyC8CFTk2sQZab1D0+qVU4 EaJTEnx51qsDrzcM0kYT4zCfRbzUmaMU/w== X-Google-Smtp-Source: ABdhPJwYAT642g4ZTg9IXbUxSKD0JZZnwTFL+D34wPS+f+i372E/2sbqwHuEUuIxFbua0emxLFerJQ== X-Received: by 2002:a62:aa03:0:b029:1b9:7cb5:bbde with SMTP id e3-20020a62aa030000b02901b97cb5bbdemr439818pff.14.1611007444720; Mon, 18 Jan 2021 14:04:04 -0800 (PST) Received: from localhost ([2601:647:4b01:ae80:51bf:373:db2a:af43]) by smtp.gmail.com with ESMTPSA id n15sm335084pjk.57.2021.01.18.14.04.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jan 2021 14:04:04 -0800 (PST) Date: Mon, 18 Jan 2021 14:04:03 -0800 From: Fangrui Song To: binutils@sourceware.org, libc-alpha@sourceware.org Subject: ifunc resolving Message-ID: <20210118220403.nzq6imfmaluuavfp@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Fangrui Song Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" I have seen ifunc relocation activities on glibc and ld recently. https://sourceware.org/glibc/wiki/GNU_IFUNC is under-documented, some aspects have not been well-known, and there are a lot differences across architectures supporting ifunc, so I am sending this email hoping that these aspects can be clarified, toolchain developers can get on the same page, and documentation can be improved (if developers get confused at times, how could regular users comfortably use them? :) ) 1. An ifunc defined in the executable is called by a (link-time DT_NEEDED or runtime) shared object. From ld https://sourceware.org/bugzilla/show_bug.cgi?id=23169 (x86 only) this looks desired. My understanding (comment 8) is that (1) The main executable is relocated the last. (2) By converting the main executable STT_GNU_IFUNC symbol to STT_FUNC, when processing relocations in a DSO, the ifunc resolver will not be called while the main executable is unresolved. ifunc calls from within the executable do not incur additional costs. ifunc calls from DSOs go through the main exe PLT and are punished. When processing an ifunc relocation in a DSO, if the ifunc resolver is defined in another DSO, according to comment 9 it will be errored. The adds an executable-vs-shared difference to non-preemptible ifunc, but so be it. The above sounds reasonable. However, the top-of-tree ld does not make -no-pie and -pie behaviors consistent (note: ld does not support -no-pie yet). cat > ./a.s < ./b.s as a.s -o a.o gcc -shared -fpic b.s -o b.so ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic a.o -o a && readelf -W -s a | grep ifunc ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic a.o ./b.so -o a && readelf -W -s a | grep ifunc ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic -pie a.o -o a && readelf -W --dyn-syms a | grep ifunc ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic -pie a.o ./b.so -o a && readelf -W --dyn-syms a | grep ifunc ~/Dev/binutils-gdb/Debug/ld/ld-new is a top-of-tree ld. % ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic a.o -o a && readelf -W -s a | grep ifunc 7: 0000000000401008 0 IFUNC GLOBAL DEFAULT 3 ifunc % ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic a.o ./b.so -o a && readelf -W -s a | grep ifunc 5: 0000000000401010 0 FUNC GLOBAL DEFAULT 7 ifunc 8: 0000000000401010 0 FUNC GLOBAL DEFAULT 7 ifunc % ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic -pie a.o -o a && readelf -W --dyn-syms a | grep ifunc 5: 0000000000001020 0 IFUNC GLOBAL DEFAULT 8 ifunc % ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic -pie a.o ./b.so -o a && readelf -W --dyn-syms a | grep ifunc 5: 0000000000001020 0 IFUNC GLOBAL DEFAULT 8 ifunc In the four combinations, -no-pie a.o ./b.so does the conversion. Once a resolution is agreed, it'd be good to make aarch64/ppc/x86/etc consistent. 2. When to convert STT_GNU_IFUNC to STT_FUNC? (This is more a ld question.) In LLD, for a non-GOT-generating-non-PLT-generating relocation referencing a STT_GNU_IFUNC, a canonical PLT entry is created and the symbol type is changed to STT_FUNC. (An absolute relocation with 0 addend in a SHF_WRITE section used to not trigger a nonical PLT entry. https://reviews.llvm.org/D65995 dropped the case.) References from other modules will resolve to the PLT entry. This approach has pros and cons: * With a canonical PLT entry, the resolver of a symbol is called only once. * If the relocation appears in a non-SHF_WRITE section, a text relocation can be avoided. * Relocation types which are not valid dynamic relocation types are supported. GNU ld may error relocation R_X86_64_PC32 against STT_GNU_IFUNC symbol `ifunc' isn't supported * References will bind to the canonical PLT entry. A function call needs to jump to the PLT, loads the value from the GOT, then does an indirect call. Last time I checked, the architectures of GNU ld behaved quite differently. This is an area that arch consistency should be improved. 3. Prefer .rela.dyn over .rela.plt for R_*_IRELATIVE? ld powerpc produces R_*_IRELATIVE in .rela.dyn. glibc powerpc32/powerpc64 do not process R_*_IRELATIVE if they are not in [DT_JMPREL, DT_JMPREL+DT_PLTRELSZ). This may be a good practice because R_*_IRELATIVE is by nature eagerly resolved. The potentially lazy .rela.plt is not suitable. I think at least aarch64 and x86 are still using .rela.plt. In LLD I followed .rela.dyn and it has been working well https://reviews.llvm.org/D65651 . 4. When to define __rela_iplt_start and __rela_iplt_end? Static pie and static no-pie relocation processing is very different in glibc. * Static no-pie uses special code to process a magic array delimitered by __rela_iplt_start/__rela_iplt_end. * Static pie uses self-relocation to take care of R_*_IRELATIVE. The above magic array code is executed as well. If __rela_iplt_start/__rela_iplt_end are defined, we will get 0 < __rela_iplt_start < __rela_iplt_end in csu/libc-start.c. ARCH_SETUP_IREL will crash when resolving the first relocation which has been processed. LLD defines __rela_iplt_start/__rela_iplt_end in -pie mode (GNU ld doesn't) so static pie elf/ldconfig segfaults. If we take the patch "Make _dl_relocate_static_pie return an int indicating whether it applied relocs." from https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/maskray/lld , LLD linked static-pie glibc programs will work well (with another cleanup from an unrelated thing: https://sourceware.org/pipermail/libc-alpha/2020-December/121144.html). My idea is that defining __rela_iplt_start/__rela_iplt_end in -pie is justified. I do see that GNU ld may not want a change (probably in a couple of years) because it does not want to gratuitously break older glibc, but taking the patch (probably with description rewritten) is a clarification to glibc code to me. glibc maintainers can follow up on "[PATCH 0/3] Make glibc build with LLD" if you accept that patch. In a few years, when the compatibility for older glibc can be dropped. ld can define __rela_iplt_start in -pie mode to drop the unneeded difference in diff -u =(ld.bfd --verbose) =(ld.bfd -pie --verbose) output.