From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3215 2.6.0.0/16 X-Spam-Status: No, score=-3.6 required=3.0 tests=AWL,BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 30F6C1F8C6 for ; Wed, 11 Aug 2021 22:53:31 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1CDDE398F037 for ; Wed, 11 Aug 2021 22:53:30 +0000 (GMT) Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by sourceware.org (Postfix) with ESMTPS id 119CE398D045 for ; Wed, 11 Aug 2021 22:51:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 119CE398D045 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=jrtc27.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=jrtc27.com Received: by mail-wr1-x42f.google.com with SMTP id f5so5227838wrm.13 for ; Wed, 11 Aug 2021 15:51:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jrtc27.com; s=gmail.jrtc27.user; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=IB8r7XaRJxIoejFGqPHBpM36b8j7OsPSLE53i3hFVWE=; b=hjVqD/0htJ93tMMZjG/CtgZiJFeUMFcceD3QPwxOjMzHbmvBAP8UpzjWfeOfwvD/Ku cGugqaATf/SjZNNzNz1msuK1ljS8HwtSQH6c9nry1FrhmSIZuPNmyUb9CYSIBwqbvbJi w4kR6NWIeu8GZwD72aNnYt/G6q+oCaOFrxICM/EIJe74tf7CSD/ejFeeYXHInbtbTbgG fDRRPTLQuDKbj+c/8SNh8tyvYiPjZsDAcBTq9rUCinI+vhsZxxiTZd3dAATNxpFxNRC+ 03f/XgXo6imXGJ/o7edU9yWDWALfeYeAldhkHsMK0gdo7hATYXmThp2Ej3xlWha6fcAs DVZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=IB8r7XaRJxIoejFGqPHBpM36b8j7OsPSLE53i3hFVWE=; b=i5IJxJeMqFqjDfd27ItoCMtf4d/QDp/MXu1Zl4leQujQ0vBiWYy4QOz2Cctjus14VA oWoDudpT66S38+TRtBOeQLftdsPG/TMwwRe6nJrzxpWdK4wlhTMBc7lux9qgqD+N5kRH R7dovLEi69VxiiTG6C9UjQwrOxe0CMXmV0/Sz6So35htePInUVp9onEC2BzvWE+ExU8o cKcPdf+RnCP8OCK2z4M9Wk7c9SFCicRnfPIcgdH+cF0ovkB4LWgxiTsPxbVANvQVIYZY J1EXqfN6OsgEiPkVvnXpZXw4fZtDTqFVq+GU40P+RnGhHPS+7mzv2g7I42ULG4xfZLjI OjDg== X-Gm-Message-State: AOAM533wevFjd8hpzA1MjsAigOAiZ7rseUkTzZv04gB/gHrPthKYjgae FwpG3yw5uOZleEfYKJb1gthb2iVLbGCKCtmJ X-Google-Smtp-Source: ABdhPJy9WymmFutPlVNcEyymor4eTlOAPYSENuexMWJVRx41lhuTjApqzBaGvCG5PjJ3XCxFMGgV4w== X-Received: by 2002:adf:9b85:: with SMTP id d5mr718892wrc.18.1628722313062; Wed, 11 Aug 2021 15:51:53 -0700 (PDT) Received: from smtpclient.apple (trinity-students-nat.trin.cam.ac.uk. [131.111.193.104]) by smtp.gmail.com with ESMTPSA id u23sm7491017wmc.24.2021.08.11.15.51.52 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 11 Aug 2021 15:51:52 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.100.0.2.22\)) Subject: Re: [PATCH] riscv: Resolve symbols directly for symbols with STO_RISCV_VARIANT_CC. From: Jessica Clarke In-Reply-To: Date: Wed, 11 Aug 2021 23:51:51 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: To: Palmer Dabbelt X-Mailer: Apple Mail (2.3654.100.0.2.22) X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: libc-alpha@sourceware.org, Andrew Waterman , kito.cheng@sifive.com Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org Sender: "Libc-alpha" On 11 Aug 2021, at 23:11, Palmer Dabbelt wrote: >=20 > On Sun, 08 Aug 2021 20:47:07 PDT (-0700), kai.wang@sifive.com wrote: >> In some cases, we do not want to go through the resolver for function >> calls. For example, functions with vector arguments will use vector >> registers to pass arguments. In the resolver, we do not save/restore = the >> vector argument registers for lazy binding efficiency. To avoid = ruining >> the vector arguments, functions with vector arguments will not go >> through the resolver. >>=20 >> To achieve the goal, we will annotate the function symbols with >> STO_RISCV_VARIANT_CC flag and add DT_RISCV_VARIANT_CC tag in the = dynamic >> section. In the first pass on PLT relocations, we do not set up to = call >> _dl_runtime_resolve. Instead, we resolve the functions directly. >>=20 >> The related discussion could be found in >> https://github.com/riscv/riscv-elf-psabi-doc/pull/190. >=20 > I'm trying to stay away from the foundation stuff these days, but from = reading that spec it doesn't look like the variant CC has anything to do = specifically with the V extension but instead is about allowing for = arbitrary calling conventions for X and F registers as well. Handling = non-standard X-register calling conventions is really a whole different = beast than handling non-standard F or V register calling conventions. = It's not crazy to allow for these, but lumping them all into a single = bit just makes this unnecessarily difficult. Whilst increased granularity would be nice to have, the reality is that = st_other is an 8-bit field, 2 of which are currently allocated by the = gABI, with proposals to grow that to 3, so there are only 5 bits to play = with, thus anything more than just a single bit for this needs an = extremely good justification beyond =E2=80=9Cit would be less difficult = to use=E2=80=9D, unfortunately. This bit is sufficient to address the = issue with the V calling convention, and is also sufficient for any = other non-standard calling convention, whether it involves X registers, = F registers or something that hasn=E2=80=99t even been thought up yet. The second reason why it=E2=80=99s done this way is I have a strong = belief that RISC-V should not reinvent the wheel except for problems = unique to RISC-V. AArch64 faced exactly the same problem with SVE a few = years ago and this is the exact same solution they use and is deployed = everywhere, so by following their lead we can be confident it works and = reuse existing implementation code for the various tools and runtimes = affected. > In theory we can handle arbitrary F or V register calling conventions = with the standard resolver, we just need to ensure we deal with = save/restore of those registers. IMO the right way to go is to just ban = F and V register use from the resolver, as there's not much of a reason = for them to be in there (none for F, and you probably don't want to spin = up the V registers). That just requires sorting out the build system, = and would still allow us to have lazy binding for the variants. At a = bare minimum we can support arbitrary V register calling conventions for = free until we start using the V registers in glibc, which is likely = quite a way off. It=E2=80=99s a nice idea in theory, but in practice it doesn=E2=80=99t = work, the resolver path is far too complicated for that. Run-time = linkers like to call into libc for various things (e.g. on FreeBSD, if = you are in a multi-threaded program, all the locking and unlocking = guarding the shared data structures ends up calling into libthr via = hooks, and in glibc similarly it uses libpthread=E2=80=99s real mutex = functions), so you end up needing to make sure those parts of your = (extended) libc are also free from vectors and floats (including any = auto-vectorisation). But the real kicker is IFUNCs; with lazy binding, = the IFUNC resolver isn=E2=80=99t called until you first call the = function, and you have no clue what the resolver is going to do other = than that it=E2=80=99ll follow the ABI (we should specify that, but = IFUNCs as a whole are underspecified in the psABI, we don=E2=80=99t = currently document what the arguments are). That means the resolver is = free to use F and V registers. Yes, all of these can be individually = addressed via various extremely-targeted means, but experience suggests = that it will break sooner or later. As a simple example, none of the = libc functions called by the run-time linker could ever use memcpy, be = it explicit or implicit (compiler-inserted instead of a large inline = struct assignment), since that will eventually be vectorised, even if = you compiled the directly-called libc functions themselves without V. > Handling arbitrary X-register calling conventions is a lot tricker: = for the fully arbitrary case we can't even have PLT entries, for = example. Is there actually a use case for these? If so we'll have to = support it, it just seems odd that one would care enough about the X ABI = to want a different one while still being able to deal with the overhead = of a dynamic call. You=E2=80=99re correct that as it stands PLT entries are technically = broken by this. We should fix that so even VARAINT_CC functions can=E2=80=99= t use those registers as argument registers (AArch64 has that implicitly = as it has IP0 and IP1 as reserved registers for PLTs and veneers, but = since we do linker relaxation rather than range-extending with veneers = we haven=E2=80=99t yet needed to explicitly reserve the PLT registers). Jess > So I'm not opposed to doing something like this, we just need some way = to make sure it's actually solving a problem. >=20 >> --- >> elf/elf.h | 7 +++++++ >> sysdeps/riscv/dl-dtprocnum.h | 21 +++++++++++++++++++++ >> sysdeps/riscv/dl-machine.h | 26 ++++++++++++++++++++++++++ >> 3 files changed, 54 insertions(+) >> create mode 100644 sysdeps/riscv/dl-dtprocnum.h >>=20 >> diff --git a/elf/elf.h b/elf/elf.h >> index 4738dfa..0de29bf 100644 >> --- a/elf/elf.h >> +++ b/elf/elf.h >> @@ -3889,6 +3889,13 @@ enum >>=20 >> #define R_TILEGX_NUM 130 >>=20 >> +/* RISC-V specific values for the Dyn d_tag field. */ >> +#define DT_RISCV_VARIANT_CC (DT_LOPROC + 1) >> +#define DT_RISCV_NUM 2 >> + >> +/* RISC-V specific values for the st_other field. */ >> +#define STO_RISCV_VARIANT_CC 0x80 >> + >> /* RISC-V ELF Flags */ >> #define EF_RISCV_RVC 0x0001 >> #define EF_RISCV_FLOAT_ABI 0x0006 >> diff --git a/sysdeps/riscv/dl-dtprocnum.h = b/sysdeps/riscv/dl-dtprocnum.h >> new file mode 100644 >> index 0000000..f189fd7 >> --- /dev/null >> +++ b/sysdeps/riscv/dl-dtprocnum.h >> @@ -0,0 +1,21 @@ >> +/* Configuration of lookup functions. RISC-V version. >> + Copyright (C) 2019-2021 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it = and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later = version. >> + >> + The GNU C Library is distributed in the hope that it will be = useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library. If not, see >> + . */ >> + >> +/* Number of extra dynamic section entries for this architecture. = By >> + default there are none. */ >> +#define DT_THISPROCNUM DT_RISCV_NUM >> diff --git a/sysdeps/riscv/dl-machine.h b/sysdeps/riscv/dl-machine.h >> index aedf69f..6488483 100644 >> --- a/sysdeps/riscv/dl-machine.h >> +++ b/sysdeps/riscv/dl-machine.h >> @@ -54,6 +54,9 @@ >> #define ELF_MACHINE_NO_REL 1 >> #define ELF_MACHINE_NO_RELA 0 >>=20 >> +/* Translate a processor specific dynamic tag to the index in l_info = array. */ >> +#define DT_RISCV(x) (DT_RISCV_##x - DT_LOPROC + DT_NUM) >> + >> /* Return nonzero iff ELF header is compatible with the running host. = */ >> static inline int __attribute_used__ >> elf_machine_matches_host (const ElfW(Ehdr) *ehdr) >> @@ -299,6 +302,29 @@ elf_machine_lazy_rel (struct link_map *map, = ElfW(Addr) l_addr, >> /* Check for unexpected PLT reloc type. */ >> if (__glibc_likely (r_type =3D=3D R_RISCV_JUMP_SLOT)) >> { >> + if (__glibc_unlikely (map->l_info[DT_RISCV (VARIANT_CC)] !=3D = NULL)) >> + { >> + /* Check the symbol table for variant CC symbols. */ >> + const Elf_Symndx symndx =3D ELFW(R_SYM) (reloc->r_info); >> + const ElfW(Sym) *symtab =3D >> + (const void *)D_PTR (map, l_info[DT_SYMTAB]); >> + const ElfW(Sym) *sym =3D &symtab[symndx]; >> + if (__glibc_unlikely (sym->st_other & STO_RISCV_VARIANT_CC)) >> + { >> + /* Avoid lazy resolution of variant CC symbols. */ >> + const struct r_found_version *version =3D NULL; >> + if (map->l_info[VERSYMIDX (DT_VERSYM)] !=3D NULL) >> + { >> + const ElfW(Half) *vernum =3D >> + (const void *)D_PTR (map, l_info[VERSYMIDX = (DT_VERSYM)]); >> + version =3D &map->l_versions[vernum[symndx] & 0x7fff]; >> + } >> + elf_machine_rela (map, reloc, sym, version, reloc_addr, >> + skip_ifunc); >> + return; >> + } >> + } >> + >> if (__glibc_unlikely (map->l_mach.plt =3D=3D 0)) >> { >> if (l_addr)