From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on starla X-Spam-Level: X-Spam-Status: No, score=-7.0 required=3.0 tests=BODY_8BITS,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id D44641F44D for ; Mon, 25 Mar 2024 18:51:49 +0000 (UTC) Authentication-Results: dcvr.yhbt.net; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20230601 header.b=OLrHY/rP; dkim-atps=neutral Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D0DD33858CDB for ; Mon, 25 Mar 2024 18:51:48 +0000 (GMT) Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) by sourceware.org (Postfix) with ESMTPS id ABA9D3858D33 for ; Mon, 25 Mar 2024 18:51:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ABA9D3858D33 Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=google.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org ABA9D3858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::332 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711392686; cv=none; b=UKM7N7epKwtlckPNoMZi8C8F+3XT18WyL1gFD6btE+4mj6H+euXwJL7i/7lXF4P9b07qwvt0auvXoOA5yu/PHRrFohBka43+aryy2SEJu/6WOtcAfri7IPwLIqwGH0O7AP0Nbb0jY0tH3jHGROgeo4pd4T8b+y5IAVbZkiVXb/k= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711392686; c=relaxed/simple; bh=7L7t29OCaC94oC1ptygCEJD6AdvUl9jOnm9cyiNihzk=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=iObBgMs5ZbLa2ASlxoiSZluMXzFHIimLpxy1svJTbtmN0ZuwIhA6bxc2bR0oCPThzlkgs99oS4GU7/jTXsb4NSgCRZO994R0VmbXjBmScn8GUaEu1uUJ55AoGPhqSIiq3uPntyQD4TG84xGcIYYtQazS1vvdznfVtWyAeBdFK2U= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-x332.google.com with SMTP id 5b1f17b1804b1-4148c09ec6bso14235e9.1 for ; Mon, 25 Mar 2024 11:51:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1711392683; x=1711997483; darn=sourceware.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=YvLzK9WPyb/sqkEtXnDq5X/u2MlXSexqg4RL3n2JX0E=; b=OLrHY/rPdxFrgy1PCWhgNIFVnNEKt4LRt578el20LEOvEZwKjfsioYEenQHSwTl7NJ 6p8tcNl8pfPyUMABwwzI5basogTZT04tbDtbF2IpJndjKTqpyZzdqYugfGUE9AdgWr9d z0cMDnT2WDc5tu57NywKb3ZNr5T0G92xspGPnrV4ixNv2ITNWDAcpPTIusexU9eXyAmS e0BQ3rdToEFQhag8Q4k47DVUXZLMFEjaTC97yRhQb6feMRXzOCjASb+JuQTQk4X1UBi2 z0dwOnMtMvtlYk/I1R/8KNphQHKNfVBSwx5MxzGDU0x5HSa37rK9XOQG+abw0oBX71eN jf2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711392683; x=1711997483; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YvLzK9WPyb/sqkEtXnDq5X/u2MlXSexqg4RL3n2JX0E=; b=MnefesRQseBlLlmJvz78Bq+H8AileHR/FgiCspPtDxZGkvJbay0W9wkYhtMszL5g2M uTjyfjSy4laoNDMp2xzUXwq0OB1dIza5oxLgPxTxppGR/rs32kgH6buq9A8YSDFcwObM fr5gej7BFWblGpNCRTbr7izSDB6Sn84nwsNyJ3Dqvq9ePtYAvRICcjjaUK7T3ZWJhQBf RYO6xNhQk1OjENH6MvkahqjkfW0Dla5upbyr6ScjEu1evAOZpVWUMrWm5B9Zd7Kauesq uUy6hJUpOSWjZVqsir2W8RaNlr7+BkkOOL8HlFcmm0wZvF50+kwVtbRIop/qyAkx+ePS 0+Pg== X-Forwarded-Encrypted: i=1; AJvYcCUHJsmHSf8O6wbhNR7yef54ZJYZS/ud87TtYwFJRWnyUZcOPgv4f88PWA95aTKGOMk5seRHpLarkTWLwafZc2fydhY055FyCKmI X-Gm-Message-State: AOJu0YxNxYHvTlLEdKdkItSm5KBQLJl8E4+4nqBuDnw7EbprH38Hrbov //0kJzh62b/JigBwKYwHtV+3gyZ0Fq6tLf4u5ESoAhUcP8qGarq7TZoyW6FQ4cgU57RLd7HvPd4 YUwgf4HFusDuBikK4SMGzl0RBdHmbSB2qWd2UTuimWjIkWP8NIIIq X-Google-Smtp-Source: AGHT+IH0fo6feWpRjgFQUh+MXzHJCq6x6pVtDYC+diUqJ9skSFDVDfcp483mloadNw3G7fGaD3j64onOfCKA7PvBKBw= X-Received: by 2002:a05:600c:3789:b0:414:7f41:7b5e with SMTP id o9-20020a05600c378900b004147f417b5emr521013wmr.6.1711392683118; Mon, 25 Mar 2024 11:51:23 -0700 (PDT) MIME-Version: 1.0 References: <87bk72a5pv.fsf@oldenburg.str.redhat.com> In-Reply-To: <87bk72a5pv.fsf@oldenburg.str.redhat.com> From: Fangrui Song Date: Mon, 25 Mar 2024 11:51:09 -0700 Message-ID: Subject: Re: CREL dynamic relocations To: Florian Weimer Cc: Fangrui Song , libc-alpha@sourceware.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org On Mon, Mar 25, 2024 at 4:53=E2=80=AFAM Florian Weimer = wrote: > > * Fangrui Song: > > > I have proposed a compact relocation format CREL at > > https://groups.google.com/g/generic-abi/c/yb0rjw56ORw/m/eiBcYxSfAQAJ > > (previously named RELLEB). > > > > CREL primarily targets static relocations, achieving significant .o > > file size reduction for lld builds: 18.0% for x86-64/aarch64 and 34.3% > > for riscv64. > > CREL holds promise for dynamic relocations as well, surpassing > > Android's packed relocation format. > > As I said elsewhere, I'm concerned about the use of the ULEB128 > encoding. It's unnecessarily difficult to decode. > > Thanks, > Florian Thanks. I have seen your question at https://groups.google.com/g/generic-abi/c/yb0rjw56ORw/m/osMXhg5XAgAJ and replied there that since one-byte encodings dominant for our use cases, LEB128 is actually the best choice (in terms of both performance and simplicity). I've researched the dynamic relocation problem in the weekend and incorporated the following text to my blog post Traditionally, we have two dynamic relocation ranges for executables and shared objects (except static position-dependent executables): * `.rela.dyn` (`[DT_RELA, DT_RELA + DT_RELASZ)`) or `.rel.dyn` (`[DT_REL, DT_REL + DT_RELSZ)`) * `.rela.plt` (`[DT_JMPREL, DT_JMPREL + DT_PLTRELSZ)`): Stored JUMP_SLOT relocations. `DT_PLTREL` specifies `DT_REL` or `DT_RELA`. IRELATIVE relocations can be placed in either range, but preferrably in `.rel[a].dyn`. Some GNU ld ports (e.g. SPARC) treat `.rela.plt` as a subset of `.rela.dyn`, introducing complexity for dynamic loaders. **CREL adoption considerations** * New dynamic tag (`DT_CREL`): To identify CREL relocations, separate from existing `DT_REL`/`DT_RELA`. * No `DT_CRELSZ`: Relocation count can be derived from the CREL header. * Output section description `.rela.dyn : { *(.rela.dyn) *(.rela.plt) }` is incompatible with CREL. **Challenges with lazy binding** glibc's lazy binding scheme relies on [random access to relocation entries within the `DT_JMPREL` table](https://maskray.me/blog/2021-09-19-all-about-procedure-linkage-table= #:~:text=3D_dl_fixup). CREL's sequential nature prevents this. However, eager binding doesn't require random access. Therefore, when `-z now` (eager binding) is enabled, we can: * Set `DT_PLTREL` to `DT_CREL`. * Replace `.rel[a].plt` with `.crel.plt`. **Challenges with statically linked position-dependent executables** glibc introduces additional complexity for IRELATIVE relocations in statically linked position-dependent executables. They should only contain IRELATIVE relocations and no other dynamic relocat= ions. glibc's `csu/libc-start.c` processes IRELATIVE relocations in the range [`[__rela_iplt_start, __rela_iplt_end)`](https://maskray.me/blog/2021-01-18-gnu-indirect-function= #non-preemptible-ifunc#rela_iplt_start-and-__rela_iplt_end) (or `[__rel_iplt_start, __rel_iplt_end)`, determined at build time through `ELF_MACHINE_IREL`). While CREL relocations cannot be decoded in the middle of the section, we can still place IRELATIVE relocations in `.crel.dyn` because there wouldn't be any other relocation types (position-dependent executables don't have RELATIVE relocations). When CREL is enabled, we can define `__crel_iplt_start` and `__crel_iplt_end` for statically linked position-dependent executables. If glibc only intends to support `addend_bit=3D=3D0`, the code can simply b= e: ```c extern const uint8_t __crel_iplt_start[] __attribute__ ((weak)); extern const uint8_t __crel_iplt_end[] __attribute__ ((weak)); if (&__crel_iplt_start !=3D &__crel_iplt_end) { const uint8_t *p =3D __crel_iplt_start; size_t offset =3D 0, count =3D read_uleb128 (&p), shift =3D count & 3; for (count >>=3D 3; count; count--) { uint8_t rel_head =3D *p++; offset +=3D rel_head >> 2; if (rel_head & 128) offset +=3D (read_uleb128 (&p) << 5) - 32; if (rel_head & 2) read_sleb128 (&p); elf_crel_irel ((ElfW (Addr) *) (offset << shift)); } } ``` **Considering implicit addends for CREL** Many dynamic relocations have zero addends: * COPY/GLOB_DAT/JUMP_SLOT relocations only use zero addends. * Absolute relocations could use non-zero addends with `STT_SECTION` symbol, but linkers convert them to relative relocations. Usually only RELATIVE/IRELATIVE and potentially TPREL/TPOFF might require non-zero addends. Switching from `DT_RELA` to `DT_REL` offers a minor size advantage. I considered defining two separate dynamic tags (`DT_CREL` and `DT_CRELA`) to distinguish between implicit and explicit addends. However, this would have introduced complexity: * Should `llvm-readelf -r` dump the zero addends for `DT_CRELA`? * Should dynamic loaders support both dynamic tags? I placed the delta addend bit next to offset bits so that it can be reused for offsets. Thanks to Stefan O'Rear's for making me believe that my original thought of reserving a single bit flag (`addend_bit`) within the CREL header is elegant. Dynamic loaders prioritizing simplicity can hardcode the desired `addend_bit` value. `ld.lld -z crel` defaults to implicit addends (`addend_bit=3D=3D0`), but the option of using in-relocation addends is available with `-z crel -z rela`. **DT_AARCH64_AUTH_RELR vs CREL** The AArch64 PAuth ABI introduces `DT_AARCH64_AUTH_RELR` as a variant of RELR for signed relocations. However, its benefit seems limited. In a release build of Clang 16, using `-z crel -z rela` resulted in a `.crel.dyn` section size of only 1.0% of the file size. Notably, enabling implicit addends with `-z crel -z rel` further reduced the size to just 0.3%. While `DT_AARCH64_AUTH_RELR` will achieve a noticeable smaller relocation size if most relative relocations are encoded with it, the advantage seems less significant considering CREL's already compact size. Furthermore, `DT_AARCH64_AUTH_RLEL` introduces additional complexity to the linker due to its 32-bit addend limitation: the in-place 64 value encodes a 32-bit schema, giving just 32 bits to the implicit addend. If the addend does not fit into 32 bits, `DT_AARCH64_AUTH_RELR` cannot be u= sed. CREL with addends would avoid this complexity. I have filed [Quantifying the benefits of DT_AARCH64_AUTH_RELR](https://github.com/ARM-software/abi-aa/issues/252). --=20 =E5=AE=8B=E6=96=B9=E7=9D=BF