unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Fangrui Song <maskray@google.com>
To: Florian Weimer <fweimer@redhat.com>
Cc: Fangrui Song <maskray@gcc.gnu.org>, libc-alpha@sourceware.org
Subject: Re: CREL dynamic relocations
Date: Mon, 25 Mar 2024 11:51:09 -0700	[thread overview]
Message-ID: <CAFP8O3K=40N03GsgYCdp6ErOkc0XiAOKknUNsAR+58vOFcOeyw@mail.gmail.com> (raw)
In-Reply-To: <87bk72a5pv.fsf@oldenburg.str.redhat.com>

On Mon, Mar 25, 2024 at 4:53 AM Florian Weimer <fweimer@redhat.com> wrote:
>
> * Fangrui Song:
>
> > I have proposed a compact relocation format CREL at
> > https://groups.google.com/g/generic-abi/c/yb0rjw56ORw/m/eiBcYxSfAQAJ
> > (previously named RELLEB).
> >
> > CREL primarily targets static relocations, achieving significant .o
> > file size reduction for lld builds: 18.0% for x86-64/aarch64 and 34.3%
> > for riscv64.
> > CREL holds promise for dynamic relocations as well, surpassing
> > Android's packed relocation format.
>
> As I said elsewhere, I'm concerned about the use of the ULEB128
> encoding.  It's unnecessarily difficult to decode.
>
> Thanks,
> Florian

Thanks. I have seen your question at
https://groups.google.com/g/generic-abi/c/yb0rjw56ORw/m/osMXhg5XAgAJ
and replied there that
since one-byte encodings dominant for our use cases, LEB128 is
actually the best choice (in terms of both performance and
simplicity).

I've researched the dynamic relocation problem in the weekend and
incorporated the following text to my blog post


Traditionally, we have two dynamic relocation ranges for executables
and shared objects (except static position-dependent executables):

* `.rela.dyn` (`[DT_RELA, DT_RELA + DT_RELASZ)`) or `.rel.dyn`
(`[DT_REL, DT_REL + DT_RELSZ)`)
* `.rela.plt` (`[DT_JMPREL, DT_JMPREL + DT_PLTRELSZ)`): Stored
JUMP_SLOT relocations. `DT_PLTREL` specifies `DT_REL` or `DT_RELA`.

IRELATIVE relocations can be placed in either range, but preferrably
in `.rel[a].dyn`.

Some GNU ld ports (e.g. SPARC) treat `.rela.plt` as a subset of
`.rela.dyn`, introducing complexity for dynamic loaders.

**CREL adoption considerations**

* New dynamic tag (`DT_CREL`): To identify CREL relocations, separate
from existing `DT_REL`/`DT_RELA`.
* No `DT_CRELSZ`: Relocation count can be derived from the CREL header.
* Output section description `.rela.dyn : { *(.rela.dyn) *(.rela.plt)
}` is incompatible with CREL.

**Challenges with lazy binding**

glibc's lazy binding scheme relies on [random access to relocation
entries within the `DT_JMPREL`
table](https://maskray.me/blog/2021-09-19-all-about-procedure-linkage-table#:~:text=_dl_fixup).
CREL's sequential nature prevents this. However, eager binding doesn't
require random access.
Therefore, when `-z now` (eager binding) is enabled, we can:

* Set `DT_PLTREL` to `DT_CREL`.
* Replace `.rel[a].plt` with `.crel.plt`.

**Challenges with statically linked position-dependent executables**

glibc introduces additional complexity for IRELATIVE relocations in
statically linked position-dependent executables.
They should only contain IRELATIVE relocations and no other dynamic relocations.

glibc's `csu/libc-start.c` processes IRELATIVE relocations in the
range [`[__rela_iplt_start,
__rela_iplt_end)`](https://maskray.me/blog/2021-01-18-gnu-indirect-function#non-preemptible-ifunc#rela_iplt_start-and-__rela_iplt_end)
(or `[__rel_iplt_start, __rel_iplt_end)`, determined at build time
through `ELF_MACHINE_IREL`).
While CREL relocations cannot be decoded in the middle of the section,
we can still place IRELATIVE relocations in `.crel.dyn` because there
wouldn't be any other relocation types (position-dependent executables
don't have RELATIVE relocations).
When CREL is enabled, we can define `__crel_iplt_start` and
`__crel_iplt_end` for statically linked position-dependent
executables.

If glibc only intends to support `addend_bit==0`, the code can simply be:
```c
  extern const uint8_t __crel_iplt_start[] __attribute__ ((weak));
  extern const uint8_t __crel_iplt_end[] __attribute__ ((weak));
  if (&__crel_iplt_start != &__crel_iplt_end) {
    const uint8_t *p = __crel_iplt_start;
    size_t offset = 0, count = read_uleb128 (&p), shift = count & 3;
    for (count >>= 3; count; count--) {
      uint8_t rel_head = *p++;
      offset += rel_head >> 2;
      if (rel_head & 128)
        offset += (read_uleb128 (&p) << 5) - 32;
      if (rel_head & 2)
        read_sleb128 (&p);
      elf_crel_irel ((ElfW (Addr) *) (offset << shift));
    }
  }
```

**Considering implicit addends for CREL**

Many dynamic relocations have zero addends:

* COPY/GLOB_DAT/JUMP_SLOT relocations only use zero addends.
* Absolute relocations could use non-zero addends with `STT_SECTION`
symbol, but linkers convert them to relative relocations.

Usually only RELATIVE/IRELATIVE and potentially TPREL/TPOFF might
require non-zero addends.
Switching from `DT_RELA` to `DT_REL` offers a minor size advantage.

I considered defining two separate dynamic tags (`DT_CREL` and
`DT_CRELA`) to distinguish between implicit and explicit addends.
However, this would have introduced complexity:

* Should `llvm-readelf -r` dump the zero addends for `DT_CRELA`?
* Should dynamic loaders support both dynamic tags?

I placed the delta addend bit next to offset bits so that it can be
reused for offsets.
Thanks to Stefan O'Rear's for making me believe that my original
thought of reserving a single bit flag (`addend_bit`) within the CREL
header is elegant.
Dynamic loaders prioritizing simplicity can hardcode the desired
`addend_bit` value.

`ld.lld -z crel` defaults to implicit addends (`addend_bit==0`), but
the option of using in-relocation addends is available with `-z crel
-z rela`.

**DT_AARCH64_AUTH_RELR vs CREL**

The AArch64 PAuth ABI introduces `DT_AARCH64_AUTH_RELR` as a variant
of RELR for signed relocations.
However, its benefit seems limited.

In a release build of Clang 16, using `-z crel -z rela` resulted in a
`.crel.dyn` section size of only 1.0% of the file size.
Notably, enabling implicit addends with `-z crel -z rel` further
reduced the size to just 0.3%.
While `DT_AARCH64_AUTH_RELR` will achieve a noticeable smaller
relocation size if most relative relocations are encoded with it, the
advantage seems less significant considering CREL's already compact
size.

Furthermore, `DT_AARCH64_AUTH_RLEL` introduces additional complexity
to the linker due to its 32-bit addend limitation: the in-place 64
value encodes a 32-bit schema, giving just 32 bits to the implicit
addend.
If the addend does not fit into 32 bits, `DT_AARCH64_AUTH_RELR` cannot be used.
CREL with addends would avoid this complexity.

I have filed [Quantifying the benefits of
DT_AARCH64_AUTH_RELR](https://github.com/ARM-software/abi-aa/issues/252).




-- 
宋方睿

  reply	other threads:[~2024-03-25 18:51 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-24  5:50 CREL dynamic relocations Fangrui Song
2024-03-25 11:52 ` Florian Weimer
2024-03-25 18:51   ` Fangrui Song [this message]
  -- strict thread matches above, loose matches on Subject: below --
2024-04-09 15:32 Wilco Dijkstra
2024-04-11  2:41 ` Fangrui Song
2024-04-12 16:18   ` enh
2024-04-18  4:31     ` Fangrui Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/libc/involved.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFP8O3K=40N03GsgYCdp6ErOkc0XiAOKknUNsAR+58vOFcOeyw@mail.gmail.com' \
    --to=maskray@google.com \
    --cc=fweimer@redhat.com \
    --cc=libc-alpha@sourceware.org \
    --cc=maskray@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).