From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <libc-alpha-bounces+e=80x24.org@sourceware.org>
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on starla
X-Spam-Level: 
X-Spam-Status: No, score=-3.0 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
	MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_PASS
	autolearn=ham autolearn_force=no version=3.4.6
Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256)
	(No client certificate requested)
	by dcvr.yhbt.net (Postfix) with ESMTPS id B88F61F44D
	for <e@80x24.org>; Thu, 11 Apr 2024 02:42:10 +0000 (UTC)
Authentication-Results: dcvr.yhbt.net;
	dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=WUsOTVRr;
	dkim-atps=neutral
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id A4209384AB42
	for <e@80x24.org>; Thu, 11 Apr 2024 02:42:06 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A4209384AB42
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1712803326;
	bh=/E2ECsedNOptXJH/pVtkbrgfPQSQdrK9itBoNsIlTXU=;
	h=References:In-Reply-To:From:Date:Subject:To:Cc:List-Id:
	 List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe:
	 From;
	b=WUsOTVRrKAn4XpuhQlSRmSo33LFklEdaoHqHlNCaD650CRECQZXNw1nEhSrCEsXZx
	 +GTcF/mnt+c/xpb9cEs0eskf65FvqllVHEtHlzQ6IGeOeVgI6tUIYgxhr0FF39e+UC
	 5i5NorYibieeafnOuhkYNXhTq06lVTsLCu90dIwQ=
Received: from mail-vk1-f176.google.com (mail-vk1-f176.google.com
 [209.85.221.176])
 by sourceware.org (Postfix) with ESMTPS id 21C173858417
 for <libc-alpha@sourceware.org>; Thu, 11 Apr 2024 02:41:29 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 21C173858417
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=gcc.gnu.org
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 21C173858417
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=209.85.221.176
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712803291; cv=none;
 b=QkDl0kvC3pZ7+dAPbRGNc98jPXMVNg/mRjJDu7eMObGDywTVlnp99e38bYsrD2Km1Mx1TUHxpRjkaBN9whWnN1pZoo0ddSQreYFCXcEIvMF2ZuR2IKOwPEraKPYfOtF3CI2mzGhpy12WaxlWnWH62TbVe/Ibl2IVPrAAz8Tz8Ig=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1712803291; c=relaxed/simple;
 bh=mPRTvNMoO+MrIWOvSUaorX82cHDcQrQ8pMNmqZO5eqU=;
 h=MIME-Version:From:Date:Message-ID:Subject:To;
 b=MuCEYnEMzwf0OBjAbtC+0+cLqT6DP0WdKxInpQB8REbzIumpVW9+GyHH7LMQZZKmhFGBUE5ZElyc03DhTy2oiYkZAbo6iWk5NI7NKAz2UGfAyBnIjFxfZwx5ZFBGJDpyv/OlCeQVn17xvT9EZmsSIT5OGrv4ByUB+mZE5oMCIeI=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: by mail-vk1-f176.google.com with SMTP id
 71dfb90a1353d-4dac92abe71so1526649e0c.2
 for <libc-alpha@sourceware.org>; Wed, 10 Apr 2024 19:41:29 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1712803288; x=1713408088;
 h=content-transfer-encoding:cc:to:subject:message-id:date:from
 :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
 :subject:date:message-id:reply-to;
 bh=/E2ECsedNOptXJH/pVtkbrgfPQSQdrK9itBoNsIlTXU=;
 b=Uy66/kKTlXAqhcvlc9S1HF3+aBbSwpoi3bziOEnMnNF4TEZuXbifxYAZ02Bg77pY/s
 qGINxYaLiUTTEGBVUYI9kpZ/gTXhAd40E4pN+DQ8snL1QvYaAwczNZ5lffmaO6SuRbcP
 JNY4J89MnSWNnTdhPpskkn/QiaY4i2EBWINjCMC0qnmdT+9rbJU4dI/JeETmI/6NE+3O
 4bBz4fkxhoQx4SQlTPVbJEVmMUI/1B3qiTzmOHOZ9qPKz+vrr0OM14QAMULR0+KrBWgM
 Nhm3UxHa4cH95Md2CBrb46FQsX9k7Vrm/Ud1XRaKDFhJMj/PdbznPHjjB+8d9hDjehth
 z0XA==
X-Gm-Message-State: AOJu0Yw/NUU2CmHKR9brWL4CkLCOWeDIGAtx6fgwD4O4lQ2hpVQXMjJ8
 HsXa7XmxyHHjWKKY9LKWP8yF/Ywk/OlqWUyW/yRNOaSq60Vb6vUu7bHmDg==
X-Google-Smtp-Source: AGHT+IEMpwaShtd5eWVALxBQ04R+BDVSQfCU7xJhMS9CQbpQzlVAMzaOR5MwHoLvcGIm+5+GYoFkng==
X-Received: by 2002:a05:6122:3c95:b0:4d4:1551:6ef6 with SMTP id
 fy21-20020a0561223c9500b004d415516ef6mr5528165vkb.2.1712803288281; 
 Wed, 10 Apr 2024 19:41:28 -0700 (PDT)
Received: from mail-vk1-f172.google.com (mail-vk1-f172.google.com.
 [209.85.221.172]) by smtp.gmail.com with ESMTPSA id
 l3-20020a1ffe03000000b004d895c72d56sm96475vki.50.2024.04.10.19.41.28
 for <libc-alpha@sourceware.org>
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Wed, 10 Apr 2024 19:41:28 -0700 (PDT)
Received: by mail-vk1-f172.google.com with SMTP id
 71dfb90a1353d-4dac3cbc8fdso2006847e0c.0
 for <libc-alpha@sourceware.org>; Wed, 10 Apr 2024 19:41:28 -0700 (PDT)
X-Received: by 2002:a05:6122:45a0:b0:4d8:74a2:6d35 with SMTP id
 de32-20020a05612245a000b004d874a26d35mr4934396vkb.9.1712803287797; Wed, 10
 Apr 2024 19:41:27 -0700 (PDT)
MIME-Version: 1.0
References: <PAWPR08MB89828687CE183D4EC04DAAF483072@PAWPR08MB8982.eurprd08.prod.outlook.com>
In-Reply-To: <PAWPR08MB89828687CE183D4EC04DAAF483072@PAWPR08MB8982.eurprd08.prod.outlook.com>
From: Fangrui Song <maskray@gcc.gnu.org>
Date: Wed, 10 Apr 2024 19:41:16 -0700
X-Gmail-Original-Message-ID: <CAN30aBGaKwqCYSEhpjL6LX0+T65dzgpKR4pkXTudWEs1qWQ68g@mail.gmail.com>
Message-ID: <CAN30aBGaKwqCYSEhpjL6LX0+T65dzgpKR4pkXTudWEs1qWQ68g@mail.gmail.com>
Subject: Re: CREL dynamic relocations
To: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Cc: GNU C Library <libc-alpha@sourceware.org>,
 Florian Weimer <fweimer@redhat.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-BeenThere: libc-alpha@sourceware.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/libc-alpha>,
 <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-alpha>,
 <mailto:libc-alpha-request@sourceware.org?subject=subscribe>
Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org

Thank you for your interest in the CREL relocation format.

On Tue, Apr 9, 2024 at 8:33=E2=80=AFAM Wilco Dijkstra <Wilco.Dijkstra@arm.c=
om> wrote:
> I like the general idea of more compact relocations, however what I don't=
 get is
> what the overall goal is. If the goal is more compact object files, why d=
on't we just
> add a (de)compress pass using a fast compression algorithm? CPU time is c=
heap
> today, and real compression easily gives 2-4x reduction of object file si=
ze, far more
> than you could achieve by just compressing relocations.

My primary goal is to make relocatable files smaller (see
https://sourceware.org/pipermail/binutils/2024-March/133229.html for a
few use cases).
Smaller files benefit applications in several ways, including smaller
I/O amount and lower linker memory usage (for linkers like gold, lld,
and mold that map input files into memory).

Generic data compression formats (like zlib or zstd) applied at the
filesystem level won't achieve this goal because they don't decrease
memory usage.
In addition, filesystem compression does not appear too popular.

Interestingly, I measured a 5.9% size reduction in .o files even after
zstd compression when comparing two Clang builds with and without
CREL.

    % ruby -e 'require "zstd-ruby"; un=3Dcom=3D0;
Dir.glob("/tmp/out/s2-custom0/**/*.o").each{|f| x =3D
File.open(f,"rb"){|h|h.read}; un+=3Dx.size; com+=3DZstd.compress(x).size};
puts "uncompressed: #{un}\ncompressed: #{com}"'
    uncompressed: 136086784
    compressed: 37173381

    % ruby -e 'require "zstd-ruby"; un=3Dcom=3D0;
Dir.glob("/tmp/out/s2-custom1/**/*.o").each{|f| x =3D
File.open(f,"rb"){|h|h.read}; un+=3Dx.size; com+=3DZstd.compress(x).size};
puts "uncompressed: #{un}\ncompressed: #{com}"'
    uncompressed: 111655952
    compressed: 34964421

    1-111655952/136086784 ~=3D 18.0% (uncompressed)
    1-34964421/37173381 ~=3D 5.9%    (zstd)

Another objective is to minimize the size of dynamic relocations.
Android achieves this through ld.lld --pack-dyn-relocs=3Dandroid+relr,
which compacts RELA relocations in their packed format.
While effective, CREL offers a simpler approach that delivers even
greater size reductions.

> Alternatively, if we wanted to access and process ELF files without any d=
ecompression,
> we could define compact relocations as fixed-size entries. Using 64 bits =
for a compact
> RELA relocation gives a straightforward 4x compression. Out of range valu=
es could
> use the next entry to extend the ranges.

64 bits are quite large. CREL typically needs just one to three bytes
for one relocation.
How do you design a format that is generic enough to work with all
relocation types and symbol indexes?

> So my main issue with the proposal is that it tries too hard to compress =
relocations.
> For example using offset compression for relocations, symbol indices and =
even addends
> seems to have little value: the signed offset means you lose one bit, and=
 if out of range
> values are rare or not grouped together, offset encodings are actually le=
ss efficient.

I actually use unsigned delta offset to save one bit but signed delta
symidx/addend.
I have analyzed how many bits are demanded by typical relocations.
Quote https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-el=
f#crel-relocation-format
:

    Absolute symbol indexes allow one-byte encoding for symbols in the
range [0,128) and offer minor size advantage for static relocations
when the symbol table is sorted by usage frequency. Delta encoding, on
the other hand, might optimize for the scenario when the symbol table
presents locality: neighbor symbols are frequently mutually called.

    Delta symbol index enables one-byte encoding for GOT/PLT dynamic
relocations when .got/.got.plt entries are ordered by symbol index.
For example, R_*_GLOB_DAT and R_*_JUMP_SLOT relocations can typically
be encoded with repeated 0x05 0x01 (when addend_bit=3D=3D0 && shift=3D=3D3,
offset++, symidx++). Delta encoding has a disvantage. It can partial
claim the optimization by arranging symbols in a "cold0 hot cold1"
pattern. In addition, delta symbol index enables one-byte encoding for
GOT/PLT dynamic relocations when .got/.got.plt entries are ordered by
symbol index.

    In my experiments, absolute encoding with ULEB128 results in
slightly larger .o file sizes for both x86-64 and AArch64 builds.

For a decoder that only supports in-reloc addends (recommended for
relocatable files), the C++ implementation is as simple as:

  const auto hdr =3D decodeULEB128(p);
  const size_t count =3D hdr / 8, shift =3D hdr % 4;
  Elf_Addr offset =3D 0, addend =3D 0;
  uint32_t symidx =3D 0, type =3D 0;
  for (size_t i =3D 0; i !=3D count; ++i) {
    const uint8_t b =3D *p++;
    offset +=3D b >> 3;
    if (b >=3D 0x80) offset +=3D (decodeULEB128(p) << 4) - 0x10;
    if (b & 1) symidx +=3D decodeSLEB128(p);
    if (b & 2) type +=3D decodeSLEB128(p);
    if (b & 4) addend +=3D decodeSLEB128(p);
    rels[i] =3D {offset << shift, symidx, type, addend};
  }

+=3D for all of symidx/type/addend is for consistency, but the choice
turns out to be very good as well.

> I don't get the discussion about relocation numbers on AArch64 - 4 or 5 b=
its would
> handle all frequently used relocations, so we'd just remap them to fit in=
 the short
> encoding. Hence I don't see a need at all for a signed offset encoding.

The common static relocation types are within [257,313] (before
R_AARCH64_PLT32).
Delta encoding allows ~all but the first relocation's type to be
encoded in a single byte.

How do you design a compression scheme without baked-in knowledge (dictiona=
ry)?
We don't want the generic encoding scheme to hard code relocation type
range for each architecture.