unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: "Zack Weinberg" <zack@owlfolio.org>
To: <libc-alpha@sourceware.org>
Subject: Maybe we should get rid of ifuncs
Date: Tue, 23 Apr 2024 14:14:23 -0400	[thread overview]
Message-ID: <D0RPG6348D0S.1F9SCYCGKZ3VI@owlfolio.org> (raw)

I've been thinking about the XZ exploit (two versions of the compression
library `liblzma` included Trojan horse code that injected a back
door into sshd; see https://research.swtch.com/xz-timeline) and what
it means for glibc, and what I've come to is we should reconsider the
entire idea of ifuncs.

The SSH protocol does not use XZ compression.  liblzma.so was loaded
into sshd's address space because some Linux distributions patched
sshd to use libsystemd, and some libsystemd functions (having to do
with systemd's "journal" logging subsystem, IIUC) do use liblzma, but
by itself that wouldn't have been enough to give the exploit control,
because the patched sshd doesn't use any of those functions.  But
these same Linux distributions also compile libsshd with -z now
(ironically, as a hardening measure, together with -z relro) and that
means the resolvers for all the ifuncs in *all* the loaded shared
libraries will be invoked, early enough in process startup that the
PLT and GOT are still writable.  The XZ exploit used an ifunc resolver
to rewrite a whole bunch of PLT entries, intercepting both calls
within sshd proper, and calls from sshd to libcrypto.so
(i.e. OpenSSL's general-purpose cryptography library).

Ifuncs were already a problem -- resolvers are arbitrary application
code that gets called from deep within the guts of the dynamic loader,
possibly while internal locks are held (I don't know for sure).
In -z now mode, they are called not just before the core C library is
fully initialized, but before symbol resolution is complete, meaning
that they can't necessarily make *any* function calls; we've had any
number of bug reports about this.  They seem to be less troublesome in
lazy binding mode, as far as I can tell, but I can still imagine them
causing trouble (e.g. due to recursive invocation of the lazy symbol
resolution machinery, or due to injecting non-async-signal-safe code
into a call, from a signal handler, to a function that's *supposed* to
be async-signal-safe).  The glibc wiki page for ifuncs
(https://sourceware.org/glibc/wiki/GNU_IFUNC) warns readers that ifunc
resolvers are subject to severe restrictions that aren't documented or
even agreed upon.

As far as I know, the only legitimate (non-malicious) use case anyone
wants for ifuncs is to allow a library to select one of several
implementations of a single function, based on the characteristics of
the CPU -- such as how glibc itself selects the best available
implementation of `memcpy` for the CPU.  It seems to me that we ought
to be able to come up with a completely declarative mechanism for this
use case.  Perhaps a library could supply an array of candidate
implementations of a function, each paired with a bit vector that
declares all of the CPU capabilities that that implementation
requires, sorted from most to least stringent, and the dynamic loader
could run down the list and pick the first one that will work.  This
would avoid all the problems with calling application code from the
guts of the loader.  And, in -z relro -z now mode, it would mean that
no application code could run before the PLT and GOT are made
read-only, closing the path that the XZ trojan used to hook itself
into sshd.  We'd have to keep STT_GNU_IFUNC support around for at
least a few releases, but we could officially deprecate it and provide
a tunable and/or a build-time switch to disable it.

To figure out if this is a workable idea, questions for you all:
(1) Are there other use cases for ifuncs that I don't know about?
(2) Are there existing ifuncs that perform CPU-capability-based
function selection that *could not* be replaced with an array of bit
vectors like what I sketched in the previous paragraph?

zw

             reply	other threads:[~2024-04-23 18:14 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-23 18:14 Zack Weinberg [this message]
2024-04-23 18:39 ` Maybe we should get rid of ifuncs enh
2024-04-23 19:46   ` Palmer Dabbelt
2024-04-24 13:56   ` Zack Weinberg
2024-04-24 14:25     ` enh
2024-04-23 18:52 ` Sam James
2024-04-23 18:54 ` Florian Weimer
2024-04-24 13:53   ` Zack Weinberg
2024-04-23 19:26 ` Andreas Schwab
2024-04-24 13:54   ` Zack Weinberg
2024-04-24  1:41 ` Richard Henderson
2024-04-24 14:43   ` Zack Weinberg
2024-04-24 15:09     ` enh
2024-04-28  0:24     ` Peter Bergner
2024-05-02  2:59       ` Michael Meissner
2024-04-30  8:42 ` Simon Josefsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/libc/involved.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D0RPG6348D0S.1F9SCYCGKZ3VI@owlfolio.org \
    --to=zack@owlfolio.org \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).