unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* Are ifuncs intended to be allowed to resolve to symbols in another DSO?
@ 2020-01-09 18:29 Zack Weinberg
  2020-01-09 20:37 ` Florian Weimer
  0 siblings, 1 reply; 7+ messages in thread
From: Zack Weinberg @ 2020-01-09 18:29 UTC (permalink / raw)
  To: GNU C Library

Suppose I have two major versions of the same shared library
(libfoo.so.1 and libfoo.so.2) and the only difference is that
libfoo.so.2 drops a whole bunch of compatibility aliases.  For
instance, libfoo.so.1 defines two names, `blurf` and `xblurf`, for the
same function, but libfoo.so.2 defines only the `blurf` name.

Any program that winds up loading both shared libraries (via
transitive dependencies) is going to have two copies of the actual
code for `blurf` in memory.  I could eliminate this duplication by
having libfoo.so.1 be a thin wrapper around libfoo.so.2, providing
only a definition for `xblurf` that calls `blurf`.  Good so far, but
now old applications are making two jumps through the PLT whenever
they call `xblurf`.  It occurred to me to wonder whether I could
eliminate the extra indirection (on the second and subsequent calls)
by making xblurf an ifunc:

extern int blurf(char *arg1, int arg2); // defined in libfoo.so.2
static int (*resolve_xblurf(void))(char *, int)
{
  return blurf;
}
int xblurf(char *, int) __attribute__((ifunc("resolve_xblurf")));

GCC 9.2 is perfectly happy to compile this and link it against a test
shared library containing an almost-trivial definition of 'blurf' (all
it does is call strlen and do some arithmetic), and a test program
that calls 'xblurf' will also compile, link, and even run to
completion correctly (using the dynamic loader from glibc 2.29).

However, reading https://sourceware.org/glibc/wiki/GNU_IFUNC gives me
the impression that this is probably *intended* to work but isn't
reliable in the current implementation, for reasons which I don't
follow.

So, my actual questions are: Is this intended to work?  If so, is this
actually reliable with current versions of the dynamic loader?  If so,
what is the oldest version of glibc where it's reliable?  Does it make
any difference if one or both of the libraries are linked with -z now?

zw

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Are ifuncs intended to be allowed to resolve to symbols in another DSO?
  2020-01-09 18:29 Are ifuncs intended to be allowed to resolve to symbols in another DSO? Zack Weinberg
@ 2020-01-09 20:37 ` Florian Weimer
  2020-02-01  6:27   ` Carlos O'Donell
  0 siblings, 1 reply; 7+ messages in thread
From: Florian Weimer @ 2020-01-09 20:37 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: GNU C Library

* Zack Weinberg:

> Suppose I have two major versions of the same shared library
> (libfoo.so.1 and libfoo.so.2) and the only difference is that
> libfoo.so.2 drops a whole bunch of compatibility aliases.  For
> instance, libfoo.so.1 defines two names, `blurf` and `xblurf`, for the
> same function, but libfoo.so.2 defines only the `blurf` name.
>
> Any program that winds up loading both shared libraries (via
> transitive dependencies) is going to have two copies of the actual
> code for `blurf` in memory.  I could eliminate this duplication by
> having libfoo.so.1 be a thin wrapper around libfoo.so.2, providing
> only a definition for `xblurf` that calls `blurf`.  Good so far, but
> now old applications are making two jumps through the PLT whenever
> they call `xblurf`.  It occurred to me to wonder whether I could
> eliminate the extra indirection (on the second and subsequent calls)
> by making xblurf an ifunc:
>
> extern int blurf(char *arg1, int arg2); // defined in libfoo.so.2
> static int (*resolve_xblurf(void))(char *, int)
> {
>   return blurf;
> }
> int xblurf(char *, int) __attribute__((ifunc("resolve_xblurf")));

This only works if libfoo.so.1 has already been relocated when the IFUNC
resolver is called.  Old glibcs always relocated objects in the wrong
order (bug 12892, fixed in glibc 2.15, probably not widely backported).

Even with that fixed, missing DT_NEEDED entries and LD_PRELOAD can
result in calls to yet-unrelocated IFUNC resolvers.  I had a patch to
mostly fix this, but it added quite a bit of complexity (delayed
relocation processing), and there are still difficult-to-describe
failure cases for complex dependency trees.  The community wasn't
enthusiastic about it.

Lazy binding obscures some of these problems, to the degree that some
programmers call getenv in IFUNC resolvers.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Are ifuncs intended to be allowed to resolve to symbols in another DSO?
  2020-01-09 20:37 ` Florian Weimer
@ 2020-02-01  6:27   ` Carlos O'Donell
  2020-02-03 20:06     ` Florian Weimer
  0 siblings, 1 reply; 7+ messages in thread
From: Carlos O'Donell @ 2020-02-01  6:27 UTC (permalink / raw)
  To: Florian Weimer, Zack Weinberg; +Cc: GNU C Library

On 1/9/20 3:37 PM, Florian Weimer wrote:
> * Zack Weinberg:
> 
>> Suppose I have two major versions of the same shared library
>> (libfoo.so.1 and libfoo.so.2) and the only difference is that
>> libfoo.so.2 drops a whole bunch of compatibility aliases.  For
>> instance, libfoo.so.1 defines two names, `blurf` and `xblurf`, for the
>> same function, but libfoo.so.2 defines only the `blurf` name.
>>
>> Any program that winds up loading both shared libraries (via
>> transitive dependencies) is going to have two copies of the actual
>> code for `blurf` in memory.  I could eliminate this duplication by
>> having libfoo.so.1 be a thin wrapper around libfoo.so.2, providing
>> only a definition for `xblurf` that calls `blurf`.  Good so far, but
>> now old applications are making two jumps through the PLT whenever
>> they call `xblurf`.  It occurred to me to wonder whether I could
>> eliminate the extra indirection (on the second and subsequent calls)
>> by making xblurf an ifunc:
>>
>> extern int blurf(char *arg1, int arg2); // defined in libfoo.so.2
>> static int (*resolve_xblurf(void))(char *, int)
>> {
>>   return blurf;
>> }
>> int xblurf(char *, int) __attribute__((ifunc("resolve_xblurf")));
> 
> This only works if libfoo.so.1 has already been relocated when the IFUNC
> resolver is called.  Old glibcs always relocated objects in the wrong
> order (bug 12892, fixed in glibc 2.15, probably not widely backported).
> 
> Even with that fixed, missing DT_NEEDED entries and LD_PRELOAD can
> result in calls to yet-unrelocated IFUNC resolvers.  I had a patch to
> mostly fix this, but it added quite a bit of complexity (delayed
> relocation processing), and there are still difficult-to-describe
> failure cases for complex dependency trees.  The community wasn't
> enthusiastic about it.
> 
> Lazy binding obscures some of these problems, to the degree that some
> programmers call getenv in IFUNC resolvers.

If an IFUNC resolver calls a function which is not yet resolved then
isn't that a defect in the IFUNC resolver?

The code which contains such an IFUNC should have had a DT_NEEDED entry
on all objects to which the resolver called into?

In Zach's case it really means that the new libfoo.so.2 needs to depend
on libfoo.so.1 so the latter is loaded and relocated first. There are
some hairy scenarios with circular dependencies which can mean libfoo.so.1
is *not* relocated first, and Chung-Ling Tang's patches to fix the sort
orders are a step in the right direction to solving this, but arguably
you have undefined behaviour with circular dependencies which we're trying
to make more deterministic.

I like solutions that ensure that IFUNC resolvers, which are just foreign
functions, run at a stage *after* their own dependencies have been resolved,
but before their own initializers have run.

In summary:
- Zach's solution works if libfoo.so.2 depends on libfoo.so.1 and also doesn't
  invoke undefined behaviour with circular dependencies.
- Yes, lazy binding hides these problems by deferring the complex ordering
  requirements until a point where it doesn't matter (all DSO relocation is
  complete). We should still fix the problem for BIND_NOW though.

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Are ifuncs intended to be allowed to resolve to symbols in another DSO?
  2020-02-01  6:27   ` Carlos O'Donell
@ 2020-02-03 20:06     ` Florian Weimer
  2020-02-03 22:03       ` Carlos O'Donell
  0 siblings, 1 reply; 7+ messages in thread
From: Florian Weimer @ 2020-02-03 20:06 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: Zack Weinberg, GNU C Library

* Carlos O'Donell:

> If an IFUNC resolver calls a function which is not yet resolved then
> isn't that a defect in the IFUNC resolver?

There is a position that an IFUNC resolver must not use any non-local
relocations.  (We must support some relocations for IFUNCs on
!PI_STATIC_AND_HIDDEN targets.)  In this case, yes, such a dependencies
would be a bug because all external dependencies are bugs.

On the other hand, in glibc, we could get rid of many IFUNC resolvers
which violate this rule only after changing the loader that eliminated
their need for them.  Other IFUNCs that do not follow this rule do not
have this luxury.

> The code which contains such an IFUNC should have had a DT_NEEDED entry
> on all objects to which the resolver called into?

This does not help with symbol interposition.

> I like solutions that ensure that IFUNC resolvers, which are just foreign
> functions, run at a stage *after* their own dependencies have been resolved,
> but before their own initializers have run.

Lazy binding achieves this because it implicitly tracks this dependency
information.  For eager binding, we simply do not have this information.
Hence my patch for delayed relocation processing.  But it does not solve
the problem completely, of course.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Are ifuncs intended to be allowed to resolve to symbols in another DSO?
  2020-02-03 20:06     ` Florian Weimer
@ 2020-02-03 22:03       ` Carlos O'Donell
  2020-02-04  7:15         ` Fangrui Song
  2020-02-21 13:28         ` Florian Weimer
  0 siblings, 2 replies; 7+ messages in thread
From: Carlos O'Donell @ 2020-02-03 22:03 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Zack Weinberg, GNU C Library

On 2/3/20 3:06 PM, Florian Weimer wrote:
> * Carlos O'Donell:
> 
>> If an IFUNC resolver calls a function which is not yet resolved then
>> isn't that a defect in the IFUNC resolver?
> 
> There is a position that an IFUNC resolver must not use any non-local
> relocations.  (We must support some relocations for IFUNCs on
> !PI_STATIC_AND_HIDDEN targets.)  In this case, yes, such a dependencies
> would be a bug because all external dependencies are bugs.

Just so I understand the position is, to simplify the implementation, that
IFUNC resolvers must not use any non-local relocations?

What does non-local relocations mean? Are we saying the target of all
relocations must be within the same linkmap as the resolver? Are we saying
then that resolvers can only manipulate local data and make local function
calls?

I don't think that would be useful since you'd want to call many libc functions
from the resolver itself and so you would have calls in the resolver that
could call through a GOT entry that has a relocation against an external
symbol in another DSO e.g. libc.so.6.

If an IFUNC resolver has relocations that against non-local symbols
then those symbols must have been a dependency of the object and listed
in DT_NEEDED and therefore relocated before the current IFUNC resolver
is running.

> On the other hand, in glibc, we could get rid of many IFUNC resolvers
> which violate this rule only after changing the loader that eliminated
> their need for them.  Other IFUNCs that do not follow this rule do not
> have this luxury.

I don't quite parse your first setnence here, could you expand on that please?

When you speak about luxury, you mean to say that IFUNCs in other projects
would not have the same possibility to solve such a problem like we do in
glibc?

>> The code which contains such an IFUNC should have had a DT_NEEDED entry
>> on all objects to which the resolver called into?
> 
> This does not help with symbol interposition.

What exact problem does symbol interposition cause?

>> I like solutions that ensure that IFUNC resolvers, which are just foreign
>> functions, run at a stage *after* their own dependencies have been resolved,
>> but before their own initializers have run.
> 
> Lazy binding achieves this because it implicitly tracks this dependency
> information.  For eager binding, we simply do not have this information.
> Hence my patch for delayed relocation processing.  But it does not solve
> the problem completely, of course.

The dependencies are explicitly tracked by DT_NEEDED.

Then when we have underlinked libraries we try to suppliment this and reorder
by relocation dependencies. This may be causing more problems than it solves?

If we had good visibility into the load order manipulations because of the
relocations I think users would understand the consequences of these problems
and we would also be able to better identify underlinked applications and file
bugs to fix them (if possible).

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Are ifuncs intended to be allowed to resolve to symbols in another DSO?
  2020-02-03 22:03       ` Carlos O'Donell
@ 2020-02-04  7:15         ` Fangrui Song
  2020-02-21 13:28         ` Florian Weimer
  1 sibling, 0 replies; 7+ messages in thread
From: Fangrui Song @ 2020-02-04  7:15 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: Florian Weimer, Zack Weinberg, GNU C Library

On 2020-02-03, Carlos O'Donell wrote:
>On 2/3/20 3:06 PM, Florian Weimer wrote:
>> * Carlos O'Donell:
>>
>>> If an IFUNC resolver calls a function which is not yet resolved then
>>> isn't that a defect in the IFUNC resolver?
>>
>> There is a position that an IFUNC resolver must not use any non-local
>> relocations.  (We must support some relocations for IFUNCs on
>> !PI_STATIC_AND_HIDDEN targets.)  In this case, yes, such a dependencies
>> would be a bug because all external dependencies are bugs.
>
>Just so I understand the position is, to simplify the implementation, that
>IFUNC resolvers must not use any non-local relocations?
>
>What does non-local relocations mean? Are we saying the target of all
>relocations must be within the same linkmap as the resolver? Are we saying
>then that resolvers can only manipulate local data and make local function
>calls?
>
>I don't think that would be useful since you'd want to call many libc functions
>from the resolver itself and so you would have calls in the resolver that
>could call through a GOT entry that has a relocation against an external
>symbol in another DSO e.g. libc.so.6.
>
>If an IFUNC resolver has relocations that against non-local symbols
>then those symbols must have been a dependency of the object and listed
>in DT_NEEDED and therefore relocated before the current IFUNC resolver
>is running.
>
>> On the other hand, in glibc, we could get rid of many IFUNC resolvers
>> which violate this rule only after changing the loader that eliminated
>> their need for them.  Other IFUNCs that do not follow this rule do not
>> have this luxury.
>
>I don't quite parse your first setnence here, could you expand on that please?
>
>When you speak about luxury, you mean to say that IFUNCs in other projects
>would not have the same possibility to solve such a problem like we do in
>glibc?
>
>>> The code which contains such an IFUNC should have had a DT_NEEDED entry
>>> on all objects to which the resolver called into?
>>
>> This does not help with symbol interposition.
>
>What exact problem does symbol interposition cause?
>
>>> I like solutions that ensure that IFUNC resolvers, which are just foreign
>>> functions, run@a stage *after* their own dependencies have been resolved,
>>> but before their own initializers have run.
>>
>> Lazy binding achieves this because it implicitly tracks this dependency
>> information.  For eager binding, we simply do not have this information.
>> Hence my patch for delayed relocation processing.  But it does not solve
>> the problem completely, of course.
>
>The dependencies are explicitly tracked by DT_NEEDED.
>
>Then when we have underlinked libraries we try to suppliment this and reorder
>by relocation dependencies. This may be causing more problems than it solves?
>
>If we had good visibility into the load order manipulations because of the
>relocations I think users would understand the consequences of these problems
>and we would also be able to better identify underlinked applications and file
>bugs to fix them (if possible).

On powerpc64: A is in one module, B and C are in another module. C is a
non-preemptible ifunc. A calls B, B tail calls C.

The GNU ld generated IPLT code sequence starts with `std r2,24(r1)` .
When B tail calls C, it has popped its stack frame. `std r2,24(r1)`
overwrites the TOC pointer from A's module with the TOC pointer from
B,C's module. When C returns to the call site in A, the TOC pointer
incorrectly restores to the value from B,C's module.

The lld generated IPLT code sequence does not use `std r2,24(r1)`.
The above scenario works, but the resolver and the implementation cannot
be in different modules.

So for powerpc64, an IFUNC resolver returning the implementation in
another module may be unreliable.


On powerpc32 Secure PLT: A is in one translation unit, B and C are in
another translation units.  A, B and C are in the same module. C is a
non-preemptible ifunc. B takes the address of C and passes it to A. A
calls the function pointer.

The IPLT code sequence assumes r30 = .got2(B)+0x8000. Calling it from A
(.got2(A)!=.got2(B)) will be wrong.

So for powerpc, even different translation units can be unreliable.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Are ifuncs intended to be allowed to resolve to symbols in another DSO?
  2020-02-03 22:03       ` Carlos O'Donell
  2020-02-04  7:15         ` Fangrui Song
@ 2020-02-21 13:28         ` Florian Weimer
  1 sibling, 0 replies; 7+ messages in thread
From: Florian Weimer @ 2020-02-21 13:28 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: Zack Weinberg, GNU C Library

* Carlos O'Donell:

> On 2/3/20 3:06 PM, Florian Weimer wrote:
>> * Carlos O'Donell:
>> 
>>> If an IFUNC resolver calls a function which is not yet resolved then
>>> isn't that a defect in the IFUNC resolver?
>> 
>> There is a position that an IFUNC resolver must not use any non-local
>> relocations.  (We must support some relocations for IFUNCs on
>> !PI_STATIC_AND_HIDDEN targets.)  In this case, yes, such a dependencies
>> would be a bug because all external dependencies are bugs.
>
> Just so I understand the position is, to simplify the implementation, that
> IFUNC resolvers must not use any non-local relocations?

It means no location depedencies at all for PI_STATIC_AND_HIDDEN
targets.  I think this is actually the current consensus, as expressed
in the implementation.

I don't know how to express this requirement for !PI_STATIC_AND_HIDDEN
targets.  Fangrui Song explained potential problems there.

We probably should not add IFUNC supports for targets that are not
PI_STATIC_AND_HIDDEN.  If every function call and data reference goes
through a GOT-like construct, that can never work with IFUNCs with
symbol references that do not strictly follow the order of dependencies
and thus relocation processing in the dynamic loader.

> I don't think that would be useful since you'd want to call many libc
> functions from the resolver itself and so you would have calls in the
> resolver that could call through a GOT entry that has a relocation
> against an external symbol in another DSO e.g. libc.so.6.

ELF constructors are another concern here.  Relocation processing
happens before they run, so such calls will see uninitialized libraries.

We can make this work for libc.so.6, but currently, it's not very
reliable, especially for things that user may actually want to use, like
getenv.

>> On the other hand, in glibc, we could get rid of many IFUNC resolvers
>> which violate this rule only after changing the loader that eliminated
>> their need for them.  Other IFUNCs that do not follow this rule do not
>> have this luxury.
>
> I don't quite parse your first setnence here, could you expand on that
> please?

The toolchain currently does not permit IFUNC resolvers which depend on
most run-time relocations for work.  libpthread violated that
requirement for vfork (and earlier for other functions).

>>> The code which contains such an IFUNC should have had a DT_NEEDED entry
>>> on all objects to which the resolver called into?
>> 
>> This does not help with symbol interposition.
>
> What exact problem does symbol interposition cause?

It introduces symbol dependencies which are not reflected in the
DT_NEEDED dependencies.  This means that the relocation order chosen by
the dynamic loader does not meet the actual requirements of IFUNC
resolvers.

>>> I like solutions that ensure that IFUNC resolvers, which are just foreign
>>> functions, run at a stage *after* their own dependencies have been resolved,
>>> but before their own initializers have run.
>> 
>> Lazy binding achieves this because it implicitly tracks this dependency
>> information.  For eager binding, we simply do not have this information.
>> Hence my patch for delayed relocation processing.  But it does not solve
>> the problem completely, of course.
>
> The dependencies are explicitly tracked by DT_NEEDED.

Not for symbol interposition.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-02-21 13:28 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-09 18:29 Are ifuncs intended to be allowed to resolve to symbols in another DSO? Zack Weinberg
2020-01-09 20:37 ` Florian Weimer
2020-02-01  6:27   ` Carlos O'Donell
2020-02-03 20:06     ` Florian Weimer
2020-02-03 22:03       ` Carlos O'Donell
2020-02-04  7:15         ` Fangrui Song
2020-02-21 13:28         ` Florian Weimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).