unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* CPython vs libstdc++
@ 2019-07-11 16:13 Zack Weinberg
  2019-07-11 16:50 ` Szabolcs Nagy
  2019-07-12  2:05 ` Carlos O'Donell
  0 siblings, 2 replies; 4+ messages in thread
From: Zack Weinberg @ 2019-07-11 16:13 UTC (permalink / raw)
  To: GNU C Library, libstdc++; +Cc: Sumana Harihareswara

I have been investigating a mysterious problem with Python extension
modules that use C++ internally.  If I'm right about the cause, I
suspect that it can't be fixed without changes to the dynamic linker,
and I think we may need to have a dialogue between Python core
maintainers and GNU toolchain maintainers to figure out what Python
wants to be possible and how much of that is feasible for GCC and
glibc to support.

The surface symptoms of the problem are that, if you load two
unrelated modules, both of which use "enough" C++ features internally,
into the same process, the entire interpreter crashes, with stack
traces pointing at the guts of libstdc++.  It is unclear exactly which
C++ features trigger the crash and it is also unclear whether it
matters what version or versions of G++ the modules were compiled by.
I have not had any luck constructing a minimal test case.

People who are deeply familiar with the internals of the Python
interpreter tell me that this "should be impossible" because each
module is loaded into its own ELF namespace.  I can't actually verify
that for myself -- I don't see any references to dlmopen() in CPython
3.7's source code, and as far as I know, that's the only way to do
that.  But assuming it's true, it immediately raises a red flag for
me, because I do know that both g++-compiled C++ in general, and
critical bits of libstdc++ in particular (e.g. the exception unwinder)
rely on certain data objects being unique within the entire address
space (process).

On the hypothesis that the problem is caused by two copies of
libstdc++.so and/or libgcc_s.so being loaded into a single address
space, which cannot reasonably be made to work, even if they're the
exact same version: we need some way of loading a shared object such
that only one copy will be loaded, and reused for each ELF namespace
that needs it.  As far as I can tell, this is currently not possible.
Ideally the trigger for this behavior would be an annotation on each
shared object that needs it, rather than requiring all programs that
use ELF namespaces to be aware of the issue; however, we might _also_
want a way for a program that uses ELF namespaces to request this
behavior, in case it's trying to support old libraries that don't have
the annotation even though they ought to.

I have very limited time to work on this myself and I'm not even fully
confident I understand the problem.  I'm writing this message as a
call for volunteers from the toolchain side who have the time and
understanding to tackle the problem; I can put you in touch with the
appropriate people from the Python side.

Thanks,
zw

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: CPython vs libstdc++
  2019-07-11 16:13 CPython vs libstdc++ Zack Weinberg
@ 2019-07-11 16:50 ` Szabolcs Nagy
  2019-07-12  2:32   ` Carlos O'Donell
  2019-07-12  2:05 ` Carlos O'Donell
  1 sibling, 1 reply; 4+ messages in thread
From: Szabolcs Nagy @ 2019-07-11 16:50 UTC (permalink / raw)
  To: Zack Weinberg, GNU C Library, libstdc++@gcc.gnu.org
  Cc: nd, Sumana Harihareswara

On 11/07/2019 17:13, Zack Weinberg wrote:
> I have been investigating a mysterious problem with Python extension
> modules that use C++ internally.  If I'm right about the cause, I
> suspect that it can't be fixed without changes to the dynamic linker,
> and I think we may need to have a dialogue between Python core
> maintainers and GNU toolchain maintainers to figure out what Python
> wants to be possible and how much of that is feasible for GCC and
> glibc to support.
> 
> The surface symptoms of the problem are that, if you load two
> unrelated modules, both of which use "enough" C++ features internally,
> into the same process, the entire interpreter crashes, with stack
> traces pointing at the guts of libstdc++.  It is unclear exactly which
> C++ features trigger the crash and it is also unclear whether it
> matters what version or versions of G++ the modules were compiled by.
> I have not had any luck constructing a minimal test case.
> 
> People who are deeply familiar with the internals of the Python
> interpreter tell me that this "should be impossible" because each
> module is loaded into its own ELF namespace.  I can't actually verify
> that for myself -- I don't see any references to dlmopen() in CPython
> 3.7's source code, and as far as I know, that's the only way to do
> that.  But assuming it's true, it immediately raises a red flag for

they don't use dlmopen, but dlopen with RTLD_LOCAL
(well they only pass RTLD_NOW by default but that means local,
you can check/change this by sys.getdlopenflags() and sys.set...)

> me, because I do know that both g++-compiled C++ in general, and
> critical bits of libstdc++ in particular (e.g. the exception unwinder)
> rely on certain data objects being unique within the entire address
> space (process).
> 
> On the hypothesis that the problem is caused by two copies of
> libstdc++.so and/or libgcc_s.so being loaded into a single address
> space, which cannot reasonably be made to work, even if they're the
> exact same version: we need some way of loading a shared object such
> that only one copy will be loaded, and reused for each ELF namespace
> that needs it.  As far as I can tell, this is currently not possible.
> Ideally the trigger for this behavior would be an annotation on each
> shared object that needs it, rather than requiring all programs that
> use ELF namespaces to be aware of the issue; however, we might _also_
> want a way for a program that uses ELF namespaces to request this
> behavior, in case it's trying to support old libraries that don't have
> the annotation even though they ought to.

there is known conflict between RTLD_LOCAL and c++ odr requirement
for 'vague linkage' objects, the gnu toolchain solution using
STB_GNU_UNIQUE binding may have issues (and that can be turned
off in gcc/gold so we would need to know what toolchain were used
for those python modules)

it's also possible that some modules were built with -static-libstdc++
you will have to dig further.


> 
> I have very limited time to work on this myself and I'm not even fully
> confident I understand the problem.  I'm writing this message as a
> call for volunteers from the toolchain side who have the time and
> understanding to tackle the problem; I can put you in touch with the
> appropriate people from the Python side.
> 
> Thanks,
> zw
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: CPython vs libstdc++
  2019-07-11 16:13 CPython vs libstdc++ Zack Weinberg
  2019-07-11 16:50 ` Szabolcs Nagy
@ 2019-07-12  2:05 ` Carlos O'Donell
  1 sibling, 0 replies; 4+ messages in thread
From: Carlos O'Donell @ 2019-07-12  2:05 UTC (permalink / raw)
  To: Zack Weinberg, GNU C Library, libstdc++; +Cc: Sumana Harihareswara

On 7/11/19 12:13 PM, Zack Weinberg wrote:
> I have been investigating a mysterious problem with Python extension
> modules that use C++ internally.  If I'm right about the cause, I
> suspect that it can't be fixed without changes to the dynamic linker,
> and I think we may need to have a dialogue between Python core
> maintainers and GNU toolchain maintainers to figure out what Python
> wants to be possible and how much of that is feasible for GCC and
> glibc to support.

I agree completely.

I try to stay integrated into Python (manylinux issues), Ruby (malloc
issues), llvm lld (CET markup), and other toolchain issues and
language issues.

Do you have a bug # or something where we can discuss the issue
upstream with the Python developers in question?

-- 
Cheers,
Carlos.

P.S.
Makes me think of this:
https://www.slideshare.net/StefanusDuToit/cpp-con-2014-hourglass-interfaces-for-c-apis

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: CPython vs libstdc++
  2019-07-11 16:50 ` Szabolcs Nagy
@ 2019-07-12  2:32   ` Carlos O'Donell
  0 siblings, 0 replies; 4+ messages in thread
From: Carlos O'Donell @ 2019-07-12  2:32 UTC (permalink / raw)
  To: Szabolcs Nagy, Zack Weinberg, GNU C Library,
	libstdc++@gcc.gnu.org
  Cc: nd, Sumana Harihareswara

On 7/11/19 12:50 PM, Szabolcs Nagy wrote:
> they don't use dlmopen, but dlopen with RTLD_LOCAL
> (well they only pass RTLD_NOW by default but that means local,
> you can check/change this by sys.getdlopenflags() and sys.set...)

You can have further dlopen's with RTLD_GLOBAL the promote the
object to global scope and so this is a mess, and I think I've
already mentioned this to upstream python developers at least
once. This issue rings a bell. We've discussed python plugins
and C++ in the past.

The current libstdc++ is not designed to be loaded into the
process image with RTLD_LOCAL, and libstdc++ uses dl_iterate_phdr()
to look for loaded objects and use them during unwinding, and this
is going to use the binaries object scope, not the local scope
created by RTLD_LOCAL.

My opinion is that if you want to use C++ with python plugins, then
the python interpreter needs to be linked against one libstdc++ and
then all plugins can use C/C++ for bindings, but I don't know if that
would fly.

>> me, because I do know that both g++-compiled C++ in general, and
>> critical bits of libstdc++ in particular (e.g. the exception unwinder)
>> rely on certain data objects being unique within the entire address
>> space (process).
>>
>> On the hypothesis that the problem is caused by two copies of
>> libstdc++.so and/or libgcc_s.so being loaded into a single address
>> space, which cannot reasonably be made to work, even if they're the
>> exact same version: we need some way of loading a shared object such
>> that only one copy will be loaded, and reused for each ELF namespace
>> that needs it.  As far as I can tell, this is currently not possible.
>> Ideally the trigger for this behavior would be an annotation on each
>> shared object that needs it, rather than requiring all programs that
>> use ELF namespaces to be aware of the issue; however, we might _also_
>> want a way for a program that uses ELF namespaces to request this
>> behavior, in case it's trying to support old libraries that don't have
>> the annotation even though they ought to.
> 
> there is known conflict between RTLD_LOCAL and c++ odr requirement
> for 'vague linkage' objects, the gnu toolchain solution using
> STB_GNU_UNIQUE binding may have issues (and that can be turned
> off in gcc/gold so we would need to know what toolchain were used
> for those python modules)

Correct, RTLD_LOCAL is a thing that doesn't make sense for C++, and
so the use of STB_GNU_UNIQUE is to promote such symbols to global
scope.

You have to understand that for C++ some things will violate your
expectations of isolation.

You really need to use dlmopen here, and that has lots of bugs too
because we only implement what is barely required for LD_AUDIT.
Though we have RFC patches to make it better.

> it's also possible that some modules were built with -static-libstdc++
> you will have to dig further.

That would be a possibility, and I'm not sure what we support there.
If the two plugins didn't pass relevant objects, then it might work, but
again I don't know how the unwinder works with -static-libstdc++ in
effect.

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-07-12  2:32 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-11 16:13 CPython vs libstdc++ Zack Weinberg
2019-07-11 16:50 ` Szabolcs Nagy
2019-07-12  2:32   ` Carlos O'Donell
2019-07-12  2:05 ` Carlos O'Donell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).