* -mno-tls-direct-seg-refs support in glibc for i386 PV Xen @ 2020-05-27 13:03 Florian Weimer via Libc-alpha 2020-05-27 13:39 ` Andrew Cooper via Libc-alpha 0 siblings, 1 reply; 8+ messages in thread From: Florian Weimer via Libc-alpha @ 2020-05-27 13:03 UTC (permalink / raw) To: xen-devel; +Cc: libc-alpha I'm about to remove nosegneg support from upstream glibc, special builds that use -mno-tls-direct-seg-refs, and the ability load different libraries built in this mode automatically, when the Linux kernel tells us to do that. I think the intended effect is that these special builds do not use operands of the form %gs:(%eax) when %eax has the MSB set because that had a performance hit with paravirtualization on 32-bit x86. Instead, the thread pointer is first loaded from %gs:0, and the actual access does not use a segment prefix. Before doing that, I'd like to ask if anybody is still using this feature? I know that we've been carrying nosegneg libraries for many years, in some cases even after we stopped shipping 32-bit kernels. 8-/ The feature has always been rather poorly documented, and the way the dynamic loader selects those nosegneg library variants is still very bizarre. Thanks, Florian ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: -mno-tls-direct-seg-refs support in glibc for i386 PV Xen 2020-05-27 13:03 -mno-tls-direct-seg-refs support in glibc for i386 PV Xen Florian Weimer via Libc-alpha @ 2020-05-27 13:39 ` Andrew Cooper via Libc-alpha 2020-05-27 13:44 ` Samuel Thibault 2020-05-27 14:00 ` Jan Beulich 0 siblings, 2 replies; 8+ messages in thread From: Andrew Cooper via Libc-alpha @ 2020-05-27 13:39 UTC (permalink / raw) To: Florian Weimer, xen-devel; +Cc: libc-alpha On 27/05/2020 14:03, Florian Weimer wrote: > I'm about to remove nosegneg support from upstream glibc, special builds > that use -mno-tls-direct-seg-refs, and the ability load different > libraries built in this mode automatically, when the Linux kernel tells > us to do that. I think the intended effect is that these special builds > do not use operands of the form %gs:(%eax) when %eax has the MSB set > because that had a performance hit with paravirtualization on 32-bit > x86. Instead, the thread pointer is first loaded from %gs:0, and the > actual access does not use a segment prefix. > > Before doing that, I'd like to ask if anybody is still using this > feature? > > I know that we've been carrying nosegneg libraries for many years, in > some cases even after we stopped shipping 32-bit kernels. 8-/ The > feature has always been rather poorly documented, and the way the > dynamic loader selects those nosegneg library variants is still very > bizarre. I wasn't even aware of this feature, or that there was a problem wanting fixing. That said, I have found: # 32-bit x86 does not perform well with -ve segment accesses on Xen. CFLAGS-$(CONFIG_X86_32) += $(call cc-option,$(CC),-mno-tls-direct-seg-refs) in one of our makefiles. Why does the MSB make any difference? %gs still needs to remain intact so the thread pointer can be pulled out, so there is nothing that Xen or Linux can do in the way of lazy loading. Beyond that, its straight up segment base semantics in x86. There will be a 1-cycle AGU delay from a non-zero base, but that nothing to do with Xen and applies to all segment based TLS accesses on x86, and you'll win that back easily through reduced register pressure. Are there any further details on the perf problem claim? I find it suspicious. Either way, 32bit PV is on its last legs (not too bad, for something which was essentially killed by the AMD64 spec). Ring 1 counting as supervisor mode as far as pagetables goes has already caused guests to suffer a major performance hit on hardware with SMAP/SMEP (IvyBridge and later), as well as various speculative mitigations (we can't rely on SMEP preventing the CPU from speculating back into Ring 1, etc), and the forthcoming CET Shadow Stack feature totally kills Ring1/2 as usable concepts in the architecture. Linux is threatening to drop PV32 support, and I've recently added an option to Xen to compile out and/or disable PV32 (both for attack surface reduction purposes, and as a necessary consequence of using Shadow Stacks). With both my XenServer and upstream x86 maintainers hats on, PV32 is solely for legacy workloads now. People currently using PV32 obviously don't care about performance, or haven't been taking security updates. I severely doubt they'll notice any change from this. ~Andrew ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: -mno-tls-direct-seg-refs support in glibc for i386 PV Xen 2020-05-27 13:39 ` Andrew Cooper via Libc-alpha @ 2020-05-27 13:44 ` Samuel Thibault 2020-05-27 14:15 ` Andrew Cooper via Libc-alpha 2020-05-27 14:00 ` Jan Beulich 1 sibling, 1 reply; 8+ messages in thread From: Samuel Thibault @ 2020-05-27 13:44 UTC (permalink / raw) To: Andrew Cooper; +Cc: Florian Weimer, xen-devel, libc-alpha Hello, Andrew Cooper via Libc-alpha, le mer. 27 mai 2020 14:39:00 +0100, a ecrit: > Why does the MSB make any difference? %gs still needs to remain intact > so the thread pointer can be pulled out, so there is nothing that Xen or > Linux can do in the way of lazy loading. > > Beyond that, its straight up segment base semantics in x86. There will > be a 1-cycle AGU delay from a non-zero base, but that nothing to do with > Xen and applies to all segment based TLS accesses on x86, and you'll win > that back easily through reduced register pressure. > > Are there any further details on the perf problem claim? I find it > suspicious. The concern is not about the indirection. The concern is that to keep safe from the guest, the hypervisor has to restrict the size of the segment, and thus negative offsets, used in the i386 TLS model, are rejected by the processor, and the hypervisor has to emulate these access, thus a high cost. Samuel ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: -mno-tls-direct-seg-refs support in glibc for i386 PV Xen 2020-05-27 13:44 ` Samuel Thibault @ 2020-05-27 14:15 ` Andrew Cooper via Libc-alpha 2020-05-27 14:20 ` Florian Weimer via Libc-alpha 0 siblings, 1 reply; 8+ messages in thread From: Andrew Cooper via Libc-alpha @ 2020-05-27 14:15 UTC (permalink / raw) To: Samuel Thibault, Florian Weimer, xen-devel, libc-alpha On 27/05/2020 14:44, Samuel Thibault wrote: > Hello, > > Andrew Cooper via Libc-alpha, le mer. 27 mai 2020 14:39:00 +0100, a ecrit: >> Why does the MSB make any difference? %gs still needs to remain intact >> so the thread pointer can be pulled out, so there is nothing that Xen or >> Linux can do in the way of lazy loading. >> >> Beyond that, its straight up segment base semantics in x86. There will >> be a 1-cycle AGU delay from a non-zero base, but that nothing to do with >> Xen and applies to all segment based TLS accesses on x86, and you'll win >> that back easily through reduced register pressure. >> >> Are there any further details on the perf problem claim? I find it >> suspicious. > The concern is not about the indirection. > > The concern is that to keep safe from the guest, the hypervisor has to > restrict the size of the segment, and thus negative offsets, used in the > i386 TLS model, are rejected by the processor, and the hypervisor has to > emulate these access, thus a high cost. Oh, so the i386 TLS model relies on the calculation wrapping (modulo 4G) when the segment limit is 4G, instead of taking a fault? Intel states this is behaviour is implementation specific (SDM Vol3 5.3.1) and may fault, while AMD doesn't discuss it at all as far as I can tell (APM Vol2 4.12 is the right section, but I can't see this discussed). While I can believe it probably works on every processor these days, it does seem like dodgy ground to base an ABI on. It also means that Xen isn't necessarily the only affected party. I'm pretty sure GRSecurity use reduced segment limits as well. I also bet it doesn't work reliably under emulation. ~Andrew ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: -mno-tls-direct-seg-refs support in glibc for i386 PV Xen 2020-05-27 14:15 ` Andrew Cooper via Libc-alpha @ 2020-05-27 14:20 ` Florian Weimer via Libc-alpha 0 siblings, 0 replies; 8+ messages in thread From: Florian Weimer via Libc-alpha @ 2020-05-27 14:20 UTC (permalink / raw) To: Andrew Cooper; +Cc: libc-alpha, xen-devel * Andrew Cooper: > Oh, so the i386 TLS model relies on the calculation wrapping (modulo 4G) > when the segment limit is 4G, instead of taking a fault? That's about it. > Intel states this is behaviour is implementation specific (SDM Vol3 > 5.3.1) and may fault, while AMD doesn't discuss it at all as far as I > can tell (APM Vol2 4.12 is the right section, but I can't see this > discussed). > > While I can believe it probably works on every processor these days, it > does seem like dodgy ground to base an ABI on. Sure, but it has been this way since the beginnings of NPTL, for close to twenty years now. The TCB is at positive offsets, and the user TLS data at negative offsets. > It also means that Xen isn't necessarily the only affected party. I'm > pretty sure GRSecurity use reduced segment limits as well. Mostly for CS and DS, I believe, for the fake NX handling. I think that was never upstream, but some vendor kernels had variants of it. > I also bet it doesn't work reliably under emulation. It has to, given that it's so pervasively used under Linux. 8-/ Thanks, Florian ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: -mno-tls-direct-seg-refs support in glibc for i386 PV Xen 2020-05-27 13:39 ` Andrew Cooper via Libc-alpha 2020-05-27 13:44 ` Samuel Thibault @ 2020-05-27 14:00 ` Jan Beulich 2020-05-27 14:40 ` Andrew Cooper via Libc-alpha 1 sibling, 1 reply; 8+ messages in thread From: Jan Beulich @ 2020-05-27 14:00 UTC (permalink / raw) To: Andrew Cooper; +Cc: Florian Weimer, xen-devel, libc-alpha On 27.05.2020 15:39, Andrew Cooper wrote: > On 27/05/2020 14:03, Florian Weimer wrote: >> I'm about to remove nosegneg support from upstream glibc, special builds >> that use -mno-tls-direct-seg-refs, and the ability load different >> libraries built in this mode automatically, when the Linux kernel tells >> us to do that. I think the intended effect is that these special builds >> do not use operands of the form %gs:(%eax) when %eax has the MSB set >> because that had a performance hit with paravirtualization on 32-bit >> x86. Instead, the thread pointer is first loaded from %gs:0, and the >> actual access does not use a segment prefix. >> >> Before doing that, I'd like to ask if anybody is still using this >> feature? >> >> I know that we've been carrying nosegneg libraries for many years, in >> some cases even after we stopped shipping 32-bit kernels. 8-/ The >> feature has always been rather poorly documented, and the way the >> dynamic loader selects those nosegneg library variants is still very >> bizarre. > > I wasn't even aware of this feature, or that there was a problem wanting > fixing. > > That said, I have found: > > # 32-bit x86 does not perform well with -ve segment accesses on Xen. > CFLAGS-$(CONFIG_X86_32) += $(call cc-option,$(CC),-mno-tls-direct-seg-refs) > > in one of our makefiles. > > Why does the MSB make any difference? %gs still needs to remain intact > so the thread pointer can be pulled out, so there is nothing that Xen or > Linux can do in the way of lazy loading. > > Beyond that, its straight up segment base semantics in x86. There will > be a 1-cycle AGU delay from a non-zero base, but that nothing to do with > Xen and applies to all segment based TLS accesses on x86, and you'll win > that back easily through reduced register pressure. > > Are there any further details on the perf problem claim? I find it > suspicious. To guard the hypervisor area, 32-bit Xen reduced the limits of guest usable segment descriptors. While this works fine for flat ones (you just chop off some space at the top), there's no way to represent a full segment with a non-zero base. You can have the descriptor map only the [base,XenBase] part or the [0,base) one. Hence Xen, from its #GP handler, flipped the descriptor between the two options depending on whether the current access was to the positive of negative part of the TLS seg. (An in-practice use of expand down segments, as you'll surely notice.) Jan ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: -mno-tls-direct-seg-refs support in glibc for i386 PV Xen 2020-05-27 14:00 ` Jan Beulich @ 2020-05-27 14:40 ` Andrew Cooper via Libc-alpha 2020-05-27 15:25 ` Jan Beulich 0 siblings, 1 reply; 8+ messages in thread From: Andrew Cooper via Libc-alpha @ 2020-05-27 14:40 UTC (permalink / raw) To: Jan Beulich; +Cc: Florian Weimer, xen-devel, libc-alpha On 27/05/2020 15:00, Jan Beulich wrote: > On 27.05.2020 15:39, Andrew Cooper wrote: >> On 27/05/2020 14:03, Florian Weimer wrote: >>> I'm about to remove nosegneg support from upstream glibc, special builds >>> that use -mno-tls-direct-seg-refs, and the ability load different >>> libraries built in this mode automatically, when the Linux kernel tells >>> us to do that. I think the intended effect is that these special builds >>> do not use operands of the form %gs:(%eax) when %eax has the MSB set >>> because that had a performance hit with paravirtualization on 32-bit >>> x86. Instead, the thread pointer is first loaded from %gs:0, and the >>> actual access does not use a segment prefix. >>> >>> Before doing that, I'd like to ask if anybody is still using this >>> feature? >>> >>> I know that we've been carrying nosegneg libraries for many years, in >>> some cases even after we stopped shipping 32-bit kernels. 8-/ The >>> feature has always been rather poorly documented, and the way the >>> dynamic loader selects those nosegneg library variants is still very >>> bizarre. >> I wasn't even aware of this feature, or that there was a problem wanting >> fixing. >> >> That said, I have found: >> >> # 32-bit x86 does not perform well with -ve segment accesses on Xen. >> CFLAGS-$(CONFIG_X86_32) += $(call cc-option,$(CC),-mno-tls-direct-seg-refs) >> >> in one of our makefiles. >> >> Why does the MSB make any difference? %gs still needs to remain intact >> so the thread pointer can be pulled out, so there is nothing that Xen or >> Linux can do in the way of lazy loading. >> >> Beyond that, its straight up segment base semantics in x86. There will >> be a 1-cycle AGU delay from a non-zero base, but that nothing to do with >> Xen and applies to all segment based TLS accesses on x86, and you'll win >> that back easily through reduced register pressure. >> >> Are there any further details on the perf problem claim? I find it >> suspicious. > To guard the hypervisor area, 32-bit Xen reduced the limits of guest > usable segment descriptors. Right. Segment limits are what keept the guest kernel (ring 1, supervisor) out of Xen (ring 1, also supervisor). > While this works fine for flat ones (you > just chop off some space at the top), there's no way to represent a > full segment with a non-zero base. (From the other thread,) The problem isn't related to the base, per say. It is that a segment with a non-4G limit now faults rather than truncating usefully for the 32bit TLS model. > You can have the descriptor map > only the [base,XenBase] part or the [0,base) one. Hence Xen, from its > #GP handler, flipped the descriptor between the two options depending > on whether the current access was to the positive of negative part of > the TLS seg. (An in-practice use of expand down segments, as you'll > surely notice.) I've found gpf_emulate_4gb() in source history. It was specific to 32bit builds of Xen (now long gone). What I can't figure out is why this is unnecessary in 64bit builds of Xen. We still enforce reduced segment limits on the guests descriptors. I have a worrying suspicion that Xen's ABI for PV32 (on top of a 64bit Xen) now depends on -mno-tls-direct-seg-refs ~Andrew ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: -mno-tls-direct-seg-refs support in glibc for i386 PV Xen 2020-05-27 14:40 ` Andrew Cooper via Libc-alpha @ 2020-05-27 15:25 ` Jan Beulich 0 siblings, 0 replies; 8+ messages in thread From: Jan Beulich @ 2020-05-27 15:25 UTC (permalink / raw) To: Andrew Cooper; +Cc: Florian Weimer, xen-devel, libc-alpha On 27.05.2020 16:40, Andrew Cooper wrote: > On 27/05/2020 15:00, Jan Beulich wrote: >> You can have the descriptor map >> only the [base,XenBase] part or the [0,base) one. Hence Xen, from its >> #GP handler, flipped the descriptor between the two options depending >> on whether the current access was to the positive of negative part of >> the TLS seg. (An in-practice use of expand down segments, as you'll >> surely notice.) > > I've found gpf_emulate_4gb() in source history. It was specific to > 32bit builds of Xen (now long gone). > > What I can't figure out is why this is unnecessary in 64bit builds of > Xen. We still enforce reduced segment limits on the guests descriptors. Do we? I can't find such - neither boot_compat_gdt[] has any signs of it, nor check_descriptor(). And we don't have a need to: The entire range is used for the r/o M2P, i.e. protection is enforced at the paging layer. 32-bit Xen necessarily had r/w as well as executable sub-ranges there. Jan ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-05-27 15:25 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-05-27 13:03 -mno-tls-direct-seg-refs support in glibc for i386 PV Xen Florian Weimer via Libc-alpha 2020-05-27 13:39 ` Andrew Cooper via Libc-alpha 2020-05-27 13:44 ` Samuel Thibault 2020-05-27 14:15 ` Andrew Cooper via Libc-alpha 2020-05-27 14:20 ` Florian Weimer via Libc-alpha 2020-05-27 14:00 ` Jan Beulich 2020-05-27 14:40 ` Andrew Cooper via Libc-alpha 2020-05-27 15:25 ` Jan Beulich
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).