unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [RFC PATCH glibc 0/8] Restartable Sequences enablement
@ 2020-01-08 14:58 Mathieu Desnoyers
  2020-01-08 14:58 ` [RFC PATCH glibc 1/8] Introduce <elf_machine_sym_no_match.h> Mathieu Desnoyers
                   ` (8 more replies)
  0 siblings, 9 replies; 17+ messages in thread
From: Mathieu Desnoyers @ 2020-01-08 14:58 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Florian Weimer, Joseph Myers, Szabolcs Nagy, libc-alpha,
	Mathieu Desnoyers

Hi,

Please find the rseq-enablement patchset for comments in this series.

This patch series is based on glibc master branch at commit
d006e84d5d Fix formatting of ChangeLog ouput

Since the last post, I updated the manual following the advices I
received, removed the syscall table patches from the series, as they are
now merged into glibc master. I also updated the abilist following le/be
split for arm, microblaze, and sh.

Thanks for the feedback!

Mathieu

Florian Weimer (3):
  Introduce <elf_machine_sym_no_match.h>
  Implement __libc_early_init
  nptl: Start new threads with all signals blocked [BZ #25098]

Mathieu Desnoyers (5):
  glibc: Perform rseq(2) registration at C startup and thread creation
    (v14)
  glibc: sched_getcpu(): use rseq cpu_id TLS on Linux (v5)
  support record failure: allow use from constructor
  support: implement xpthread key create/delete (v3)
  rseq registration tests (v7)

 NEWS                                          |  10 +
 csu/init-first.c                              |   4 -
 csu/libc-start.c                              |   5 +
 elf/Makefile                                  |   5 +-
 elf/Versions                                  |   1 +
 elf/dl-call-libc-early-init.c                 |  41 ++
 elf/dl-load.c                                 |   9 +
 elf/dl-lookup-direct.c                        | 116 ++++++
 elf/dl-lookup.c                               |  10 +-
 elf/dl-open.c                                 |  24 ++
 elf/elf_machine_sym_no_match.h                |  34 ++
 elf/libc-early-init.h                         |  35 ++
 elf/libc_early_init.c                         |  30 ++
 elf/rtld.c                                    |   4 +
 manual/threads.texi                           |  30 +-
 misc/rseq-internal.h                          |  33 ++
 nptl/descr.h                                  |  10 +-
 nptl/pthread_create.c                         |  57 +--
 support/Makefile                              |   2 +
 support/check.h                               |   4 +
 support/support_record_failure.c              |  18 +-
 support/xpthread_key_create.c                 |  25 ++
 support/xpthread_key_delete.c                 |  24 ++
 support/xthread.h                             |   2 +
 sysdeps/generic/ldsodefs.h                    |  17 +
 sysdeps/mach/hurd/i386/init-first.c           |   4 -
 sysdeps/mips/dl-machine.h                     |  15 -
 sysdeps/mips/elf_machine_sym_no_match.h       |  43 +++
 sysdeps/unix/sysv/linux/Makefile              |  11 +-
 sysdeps/unix/sysv/linux/Versions              |   3 +
 sysdeps/unix/sysv/linux/aarch64/bits/rseq.h   |  43 +++
 sysdeps/unix/sysv/linux/aarch64/libc.abilist  |   1 +
 sysdeps/unix/sysv/linux/alpha/libc.abilist    |   1 +
 sysdeps/unix/sysv/linux/arm/be/libc.abilist   |   1 +
 sysdeps/unix/sysv/linux/arm/bits/rseq.h       |  83 ++++
 sysdeps/unix/sysv/linux/arm/le/libc.abilist   |   1 +
 sysdeps/unix/sysv/linux/bits/rseq.h           |  29 ++
 sysdeps/unix/sysv/linux/csky/libc.abilist     |   1 +
 sysdeps/unix/sysv/linux/hppa/libc.abilist     |   1 +
 sysdeps/unix/sysv/linux/i386/libc.abilist     |   1 +
 sysdeps/unix/sysv/linux/ia64/libc.abilist     |   1 +
 .../sysv/linux/m68k/coldfire/libc.abilist     |   1 +
 .../unix/sysv/linux/m68k/m680x0/libc.abilist  |   1 +
 .../sysv/linux/microblaze/be/libc.abilist     |   1 +
 .../sysv/linux/microblaze/le/libc.abilist     |   1 +
 sysdeps/unix/sysv/linux/mips/bits/rseq.h      |  62 +++
 .../sysv/linux/mips/mips32/fpu/libc.abilist   |   1 +
 .../sysv/linux/mips/mips32/nofpu/libc.abilist |   1 +
 .../sysv/linux/mips/mips64/n32/libc.abilist   |   1 +
 .../sysv/linux/mips/mips64/n64/libc.abilist   |   1 +
 sysdeps/unix/sysv/linux/nios2/libc.abilist    |   1 +
 sysdeps/unix/sysv/linux/powerpc/bits/rseq.h   |  37 ++
 .../linux/powerpc/powerpc32/fpu/libc.abilist  |   1 +
 .../powerpc/powerpc32/nofpu/libc.abilist      |   1 +
 .../linux/powerpc/powerpc64/be/libc.abilist   |   1 +
 .../linux/powerpc/powerpc64/le/libc.abilist   |   1 +
 .../unix/sysv/linux/riscv/rv64/libc.abilist   |   1 +
 sysdeps/unix/sysv/linux/rseq-internal.h       |  77 ++++
 sysdeps/unix/sysv/linux/rseq-sym.c            |  43 +++
 sysdeps/unix/sysv/linux/s390/bits/rseq.h      |  37 ++
 .../unix/sysv/linux/s390/s390-32/libc.abilist |   1 +
 .../unix/sysv/linux/s390/s390-64/libc.abilist |   1 +
 sysdeps/unix/sysv/linux/sched_getcpu.c        |  29 +-
 sysdeps/unix/sysv/linux/sh/be/libc.abilist    |   1 +
 sysdeps/unix/sysv/linux/sh/le/libc.abilist    |   1 +
 .../sysv/linux/sparc/sparc32/libc.abilist     |   1 +
 .../sysv/linux/sparc/sparc64/libc.abilist     |   1 +
 sysdeps/unix/sysv/linux/sys/rseq.h            |  30 ++
 sysdeps/unix/sysv/linux/tst-rseq-nptl.c       | 353 ++++++++++++++++++
 sysdeps/unix/sysv/linux/tst-rseq.c            | 114 ++++++
 sysdeps/unix/sysv/linux/x86/bits/rseq.h       |  30 ++
 .../unix/sysv/linux/x86_64/64/libc.abilist    |   1 +
 .../unix/sysv/linux/x86_64/x32/libc.abilist   |   1 +
 73 files changed, 1553 insertions(+), 70 deletions(-)
 create mode 100644 elf/dl-call-libc-early-init.c
 create mode 100644 elf/dl-lookup-direct.c
 create mode 100644 elf/elf_machine_sym_no_match.h
 create mode 100644 elf/libc-early-init.h
 create mode 100644 elf/libc_early_init.c
 create mode 100644 misc/rseq-internal.h
 create mode 100644 support/xpthread_key_create.c
 create mode 100644 support/xpthread_key_delete.c
 create mode 100644 sysdeps/mips/elf_machine_sym_no_match.h
 create mode 100644 sysdeps/unix/sysv/linux/aarch64/bits/rseq.h
 create mode 100644 sysdeps/unix/sysv/linux/arm/bits/rseq.h
 create mode 100644 sysdeps/unix/sysv/linux/bits/rseq.h
 create mode 100644 sysdeps/unix/sysv/linux/mips/bits/rseq.h
 create mode 100644 sysdeps/unix/sysv/linux/powerpc/bits/rseq.h
 create mode 100644 sysdeps/unix/sysv/linux/rseq-internal.h
 create mode 100644 sysdeps/unix/sysv/linux/rseq-sym.c
 create mode 100644 sysdeps/unix/sysv/linux/s390/bits/rseq.h
 create mode 100644 sysdeps/unix/sysv/linux/sys/rseq.h
 create mode 100644 sysdeps/unix/sysv/linux/tst-rseq-nptl.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-rseq.c
 create mode 100644 sysdeps/unix/sysv/linux/x86/bits/rseq.h

-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH glibc 1/8] Introduce <elf_machine_sym_no_match.h>
  2020-01-08 14:58 [RFC PATCH glibc 0/8] Restartable Sequences enablement Mathieu Desnoyers
@ 2020-01-08 14:58 ` Mathieu Desnoyers
  2020-01-08 14:58 ` [RFC PATCH glibc 2/8] Implement __libc_early_init Mathieu Desnoyers
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Mathieu Desnoyers @ 2020-01-08 14:58 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Florian Weimer, Joseph Myers, Szabolcs Nagy, libc-alpha

From: Florian Weimer <fweimer@redhat.com>

MIPS needs to ignore certain existing symbols during symbol lookup.
The old scheme uses the ELF_MACHINE_SYM_NO_MATCH macro, with an
inline function, within its own header, with a sysdeps override for
MIPS.  This allows re-use of the function from another file (without
having to include <dl-machine.h> or providing the default definition
for ELF_MACHINE_SYM_NO_MATCH).

Built with build-many-glibcs.py, with manual verification that
sysdeps/mips/elf_machine_sym_no_match.h is picked up on MIPS.  Tested
on aarch64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu,
s390x-linux-gnu, x86_64-linux-gnu.

	* elf/elf_machine_sym_no_match.h: New file.
	* elf/dl-lookip.c (ELF_MACHINE_SYM_NO_MATCH): Do not define.
	* elf/dl-lookup.c (check_match) Call elf_machine_sym_no_match
	instead of ELF_MACHINE_SYM_NO_MATCH.
	* sysdeps/mips/dl-machine.h (ELF_MACHINE_SYM_NO_MATCH): Remove
	definition.
	* sysdeps/mips/elf_machine_sym_no_match.h: New file.  Extracted
	from sysdeps/mips/dl-machine.h.
---
 elf/dl-lookup.c                         | 10 ++----
 elf/elf_machine_sym_no_match.h          | 34 +++++++++++++++++++
 sysdeps/mips/dl-machine.h               | 15 ---------
 sysdeps/mips/elf_machine_sym_no_match.h | 43 +++++++++++++++++++++++++
 4 files changed, 79 insertions(+), 23 deletions(-)
 create mode 100644 elf/elf_machine_sym_no_match.h
 create mode 100644 sysdeps/mips/elf_machine_sym_no_match.h

diff --git a/elf/dl-lookup.c b/elf/dl-lookup.c
index 378f28fa7d..db9785ab75 100644
--- a/elf/dl-lookup.c
+++ b/elf/dl-lookup.c
@@ -28,18 +28,12 @@
 #include <libc-lock.h>
 #include <tls.h>
 #include <atomic.h>
+#include <elf_machine_sym_no_match.h>
 
 #include <assert.h>
 
-/* Return nonzero if check_match should consider SYM to fail to match a
-   symbol reference for some machine-specific reason.  */
-#ifndef ELF_MACHINE_SYM_NO_MATCH
-# define ELF_MACHINE_SYM_NO_MATCH(sym) 0
-#endif
-
 #define VERSTAG(tag)	(DT_NUM + DT_THISPROCNUM + DT_VERSIONTAGIDX (tag))
 
-
 struct sym_val
   {
     const ElfW(Sym) *s;
@@ -78,7 +72,7 @@ check_match (const char *const undef_name,
   if (__glibc_unlikely ((sym->st_value == 0 /* No value.  */
 			 && sym->st_shndx != SHN_ABS
 			 && stt != STT_TLS)
-			|| ELF_MACHINE_SYM_NO_MATCH (sym)
+			|| elf_machine_sym_no_match (sym)
 			|| (type_class & (sym->st_shndx == SHN_UNDEF))))
     return NULL;
 
diff --git a/elf/elf_machine_sym_no_match.h b/elf/elf_machine_sym_no_match.h
new file mode 100644
index 0000000000..6e299e5ee8
--- /dev/null
+++ b/elf/elf_machine_sym_no_match.h
@@ -0,0 +1,34 @@
+/* Function to ignore certain symbol matches for machine-specific reasons.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _ELF_MACHINE_SYM_NO_MATCH_H
+#define _ELF_MACHINE_SYM_NO_MATCH_H
+
+#include <link.h>
+#include <stdbool.h>
+
+/* This can be customized to ignore certain symbols during lookup in
+   case there are machine-specific rules to disregard some
+   symbols.  */
+static inline bool
+elf_machine_sym_no_match (const ElfW(Sym) *sym)
+{
+  return false;
+}
+
+#endif /* _ELF_MACHINE_SYM_NO_MATCH_H */
diff --git a/sysdeps/mips/dl-machine.h b/sysdeps/mips/dl-machine.h
index 06021ea9ab..e3b11a4f21 100644
--- a/sysdeps/mips/dl-machine.h
+++ b/sysdeps/mips/dl-machine.h
@@ -467,21 +467,6 @@ elf_machine_plt_value (struct link_map *map, const ElfW(Rel) *reloc,
   return value;
 }
 
-/* The semantics of zero/non-zero values of undefined symbols differs
-   depending on whether the non-PIC ABI is in use.  Under the non-PIC
-   ABI, a non-zero value indicates that there is an address reference
-   to the symbol and thus it must always be resolved (except when
-   resolving a jump slot relocation) to the PLT entry whose address is
-   provided as the symbol's value; a zero value indicates that this
-   canonical-address behaviour is not required.  Yet under the classic
-   MIPS psABI, a zero value indicates that there is an address
-   reference to the function and the dynamic linker must resolve the
-   symbol immediately upon loading.  To avoid conflict, symbols for
-   which the dynamic linker must assume the non-PIC ABI semantics are
-   marked with the STO_MIPS_PLT flag.  */
-#define ELF_MACHINE_SYM_NO_MATCH(sym) \
-  ((sym)->st_shndx == SHN_UNDEF && !((sym)->st_other & STO_MIPS_PLT))
-
 #endif /* !dl_machine_h */
 
 #ifdef RESOLVE_MAP
diff --git a/sysdeps/mips/elf_machine_sym_no_match.h b/sysdeps/mips/elf_machine_sym_no_match.h
new file mode 100644
index 0000000000..f2be74caaf
--- /dev/null
+++ b/sysdeps/mips/elf_machine_sym_no_match.h
@@ -0,0 +1,43 @@
+/* MIPS-specific handling of undefined symbols.
+   Copyright (C) 2008-2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _ELF_MACHINE_SYM_NO_MATCH_H
+#define _ELF_MACHINE_SYM_NO_MATCH_H
+
+#include <link.h>
+#include <stdbool.h>
+
+/* The semantics of zero/non-zero values of undefined symbols differs
+   depending on whether the non-PIC ABI is in use.  Under the non-PIC
+   ABI, a non-zero value indicates that there is an address reference
+   to the symbol and thus it must always be resolved (except when
+   resolving a jump slot relocation) to the PLT entry whose address is
+   provided as the symbol's value; a zero value indicates that this
+   canonical-address behaviour is not required.  Yet under the classic
+   MIPS psABI, a zero value indicates that there is an address
+   reference to the function and the dynamic linker must resolve the
+   symbol immediately upon loading.  To avoid conflict, symbols for
+   which the dynamic linker must assume the non-PIC ABI semantics are
+   marked with the STO_MIPS_PLT flag.  */
+static inline bool
+elf_machine_sym_no_match (const ElfW(Sym) *sym)
+{
+  return sym->st_shndx == SHN_UNDEF && !(sym->st_other & STO_MIPS_PLT);
+}
+
+#endif /* _ELF_MACHINE_SYM_NO_MATCH_H */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH glibc 2/8] Implement __libc_early_init
  2020-01-08 14:58 [RFC PATCH glibc 0/8] Restartable Sequences enablement Mathieu Desnoyers
  2020-01-08 14:58 ` [RFC PATCH glibc 1/8] Introduce <elf_machine_sym_no_match.h> Mathieu Desnoyers
@ 2020-01-08 14:58 ` Mathieu Desnoyers
  2020-01-08 14:58 ` [RFC PATCH glibc 3/8] nptl: Start new threads with all signals blocked [BZ #25098] Mathieu Desnoyers
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Mathieu Desnoyers @ 2020-01-08 14:58 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Florian Weimer, Joseph Myers, Szabolcs Nagy, libc-alpha

From: Florian Weimer <fweimer@redhat.com>

This function is defined in libc.so, and the dynamic loader calls
right after relocation has been finished, before any ELF constructors
or the preinit function is invoked.  It is also used in the static
build for initializing parts of the static libc.

To locate __libc_early_init, a direct symbol lookup function is used,
_dl_lookup_direct.  It does not search the entire symbol scope and
consults merely a single link map.  This function could also be used
to implement lookups in the vDSO (as an optimization).

A per-namespace variable (libc_map) is added for locating libc.so,
to avoid repeated traversals of the search scope.  It is similar to
GL(dl_initfirst).  An alternative would have been to thread a context
argument from _dl_open down to _dl_map_object_from_fd (where libc.so
is identified).  This could have avoided the global variable, but
the change would be larger as a result.  It would not have been
possible to use this to replace GL(dl_initfirst) because that global
variable is used to pass the function pointer past the stack switch
from dl_main to the main program.  Replacing that requires adding
a new argument to _dl_init, which in turn needs changes to the
architecture-specific libc.so startup code written in assembler.

__libc_early_init should not be used to replace _dl_var_init (as
it exists today on some architectures).  Instead, _dl_lookup_direct
should be used to look up a new variable symbol in libc.so, and
that should then be initialized from the dynamic loader, immediately
after the object has been loaded in _dl_map_object_from_fd (before
relocation is run).  This way, more IFUNC resolvers which depend on
these variables will work.

This version was tested on x86_64-linux-gnu.

	* csu/init-first.c (_init): Remove call to __ctype_init.  Moved to
	__libc_early_init.
	* csu/libc-start.c [!SHARED] (LIBC_START_MAIN): Call
	__libc_early_init.
	* elf/Makefile (routines): Add libc_early_init.
	(dl-routines): Add dl-call-libc-early-init.
	* elf/Versions (libc): Export __libc_early_init under
	GLIBC_PRIVATE.
	* elf/dl-call-libc-early-init.c: New file.
	* elf/dl-load.c (_dl_map_object_from_fd): Record in the namespace
	description if libc.so has been loaded.
	* elf/dl-lookup-direct.c: New file.  Extracted from
	elf/dl-lookup.c.
	* elf/dl-open.c (struct dl_open_args): Add libc_already_loaded.
	(dl_open_worker): Set libc_already_loaded.  Call
	_dl_call_libc_early_init if libc.so has been loaded.
	(_dl_open): Initialize libc_already_loaded.  Reset libc_map on
	failure if necessary.
	* elf/libc-early-init.h: New file.
	* elf/libc_early_init.c: Likewise.
	* elf/rtld.c (dl_main): Call _dl_call_libc_early_init.
	* sysdeps/generic/ldsodefs.h (struct rtld_global): Add libc_map
	member to the namespace description.
	(_dl_lookup_direct): Declare.
	* sysdeps/mach/hurd/i386/init-first.c (posixland_init): Do not
	call __ctype_init.
---
 csu/init-first.c                    |   4 -
 csu/libc-start.c                    |   5 ++
 elf/Makefile                        |   5 +-
 elf/Versions                        |   1 +
 elf/dl-call-libc-early-init.c       |  41 ++++++++++
 elf/dl-load.c                       |   9 +++
 elf/dl-lookup-direct.c              | 116 ++++++++++++++++++++++++++++
 elf/dl-open.c                       |  24 ++++++
 elf/libc-early-init.h               |  35 +++++++++
 elf/libc_early_init.c               |  27 +++++++
 elf/rtld.c                          |   4 +
 sysdeps/generic/ldsodefs.h          |  17 ++++
 sysdeps/mach/hurd/i386/init-first.c |   4 -
 13 files changed, 282 insertions(+), 10 deletions(-)
 create mode 100644 elf/dl-call-libc-early-init.c
 create mode 100644 elf/dl-lookup-direct.c
 create mode 100644 elf/libc-early-init.h
 create mode 100644 elf/libc_early_init.c

diff --git a/csu/init-first.c b/csu/init-first.c
index 1cd8a75098..8226ee3945 100644
--- a/csu/init-first.c
+++ b/csu/init-first.c
@@ -16,7 +16,6 @@
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
-#include <ctype.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <fcntl.h>
@@ -76,9 +75,6 @@ _init (int argc, char **argv, char **envp)
 
   __init_misc (argc, argv, envp);
 
-  /* Initialize ctype data.  */
-  __ctype_init ();
-
 #if defined SHARED && !defined NO_CTORS_DTORS_SECTIONS
   __libc_global_ctors ();
 #endif
diff --git a/csu/libc-start.c b/csu/libc-start.c
index 12468c5a89..ccc743c9d1 100644
--- a/csu/libc-start.c
+++ b/csu/libc-start.c
@@ -22,6 +22,7 @@
 #include <ldsodefs.h>
 #include <exit-thread.h>
 #include <libc-internal.h>
+#include <elf/libc-early-init.h>
 
 #include <elf/dl-tunables.h>
 
@@ -238,6 +239,10 @@ LIBC_START_MAIN (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL),
     __cxa_atexit ((void (*) (void *)) rtld_fini, NULL, NULL);
 
 #ifndef SHARED
+  /* Perform early initialization.  In the shared case, this function
+     is called from the dynamic loader as early as possible.  */
+  __libc_early_init ();
+
   /* Call the initializer of the libc.  This is only needed here if we
      are compiling for the static library in which case we haven't
      run the constructors in `_dl_start_user'.  */
diff --git a/elf/Makefile b/elf/Makefile
index b293f6ec85..9c74527f44 100644
--- a/elf/Makefile
+++ b/elf/Makefile
@@ -25,7 +25,7 @@ headers		= elf.h bits/elfclass.h link.h bits/link.h
 routines	= $(all-dl-routines) dl-support dl-iteratephdr \
 		  dl-addr dl-addr-obj enbl-secure dl-profstub \
 		  dl-origin dl-libc dl-sym dl-sysdep dl-error \
-		  dl-reloc-static-pie
+		  dl-reloc-static-pie libc_early_init
 
 # The core dynamic linking functions are in libc for the static and
 # profiled libraries.
@@ -33,7 +33,8 @@ dl-routines	= $(addprefix dl-,load lookup object reloc deps hwcaps \
 				  runtime init fini debug misc \
 				  version profile tls origin scope \
 				  execstack open close trampoline \
-				  exception sort-maps)
+				  exception sort-maps lookup-direct \
+				  call-libc-early-init)
 ifeq (yes,$(use-ldconfig))
 dl-routines += dl-cache
 endif
diff --git a/elf/Versions b/elf/Versions
index 3b09901f6c..f26d2817c3 100644
--- a/elf/Versions
+++ b/elf/Versions
@@ -26,6 +26,7 @@ libc {
     _dl_open_hook; _dl_open_hook2;
     _dl_sym; _dl_vsym;
     __libc_dlclose; __libc_dlopen_mode; __libc_dlsym; __libc_dlvsym;
+    __libc_early_init;
 
     # Internal error handling support.  Interposes the functions in ld.so.
     _dl_signal_exception; _dl_catch_exception;
diff --git a/elf/dl-call-libc-early-init.c b/elf/dl-call-libc-early-init.c
new file mode 100644
index 0000000000..6c3ac5bfe7
--- /dev/null
+++ b/elf/dl-call-libc-early-init.c
@@ -0,0 +1,41 @@
+/* Invoke the early initialization function in libc.so.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <assert.h>
+#include <ldsodefs.h>
+#include <libc-early-init.h>
+#include <link.h>
+#include <stddef.h>
+
+void
+_dl_call_libc_early_init (struct link_map *libc_map)
+{
+  /* There is nothing to do if we did not actually load libc.so.  */
+  if (libc_map == NULL)
+    return;
+
+  const ElfW(Sym) *sym
+    = _dl_lookup_direct (libc_map, "__libc_early_init",
+                         0x69682ac, /* dl_new_hash output.  */
+                         "GLIBC_PRIVATE",
+                         0x0963cf85); /* _dl_elf_hash output.  */
+  assert (sym != NULL);
+  __typeof (__libc_early_init) *early_init
+    = DL_SYMBOL_ADDRESS (libc_map, sym);
+  early_init ();
+}
diff --git a/elf/dl-load.c b/elf/dl-load.c
index a6b80f9395..06f2ba7264 100644
--- a/elf/dl-load.c
+++ b/elf/dl-load.c
@@ -30,6 +30,7 @@
 #include <sys/param.h>
 #include <sys/stat.h>
 #include <sys/types.h>
+#include <gnu/lib-names.h>
 
 /* Type for the buffer we put the ELF header and hopefully the program
    header.  This buffer does not really have to be too large.  In most
@@ -1374,6 +1375,14 @@ cannot enable executable stack as shared object requires");
     add_name_to_object (l, ((const char *) D_PTR (l, l_info[DT_STRTAB])
 			    + l->l_info[DT_SONAME]->d_un.d_val));
 
+  /* If we have newly loaded libc.so, update the namespace
+     description.  */
+  if (GL(dl_ns)[nsid].libc_map == NULL
+      && l->l_info[DT_SONAME] != NULL
+      && strcmp (((const char *) D_PTR (l, l_info[DT_STRTAB])
+		  + l->l_info[DT_SONAME]->d_un.d_val), LIBC_SO) == 0)
+    GL(dl_ns)[nsid].libc_map = l;
+
   /* _dl_close can only eventually undo the module ID assignment (via
      remove_slotinfo) if this function returns a pointer to a link
      map.  Therefore, delay this step until all possibilities for
diff --git a/elf/dl-lookup-direct.c b/elf/dl-lookup-direct.c
new file mode 100644
index 0000000000..190b826e1e
--- /dev/null
+++ b/elf/dl-lookup-direct.c
@@ -0,0 +1,116 @@
+/* Look up a symbol in a single specified object.
+   Copyright (C) 1995-2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <ldsodefs.h>
+#include <string.h>
+#include <elf_machine_sym_no_match.h>
+#include <dl-hash.h>
+
+/* This function corresponds to do_lookup_x in elf/dl-lookup.c.  The
+   variant here is simplified because it requires symbol
+   versioning. */
+static const ElfW(Sym) *
+check_match (const struct link_map *const map, const char *const undef_name,
+             const char *version, uint32_t version_hash,
+             const Elf_Symndx symidx)
+{
+  const ElfW(Sym) *symtab = (const void *) D_PTR (map, l_info[DT_SYMTAB]);
+  const ElfW(Sym) *sym = &symtab[symidx];
+
+  unsigned int stt = ELFW(ST_TYPE) (sym->st_info);
+  if (__glibc_unlikely ((sym->st_value == 0 /* No value.  */
+                         && sym->st_shndx != SHN_ABS
+                         && stt != STT_TLS)
+                        || elf_machine_sym_no_match (sym)))
+    return NULL;
+
+  /* Ignore all but STT_NOTYPE, STT_OBJECT, STT_FUNC,
+     STT_COMMON, STT_TLS, and STT_GNU_IFUNC since these are no
+     code/data definitions.  */
+#define ALLOWED_STT \
+  ((1 << STT_NOTYPE) | (1 << STT_OBJECT) | (1 << STT_FUNC) \
+   | (1 << STT_COMMON) | (1 << STT_TLS) | (1 << STT_GNU_IFUNC))
+  if (__glibc_unlikely (((1 << stt) & ALLOWED_STT) == 0))
+    return NULL;
+
+  const char *strtab = (const void *) D_PTR (map, l_info[DT_STRTAB]);
+
+  if (strcmp (strtab + sym->st_name, undef_name) != 0)
+    /* Not the symbol we are looking for.  */
+    return NULL;
+
+  ElfW(Half) ndx = map->l_versyms[symidx] & 0x7fff;
+  if (map->l_versions[ndx].hash != version_hash
+      || strcmp (map->l_versions[ndx].name, version) != 0)
+    /* It's not the version we want.  */
+    return NULL;
+
+  return sym;
+}
+
+
+/* This function corresponds to do_lookup_x in elf/dl-lookup.c.  The
+   variant here is simplified because it does not search object
+   dependencies.  It is optimized for a successful lookup.  */
+const ElfW(Sym) *
+_dl_lookup_direct (struct link_map *map,
+                   const char *undef_name, uint32_t new_hash,
+                   const char *version, uint32_t version_hash)
+{
+  const ElfW(Addr) *bitmask = map->l_gnu_bitmask;
+  if (__glibc_likely (bitmask != NULL))
+    {
+      Elf32_Word bucket = map->l_gnu_buckets[new_hash % map->l_nbuckets];
+      if (bucket != 0)
+        {
+          const Elf32_Word *hasharr = &map->l_gnu_chain_zero[bucket];
+
+          do
+            if (((*hasharr ^ new_hash) >> 1) == 0)
+              {
+                Elf_Symndx symidx = ELF_MACHINE_HASH_SYMIDX (map, hasharr);
+                const ElfW(Sym) *sym = check_match (map, undef_name,
+                                                    version, version_hash,
+                                                    symidx);
+                if (sym != NULL)
+                  return sym;
+              }
+          while ((*hasharr++ & 1u) == 0);
+        }
+    }
+  else
+    {
+      /* Fallback code for lack of GNU_HASH support.  */
+      uint32_t old_hash = _dl_elf_hash (undef_name);
+
+      /* Use the old SysV-style hash table.  Search the appropriate
+         hash bucket in this object's symbol table for a definition
+         for the same symbol name.  */
+      for (Elf_Symndx symidx = map->l_buckets[old_hash % map->l_nbuckets];
+           symidx != STN_UNDEF;
+           symidx = map->l_chain[symidx])
+        {
+          const ElfW(Sym) *sym = check_match (map, undef_name,
+                                              version, version_hash, symidx);
+          if (sym != NULL)
+            return sym;
+        }
+    }
+
+  return NULL;
+}
diff --git a/elf/dl-open.c b/elf/dl-open.c
index 623c9754eb..a5b9f226b4 100644
--- a/elf/dl-open.c
+++ b/elf/dl-open.c
@@ -34,6 +34,7 @@
 #include <atomic.h>
 #include <libc-internal.h>
 #include <array_length.h>
+#include <libc-early-init.h>
 
 #include <dl-dst.h>
 #include <dl-prop.h>
@@ -57,6 +58,13 @@ struct dl_open_args
      (non-negative).  */
   unsigned int original_global_scope_pending_adds;
 
+  /* Set to true if libc.so was already loaded into the namespace at
+     the time dl_open_worker was called.  This is used to determine
+     whether libc.so early initialization needs to before, and whether
+     to roll back the cached libc_map value in the namespace in case
+     of a dlopen failure.  */
+  bool libc_already_loaded;
+
   /* Original parameters to the program and the current environment.  */
   int argc;
   char **argv;
@@ -500,6 +508,11 @@ dl_open_worker (void *a)
 	args->nsid = call_map->l_ns;
     }
 
+  /* The namespace ID is now known.  Keep track of whether libc.so was
+     already loaded, to determine whether it is necessary to call the
+     early initialization routine (or clear libc_map on error).  */
+  args->libc_already_loaded = GL(dl_ns)[args->nsid].libc_map != NULL;
+
   /* Retain the old value, so that it can be restored.  */
   args->original_global_scope_pending_adds
     = GL (dl_ns)[args->nsid]._ns_global_scope_pending_adds;
@@ -735,6 +748,11 @@ dl_open_worker (void *a)
   if (relocation_in_progress)
     LIBC_PROBE (reloc_complete, 3, args->nsid, r, new);
 
+  /* If libc.so was not there before, attempt to call its early
+     initialization routine.  */
+  if (!args->libc_already_loaded)
+    _dl_call_libc_early_init (GL(dl_ns)[args->nsid].libc_map);
+
 #ifndef SHARED
   DL_STATIC_INIT (new);
 #endif
@@ -829,6 +847,7 @@ no more namespaces available for dlmopen()"));
   args.caller_dlopen = caller_dlopen;
   args.map = NULL;
   args.nsid = nsid;
+  args.libc_already_loaded = true; /* No reset below with early failure.  */
   args.argc = argc;
   args.argv = argv;
   args.env = env;
@@ -857,6 +876,11 @@ no more namespaces available for dlmopen()"));
   /* See if an error occurred during loading.  */
   if (__glibc_unlikely (exception.errstring != NULL))
     {
+      /* Avoid keeping around a dangling reference to the libc.so link
+	 map in case it has been cached in libc_map.  */
+      if (!args.libc_already_loaded)
+	GL(dl_ns)[nsid].libc_map = NULL;
+
       /* Remove the object from memory.  It may be in an inconsistent
 	 state if relocation failed, for example.  */
       if (args.map)
diff --git a/elf/libc-early-init.h b/elf/libc-early-init.h
new file mode 100644
index 0000000000..02b855754e
--- /dev/null
+++ b/elf/libc-early-init.h
@@ -0,0 +1,35 @@
+/* Early initialization of libc.so.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _LIBC_EARLY_INIT_H
+#define _LIBC_EARLY_INIT_H
+
+struct link_map;
+
+/* If LIBC_MAP is not NULL, look up the __libc_early_init symbol in it
+   and call this function.  */
+void _dl_call_libc_early_init (struct link_map *libc_map) attribute_hidden;
+
+/* In the shared case, this function is defined in libc.so and invoked
+   from ld.so (or on the fist static dlopen) after complete relocation
+   of a new loaded libc.so, but before user-defined ELF constructors
+   run.  In the static case, this function is called directly from the
+   startup code.  */
+void __libc_early_init (void);
+
+#endif /* _LIBC_EARLY_INIT_H */
diff --git a/elf/libc_early_init.c b/elf/libc_early_init.c
new file mode 100644
index 0000000000..1ac66d895d
--- /dev/null
+++ b/elf/libc_early_init.c
@@ -0,0 +1,27 @@
+/* Early initialization of libc.so, libc.so side.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <ctype.h>
+#include <libc-early-init.h>
+
+void
+__libc_early_init (void)
+{
+  /* Initialize ctype data.  */
+  __ctype_init ();
+}
diff --git a/elf/rtld.c b/elf/rtld.c
index 553cfbd1b7..a96cd54514 100644
--- a/elf/rtld.c
+++ b/elf/rtld.c
@@ -45,6 +45,7 @@
 #include <stap-probe.h>
 #include <stackinfo.h>
 #include <not-cancel.h>
+#include <libc-early-init.h>
 
 #include <assert.h>
 
@@ -2320,6 +2321,9 @@ ERROR: '%s': cannot process note segment.\n", _dl_argv[0]);
       rtld_timer_accum (&relocate_time, start);
     }
 
+  /* Relocation is complete.  Perform early libc initialization.  */
+  _dl_call_libc_early_init (GL(dl_ns)[LM_ID_BASE].libc_map);
+
   /* Do any necessary cleanups for the startup OS interface code.
      We do these now so that no calls are made after rtld re-relocation
      which might be resolved to different functions than we expect.
diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h
index 497938ffa2..5ff4a2831b 100644
--- a/sysdeps/generic/ldsodefs.h
+++ b/sysdeps/generic/ldsodefs.h
@@ -336,6 +336,10 @@ struct rtld_global
        recursive dlopen calls from ELF constructors.  */
     unsigned int _ns_global_scope_pending_adds;
 
+    /* Once libc.so has been loaded into the namespace, this points to
+       its link map.  */
+    struct link_map *libc_map;
+
     /* Search table for unique objects.  */
     struct unique_sym_table
     {
@@ -946,6 +950,19 @@ extern lookup_t _dl_lookup_symbol_x (const char *undef,
      attribute_hidden;
 
 
+/* Restricted version of _dl_lookup_symbol_x.  Searches MAP (and only
+   MAP) for the symbol UNDEF_NAME, with GNU hash NEW_HASH (computed
+   with dl_new_hash), symbol version VERSION, and symbol version hash
+   VERSION_HASH (computed with _dl_elf_hash).  Returns a pointer to
+   the symbol table entry in MAP on success, or NULL on failure.  MAP
+   must have symbol versioning information, or otherwise the result is
+   undefined.  */
+const ElfW(Sym) *_dl_lookup_direct (struct link_map *map,
+				    const char *undef_name,
+				    uint32_t new_hash,
+				    const char *version,
+				    uint32_t version_hash) attribute_hidden;
+
 /* Add the new link_map NEW to the end of the namespace list.  */
 extern void _dl_add_to_namespace_list (struct link_map *new, Lmid_t nsid)
      attribute_hidden;
diff --git a/sysdeps/mach/hurd/i386/init-first.c b/sysdeps/mach/hurd/i386/init-first.c
index 92bf45223f..a7dd4895d9 100644
--- a/sysdeps/mach/hurd/i386/init-first.c
+++ b/sysdeps/mach/hurd/i386/init-first.c
@@ -17,7 +17,6 @@
    <https://www.gnu.org/licenses/>.  */
 
 #include <assert.h>
-#include <ctype.h>
 #include <hurd.h>
 #include <stdio.h>
 #include <unistd.h>
@@ -85,9 +84,6 @@ posixland_init (int argc, char **argv, char **envp)
 #endif
   __init_misc (argc, argv, envp);
 
-  /* Initialize ctype data.  */
-  __ctype_init ();
-
 #if defined SHARED && !defined NO_CTORS_DTORS_SECTIONS
   __libc_global_ctors ();
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH glibc 3/8] nptl: Start new threads with all signals blocked [BZ #25098]
  2020-01-08 14:58 [RFC PATCH glibc 0/8] Restartable Sequences enablement Mathieu Desnoyers
  2020-01-08 14:58 ` [RFC PATCH glibc 1/8] Introduce <elf_machine_sym_no_match.h> Mathieu Desnoyers
  2020-01-08 14:58 ` [RFC PATCH glibc 2/8] Implement __libc_early_init Mathieu Desnoyers
@ 2020-01-08 14:58 ` Mathieu Desnoyers
  2020-01-08 14:58 ` [RFC PATCH glibc 4/8] glibc: Perform rseq(2) registration at C startup and thread creation (v14) Mathieu Desnoyers
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Mathieu Desnoyers @ 2020-01-08 14:58 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Florian Weimer, Joseph Myers, Szabolcs Nagy, libc-alpha

From: Florian Weimer <fweimer@redhat.com>

New threads inherit the signal mask from the current thread.  This
means that signal handlers can run on the newly created thread
immediately after the kernel has created the userspace thread, even
before glibc has initialized the TCB.  Consequently, new threads can
observe uninitialized ctype data, among other things.

To address this, block all signals before starting the thread, and
pass the original signal mask to the start routine wrapper.  On the
new thread, first perform all thread initialization, and then unblock
signals.

The cost of doing this is two rt_sigprocmask system calls on the old
thread, and one rt_sigprocmask system call on the new thread.  (If
there was a way to clone a new thread with a signals disabled, this
could be brought down to one system call each.)  The thread descriptor
increases in size, too, and sigset_t is fairly large.  This increase
could be brought down by reusing space the in the descriptor which is
not needed before running user code, or by switching to an internal
sigset_t definition which only covers the signals supported by the
kernel definition.  (Part of the thread descriptor size increase is
already offset by reduced stack usage in the thread start wrapper
routine after this commit.)

-----
 nptl/descr.h          | 10 +++++++---
 nptl/pthread_create.c | 47 +++++++++++++++++++++++++----------------------
 2 files changed, 32 insertions(+), 25 deletions(-)
---
 nptl/descr.h          | 10 ++++++---
 nptl/pthread_create.c | 47 +++++++++++++++++++++++--------------------
 2 files changed, 32 insertions(+), 25 deletions(-)

diff --git a/nptl/descr.h b/nptl/descr.h
index 9dcf480bdf..e1c7db5473 100644
--- a/nptl/descr.h
+++ b/nptl/descr.h
@@ -332,9 +332,8 @@ struct pthread
   /* True if thread must stop at startup time.  */
   bool stopped_start;
 
-  /* The parent's cancel handling at the time of the pthread_create
-     call.  This might be needed to undo the effects of a cancellation.  */
-  int parent_cancelhandling;
+  /* Formerly used for dealing with cancellation.  */
+  int parent_cancelhandling_unsed;
 
   /* Lock to synchronize access to the descriptor.  */
   int lock;
@@ -391,6 +390,11 @@ struct pthread
   /* Resolver state.  */
   struct __res_state res;
 
+  /* Signal mask for the new thread.  Used during thread startup to
+     restore the signal mask.  (Threads are launched with all signals
+     masked.)  */
+  sigset_t sigmask;
+
   /* Indicates whether is a C11 thread created by thrd_creat.  */
   bool c11;
 
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index d3fd58730c..5b3d4382b3 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -369,7 +369,6 @@ __free_tcb (struct pthread *pd)
     }
 }
 
-
 /* Local function to start thread and handle cleanup.
    createthread.c defines the macro START_THREAD_DEFN to the
    declaration that its create_thread function will refer to, and
@@ -385,10 +384,6 @@ START_THREAD_DEFN
   /* Initialize pointers to locale data.  */
   __ctype_init ();
 
-  /* Allow setxid from now onwards.  */
-  if (__glibc_unlikely (atomic_exchange_acq (&pd->setxid_futex, 0) == -2))
-    futex_wake (&pd->setxid_futex, 1, FUTEX_PRIVATE);
-
 #ifdef __NR_set_robust_list
 # ifndef __ASSUME_SET_ROBUST_LIST
   if (__set_robust_list_avail >= 0)
@@ -402,19 +397,6 @@ START_THREAD_DEFN
     }
 #endif
 
-  /* If the parent was running cancellation handlers while creating
-     the thread the new thread inherited the signal mask.  Reset the
-     cancellation signal mask.  */
-  if (__glibc_unlikely (pd->parent_cancelhandling & CANCELING_BITMASK))
-    {
-      INTERNAL_SYSCALL_DECL (err);
-      sigset_t mask;
-      __sigemptyset (&mask);
-      __sigaddset (&mask, SIGCANCEL);
-      (void) INTERNAL_SYSCALL (rt_sigprocmask, err, 4, SIG_UNBLOCK, &mask,
-			       NULL, _NSIG / 8);
-    }
-
   /* This is where the try/finally block should be created.  For
      compilers without that support we do use setjmp.  */
   struct pthread_unwind_buf unwind_buf;
@@ -436,6 +418,12 @@ START_THREAD_DEFN
   unwind_buf.priv.data.prev = NULL;
   unwind_buf.priv.data.cleanup = NULL;
 
+  __libc_signal_restore_set (&pd->sigmask);
+
+  /* Allow setxid from now onwards.  */
+  if (__glibc_unlikely (atomic_exchange_acq (&pd->setxid_futex, 0) == -2))
+    futex_wake (&pd->setxid_futex, 1, FUTEX_PRIVATE);
+
   if (__glibc_likely (! not_first_call))
     {
       /* Store the new cleanup handler info.  */
@@ -726,10 +714,6 @@ __pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr,
   CHECK_THREAD_SYSINFO (pd);
 #endif
 
-  /* Inform start_thread (above) about cancellation state that might
-     translate into inherited signal state.  */
-  pd->parent_cancelhandling = THREAD_GETMEM (THREAD_SELF, cancelhandling);
-
   /* Determine scheduling parameters for the thread.  */
   if (__builtin_expect ((iattr->flags & ATTR_FLAG_NOTINHERITSCHED) != 0, 0)
       && (iattr->flags & (ATTR_FLAG_SCHED_SET | ATTR_FLAG_POLICY_SET)) != 0)
@@ -775,6 +759,21 @@ __pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr,
      ownership of PD (see CONCURRENCY NOTES above).  */
   bool stopped_start = false; bool thread_ran = false;
 
+  /* Block all signals, so that the new thread starts out with
+     signals disabled.  This avoids race conditions in the thread
+     startup.  */
+  sigset_t original_sigmask;
+  __libc_signal_block_all (&original_sigmask);
+
+  /* Conceptually, the new thread needs to inherit the signal mask of
+     this thread.  Therefore, it needs to restore the saved signal
+     mask of this thread, so save it in the startup information.  */
+  pd->sigmask = original_sigmask;
+
+  /* Reset the cancellation signal mask in case this thread is running
+     cancellation.  */
+  __sigdelset (&pd->sigmask, SIGCANCEL);
+
   /* Start the thread.  */
   if (__glibc_unlikely (report_thread_creation (pd)))
     {
@@ -817,6 +816,10 @@ __pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr,
     retval = create_thread (pd, iattr, &stopped_start,
 			    STACK_VARIABLES_ARGS, &thread_ran);
 
+  /* Return to the previous signal mask, after creating the new
+     thread.  */
+  __libc_signal_restore_set (&original_sigmask);
+
   if (__glibc_unlikely (retval != 0))
     {
       if (thread_ran)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH glibc 4/8] glibc: Perform rseq(2) registration at C startup and thread creation (v14)
  2020-01-08 14:58 [RFC PATCH glibc 0/8] Restartable Sequences enablement Mathieu Desnoyers
                   ` (2 preceding siblings ...)
  2020-01-08 14:58 ` [RFC PATCH glibc 3/8] nptl: Start new threads with all signals blocked [BZ #25098] Mathieu Desnoyers
@ 2020-01-08 14:58 ` Mathieu Desnoyers
  2020-01-08 14:58 ` [RFC PATCH glibc 5/8] glibc: sched_getcpu(): use rseq cpu_id TLS on Linux (v5) Mathieu Desnoyers
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Mathieu Desnoyers @ 2020-01-08 14:58 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Florian Weimer, Joseph Myers, Szabolcs Nagy, libc-alpha,
	Mathieu Desnoyers, Thomas Gleixner, Ben Maurer, Peter Zijlstra,
	Paul E. McKenney, Boqun Feng, Will Deacon, Dave Watson,
	Paul Turner, Rich Felker, linux-kernel, linux-api

Register rseq(2) TLS for each thread (including main), and unregister
for each thread (excluding main). "rseq" stands for Restartable
Sequences.

See the rseq(2) man page proposed here:
  https://lkml.org/lkml/2018/9/19/647

This patch is based on glibc-2.30. The rseq(2) system call was merged
into Linux 4.18.

	* NEWS: Add Restartable Sequences feature description.
	* elf/libc_early_init.c: Perform rseq(2) registration at C startup.
	startup for shared libc.
	* nptl/pthread_create.c: Perform rseq(2) registration at thread
	creation.
	* manual/threads.texi: Document __rseq_abi, RSEQ_SIG, sys/rseq.h.
	* sysdeps/unix/sysv/linux/Makefile: Add rseq-sym, sys/rseq.h,
	bits/rseq.h.
	* sysdeps/unix/sysv/linux/Versions: Export __rseq_abi from libc.
	* sysdeps/unix/sysv/linux/aarch64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/alpha/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/arm/be/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/arm/le/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/csky/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/hppa/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/i386/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/ia64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/microblaze/be/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/microblaze/le/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/nios2/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sh/be/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sh/le/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/64/libc.abilist: Likewise.
	* sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist: Likewise.
	* misc/rseq-internal.h: New file.
	* sysdeps/unix/sysv/linux/rseq-internal.h: Likewise.
	* sysdeps/unix/sysv/linux/rseq-sym.c: Likewise.
	* sysdeps/unix/sysv/linux/sys/rseq.h: Likewise.
	* sysdeps/unix/sysv/linux/bits/rseq.h: Likewise.
	* sysdeps/unix/sysv/linux/aarch64/bits/rseq.h: Likewise.
	* sysdeps/unix/sysv/linux/arm/bits/rseq.h: Likewise.
	* sysdeps/unix/sysv/linux/mips/bits/rseq.h: Likewise.
	* sysdeps/unix/sysv/linux/powerpc/bits/rseq.h: Likewise.
	* sysdeps/unix/sysv/linux/s390/bits/rseq.h: Likewise.
	* sysdeps/unix/sysv/linux/x86/bits/rseq.h: Likewise.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Carlos O'Donell <carlos@redhat.com>
CC: Florian Weimer <fweimer@redhat.com>
CC: Joseph Myers <joseph@codesourcery.com>
CC: Szabolcs Nagy <szabolcs.nagy@arm.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ben Maurer <bmaurer@fb.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Dave Watson <davejwatson@fb.com>
CC: Paul Turner <pjt@google.com>
CC: Rich Felker <dalias@libc.org>
CC: libc-alpha@sourceware.org
CC: linux-kernel@vger.kernel.org
CC: linux-api@vger.kernel.org
---
Changes since v1:
- Move __rseq_refcount to an extra field at the end of __rseq_abi to
  eliminate one symbol.

  All libraries/programs which try to register rseq (glibc,
  early-adopter applications, early-adopter libraries) should use the
  rseq refcount. It becomes part of the ABI within a user-space
  process, but it's not part of the ABI shared with the kernel per se.

- Restructure how this code is organized so glibc keeps building on
  non-Linux targets.

- Use non-weak symbol for __rseq_abi.

- Move rseq registration/unregistration implementation into its own
  nptl/rseq.c compile unit.

- Move __rseq_abi symbol under GLIBC_2.29.

Changes since v2:
- Move __rseq_refcount to its own symbol, which is less ugly than
  trying to play tricks with the rseq uapi.
- Move __rseq_abi from nptl to csu (C start up), so it can be used
  across glibc, including memory allocator and sched_getcpu(). The
  __rseq_refcount symbol is kept in nptl, because there is no reason
  to use it elsewhere in glibc.

Changes since v3:
- Set __rseq_refcount TLS to 1 on register/set to 0 on unregister
  because glibc is the first/last user.
- Unconditionally register/unregister rseq at thread start/exit, because
  glibc is the first/last user.
- Add missing abilist items.
- Rebase on glibc master commit a502c5294.
- Add NEWS entry.

Changes since v4:
- Do not use "weak" symbols for __rseq_abi and __rseq_refcount. Based on
  "System V Application Binary Interface", weak only affects the link
  editor, not the dynamic linker.
- Install a new sys/rseq.h system header on Linux, which contains the
  RSEQ_SIG definition, __rseq_abi declaration and __rseq_refcount
  declaration. Move those definition/declarations from rseq-internal.h
  to the installed sys/rseq.h header.
- Considering that rseq is only available on Linux, move csu/rseq.c to
  sysdeps/unix/sysv/linux/rseq-sym.c.
- Move __rseq_refcount from nptl/rseq.c to
  sysdeps/unix/sysv/linux/rseq-sym.c, so it is only defined on Linux.
- Move both ABI definitions for __rseq_abi and __rseq_refcount to
  sysdeps/unix/sysv/linux/Versions, so they only appear on Linux.
- Document __rseq_abi and __rseq_refcount volatile.
- Document the RSEQ_SIG signature define.
- Move registration functions from rseq.c to rseq-internal.h static
  inline functions. Introduce empty stubs in misc/rseq-internal.h,
  which can be overridden by architecture code in
  sysdeps/unix/sysv/linux/rseq-internal.h.
- Rename __rseq_register_current_thread and __rseq_unregister_current_thread
  to rseq_register_current_thread and rseq_unregister_current_thread,
  now that those are only visible as internal static inline functions.
- Invoke rseq_register_current_thread() from libc-start.c LIBC_START_MAIN
  rather than nptl init, so applications not linked against
  libpthread.so have rseq registered for their main() thread. Note that
  it is invoked separately for SHARED and !SHARED builds.

Changes since v5:
- Replace __rseq_refcount by __rseq_lib_abi, which contains two
  uint32_t: register_state and refcount. The "register_state" field
  allows inhibiting rseq registration from signal handlers nested on top
  of glibc registration and occuring after rseq unregistration by glibc.
- Introduce enum rseq_register_state, which contains the states allowed
  for the struct rseq_lib_abi register_state field.

Changes since v6:
- Introduce bits/rseq.h to define RSEQ_SIG for each architecture.
  The generic bits/rseq.h does not define RSEQ_SIG, meaning that each
  architecture implementing rseq needs to implement bits/rseq.h.
- Rename enum item RSEQ_REGISTER_NESTED to RSEQ_REGISTER_ONGOING.
- Port to glibc-2.29.

Changes since v7:
- Remove __rseq_lib_abi symbol, including refcount and register_state
  fields.
- Remove reference counting and nested signals handling from
  registration/unregistration functions.
- Introduce new __rseq_handled exported symbol, which is set to 1
  by glibc on C startup when it handles restartable sequences.
  This allows glibc to coexist with early adopter libraries and
  applications wishing to register restartable sequences when it
  is not handled by glibc.
- Introduce rseq_init (), which sets __rseq_handled to 1 from
  C startup.
- Update NEWS entry.
- Update comments at the beginning of new files.
- Registration depends on both __NR_rseq and RSEQ_SIG.
- Remove ARM, powerpc, MIPS RSEQ_SIG until we agree with maintainers
  on the signature choice.
- Update x86, s390 RSEQ_SIG based on discussion with arch maintainers.
- Remove rseq-internal.h from headers list of misc/Makefile, so it
  it not installed by make install.

Changes since v8:
- Introduce RSEQ_SIG_CODE and RSEQ_SIG_DATA on aarch64 to handle
  compiling with -mbig-endian.

Changes since v9:
- Update Changelog.
- Remove unneeded new file comment header newlines.

Changes since v10:
- Remove volatile from __rseq_abi declaration.
- Document that __rseq_handled is about library managing rseq
  registration, independently of whether rseq is available or not.
- Move __rseq_handled symbol to ld.so, initialize this symbol within
  the dynamic linker initialization for both shared (rtld.c) and static
  (dl-support.c) builds.
- Only register the rseq TLS on initialization once in multiple-libc
  scenarios. Use rtld_active () for this purpose.
- In the static libc case, register the rseq TLS after LD_PRELOAD
  constructors are run, so it matches the order of this initialization
  vs LD_PRELOAD contructors execution for the shared libc.
- Agreed on signature choice with powerpc and MIPS maintainers,
  re-adding those signatures,
- The main architecture still left out signature-wise is ARM32.

Changes since v11:
- Rebase on glibc 2.30.
- Re-introduce ARM RSEQ_SIG following feedback from Will Deacon.

Changes since v12:
- Remove __rseq_handled,
- Rely on OS implicit rseq unregistration on thread teardown,
- Register main thread in __libc_early_init ().
- Add Restartable Sequences entry to threads manual.

Changes since v13:
- Update following be/le abilist split for arm, microblaze, and sh.
- Update manual to add the __rseq_abi variable and RSEQ_SIG macro to
  generate manual index entries, and add missing "Restartable Sequences"
  menu entry to the threads chapter.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Carlos O'Donell <carlos@redhat.com>
CC: Florian Weimer <fweimer@redhat.com>
CC: Joseph Myers <joseph@codesourcery.com>
CC: Szabolcs Nagy <szabolcs.nagy@arm.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ben Maurer <bmaurer@fb.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Dave Watson <davejwatson@fb.com>
CC: Paul Turner <pjt@google.com>
CC: Rich Felker <dalias@libc.org>
CC: libc-alpha@sourceware.org
CC: linux-kernel@vger.kernel.org
CC: linux-api@vger.kernel.org
---
 NEWS                                          | 10 +++
 elf/libc_early_init.c                         |  3 +
 manual/threads.texi                           | 30 ++++++-
 misc/rseq-internal.h                          | 33 ++++++++
 nptl/pthread_create.c                         | 12 +++
 sysdeps/unix/sysv/linux/Makefile              |  5 +-
 sysdeps/unix/sysv/linux/Versions              |  3 +
 sysdeps/unix/sysv/linux/aarch64/bits/rseq.h   | 43 ++++++++++
 sysdeps/unix/sysv/linux/aarch64/libc.abilist  |  1 +
 sysdeps/unix/sysv/linux/alpha/libc.abilist    |  1 +
 sysdeps/unix/sysv/linux/arm/be/libc.abilist   |  1 +
 sysdeps/unix/sysv/linux/arm/bits/rseq.h       | 83 +++++++++++++++++++
 sysdeps/unix/sysv/linux/arm/le/libc.abilist   |  1 +
 sysdeps/unix/sysv/linux/bits/rseq.h           | 29 +++++++
 sysdeps/unix/sysv/linux/csky/libc.abilist     |  1 +
 sysdeps/unix/sysv/linux/hppa/libc.abilist     |  1 +
 sysdeps/unix/sysv/linux/i386/libc.abilist     |  1 +
 sysdeps/unix/sysv/linux/ia64/libc.abilist     |  1 +
 .../sysv/linux/m68k/coldfire/libc.abilist     |  1 +
 .../unix/sysv/linux/m68k/m680x0/libc.abilist  |  1 +
 .../sysv/linux/microblaze/be/libc.abilist     |  1 +
 .../sysv/linux/microblaze/le/libc.abilist     |  1 +
 sysdeps/unix/sysv/linux/mips/bits/rseq.h      | 62 ++++++++++++++
 .../sysv/linux/mips/mips32/fpu/libc.abilist   |  1 +
 .../sysv/linux/mips/mips32/nofpu/libc.abilist |  1 +
 .../sysv/linux/mips/mips64/n32/libc.abilist   |  1 +
 .../sysv/linux/mips/mips64/n64/libc.abilist   |  1 +
 sysdeps/unix/sysv/linux/nios2/libc.abilist    |  1 +
 sysdeps/unix/sysv/linux/powerpc/bits/rseq.h   | 37 +++++++++
 .../linux/powerpc/powerpc32/fpu/libc.abilist  |  1 +
 .../powerpc/powerpc32/nofpu/libc.abilist      |  1 +
 .../linux/powerpc/powerpc64/be/libc.abilist   |  1 +
 .../linux/powerpc/powerpc64/le/libc.abilist   |  1 +
 .../unix/sysv/linux/riscv/rv64/libc.abilist   |  1 +
 sysdeps/unix/sysv/linux/rseq-internal.h       | 77 +++++++++++++++++
 sysdeps/unix/sysv/linux/rseq-sym.c            | 43 ++++++++++
 sysdeps/unix/sysv/linux/s390/bits/rseq.h      | 37 +++++++++
 .../unix/sysv/linux/s390/s390-32/libc.abilist |  1 +
 .../unix/sysv/linux/s390/s390-64/libc.abilist |  1 +
 sysdeps/unix/sysv/linux/sh/be/libc.abilist    |  1 +
 sysdeps/unix/sysv/linux/sh/le/libc.abilist    |  1 +
 .../sysv/linux/sparc/sparc32/libc.abilist     |  1 +
 .../sysv/linux/sparc/sparc64/libc.abilist     |  1 +
 sysdeps/unix/sysv/linux/sys/rseq.h            | 30 +++++++
 sysdeps/unix/sysv/linux/x86/bits/rseq.h       | 30 +++++++
 .../unix/sysv/linux/x86_64/64/libc.abilist    |  1 +
 .../unix/sysv/linux/x86_64/x32/libc.abilist   |  1 +
 47 files changed, 593 insertions(+), 4 deletions(-)
 create mode 100644 misc/rseq-internal.h
 create mode 100644 sysdeps/unix/sysv/linux/aarch64/bits/rseq.h
 create mode 100644 sysdeps/unix/sysv/linux/arm/bits/rseq.h
 create mode 100644 sysdeps/unix/sysv/linux/bits/rseq.h
 create mode 100644 sysdeps/unix/sysv/linux/mips/bits/rseq.h
 create mode 100644 sysdeps/unix/sysv/linux/powerpc/bits/rseq.h
 create mode 100644 sysdeps/unix/sysv/linux/rseq-internal.h
 create mode 100644 sysdeps/unix/sysv/linux/rseq-sym.c
 create mode 100644 sysdeps/unix/sysv/linux/s390/bits/rseq.h
 create mode 100644 sysdeps/unix/sysv/linux/sys/rseq.h
 create mode 100644 sysdeps/unix/sysv/linux/x86/bits/rseq.h

diff --git a/NEWS b/NEWS
index b85989ec3d..42f9b1c134 100644
--- a/NEWS
+++ b/NEWS
@@ -49,6 +49,16 @@ Major new features:
   responses, indicating a lack of DNSSEC validation.  (Therefore, the name
   servers and the network path to them are treated as untrusted.)
 
+* Support for automatically registering threads with the Linux rseq(2)
+  system call has been added.  This system call is implemented starting
+  from Linux 4.18.  The Restartable Sequences ABI accelerates user-space
+  operations on per-cpu data.  It allows user-space to perform updates
+  on per-cpu data without requiring heavy-weight atomic operations.
+  Automatically registering threads allows all libraries, including libc,
+  to make immediate use of the rseq(2) support by using the documented ABI.
+  See 'man 2 rseq' for the details of the ABI shared between libc and the
+  kernel.
+
 Deprecated and removed features, and other changes affecting compatibility:
 
 * The totalorder and totalordermag functions, and the corresponding
diff --git a/elf/libc_early_init.c b/elf/libc_early_init.c
index 1ac66d895d..30466afea0 100644
--- a/elf/libc_early_init.c
+++ b/elf/libc_early_init.c
@@ -18,10 +18,13 @@
 
 #include <ctype.h>
 #include <libc-early-init.h>
+#include <rseq-internal.h>
 
 void
 __libc_early_init (void)
 {
   /* Initialize ctype data.  */
   __ctype_init ();
+  /* Register rseq ABI to the kernel.   */
+  (void) rseq_register_current_thread ();
 }
diff --git a/manual/threads.texi b/manual/threads.texi
index 0858ef8f92..59f634e432 100644
--- a/manual/threads.texi
+++ b/manual/threads.texi
@@ -9,8 +9,10 @@ This chapter describes functions used for managing threads.
 POSIX threads.
 
 @menu
-* ISO C Threads::	Threads based on the ISO C specification.
-* POSIX Threads::	Threads based on the POSIX specification.
+* ISO C Threads::		Threads based on the ISO C specification.
+* POSIX Threads::		Threads based on the POSIX specification.
+* Restartable Sequences::	Linux-specific Restartable Sequences
+				integration.
 @end menu
 
 
@@ -881,3 +883,27 @@ Behaves like @code{pthread_timedjoin_np} except that the absolute time in
 @c pthread_spin_unlock
 @c pthread_testcancel
 @c pthread_yield
+
+@node Restartable Sequences
+@section Restartable Sequences
+@cindex rseq
+
+This section describes @theglibc{} Restartable Sequences integration.
+
+@deftypevar {struct rseq} __rseq_abi
+@standards{GNU, sys/rseq.h}
+@Theglibc{} implements a @code{__rseq_abi} TLS symbol to interact with the
+Restartable Sequences system call (Linux-specific).  The layout of this
+structure is defined by the Linux kernel @file{linux/rseq.h} UAPI.
+Registration of each thread's @code{__rseq_abi} is performed by
+@theglibc{} at libc initialization and pthread creation.
+@end deftypevar
+
+@deftypevr Macro int RSEQ_SIG
+@standards{GNU, sys/rseq.h}
+Each supported architecture provide a @code{RSEQ_SIG} macro in
+@file{sys/rseq.h} which contains a signature.  That signature is expected to be
+present in the code before each Restartable Sequences abort handler.  Failure
+to provide the expected signature may terminate the process with a Segmentation
+fault.
+@end deftypevr
diff --git a/misc/rseq-internal.h b/misc/rseq-internal.h
new file mode 100644
index 0000000000..df8fc6c006
--- /dev/null
+++ b/misc/rseq-internal.h
@@ -0,0 +1,33 @@
+/* Restartable Sequences internal API. Stub version.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef RSEQ_INTERNAL_H
+#define RSEQ_INTERNAL_H
+
+static inline int
+rseq_register_current_thread (void)
+{
+  return -1;
+}
+
+static inline int
+rseq_unregister_current_thread (void)
+{
+  return -1;
+}
+
+#endif /* rseq-internal.h */
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index 5b3d4382b3..94dd30b173 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -33,6 +33,7 @@
 #include <default-sched.h>
 #include <futex-internal.h>
 #include <tls-setup.h>
+#include <rseq-internal.h>
 #include "libioP.h"
 
 #include <shlib-compat.h>
@@ -384,6 +385,9 @@ START_THREAD_DEFN
   /* Initialize pointers to locale data.  */
   __ctype_init ();
 
+  /* Register rseq TLS to the kernel. */
+  (void) rseq_register_current_thread ();
+
 #ifdef __NR_set_robust_list
 # ifndef __ASSUME_SET_ROBUST_LIST
   if (__set_robust_list_avail >= 0)
@@ -581,6 +585,14 @@ START_THREAD_DEFN
      process is really dead since 'clone' got passed the CLONE_CHILD_CLEARTID
      flag.  The 'tid' field in the TCB will be set to zero.
 
+     rseq TLS is still registered at this point. Rely on implicit unregistration
+     performed by the kernel on thread teardown. This is not a problem because the
+     rseq TLS lives on the stack, and the stack outlives the thread. If TCB
+     allocation is ever changed, additional steps may be required, such as
+     performing explicit rseq unregistration before reclaiming the rseq TLS area
+     memory. It is NOT sufficient to block signals because the kernel may write
+     to the rseq area even without signals.
+
      The exit code is zero since in case all threads exit by calling
      'pthread_exit' the exit status must be 0 (zero).  */
   __exit_thread ();
diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile
index f12b7b1a2d..36a0fd231b 100644
--- a/sysdeps/unix/sysv/linux/Makefile
+++ b/sysdeps/unix/sysv/linux/Makefile
@@ -41,7 +41,7 @@ update-syscall-lists: arch-syscall.h
 endif
 
 ifeq ($(subdir),csu)
-sysdep_routines += errno-loc
+sysdep_routines += errno-loc rseq-sym
 endif
 
 ifeq ($(subdir),assert)
@@ -91,7 +91,8 @@ sysdep_headers += sys/mount.h sys/acct.h sys/sysctl.h \
 		  bits/termios-baud.h bits/termios-c_cflag.h \
 		  bits/termios-c_lflag.h bits/termios-tcflow.h \
 		  bits/termios-misc.h \
-		  bits/ipc-perm.h
+		  bits/ipc-perm.h \
+		  sys/rseq.h bits/rseq.h
 
 tests += tst-clone tst-clone2 tst-clone3 tst-fanotify tst-personality \
 	 tst-quota tst-sync_file_range tst-sysconf-iov_max tst-ttyname \
diff --git a/sysdeps/unix/sysv/linux/Versions b/sysdeps/unix/sysv/linux/Versions
index d385085c61..7f0da50580 100644
--- a/sysdeps/unix/sysv/linux/Versions
+++ b/sysdeps/unix/sysv/linux/Versions
@@ -177,6 +177,9 @@ libc {
   GLIBC_2.30 {
     getdents64; gettid; tgkill;
   }
+  GLIBC_2.31 {
+    __rseq_abi;
+  }
   GLIBC_PRIVATE {
     # functions used in other libraries
     __syscall_rt_sigqueueinfo;
diff --git a/sysdeps/unix/sysv/linux/aarch64/bits/rseq.h b/sysdeps/unix/sysv/linux/aarch64/bits/rseq.h
new file mode 100644
index 0000000000..35fcc41f1e
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/aarch64/bits/rseq.h
@@ -0,0 +1,43 @@
+/* Restartable Sequences Linux aarch64 architecture header.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _SYS_RSEQ_H
+# error "Never use <bits/rseq.h> directly; include <sys/rseq.h> instead."
+#endif
+
+/* RSEQ_SIG is a signature required before each abort handler code.
+
+   It is a 32-bit value that maps to actual architecture code compiled
+   into applications and libraries. It needs to be defined for each
+   architecture. When choosing this value, it needs to be taken into
+   account that generating invalid instructions may have ill effects on
+   tools like objdump, and may also have impact on the CPU speculative
+   execution efficiency in some cases.
+
+   aarch64 -mbig-endian generates mixed endianness code vs data:
+   little-endian code and big-endian data. Ensure the RSEQ_SIG signature
+   matches code endianness.  */
+
+#define RSEQ_SIG_CODE	0xd428bc00	/* BRK #0x45E0.  */
+
+#ifdef __AARCH64EB__
+#define RSEQ_SIG_DATA	0x00bc28d4	/* BRK #0x45E0.  */
+#else
+#define RSEQ_SIG_DATA	RSEQ_SIG_CODE
+#endif
+
+#define RSEQ_SIG	RSEQ_SIG_DATA
diff --git a/sysdeps/unix/sysv/linux/aarch64/libc.abilist b/sysdeps/unix/sysv/linux/aarch64/libc.abilist
index a4c31932cb..6784f13c09 100644
--- a/sysdeps/unix/sysv/linux/aarch64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/aarch64/libc.abilist
@@ -2145,3 +2145,4 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
diff --git a/sysdeps/unix/sysv/linux/alpha/libc.abilist b/sysdeps/unix/sysv/linux/alpha/libc.abilist
index e7f2174ac2..71db8422a2 100644
--- a/sysdeps/unix/sysv/linux/alpha/libc.abilist
+++ b/sysdeps/unix/sysv/linux/alpha/libc.abilist
@@ -2225,6 +2225,7 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 _IO_fprintf F
 GLIBC_2.4 _IO_printf F
 GLIBC_2.4 _IO_sprintf F
diff --git a/sysdeps/unix/sysv/linux/arm/be/libc.abilist b/sysdeps/unix/sysv/linux/arm/be/libc.abilist
index b152c0e24a..eecebc908f 100644
--- a/sysdeps/unix/sysv/linux/arm/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/arm/be/libc.abilist
@@ -133,6 +133,7 @@ GLIBC_2.30 twalk_r F
 GLIBC_2.31 msgctl F
 GLIBC_2.31 semctl F
 GLIBC_2.31 shmctl F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 _Exit F
 GLIBC_2.4 _IO_2_1_stderr_ D 0xa0
 GLIBC_2.4 _IO_2_1_stdin_ D 0xa0
diff --git a/sysdeps/unix/sysv/linux/arm/bits/rseq.h b/sysdeps/unix/sysv/linux/arm/bits/rseq.h
new file mode 100644
index 0000000000..cd00513bfb
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/arm/bits/rseq.h
@@ -0,0 +1,83 @@
+/* Restartable Sequences Linux arm architecture header.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _SYS_RSEQ_H
+# error "Never use <bits/rseq.h> directly; include <sys/rseq.h> instead."
+#endif
+
+/*
+   RSEQ_SIG is a signature required before each abort handler code.
+
+   It is a 32-bit value that maps to actual architecture code compiled
+   into applications and libraries. It needs to be defined for each
+   architecture. When choosing this value, it needs to be taken into
+   account that generating invalid instructions may have ill effects on
+   tools like objdump, and may also have impact on the CPU speculative
+   execution efficiency in some cases.
+
+   - ARM little endian
+
+   RSEQ_SIG uses the udf A32 instruction with an uncommon immediate operand
+   value 0x5de3. This traps if user-space reaches this instruction by mistake,
+   and the uncommon operand ensures the kernel does not move the instruction
+   pointer to attacker-controlled code on rseq abort.
+
+   The instruction pattern in the A32 instruction set is:
+
+   e7f5def3    udf    #24035    ; 0x5de3
+
+   This translates to the following instruction pattern in the T16 instruction
+   set:
+
+   little endian:
+   def3        udf    #243      ; 0xf3
+   e7f5        b.n    <7f5>
+
+   - ARMv6+ big endian (BE8):
+
+   ARMv6+ -mbig-endian generates mixed endianness code vs data: little-endian
+   code and big-endian data. The data value of the signature needs to have its
+   byte order reversed to generate the trap instruction:
+
+   Data: 0xf3def5e7
+
+   Translates to this A32 instruction pattern:
+
+   e7f5def3    udf    #24035    ; 0x5de3
+
+   Translates to this T16 instruction pattern:
+
+   def3        udf    #243      ; 0xf3
+   e7f5        b.n    <7f5>
+
+   - Prior to ARMv6 big endian (BE32):
+
+   Prior to ARMv6, -mbig-endian generates big-endian code and data
+   (which match), so the endianness of the data representation of the
+   signature should not be reversed. However, the choice between BE32
+   and BE8 is done by the linker, so we cannot know whether code and
+   data endianness will be mixed before the linker is invoked. So rather
+   than try to play tricks with the linker, the rseq signature is simply
+   data (not a trap instruction) prior to ARMv6 on big endian. This is
+   why the signature is expressed as data (.word) rather than as
+   instruction (.inst) in assembler.  */
+
+#ifdef __ARMEB__
+#define RSEQ_SIG    0xf3def5e7      /* udf    #24035    ; 0x5de3 (ARMv6+) */
+#else
+#define RSEQ_SIG    0xe7f5def3      /* udf    #24035    ; 0x5de3 */
+#endif
diff --git a/sysdeps/unix/sysv/linux/arm/le/libc.abilist b/sysdeps/unix/sysv/linux/arm/le/libc.abilist
index 9371927927..3e7434d3d5 100644
--- a/sysdeps/unix/sysv/linux/arm/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/arm/le/libc.abilist
@@ -130,6 +130,7 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 _Exit F
 GLIBC_2.4 _IO_2_1_stderr_ D 0xa0
 GLIBC_2.4 _IO_2_1_stdin_ D 0xa0
diff --git a/sysdeps/unix/sysv/linux/bits/rseq.h b/sysdeps/unix/sysv/linux/bits/rseq.h
new file mode 100644
index 0000000000..a3c023f5c7
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/bits/rseq.h
@@ -0,0 +1,29 @@
+/* Restartable Sequences architecture header. Stub version.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _SYS_RSEQ_H
+# error "Never use <bits/rseq.h> directly; include <sys/rseq.h> instead."
+#endif
+
+/* RSEQ_SIG is a signature required before each abort handler code.
+
+   It is a 32-bit value that maps to actual architecture code compiled
+   into applications and libraries. It needs to be defined for each
+   architecture. When choosing this value, it needs to be taken into
+   account that generating invalid instructions may have ill effects on
+   tools like objdump, and may also have impact on the CPU speculative
+   execution efficiency in some cases.  */
diff --git a/sysdeps/unix/sysv/linux/csky/libc.abilist b/sysdeps/unix/sysv/linux/csky/libc.abilist
index 9b3cee65bb..b7ed346b1c 100644
--- a/sysdeps/unix/sysv/linux/csky/libc.abilist
+++ b/sysdeps/unix/sysv/linux/csky/libc.abilist
@@ -2089,3 +2089,4 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
diff --git a/sysdeps/unix/sysv/linux/hppa/libc.abilist b/sysdeps/unix/sysv/linux/hppa/libc.abilist
index df6d96fbae..d55b153c48 100644
--- a/sysdeps/unix/sysv/linux/hppa/libc.abilist
+++ b/sysdeps/unix/sysv/linux/hppa/libc.abilist
@@ -2046,6 +2046,7 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/i386/libc.abilist b/sysdeps/unix/sysv/linux/i386/libc.abilist
index fcb625b6bf..c9061600f6 100644
--- a/sysdeps/unix/sysv/linux/i386/libc.abilist
+++ b/sysdeps/unix/sysv/linux/i386/libc.abilist
@@ -2212,6 +2212,7 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/ia64/libc.abilist b/sysdeps/unix/sysv/linux/ia64/libc.abilist
index cb556c5998..f794303f0e 100644
--- a/sysdeps/unix/sysv/linux/ia64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/ia64/libc.abilist
@@ -2078,6 +2078,7 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
index 5e3cdea246..e5e545f3af 100644
--- a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
+++ b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
@@ -134,6 +134,7 @@ GLIBC_2.30 twalk_r F
 GLIBC_2.31 msgctl F
 GLIBC_2.31 semctl F
 GLIBC_2.31 shmctl F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 _Exit F
 GLIBC_2.4 _IO_2_1_stderr_ D 0x98
 GLIBC_2.4 _IO_2_1_stdin_ D 0x98
diff --git a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
index ea5e7a41af..3fc1223e2c 100644
--- a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
+++ b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
@@ -2158,6 +2158,7 @@ GLIBC_2.30 twalk_r F
 GLIBC_2.31 msgctl F
 GLIBC_2.31 semctl F
 GLIBC_2.31 shmctl F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist b/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist
index ac55b0acd7..81db30b543 100644
--- a/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist
@@ -2140,3 +2140,4 @@ GLIBC_2.30 twalk_r F
 GLIBC_2.31 msgctl F
 GLIBC_2.31 semctl F
 GLIBC_2.31 shmctl F
+GLIBC_2.31 __rseq_abi T 0x20
diff --git a/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist b/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist
index f7ced487f7..a2ce147dde 100644
--- a/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist
@@ -2137,3 +2137,4 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
diff --git a/sysdeps/unix/sysv/linux/mips/bits/rseq.h b/sysdeps/unix/sysv/linux/mips/bits/rseq.h
new file mode 100644
index 0000000000..8c75f107e7
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/mips/bits/rseq.h
@@ -0,0 +1,62 @@
+/* Restartable Sequences Linux mips architecture header.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _SYS_RSEQ_H
+# error "Never use <bits/rseq.h> directly; include <sys/rseq.h> instead."
+#endif
+
+/* RSEQ_SIG is a signature required before each abort handler code.
+
+   It is a 32-bit value that maps to actual architecture code compiled
+   into applications and libraries. It needs to be defined for each
+   architecture. When choosing this value, it needs to be taken into
+   account that generating invalid instructions may have ill effects on
+   tools like objdump, and may also have impact on the CPU speculative
+   execution efficiency in some cases.
+
+   RSEQ_SIG uses the break instruction. The instruction pattern is:
+
+   On MIPS:
+        0350000d        break     0x350
+
+   On nanoMIPS:
+        00100350        break     0x350
+
+   On microMIPS:
+        0000d407        break     0x350
+
+   For nanoMIPS32 and microMIPS, the instruction stream is encoded as
+   16-bit halfwords, so the signature halfwords need to be swapped
+   accordingly for little-endian.  */
+
+#if defined(__nanomips__)
+# ifdef __MIPSEL__
+#  define RSEQ_SIG	0x03500010
+# else
+#  define RSEQ_SIG	0x00100350
+# endif
+#elif defined(__mips_micromips)
+# ifdef __MIPSEL__
+#  define RSEQ_SIG	0xd4070000
+# else
+#  define RSEQ_SIG	0x0000d407
+# endif
+#elif defined(__mips__)
+# define RSEQ_SIG	0x0350000d
+#else
+/* Unknown MIPS architecture. */
+#endif
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
index 06c2e64edd..c0040ddd4e 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
@@ -2129,6 +2129,7 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
index bdfd073b86..61f19076eb 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
@@ -2127,6 +2127,7 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
index 3d61d4974a..df4f3a3c04 100644
--- a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
@@ -2135,6 +2135,7 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
index 675acca5db..a96de2e467 100644
--- a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
@@ -2129,6 +2129,7 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/nios2/libc.abilist b/sysdeps/unix/sysv/linux/nios2/libc.abilist
index 7fec0c9670..7b2ccbe953 100644
--- a/sysdeps/unix/sysv/linux/nios2/libc.abilist
+++ b/sysdeps/unix/sysv/linux/nios2/libc.abilist
@@ -2178,3 +2178,4 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
diff --git a/sysdeps/unix/sysv/linux/powerpc/bits/rseq.h b/sysdeps/unix/sysv/linux/powerpc/bits/rseq.h
new file mode 100644
index 0000000000..bae8f4aaa1
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/powerpc/bits/rseq.h
@@ -0,0 +1,37 @@
+/* Restartable Sequences Linux powerpc architecture header.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _SYS_RSEQ_H
+# error "Never use <bits/rseq.h> directly; include <sys/rseq.h> instead."
+#endif
+
+/* RSEQ_SIG is a signature required before each abort handler code.
+
+   It is a 32-bit value that maps to actual architecture code compiled
+   into applications and libraries. It needs to be defined for each
+   architecture. When choosing this value, it needs to be taken into
+   account that generating invalid instructions may have ill effects on
+   tools like objdump, and may also have impact on the CPU speculative
+   execution efficiency in some cases.
+
+   RSEQ_SIG uses the following trap instruction:
+
+   powerpc-be:    0f e5 00 0b           twui   r5,11
+   powerpc64-le:  0b 00 e5 0f           twui   r5,11
+   powerpc64-be:  0f e5 00 0b           twui   r5,11  */
+
+#define RSEQ_SIG	0x0fe5000b
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
index 1e8ff6f83e..6f4c6515dc 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
@@ -2185,6 +2185,7 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 _IO_fprintf F
 GLIBC_2.4 _IO_printf F
 GLIBC_2.4 _IO_sprintf F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
index b5a0751d90..f9875b4e22 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
@@ -2218,6 +2218,7 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 _IO_fprintf F
 GLIBC_2.4 _IO_printf F
 GLIBC_2.4 _IO_sprintf F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist
index 0c86217fc6..db06080db8 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist
@@ -2048,6 +2048,7 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 _IO_fprintf F
 GLIBC_2.4 _IO_printf F
 GLIBC_2.4 _IO_sprintf F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist
index 2229a1dcc0..608ad49593 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist
@@ -2247,3 +2247,4 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
diff --git a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
index 31010e6cf7..c7657ce7f6 100644
--- a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
@@ -2107,3 +2107,4 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
diff --git a/sysdeps/unix/sysv/linux/rseq-internal.h b/sysdeps/unix/sysv/linux/rseq-internal.h
new file mode 100644
index 0000000000..1dd3b9a968
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/rseq-internal.h
@@ -0,0 +1,77 @@
+/* Restartable Sequences internal API. Linux implementation.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef RSEQ_INTERNAL_H
+#define RSEQ_INTERNAL_H
+
+#include <sysdep.h>
+#include <errno.h>
+
+#ifdef __NR_rseq
+#include <sys/rseq.h>
+#endif
+
+#if defined __NR_rseq && defined RSEQ_SIG
+
+static inline int
+rseq_register_current_thread (void)
+{
+  int rc, ret = 0;
+  INTERNAL_SYSCALL_DECL (err);
+
+  if (__rseq_abi.cpu_id == RSEQ_CPU_ID_REGISTRATION_FAILED)
+    return -1;
+  rc = INTERNAL_SYSCALL_CALL (rseq, err, &__rseq_abi, sizeof (struct rseq),
+                              0, RSEQ_SIG);
+  if (!rc)
+    goto end;
+  if (INTERNAL_SYSCALL_ERRNO (rc, err) != EBUSY)
+    __rseq_abi.cpu_id = RSEQ_CPU_ID_REGISTRATION_FAILED;
+  ret = -1;
+end:
+  return ret;
+}
+
+static inline int
+rseq_unregister_current_thread (void)
+{
+  int rc, ret = 0;
+  INTERNAL_SYSCALL_DECL (err);
+
+  rc = INTERNAL_SYSCALL_CALL (rseq, err, &__rseq_abi, sizeof (struct rseq),
+                              RSEQ_FLAG_UNREGISTER, RSEQ_SIG);
+  if (!rc)
+    goto end;
+  ret = -1;
+end:
+  return ret;
+}
+#else
+static inline int
+rseq_register_current_thread (void)
+{
+  return -1;
+}
+
+static inline int
+rseq_unregister_current_thread (void)
+{
+  return -1;
+}
+#endif
+
+#endif /* rseq-internal.h */
diff --git a/sysdeps/unix/sysv/linux/rseq-sym.c b/sysdeps/unix/sysv/linux/rseq-sym.c
new file mode 100644
index 0000000000..f86869a380
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/rseq-sym.c
@@ -0,0 +1,43 @@
+/* Restartable Sequences exported symbols. Linux Implementation.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sys/syscall.h>
+#include <stdint.h>
+
+#ifdef __NR_rseq
+#include <sys/rseq.h>
+#else
+
+enum rseq_cpu_id_state {
+  RSEQ_CPU_ID_UNINITIALIZED = -1,
+  RSEQ_CPU_ID_REGISTRATION_FAILED = -2,
+};
+
+/* linux/rseq.h defines struct rseq as aligned on 32 bytes. The kernel ABI
+   size is 20 bytes.  */
+struct rseq {
+  uint32_t cpu_id_start;
+  uint32_t cpu_id;
+  uint64_t rseq_cs;
+  uint32_t flags;
+} __attribute__ ((aligned(4 * sizeof(uint64_t))));
+
+#endif
+
+__thread struct rseq __rseq_abi = {
+  .cpu_id = RSEQ_CPU_ID_UNINITIALIZED,
+};
diff --git a/sysdeps/unix/sysv/linux/s390/bits/rseq.h b/sysdeps/unix/sysv/linux/s390/bits/rseq.h
new file mode 100644
index 0000000000..453250d761
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/s390/bits/rseq.h
@@ -0,0 +1,37 @@
+/* Restartable Sequences Linux s390 architecture header.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _SYS_RSEQ_H
+# error "Never use <bits/rseq.h> directly; include <sys/rseq.h> instead."
+#endif
+
+/* RSEQ_SIG is a signature required before each abort handler code.
+
+   It is a 32-bit value that maps to actual architecture code compiled
+   into applications and libraries. It needs to be defined for each
+   architecture. When choosing this value, it needs to be taken into
+   account that generating invalid instructions may have ill effects on
+   tools like objdump, and may also have impact on the CPU speculative
+   execution efficiency in some cases.
+
+   RSEQ_SIG uses the trap4 instruction. As Linux does not make use of the
+   access-register mode nor the linkage stack this instruction will always
+   cause a special-operation exception (the trap-enabled bit in the DUCT
+   is and will stay 0). The instruction pattern is
+       b2 ff 0f ff        trap4   4095(%r0)  */
+
+#define RSEQ_SIG	0xB2FF0FFF
diff --git a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
index 4feca641b0..9c29ec0d2d 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
@@ -2183,6 +2183,7 @@ GLIBC_2.30 twalk_r F
 GLIBC_2.31 msgctl F
 GLIBC_2.31 semctl F
 GLIBC_2.31 shmctl F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 _IO_fprintf F
 GLIBC_2.4 _IO_printf F
 GLIBC_2.4 _IO_sprintf F
diff --git a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
index efe588a072..99424ceac9 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
@@ -2084,6 +2084,7 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 _IO_fprintf F
 GLIBC_2.4 _IO_printf F
 GLIBC_2.4 _IO_sprintf F
diff --git a/sysdeps/unix/sysv/linux/sh/be/libc.abilist b/sysdeps/unix/sysv/linux/sh/be/libc.abilist
index 6bfc2b7439..80198de5d6 100644
--- a/sysdeps/unix/sysv/linux/sh/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sh/be/libc.abilist
@@ -2053,6 +2053,7 @@ GLIBC_2.30 twalk_r F
 GLIBC_2.31 msgctl F
 GLIBC_2.31 semctl F
 GLIBC_2.31 shmctl F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/sh/le/libc.abilist b/sysdeps/unix/sysv/linux/sh/le/libc.abilist
index 4b057bf4a2..916aa0b7f0 100644
--- a/sysdeps/unix/sysv/linux/sh/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sh/le/libc.abilist
@@ -2050,6 +2050,7 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
index 49cd597fd6..9a27df8e43 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
@@ -2174,6 +2174,7 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 _IO_fprintf F
 GLIBC_2.4 _IO_printf F
 GLIBC_2.4 _IO_sprintf F
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
index 95e68e0ba1..32908666c4 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
@@ -2101,6 +2101,7 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/sys/rseq.h b/sysdeps/unix/sysv/linux/sys/rseq.h
new file mode 100644
index 0000000000..e675219ace
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/sys/rseq.h
@@ -0,0 +1,30 @@
+/* Restartable Sequences exported symbols. Linux header.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _SYS_RSEQ_H
+#define _SYS_RSEQ_H	1
+
+/* We use the structures declarations from the kernel headers.  */
+#include <linux/rseq.h>
+/* Architecture-specific rseq signature.  */
+#include <bits/rseq.h>
+#include <stdint.h>
+
+extern __thread struct rseq __rseq_abi
+__attribute__ ((tls_model ("initial-exec")));
+
+#endif /* sys/rseq.h */
diff --git a/sysdeps/unix/sysv/linux/x86/bits/rseq.h b/sysdeps/unix/sysv/linux/x86/bits/rseq.h
new file mode 100644
index 0000000000..a2918c4617
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/x86/bits/rseq.h
@@ -0,0 +1,30 @@
+/* Restartable Sequences Linux x86 architecture header.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _SYS_RSEQ_H
+# error "Never use <bits/rseq.h> directly; include <sys/rseq.h> instead."
+#endif
+
+/* RSEQ_SIG is a signature required before each abort handler code.
+
+   RSEQ_SIG is used with the following reserved undefined instructions, which
+   trap in user-space:
+
+   x86-32:    0f b9 3d 53 30 05 53      ud1    0x53053053,%edi
+   x86-64:    0f b9 3d 53 30 05 53      ud1    0x53053053(%rip),%edi  */
+
+#define RSEQ_SIG	0x53053053
diff --git a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
index 1f2dbd1451..7366565608 100644
--- a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
@@ -2059,6 +2059,7 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
 GLIBC_2.4 __confstr_chk F
 GLIBC_2.4 __fgets_chk F
 GLIBC_2.4 __fgets_unlocked_chk F
diff --git a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
index 59da85a5d8..c1aa86f06e 100644
--- a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
@@ -2158,3 +2158,4 @@ GLIBC_2.30 getdents64 F
 GLIBC_2.30 gettid F
 GLIBC_2.30 tgkill F
 GLIBC_2.30 twalk_r F
+GLIBC_2.31 __rseq_abi T 0x20
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH glibc 5/8] glibc: sched_getcpu(): use rseq cpu_id TLS on Linux (v5)
  2020-01-08 14:58 [RFC PATCH glibc 0/8] Restartable Sequences enablement Mathieu Desnoyers
                   ` (3 preceding siblings ...)
  2020-01-08 14:58 ` [RFC PATCH glibc 4/8] glibc: Perform rseq(2) registration at C startup and thread creation (v14) Mathieu Desnoyers
@ 2020-01-08 14:58 ` Mathieu Desnoyers
  2020-01-08 14:58 ` [RFC PATCH glibc 6/8] support record failure: allow use from constructor Mathieu Desnoyers
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Mathieu Desnoyers @ 2020-01-08 14:58 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Florian Weimer, Joseph Myers, Szabolcs Nagy, libc-alpha,
	Mathieu Desnoyers, Thomas Gleixner, Ben Maurer, Peter Zijlstra,
	Paul E. McKenney, Boqun Feng, Will Deacon, Dave Watson,
	Paul Turner, linux-kernel, linux-api

When available, use the cpu_id field from __rseq_abi on Linux to
implement sched_getcpu(). Fall-back on the vgetcpu vDSO if unavailable.

Benchmarks:

x86-64: Intel E5-2630 v3@2.40GHz, 16-core, hyperthreading

glibc sched_getcpu():                     13.7 ns (baseline)
glibc sched_getcpu() using rseq:           2.5 ns (speedup:  5.5x)
inline load cpuid from __rseq_abi TLS:     0.8 ns (speedup: 17.1x)

	* sysdeps/unix/sysv/linux/sched_getcpu.c: use rseq cpu_id TLS on
	Linux.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Carlos O'Donell <carlos@redhat.com>
CC: Florian Weimer <fweimer@redhat.com>
CC: Joseph Myers <joseph@codesourcery.com>
CC: Szabolcs Nagy <szabolcs.nagy@arm.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ben Maurer <bmaurer@fb.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Dave Watson <davejwatson@fb.com>
CC: Paul Turner <pjt@google.com>
CC: libc-alpha@sourceware.org
CC: linux-kernel@vger.kernel.org
CC: linux-api@vger.kernel.org
---
Changes since v1:
- rseq is only used if both __NR_rseq and RSEQ_SIG are defined.

Changes since v2:
- remove duplicated __rseq_abi extern declaration.

Changes since v3:
- update ChangeLog.

Changes since v4:
- Use atomic_load_relaxed to load the __rseq_abi.cpu_id field, a
  consequence of the fact that __rseq_abi is not volatile anymore.
- Include atomic.h which provides atomic_load_relaxed.
---
 sysdeps/unix/sysv/linux/sched_getcpu.c | 29 ++++++++++++++++++++++++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/sysdeps/unix/sysv/linux/sched_getcpu.c b/sysdeps/unix/sysv/linux/sched_getcpu.c
index c019cfb3cf..6d3541333d 100644
--- a/sysdeps/unix/sysv/linux/sched_getcpu.c
+++ b/sysdeps/unix/sysv/linux/sched_getcpu.c
@@ -18,10 +18,15 @@
 #include <errno.h>
 #include <sched.h>
 #include <sysdep.h>
+#include <atomic.h>
 #include <sysdep-vdso.h>
 
-int
-sched_getcpu (void)
+#ifdef HAVE_GETCPU_VSYSCALL
+# define HAVE_VSYSCALL
+#endif
+
+static int
+vsyscall_sched_getcpu (void)
 {
   unsigned int cpu;
   int r = -1;
@@ -32,3 +37,23 @@ sched_getcpu (void)
 #endif
   return r == -1 ? r : cpu;
 }
+
+#ifdef __NR_rseq
+#include <sys/rseq.h>
+#endif
+
+#if defined __NR_rseq && defined RSEQ_SIG
+int
+sched_getcpu (void)
+{
+  int cpu_id = atomic_load_relaxed (&__rseq_abi.cpu_id);
+
+  return cpu_id >= 0 ? cpu_id : vsyscall_sched_getcpu ();
+}
+#else
+int
+sched_getcpu (void)
+{
+  return vsyscall_sched_getcpu ();
+}
+#endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH glibc 6/8] support record failure: allow use from constructor
  2020-01-08 14:58 [RFC PATCH glibc 0/8] Restartable Sequences enablement Mathieu Desnoyers
                   ` (4 preceding siblings ...)
  2020-01-08 14:58 ` [RFC PATCH glibc 5/8] glibc: sched_getcpu(): use rseq cpu_id TLS on Linux (v5) Mathieu Desnoyers
@ 2020-01-08 14:58 ` Mathieu Desnoyers
  2020-01-08 14:58 ` [RFC PATCH glibc 7/8] support: implement xpthread key create/delete (v3) Mathieu Desnoyers
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Mathieu Desnoyers @ 2020-01-08 14:58 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Florian Weimer, Joseph Myers, Szabolcs Nagy, libc-alpha,
	Mathieu Desnoyers

Expose support_record_failure_init () so constructors can explicitly
initialize the record failure API.

This is preferred to lazy initialization at first use, because
lazy initialization does not cover use in constructors within
forked children processes (forked from parent constructor).

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
CC: Carlos O'Donell <carlos@redhat.com>
CC: Florian Weimer <fweimer@redhat.com>
CC: Joseph Myers <joseph@codesourcery.com>
CC: Szabolcs Nagy <szabolcs.nagy@arm.com>
CC: libc-alpha@sourceware.org
---
 support/check.h                  |  4 ++++
 support/support_record_failure.c | 18 +++++++++++++-----
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/support/check.h b/support/check.h
index 77d1d1e14d..902cea6878 100644
--- a/support/check.h
+++ b/support/check.h
@@ -88,6 +88,10 @@ void support_test_verify_exit_impl (int status, const char *file, int line,
    does not support reporting failures from a DSO.  */
 void support_record_failure (void);
 
+/* Initialize record failure.  Calling this is only needed when
+   recording failures from constructors.  */
+void support_record_failure_init (void);
+
 /* Static assertion, under a common name for both C++ and C11.  */
 #ifdef __cplusplus
 # define support_static_assert static_assert
diff --git a/support/support_record_failure.c b/support/support_record_failure.c
index f766c06236..957572f487 100644
--- a/support/support_record_failure.c
+++ b/support/support_record_failure.c
@@ -32,8 +32,12 @@
    zero, the failure of a test can be detected.
 
    The init constructor function below puts *state on a shared
-   annonymous mapping, so that failure reports from subprocesses
-   propagate to the parent process.  */
+   anonymous mapping, so that failure reports from subprocesses
+   propagate to the parent process.
+
+   support_record_failure_init is exposed so it can be called explicitly
+   in case this API needs to be used from a constructor.  */
+
 struct test_failures
 {
   unsigned int counter;
@@ -41,10 +45,14 @@ struct test_failures
 };
 static struct test_failures *state;
 
-static __attribute__ ((constructor)) void
-init (void)
+__attribute__ ((constructor)) void
+support_record_failure_init (void)
 {
-  void *ptr = mmap (NULL, sizeof (*state), PROT_READ | PROT_WRITE,
+  void *ptr;
+
+  if (state != NULL)
+    return;
+  ptr = mmap (NULL, sizeof (*state), PROT_READ | PROT_WRITE,
                     MAP_ANONYMOUS | MAP_SHARED, -1, 0);
   if (ptr == MAP_FAILED)
     {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH glibc 7/8] support: implement xpthread key create/delete (v3)
  2020-01-08 14:58 [RFC PATCH glibc 0/8] Restartable Sequences enablement Mathieu Desnoyers
                   ` (5 preceding siblings ...)
  2020-01-08 14:58 ` [RFC PATCH glibc 6/8] support record failure: allow use from constructor Mathieu Desnoyers
@ 2020-01-08 14:58 ` Mathieu Desnoyers
  2020-01-08 14:58 ` [RFC PATCH glibc 8/8] rseq registration tests (v7) Mathieu Desnoyers
  2020-01-08 15:31 ` [RFC PATCH glibc 0/8] Restartable Sequences enablement Joseph Myers
  8 siblings, 0 replies; 17+ messages in thread
From: Mathieu Desnoyers @ 2020-01-08 14:58 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Florian Weimer, Joseph Myers, Szabolcs Nagy, libc-alpha,
	Mathieu Desnoyers

Expose xpthread_key_create () and xpthread_key_delete () wrappers
for tests.

	* support/Makefile: Add xpthread_key_create and xpthread_key_delete.
	* support/xthread.h: Add prototype for xpthread_key_create and
	xpthread_key_delete.
	* support/xpthread_key_create.c: New file.
	* support/xpthread_key_delete.c: New file.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Carlos O'Donell <carlos@redhat.com>
CC: Florian Weimer <fweimer@redhat.com>
CC: Joseph Myers <joseph@codesourcery.com>
CC: Szabolcs Nagy <szabolcs.nagy@arm.com>
CC: libc-alpha@sourceware.org
---
Changes since v1:
- Update ChangeLog.
- Wrap long line in xpthread_key_create.

Changes since v2:
- Rebase on glibc 2.30.
---
 support/Makefile              |  2 ++
 support/xpthread_key_create.c | 25 +++++++++++++++++++++++++
 support/xpthread_key_delete.c | 24 ++++++++++++++++++++++++
 support/xthread.h             |  2 ++
 4 files changed, 53 insertions(+)
 create mode 100644 support/xpthread_key_create.c
 create mode 100644 support/xpthread_key_delete.c

diff --git a/support/Makefile b/support/Makefile
index 3325feb790..ab7bb3880f 100644
--- a/support/Makefile
+++ b/support/Makefile
@@ -127,6 +127,8 @@ libsupport-routines = \
   xpthread_create \
   xpthread_detach \
   xpthread_join \
+  xpthread_key_create \
+  xpthread_key_delete \
   xpthread_mutex_consistent \
   xpthread_mutex_destroy \
   xpthread_mutex_init \
diff --git a/support/xpthread_key_create.c b/support/xpthread_key_create.c
new file mode 100644
index 0000000000..fb5a89ab3a
--- /dev/null
+++ b/support/xpthread_key_create.c
@@ -0,0 +1,25 @@
+/* pthread_key_create with error checking.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <support/xthread.h>
+
+void
+xpthread_key_create (pthread_key_t *key, void (*destr_function) (void *))
+{
+  xpthread_check_return ("pthread_key_create",
+                         pthread_key_create (key, destr_function));
+}
diff --git a/support/xpthread_key_delete.c b/support/xpthread_key_delete.c
new file mode 100644
index 0000000000..423ff4584d
--- /dev/null
+++ b/support/xpthread_key_delete.c
@@ -0,0 +1,24 @@
+/* pthread_key_delete with error checking.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <support/xthread.h>
+
+void
+xpthread_key_delete (pthread_key_t key)
+{
+  xpthread_check_return ("pthread_key_delete", pthread_key_delete (key));
+}
diff --git a/support/xthread.h b/support/xthread.h
index d350d1506d..2a519874bf 100644
--- a/support/xthread.h
+++ b/support/xthread.h
@@ -95,6 +95,8 @@ void xpthread_rwlock_wrlock (pthread_rwlock_t *rwlock);
 void xpthread_rwlock_rdlock (pthread_rwlock_t *rwlock);
 void xpthread_rwlock_unlock (pthread_rwlock_t *rwlock);
 void xpthread_rwlock_destroy (pthread_rwlock_t *rwlock);
+void xpthread_key_create (pthread_key_t *key, void (*destr_function) (void *));
+void xpthread_key_delete (pthread_key_t key);
 
 __END_DECLS
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH glibc 8/8] rseq registration tests (v7)
  2020-01-08 14:58 [RFC PATCH glibc 0/8] Restartable Sequences enablement Mathieu Desnoyers
                   ` (6 preceding siblings ...)
  2020-01-08 14:58 ` [RFC PATCH glibc 7/8] support: implement xpthread key create/delete (v3) Mathieu Desnoyers
@ 2020-01-08 14:58 ` Mathieu Desnoyers
  2020-01-08 15:31 ` [RFC PATCH glibc 0/8] Restartable Sequences enablement Joseph Myers
  8 siblings, 0 replies; 17+ messages in thread
From: Mathieu Desnoyers @ 2020-01-08 14:58 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Florian Weimer, Joseph Myers, Szabolcs Nagy, libc-alpha,
	Mathieu Desnoyers, Thomas Gleixner, Ben Maurer, Peter Zijlstra,
	Paul E. McKenney, Boqun Feng, Will Deacon, Dave Watson,
	Paul Turner

These tests validate that rseq is registered from various execution
contexts (main thread, constructor, destructor, other threads, other
threads created from constructor and destructor, forked process
(without exec), pthread_atfork handlers, pthread setspecific
destructors, C++ thread and process destructors, signal handlers,
atexit handlers).

tst-rseq.c only links against libc.so, testing registration of rseq in
a non-multithreaded environment.

tst-rseq-nptl.c also links against libpthread.so, testing registration
of rseq in a multithreaded environment.

See the Linux kernel selftests for extensive rseq stress-tests.

	* sysdeps/unix/sysv/linux/Makefile: Add tst-rseq and tst-rseq-nptl.
	* sysdeps/unix/sysv/linux/tst-rseq-nptl.c: New file.
	* sysdeps/unix/sysv/linux/tst-rseq.c: Likewise.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Carlos O'Donell <carlos@redhat.com>
CC: Florian Weimer <fweimer@redhat.com>
CC: Joseph Myers <joseph@codesourcery.com>
CC: Szabolcs Nagy <szabolcs.nagy@arm.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ben Maurer <bmaurer@fb.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Dave Watson <davejwatson@fb.com>
CC: Paul Turner <pjt@google.com>
CC: libc-alpha@sourceware.org
---
Changes since v1:
- Rename tst-rseq.c to tst-rseq-nptl.c.
- Introduce tst-rseq.c testing rseq registration in a non-multithreaded
  environment.

Chances since v2:
- Update file headers.
- use xpthread key create/delete.
- remove set stacksize.
- Tests depend on both __NR_rseq and RSEQ_SIG being defined.

Changes since v3:
- Update ChangeLog.

Changes since v4:
- Remove volatile from sys_rseq() rseq_abi parameter.
- Use atomic_load_relaxed to load __rseq_abi.cpu_id, consequence of the
  fact that __rseq_abi is not volatile anymore.
- Include atomic.h from tst-rseq.c for use of atomic_load_relaxed.
  Move tst-rseq.c to internal tests within Makefile due to its use of
  atomic.h.
- Test __rseq_handled initialization by glibc.

Changes since v5:
- Rebase on glibc 2.30.

Changes since v6:
- Remove __rseq_handled.
---
 sysdeps/unix/sysv/linux/Makefile        |   6 +-
 sysdeps/unix/sysv/linux/tst-rseq-nptl.c | 353 ++++++++++++++++++++++++
 sysdeps/unix/sysv/linux/tst-rseq.c      | 114 ++++++++
 3 files changed, 471 insertions(+), 2 deletions(-)
 create mode 100644 sysdeps/unix/sysv/linux/tst-rseq-nptl.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-rseq.c

diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile
index 36a0fd231b..a917147dff 100644
--- a/sysdeps/unix/sysv/linux/Makefile
+++ b/sysdeps/unix/sysv/linux/Makefile
@@ -99,7 +99,9 @@ tests += tst-clone tst-clone2 tst-clone3 tst-fanotify tst-personality \
 	 test-errno-linux tst-memfd_create tst-mlock2 tst-pkey \
 	 tst-rlimit-infinity tst-ofdlocks tst-gettid tst-gettid-kill \
 	 tst-tgkill
-tests-internal += tst-ofdlocks-compat tst-sigcontext-get_pc
+
+tests-internal += tst-ofdlocks-compat tst-sigcontext-get_pc \
+		  tst-ofdlocks-compat tst-rseq
 
 CFLAGS-tst-sigcontext-get_pc.c = -fasynchronous-unwind-tables
 
@@ -302,5 +304,5 @@ ifeq ($(subdir),nptl)
 tests += tst-align-clone tst-getpid1 \
 	tst-thread-affinity-pthread tst-thread-affinity-pthread2 \
 	tst-thread-affinity-sched
-tests-internal += tst-setgetname
+tests-internal += tst-setgetname tst-rseq-nptl
 endif
diff --git a/sysdeps/unix/sysv/linux/tst-rseq-nptl.c b/sysdeps/unix/sysv/linux/tst-rseq-nptl.c
new file mode 100644
index 0000000000..bfae56aec6
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/tst-rseq-nptl.c
@@ -0,0 +1,353 @@
+/* Restartable Sequences NPTL test.
+
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* These tests validate that rseq is registered from various execution
+   contexts (main thread, constructor, destructor, other threads, other
+   threads created from constructor and destructor, forked process
+   (without exec), pthread_atfork handlers, pthread setspecific
+   destructors, C++ thread and process destructors, signal handlers,
+   atexit handlers).
+
+   See the Linux kernel selftests for extensive rseq stress-tests.  */
+
+#include <sys/syscall.h>
+#include <unistd.h>
+#include <stdio.h>
+#include <support/check.h>
+#include <support/xthread.h>
+
+#ifdef __NR_rseq
+#include <sys/rseq.h>
+#endif
+
+#if defined __NR_rseq && defined RSEQ_SIG
+#include <pthread.h>
+#include <syscall.h>
+#include <stdlib.h>
+#include <error.h>
+#include <errno.h>
+#include <string.h>
+#include <stdint.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+#include <signal.h>
+#include <atomic.h>
+
+static pthread_key_t rseq_test_key;
+
+static int
+rseq_thread_registered (void)
+{
+  return (int32_t) atomic_load_relaxed (&__rseq_abi.cpu_id) >= 0;
+}
+
+static int
+do_rseq_main_test (void)
+{
+  if (raise (SIGUSR1))
+    FAIL_EXIT1 ("error raising signal");
+  if (pthread_setspecific (rseq_test_key, (void *) 1l))
+    FAIL_EXIT1 ("error in pthread_setspecific");
+  if (!rseq_thread_registered ())
+    {
+      FAIL_RET ("rseq not registered in main thread");
+    }
+  return 0;
+}
+
+static void
+cancel_routine (void *arg)
+{
+  if (!rseq_thread_registered ())
+    {
+      printf ("rseq not registered in cancel routine\n");
+      support_record_failure ();
+    }
+}
+
+static int cancel_thread_ready;
+
+static void
+test_cancel_thread (void)
+{
+  pthread_cleanup_push (cancel_routine, NULL);
+  atomic_store_release (&cancel_thread_ready, 1);
+  for (;;)
+    usleep (100);
+  pthread_cleanup_pop (0);
+}
+
+static void *
+thread_function (void * arg)
+{
+  int i = (int) (intptr_t) arg;
+
+  if (raise (SIGUSR1))
+    FAIL_EXIT1 ("error raising signal");
+  if (i == 0)
+    test_cancel_thread ();
+  if (pthread_setspecific (rseq_test_key, (void *) 1l))
+    FAIL_EXIT1 ("error in pthread_setspecific");
+  return rseq_thread_registered () ? NULL : (void *) 1l;
+}
+
+static void
+sighandler (int sig)
+{
+  if (!rseq_thread_registered ())
+    {
+      printf ("rseq not registered in signal handler\n");
+      support_record_failure ();
+    }
+}
+
+static void
+setup_signals (void)
+{
+  struct sigaction sa;
+
+  sigemptyset (&sa.sa_mask);
+  sigaddset (&sa.sa_mask, SIGUSR1);
+  sa.sa_flags = 0;
+  sa.sa_handler = sighandler;
+  if (sigaction (SIGUSR1, &sa, NULL) != 0)
+    {
+      FAIL_EXIT1 ("sigaction failure: %s", strerror (errno));
+    }
+}
+
+#define N 7
+static const int t[N] = { 1, 2, 6, 5, 4, 3, 50 };
+
+static int
+do_rseq_threads_test (int nr_threads)
+{
+  pthread_t th[nr_threads];
+  int i;
+  int result = 0;
+
+  cancel_thread_ready = 0;
+  for (i = 0; i < nr_threads; ++i)
+    if (pthread_create (&th[i], NULL, thread_function,
+                        (void *) (intptr_t) i) != 0)
+      {
+        FAIL_EXIT1 ("creation of thread %d failed", i);
+      }
+
+  while (!atomic_load_acquire (&cancel_thread_ready))
+    usleep (100);
+
+  if (pthread_cancel (th[0]))
+    FAIL_EXIT1 ("error in pthread_cancel");
+
+  for (i = 0; i < nr_threads; ++i)
+    {
+      void *v;
+      if (pthread_join (th[i], &v) != 0)
+        {
+          printf ("join of thread %d failed\n", i);
+          result = 1;
+        }
+      else if (i != 0 && v != NULL)
+        {
+          printf ("join %d successful, but child failed\n", i);
+          result = 1;
+        }
+      else if (i == 0 && v == NULL)
+        {
+          printf ("join %d successful, child did not fail as expected\n", i);
+          result = 1;
+        }
+    }
+  return result;
+}
+
+static int
+sys_rseq (struct rseq *rseq_abi, uint32_t rseq_len, int flags, uint32_t sig)
+{
+  return syscall (__NR_rseq, rseq_abi, rseq_len, flags, sig);
+}
+
+static int
+rseq_available (void)
+{
+  int rc;
+
+  rc = sys_rseq (NULL, 0, 0, 0);
+  if (rc != -1)
+    FAIL_EXIT1 ("Unexpected rseq return value %d", rc);
+  switch (errno)
+    {
+    case ENOSYS:
+      return 0;
+    case EINVAL:
+      return 1;
+    default:
+      FAIL_EXIT1 ("Unexpected rseq error %s", strerror (errno));
+    }
+}
+
+static int
+do_rseq_fork_test (void)
+{
+  int status;
+  pid_t pid, retpid;
+
+  pid = fork ();
+  switch (pid)
+    {
+      case 0:
+        exit (do_rseq_main_test ());
+      case -1:
+        FAIL_EXIT1 ("Unexpected fork error %s", strerror (errno));
+    }
+  retpid = TEMP_FAILURE_RETRY (waitpid (pid, &status, 0));
+  if (retpid != pid)
+    {
+      FAIL_EXIT1 ("waitpid returned %ld, expected %ld",
+                  (long int) retpid, (long int) pid);
+    }
+  if (WEXITSTATUS (status))
+    {
+      printf ("rseq not registered in child\n");
+      return 1;
+    }
+  return 0;
+}
+
+static int
+do_rseq_test (void)
+{
+  int i, result = 0;
+
+  if (!rseq_available ())
+    {
+      FAIL_UNSUPPORTED ("kernel does not support rseq, skipping test");
+    }
+  setup_signals ();
+  if (raise (SIGUSR1))
+    FAIL_EXIT1 ("error raising signal");
+  if (do_rseq_main_test ())
+    result = 1;
+  for (i = 0; i < N; i++)
+    {
+      if (do_rseq_threads_test (t[i]))
+        result = 1;
+    }
+  if (do_rseq_fork_test ())
+    result = 1;
+  return result;
+}
+
+static void
+atfork_prepare (void)
+{
+  if (!rseq_thread_registered ())
+    {
+      printf ("rseq not registered in pthread atfork prepare\n");
+      support_record_failure ();
+    }
+}
+
+static void
+atfork_parent (void)
+{
+  if (!rseq_thread_registered ())
+    {
+      printf ("rseq not registered in pthread atfork parent\n");
+      support_record_failure ();
+    }
+}
+
+static void
+atfork_child (void)
+{
+  if (!rseq_thread_registered ())
+    {
+      printf ("rseq not registered in pthread atfork child\n");
+      support_record_failure ();
+    }
+}
+
+static void
+rseq_key_destructor (void *arg)
+{
+  /* Cannot use deferred failure reporting after main () returns.  */
+  if (!rseq_thread_registered ())
+    FAIL_EXIT1 ("rseq not registered in pthread key destructor");
+}
+
+static void
+atexit_handler (void)
+{
+  /* Cannot use deferred failure reporting after main () returns.  */
+  if (!rseq_thread_registered ())
+    FAIL_EXIT1 ("rseq not registered in atexit handler");
+}
+
+static void __attribute__ ((constructor))
+do_rseq_constructor_test (void)
+{
+  support_record_failure_init ();
+  if (atexit (atexit_handler))
+    FAIL_EXIT1 ("error calling atexit");
+  xpthread_key_create (&rseq_test_key, rseq_key_destructor);
+  if (pthread_atfork (atfork_prepare, atfork_parent, atfork_child))
+    FAIL_EXIT1 ("error calling pthread_atfork");
+  if (do_rseq_test ())
+    FAIL_EXIT1 ("rseq not registered within constructor");
+}
+
+static void __attribute__ ((destructor))
+do_rseq_destructor_test (void)
+{
+  /* Cannot use deferred failure reporting after main () returns.  */
+  if (do_rseq_test ())
+    FAIL_EXIT1 ("rseq not registered within destructor");
+  xpthread_key_delete (rseq_test_key);
+}
+
+/* Test C++ destructor called at thread and process exit.  */
+void
+__call_tls_dtors (void)
+{
+  /* Cannot use deferred failure reporting after main () returns.  */
+  if (!rseq_thread_registered ())
+    FAIL_EXIT1 ("rseq not registered in C++ thread/process exit destructor");
+}
+#else
+static int
+do_rseq_test (void)
+{
+#ifndef __NR_rseq
+  FAIL_UNSUPPORTED ("kernel headers do not support rseq, skipping test");
+#endif
+#ifndef RSEQ_SIG
+  FAIL_UNSUPPORTED ("glibc does not define RSEQ_SIG, skipping test");
+#endif
+  return 0;
+}
+#endif
+
+static int
+do_test (void)
+{
+  return do_rseq_test ();
+}
+
+#include <support/test-driver.c>
diff --git a/sysdeps/unix/sysv/linux/tst-rseq.c b/sysdeps/unix/sysv/linux/tst-rseq.c
new file mode 100644
index 0000000000..3322739854
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/tst-rseq.c
@@ -0,0 +1,114 @@
+/* Restartable Sequences single-threaded tests.
+
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* These tests validate that rseq is registered from main in an executable
+   not linked against libpthread.  */
+
+#include <sys/syscall.h>
+#include <unistd.h>
+#include <stdio.h>
+#include <support/check.h>
+
+#ifdef __NR_rseq
+#include <sys/rseq.h>
+#endif
+
+#if defined __NR_rseq && defined RSEQ_SIG
+#include <syscall.h>
+#include <stdlib.h>
+#include <error.h>
+#include <errno.h>
+#include <stdint.h>
+#include <string.h>
+#include <atomic.h>
+
+static int
+rseq_thread_registered (void)
+{
+  return (int32_t) atomic_load_relaxed (&__rseq_abi.cpu_id) >= 0;
+}
+
+static int
+do_rseq_main_test (void)
+{
+  if (!rseq_thread_registered ())
+    {
+      FAIL_RET ("rseq not registered in main thread");
+    }
+  return 0;
+}
+
+static int
+sys_rseq (struct rseq *rseq_abi, uint32_t rseq_len, int flags, uint32_t sig)
+{
+  return syscall (__NR_rseq, rseq_abi, rseq_len, flags, sig);
+}
+
+static int
+rseq_available (void)
+{
+  int rc;
+
+  rc = sys_rseq (NULL, 0, 0, 0);
+  if (rc != -1)
+    FAIL_EXIT1 ("Unexpected rseq return value %d", rc);
+  switch (errno)
+    {
+    case ENOSYS:
+      return 0;
+    case EINVAL:
+      return 1;
+    default:
+      FAIL_EXIT1 ("Unexpected rseq error %s", strerror (errno));
+    }
+}
+
+static int
+do_rseq_test (void)
+{
+  int result = 0;
+
+  if (!rseq_available ())
+    {
+      FAIL_UNSUPPORTED ("kernel does not support rseq, skipping test");
+    }
+  if (do_rseq_main_test ())
+    result = 1;
+  return result;
+}
+#else
+static int
+do_rseq_test (void)
+{
+#ifndef __NR_rseq
+  FAIL_UNSUPPORTED ("kernel headers do not support rseq, skipping test");
+#endif
+#ifndef RSEQ_SIG
+  FAIL_UNSUPPORTED ("glibc does not define RSEQ_SIG, skipping test");
+#endif
+  return 0;
+}
+#endif
+
+static int
+do_test (void)
+{
+  return do_rseq_test ();
+}
+
+#include <support/test-driver.c>
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH glibc 0/8] Restartable Sequences enablement
  2020-01-08 14:58 [RFC PATCH glibc 0/8] Restartable Sequences enablement Mathieu Desnoyers
                   ` (7 preceding siblings ...)
  2020-01-08 14:58 ` [RFC PATCH glibc 8/8] rseq registration tests (v7) Mathieu Desnoyers
@ 2020-01-08 15:31 ` Joseph Myers
  2020-01-08 15:48   ` Mathieu Desnoyers
  8 siblings, 1 reply; 17+ messages in thread
From: Joseph Myers @ 2020-01-08 15:31 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Carlos O'Donell, Florian Weimer, Szabolcs Nagy, libc-alpha

On Wed, 8 Jan 2020, Mathieu Desnoyers wrote:

> Since the last post, I updated the manual following the advices I
> received, removed the syscall table patches from the series, as they are
> now merged into glibc master. I also updated the abilist following le/be
> split for arm, microblaze, and sh.

You also need to update copyright year ranges for new files to include 
2020.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH glibc 0/8] Restartable Sequences enablement
  2020-01-08 15:31 ` [RFC PATCH glibc 0/8] Restartable Sequences enablement Joseph Myers
@ 2020-01-08 15:48   ` Mathieu Desnoyers
  0 siblings, 0 replies; 17+ messages in thread
From: Mathieu Desnoyers @ 2020-01-08 15:48 UTC (permalink / raw)
  To: Joseph Myers; +Cc: carlos, Florian Weimer, Szabolcs Nagy, libc-alpha

----- On Jan 8, 2020, at 10:31 AM, Joseph Myers joseph@codesourcery.com wrote:

> On Wed, 8 Jan 2020, Mathieu Desnoyers wrote:
> 
>> Since the last post, I updated the manual following the advices I
>> received, removed the syscall table patches from the series, as they are
>> now merged into glibc master. I also updated the abilist following le/be
>> split for arm, microblaze, and sh.
> 
> You also need to update copyright year ranges for new files to include
> 2020.

Done, it will be in the next round, thanks for the review!

Mathieu


-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH glibc 3/8] nptl: Start new threads with all signals blocked [BZ #25098]
  2020-03-19 14:41 Mathieu Desnoyers via Libc-alpha
@ 2020-03-19 14:41 ` Mathieu Desnoyers via Libc-alpha
  0 siblings, 0 replies; 17+ messages in thread
From: Mathieu Desnoyers via Libc-alpha @ 2020-03-19 14:41 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: libc-alpha, Joseph Myers

From: Florian Weimer <fweimer@redhat.com>

New threads inherit the signal mask from the current thread.  This
means that signal handlers can run on the newly created thread
immediately after the kernel has created the userspace thread, even
before glibc has initialized the TCB.  Consequently, new threads can
observe uninitialized ctype data, among other things.

To address this, block all signals before starting the thread, and
pass the original signal mask to the start routine wrapper.  On the
new thread, first perform all thread initialization, and then unblock
signals.

The cost of doing this is two rt_sigprocmask system calls on the old
thread, and one rt_sigprocmask system call on the new thread.  (If
there was a way to clone a new thread with a signals disabled, this
could be brought down to one system call each.)  The thread descriptor
increases in size, too, and sigset_t is fairly large.  This increase
could be brought down by reusing space the in the descriptor which is
not needed before running user code, or by switching to an internal
sigset_t definition which only covers the signals supported by the
kernel definition.  (Part of the thread descriptor size increase is
already offset by reduced stack usage in the thread start wrapper
routine after this commit.)

-----
 nptl/descr.h          | 10 +++++++---
 nptl/pthread_create.c | 47 +++++++++++++++++++++++++----------------------
 2 files changed, 32 insertions(+), 25 deletions(-)
---
 nptl/descr.h          | 10 +++++++---
 nptl/pthread_create.c | 46 +++++++++++++++++++++++--------------------
 2 files changed, 32 insertions(+), 24 deletions(-)

diff --git a/nptl/descr.h b/nptl/descr.h
index 9dcf480bdf..e1c7db5473 100644
--- a/nptl/descr.h
+++ b/nptl/descr.h
@@ -332,9 +332,8 @@ struct pthread
   /* True if thread must stop at startup time.  */
   bool stopped_start;
 
-  /* The parent's cancel handling at the time of the pthread_create
-     call.  This might be needed to undo the effects of a cancellation.  */
-  int parent_cancelhandling;
+  /* Formerly used for dealing with cancellation.  */
+  int parent_cancelhandling_unsed;
 
   /* Lock to synchronize access to the descriptor.  */
   int lock;
@@ -391,6 +390,11 @@ struct pthread
   /* Resolver state.  */
   struct __res_state res;
 
+  /* Signal mask for the new thread.  Used during thread startup to
+     restore the signal mask.  (Threads are launched with all signals
+     masked.)  */
+  sigset_t sigmask;
+
   /* Indicates whether is a C11 thread created by thrd_creat.  */
   bool c11;
 
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index 7c752d0f99..afd379e89a 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -369,7 +369,6 @@ __free_tcb (struct pthread *pd)
     }
 }
 
-
 /* Local function to start thread and handle cleanup.
    createthread.c defines the macro START_THREAD_DEFN to the
    declaration that its create_thread function will refer to, and
@@ -385,10 +384,6 @@ START_THREAD_DEFN
   /* Initialize pointers to locale data.  */
   __ctype_init ();
 
-  /* Allow setxid from now onwards.  */
-  if (__glibc_unlikely (atomic_exchange_acq (&pd->setxid_futex, 0) == -2))
-    futex_wake (&pd->setxid_futex, 1, FUTEX_PRIVATE);
-
 #ifndef __ASSUME_SET_ROBUST_LIST
   if (__set_robust_list_avail >= 0)
 #endif
@@ -399,18 +394,6 @@ START_THREAD_DEFN
 			     sizeof (struct robust_list_head));
     }
 
-  /* If the parent was running cancellation handlers while creating
-     the thread the new thread inherited the signal mask.  Reset the
-     cancellation signal mask.  */
-  if (__glibc_unlikely (pd->parent_cancelhandling & CANCELING_BITMASK))
-    {
-      sigset_t mask;
-      __sigemptyset (&mask);
-      __sigaddset (&mask, SIGCANCEL);
-      INTERNAL_SYSCALL_CALL (rt_sigprocmask, SIG_UNBLOCK, &mask,
-			     NULL, _NSIG / 8);
-    }
-
   /* This is where the try/finally block should be created.  For
      compilers without that support we do use setjmp.  */
   struct pthread_unwind_buf unwind_buf;
@@ -432,6 +415,12 @@ START_THREAD_DEFN
   unwind_buf.priv.data.prev = NULL;
   unwind_buf.priv.data.cleanup = NULL;
 
+  __libc_signal_restore_set (&pd->sigmask);
+
+  /* Allow setxid from now onwards.  */
+  if (__glibc_unlikely (atomic_exchange_acq (&pd->setxid_futex, 0) == -2))
+    futex_wake (&pd->setxid_futex, 1, FUTEX_PRIVATE);
+
   if (__glibc_likely (! not_first_call))
     {
       /* Store the new cleanup handler info.  */
@@ -722,10 +711,6 @@ __pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr,
   CHECK_THREAD_SYSINFO (pd);
 #endif
 
-  /* Inform start_thread (above) about cancellation state that might
-     translate into inherited signal state.  */
-  pd->parent_cancelhandling = THREAD_GETMEM (THREAD_SELF, cancelhandling);
-
   /* Determine scheduling parameters for the thread.  */
   if (__builtin_expect ((iattr->flags & ATTR_FLAG_NOTINHERITSCHED) != 0, 0)
       && (iattr->flags & (ATTR_FLAG_SCHED_SET | ATTR_FLAG_POLICY_SET)) != 0)
@@ -771,6 +756,21 @@ __pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr,
      ownership of PD (see CONCURRENCY NOTES above).  */
   bool stopped_start = false; bool thread_ran = false;
 
+  /* Block all signals, so that the new thread starts out with
+     signals disabled.  This avoids race conditions in the thread
+     startup.  */
+  sigset_t original_sigmask;
+  __libc_signal_block_all (&original_sigmask);
+
+  /* Conceptually, the new thread needs to inherit the signal mask of
+     this thread.  Therefore, it needs to restore the saved signal
+     mask of this thread, so save it in the startup information.  */
+  pd->sigmask = original_sigmask;
+
+  /* Reset the cancellation signal mask in case this thread is running
+     cancellation.  */
+  __sigdelset (&pd->sigmask, SIGCANCEL);
+
   /* Start the thread.  */
   if (__glibc_unlikely (report_thread_creation (pd)))
     {
@@ -813,6 +813,10 @@ __pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr,
     retval = create_thread (pd, iattr, &stopped_start,
 			    STACK_VARIABLES_ARGS, &thread_ran);
 
+  /* Return to the previous signal mask, after creating the new
+     thread.  */
+  __libc_signal_restore_set (&original_sigmask);
+
   if (__glibc_unlikely (retval != 0))
     {
       if (thread_ran)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH glibc 3/8] nptl: Start new threads with all signals blocked [BZ #25098]
  2020-03-23 13:15 [RFC PATCH glibc 0/8] Restartable Sequences enablement Mathieu Desnoyers via Libc-alpha
@ 2020-03-23 13:16 ` Mathieu Desnoyers via Libc-alpha
  2020-03-23 15:31   ` Christian Brauner
  0 siblings, 1 reply; 17+ messages in thread
From: Mathieu Desnoyers via Libc-alpha @ 2020-03-23 13:16 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: libc-alpha, Joseph Myers

From: Florian Weimer <fweimer@redhat.com>

New threads inherit the signal mask from the current thread.  This
means that signal handlers can run on the newly created thread
immediately after the kernel has created the userspace thread, even
before glibc has initialized the TCB.  Consequently, new threads can
observe uninitialized ctype data, among other things.

To address this, block all signals before starting the thread, and
pass the original signal mask to the start routine wrapper.  On the
new thread, first perform all thread initialization, and then unblock
signals.

The cost of doing this is two rt_sigprocmask system calls on the old
thread, and one rt_sigprocmask system call on the new thread.  (If
there was a way to clone a new thread with a signals disabled, this
could be brought down to one system call each.)  The thread descriptor
increases in size, too, and sigset_t is fairly large.  This increase
could be brought down by reusing space the in the descriptor which is
not needed before running user code, or by switching to an internal
sigset_t definition which only covers the signals supported by the
kernel definition.  (Part of the thread descriptor size increase is
already offset by reduced stack usage in the thread start wrapper
routine after this commit.)

-----
 nptl/descr.h          | 10 +++++++---
 nptl/pthread_create.c | 47 +++++++++++++++++++++++++----------------------
 2 files changed, 32 insertions(+), 25 deletions(-)
---
 nptl/descr.h          | 10 +++++++---
 nptl/pthread_create.c | 46 +++++++++++++++++++++++--------------------
 2 files changed, 32 insertions(+), 24 deletions(-)

diff --git a/nptl/descr.h b/nptl/descr.h
index 9dcf480bdf..e1c7db5473 100644
--- a/nptl/descr.h
+++ b/nptl/descr.h
@@ -332,9 +332,8 @@ struct pthread
   /* True if thread must stop at startup time.  */
   bool stopped_start;
 
-  /* The parent's cancel handling at the time of the pthread_create
-     call.  This might be needed to undo the effects of a cancellation.  */
-  int parent_cancelhandling;
+  /* Formerly used for dealing with cancellation.  */
+  int parent_cancelhandling_unsed;
 
   /* Lock to synchronize access to the descriptor.  */
   int lock;
@@ -391,6 +390,11 @@ struct pthread
   /* Resolver state.  */
   struct __res_state res;
 
+  /* Signal mask for the new thread.  Used during thread startup to
+     restore the signal mask.  (Threads are launched with all signals
+     masked.)  */
+  sigset_t sigmask;
+
   /* Indicates whether is a C11 thread created by thrd_creat.  */
   bool c11;
 
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index 7c752d0f99..afd379e89a 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -369,7 +369,6 @@ __free_tcb (struct pthread *pd)
     }
 }
 
-
 /* Local function to start thread and handle cleanup.
    createthread.c defines the macro START_THREAD_DEFN to the
    declaration that its create_thread function will refer to, and
@@ -385,10 +384,6 @@ START_THREAD_DEFN
   /* Initialize pointers to locale data.  */
   __ctype_init ();
 
-  /* Allow setxid from now onwards.  */
-  if (__glibc_unlikely (atomic_exchange_acq (&pd->setxid_futex, 0) == -2))
-    futex_wake (&pd->setxid_futex, 1, FUTEX_PRIVATE);
-
 #ifndef __ASSUME_SET_ROBUST_LIST
   if (__set_robust_list_avail >= 0)
 #endif
@@ -399,18 +394,6 @@ START_THREAD_DEFN
 			     sizeof (struct robust_list_head));
     }
 
-  /* If the parent was running cancellation handlers while creating
-     the thread the new thread inherited the signal mask.  Reset the
-     cancellation signal mask.  */
-  if (__glibc_unlikely (pd->parent_cancelhandling & CANCELING_BITMASK))
-    {
-      sigset_t mask;
-      __sigemptyset (&mask);
-      __sigaddset (&mask, SIGCANCEL);
-      INTERNAL_SYSCALL_CALL (rt_sigprocmask, SIG_UNBLOCK, &mask,
-			     NULL, _NSIG / 8);
-    }
-
   /* This is where the try/finally block should be created.  For
      compilers without that support we do use setjmp.  */
   struct pthread_unwind_buf unwind_buf;
@@ -432,6 +415,12 @@ START_THREAD_DEFN
   unwind_buf.priv.data.prev = NULL;
   unwind_buf.priv.data.cleanup = NULL;
 
+  __libc_signal_restore_set (&pd->sigmask);
+
+  /* Allow setxid from now onwards.  */
+  if (__glibc_unlikely (atomic_exchange_acq (&pd->setxid_futex, 0) == -2))
+    futex_wake (&pd->setxid_futex, 1, FUTEX_PRIVATE);
+
   if (__glibc_likely (! not_first_call))
     {
       /* Store the new cleanup handler info.  */
@@ -722,10 +711,6 @@ __pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr,
   CHECK_THREAD_SYSINFO (pd);
 #endif
 
-  /* Inform start_thread (above) about cancellation state that might
-     translate into inherited signal state.  */
-  pd->parent_cancelhandling = THREAD_GETMEM (THREAD_SELF, cancelhandling);
-
   /* Determine scheduling parameters for the thread.  */
   if (__builtin_expect ((iattr->flags & ATTR_FLAG_NOTINHERITSCHED) != 0, 0)
       && (iattr->flags & (ATTR_FLAG_SCHED_SET | ATTR_FLAG_POLICY_SET)) != 0)
@@ -771,6 +756,21 @@ __pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr,
      ownership of PD (see CONCURRENCY NOTES above).  */
   bool stopped_start = false; bool thread_ran = false;
 
+  /* Block all signals, so that the new thread starts out with
+     signals disabled.  This avoids race conditions in the thread
+     startup.  */
+  sigset_t original_sigmask;
+  __libc_signal_block_all (&original_sigmask);
+
+  /* Conceptually, the new thread needs to inherit the signal mask of
+     this thread.  Therefore, it needs to restore the saved signal
+     mask of this thread, so save it in the startup information.  */
+  pd->sigmask = original_sigmask;
+
+  /* Reset the cancellation signal mask in case this thread is running
+     cancellation.  */
+  __sigdelset (&pd->sigmask, SIGCANCEL);
+
   /* Start the thread.  */
   if (__glibc_unlikely (report_thread_creation (pd)))
     {
@@ -813,6 +813,10 @@ __pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr,
     retval = create_thread (pd, iattr, &stopped_start,
 			    STACK_VARIABLES_ARGS, &thread_ran);
 
+  /* Return to the previous signal mask, after creating the new
+     thread.  */
+  __libc_signal_restore_set (&original_sigmask);
+
   if (__glibc_unlikely (retval != 0))
     {
       if (thread_ran)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH glibc 3/8] nptl: Start new threads with all signals blocked [BZ #25098]
  2020-03-23 13:16 ` [RFC PATCH glibc 3/8] nptl: Start new threads with all signals blocked [BZ #25098] Mathieu Desnoyers via Libc-alpha
@ 2020-03-23 15:31   ` Christian Brauner
  2020-03-23 17:02     ` Mathieu Desnoyers via Libc-alpha
  0 siblings, 1 reply; 17+ messages in thread
From: Christian Brauner @ 2020-03-23 15:31 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: libc-alpha, Joseph Myers

On Mon, Mar 23, 2020 at 09:16:02AM -0400, Mathieu Desnoyers via Libc-alpha wrote:
> From: Florian Weimer <fweimer@redhat.com>
> 
> New threads inherit the signal mask from the current thread.  This
> means that signal handlers can run on the newly created thread
> immediately after the kernel has created the userspace thread, even
> before glibc has initialized the TCB.  Consequently, new threads can
> observe uninitialized ctype data, among other things.
> 
> To address this, block all signals before starting the thread, and
> pass the original signal mask to the start routine wrapper.  On the
> new thread, first perform all thread initialization, and then unblock
> signals.
> 
> The cost of doing this is two rt_sigprocmask system calls on the old
> thread, and one rt_sigprocmask system call on the new thread.  (If
> there was a way to clone a new thread with a signals disabled, this

This could be a new clone3() flag. If someone wants to send a patch I'd
take it.

Christian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH glibc 3/8] nptl: Start new threads with all signals blocked [BZ #25098]
  2020-03-23 15:31   ` Christian Brauner
@ 2020-03-23 17:02     ` Mathieu Desnoyers via Libc-alpha
  2020-03-23 17:05       ` Christian Brauner
  0 siblings, 1 reply; 17+ messages in thread
From: Mathieu Desnoyers via Libc-alpha @ 2020-03-23 17:02 UTC (permalink / raw)
  To: Christian Brauner; +Cc: libc-alpha, Joseph Myers

----- On Mar 23, 2020, at 11:31 AM, Christian Brauner christian.brauner@ubuntu.com wrote:

> On Mon, Mar 23, 2020 at 09:16:02AM -0400, Mathieu Desnoyers via Libc-alpha
> wrote:
>> From: Florian Weimer <fweimer@redhat.com>
>> 
>> New threads inherit the signal mask from the current thread.  This
>> means that signal handlers can run on the newly created thread
>> immediately after the kernel has created the userspace thread, even
>> before glibc has initialized the TCB.  Consequently, new threads can
>> observe uninitialized ctype data, among other things.
>> 
>> To address this, block all signals before starting the thread, and
>> pass the original signal mask to the start routine wrapper.  On the
>> new thread, first perform all thread initialization, and then unblock
>> signals.
>> 
>> The cost of doing this is two rt_sigprocmask system calls on the old
>> thread, and one rt_sigprocmask system call on the new thread.  (If
>> there was a way to clone a new thread with a signals disabled, this
> 
> This could be a new clone3() flag. If someone wants to send a patch I'd
> take it.

I agree that it would be a nice improvement to alleviate the overhead of
tweaking the signal masks on thread creation. I suspect the code proposed in
this patch is still needed, because glibc would have to support the currently
existing kernels. The improvement you envision involves adding this new flag
into the Linux kernel clone3 system call and then wiring up glibc support.

I don't expect this should delay rseq integration into glibc ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH glibc 3/8] nptl: Start new threads with all signals blocked [BZ #25098]
  2020-03-23 17:02     ` Mathieu Desnoyers via Libc-alpha
@ 2020-03-23 17:05       ` Christian Brauner
  2020-04-22 16:38         ` Carlos O'Donell via Libc-alpha
  0 siblings, 1 reply; 17+ messages in thread
From: Christian Brauner @ 2020-03-23 17:05 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: libc-alpha, Joseph Myers

On Mon, Mar 23, 2020 at 01:02:12PM -0400, Mathieu Desnoyers via Libc-alpha wrote:
> ----- On Mar 23, 2020, at 11:31 AM, Christian Brauner christian.brauner@ubuntu.com wrote:
> 
> > On Mon, Mar 23, 2020 at 09:16:02AM -0400, Mathieu Desnoyers via Libc-alpha
> > wrote:
> >> From: Florian Weimer <fweimer@redhat.com>
> >> 
> >> New threads inherit the signal mask from the current thread.  This
> >> means that signal handlers can run on the newly created thread
> >> immediately after the kernel has created the userspace thread, even
> >> before glibc has initialized the TCB.  Consequently, new threads can
> >> observe uninitialized ctype data, among other things.
> >> 
> >> To address this, block all signals before starting the thread, and
> >> pass the original signal mask to the start routine wrapper.  On the
> >> new thread, first perform all thread initialization, and then unblock
> >> signals.
> >> 
> >> The cost of doing this is two rt_sigprocmask system calls on the old
> >> thread, and one rt_sigprocmask system call on the new thread.  (If
> >> there was a way to clone a new thread with a signals disabled, this
> > 
> > This could be a new clone3() flag. If someone wants to send a patch I'd
> > take it.
> 
> I agree that it would be a nice improvement to alleviate the overhead of
> tweaking the signal masks on thread creation. I suspect the code proposed in
> this patch is still needed, because glibc would have to support the currently
> existing kernels. The improvement you envision involves adding this new flag
> into the Linux kernel clone3 system call and then wiring up glibc support.
> 
> I don't expect this should delay rseq integration into glibc ?

Oh no, I wasn't trying to say that rseq() in glibc should be blocked on
this, not at all. I was just saying we should probably do this since
it's a valuable improvement.

Christian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH glibc 3/8] nptl: Start new threads with all signals blocked [BZ #25098]
  2020-03-23 17:05       ` Christian Brauner
@ 2020-04-22 16:38         ` Carlos O'Donell via Libc-alpha
  0 siblings, 0 replies; 17+ messages in thread
From: Carlos O'Donell via Libc-alpha @ 2020-04-22 16:38 UTC (permalink / raw)
  To: Christian Brauner, Mathieu Desnoyers; +Cc: libc-alpha, Joseph Myers

On 3/23/20 1:05 PM, Christian Brauner wrote:
> On Mon, Mar 23, 2020 at 01:02:12PM -0400, Mathieu Desnoyers via Libc-alpha wrote:
>> ----- On Mar 23, 2020, at 11:31 AM, Christian Brauner christian.brauner@ubuntu.com wrote:
>>
>>> On Mon, Mar 23, 2020 at 09:16:02AM -0400, Mathieu Desnoyers via Libc-alpha
>>> wrote:
>>>> From: Florian Weimer <fweimer@redhat.com>
>>>>
>>>> New threads inherit the signal mask from the current thread.  This
>>>> means that signal handlers can run on the newly created thread
>>>> immediately after the kernel has created the userspace thread, even
>>>> before glibc has initialized the TCB.  Consequently, new threads can
>>>> observe uninitialized ctype data, among other things.
>>>>
>>>> To address this, block all signals before starting the thread, and
>>>> pass the original signal mask to the start routine wrapper.  On the
>>>> new thread, first perform all thread initialization, and then unblock
>>>> signals.
>>>>
>>>> The cost of doing this is two rt_sigprocmask system calls on the old
>>>> thread, and one rt_sigprocmask system call on the new thread.  (If
>>>> there was a way to clone a new thread with a signals disabled, this
>>>
>>> This could be a new clone3() flag. If someone wants to send a patch I'd
>>> take it.
>>
>> I agree that it would be a nice improvement to alleviate the overhead of
>> tweaking the signal masks on thread creation. I suspect the code proposed in
>> this patch is still needed, because glibc would have to support the currently
>> existing kernels. The improvement you envision involves adding this new flag
>> into the Linux kernel clone3 system call and then wiring up glibc support.
>>
>> I don't expect this should delay rseq integration into glibc ?
> 
> Oh no, I wasn't trying to say that rseq() in glibc should be blocked on
> this, not at all. I was just saying we should probably do this since
> it's a valuable improvement.

That's up to you to decide, and it depends on the workload. I'd be hesitant
to spend a flag bit on this. To cleanup the glibc code would require you to
(a) implement this in the kernel
(b) have glibc move the minimum kernel line above the new kernel that has this
We're talking a minor cleanup that would take 10-15 years to arrive... if ever.
Moving the minimum kernel baseline has container workload impacts that are
going to become real problems soon.

Mathieu, Florian, and I talked about this very issue at GNU Tools
Cauldron 2019 (Montreal). We agreed that even if we could fix it immediately
we would still need the above implementation.

IMO you are doing something wrong if this shows up in your profiling though.
You should not be starting up and killing threads that fast. You should be
using thread pools etc. The kernel reaping of tasks is quite slow and even
in glibc we often hit EAGAIN limits on heavily loaded test boxes.

The above code is required for correctness so certain state is not observable,
but it need not be that fast. Someone correct me if I'm wrong though and I've
missed a use case.

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-04-22 16:38 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-08 14:58 [RFC PATCH glibc 0/8] Restartable Sequences enablement Mathieu Desnoyers
2020-01-08 14:58 ` [RFC PATCH glibc 1/8] Introduce <elf_machine_sym_no_match.h> Mathieu Desnoyers
2020-01-08 14:58 ` [RFC PATCH glibc 2/8] Implement __libc_early_init Mathieu Desnoyers
2020-01-08 14:58 ` [RFC PATCH glibc 3/8] nptl: Start new threads with all signals blocked [BZ #25098] Mathieu Desnoyers
2020-01-08 14:58 ` [RFC PATCH glibc 4/8] glibc: Perform rseq(2) registration at C startup and thread creation (v14) Mathieu Desnoyers
2020-01-08 14:58 ` [RFC PATCH glibc 5/8] glibc: sched_getcpu(): use rseq cpu_id TLS on Linux (v5) Mathieu Desnoyers
2020-01-08 14:58 ` [RFC PATCH glibc 6/8] support record failure: allow use from constructor Mathieu Desnoyers
2020-01-08 14:58 ` [RFC PATCH glibc 7/8] support: implement xpthread key create/delete (v3) Mathieu Desnoyers
2020-01-08 14:58 ` [RFC PATCH glibc 8/8] rseq registration tests (v7) Mathieu Desnoyers
2020-01-08 15:31 ` [RFC PATCH glibc 0/8] Restartable Sequences enablement Joseph Myers
2020-01-08 15:48   ` Mathieu Desnoyers
  -- strict thread matches above, loose matches on Subject: below --
2020-03-19 14:41 Mathieu Desnoyers via Libc-alpha
2020-03-19 14:41 ` [RFC PATCH glibc 3/8] nptl: Start new threads with all signals blocked [BZ #25098] Mathieu Desnoyers via Libc-alpha
2020-03-23 13:15 [RFC PATCH glibc 0/8] Restartable Sequences enablement Mathieu Desnoyers via Libc-alpha
2020-03-23 13:16 ` [RFC PATCH glibc 3/8] nptl: Start new threads with all signals blocked [BZ #25098] Mathieu Desnoyers via Libc-alpha
2020-03-23 15:31   ` Christian Brauner
2020-03-23 17:02     ` Mathieu Desnoyers via Libc-alpha
2020-03-23 17:05       ` Christian Brauner
2020-04-22 16:38         ` Carlos O'Donell via Libc-alpha

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).