unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [RFC patch 0/5] RISC-V: Add vector ISA support
@ 2021-09-13  1:41 Vincent Chen
  2021-09-13  1:41 ` [RFC patch 1/5] RISC-V: Remove riscv-specific sigcontext.h Vincent Chen
                   ` (6 more replies)
  0 siblings, 7 replies; 79+ messages in thread
From: Vincent Chen @ 2021-09-13  1:41 UTC (permalink / raw)
  To: libc-alpha, palmer; +Cc: andrew, Vincent Chen

This patchset adds required ports to support RISC-V Vector (RVV) extension.

Since the length of the vector register in RVV (the theoretical maximum
is 2^XLEN-1 bits) is variable, a huge and flexible space is needed to back
up all vector registers in the signal context. This patchset expands the
default SIGSTKSZ, MINSIGSTKSZ, and PTHREAD_STACK_MIN to ensure the stack
size is enough for the normal case (VLENB <= 128 bytes). Linux kernel also
places the exact minimum signal stack size in AT_MINSIGSTKSZ entry of the
auxiliary vector to inform user, so user still can know the sutible minimum
signal stack size by sysconf (_SC_MINSIGSTKSZ) if the VLENB is greater
than 128 bytes. 

In addition, according to the specification, the VCSR that combines VXRM and
VXSAT has thread storage duration, so this patchset adds the required user
context operation for it.

Finally, the RISC-V glibc customized sigcontext.h has been removed in this
patchset. to reduce the synchronization work when new extension support is
introduced to the Linux environment. However, it may bring some backward
incompatible issues. Therefore, I sent an RFC patch
(https://sourceware.org/pipermail/libc-alpha/2020-June/115549.html)
to discuss this modification before this patchset. As I mentioned in the
RFC patch thread, I used OpenEmbeded to evaluate the impact. During the
tests, I didn't get any compiler errors. Therefore, I infer that this
modification may not cause server backward incompatible issues at this
moment.

1. The RISC-V V-extension draft v1.0 can be found in
https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc
2. The associated kernel implementation can be found in
http://lists.infradead.org/pipermail/linux-riscv/2021-September/008249.html
3. QEMU with RISC-V V-extension support can be found in
https://github.com/sifive/qemu/tree/rvv-1.0

Vincent Chen (5):
  RISC-V: Remove riscv-specific sigcontext.h
  RISC-V: Reserve about 5K space in mcontext_t to support future ISA
    expansion.
  RISC-V: Save and restore VCSR when doing user context switch
  RISC-V: Extend MINSIGSTKSZ and SIGSTKSZ to backup RVV registers
  RISC-V: Expand PTHREAD_STACK_MIN to support RVV environment

 sysdeps/riscv/Makefile                             |  5 +++
 sysdeps/riscv/rtld-global-offsets.sym              |  7 ++++
 sysdeps/unix/sysv/linux/riscv/bits/hwcap.h         | 31 ++++++++++++++++
 .../unix/sysv/linux/riscv/bits/pthread_stack_min.h | 21 +++++++++++
 sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h    | 31 ----------------
 sysdeps/unix/sysv/linux/riscv/bits/sigstack.h      | 32 +++++++++++++++++
 sysdeps/unix/sysv/linux/riscv/getcontext.S         | 22 +++++++++++-
 sysdeps/unix/sysv/linux/riscv/setcontext.S         | 22 ++++++++++++
 sysdeps/unix/sysv/linux/riscv/swapcontext.S        | 41 ++++++++++++++++++++++
 sysdeps/unix/sysv/linux/riscv/sys/ucontext.h       |  2 ++
 .../sysv/linux/riscv/sysconf-pthread_stack_min.h   | 39 ++++++++++++++++++++
 sysdeps/unix/sysv/linux/riscv/sysdep.h             |  1 +
 sysdeps/unix/sysv/linux/riscv/ucontext_i.sym       |  6 ++++
 13 files changed, 228 insertions(+), 32 deletions(-)
 create mode 100644 sysdeps/riscv/rtld-global-offsets.sym
 create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/hwcap.h
 create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/pthread_stack_min.h
 delete mode 100644 sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h
 create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/sigstack.h
 create mode 100644 sysdeps/unix/sysv/linux/riscv/sysconf-pthread_stack_min.h

-- 
2.7.4


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [RFC patch 1/5] RISC-V: Remove riscv-specific sigcontext.h
  2021-09-13  1:41 [RFC patch 0/5] RISC-V: Add vector ISA support Vincent Chen
@ 2021-09-13  1:41 ` Vincent Chen
  2021-09-13  1:41 ` [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion Vincent Chen
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 79+ messages in thread
From: Vincent Chen @ 2021-09-13  1:41 UTC (permalink / raw)
  To: libc-alpha, palmer; +Cc: andrew, Vincent Chen

Remove riscv-specific sigcontext.h so that Glibc can directly use
sigcontext.h provided by the kernel to reduce synchronization work
when new extension support is introduced.
---
 sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h | 31 -------------------------
 1 file changed, 31 deletions(-)
 delete mode 100644 sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h

diff --git a/sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h b/sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h
deleted file mode 100644
index 14e4e06..0000000
--- a/sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h
+++ /dev/null
@@ -1,31 +0,0 @@
-/* Machine-dependent signal context structure for Linux.  RISC-V version.
-   Copyright (C) 1996-2021 Free Software Foundation, Inc.  This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library.  If not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#ifndef _BITS_SIGCONTEXT_H
-#define _BITS_SIGCONTEXT_H 1
-
-#if !defined _SIGNAL_H && !defined _SYS_UCONTEXT_H
-# error "Never use <bits/sigcontext.h> directly; include <signal.h> instead."
-#endif
-
-struct sigcontext {
-  /* gregs[0] holds the program counter.  */
-  unsigned long int gregs[32];
-  unsigned long long int fpregs[66] __attribute__ ((__aligned__ (16)));
-};
-
-#endif
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion.
  2021-09-13  1:41 [RFC patch 0/5] RISC-V: Add vector ISA support Vincent Chen
  2021-09-13  1:41 ` [RFC patch 1/5] RISC-V: Remove riscv-specific sigcontext.h Vincent Chen
@ 2021-09-13  1:41 ` Vincent Chen
  2021-09-13 13:44   ` Florian Weimer via Libc-alpha
  2021-09-13  1:41 ` [RFC patch 3/5] RISC-V: Save and restore VCSR when doing user context switch Vincent Chen
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 79+ messages in thread
From: Vincent Chen @ 2021-09-13  1:41 UTC (permalink / raw)
  To: libc-alpha, palmer; +Cc: andrew, Vincent Chen

Following the changes of struct sigcontext in Linux to reserve about 5K space
to support future ISA expansion.
---
 sysdeps/unix/sysv/linux/riscv/sys/ucontext.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h b/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
index cfafa44..80caf07 100644
--- a/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
+++ b/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
@@ -82,6 +82,8 @@ typedef struct mcontext_t
   {
     __riscv_mc_gp_state __gregs;
     union  __riscv_mc_fp_state __fpregs;
+    /* 5K + 256 reserved for vector state and future expansion.  */
+    unsigned char __reserved[5376] __attribute__ ((__aligned__ (16)));
   } mcontext_t;
 
 /* Userlevel context.  */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [RFC patch 3/5] RISC-V: Save and restore VCSR when doing user context switch
  2021-09-13  1:41 [RFC patch 0/5] RISC-V: Add vector ISA support Vincent Chen
  2021-09-13  1:41 ` [RFC patch 1/5] RISC-V: Remove riscv-specific sigcontext.h Vincent Chen
  2021-09-13  1:41 ` [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion Vincent Chen
@ 2021-09-13  1:41 ` Vincent Chen
  2021-09-14 23:48   ` Joseph Myers
  2021-10-01 13:04   ` Adhemerval Zanella via Libc-alpha
  2021-09-13  1:41 ` [RFC patch 4/5] RISC-V: Extend MINSIGSTKSZ and SIGSTKSZ to backup RVV registers Vincent Chen
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 79+ messages in thread
From: Vincent Chen @ 2021-09-13  1:41 UTC (permalink / raw)
  To: libc-alpha, palmer; +Cc: andrew, Vincent Chen

According to the RISC-V V extension specification, all vector registers
except VCSR are caller-saved registers. The VCSR (vxrm + vxsat) has thread
storage duration. Therefore, only VCSR needs to be added to the user
context operation.
---
 sysdeps/riscv/Makefile                       |  5 ++++
 sysdeps/riscv/rtld-global-offsets.sym        |  7 +++++
 sysdeps/unix/sysv/linux/riscv/bits/hwcap.h   | 31 +++++++++++++++++++++
 sysdeps/unix/sysv/linux/riscv/getcontext.S   | 22 ++++++++++++++-
 sysdeps/unix/sysv/linux/riscv/setcontext.S   | 22 +++++++++++++++
 sysdeps/unix/sysv/linux/riscv/swapcontext.S  | 41 ++++++++++++++++++++++++++++
 sysdeps/unix/sysv/linux/riscv/sysdep.h       |  1 +
 sysdeps/unix/sysv/linux/riscv/ucontext_i.sym |  6 ++++
 8 files changed, 134 insertions(+), 1 deletion(-)
 create mode 100644 sysdeps/riscv/rtld-global-offsets.sym
 create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/hwcap.h

diff --git a/sysdeps/riscv/Makefile b/sysdeps/riscv/Makefile
index 20a9968..cda3ded 100644
--- a/sysdeps/riscv/Makefile
+++ b/sysdeps/riscv/Makefile
@@ -2,6 +2,11 @@ ifeq ($(subdir),misc)
 sysdep_headers += sys/asm.h
 endif
 
+ifeq ($(subdir),csu)
+# get offset to rtld_global._dl_hwcap and rtld_global._dl_hwcap2.
+gen-as-const-headers += rtld-global-offsets.sym
+endif
+
 # RISC-V's assembler also needs to know about PIC as it changes the definition
 # of some assembler macros.
 ASFLAGS-.os += $(pic-ccflag)
diff --git a/sysdeps/riscv/rtld-global-offsets.sym b/sysdeps/riscv/rtld-global-offsets.sym
new file mode 100644
index 0000000..ff4e97f
--- /dev/null
+++ b/sysdeps/riscv/rtld-global-offsets.sym
@@ -0,0 +1,7 @@
+#define SHARED 1
+
+#include <ldsodefs.h>
+
+#define rtld_global_ro_offsetof(mem) offsetof (struct rtld_global_ro, mem)
+
+RTLD_GLOBAL_RO_DL_HWCAP_OFFSET	rtld_global_ro_offsetof (_dl_hwcap)
diff --git a/sysdeps/unix/sysv/linux/riscv/bits/hwcap.h b/sysdeps/unix/sysv/linux/riscv/bits/hwcap.h
new file mode 100644
index 0000000..e6c5ef5
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/riscv/bits/hwcap.h
@@ -0,0 +1,31 @@
+/* Defines for bits in AT_HWCAP.  RISC-V Linux version.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#if !defined (_SYS_AUXV_H) && !defined (_LINUX_RISCV_SYSDEP_H)
+# error "Never include <bits/hwcap.h> directly; use <sys/auxv.h> instead."
+#endif
+
+/* The following must match the kernel's <asm/hwcap.h>.  */
+#define HWCAP_ISA_I      0x100		//(1 << ('I' - 'A'))
+#define HWCAP_ISA_M      0x1000 	//(1 << ('M' - 'A'))
+#define HWCAP_ISA_A      0x1		//(1 << ('A' - 'A'))
+#define HWCAP_ISA_F      0x20		//(1 << ('F' - 'A'))
+#define HWCAP_ISA_D      0x8		//(1 << ('D' - 'A'))
+#define HWCAP_ISA_C      0x4		//(1 << ('C' - 'A'))
+#define HWCAP_ISA_V      0x200000	//(1 << ('V' - 'A'))
+
diff --git a/sysdeps/unix/sysv/linux/riscv/getcontext.S b/sysdeps/unix/sysv/linux/riscv/getcontext.S
index d6a9bbc..840d8fe 100644
--- a/sysdeps/unix/sysv/linux/riscv/getcontext.S
+++ b/sysdeps/unix/sysv/linux/riscv/getcontext.S
@@ -16,6 +16,8 @@
    License along with the GNU C Library.  If not, see
    <https://www.gnu.org/licenses/>.  */
 
+#include <sysdep.h>
+#include <rtld-global-offsets.h>
 #include "ucontext-macros.h"
 
 /* int getcontext (ucontext_t *ucp) */
@@ -39,6 +41,25 @@ LEAF (__getcontext)
 	SAVE_INT_REG (s10, 26, a0)
 	SAVE_INT_REG (s11, 27, a0)
 
+#ifdef __riscv_vector
+# ifdef SHARED
+	la	t1, _rtld_global_ro
+	REG_L   t1, RTLD_GLOBAL_RO_DL_HWCAP_OFFSET(t1)
+# else
+	la	t1, _dl_hwcap
+	REG_L	t1, (t1)
+# endif
+	li	t2, HWCAP_ISA_V
+	and	t2, t1, t2
+	beqz	t2, 1f
+	addi	t2, a0,	MCONTEXT_EXTENSION
+	li	t1, RVV_MAGIC
+	sw	t1, (t2)
+	csrr	t1, vcsr
+	REG_S	t1, VCSR_OFFSET(t2)
+1:
+#endif
+
 #ifndef __riscv_float_abi_soft
 	frsr	a1
 
@@ -73,5 +94,4 @@ LEAF (__getcontext)
 99:	j	__syscall_error
 
 PSEUDO_END (__getcontext)
-
 weak_alias (__getcontext, getcontext)
diff --git a/sysdeps/unix/sysv/linux/riscv/setcontext.S b/sysdeps/unix/sysv/linux/riscv/setcontext.S
index 9510518..d2404fb 100644
--- a/sysdeps/unix/sysv/linux/riscv/setcontext.S
+++ b/sysdeps/unix/sysv/linux/riscv/setcontext.S
@@ -16,6 +16,8 @@
    License along with the GNU C Library.  If not, see
    <https://www.gnu.org/licenses/>.  */
 
+#include <sysdep.h>
+#include <rtld-global-offsets.h>
 #include "ucontext-macros.h"
 
 /*  int __setcontext (const ucontext_t *ucp)
@@ -64,6 +66,26 @@ LEAF (__setcontext)
 	fssr	t1
 #endif /* __riscv_float_abi_soft */
 
+#ifdef __riscv_vector
+#ifdef SHARED
+	la	t1, _rtld_global_ro
+	REG_L   t1, RTLD_GLOBAL_RO_DL_HWCAP_OFFSET(t1)
+#else
+	la	t1, _dl_hwcap
+	REG_L	t1, (t1)
+#endif
+	li	t2, HWCAP_ISA_V
+	and	t2, t1, t2
+	beqz	t2, 1f
+	li      t1, RVV_MAGIC
+	addi	t2, t0,	MCONTEXT_EXTENSION
+	lw	a1, (t2)
+	bne	a1, t1, 1f
+	REG_L   t1, VCSR_OFFSET(t2)
+	csrw	vcsr, t1
+1:
+#endif
+
 	/* Note the contents of argument registers will be random
 	   unless makecontext() has been called.  */
 	RESTORE_INT_REG     (t1,   0, t0)
diff --git a/sysdeps/unix/sysv/linux/riscv/swapcontext.S b/sysdeps/unix/sysv/linux/riscv/swapcontext.S
index df0f699..94ae8e4 100644
--- a/sysdeps/unix/sysv/linux/riscv/swapcontext.S
+++ b/sysdeps/unix/sysv/linux/riscv/swapcontext.S
@@ -16,6 +16,8 @@
    License along with the GNU C Library.  If not, see
    <https://www.gnu.org/licenses/>.  */
 
+#include <sysdep.h>
+#include <rtld-global-offsets.h>
 #include "ucontext-macros.h"
 
 /* int swapcontext (ucontext_t *oucp, const ucontext_t *ucp) */
@@ -40,6 +42,25 @@ LEAF (__swapcontext)
 	SAVE_INT_REG (s10, 26, a0)
 	SAVE_INT_REG (s11, 27, a0)
 
+#ifdef __riscv_vector
+#ifdef SHARED
+	la      t1, _rtld_global_ro
+	REG_L   t1, RTLD_GLOBAL_RO_DL_HWCAP_OFFSET(t1)
+#else
+	la	t1, _dl_hwcap
+	REG_L   t1, (t1)
+#endif
+	li	t2, HWCAP_ISA_V
+	and	t2, t1, t2
+	beqz	t2, 1f
+	addi	t2, a0,	MCONTEXT_EXTENSION
+	li	t1, RVV_MAGIC
+	sw	t1, (t2)
+	csrr	t1, vcsr
+	REG_S	t1, VCSR_OFFSET(t2)
+1:
+#endif
+
 #ifndef __riscv_float_abi_soft
 	frsr a1
 
@@ -89,6 +110,26 @@ LEAF (__swapcontext)
 	fssr	t1
 #endif /* __riscv_float_abi_soft */
 
+#ifdef __riscv_vector
+#ifdef SHARED
+	la      t1, _rtld_global_ro
+	REG_L   t1, RTLD_GLOBAL_RO_DL_HWCAP_OFFSET(t1)
+#else
+	la	t1, _dl_hwcap
+	REG_L   t1, (t1)
+#endif
+	li	t2, HWCAP_ISA_V
+	and	t2, t1, t2
+	beqz	t2, 1f
+	li      t1, RVV_MAGIC
+	addi	t2, t0,	MCONTEXT_EXTENSION
+	lw	a1, (t2)
+	bne	a1, t1, 1f
+	REG_L   t1, VCSR_OFFSET(t2)
+	csrw	vcsr, t1
+1:
+#endif
+
 	/* Note the contents of argument registers will be random
 	   unless makecontext() has been called.  */
 	RESTORE_INT_REG (t1,   0, t0)
diff --git a/sysdeps/unix/sysv/linux/riscv/sysdep.h b/sysdeps/unix/sysv/linux/riscv/sysdep.h
index 37ff07a..c9f8fd8 100644
--- a/sysdeps/unix/sysv/linux/riscv/sysdep.h
+++ b/sysdeps/unix/sysv/linux/riscv/sysdep.h
@@ -50,6 +50,7 @@
 
 #ifdef __ASSEMBLER__
 
+# include <bits/hwcap.h>
 # include <sys/asm.h>
 
 # define ENTRY(name) LEAF(name)
diff --git a/sysdeps/unix/sysv/linux/riscv/ucontext_i.sym b/sysdeps/unix/sysv/linux/riscv/ucontext_i.sym
index be55b26..4037473 100644
--- a/sysdeps/unix/sysv/linux/riscv/ucontext_i.sym
+++ b/sysdeps/unix/sysv/linux/riscv/ucontext_i.sym
@@ -2,6 +2,7 @@
 #include <signal.h>
 #include <stddef.h>
 #include <sys/ucontext.h>
+#include <asm/sigcontext.h>
 
 -- Constants used by the rt_sigprocmask call.
 
@@ -27,5 +28,10 @@ STACK_FLAGS			stack (ss_flags)
 
 MCONTEXT_GREGS			mcontext (__gregs)
 MCONTEXT_FPREGS			mcontext (__fpregs)
+MCONTEXT_EXTENSION 		mcontext (__reserved)
 
 UCONTEXT_SIZE			sizeof (ucontext_t)
+
+VCSR_OFFSET			offsetof (struct __riscv_v_state, vcsr)
+
+RVV_MAGIC
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [RFC patch 4/5] RISC-V: Extend MINSIGSTKSZ and SIGSTKSZ to backup RVV registers
  2021-09-13  1:41 [RFC patch 0/5] RISC-V: Add vector ISA support Vincent Chen
                   ` (2 preceding siblings ...)
  2021-09-13  1:41 ` [RFC patch 3/5] RISC-V: Save and restore VCSR when doing user context switch Vincent Chen
@ 2021-09-13  1:41 ` Vincent Chen
  2021-09-13 13:51   ` Rich Felker
  2021-09-13  1:41 ` [RFC 5/5] RISC-V: Expand PTHREAD_STACK_MIN to support RVV environment Vincent Chen
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 79+ messages in thread
From: Vincent Chen @ 2021-09-13  1:41 UTC (permalink / raw)
  To: libc-alpha, palmer; +Cc: andrew, Vincent Chen

As using RVV extension, the original MINSIGSTKSZ is not enough to
back up all RVV registers for the normal case. Therefore, the MINSIGSTKSZ
is expanded to about 5K and the SIGSTKSZ is expanded to about 16K. This
space is enough for the case that the VLENB of a vector register is 128
bytes. For the case that VLENB > 128 bytes, users can use
sysconf (_SC_MINSIGSTKSZ) and sysconf (_SC_SIGSTKSZ) to get the
appropriate signal stack size.
---
 sysdeps/unix/sysv/linux/riscv/bits/sigstack.h | 32 +++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)
 create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/sigstack.h

diff --git a/sysdeps/unix/sysv/linux/riscv/bits/sigstack.h b/sysdeps/unix/sysv/linux/riscv/bits/sigstack.h
new file mode 100644
index 0000000..c18512f
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/riscv/bits/sigstack.h
@@ -0,0 +1,32 @@
+/* sigstack, sigaltstack definitions.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _BITS_SIGSTACK_H
+#define _BITS_SIGSTACK_H 1
+
+#if !defined _SIGNAL_H && !defined _SYS_UCONTEXT_H
+# error "Never include this file directly.  Use <signal.h> instead"
+#endif
+
+/* Minimum stack size (5k+256 bytes) for a signal handler.  */
+#define MINSIGSTKSZ	5376
+
+/* System default stack size.  */
+#define SIGSTKSZ	16384
+
+#endif /* bits/sigstack.h */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [RFC 5/5] RISC-V: Expand PTHREAD_STACK_MIN to support RVV environment
  2021-09-13  1:41 [RFC patch 0/5] RISC-V: Add vector ISA support Vincent Chen
                   ` (3 preceding siblings ...)
  2021-09-13  1:41 ` [RFC patch 4/5] RISC-V: Extend MINSIGSTKSZ and SIGSTKSZ to backup RVV registers Vincent Chen
@ 2021-09-13  1:41 ` Vincent Chen
  2021-09-14 23:43   ` Joseph Myers
  2021-09-13 19:11 ` [RFC patch 0/5] RISC-V: Add vector ISA support Vineet Gupta via Libc-alpha
  2021-11-09 19:21 ` Darius Rad
  6 siblings, 1 reply; 79+ messages in thread
From: Vincent Chen @ 2021-09-13  1:41 UTC (permalink / raw)
  To: libc-alpha, palmer; +Cc: andrew, Vincent Chen

In order to support all pthread operations in the RVV environment, here
PTHREAD_STACK_MIN is set to 4 times GLRO(dl_minsigstacksize), and the
default PTHREAD_STACK_MIN is expanded to 20K bytes.
---
 .../unix/sysv/linux/riscv/bits/pthread_stack_min.h | 21 ++++++++++++
 .../sysv/linux/riscv/sysconf-pthread_stack_min.h   | 39 ++++++++++++++++++++++
 2 files changed, 60 insertions(+)
 create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/pthread_stack_min.h
 create mode 100644 sysdeps/unix/sysv/linux/riscv/sysconf-pthread_stack_min.h

diff --git a/sysdeps/unix/sysv/linux/riscv/bits/pthread_stack_min.h b/sysdeps/unix/sysv/linux/riscv/bits/pthread_stack_min.h
new file mode 100644
index 0000000..83585b3
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/riscv/bits/pthread_stack_min.h
@@ -0,0 +1,21 @@
+/* Definition of PTHREAD_STACK_MIN.  Linux/riscv version.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public License as
+   published by the Free Software Foundation; either version 2.1 of the
+   License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <https://www.gnu.org/licenses/>.  */
+
+/* Minimum size for a thread.  We are free to choose a reasonable value.  */
+#define PTHREAD_STACK_MIN	20480
diff --git a/sysdeps/unix/sysv/linux/riscv/sysconf-pthread_stack_min.h b/sysdeps/unix/sysv/linux/riscv/sysconf-pthread_stack_min.h
new file mode 100644
index 0000000..53ba6a1
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/riscv/sysconf-pthread_stack_min.h
@@ -0,0 +1,39 @@
+/* __get_pthread_stack_min ().  Linux version.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+/* Return sysconf (_SC_THREAD_STACK_MIN).  */
+
+static inline long int
+__get_pthread_stack_min (void)
+{
+  /* sysconf (_SC_THREAD_STACK_MIN) >= sysconf (_SC_MINSIGSTKSZ).  */
+  long int pthread_stack_min = GLRO(dl_minsigstacksize) * 4;
+  assert (pthread_stack_min != 0);
+  _Static_assert (__builtin_constant_p (PTHREAD_STACK_MIN),
+		  "PTHREAD_STACK_MIN is constant");
+  /* Return MAX (PTHREAD_STACK_MIN, pthread_stack_min).  */
+  if (pthread_stack_min < PTHREAD_STACK_MIN)
+    pthread_stack_min = PTHREAD_STACK_MIN;
+  /* We have a private interface, __pthread_get_minstack@GLIBC_PRIVATE
+     which returns a larger size that includes the required TLS variable
+     space which has been determined at startup.  For sysconf here we are
+     conservative and don't include the space required for TLS access.
+     Eventually the TLS variable space will not be part of the stack
+     (Bug 11787).  */
+  return pthread_stack_min;
+}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion.
  2021-09-13  1:41 ` [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion Vincent Chen
@ 2021-09-13 13:44   ` Florian Weimer via Libc-alpha
  2021-09-13 13:52     ` Rich Felker
  0 siblings, 1 reply; 79+ messages in thread
From: Florian Weimer via Libc-alpha @ 2021-09-13 13:44 UTC (permalink / raw)
  To: Vincent Chen; +Cc: libc-alpha, andrew

* Vincent Chen:

> Following the changes of struct sigcontext in Linux to reserve about 5K space
> to support future ISA expansion.
> ---
>  sysdeps/unix/sysv/linux/riscv/sys/ucontext.h | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h b/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
> index cfafa44..80caf07 100644
> --- a/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
> +++ b/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
> @@ -82,6 +82,8 @@ typedef struct mcontext_t
>    {
>      __riscv_mc_gp_state __gregs;
>      union  __riscv_mc_fp_state __fpregs;
> +    /* 5K + 256 reserved for vector state and future expansion.  */
> +    unsigned char __reserved[5376] __attribute__ ((__aligned__ (16)));
>    } mcontext_t;

This changes the size of struct ucontext_t, which is an ABI break
(getcontext callers are supposed to provide their own object).

This shouldn't be necessary if the additional vector registers are
caller-saved.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 4/5] RISC-V: Extend MINSIGSTKSZ and SIGSTKSZ to backup RVV registers
  2021-09-13  1:41 ` [RFC patch 4/5] RISC-V: Extend MINSIGSTKSZ and SIGSTKSZ to backup RVV registers Vincent Chen
@ 2021-09-13 13:51   ` Rich Felker
  2021-09-16  9:25     ` Vincent Chen
  0 siblings, 1 reply; 79+ messages in thread
From: Rich Felker @ 2021-09-13 13:51 UTC (permalink / raw)
  To: Vincent Chen; +Cc: libc-alpha, andrew

On Mon, Sep 13, 2021 at 09:41:17AM +0800, Vincent Chen wrote:
> As using RVV extension, the original MINSIGSTKSZ is not enough to
> back up all RVV registers for the normal case. Therefore, the MINSIGSTKSZ
> is expanded to about 5K and the SIGSTKSZ is expanded to about 16K. This
> space is enough for the case that the VLENB of a vector register is 128
> bytes. For the case that VLENB > 128 bytes, users can use
> sysconf (_SC_MINSIGSTKSZ) and sysconf (_SC_SIGSTKSZ) to get the
> appropriate signal stack size.
> ---
>  sysdeps/unix/sysv/linux/riscv/bits/sigstack.h | 32 +++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
>  create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/sigstack.h
> 
> diff --git a/sysdeps/unix/sysv/linux/riscv/bits/sigstack.h b/sysdeps/unix/sysv/linux/riscv/bits/sigstack.h
> new file mode 100644
> index 0000000..c18512f
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/riscv/bits/sigstack.h
> @@ -0,0 +1,32 @@
> +/* sigstack, sigaltstack definitions.
> +   Copyright (C) 2021 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#ifndef _BITS_SIGSTACK_H
> +#define _BITS_SIGSTACK_H 1
> +
> +#if !defined _SIGNAL_H && !defined _SYS_UCONTEXT_H
> +# error "Never include this file directly.  Use <signal.h> instead"
> +#endif
> +
> +/* Minimum stack size (5k+256 bytes) for a signal handler.  */
> +#define MINSIGSTKSZ	5376
> +
> +/* System default stack size.  */
> +#define SIGSTKSZ	16384
> +
> +#endif /* bits/sigstack.h */
> -- 
> 2.7.4

Strictly speaking this is also an ABI change (and what the kernel is
doing is too). If possible I think there should be an effort to get
the riscv folks to rethink this. Aside from being breakage, large
state that has the be saved/restored at context switch time is an
anti-feature. Any reasonable amount of vector state fits in the
existing size.

If it is to be changed, I suspect 5376 is too small. IIRC other archs
that have large (4k or so?) register files used something like 6k as
the min (1-2k margin for actual execution space).

Rich

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion.
  2021-09-13 13:44   ` Florian Weimer via Libc-alpha
@ 2021-09-13 13:52     ` Rich Felker
  2021-09-16  8:02       ` Vincent Chen
  0 siblings, 1 reply; 79+ messages in thread
From: Rich Felker @ 2021-09-13 13:52 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Vincent Chen, libc-alpha, andrew

On Mon, Sep 13, 2021 at 03:44:09PM +0200, Florian Weimer via Libc-alpha wrote:
> * Vincent Chen:
> 
> > Following the changes of struct sigcontext in Linux to reserve about 5K space
> > to support future ISA expansion.
> > ---
> >  sysdeps/unix/sysv/linux/riscv/sys/ucontext.h | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h b/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
> > index cfafa44..80caf07 100644
> > --- a/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
> > +++ b/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
> > @@ -82,6 +82,8 @@ typedef struct mcontext_t
> >    {
> >      __riscv_mc_gp_state __gregs;
> >      union  __riscv_mc_fp_state __fpregs;
> > +    /* 5K + 256 reserved for vector state and future expansion.  */
> > +    unsigned char __reserved[5376] __attribute__ ((__aligned__ (16)));
> >    } mcontext_t;
> 
> This changes the size of struct ucontext_t, which is an ABI break
> (getcontext callers are supposed to provide their own object).
> 
> This shouldn't be necessary if the additional vector registers are
> caller-saved.

Indeed, that was my first thought when I saw this too. Any late
additions to the register file must be call-clobbered or else they are
a new ABI. And mcontext_t does not need to represent any
call-clobbered state.

Rich

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 0/5] RISC-V: Add vector ISA support
  2021-09-13  1:41 [RFC patch 0/5] RISC-V: Add vector ISA support Vincent Chen
                   ` (4 preceding siblings ...)
  2021-09-13  1:41 ` [RFC 5/5] RISC-V: Expand PTHREAD_STACK_MIN to support RVV environment Vincent Chen
@ 2021-09-13 19:11 ` Vineet Gupta via Libc-alpha
  2021-09-15 19:37   ` Jim Wilson
  2021-11-09 19:21 ` Darius Rad
  6 siblings, 1 reply; 79+ messages in thread
From: Vineet Gupta via Libc-alpha @ 2021-09-13 19:11 UTC (permalink / raw)
  To: Vincent Chen, libc-alpha, palmer; +Cc: andrew

On 9/12/21 6:41 PM, Vincent Chen wrote:
> This patchset adds required ports to support RISC-V Vector (RVV) extension.
> 
> Since the length of the vector register in RVV (the theoretical maximum
> is 2^XLEN-1 bits) is variable, a huge and flexible space is needed to back
> up all vector registers in the signal context. This patchset expands the
> default SIGSTKSZ, MINSIGSTKSZ, and PTHREAD_STACK_MIN to ensure the stack
> size is enough for the normal case (VLENB <= 128 bytes). Linux kernel also
> places the exact minimum signal stack size in AT_MINSIGSTKSZ entry of the
> auxiliary vector to inform user, so user still can know the sutible minimum
> signal stack size by sysconf (_SC_MINSIGSTKSZ) if the VLENB is greater
> than 128 bytes.
> 
> In addition, according to the specification, the VCSR that combines VXRM and
> VXSAT has thread storage duration, so this patchset adds the required user
> context operation for it.
> 
> Finally, the RISC-V glibc customized sigcontext.h has been removed in this
> patchset. to reduce the synchronization work when new extension support is
> introduced to the Linux environment. However, it may bring some backward
> incompatible issues. Therefore, I sent an RFC patch
> (https://sourceware.org/pipermail/libc-alpha/2020-June/115549.html)
> to discuss this modification before this patchset. As I mentioned in the
> RFC patch thread, I used OpenEmbeded to evaluate the impact. During the
> tests, I didn't get any compiler errors. Therefore, I infer that this
> modification may not cause server backward incompatible issues at this
> moment.
> 
> 1. The RISC-V V-extension draft v1.0 can be found in
> https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc
> 2. The associated kernel implementation can be found in
> http://lists.infradead.org/pipermail/linux-riscv/2021-September/008249.html
> 3. QEMU with RISC-V V-extension support can be found in
> https://github.com/sifive/qemu/tree/rvv-1.0

What about gcc/binutils: sifive forks for those have quite a few 
branches with rvv suffix, but it is not obvious which one pertains to 
the specific version implemented in qemu above.

Thx,
-Vineet

> 
> Vincent Chen (5):
>    RISC-V: Remove riscv-specific sigcontext.h
>    RISC-V: Reserve about 5K space in mcontext_t to support future ISA
>      expansion.
>    RISC-V: Save and restore VCSR when doing user context switch
>    RISC-V: Extend MINSIGSTKSZ and SIGSTKSZ to backup RVV registers
>    RISC-V: Expand PTHREAD_STACK_MIN to support RVV environment
> 
>   sysdeps/riscv/Makefile                             |  5 +++
>   sysdeps/riscv/rtld-global-offsets.sym              |  7 ++++
>   sysdeps/unix/sysv/linux/riscv/bits/hwcap.h         | 31 ++++++++++++++++
>   .../unix/sysv/linux/riscv/bits/pthread_stack_min.h | 21 +++++++++++
>   sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h    | 31 ----------------
>   sysdeps/unix/sysv/linux/riscv/bits/sigstack.h      | 32 +++++++++++++++++
>   sysdeps/unix/sysv/linux/riscv/getcontext.S         | 22 +++++++++++-
>   sysdeps/unix/sysv/linux/riscv/setcontext.S         | 22 ++++++++++++
>   sysdeps/unix/sysv/linux/riscv/swapcontext.S        | 41 ++++++++++++++++++++++
>   sysdeps/unix/sysv/linux/riscv/sys/ucontext.h       |  2 ++
>   .../sysv/linux/riscv/sysconf-pthread_stack_min.h   | 39 ++++++++++++++++++++
>   sysdeps/unix/sysv/linux/riscv/sysdep.h             |  1 +
>   sysdeps/unix/sysv/linux/riscv/ucontext_i.sym       |  6 ++++
>   13 files changed, 228 insertions(+), 32 deletions(-)
>   create mode 100644 sysdeps/riscv/rtld-global-offsets.sym
>   create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/hwcap.h
>   create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/pthread_stack_min.h
>   delete mode 100644 sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h
>   create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/sigstack.h
>   create mode 100644 sysdeps/unix/sysv/linux/riscv/sysconf-pthread_stack_min.h
> 


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC 5/5] RISC-V: Expand PTHREAD_STACK_MIN to support RVV environment
  2021-09-13  1:41 ` [RFC 5/5] RISC-V: Expand PTHREAD_STACK_MIN to support RVV environment Vincent Chen
@ 2021-09-14 23:43   ` Joseph Myers
  2021-09-15 10:42     ` Florian Weimer via Libc-alpha
  0 siblings, 1 reply; 79+ messages in thread
From: Joseph Myers @ 2021-09-14 23:43 UTC (permalink / raw)
  To: Vincent Chen; +Cc: libc-alpha, andrew

On Mon, 13 Sep 2021, Vincent Chen wrote:

> In order to support all pthread operations in the RVV environment, here
> PTHREAD_STACK_MIN is set to 4 times GLRO(dl_minsigstacksize), and the
> default PTHREAD_STACK_MIN is expanded to 20K bytes.

A change to PTHREAD_STACK_MIN has been considered an ABI change in the 
past, requiring new symbol versions for pthread_attr_setstack and 
pthread_attr_setstacksize to ensure that binaries built with the old 
PTHREAD_STACK_MIN definition continue to work rather than failing because 
the old size is too small.  You may need symbol versioning updates for 
those functions in RISC-V if you make such a change.  (All the existing 
versioning support for this in architecture-independent files assumes the 
change in value was done before libpthread was merged into libc, so there 
will be some extra work involved in being the first architecture to 
increase PTHREAD_STACK_MIN after that merge.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 3/5] RISC-V: Save and restore VCSR when doing user context switch
  2021-09-13  1:41 ` [RFC patch 3/5] RISC-V: Save and restore VCSR when doing user context switch Vincent Chen
@ 2021-09-14 23:48   ` Joseph Myers
  2021-09-15  0:13     ` Andrew Waterman
  2021-10-01 13:04   ` Adhemerval Zanella via Libc-alpha
  1 sibling, 1 reply; 79+ messages in thread
From: Joseph Myers @ 2021-09-14 23:48 UTC (permalink / raw)
  To: Vincent Chen; +Cc: libc-alpha, andrew

On Mon, 13 Sep 2021, Vincent Chen wrote:

> According to the RISC-V V extension specification, all vector registers
> except VCSR are caller-saved registers. The VCSR (vxrm + vxsat) has thread
> storage duration. Therefore, only VCSR needs to be added to the user
> context operation.

What is the intended programming model for using vxrm and vxsat?

The expectation for the floating-point rounding modes and flags is that 
they work just like user-defined _Thread_local variables - that is, they 
are *not* saved or restored by setjmp/longjmp or *context functions.  It 
would be natural to expect fixed-point rounding modes and flags to work 
similarly.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 3/5] RISC-V: Save and restore VCSR when doing user context switch
  2021-09-14 23:48   ` Joseph Myers
@ 2021-09-15  0:13     ` Andrew Waterman
  2021-09-16  9:20       ` Vincent Chen
  0 siblings, 1 reply; 79+ messages in thread
From: Andrew Waterman @ 2021-09-15  0:13 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Vincent Chen, libc-alpha

On Tue, Sep 14, 2021 at 4:48 PM Joseph Myers <joseph@codesourcery.com> wrote:
>
> On Mon, 13 Sep 2021, Vincent Chen wrote:
>
> > According to the RISC-V V extension specification, all vector registers
> > except VCSR are caller-saved registers. The VCSR (vxrm + vxsat) has thread
> > storage duration. Therefore, only VCSR needs to be added to the user
> > context operation.
>
> What is the intended programming model for using vxrm and vxsat?
>
> The expectation for the floating-point rounding modes and flags is that
> they work just like user-defined _Thread_local variables - that is, they
> are *not* saved or restored by setjmp/longjmp or *context functions.  It
> would be natural to expect fixed-point rounding modes and flags to work
> similarly.

Indeed, Joseph, vxsat and vxrm should be treated analogously to the FP
flags and rounding mode, respectively.

>
> --
> Joseph S. Myers
> joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC 5/5] RISC-V: Expand PTHREAD_STACK_MIN to support RVV environment
  2021-09-14 23:43   ` Joseph Myers
@ 2021-09-15 10:42     ` Florian Weimer via Libc-alpha
  2021-09-15 14:31       ` H.J. Lu via Libc-alpha
  0 siblings, 1 reply; 79+ messages in thread
From: Florian Weimer via Libc-alpha @ 2021-09-15 10:42 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Vincent Chen, libc-alpha, andrew

* Joseph Myers:

> On Mon, 13 Sep 2021, Vincent Chen wrote:
>
>> In order to support all pthread operations in the RVV environment, here
>> PTHREAD_STACK_MIN is set to 4 times GLRO(dl_minsigstacksize), and the
>> default PTHREAD_STACK_MIN is expanded to 20K bytes.
>
> A change to PTHREAD_STACK_MIN has been considered an ABI change in the 
> past, requiring new symbol versions for pthread_attr_setstack and 
> pthread_attr_setstacksize to ensure that binaries built with the old 
> PTHREAD_STACK_MIN definition continue to work rather than failing because 
> the old size is too small.  You may need symbol versioning updates for 
> those functions in RISC-V if you make such a change.  (All the existing 
> versioning support for this in architecture-independent files assumes the 
> change in value was done before libpthread was merged into libc, so there 
> will be some extra work involved in being the first architecture to 
> increase PTHREAD_STACK_MIN after that merge.)

Instead it may make sense to leave PTHREAD_STACK_MIN as is and switch to
the dynamic version (same for the SIGSTKSZ constants).

Thanks,
Florian


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC 5/5] RISC-V: Expand PTHREAD_STACK_MIN to support RVV environment
  2021-09-15 10:42     ` Florian Weimer via Libc-alpha
@ 2021-09-15 14:31       ` H.J. Lu via Libc-alpha
  2021-09-16 10:21         ` Vincent Chen
  0 siblings, 1 reply; 79+ messages in thread
From: H.J. Lu via Libc-alpha @ 2021-09-15 14:31 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Vincent Chen, GNU C Library, Andrew Waterman, Joseph Myers

On Wed, Sep 15, 2021 at 3:42 AM Florian Weimer via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> * Joseph Myers:
>
> > On Mon, 13 Sep 2021, Vincent Chen wrote:
> >
> >> In order to support all pthread operations in the RVV environment, here
> >> PTHREAD_STACK_MIN is set to 4 times GLRO(dl_minsigstacksize), and the
> >> default PTHREAD_STACK_MIN is expanded to 20K bytes.
> >
> > A change to PTHREAD_STACK_MIN has been considered an ABI change in the
> > past, requiring new symbol versions for pthread_attr_setstack and
> > pthread_attr_setstacksize to ensure that binaries built with the old
> > PTHREAD_STACK_MIN definition continue to work rather than failing because
> > the old size is too small.  You may need symbol versioning updates for
> > those functions in RISC-V if you make such a change.  (All the existing
> > versioning support for this in architecture-independent files assumes the
> > change in value was done before libpthread was merged into libc, so there
> > will be some extra work involved in being the first architecture to
> > increase PTHREAD_STACK_MIN after that merge.)
>
> Instead it may make sense to leave PTHREAD_STACK_MIN as is and switch to
> the dynamic version (same for the SIGSTKSZ constants).
>

I don't know what problem RISC-V ran into.   It should be fixed with:

commit 5d98a7dae955bafa6740c26eaba9c86060ae0344
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Mon Jun 21 12:42:56 2021 -0700

    Define PTHREAD_STACK_MIN to sysconf(_SC_THREAD_STACK_MIN)

    The constant PTHREAD_STACK_MIN may be too small for some processors.
    Rename _SC_SIGSTKSZ_SOURCE to _DYNAMIC_STACK_SIZE_SOURCE.  When
    _DYNAMIC_STACK_SIZE_SOURCE or _GNU_SOURCE are defined, define
    PTHREAD_STACK_MIN to sysconf(_SC_THREAD_STACK_MIN) which is changed
    to MIN (PTHREAD_STACK_MIN, sysconf(_SC_MINSIGSTKSZ)).

    Consolidate <bits/local_lim.h> with <bits/pthread_stack_min.h> to
    provide a constant target specific PTHREAD_STACK_MIN value.

    Reviewed-by: Carlos O'Donell <carlos@redhat.com>

--
H.J.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 0/5] RISC-V: Add vector ISA support
  2021-09-13 19:11 ` [RFC patch 0/5] RISC-V: Add vector ISA support Vineet Gupta via Libc-alpha
@ 2021-09-15 19:37   ` Jim Wilson
  0 siblings, 0 replies; 79+ messages in thread
From: Jim Wilson @ 2021-09-15 19:37 UTC (permalink / raw)
  To: Vineet Gupta; +Cc: Vincent Chen, GNU C Library, Andrew Waterman

On Mon, Sep 13, 2021 at 12:11 PM Vineet Gupta via Libc-alpha <
libc-alpha@sourceware.org> wrote:

> What about gcc/binutils: sifive forks for those have quite a few
> branches with rvv suffix, but it is not obvious which one pertains to
> the specific version implemented in qemu above.
>

Binutils RVV support can be found on the RISC-V integration branch in the
FSF Binutils tree: users/riscv/binutils-integration-branch.

There is no usable RVV GCC port.  One was started and then mostly
abandoned.  It still gets the occasional patch, but it should not be
trusted.  RVV work nowadays is happening only in clang and is available on
mainline in the llvm.org git tree.  Special options are required to enable
it though.  If you really want to try an RVV GCC port, you can find one at
github.com/riscv/riscv-gnu-toolchain on the rvv-intrinsics branch.  Note
that the submodules are not being kept up to date, but the branch names are
in the .gitmodules file, so you can use "git submodule update --remote" to
get the most recent commit for them.  I would however advise against trying
to use this compiler for any serious work.  It is buggy and out of date.
Use clang instead.  But since we aren't trying to put vector intrinsic
calls into glibc or the kernel, all you should really need is the assembler.

Jim

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion.
  2021-09-13 13:52     ` Rich Felker
@ 2021-09-16  8:02       ` Vincent Chen
  2021-09-16  8:14         ` Florian Weimer via Libc-alpha
                           ` (2 more replies)
  0 siblings, 3 replies; 79+ messages in thread
From: Vincent Chen @ 2021-09-16  8:02 UTC (permalink / raw)
  To: Rich Felker; +Cc: Florian Weimer, GNU C Library, Andrew Waterman

On Mon, Sep 13, 2021 at 9:52 PM Rich Felker <dalias@libc.org> wrote:
>
> On Mon, Sep 13, 2021 at 03:44:09PM +0200, Florian Weimer via Libc-alpha wrote:
> > * Vincent Chen:
> >
> > > Following the changes of struct sigcontext in Linux to reserve about 5K space
> > > to support future ISA expansion.
> > > ---
> > >  sysdeps/unix/sysv/linux/riscv/sys/ucontext.h | 2 ++
> > >  1 file changed, 2 insertions(+)
> > >
> > > diff --git a/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h b/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
> > > index cfafa44..80caf07 100644
> > > --- a/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
> > > +++ b/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
> > > @@ -82,6 +82,8 @@ typedef struct mcontext_t
> > >    {
> > >      __riscv_mc_gp_state __gregs;
> > >      union  __riscv_mc_fp_state __fpregs;
> > > +    /* 5K + 256 reserved for vector state and future expansion.  */
> > > +    unsigned char __reserved[5376] __attribute__ ((__aligned__ (16)));
> > >    } mcontext_t;
> >
Hi Florian and Rich,
Sorry for the late reply and thank you for reminding me the
modification will cause ABI break.

> > This changes the size of struct ucontext_t, which is an ABI break
> > (getcontext callers are supposed to provide their own object).
> >

The riscv vector registers are all caller-saved registers except for
VCSR. Therefore, the struct mcontext_t needs to reserve a space for
it. In addition, RISCV ISA is growing, so I also hope the struct
mcontext_t has a space for future expansion. Based on the above ideas,
I reserved a 5K space here.

> > This shouldn't be necessary if the additional vector registers are
> > caller-saved.

Here I am a little confused about the usage of struct mcontext_t. As
far as I know, the struct mcontext_t is used to save the
machine-specific information in user context operation. Therefore, in
this case, the struct mcontext_t is allowed to reserve the space only
for saving caller-saved registers. However, in the signal handler, the
user seems to be allowed to use uc_mcontext whose data type is struct
mcontext_t to access the content of the signal context. In this case,
the struct mcontext_t may need to be the same as the struct sigcontext
defined at kernel. However, it will have a conflict with your
suggestion because the struct sigcontext cannot just reserve a space
for saving caller-saved registers. Could you help me point out my
misunderstanding? Thank you.

> Indeed, that was my first thought when I saw this too. Any late
> additions to the register file must be call-clobbered or else they are
> a new ABI. And mcontext_t does not need to represent any
> call-clobbered state.
>
> Rich

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion.
  2021-09-16  8:02       ` Vincent Chen
@ 2021-09-16  8:14         ` Florian Weimer via Libc-alpha
  2021-09-18  3:04           ` Vincent Chen
  2021-09-16 23:56         ` [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion Ben Woodard via Libc-alpha
  2021-09-17 17:03         ` Rich Felker
  2 siblings, 1 reply; 79+ messages in thread
From: Florian Weimer via Libc-alpha @ 2021-09-16  8:14 UTC (permalink / raw)
  To: Vincent Chen; +Cc: Rich Felker, GNU C Library, Andrew Waterman

* Vincent Chen:

>> > This changes the size of struct ucontext_t, which is an ABI break
>> > (getcontext callers are supposed to provide their own object).
>> >
>
> The riscv vector registers are all caller-saved registers except for
> VCSR. Therefore, the struct mcontext_t needs to reserve a space for
> it. In addition, RISCV ISA is growing, so I also hope the struct
> mcontext_t has a space for future expansion. Based on the above ideas,
> I reserved a 5K space here.

You have reserved space in ucontext_t that you could use for this.

>> > This shouldn't be necessary if the additional vector registers are
>> > caller-saved.
>
> Here I am a little confused about the usage of struct mcontext_t. As
> far as I know, the struct mcontext_t is used to save the
> machine-specific information in user context operation. Therefore, in
> this case, the struct mcontext_t is allowed to reserve the space only
> for saving caller-saved registers. However, in the signal handler, the
> user seems to be allowed to use uc_mcontext whose data type is struct
> mcontext_t to access the content of the signal context. In this case,
> the struct mcontext_t may need to be the same as the struct sigcontext
> defined at kernel. However, it will have a conflict with your
> suggestion because the struct sigcontext cannot just reserve a space
> for saving caller-saved registers. Could you help me point out my
> misunderstanding? Thank you.

struct sigcontext is allocated by the kernel, so you can have pointers
in reserved fields to out-of-line start, or after struct sigcontext.

I don't know how the kernel implements this, but there is considerable
flexibility and extensibility.  The main issues comes from small stacks
which are incompatible with large register files.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 3/5] RISC-V: Save and restore VCSR when doing user context switch
  2021-09-15  0:13     ` Andrew Waterman
@ 2021-09-16  9:20       ` Vincent Chen
  0 siblings, 0 replies; 79+ messages in thread
From: Vincent Chen @ 2021-09-16  9:20 UTC (permalink / raw)
  To: Andrew Waterman; +Cc: GNU C Library, Joseph Myers

On Wed, Sep 15, 2021 at 8:13 AM Andrew Waterman <andrew@sifive.com> wrote:
>
> On Tue, Sep 14, 2021 at 4:48 PM Joseph Myers <joseph@codesourcery.com> wrote:
> >
> > On Mon, 13 Sep 2021, Vincent Chen wrote:
> >
> > > According to the RISC-V V extension specification, all vector registers
> > > except VCSR are caller-saved registers. The VCSR (vxrm + vxsat) has thread
> > > storage duration. Therefore, only VCSR needs to be added to the user
> > > context operation.
> >
> > What is the intended programming model for using vxrm and vxsat?
> >
> > The expectation for the floating-point rounding modes and flags is that
> > they work just like user-defined _Thread_local variables - that is, they
> > are *not* saved or restored by setjmp/longjmp or *context functions.  It
> > would be natural to expect fixed-point rounding modes and flags to work
> > similarly.
>
> Indeed, Joseph, vxsat and vxrm should be treated analogously to the FP
> flags and rounding mode, respectively.
>

OK, I understood. As Andrew mentioned, the vxsat and vxrm should be
treated analogously to the FP flags and rounding mode. Therefore, the
VCSR should not be saved and restored in *context functions. I think
this patch can be dropped in the next version patch. Thank Joseph and
Andrew for the kind reply.

Thanks,
Vincent
> >
> > --
> > Joseph S. Myers
> > joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 4/5] RISC-V: Extend MINSIGSTKSZ and SIGSTKSZ to backup RVV registers
  2021-09-13 13:51   ` Rich Felker
@ 2021-09-16  9:25     ` Vincent Chen
  0 siblings, 0 replies; 79+ messages in thread
From: Vincent Chen @ 2021-09-16  9:25 UTC (permalink / raw)
  To: Rich Felker; +Cc: GNU C Library, Andrew Waterman

On Mon, Sep 13, 2021 at 9:51 PM Rich Felker <dalias@libc.org> wrote:
>
> On Mon, Sep 13, 2021 at 09:41:17AM +0800, Vincent Chen wrote:
> > As using RVV extension, the original MINSIGSTKSZ is not enough to
> > back up all RVV registers for the normal case. Therefore, the MINSIGSTKSZ
> > is expanded to about 5K and the SIGSTKSZ is expanded to about 16K. This
> > space is enough for the case that the VLENB of a vector register is 128
> > bytes. For the case that VLENB > 128 bytes, users can use
> > sysconf (_SC_MINSIGSTKSZ) and sysconf (_SC_SIGSTKSZ) to get the
> > appropriate signal stack size.
> > ---
> >  sysdeps/unix/sysv/linux/riscv/bits/sigstack.h | 32 +++++++++++++++++++++++++++
> >  1 file changed, 32 insertions(+)
> >  create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/sigstack.h
> >
> > diff --git a/sysdeps/unix/sysv/linux/riscv/bits/sigstack.h b/sysdeps/unix/sysv/linux/riscv/bits/sigstack.h
> > new file mode 100644
> > index 0000000..c18512f
> > --- /dev/null
> > +++ b/sysdeps/unix/sysv/linux/riscv/bits/sigstack.h
> > @@ -0,0 +1,32 @@
> > +/* sigstack, sigaltstack definitions.
> > +   Copyright (C) 2021 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, see
> > +   <https://www.gnu.org/licenses/>.  */
> > +
> > +#ifndef _BITS_SIGSTACK_H
> > +#define _BITS_SIGSTACK_H 1
> > +
> > +#if !defined _SIGNAL_H && !defined _SYS_UCONTEXT_H
> > +# error "Never include this file directly.  Use <signal.h> instead"
> > +#endif
> > +
> > +/* Minimum stack size (5k+256 bytes) for a signal handler.  */
> > +#define MINSIGSTKSZ  5376
> > +
> > +/* System default stack size.  */
> > +#define SIGSTKSZ     16384
> > +
> > +#endif /* bits/sigstack.h */
> > --
> > 2.7.4
>
> Strictly speaking this is also an ABI change (and what the kernel is
> doing is too). If possible I think there should be an effort to get
> the riscv folks to rethink this. Aside from being breakage, large
> state that has the be saved/restored at context switch time is an
> anti-feature. Any reasonable amount of vector state fits in the
> existing size.
>
> If it is to be changed, I suspect 5376 is too small. IIRC other archs
> that have large (4k or so?) register files used something like 6k as
> the min (1-2k margin for actual execution space).
>
> Rich

Hi Rich
I will share this modification in the RISC-V SW mailing list to
discuss it. Thank you very much for your suggestions.

Vincent

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC 5/5] RISC-V: Expand PTHREAD_STACK_MIN to support RVV environment
  2021-09-15 14:31       ` H.J. Lu via Libc-alpha
@ 2021-09-16 10:21         ` Vincent Chen
  0 siblings, 0 replies; 79+ messages in thread
From: Vincent Chen @ 2021-09-16 10:21 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Florian Weimer, GNU C Library, Andrew Waterman, Joseph Myers

On Wed, Sep 15, 2021 at 10:31 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Wed, Sep 15, 2021 at 3:42 AM Florian Weimer via Libc-alpha
> <libc-alpha@sourceware.org> wrote:
> >
> > * Joseph Myers:
> >
> > > On Mon, 13 Sep 2021, Vincent Chen wrote:
> > >
> > >> In order to support all pthread operations in the RVV environment, here
> > >> PTHREAD_STACK_MIN is set to 4 times GLRO(dl_minsigstacksize), and the
> > >> default PTHREAD_STACK_MIN is expanded to 20K bytes.
> > >
> > > A change to PTHREAD_STACK_MIN has been considered an ABI change in the
> > > past, requiring new symbol versions for pthread_attr_setstack and
> > > pthread_attr_setstacksize to ensure that binaries built with the old
> > > PTHREAD_STACK_MIN definition continue to work rather than failing because
> > > the old size is too small.  You may need symbol versioning updates for
> > > those functions in RISC-V if you make such a change.  (All the existing
> > > versioning support for this in architecture-independent files assumes the
> > > change in value was done before libpthread was merged into libc, so there
> > > will be some extra work involved in being the first architecture to
> > > increase PTHREAD_STACK_MIN after that merge.)
> >
Hi Joseph,
I understood. I will add symbol versioning to pthread_attr_setstack
and pthread_attr_setstacksize functions if I decide to change the
PTHREAD_STACK_MIN definition. Thank you.

> > Instead it may make sense to leave PTHREAD_STACK_MIN as is and switch to
> > the dynamic version (same for the SIGSTKSZ constants).
> >
>
> I don't know what problem RISC-V ran into.   It should be fixed with:
>

Hi Florian and H.J. Lu,
The dynamic version works for RISC-V. However, I am afraid that some
existing programs, such as nptl/tst-minstack-cancel, do not define
_DYNAMIC_STACK_SIZE_SOURCE or _GNU_SOURCE. In this case, the original
PTHREAD_STACK_MIN definition is too small to support V extension.
Therefore, I finally decided to extend PTHREAD_STACK_MIN to 20K for
normal use cases.

> commit 5d98a7dae955bafa6740c26eaba9c86060ae0344
> Author: H.J. Lu <hjl.tools@gmail.com>
> Date:   Mon Jun 21 12:42:56 2021 -0700
>
>     Define PTHREAD_STACK_MIN to sysconf(_SC_THREAD_STACK_MIN)
>
>     The constant PTHREAD_STACK_MIN may be too small for some processors.
>     Rename _SC_SIGSTKSZ_SOURCE to _DYNAMIC_STACK_SIZE_SOURCE.  When
>     _DYNAMIC_STACK_SIZE_SOURCE or _GNU_SOURCE are defined, define
>     PTHREAD_STACK_MIN to sysconf(_SC_THREAD_STACK_MIN) which is changed
>     to MIN (PTHREAD_STACK_MIN, sysconf(_SC_MINSIGSTKSZ)).
>
>     Consolidate <bits/local_lim.h> with <bits/pthread_stack_min.h> to
>     provide a constant target specific PTHREAD_STACK_MIN value.
>
>     Reviewed-by: Carlos O'Donell <carlos@redhat.com>
>
For this patch, I really appreciate H.J. Lu for providing this feature
in glibc. It's really helpful. But, I have a little question. If
possible, could H.J. Lu help me clarify it?

In __get_pthread_stack_min(), when GLRO(dl_minsigstacksize) !=0 and
GLRO(dl_minsigstacksize) > PTHREAD_STACK_MIN, the pthread_stack_min
will be set to GLRO(dl_minsigstacksize). However, If my understanding
is correct, the size of GLRO(dl_minsigstacksize) approximately equals
the size of the signal context. In this case, the remaining free stack
seems not enough for GCC to execute unwind if this pthread is
terminated by pthread_cancel, such as the case in
tst-minstack-cancel.c. Therefore, my question is that the
PTHREAD_STACK_MIN does not need to reserve the space for GCC to
execute unwind? Thank you in advance.

Regards,
Vincent

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion.
  2021-09-16  8:02       ` Vincent Chen
  2021-09-16  8:14         ` Florian Weimer via Libc-alpha
@ 2021-09-16 23:56         ` Ben Woodard via Libc-alpha
  2021-09-18  3:15           ` Vincent Chen
  2021-09-17 17:03         ` Rich Felker
  2 siblings, 1 reply; 79+ messages in thread
From: Ben Woodard via Libc-alpha @ 2021-09-16 23:56 UTC (permalink / raw)
  To: Vincent Chen; +Cc: Florian Weimer, Rich Felker, GNU C Library, Andrew Waterman

I know this patch set mostly deals with signal handling but don’t forget LD_AUDIT. It has a similar issue for the plt_enter and exit functions.

-ben

> On Sep 16, 2021, at 1:03 AM, Vincent Chen <vincent.chen@sifive.com> wrote:
> 
> On Mon, Sep 13, 2021 at 9:52 PM Rich Felker <dalias@libc.org> wrote:
>> 
>>> On Mon, Sep 13, 2021 at 03:44:09PM +0200, Florian Weimer via Libc-alpha wrote:
>>> * Vincent Chen:
>>> 
>>>> Following the changes of struct sigcontext in Linux to reserve about 5K space
>>>> to support future ISA expansion.
>>>> ---
>>>> sysdeps/unix/sysv/linux/riscv/sys/ucontext.h | 2 ++
>>>> 1 file changed, 2 insertions(+)
>>>> 
>>>> diff --git a/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h b/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
>>>> index cfafa44..80caf07 100644
>>>> --- a/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
>>>> +++ b/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
>>>> @@ -82,6 +82,8 @@ typedef struct mcontext_t
>>>>   {
>>>>     __riscv_mc_gp_state __gregs;
>>>>     union  __riscv_mc_fp_state __fpregs;
>>>> +    /* 5K + 256 reserved for vector state and future expansion.  */
>>>> +    unsigned char __reserved[5376] __attribute__ ((__aligned__ (16)));
>>>>   } mcontext_t;
>>> 
> Hi Florian and Rich,
> Sorry for the late reply and thank you for reminding me the
> modification will cause ABI break.
> 
>>> This changes the size of struct ucontext_t, which is an ABI break
>>> (getcontext callers are supposed to provide their own object).
>>> 
> 
> The riscv vector registers are all caller-saved registers except for
> VCSR. Therefore, the struct mcontext_t needs to reserve a space for
> it. In addition, RISCV ISA is growing, so I also hope the struct
> mcontext_t has a space for future expansion. Based on the above ideas,
> I reserved a 5K space here.
> 
>>> This shouldn't be necessary if the additional vector registers are
>>> caller-saved.
> 
> Here I am a little confused about the usage of struct mcontext_t. As
> far as I know, the struct mcontext_t is used to save the
> machine-specific information in user context operation. Therefore, in
> this case, the struct mcontext_t is allowed to reserve the space only
> for saving caller-saved registers. However, in the signal handler, the
> user seems to be allowed to use uc_mcontext whose data type is struct
> mcontext_t to access the content of the signal context. In this case,
> the struct mcontext_t may need to be the same as the struct sigcontext
> defined at kernel. However, it will have a conflict with your
> suggestion because the struct sigcontext cannot just reserve a space
> for saving caller-saved registers. Could you help me point out my
> misunderstanding? Thank you.
> 
>> Indeed, that was my first thought when I saw this too. Any late
>> additions to the register file must be call-clobbered or else they are
>> a new ABI. And mcontext_t does not need to represent any
>> call-clobbered state.
>> 
>> Rich
> 


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion.
  2021-09-16  8:02       ` Vincent Chen
  2021-09-16  8:14         ` Florian Weimer via Libc-alpha
  2021-09-16 23:56         ` [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion Ben Woodard via Libc-alpha
@ 2021-09-17 17:03         ` Rich Felker
  2021-09-18  3:19           ` Vincent Chen
  2 siblings, 1 reply; 79+ messages in thread
From: Rich Felker @ 2021-09-17 17:03 UTC (permalink / raw)
  To: Vincent Chen; +Cc: Florian Weimer, GNU C Library, Andrew Waterman

On Thu, Sep 16, 2021 at 04:02:50PM +0800, Vincent Chen wrote:
> On Mon, Sep 13, 2021 at 9:52 PM Rich Felker <dalias@libc.org> wrote:
> >
> > On Mon, Sep 13, 2021 at 03:44:09PM +0200, Florian Weimer via Libc-alpha wrote:
> > > * Vincent Chen:
> > >
> > > > Following the changes of struct sigcontext in Linux to reserve about 5K space
> > > > to support future ISA expansion.
> > > > ---
> > > >  sysdeps/unix/sysv/linux/riscv/sys/ucontext.h | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > >
> > > > diff --git a/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h b/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
> > > > index cfafa44..80caf07 100644
> > > > --- a/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
> > > > +++ b/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
> > > > @@ -82,6 +82,8 @@ typedef struct mcontext_t
> > > >    {
> > > >      __riscv_mc_gp_state __gregs;
> > > >      union  __riscv_mc_fp_state __fpregs;
> > > > +    /* 5K + 256 reserved for vector state and future expansion.  */
> > > > +    unsigned char __reserved[5376] __attribute__ ((__aligned__ (16)));
> > > >    } mcontext_t;
> > >
> Hi Florian and Rich,
> Sorry for the late reply and thank you for reminding me the
> modification will cause ABI break.
> 
> > > This changes the size of struct ucontext_t, which is an ABI break
> > > (getcontext callers are supposed to provide their own object).
> > >
> 
> The riscv vector registers are all caller-saved registers except for
> VCSR. Therefore, the struct mcontext_t needs to reserve a space for
> it. In addition, RISCV ISA is growing, so I also hope the struct
> mcontext_t has a space for future expansion. Based on the above ideas,
> I reserved a 5K space here.

VCSR is not call-saved (aka 'callee-saved' in alternate notation)
either. It's thread-local state that may be changed or left alone by
calls, and that sj/lj/ucontext functions can't touch, just like fenv.
Saving and restoring it here would be wrong.

Rich

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion.
  2021-09-16  8:14         ` Florian Weimer via Libc-alpha
@ 2021-09-18  3:04           ` Vincent Chen
  2022-12-09  3:39             ` RISCV kernel struct sigcontext expansion for V regs and potential glibc ABI break (was Re: [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion.) Vineet Gupta
  0 siblings, 1 reply; 79+ messages in thread
From: Vincent Chen @ 2021-09-18  3:04 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Rich Felker, GNU C Library, Andrew Waterman

On Thu, Sep 16, 2021 at 4:14 PM Florian Weimer <fweimer@redhat.com> wrote:
>
> * Vincent Chen:
>
> >> > This changes the size of struct ucontext_t, which is an ABI break
> >> > (getcontext callers are supposed to provide their own object).
> >> >
> >
> > The riscv vector registers are all caller-saved registers except for
> > VCSR. Therefore, the struct mcontext_t needs to reserve a space for
> > it. In addition, RISCV ISA is growing, so I also hope the struct
> > mcontext_t has a space for future expansion. Based on the above ideas,
> > I reserved a 5K space here.
>
> You have reserved space in ucontext_t that you could use for this.
>
Sorry, I cannot really understand what you mean. The following is the
contents of ucontext_t
typedef struct ucontext_t
  {
    unsigned long int  __uc_flags;
    struct ucontext_t *uc_link;
    stack_t            uc_stack;
    sigset_t           uc_sigmask;
    /* There's some padding here to allow sigset_t to be expanded in the
       future.  Though this is unlikely, other architectures put uc_sigmask
       at the end of this structure and explicitly state it can be
       expanded, so we didn't want to box ourselves in here.  */
    char               __glibc_reserved[1024 / 8 - sizeof (sigset_t)];
    /* We can't put uc_sigmask at the end of this structure because we need
       to be able to expand sigcontext in the future.  For example, the
       vector ISA extension will almost certainly add ISA state.  We want
       to ensure all user-visible ISA state can be saved and restored via a
       ucontext, so we're putting this at the end in order to allow for
       infinite extensibility.  Since we know this will be extended and we
       assume sigset_t won't be extended an extreme amount, we're
       prioritizing this.  */
    mcontext_t uc_mcontext;
  } ucontext_t;

Currently, we only reserve a space, __glibc_reserved[], for the future
expansion of sigset_t.
Do you mean I could use __glibc_reserved[] to for future expansion of
ISA as well?

> >> > This shouldn't be necessary if the additional vector registers are
> >> > caller-saved.
> >
> > Here I am a little confused about the usage of struct mcontext_t. As
> > far as I know, the struct mcontext_t is used to save the
> > machine-specific information in user context operation. Therefore, in
> > this case, the struct mcontext_t is allowed to reserve the space only
> > for saving caller-saved registers. However, in the signal handler, the
> > user seems to be allowed to use uc_mcontext whose data type is struct
> > mcontext_t to access the content of the signal context. In this case,
> > the struct mcontext_t may need to be the same as the struct sigcontext
> > defined at kernel. However, it will have a conflict with your
> > suggestion because the struct sigcontext cannot just reserve a space
> > for saving caller-saved registers. Could you help me point out my
> > misunderstanding? Thank you.
>
> struct sigcontext is allocated by the kernel, so you can have pointers
> in reserved fields to out-of-line start, or after struct sigcontext.
>
> I don't know how the kernel implements this, but there is considerable
> flexibility and extensibility.  The main issues comes from small stacks
> which are incompatible with large register files.
>

I have the same concern as you for reserving a huge space in
mcontext_t. If the content of mcontext_t is allowed to be different
from the content of sigcontext_t, and it has been confirmed that VCSR
should not be saved or restored by the *context function, then there
seems to be no need to reserve a space in mcontext to support V
extension. I will review it again. Thank you !!


> Thanks,
> Florian
>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion.
  2021-09-16 23:56         ` [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion Ben Woodard via Libc-alpha
@ 2021-09-18  3:15           ` Vincent Chen
  2021-09-20 16:41             ` DJ Delorie via Libc-alpha
  0 siblings, 1 reply; 79+ messages in thread
From: Vincent Chen @ 2021-09-18  3:15 UTC (permalink / raw)
  To: Ben Woodard; +Cc: Florian Weimer, Rich Felker, GNU C Library, Andrew Waterman

On Fri, Sep 17, 2021 at 7:56 AM Ben Woodard <woodard@redhat.com> wrote:
>
> I know this patch set mostly deals with signal handling but don’t forget LD_AUDIT. It has a similar issue for the plt_enter and exit functions.
>
> -ben

Hi Ben,
I am not familiar with the mechanism of LD_AUDIT, so I actually do not
know if this modification may have any effect on LD_AUDIT. If
possible, could you briefly introduce the issues for me? Thank you
very much.

Regards,
Vincent
>
> > On Sep 16, 2021, at 1:03 AM, Vincent Chen <vincent.chen@sifive.com> wrote:
> >
> > On Mon, Sep 13, 2021 at 9:52 PM Rich Felker <dalias@libc.org> wrote:
> >>
> >>> On Mon, Sep 13, 2021 at 03:44:09PM +0200, Florian Weimer via Libc-alpha wrote:
> >>> * Vincent Chen:
> >>>
> >>>> Following the changes of struct sigcontext in Linux to reserve about 5K space
> >>>> to support future ISA expansion.
> >>>> ---
> >>>> sysdeps/unix/sysv/linux/riscv/sys/ucontext.h | 2 ++
> >>>> 1 file changed, 2 insertions(+)
> >>>>
> >>>> diff --git a/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h b/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
> >>>> index cfafa44..80caf07 100644
> >>>> --- a/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
> >>>> +++ b/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
> >>>> @@ -82,6 +82,8 @@ typedef struct mcontext_t
> >>>>   {
> >>>>     __riscv_mc_gp_state __gregs;
> >>>>     union  __riscv_mc_fp_state __fpregs;
> >>>> +    /* 5K + 256 reserved for vector state and future expansion.  */
> >>>> +    unsigned char __reserved[5376] __attribute__ ((__aligned__ (16)));
> >>>>   } mcontext_t;
> >>>
> > Hi Florian and Rich,
> > Sorry for the late reply and thank you for reminding me the
> > modification will cause ABI break.
> >
> >>> This changes the size of struct ucontext_t, which is an ABI break
> >>> (getcontext callers are supposed to provide their own object).
> >>>
> >
> > The riscv vector registers are all caller-saved registers except for
> > VCSR. Therefore, the struct mcontext_t needs to reserve a space for
> > it. In addition, RISCV ISA is growing, so I also hope the struct
> > mcontext_t has a space for future expansion. Based on the above ideas,
> > I reserved a 5K space here.
> >
> >>> This shouldn't be necessary if the additional vector registers are
> >>> caller-saved.
> >
> > Here I am a little confused about the usage of struct mcontext_t. As
> > far as I know, the struct mcontext_t is used to save the
> > machine-specific information in user context operation. Therefore, in
> > this case, the struct mcontext_t is allowed to reserve the space only
> > for saving caller-saved registers. However, in the signal handler, the
> > user seems to be allowed to use uc_mcontext whose data type is struct
> > mcontext_t to access the content of the signal context. In this case,
> > the struct mcontext_t may need to be the same as the struct sigcontext
> > defined at kernel. However, it will have a conflict with your
> > suggestion because the struct sigcontext cannot just reserve a space
> > for saving caller-saved registers. Could you help me point out my
> > misunderstanding? Thank you.
> >
> >> Indeed, that was my first thought when I saw this too. Any late
> >> additions to the register file must be call-clobbered or else they are
> >> a new ABI. And mcontext_t does not need to represent any
> >> call-clobbered state.
> >>
> >> Rich
> >
>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion.
  2021-09-17 17:03         ` Rich Felker
@ 2021-09-18  3:19           ` Vincent Chen
  0 siblings, 0 replies; 79+ messages in thread
From: Vincent Chen @ 2021-09-18  3:19 UTC (permalink / raw)
  To: Rich Felker; +Cc: Florian Weimer, GNU C Library, Andrew Waterman

On Sat, Sep 18, 2021 at 1:03 AM Rich Felker <dalias@libc.org> wrote:
>
> On Thu, Sep 16, 2021 at 04:02:50PM +0800, Vincent Chen wrote:
> > On Mon, Sep 13, 2021 at 9:52 PM Rich Felker <dalias@libc.org> wrote:
> > >
> > > On Mon, Sep 13, 2021 at 03:44:09PM +0200, Florian Weimer via Libc-alpha wrote:
> > > > * Vincent Chen:
> > > >
> > > > > Following the changes of struct sigcontext in Linux to reserve about 5K space
> > > > > to support future ISA expansion.
> > > > > ---
> > > > >  sysdeps/unix/sysv/linux/riscv/sys/ucontext.h | 2 ++
> > > > >  1 file changed, 2 insertions(+)
> > > > >
> > > > > diff --git a/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h b/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
> > > > > index cfafa44..80caf07 100644
> > > > > --- a/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
> > > > > +++ b/sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
> > > > > @@ -82,6 +82,8 @@ typedef struct mcontext_t
> > > > >    {
> > > > >      __riscv_mc_gp_state __gregs;
> > > > >      union  __riscv_mc_fp_state __fpregs;
> > > > > +    /* 5K + 256 reserved for vector state and future expansion.  */
> > > > > +    unsigned char __reserved[5376] __attribute__ ((__aligned__ (16)));
> > > > >    } mcontext_t;
> > > >
> > Hi Florian and Rich,
> > Sorry for the late reply and thank you for reminding me the
> > modification will cause ABI break.
> >
> > > > This changes the size of struct ucontext_t, which is an ABI break
> > > > (getcontext callers are supposed to provide their own object).
> > > >
> >
> > The riscv vector registers are all caller-saved registers except for
> > VCSR. Therefore, the struct mcontext_t needs to reserve a space for
> > it. In addition, RISCV ISA is growing, so I also hope the struct
> > mcontext_t has a space for future expansion. Based on the above ideas,
> > I reserved a 5K space here.
>
> VCSR is not call-saved (aka 'callee-saved' in alternate notation)
> either. It's thread-local state that may be changed or left alone by
> calls, and that sj/lj/ucontext functions can't touch, just like fenv.
> Saving and restoring it here would be wrong.
>

You are right. Joseph pointed it out in my 3rd patch. I will remove it
from my next version patch. Thank you.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion.
  2021-09-18  3:15           ` Vincent Chen
@ 2021-09-20 16:41             ` DJ Delorie via Libc-alpha
  2021-09-20 17:10               ` Florian Weimer via Libc-alpha
  0 siblings, 1 reply; 79+ messages in thread
From: DJ Delorie via Libc-alpha @ 2021-09-20 16:41 UTC (permalink / raw)
  To: Vincent Chen; +Cc: libc-alpha

Vincent Chen <vincent.chen@sifive.com> writes:
> I am not familiar with the mechanism of LD_AUDIT, so I actually do not
> know if this modification may have any effect on LD_AUDIT. If
> possible, could you briefly introduce the issues for me? Thank you
> very much.

In general, when function foo() calls DSO function bar(), and bar() is
in an object that needs to be loaded from disk, the loader needs to save
foo()'s context, do a bunch of work, restore the context, and call
bar().

The LD_AUDIT feature adds a lot more "do a bunch of work" both on the
foo->bar call, and on the bar->foo return, typically calling some third
party functions to process the audit messages.

However, if the "do a bunch of work" changes registers that aren't saved
in the context, and aren't agreed on as "call clobbered" and thus
changeable, problems happen.  If foo() expects a register to be
preserved across the call to bar(), and the loader and audit functions
don't know that and clobber it, foo() breaks.

If everything is built with the same ISA, this normally isn't a problem,
but when you mix ISAs, or use optional/experimental ISAs that glibc (or
the auding code) doesn't know about, it may be.


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion.
  2021-09-20 16:41             ` DJ Delorie via Libc-alpha
@ 2021-09-20 17:10               ` Florian Weimer via Libc-alpha
  2021-10-01  1:43                 ` Vincent Chen
  0 siblings, 1 reply; 79+ messages in thread
From: Florian Weimer via Libc-alpha @ 2021-09-20 17:10 UTC (permalink / raw)
  To: DJ Delorie via Libc-alpha; +Cc: Vincent Chen

* DJ Delorie via Libc-alpha:

> Vincent Chen <vincent.chen@sifive.com> writes:
>> I am not familiar with the mechanism of LD_AUDIT, so I actually do not
>> know if this modification may have any effect on LD_AUDIT. If
>> possible, could you briefly introduce the issues for me? Thank you
>> very much.
>
> In general, when function foo() calls DSO function bar(), and bar() is
> in an object that needs to be loaded from disk, the loader needs to save
> foo()'s context, do a bunch of work, restore the context, and call
> bar().
>
> The LD_AUDIT feature adds a lot more "do a bunch of work" both on the
> foo->bar call, and on the bar->foo return, typically calling some third
> party functions to process the audit messages.
>
> However, if the "do a bunch of work" changes registers that aren't saved
> in the context, and aren't agreed on as "call clobbered" and thus
> changeable, problems happen.  If foo() expects a register to be
> preserved across the call to bar(), and the loader and audit functions
> don't know that and clobber it, foo() breaks.

One point of clarification:

The issue is with register usage for passing argument and return values.
It's more or less unrelated to whether registers are callee-saved or
caller-saved.  So you need special LD_AUDIT support as soon it's
possible to pass vector arguments and return values in registers (as
opposed to memory).

Thanks,
Florian


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion.
  2021-09-20 17:10               ` Florian Weimer via Libc-alpha
@ 2021-10-01  1:43                 ` Vincent Chen
  2021-10-01 12:08                   ` Adhemerval Zanella via Libc-alpha
  0 siblings, 1 reply; 79+ messages in thread
From: Vincent Chen @ 2021-10-01  1:43 UTC (permalink / raw)
  To: Florian Weimer; +Cc: DJ Delorie via Libc-alpha

On Tue, Sep 21, 2021 at 1:10 AM Florian Weimer <fweimer@redhat.com> wrote:
>
> * DJ Delorie via Libc-alpha:
>
> > Vincent Chen <vincent.chen@sifive.com> writes:
> >> I am not familiar with the mechanism of LD_AUDIT, so I actually do not
> >> know if this modification may have any effect on LD_AUDIT. If
> >> possible, could you briefly introduce the issues for me? Thank you
> >> very much.
> >
> > In general, when function foo() calls DSO function bar(), and bar() is
> > in an object that needs to be loaded from disk, the loader needs to save
> > foo()'s context, do a bunch of work, restore the context, and call
> > bar().
> >
> > The LD_AUDIT feature adds a lot more "do a bunch of work" both on the
> > foo->bar call, and on the bar->foo return, typically calling some third
> > party functions to process the audit messages.
> >
> > However, if the "do a bunch of work" changes registers that aren't saved
> > in the context, and aren't agreed on as "call clobbered" and thus
> > changeable, problems happen.  If foo() expects a register to be
> > preserved across the call to bar(), and the loader and audit functions
> > don't know that and clobber it, foo() breaks.
>
> One point of clarification:
>
> The issue is with register usage for passing argument and return values.
> It's more or less unrelated to whether registers are callee-saved or
> caller-saved.  So you need special LD_AUDIT support as soon it's
> possible to pass vector arguments and return values in registers (as
> opposed to memory).
>
> Thanks,
> Florian
>
Thank DJ Delorie and Florian very much for the detailed explanation
and clarification. It is really helpful for me to understand this
problem I have not noticed. Currently, I have some findings. If my
understanding is wrong, please correct me. Thank you.

The ABI for using vector registers to pass arguments and return value
is under discussion and close to ratification. As far as I know, the
riscv Glibc resolver will have a similar issue to this LD_AUDIT
problem after this new ABI is used. Hsiangkai Wang has sent a patch,
https://sourceware.org/pipermail/libc-alpha/2021-August/129931.html,
to deal with this issue in GLIBC resolver. This patch adds a new tag,
STO_RISCV_VARIANT_CC, to indicate whether the function uses this new
ABI or not. During the relocation process, if STO_RISCV_VARIANT_CC is
set in the st_other field of the symbol being processed, the delayed
binding mechanism will be disabled. It can avoid saving vector
registers before entering the resolver function.

The LD_AUDIT problem is similar to this because we need to prevent the
auditing function from clobbering the vector registers that store the
argument to pass to the audited symbol. Therefore, I think this patch
can resolve the LD_AUDIT issue as well.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion.
  2021-10-01  1:43                 ` Vincent Chen
@ 2021-10-01 12:08                   ` Adhemerval Zanella via Libc-alpha
  0 siblings, 0 replies; 79+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2021-10-01 12:08 UTC (permalink / raw)
  To: Vincent Chen, Florian Weimer; +Cc: DJ Delorie via Libc-alpha



On 30/09/2021 22:43, Vincent Chen wrote:
> On Tue, Sep 21, 2021 at 1:10 AM Florian Weimer <fweimer@redhat.com> wrote:
>>
>> * DJ Delorie via Libc-alpha:
>>
>>> Vincent Chen <vincent.chen@sifive.com> writes:
>>>> I am not familiar with the mechanism of LD_AUDIT, so I actually do not
>>>> know if this modification may have any effect on LD_AUDIT. If
>>>> possible, could you briefly introduce the issues for me? Thank you
>>>> very much.
>>>
>>> In general, when function foo() calls DSO function bar(), and bar() is
>>> in an object that needs to be loaded from disk, the loader needs to save
>>> foo()'s context, do a bunch of work, restore the context, and call
>>> bar().
>>>
>>> The LD_AUDIT feature adds a lot more "do a bunch of work" both on the
>>> foo->bar call, and on the bar->foo return, typically calling some third
>>> party functions to process the audit messages.
>>>
>>> However, if the "do a bunch of work" changes registers that aren't saved
>>> in the context, and aren't agreed on as "call clobbered" and thus
>>> changeable, problems happen.  If foo() expects a register to be
>>> preserved across the call to bar(), and the loader and audit functions
>>> don't know that and clobber it, foo() breaks.
>>
>> One point of clarification:
>>
>> The issue is with register usage for passing argument and return values.
>> It's more or less unrelated to whether registers are callee-saved or
>> caller-saved.  So you need special LD_AUDIT support as soon it's
>> possible to pass vector arguments and return values in registers (as
>> opposed to memory).
>>
>> Thanks,
>> Florian
>>
> Thank DJ Delorie and Florian very much for the detailed explanation
> and clarification. It is really helpful for me to understand this
> problem I have not noticed. Currently, I have some findings. If my
> understanding is wrong, please correct me. Thank you.
> 
> The ABI for using vector registers to pass arguments and return value
> is under discussion and close to ratification. As far as I know, the
> riscv Glibc resolver will have a similar issue to this LD_AUDIT
> problem after this new ABI is used. Hsiangkai Wang has sent a patch,
> https://sourceware.org/pipermail/libc-alpha/2021-August/129931.html,
> to deal with this issue in GLIBC resolver. This patch adds a new tag,
> STO_RISCV_VARIANT_CC, to indicate whether the function uses this new
> ABI or not. During the relocation process, if STO_RISCV_VARIANT_CC is
> set in the st_other field of the symbol being processed, the delayed
> binding mechanism will be disabled. It can avoid saving vector
> registers before entering the resolver function.
> 
> The LD_AUDIT problem is similar to this because we need to prevent the
> auditing function from clobbering the vector registers that store the
> argument to pass to the audited symbol. Therefore, I think this patch
> can resolve the LD_AUDIT issue as well.
> 

It solves the LD_AUDIT issue by not enabling it, so it won't work with
STO_RISCV_VARIANT_CC. It is essentially the same issue we have for AArch64
SVE. 

It should be a fair approach to just not support it, although it seems
that HPC tools does use it extensively for some specific usercases.  For
SVE case I tried to fix by saving/restoring the required SVE defined
registers by ABI argument passing [1], although the ABI does not really
defines it (Szabolcs thinks we should save *all* registers).

In any case, LD_AUDIT is corner case and I think we can workaround within
glibc (since we can use a different runtime trampoline to save/restore
the required state).

[1] https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=c8315ccd30fcecc1b93a9bc3f073010190a86e05;hp=171fdd4bd4f337001db053721477add60d205ed8

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 3/5] RISC-V: Save and restore VCSR when doing user context switch
  2021-09-13  1:41 ` [RFC patch 3/5] RISC-V: Save and restore VCSR when doing user context switch Vincent Chen
  2021-09-14 23:48   ` Joseph Myers
@ 2021-10-01 13:04   ` Adhemerval Zanella via Libc-alpha
  1 sibling, 0 replies; 79+ messages in thread
From: Adhemerval Zanella via Libc-alpha @ 2021-10-01 13:04 UTC (permalink / raw)
  To: libc-alpha, Vincent Chen, palmer; +Cc: andrew



On 12/09/2021 22:41, Vincent Chen wrote:
> According to the RISC-V V extension specification, all vector registers
> except VCSR are caller-saved registers. The VCSR (vxrm + vxsat) has thread
> storage duration. Therefore, only VCSR needs to be added to the user
> context operation.
> ---
>  sysdeps/riscv/Makefile                       |  5 ++++
>  sysdeps/riscv/rtld-global-offsets.sym        |  7 +++++
>  sysdeps/unix/sysv/linux/riscv/bits/hwcap.h   | 31 +++++++++++++++++++++
>  sysdeps/unix/sysv/linux/riscv/getcontext.S   | 22 ++++++++++++++-
>  sysdeps/unix/sysv/linux/riscv/setcontext.S   | 22 +++++++++++++++
>  sysdeps/unix/sysv/linux/riscv/swapcontext.S  | 41 ++++++++++++++++++++++++++++
>  sysdeps/unix/sysv/linux/riscv/sysdep.h       |  1 +
>  sysdeps/unix/sysv/linux/riscv/ucontext_i.sym |  6 ++++
>  8 files changed, 134 insertions(+), 1 deletion(-)
>  create mode 100644 sysdeps/riscv/rtld-global-offsets.sym
>  create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/hwcap.h
> 
> diff --git a/sysdeps/riscv/Makefile b/sysdeps/riscv/Makefile
> index 20a9968..cda3ded 100644
> --- a/sysdeps/riscv/Makefile
> +++ b/sysdeps/riscv/Makefile
> @@ -2,6 +2,11 @@ ifeq ($(subdir),misc)
>  sysdep_headers += sys/asm.h
>  endif
>  
> +ifeq ($(subdir),csu)
> +# get offset to rtld_global._dl_hwcap and rtld_global._dl_hwcap2.
> +gen-as-const-headers += rtld-global-offsets.sym
> +endif
> +
>  # RISC-V's assembler also needs to know about PIC as it changes the definition
>  # of some assembler macros.
>  ASFLAGS-.os += $(pic-ccflag)
> diff --git a/sysdeps/riscv/rtld-global-offsets.sym b/sysdeps/riscv/rtld-global-offsets.sym
> new file mode 100644
> index 0000000..ff4e97f
> --- /dev/null
> +++ b/sysdeps/riscv/rtld-global-offsets.sym
> @@ -0,0 +1,7 @@
> +#define SHARED 1
> +
> +#include <ldsodefs.h>
> +
> +#define rtld_global_ro_offsetof(mem) offsetof (struct rtld_global_ro, mem)
> +
> +RTLD_GLOBAL_RO_DL_HWCAP_OFFSET	rtld_global_ro_offsetof (_dl_hwcap)
> diff --git a/sysdeps/unix/sysv/linux/riscv/bits/hwcap.h b/sysdeps/unix/sysv/linux/riscv/bits/hwcap.h
> new file mode 100644
> index 0000000..e6c5ef5
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/riscv/bits/hwcap.h
> @@ -0,0 +1,31 @@
> +/* Defines for bits in AT_HWCAP.  RISC-V Linux version.
> +   Copyright (C) 2021 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#if !defined (_SYS_AUXV_H) && !defined (_LINUX_RISCV_SYSDEP_H)

_LINUX_RISCV_SYSDEP_H is defined by an internal header only, so I
think it should no be referenced by an exported one.

> +# error "Never include <bits/hwcap.h> directly; use <sys/auxv.h> instead."
> +#endif
> +
> +/* The following must match the kernel's <asm/hwcap.h>.  */
> +#define HWCAP_ISA_I      0x100		//(1 << ('I' - 'A'))
> +#define HWCAP_ISA_M      0x1000 	//(1 << ('M' - 'A'))
> +#define HWCAP_ISA_A      0x1		//(1 << ('A' - 'A'))
> +#define HWCAP_ISA_F      0x20		//(1 << ('F' - 'A'))
> +#define HWCAP_ISA_D      0x8		//(1 << ('D' - 'A'))
> +#define HWCAP_ISA_C      0x4		//(1 << ('C' - 'A'))
> +#define HWCAP_ISA_V      0x200000	//(1 << ('V' - 'A'))
> +
> diff --git a/sysdeps/unix/sysv/linux/riscv/getcontext.S b/sysdeps/unix/sysv/linux/riscv/getcontext.S
> index d6a9bbc..840d8fe 100644
> --- a/sysdeps/unix/sysv/linux/riscv/getcontext.S
> +++ b/sysdeps/unix/sysv/linux/riscv/getcontext.S
> @@ -16,6 +16,8 @@
>     License along with the GNU C Library.  If not, see
>     <https://www.gnu.org/licenses/>.  */
>  
> +#include <sysdep.h>
> +#include <rtld-global-offsets.h>
>  #include "ucontext-macros.h"
>  
>  /* int getcontext (ucontext_t *ucp) */
> @@ -39,6 +41,25 @@ LEAF (__getcontext)
>  	SAVE_INT_REG (s10, 26, a0)
>  	SAVE_INT_REG (s11, 27, a0)
>  
> +#ifdef __riscv_vector

I take '__riscv_vector' would be defined by the compiler (although there is
no gcc support yet).  Why do you need to build iff vector extension is being
use if you are checking the hwcap?

For __riscv_float_abi_soft it does make sense since 'frsr' will be issue
regardless.

> +# ifdef SHARED
> +	la	t1, _rtld_global_ro
> +	REG_L   t1, RTLD_GLOBAL_RO_DL_HWCAP_OFFSET(t1)
> +# else
> +	la	t1, _dl_hwcap
> +	REG_L	t1, (t1)
> +# endif
> +	li	t2, HWCAP_ISA_V
> +	and	t2, t1, t2
> +	beqz	t2, 1f
> +	addi	t2, a0,	MCONTEXT_EXTENSION
> +	li	t1, RVV_MAGIC
> +	sw	t1, (t2)
> +	csrr	t1, vcsr
> +	REG_S	t1, VCSR_OFFSET(t2)
> +1:
> +#endif
> +
>  #ifndef __riscv_float_abi_soft
>  	frsr	a1
>  
> @@ -73,5 +94,4 @@ LEAF (__getcontext)
>  99:	j	__syscall_error
>  
>  PSEUDO_END (__getcontext)
> -
>  weak_alias (__getcontext, getcontext)
> diff --git a/sysdeps/unix/sysv/linux/riscv/setcontext.S b/sysdeps/unix/sysv/linux/riscv/setcontext.S
> index 9510518..d2404fb 100644
> --- a/sysdeps/unix/sysv/linux/riscv/setcontext.S
> +++ b/sysdeps/unix/sysv/linux/riscv/setcontext.S
> @@ -16,6 +16,8 @@
>     License along with the GNU C Library.  If not, see
>     <https://www.gnu.org/licenses/>.  */
>  
> +#include <sysdep.h>
> +#include <rtld-global-offsets.h>
>  #include "ucontext-macros.h"
>  
>  /*  int __setcontext (const ucontext_t *ucp)
> @@ -64,6 +66,26 @@ LEAF (__setcontext)
>  	fssr	t1
>  #endif /* __riscv_float_abi_soft */
>  
> +#ifdef __riscv_vector
> +#ifdef SHARED
> +	la	t1, _rtld_global_ro
> +	REG_L   t1, RTLD_GLOBAL_RO_DL_HWCAP_OFFSET(t1)
> +#else
> +	la	t1, _dl_hwcap
> +	REG_L	t1, (t1)
> +#endif
> +	li	t2, HWCAP_ISA_V
> +	and	t2, t1, t2
> +	beqz	t2, 1f
> +	li      t1, RVV_MAGIC
> +	addi	t2, t0,	MCONTEXT_EXTENSION
> +	lw	a1, (t2)
> +	bne	a1, t1, 1f
> +	REG_L   t1, VCSR_OFFSET(t2)
> +	csrw	vcsr, t1
> +1:
> +#endif
> +
>  	/* Note the contents of argument registers will be random
>  	   unless makecontext() has been called.  */
>  	RESTORE_INT_REG     (t1,   0, t0)
> diff --git a/sysdeps/unix/sysv/linux/riscv/swapcontext.S b/sysdeps/unix/sysv/linux/riscv/swapcontext.S
> index df0f699..94ae8e4 100644
> --- a/sysdeps/unix/sysv/linux/riscv/swapcontext.S
> +++ b/sysdeps/unix/sysv/linux/riscv/swapcontext.S
> @@ -16,6 +16,8 @@
>     License along with the GNU C Library.  If not, see
>     <https://www.gnu.org/licenses/>.  */
>  
> +#include <sysdep.h>
> +#include <rtld-global-offsets.h>
>  #include "ucontext-macros.h"
>  
>  /* int swapcontext (ucontext_t *oucp, const ucontext_t *ucp) */
> @@ -40,6 +42,25 @@ LEAF (__swapcontext)
>  	SAVE_INT_REG (s10, 26, a0)
>  	SAVE_INT_REG (s11, 27, a0)
>  
> +#ifdef __riscv_vector
> +#ifdef SHARED
> +	la      t1, _rtld_global_ro
> +	REG_L   t1, RTLD_GLOBAL_RO_DL_HWCAP_OFFSET(t1)
> +#else
> +	la	t1, _dl_hwcap
> +	REG_L   t1, (t1)
> +#endif
> +	li	t2, HWCAP_ISA_V
> +	and	t2, t1, t2
> +	beqz	t2, 1f
> +	addi	t2, a0,	MCONTEXT_EXTENSION
> +	li	t1, RVV_MAGIC
> +	sw	t1, (t2)
> +	csrr	t1, vcsr
> +	REG_S	t1, VCSR_OFFSET(t2)
> +1:
> +#endif
> +
>  #ifndef __riscv_float_abi_soft
>  	frsr a1
>  
> @@ -89,6 +110,26 @@ LEAF (__swapcontext)
>  	fssr	t1
>  #endif /* __riscv_float_abi_soft */
>  
> +#ifdef __riscv_vector
> +#ifdef SHARED
> +	la      t1, _rtld_global_ro
> +	REG_L   t1, RTLD_GLOBAL_RO_DL_HWCAP_OFFSET(t1)
> +#else
> +	la	t1, _dl_hwcap
> +	REG_L   t1, (t1)
> +#endif
> +	li	t2, HWCAP_ISA_V
> +	and	t2, t1, t2
> +	beqz	t2, 1f
> +	li      t1, RVV_MAGIC
> +	addi	t2, t0,	MCONTEXT_EXTENSION
> +	lw	a1, (t2)
> +	bne	a1, t1, 1f
> +	REG_L   t1, VCSR_OFFSET(t2)
> +	csrw	vcsr, t1
> +1:
> +#endif
> +
>  	/* Note the contents of argument registers will be random
>  	   unless makecontext() has been called.  */
>  	RESTORE_INT_REG (t1,   0, t0)
> diff --git a/sysdeps/unix/sysv/linux/riscv/sysdep.h b/sysdeps/unix/sysv/linux/riscv/sysdep.h
> index 37ff07a..c9f8fd8 100644
> --- a/sysdeps/unix/sysv/linux/riscv/sysdep.h
> +++ b/sysdeps/unix/sysv/linux/riscv/sysdep.h
> @@ -50,6 +50,7 @@
>  
>  #ifdef __ASSEMBLER__
>  
> +# include <bits/hwcap.h>
>  # include <sys/asm.h>
>  
>  # define ENTRY(name) LEAF(name)
> diff --git a/sysdeps/unix/sysv/linux/riscv/ucontext_i.sym b/sysdeps/unix/sysv/linux/riscv/ucontext_i.sym
> index be55b26..4037473 100644
> --- a/sysdeps/unix/sysv/linux/riscv/ucontext_i.sym
> +++ b/sysdeps/unix/sysv/linux/riscv/ucontext_i.sym
> @@ -2,6 +2,7 @@
>  #include <signal.h>
>  #include <stddef.h>
>  #include <sys/ucontext.h>
> +#include <asm/sigcontext.h>
>  
>  -- Constants used by the rt_sigprocmask call.
>  
> @@ -27,5 +28,10 @@ STACK_FLAGS			stack (ss_flags)
>  
>  MCONTEXT_GREGS			mcontext (__gregs)
>  MCONTEXT_FPREGS			mcontext (__fpregs)
> +MCONTEXT_EXTENSION 		mcontext (__reserved)
>  
>  UCONTEXT_SIZE			sizeof (ucontext_t)
> +
> +VCSR_OFFSET			offsetof (struct __riscv_v_state, vcsr)
> +
> +RVV_MAGIC
> 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 0/5] RISC-V: Add vector ISA support
  2021-09-13  1:41 [RFC patch 0/5] RISC-V: Add vector ISA support Vincent Chen
                   ` (5 preceding siblings ...)
  2021-09-13 19:11 ` [RFC patch 0/5] RISC-V: Add vector ISA support Vineet Gupta via Libc-alpha
@ 2021-11-09 19:21 ` Darius Rad
  2021-11-09 19:30   ` Andrew Waterman
  6 siblings, 1 reply; 79+ messages in thread
From: Darius Rad @ 2021-11-09 19:21 UTC (permalink / raw)
  To: Vincent Chen; +Cc: libc-alpha, andrew

On Mon, Sep 13, 2021 at 09:41:13AM +0800, Vincent Chen wrote:
> This patchset adds required ports to support RISC-V Vector (RVV) extension.
> 
> Since the length of the vector register in RVV (the theoretical maximum
> is 2^XLEN-1 bits) is variable, a huge and flexible space is needed to back
> up all vector registers in the signal context. This patchset expands the
> default SIGSTKSZ, MINSIGSTKSZ, and PTHREAD_STACK_MIN to ensure the stack
> size is enough for the normal case (VLENB <= 128 bytes). Linux kernel also
> places the exact minimum signal stack size in AT_MINSIGSTKSZ entry of the
> auxiliary vector to inform user, so user still can know the sutible minimum
> signal stack size by sysconf (_SC_MINSIGSTKSZ) if the VLENB is greater
> than 128 bytes. 
> 
> In addition, according to the specification, the VCSR that combines VXRM and
> VXSAT has thread storage duration, so this patchset adds the required user
> context operation for it.
> 
> Finally, the RISC-V glibc customized sigcontext.h has been removed in this
> patchset. to reduce the synchronization work when new extension support is
> introduced to the Linux environment. However, it may bring some backward
> incompatible issues. Therefore, I sent an RFC patch
> (https://sourceware.org/pipermail/libc-alpha/2020-June/115549.html)
> to discuss this modification before this patchset. As I mentioned in the
> RFC patch thread, I used OpenEmbeded to evaluate the impact. During the
> tests, I didn't get any compiler errors. Therefore, I infer that this
> modification may not cause server backward incompatible issues at this
> moment.
> 
> 1. The RISC-V V-extension draft v1.0 can be found in
> https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc
> 2. The associated kernel implementation can be found in
> http://lists.infradead.org/pipermail/linux-riscv/2021-September/008249.html
> 3. QEMU with RISC-V V-extension support can be found in
> https://github.com/sifive/qemu/tree/rvv-1.0
> 

For the record on libc-alpha, I object to these changes.  In particular,
the lack of a user space API for the corresponding Linux support.  More
discussion on linux-riscv:

https://lists.infradead.org/pipermail/linux-riscv/2021-September/thread.html#8361

> Vincent Chen (5):
>   RISC-V: Remove riscv-specific sigcontext.h
>   RISC-V: Reserve about 5K space in mcontext_t to support future ISA
>     expansion.
>   RISC-V: Save and restore VCSR when doing user context switch
>   RISC-V: Extend MINSIGSTKSZ and SIGSTKSZ to backup RVV registers
>   RISC-V: Expand PTHREAD_STACK_MIN to support RVV environment
> 
>  sysdeps/riscv/Makefile                             |  5 +++
>  sysdeps/riscv/rtld-global-offsets.sym              |  7 ++++
>  sysdeps/unix/sysv/linux/riscv/bits/hwcap.h         | 31 ++++++++++++++++
>  .../unix/sysv/linux/riscv/bits/pthread_stack_min.h | 21 +++++++++++
>  sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h    | 31 ----------------
>  sysdeps/unix/sysv/linux/riscv/bits/sigstack.h      | 32 +++++++++++++++++
>  sysdeps/unix/sysv/linux/riscv/getcontext.S         | 22 +++++++++++-
>  sysdeps/unix/sysv/linux/riscv/setcontext.S         | 22 ++++++++++++
>  sysdeps/unix/sysv/linux/riscv/swapcontext.S        | 41 ++++++++++++++++++++++
>  sysdeps/unix/sysv/linux/riscv/sys/ucontext.h       |  2 ++
>  .../sysv/linux/riscv/sysconf-pthread_stack_min.h   | 39 ++++++++++++++++++++
>  sysdeps/unix/sysv/linux/riscv/sysdep.h             |  1 +
>  sysdeps/unix/sysv/linux/riscv/ucontext_i.sym       |  6 ++++
>  13 files changed, 228 insertions(+), 32 deletions(-)
>  create mode 100644 sysdeps/riscv/rtld-global-offsets.sym
>  create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/hwcap.h
>  create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/pthread_stack_min.h
>  delete mode 100644 sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h
>  create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/sigstack.h
>  create mode 100644 sysdeps/unix/sysv/linux/riscv/sysconf-pthread_stack_min.h
> 
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 0/5] RISC-V: Add vector ISA support
  2021-11-09 19:21 ` Darius Rad
@ 2021-11-09 19:30   ` Andrew Waterman
  2021-11-09 22:03     ` Darius Rad
  0 siblings, 1 reply; 79+ messages in thread
From: Andrew Waterman @ 2021-11-09 19:30 UTC (permalink / raw)
  To: Vincent Chen, libc-alpha, Palmer Dabbelt, DJ Delorie,
	Andrew Waterman

On Tue, Nov 9, 2021 at 11:21 AM Darius Rad <darius@bluespec.com> wrote:
>
> On Mon, Sep 13, 2021 at 09:41:13AM +0800, Vincent Chen wrote:
> > This patchset adds required ports to support RISC-V Vector (RVV) extension.
> >
> > Since the length of the vector register in RVV (the theoretical maximum
> > is 2^XLEN-1 bits) is variable, a huge and flexible space is needed to back
> > up all vector registers in the signal context. This patchset expands the
> > default SIGSTKSZ, MINSIGSTKSZ, and PTHREAD_STACK_MIN to ensure the stack
> > size is enough for the normal case (VLENB <= 128 bytes). Linux kernel also
> > places the exact minimum signal stack size in AT_MINSIGSTKSZ entry of the
> > auxiliary vector to inform user, so user still can know the sutible minimum
> > signal stack size by sysconf (_SC_MINSIGSTKSZ) if the VLENB is greater
> > than 128 bytes.
> >
> > In addition, according to the specification, the VCSR that combines VXRM and
> > VXSAT has thread storage duration, so this patchset adds the required user
> > context operation for it.
> >
> > Finally, the RISC-V glibc customized sigcontext.h has been removed in this
> > patchset. to reduce the synchronization work when new extension support is
> > introduced to the Linux environment. However, it may bring some backward
> > incompatible issues. Therefore, I sent an RFC patch
> > (https://sourceware.org/pipermail/libc-alpha/2020-June/115549.html)
> > to discuss this modification before this patchset. As I mentioned in the
> > RFC patch thread, I used OpenEmbeded to evaluate the impact. During the
> > tests, I didn't get any compiler errors. Therefore, I infer that this
> > modification may not cause server backward incompatible issues at this
> > moment.
> >
> > 1. The RISC-V V-extension draft v1.0 can be found in
> > https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc
> > 2. The associated kernel implementation can be found in
> > http://lists.infradead.org/pipermail/linux-riscv/2021-September/008249.html
> > 3. QEMU with RISC-V V-extension support can be found in
> > https://github.com/sifive/qemu/tree/rvv-1.0
> >
>
> For the record on libc-alpha, I object to these changes.  In particular,
> the lack of a user space API for the corresponding Linux support.  More
> discussion on linux-riscv:
>
> https://lists.infradead.org/pipermail/linux-riscv/2021-September/thread.html#8361

I do not agree with that analysis.  The vector extension scales down
to having potentially very little state (512 bytes on RV64) and we
expect typical applications-processor implementations to land in the
512 - 2048-byte range.  This matches AVX, not AMX.  Furthermore, we
want all implementations to take advantage of vectorized C
string/memory functions without having to explicitly opt in.  Not
doing this would put RISC-V at a significant competitive disadvantage
vs. other architectures with SIMD units.

>
>
> > Vincent Chen (5):
> >   RISC-V: Remove riscv-specific sigcontext.h
> >   RISC-V: Reserve about 5K space in mcontext_t to support future ISA
> >     expansion.
> >   RISC-V: Save and restore VCSR when doing user context switch
> >   RISC-V: Extend MINSIGSTKSZ and SIGSTKSZ to backup RVV registers
> >   RISC-V: Expand PTHREAD_STACK_MIN to support RVV environment
> >
> >  sysdeps/riscv/Makefile                             |  5 +++
> >  sysdeps/riscv/rtld-global-offsets.sym              |  7 ++++
> >  sysdeps/unix/sysv/linux/riscv/bits/hwcap.h         | 31 ++++++++++++++++
> >  .../unix/sysv/linux/riscv/bits/pthread_stack_min.h | 21 +++++++++++
> >  sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h    | 31 ----------------
> >  sysdeps/unix/sysv/linux/riscv/bits/sigstack.h      | 32 +++++++++++++++++
> >  sysdeps/unix/sysv/linux/riscv/getcontext.S         | 22 +++++++++++-
> >  sysdeps/unix/sysv/linux/riscv/setcontext.S         | 22 ++++++++++++
> >  sysdeps/unix/sysv/linux/riscv/swapcontext.S        | 41 ++++++++++++++++++++++
> >  sysdeps/unix/sysv/linux/riscv/sys/ucontext.h       |  2 ++
> >  .../sysv/linux/riscv/sysconf-pthread_stack_min.h   | 39 ++++++++++++++++++++
> >  sysdeps/unix/sysv/linux/riscv/sysdep.h             |  1 +
> >  sysdeps/unix/sysv/linux/riscv/ucontext_i.sym       |  6 ++++
> >  13 files changed, 228 insertions(+), 32 deletions(-)
> >  create mode 100644 sysdeps/riscv/rtld-global-offsets.sym
> >  create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/hwcap.h
> >  create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/pthread_stack_min.h
> >  delete mode 100644 sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h
> >  create mode 100644 sysdeps/unix/sysv/linux/riscv/bits/sigstack.h
> >  create mode 100644 sysdeps/unix/sysv/linux/riscv/sysconf-pthread_stack_min.h
> >
> > --
> > 2.7.4
> >

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 0/5] RISC-V: Add vector ISA support
  2021-11-09 19:30   ` Andrew Waterman
@ 2021-11-09 22:03     ` Darius Rad
  2021-11-09 22:18       ` Andrew Waterman
  0 siblings, 1 reply; 79+ messages in thread
From: Darius Rad @ 2021-11-09 22:03 UTC (permalink / raw)
  To: Andrew Waterman; +Cc: Vincent Chen, libc-alpha

On Tue, Nov 09, 2021 at 11:30:49AM -0800, Andrew Waterman wrote:
> On Tue, Nov 9, 2021 at 11:21 AM Darius Rad <darius@bluespec.com> wrote:
> >
> > On Mon, Sep 13, 2021 at 09:41:13AM +0800, Vincent Chen wrote:
> > > This patchset adds required ports to support RISC-V Vector (RVV) extension.
> > >
> > > Since the length of the vector register in RVV (the theoretical maximum
> > > is 2^XLEN-1 bits) is variable, a huge and flexible space is needed to back
> > > up all vector registers in the signal context. This patchset expands the
> > > default SIGSTKSZ, MINSIGSTKSZ, and PTHREAD_STACK_MIN to ensure the stack
> > > size is enough for the normal case (VLENB <= 128 bytes). Linux kernel also
> > > places the exact minimum signal stack size in AT_MINSIGSTKSZ entry of the
> > > auxiliary vector to inform user, so user still can know the sutible minimum
> > > signal stack size by sysconf (_SC_MINSIGSTKSZ) if the VLENB is greater
> > > than 128 bytes.
> > >
> > > In addition, according to the specification, the VCSR that combines VXRM and
> > > VXSAT has thread storage duration, so this patchset adds the required user
> > > context operation for it.
> > >
> > > Finally, the RISC-V glibc customized sigcontext.h has been removed in this
> > > patchset. to reduce the synchronization work when new extension support is
> > > introduced to the Linux environment. However, it may bring some backward
> > > incompatible issues. Therefore, I sent an RFC patch
> > > (https://sourceware.org/pipermail/libc-alpha/2020-June/115549.html)
> > > to discuss this modification before this patchset. As I mentioned in the
> > > RFC patch thread, I used OpenEmbeded to evaluate the impact. During the
> > > tests, I didn't get any compiler errors. Therefore, I infer that this
> > > modification may not cause server backward incompatible issues at this
> > > moment.
> > >
> > > 1. The RISC-V V-extension draft v1.0 can be found in
> > > https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc
> > > 2. The associated kernel implementation can be found in
> > > http://lists.infradead.org/pipermail/linux-riscv/2021-September/008249.html
> > > 3. QEMU with RISC-V V-extension support can be found in
> > > https://github.com/sifive/qemu/tree/rvv-1.0
> > >
> >
> > For the record on libc-alpha, I object to these changes.  In particular,
> > the lack of a user space API for the corresponding Linux support.  More
> > discussion on linux-riscv:
> >
> > https://lists.infradead.org/pipermail/linux-riscv/2021-September/thread.html#8361
> 
> I do not agree with that analysis.  The vector extension scales down
> to having potentially very little state (512 bytes on RV64) and we
> expect typical applications-processor implementations to land in the
> 512 - 2048-byte range.  This matches AVX, not AMX.  Furthermore, we
> want all implementations to take advantage of vectorized C
> string/memory functions without having to explicitly opt in.  Not
> doing this would put RISC-V at a significant competitive disadvantage
> vs. other architectures with SIMD units.
> 

The vector extension also scales up to 256 kiB, which, for comparison sake,
is considerably more than AMX.

There are those that believe AVX should have had some sort of user space
control [1], as well.

[1] https://lore.kernel.org/lkml/87k0ntazyn.ffs@nanos.tec.linutronix.de/

I don't see how having user space control either prevents glibc from using
vector by default when it is available or how it puts RISC-V at a
significant competitive disadvantage.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 0/5] RISC-V: Add vector ISA support
  2021-11-09 22:03     ` Darius Rad
@ 2021-11-09 22:18       ` Andrew Waterman
  2021-11-10 11:39         ` Darius Rad
  0 siblings, 1 reply; 79+ messages in thread
From: Andrew Waterman @ 2021-11-09 22:18 UTC (permalink / raw)
  To: Andrew Waterman, Vincent Chen, libc-alpha, Palmer Dabbelt,
	DJ Delorie

On Tue, Nov 9, 2021 at 2:04 PM Darius Rad <darius@bluespec.com> wrote:
>
> On Tue, Nov 09, 2021 at 11:30:49AM -0800, Andrew Waterman wrote:
> > On Tue, Nov 9, 2021 at 11:21 AM Darius Rad <darius@bluespec.com> wrote:
> > >
> > > On Mon, Sep 13, 2021 at 09:41:13AM +0800, Vincent Chen wrote:
> > > > This patchset adds required ports to support RISC-V Vector (RVV) extension.
> > > >
> > > > Since the length of the vector register in RVV (the theoretical maximum
> > > > is 2^XLEN-1 bits) is variable, a huge and flexible space is needed to back
> > > > up all vector registers in the signal context. This patchset expands the
> > > > default SIGSTKSZ, MINSIGSTKSZ, and PTHREAD_STACK_MIN to ensure the stack
> > > > size is enough for the normal case (VLENB <= 128 bytes). Linux kernel also
> > > > places the exact minimum signal stack size in AT_MINSIGSTKSZ entry of the
> > > > auxiliary vector to inform user, so user still can know the sutible minimum
> > > > signal stack size by sysconf (_SC_MINSIGSTKSZ) if the VLENB is greater
> > > > than 128 bytes.
> > > >
> > > > In addition, according to the specification, the VCSR that combines VXRM and
> > > > VXSAT has thread storage duration, so this patchset adds the required user
> > > > context operation for it.
> > > >
> > > > Finally, the RISC-V glibc customized sigcontext.h has been removed in this
> > > > patchset. to reduce the synchronization work when new extension support is
> > > > introduced to the Linux environment. However, it may bring some backward
> > > > incompatible issues. Therefore, I sent an RFC patch
> > > > (https://sourceware.org/pipermail/libc-alpha/2020-June/115549.html)
> > > > to discuss this modification before this patchset. As I mentioned in the
> > > > RFC patch thread, I used OpenEmbeded to evaluate the impact. During the
> > > > tests, I didn't get any compiler errors. Therefore, I infer that this
> > > > modification may not cause server backward incompatible issues at this
> > > > moment.
> > > >
> > > > 1. The RISC-V V-extension draft v1.0 can be found in
> > > > https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc
> > > > 2. The associated kernel implementation can be found in
> > > > http://lists.infradead.org/pipermail/linux-riscv/2021-September/008249.html
> > > > 3. QEMU with RISC-V V-extension support can be found in
> > > > https://github.com/sifive/qemu/tree/rvv-1.0
> > > >
> > >
> > > For the record on libc-alpha, I object to these changes.  In particular,
> > > the lack of a user space API for the corresponding Linux support.  More
> > > discussion on linux-riscv:
> > >
> > > https://lists.infradead.org/pipermail/linux-riscv/2021-September/thread.html#8361
> >
> > I do not agree with that analysis.  The vector extension scales down
> > to having potentially very little state (512 bytes on RV64) and we
> > expect typical applications-processor implementations to land in the
> > 512 - 2048-byte range.  This matches AVX, not AMX.  Furthermore, we
> > want all implementations to take advantage of vectorized C
> > string/memory functions without having to explicitly opt in.  Not
> > doing this would put RISC-V at a significant competitive disadvantage
> > vs. other architectures with SIMD units.
> >
>
> The vector extension also scales up to 256 kiB, which, for comparison sake,
> is considerably more than AMX.

We have good reason to believe that apps/server processors will not
get anywhere near an order of magnitude of that limit, and that huge
vector regfiles will be the province of HPC.

>
> There are those that believe AVX should have had some sort of user space
> control [1], as well.
>
> [1] https://lore.kernel.org/lkml/87k0ntazyn.ffs@nanos.tec.linutronix.de/
>
> I don't see how having user space control either prevents glibc from using
> vector by default when it is available or how it puts RISC-V at a
> significant competitive disadvantage.

My "competitive advantage" comment was about the need to do the
following (quoting from your other message):

"A process (or thread) must specifically request the desire to use
vector extensions (perhaps with some new arch_prctl() API)"

I had assumed you meant that programmers must do this
explicitly--which would clearly put RISC-V at a competitive
disadvantage.  If glibc initialization code makes this API call, then
I withdraw my comment. (But then I question the value of the API call
vs. the kernel automatically enabling the vector unit, since
essentially all processes will invoke the API call anyway.)

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [RFC patch 0/5] RISC-V: Add vector ISA support
  2021-11-09 22:18       ` Andrew Waterman
@ 2021-11-10 11:39         ` Darius Rad
  0 siblings, 0 replies; 79+ messages in thread
From: Darius Rad @ 2021-11-10 11:39 UTC (permalink / raw)
  To: Andrew Waterman; +Cc: Vincent Chen, libc-alpha

On Tue, Nov 09, 2021 at 02:18:17PM -0800, Andrew Waterman wrote:
> On Tue, Nov 9, 2021 at 2:04 PM Darius Rad <darius@bluespec.com> wrote:
> >
> > On Tue, Nov 09, 2021 at 11:30:49AM -0800, Andrew Waterman wrote:
> > > On Tue, Nov 9, 2021 at 11:21 AM Darius Rad <darius@bluespec.com> wrote:
> > > >
> > > > On Mon, Sep 13, 2021 at 09:41:13AM +0800, Vincent Chen wrote:
> > > > > This patchset adds required ports to support RISC-V Vector (RVV) extension.
> > > > >
> > > > > Since the length of the vector register in RVV (the theoretical maximum
> > > > > is 2^XLEN-1 bits) is variable, a huge and flexible space is needed to back
> > > > > up all vector registers in the signal context. This patchset expands the
> > > > > default SIGSTKSZ, MINSIGSTKSZ, and PTHREAD_STACK_MIN to ensure the stack
> > > > > size is enough for the normal case (VLENB <= 128 bytes). Linux kernel also
> > > > > places the exact minimum signal stack size in AT_MINSIGSTKSZ entry of the
> > > > > auxiliary vector to inform user, so user still can know the sutible minimum
> > > > > signal stack size by sysconf (_SC_MINSIGSTKSZ) if the VLENB is greater
> > > > > than 128 bytes.
> > > > >
> > > > > In addition, according to the specification, the VCSR that combines VXRM and
> > > > > VXSAT has thread storage duration, so this patchset adds the required user
> > > > > context operation for it.
> > > > >
> > > > > Finally, the RISC-V glibc customized sigcontext.h has been removed in this
> > > > > patchset. to reduce the synchronization work when new extension support is
> > > > > introduced to the Linux environment. However, it may bring some backward
> > > > > incompatible issues. Therefore, I sent an RFC patch
> > > > > (https://sourceware.org/pipermail/libc-alpha/2020-June/115549.html)
> > > > > to discuss this modification before this patchset. As I mentioned in the
> > > > > RFC patch thread, I used OpenEmbeded to evaluate the impact. During the
> > > > > tests, I didn't get any compiler errors. Therefore, I infer that this
> > > > > modification may not cause server backward incompatible issues at this
> > > > > moment.
> > > > >
> > > > > 1. The RISC-V V-extension draft v1.0 can be found in
> > > > > https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc
> > > > > 2. The associated kernel implementation can be found in
> > > > > http://lists.infradead.org/pipermail/linux-riscv/2021-September/008249.html
> > > > > 3. QEMU with RISC-V V-extension support can be found in
> > > > > https://github.com/sifive/qemu/tree/rvv-1.0
> > > > >
> > > >
> > > > For the record on libc-alpha, I object to these changes.  In particular,
> > > > the lack of a user space API for the corresponding Linux support.  More
> > > > discussion on linux-riscv:
> > > >
> > > > https://lists.infradead.org/pipermail/linux-riscv/2021-September/thread.html#8361
> > >
> > > I do not agree with that analysis.  The vector extension scales down
> > > to having potentially very little state (512 bytes on RV64) and we
> > > expect typical applications-processor implementations to land in the
> > > 512 - 2048-byte range.  This matches AVX, not AMX.  Furthermore, we
> > > want all implementations to take advantage of vectorized C
> > > string/memory functions without having to explicitly opt in.  Not
> > > doing this would put RISC-V at a significant competitive disadvantage
> > > vs. other architectures with SIMD units.
> > >
> >
> > The vector extension also scales up to 256 kiB, which, for comparison sake,
> > is considerably more than AMX.
> 
> We have good reason to believe that apps/server processors will not
> get anywhere near an order of magnitude of that limit, and that huge
> vector regfiles will be the province of HPC.
> 
> >
> > There are those that believe AVX should have had some sort of user space
> > control [1], as well.
> >
> > [1] https://lore.kernel.org/lkml/87k0ntazyn.ffs@nanos.tec.linutronix.de/
> >
> > I don't see how having user space control either prevents glibc from using
> > vector by default when it is available or how it puts RISC-V at a
> > significant competitive disadvantage.
> 
> My "competitive advantage" comment was about the need to do the
> following (quoting from your other message):
> 
> "A process (or thread) must specifically request the desire to use
> vector extensions (perhaps with some new arch_prctl() API)"
> 
> I had assumed you meant that programmers must do this
> explicitly--which would clearly put RISC-V at a competitive
> disadvantage.  If glibc initialization code makes this API call, then
> I withdraw my comment. (But then I question the value of the API call
> vs. the kernel automatically enabling the vector unit, since
> essentially all processes will invoke the API call anyway.)

The benefit to having the API call is that it allows the kernel to report a
failure.  This is necessary now, because memory allocation for context
state can fail, and in some circumstances there is no other way to report
that error.  It is also useful for the future, as a way for the kernel to
provide a policy enforcement mechanism whereby the system administrator can
control which tasks are permitted to use vector extensions.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* RISCV kernel struct sigcontext expansion for V regs and potential glibc ABI break (was Re: [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion.)
  2021-09-18  3:04           ` Vincent Chen
@ 2022-12-09  3:39             ` Vineet Gupta
  2022-12-09  4:03               ` Vineet Gupta
  2022-12-20 20:05               ` Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break Vineet Gupta
  0 siblings, 2 replies; 79+ messages in thread
From: Vineet Gupta @ 2022-12-09  3:39 UTC (permalink / raw)
  To: Vincent Chen, Florian Weimer
  Cc: Rich Felker, GNU C Library, Andrew Waterman, Palmer Dabbelt,
	Kito Cheng, Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel

Hi Florian,

P.S. Since I'm revisiting a year old thread with some new CC recipients, 
here's the link to original patch/thread [1]

On 9/17/21 20:04, Vincent Chen wrote:
> On Thu, Sep 16, 2021 at 4:14 PM Florian Weimer <fweimer@redhat.com> wrote:
>>>>> This changes the size of struct ucontext_t, which is an ABI break
>>>>> (getcontext callers are supposed to provide their own object).
>>>>>
>>> The riscv vector registers are all caller-saved registers except for
>>> VCSR. Therefore, the struct mcontext_t needs to reserve a space for
>>> it. In addition, RISCV ISA is growing, so I also hope the struct
>>> mcontext_t has a space for future expansion. Based on the above ideas,
>>> I reserved a 5K space here.
>> You have reserved space in ucontext_t that you could use for this.
>>
> Sorry, I cannot really understand what you mean. The following is the
> contents of ucontext_t
> typedef struct ucontext_t
>    {
>      unsigned long int  __uc_flags;
>      struct ucontext_t *uc_link;
>      stack_t            uc_stack;
>      sigset_t           uc_sigmask;
>      /* There's some padding here to allow sigset_t to be expanded in the
>         future.  Though this is unlikely, other architectures put uc_sigmask
>         at the end of this structure and explicitly state it can be
>         expanded, so we didn't want to box ourselves in here.  */
>      char               __glibc_reserved[1024 / 8 - sizeof (sigset_t)];
>      /* We can't put uc_sigmask at the end of this structure because we need
>         to be able to expand sigcontext in the future.  For example, the
>         vector ISA extension will almost certainly add ISA state.  We want
>         to ensure all user-visible ISA state can be saved and restored via a
>         ucontext, so we're putting this at the end in order to allow for
>         infinite extensibility.  Since we know this will be extended and we
>         assume sigset_t won't be extended an extreme amount, we're
>         prioritizing this.  */
>      mcontext_t uc_mcontext;
>    } ucontext_t;
>
> Currently, we only reserve a space, __glibc_reserved[], for the future
> expansion of sigset_t.
> Do you mean I could use __glibc_reserved[] to for future expansion of
> ISA as well?

Given unlikely sigset expansion, we could in theory use some of those 
reserved fields to store pointers (offsets) to actual V state, but not 
for actual V state which is way too large for non-embedded machines with 
typical 128 or even wider V regs.


>
>>>>> This shouldn't be necessary if the additional vector registers are
>>>>> caller-saved.
>>> Here I am a little confused about the usage of struct mcontext_t. As
>>> far as I know, the struct mcontext_t is used to save the
>>> machine-specific information in user context operation. Therefore, in
>>> this case, the struct mcontext_t is allowed to reserve the space only
>>> for saving caller-saved registers. However, in the signal handler, the
>>> user seems to be allowed to use uc_mcontext whose data type is struct
>>> mcontext_t to access the content of the signal context. In this case,
>>> the struct mcontext_t may need to be the same as the struct sigcontext
>>> defined at kernel. However, it will have a conflict with your
>>> suggestion because the struct sigcontext cannot just reserve a space
>>> for saving caller-saved registers. Could you help me point out my
>>> misunderstanding? Thank you.

I think the confusion comes from apparent equivalence of kernel struct 
sigcontext and glibc mcontext_t as they appear in respective struct 
ucontext definitions.
I've enumerated the actual RV structs below to keep them handy in one 
place for discussion.

>> struct sigcontext is allocated by the kernel, so you can have pointers
>> in reserved fields to out-of-line start, or after struct sigcontext.

In this scheme, would the actual V regfile contents (at the out-of-line 
location w.r.t kernel sigcontext) be anonymous for glibc i.e. do we not 
need to expose them to glibc userspace ABI ?


>> I don't know how the kernel implements this, but there is considerable
>> flexibility and extensibility.  The main issues comes from small stacks
>> which are incompatible with large register files.

Simplistically, Linux kernel needs to preserve the V regfile across task 
switch. The necessary evil that follows is preserving V across 
signal-handling (sigaction/sigreturn).

In RV kernel we have following:

struct rt_sigframe {
   struct siginfo info;
   struct ucontext uc;
};

struct ucontext {
    unsigned long uc_flags;
    struct ucontext *uc_link;
    stack_t uc_stack;
    sigset_t uc_sigmask;
    __u8 __unused[1024 / 8 - sizeof(sigset_t)];     // this is for 
sigset_t expansion
    struct sigcontext uc_mcontext;
};

struct sigcontext {
    struct user_regs_struct sc_regs;
    union __riscv_fp_state sc_fpregs;
+  __u8 sc_extn[4096+128] __attribute__((__aligned__(16)));   // handle 
128B V regs
};

The sc_extn[] would have V state (regfile + control state) in kernel 
defined format.

As I understand it, you are suggesting to prevent ABI break, we should 
not add anything to kernel struct sigcontext i.e. do something like this

struct rt_sigframe {
   struct siginfo info;
   struct ucontext uc;
+__u8 sc_extn[4096+128] __attribute__((__aligned__(16)));
}

So kernel sig handling can continue to save/restore the V regfile on 
user stack, w/o making it part of actual struct sigcontext.
So they are not explicitly visible to userspace at all - is that 
feasible ? I know that SA_SIGINFO handlers can access the scalar/fp 
regs, they won't do it V.
Is there a POSIX req for SA_SIGINFO handlers being able to access all 
machine regs saved by signal handling.

An alternate approach is what Vincent did originally, to add sc_exn to 
struct sigcontext. Here to prevent ABI breakage, we can choose to not 
reflect this in the glibc sigcontext. But the question remains, is that OK ?

The other topic is changing glibc mcontext_t to add V-regs. It would 
seem one has to as mcontext is "visually equivalent" to struct 
sigcontext in the respective ucontext structs. But in unserspace 
*context routine semantics only require callee-regs to be saved, which V 
regs are not per psABI [2]. So looks like this can be avoided which is 
what Vincent did in v2 series [3]


[1] https://sourceware.org/pipermail/libc-alpha/2021-September/130899.html
[2] 
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc
[3] https://sourceware.org/pipermail/libc-alpha/2022-January/135416.html

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: RISCV kernel struct sigcontext expansion for V regs and potential glibc ABI break (was Re: [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion.)
  2022-12-09  3:39             ` RISCV kernel struct sigcontext expansion for V regs and potential glibc ABI break (was Re: [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion.) Vineet Gupta
@ 2022-12-09  4:03               ` Vineet Gupta
  2022-12-20 20:05               ` Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break Vineet Gupta
  1 sibling, 0 replies; 79+ messages in thread
From: Vineet Gupta @ 2022-12-09  4:03 UTC (permalink / raw)
  To: Vincent Chen, Florian Weimer
  Cc: Rich Felker, GNU C Library, Andrew Waterman, Palmer Dabbelt,
	Kito Cheng, Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Greentime Hu, Andy Chiu

+CC Greentime and Andy

On 12/8/22 19:39, Vineet Gupta wrote:
> Hi Florian,
>
> P.S. Since I'm revisiting a year old thread with some new CC 
> recipients, here's the link to original patch/thread [1]
>
> On 9/17/21 20:04, Vincent Chen wrote:
>> On Thu, Sep 16, 2021 at 4:14 PM Florian Weimer <fweimer@redhat.com> 
>> wrote:
>>>>>> This changes the size of struct ucontext_t, which is an ABI break
>>>>>> (getcontext callers are supposed to provide their own object).
>>>>>>
>>>> The riscv vector registers are all caller-saved registers except for
>>>> VCSR. Therefore, the struct mcontext_t needs to reserve a space for
>>>> it. In addition, RISCV ISA is growing, so I also hope the struct
>>>> mcontext_t has a space for future expansion. Based on the above ideas,
>>>> I reserved a 5K space here.
>>> You have reserved space in ucontext_t that you could use for this.
>>>
>> Sorry, I cannot really understand what you mean. The following is the
>> contents of ucontext_t
>> typedef struct ucontext_t
>>    {
>>      unsigned long int  __uc_flags;
>>      struct ucontext_t *uc_link;
>>      stack_t            uc_stack;
>>      sigset_t           uc_sigmask;
>>      /* There's some padding here to allow sigset_t to be expanded in 
>> the
>>         future.  Though this is unlikely, other architectures put 
>> uc_sigmask
>>         at the end of this structure and explicitly state it can be
>>         expanded, so we didn't want to box ourselves in here. */
>>      char               __glibc_reserved[1024 / 8 - sizeof (sigset_t)];
>>      /* We can't put uc_sigmask at the end of this structure because 
>> we need
>>         to be able to expand sigcontext in the future.  For example, the
>>         vector ISA extension will almost certainly add ISA state.  We 
>> want
>>         to ensure all user-visible ISA state can be saved and 
>> restored via a
>>         ucontext, so we're putting this at the end in order to allow for
>>         infinite extensibility.  Since we know this will be extended 
>> and we
>>         assume sigset_t won't be extended an extreme amount, we're
>>         prioritizing this.  */
>>      mcontext_t uc_mcontext;
>>    } ucontext_t;
>>
>> Currently, we only reserve a space, __glibc_reserved[], for the future
>> expansion of sigset_t.
>> Do you mean I could use __glibc_reserved[] to for future expansion of
>> ISA as well?
>
> Given unlikely sigset expansion, we could in theory use some of those 
> reserved fields to store pointers (offsets) to actual V state, but not 
> for actual V state which is way too large for non-embedded machines 
> with typical 128 or even wider V regs.
>
>
>>
>>>>>> This shouldn't be necessary if the additional vector registers are
>>>>>> caller-saved.
>>>> Here I am a little confused about the usage of struct mcontext_t. As
>>>> far as I know, the struct mcontext_t is used to save the
>>>> machine-specific information in user context operation. Therefore, in
>>>> this case, the struct mcontext_t is allowed to reserve the space only
>>>> for saving caller-saved registers. However, in the signal handler, the
>>>> user seems to be allowed to use uc_mcontext whose data type is struct
>>>> mcontext_t to access the content of the signal context. In this case,
>>>> the struct mcontext_t may need to be the same as the struct sigcontext
>>>> defined at kernel. However, it will have a conflict with your
>>>> suggestion because the struct sigcontext cannot just reserve a space
>>>> for saving caller-saved registers. Could you help me point out my
>>>> misunderstanding? Thank you.
>
> I think the confusion comes from apparent equivalence of kernel struct 
> sigcontext and glibc mcontext_t as they appear in respective struct 
> ucontext definitions.
> I've enumerated the actual RV structs below to keep them handy in one 
> place for discussion.
>
>>> struct sigcontext is allocated by the kernel, so you can have pointers
>>> in reserved fields to out-of-line start, or after struct sigcontext.
>
> In this scheme, would the actual V regfile contents (at the 
> out-of-line location w.r.t kernel sigcontext) be anonymous for glibc 
> i.e. do we not need to expose them to glibc userspace ABI ?
>
>
>>> I don't know how the kernel implements this, but there is considerable
>>> flexibility and extensibility.  The main issues comes from small stacks
>>> which are incompatible with large register files.
>
> Simplistically, Linux kernel needs to preserve the V regfile across 
> task switch. The necessary evil that follows is preserving V across 
> signal-handling (sigaction/sigreturn).
>
> In RV kernel we have following:
>
> struct rt_sigframe {
>   struct siginfo info;
>   struct ucontext uc;
> };
>
> struct ucontext {
>    unsigned long uc_flags;
>    struct ucontext *uc_link;
>    stack_t uc_stack;
>    sigset_t uc_sigmask;
>    __u8 __unused[1024 / 8 - sizeof(sigset_t)];     // this is for 
> sigset_t expansion
>    struct sigcontext uc_mcontext;
> };
>
> struct sigcontext {
>    struct user_regs_struct sc_regs;
>    union __riscv_fp_state sc_fpregs;
> +  __u8 sc_extn[4096+128] __attribute__((__aligned__(16)));   // 
> handle 128B V regs
> };
>
> The sc_extn[] would have V state (regfile + control state) in kernel 
> defined format.
>
> As I understand it, you are suggesting to prevent ABI break, we should 
> not add anything to kernel struct sigcontext i.e. do something like this
>
> struct rt_sigframe {
>   struct siginfo info;
>   struct ucontext uc;
> +__u8 sc_extn[4096+128] __attribute__((__aligned__(16)));
> }
>
> So kernel sig handling can continue to save/restore the V regfile on 
> user stack, w/o making it part of actual struct sigcontext.
> So they are not explicitly visible to userspace at all - is that 
> feasible ? I know that SA_SIGINFO handlers can access the scalar/fp 
> regs, they won't do it V.
> Is there a POSIX req for SA_SIGINFO handlers being able to access all 
> machine regs saved by signal handling.
>
> An alternate approach is what Vincent did originally, to add sc_exn to 
> struct sigcontext. Here to prevent ABI breakage, we can choose to not 
> reflect this in the glibc sigcontext. But the question remains, is 
> that OK ?
>
> The other topic is changing glibc mcontext_t to add V-regs. It would 
> seem one has to as mcontext is "visually equivalent" to struct 
> sigcontext in the respective ucontext structs. But in unserspace 
> *context routine semantics only require callee-regs to be saved, which 
> V regs are not per psABI [2]. So looks like this can be avoided which 
> is what Vincent did in v2 series [3]
>
>
> [1] 
> https://sourceware.org/pipermail/libc-alpha/2021-September/130899.html
> [2] 
> https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc
> [3] https://sourceware.org/pipermail/libc-alpha/2022-January/135416.html


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2022-12-09  3:39             ` RISCV kernel struct sigcontext expansion for V regs and potential glibc ABI break (was Re: [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion.) Vineet Gupta
  2022-12-09  4:03               ` Vineet Gupta
@ 2022-12-20 20:05               ` Vineet Gupta
  2022-12-21 15:53                 ` Vincent Chen
  1 sibling, 1 reply; 79+ messages in thread
From: Vineet Gupta @ 2022-12-20 20:05 UTC (permalink / raw)
  To: Florian Weimer, Rich Felker, Andrew Waterman, Palmer Dabbelt,
	Kito Cheng, Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Philipp Tomsich, Szabolcs Nagy, Andy Chiu,
	Greentime Hu, Vincent Chen, Aaron Durbin, Andrew de los Reyes
  Cc: linux-riscv, GNU C Library

[-- Attachment #1: Type: text/plain, Size: 10252 bytes --]

Hi folks,

Apologies for the extraneous CC (and the top post), but I would really 
appreciate some feedback on this to close on the V-ext plumbing support 
in kernel/glibc. This is one of the two contentious issues (other being 
prctl enable) preventing us from getting to an RVV enabled SW ecosystem.

The premise is : for preserving V-ext registers across signal handling, 
the natural way is to add V reg storage to kernel struct sigcontext 
where scalar / fp regs are currently saved. But this doesn’t seem to be 
the right way to go:

1. Breaks the userspace ABI (even if user programs were recompiled) 
because RV glibc port for historical reasons has defined its own version 
of struct sigcontext (vs. relying on kernel exported UAPI header).

2. Even if we were to expand sigcontext (in both kernel and glibc, which 
is always hard to time) there's still a (different) ABI breakage for 
existing binaries despite earlier proposed __extension__ union trick [2] 
since it still breaks old binaries w.r.t. size of the sigcontext struct.

3. glibc {set,get,*}context() routines use struct mcontext_t which is 
analogous to kernel struct sigcontext (in respective ucontext structs 
[1]). Thus ideally mcontext_t needs to be expanded too but need not be, 
given its semantics to save callee-saved regs only : per current psABI 
RVVV regs are caller-saved/call-clobbered [3]. Apparently this 
connection of sigcontext to mcontext_t is also historical as some arches 
used/still-use sigreturn to restore regs in setcontext [4]

Does anyone disagree that 1-3 are not valid reasons.

So the proposal here is to *not* add V-ext state to kernel sigcontext 
but instead dynamically to struct rt_sigframe, similar to aarch64 
kernel. This avoids touching glibc sigcontext as well.

struct rt_sigframe {
   struct siginfo info;
   struct ucontext uc;
+__u8 sc_extn[] __attribute__((__aligned__(16))); // C99 flexible length 
array to handle implementation defined VLEN wide regs
}

The only downside to this is that SA_SIGINFO signal handlers don’t have 
direct access to V state (but it seems aarch64 kernel doesn’t either).

Does anyone really disagree with this proposal.

Attached is a proof-of-concept kernel patch which implements this 
proposal with no need for any corresponding glibc change.

Thx,
-Vineet


[1] ucontex in kernel and glibc respectively.

kernel: arch/riscv/include/uapi/asm/ucontext.h

struct ucontext {
  unsigned long uc_flags;
  struct ucontext *uc_link;
  stack_t uc_stack;
  sigset_t uc_sigmask;
  __u8 __unused[1024 / 8 - sizeof(sigset_t)];
  struct sigcontext uc_mcontext;
}

glibc: sysdeps/unix/sysv/linux/riscv/sys/ucontext.h

typedef struct ucontext_t
   {
     unsigned long int  __uc_flags;
     struct ucontext_t *uc_link;
     stack_t            uc_stack;
     sigset_t           uc_sigmask;
     /* padding to allow future sigset_t expansion */
     char   __glibc_reserved[1024 / 8 - sizeof (sigset_t)];
      mcontext_t uc_mcontext;
} ucontext_t;

[2] https://sourceware.org/pipermail/libc-alpha/2022-January/135610.html
[3] 
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc
[4] https://sourceware.org/legacy-ml/libc-alpha/2014-04/msg00006.html




On 12/8/22 19:39, Vineet Gupta wrote:
> Hi Florian,
>
> P.S. Since I'm revisiting a year old thread with some new CC 
> recipients, here's the link to original patch/thread [1]
>
> On 9/17/21 20:04, Vincent Chen wrote:
>> On Thu, Sep 16, 2021 at 4:14 PM Florian Weimer <fweimer@redhat.com> 
>> wrote:
>>>>>> This changes the size of struct ucontext_t, which is an ABI break
>>>>>> (getcontext callers are supposed to provide their own object).
>>>>>>
>>>> The riscv vector registers are all caller-saved registers except for
>>>> VCSR. Therefore, the struct mcontext_t needs to reserve a space for
>>>> it. In addition, RISCV ISA is growing, so I also hope the struct
>>>> mcontext_t has a space for future expansion. Based on the above ideas,
>>>> I reserved a 5K space here.
>>> You have reserved space in ucontext_t that you could use for this.
>>>
>> Sorry, I cannot really understand what you mean. The following is the
>> contents of ucontext_t
>> typedef struct ucontext_t
>>    {
>>      unsigned long int  __uc_flags;
>>      struct ucontext_t *uc_link;
>>      stack_t            uc_stack;
>>      sigset_t           uc_sigmask;
>>      /* There's some padding here to allow sigset_t to be expanded in 
>> the
>>         future.  Though this is unlikely, other architectures put 
>> uc_sigmask
>>         at the end of this structure and explicitly state it can be
>>         expanded, so we didn't want to box ourselves in here. */
>>      char               __glibc_reserved[1024 / 8 - sizeof (sigset_t)];
>>      /* We can't put uc_sigmask at the end of this structure because 
>> we need
>>         to be able to expand sigcontext in the future.  For example, the
>>         vector ISA extension will almost certainly add ISA state.  We 
>> want
>>         to ensure all user-visible ISA state can be saved and 
>> restored via a
>>         ucontext, so we're putting this at the end in order to allow for
>>         infinite extensibility.  Since we know this will be extended 
>> and we
>>         assume sigset_t won't be extended an extreme amount, we're
>>         prioritizing this.  */
>>      mcontext_t uc_mcontext;
>>    } ucontext_t;
>>
>> Currently, we only reserve a space, __glibc_reserved[], for the future
>> expansion of sigset_t.
>> Do you mean I could use __glibc_reserved[] to for future expansion of
>> ISA as well?
>
> Given unlikely sigset expansion, we could in theory use some of those 
> reserved fields to store pointers (offsets) to actual V state, but not 
> for actual V state which is way too large for non-embedded machines 
> with typical 128 or even wider V regs.
>
>
>>
>>>>>> This shouldn't be necessary if the additional vector registers are
>>>>>> caller-saved.
>>>> Here I am a little confused about the usage of struct mcontext_t. As
>>>> far as I know, the struct mcontext_t is used to save the
>>>> machine-specific information in user context operation. Therefore, in
>>>> this case, the struct mcontext_t is allowed to reserve the space only
>>>> for saving caller-saved registers. However, in the signal handler, the
>>>> user seems to be allowed to use uc_mcontext whose data type is struct
>>>> mcontext_t to access the content of the signal context. In this case,
>>>> the struct mcontext_t may need to be the same as the struct sigcontext
>>>> defined at kernel. However, it will have a conflict with your
>>>> suggestion because the struct sigcontext cannot just reserve a space
>>>> for saving caller-saved registers. Could you help me point out my
>>>> misunderstanding? Thank you.
>
> I think the confusion comes from apparent equivalence of kernel struct 
> sigcontext and glibc mcontext_t as they appear in respective struct 
> ucontext definitions.
> I've enumerated the actual RV structs below to keep them handy in one 
> place for discussion.
>
>>> struct sigcontext is allocated by the kernel, so you can have pointers
>>> in reserved fields to out-of-line start, or after struct sigcontext.
>
> In this scheme, would the actual V regfile contents (at the 
> out-of-line location w.r.t kernel sigcontext) be anonymous for glibc 
> i.e. do we not need to expose them to glibc userspace ABI ?
>
>
>>> I don't know how the kernel implements this, but there is considerable
>>> flexibility and extensibility.  The main issues comes from small stacks
>>> which are incompatible with large register files.
>
> Simplistically, Linux kernel needs to preserve the V regfile across 
> task switch. The necessary evil that follows is preserving V across 
> signal-handling (sigaction/sigreturn).
>
> In RV kernel we have following:
>
> struct rt_sigframe {
>   struct siginfo info;
>   struct ucontext uc;
> };
>
> struct ucontext {
>    unsigned long uc_flags;
>    struct ucontext *uc_link;
>    stack_t uc_stack;
>    sigset_t uc_sigmask;
>    __u8 __unused[1024 / 8 - sizeof(sigset_t)];     // this is for 
> sigset_t expansion
>    struct sigcontext uc_mcontext;
> };
>
> struct sigcontext {
>    struct user_regs_struct sc_regs;
>    union __riscv_fp_state sc_fpregs;
> +  __u8 sc_extn[4096+128] __attribute__((__aligned__(16)));   // 
> handle 128B V regs
> };
>
> The sc_extn[] would have V state (regfile + control state) in kernel 
> defined format.
>
> As I understand it, you are suggesting to prevent ABI break, we should 
> not add anything to kernel struct sigcontext i.e. do something like this
>
> struct rt_sigframe {
>   struct siginfo info;
>   struct ucontext uc;
> +__u8 sc_extn[4096+128] __attribute__((__aligned__(16)));
> }
>
> So kernel sig handling can continue to save/restore the V regfile on 
> user stack, w/o making it part of actual struct sigcontext.
> So they are not explicitly visible to userspace at all - is that 
> feasible ? I know that SA_SIGINFO handlers can access the scalar/fp 
> regs, they won't do it V.
> Is there a POSIX req for SA_SIGINFO handlers being able to access all 
> machine regs saved by signal handling.
>
> An alternate approach is what Vincent did originally, to add sc_exn to 
> struct sigcontext. Here to prevent ABI breakage, we can choose to not 
> reflect this in the glibc sigcontext. But the question remains, is 
> that OK ?
>
> The other topic is changing glibc mcontext_t to add V-regs. It would 
> seem one has to as mcontext is "visually equivalent" to struct 
> sigcontext in the respective ucontext structs. But in unserspace 
> *context routine semantics only require callee-regs to be saved, which 
> V regs are not per psABI [2]. So looks like this can be avoided which 
> is what Vincent did in v2 series [3]
>
>
> [1] 
> https://sourceware.org/pipermail/libc-alpha/2021-September/130899.html
> [2] 
> https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc
> [3] https://sourceware.org/pipermail/libc-alpha/2022-January/135416.html

[-- Attachment #2: 0001-riscv-Add-sigcontext-save-restore-for-vector.patch --]
[-- Type: text/x-patch, Size: 10635 bytes --]

From 169eea1ef072c8403277a66313b00258080ac92c Mon Sep 17 00:00:00 2001
From: Vineet Gupta <vineetg@rivosinc.com>
Date: Wed, 21 Sep 2022 14:43:52 -0700
Subject: [PATCH] riscv: Add sigcontext save/restore for vector

V state needs to be preserved across signal handling on user stack.
To avoid glibc ABI break, this is not added to struct sigcontext (just as
for int/fp regs) but to struct rt_sigframe. Also this is all done
dynamically (vs. some static allocation) to cleanly handle implementation
defined VLEN wide V-regs.

We also borrow arm64 style of "context header" to tag the extension
state to allow for easy integration of future extensions.

Co-developed-by: Vincent Chen <vincent.chen@sifive.com>
Co-developed-by: Greentime Hu <greentime.hu@sifive.com>
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Greentime Hu <greentime.hu@sifive.com>
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
[vineetg: reworked to not change struct sigcontext,
          wireup init_rt_signal_env]
---
 arch/riscv/include/asm/processor.h       |   1 +
 arch/riscv/include/uapi/asm/sigcontext.h |  18 +++
 arch/riscv/kernel/asm-offsets.c          |   2 +
 arch/riscv/kernel/setup.c                |   2 +
 arch/riscv/kernel/signal.c               | 171 +++++++++++++++++++++--
 5 files changed, 186 insertions(+), 8 deletions(-)

diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
index 95917a2b24f9..854854b377b2 100644
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -85,6 +85,7 @@ int riscv_of_parent_hartid(struct device_node *node, unsigned long *hartid);
 
 extern void riscv_fill_hwcap(void);
 extern int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);
+void init_rt_signal_env(void);
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/riscv/include/uapi/asm/sigcontext.h b/arch/riscv/include/uapi/asm/sigcontext.h
index 84f2dfcfdbce..411bf6985784 100644
--- a/arch/riscv/include/uapi/asm/sigcontext.h
+++ b/arch/riscv/include/uapi/asm/sigcontext.h
@@ -8,6 +8,24 @@
 
 #include <asm/ptrace.h>
 
+/* The Magic number for signal context frame header. */
+#define RVV_MAGIC	0x53465457
+#define END_MAGIC	0x0
+
+/* The size of END signal context header. */
+#define END_HDR_SIZE	0x0
+
+/* Every optional extension state needs to have the hdr. */
+struct __riscv_ctx_hdr {
+	__u32 magic;
+	__u32 size;
+};
+
+struct __sc_riscv_v_state {
+	struct __riscv_ctx_hdr head;
+	struct __riscv_v_state v_state;
+} __attribute__((aligned(16)));
+
 /*
  * Signal context structure
  *
diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c
index 37e3e6a8d877..80316ef7bb78 100644
--- a/arch/riscv/kernel/asm-offsets.c
+++ b/arch/riscv/kernel/asm-offsets.c
@@ -75,6 +75,8 @@ void asm_offsets(void)
 	OFFSET(TSK_STACK_CANARY, task_struct, stack_canary);
 #endif
 
+	OFFSET(RISCV_V_STATE_MAGIC, __riscv_ctx_hdr, magic);
+	OFFSET(RISCV_V_STATE_SIZE, __riscv_ctx_hdr, size);
 	OFFSET(RISCV_V_STATE_VSTART, __riscv_v_state, vstart);
 	OFFSET(RISCV_V_STATE_VL, __riscv_v_state, vl);
 	OFFSET(RISCV_V_STATE_VTYPE, __riscv_v_state, vtype);
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index 2dfc463b86bb..aa0eedd3b890 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -299,6 +299,8 @@ void __init setup_arch(char **cmdline_p)
 	riscv_init_cbom_blocksize();
 	riscv_fill_hwcap();
 	apply_boot_alternatives();
+	/* needs to be after riscv_fill_hwcap */
+	init_rt_signal_env();
 }
 
 static int __init topology_init(void)
diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c
index 5c591123c440..ee234c319e5b 100644
--- a/arch/riscv/kernel/signal.c
+++ b/arch/riscv/kernel/signal.c
@@ -21,15 +21,27 @@
 #include <asm/csr.h>
 
 extern u32 __user_rt_sigreturn[2];
+static size_t rvv_sc_size;
 
 #define DEBUG_SIG 0
 
 struct rt_sigframe {
 	struct siginfo info;
-	struct ucontext uc;
 #ifndef CONFIG_MMU
 	u32 sigreturn_code[2];
 #endif
+	struct ucontext uc;
+	/*
+	 * Placeholder for additional state for V ext (and others in future).
+	 *  - Not added to struct sigcontext (unlike int/fp regs) to remain
+	 *    compatible with existing glibc struct sigcontext
+	 *  - Not added here explicitly either to allow for
+	 *     - Implementation defined VLEN wide V reg
+	 *     - Ability to do this per process
+	 * The actual V state struct is defined in uapi header.
+	 * Note: The alignment of 16 is ABI mandated for stack entries.
+	 */
+	__u8 sc_extn[] __attribute__((__aligned__(16)));
 };
 
 #ifdef CONFIG_FPU
@@ -86,16 +98,142 @@ static long save_fp_state(struct pt_regs *regs,
 #define restore_fp_state(task, regs) (0)
 #endif
 
-static long restore_sigcontext(struct pt_regs *regs,
-	struct sigcontext __user *sc)
+#ifdef CONFIG_RISCV_ISA_V
+
+static long save_v_state(struct pt_regs *regs, void **sc_vec)
+{
+	/*
+	 * Put __sc_riscv_v_state to the user's signal context space pointed
+	 * by sc_vec and the datap point the address right
+	 * after __sc_riscv_v_state.
+	 */
+	struct __sc_riscv_v_state __user *state = (struct __sc_riscv_v_state *) (*sc_vec);
+	void __user *datap = state + 1;
+	long err;
+
+	err = __put_user(RVV_MAGIC, &state->head.magic);
+	err = __put_user(rvv_sc_size, &state->head.size);
+
+	vstate_save(current, regs);
+	/* Copy additional vstate (except V regfile). */
+	err = __copy_to_user(&state->v_state, &current->thread.vstate,
+			     RISCV_V_STATE_DATAP);
+	if (unlikely(err))
+		return err;
+
+	/* Copy the pointer datap itself. */
+	err = __put_user(datap, &state->v_state.datap);
+	if (unlikely(err))
+		return err;
+
+	/* Copy the V regfile to user space datap. */
+	err = __copy_to_user(datap, current->thread.vstate.datap, riscv_vsize);
+
+	*sc_vec += rvv_sc_size;
+
+	return err;
+}
+
+static long restore_v_state(struct pt_regs *regs, void **sc_vec)
+{
+	long err;
+	struct __sc_riscv_v_state __user *state = (struct __sc_riscv_v_state *)(*sc_vec);
+	void __user *datap;
+
+	/* ctx_hdr check for RVV_MAGIC already done in caller. */
+
+	/* Copy everything of __sc_riscv_v_state except datap. */
+	err = __copy_from_user(&current->thread.vstate, &state->v_state,
+			       RISCV_V_STATE_DATAP);
+	if (unlikely(err))
+		return err;
+
+	/* Copy the pointer datap itself. */
+	err = __get_user(datap, &state->v_state.datap);
+	if (unlikely(err))
+		return err;
+
+	/* Copy the whole vector content from user space datap. */
+	err = __copy_from_user(current->thread.vstate.datap, datap, riscv_vsize);
+	if (unlikely(err))
+		return err;
+
+	vstate_restore(current, regs);
+
+	*sc_vec += rvv_sc_size;
+
+	return err;
+}
+
+#else
+#define save_v_state(task, regs) (0)
+#define restore_v_state(task, regs) (0)
+#endif
+
+static long restore_sigcontext(struct rt_sigframe __user *frame,
+			       struct pt_regs *regs)
 {
+	struct sigcontext __user *sc = &frame->uc.uc_mcontext;
+	void *sc_extn = &frame->sc_extn;
 	long err;
+
 	/* sc_regs is structured the same as the start of pt_regs */
 	err = __copy_from_user(regs, &sc->sc_regs, sizeof(sc->sc_regs));
 	/* Restore the floating-point state. */
 	if (has_fpu())
 		err |= restore_fp_state(regs, &sc->sc_fpregs);
+
+	while (1 && !err) {
+		struct __riscv_ctx_hdr *head = (struct __riscv_ctx_hdr *)sc_extn;
+		__u32 magic, size;
+
+		err |= __get_user(magic, &head->magic);
+		err |= __get_user(size, &head->size);
+		if (err)
+			goto done;
+
+		switch (magic) {
+		case END_MAGIC:
+			if (size != END_HDR_SIZE)
+				goto invalid;
+			goto done;
+		case RVV_MAGIC:
+			if (!has_vector() || (size != rvv_sc_size))
+				goto invalid;
+			err |= restore_v_state(regs, &sc_extn);
+			break;
+		default:
+			goto invalid;
+		}
+	}
+done:
 	return err;
+
+invalid:
+	return -EINVAL;
+}
+
+static size_t cal_rt_frame_size(void)
+{
+	struct rt_sigframe __user *frame;
+	static size_t frame_size;
+	size_t total_context_size = 0;
+
+	if (frame_size)
+		goto done;
+
+	total_context_size = sizeof(*frame);
+
+	if (has_vector())
+		total_context_size += rvv_sc_size;
+
+	/* Add a __riscv_ctx_hdr for END signal context header. */
+	total_context_size += sizeof(struct __riscv_ctx_hdr);
+
+	frame_size = round_up(total_context_size, 16);
+done:
+	return frame_size;
+
 }
 
 SYSCALL_DEFINE0(rt_sigreturn)
@@ -104,13 +242,14 @@ SYSCALL_DEFINE0(rt_sigreturn)
 	struct rt_sigframe __user *frame;
 	struct task_struct *task;
 	sigset_t set;
+	size_t frame_size = cal_rt_frame_size();
 
 	/* Always make any pending restarted system calls return -EINTR */
 	current->restart_block.fn = do_no_restart_syscall;
 
 	frame = (struct rt_sigframe __user *)regs->sp;
 
-	if (!access_ok(frame, sizeof(*frame)))
+	if (!access_ok(frame, frame_size))
 		goto badframe;
 
 	if (__copy_from_user(&set, &frame->uc.uc_sigmask, sizeof(set)))
@@ -118,7 +257,7 @@ SYSCALL_DEFINE0(rt_sigreturn)
 
 	set_current_blocked(&set);
 
-	if (restore_sigcontext(regs, &frame->uc.uc_mcontext))
+	if (restore_sigcontext(frame, regs))
 		goto badframe;
 
 	if (restore_altstack(&frame->uc.uc_stack))
@@ -141,15 +280,24 @@ SYSCALL_DEFINE0(rt_sigreturn)
 }
 
 static long setup_sigcontext(struct rt_sigframe __user *frame,
-	struct pt_regs *regs)
+			     struct pt_regs *regs)
 {
 	struct sigcontext __user *sc = &frame->uc.uc_mcontext;
+	void *sc_extn = &frame->sc_extn;
 	long err;
+
 	/* sc_regs is structured the same as the start of pt_regs */
 	err = __copy_to_user(&sc->sc_regs, regs, sizeof(sc->sc_regs));
 	/* Save the floating-point state. */
 	if (has_fpu())
 		err |= save_fp_state(regs, &sc->sc_fpregs);
+	/* Save the vector state. */
+	if (has_vector())
+		err |= save_v_state(regs, &sc_extn);
+
+	/* Put END __riscv_ctx_hdr at the end. */
+	err = __put_user(END_MAGIC, &((struct __riscv_ctx_hdr *)sc_extn)->magic);
+	err = __put_user(END_HDR_SIZE, &((struct __riscv_ctx_hdr *)sc_extn)->size);
 	return err;
 }
 
@@ -180,10 +328,11 @@ static int setup_rt_frame(struct ksignal *ksig, sigset_t *set,
 	struct pt_regs *regs)
 {
 	struct rt_sigframe __user *frame;
+	size_t frame_size = cal_rt_frame_size();
 	long err = 0;
 
-	frame = get_sigframe(ksig, regs, sizeof(*frame));
-	if (!access_ok(frame, sizeof(*frame)))
+	frame = get_sigframe(ksig, regs, frame_size);
+	if (!access_ok(frame, frame_size))
 		return -EFAULT;
 
 	err |= copy_siginfo_to_user(&frame->info, &ksig->info);
@@ -329,3 +478,9 @@ asmlinkage __visible void do_notify_resume(struct pt_regs *regs,
 	if (thread_info_flags & _TIF_NOTIFY_RESUME)
 		resume_user_mode_work(regs);
 }
+
+void __init init_rt_signal_env(void)
+{
+	/* Vector regfile + control regs. */
+	rvv_sc_size = sizeof(struct __sc_riscv_v_state) + riscv_vsize;
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2022-12-20 20:05               ` Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break Vineet Gupta
@ 2022-12-21 15:53                 ` Vincent Chen
  2022-12-21 19:45                   ` Vineet Gupta
  0 siblings, 1 reply; 79+ messages in thread
From: Vincent Chen @ 2022-12-21 15:53 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Florian Weimer, Rich Felker, Andrew Waterman, Palmer Dabbelt,
	Kito Cheng, Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Philipp Tomsich, Szabolcs Nagy, Andy Chiu,
	Greentime Hu, Aaron Durbin, Andrew de los Reyes, linux-riscv,
	GNU C Library

Hi Vineet,
Thank you for creating this discussion thread to get some consensus
and propose a way to solve this problem. Actually, I don't object to
your proposal. I just don't understand why my solution is
inappropriate. IIUC, the struct sigcontext is used by the kernel to
preserve the context of the register before entering the signal
handler. Because the memory region used to save the register context
is in user space, user space can obtain register context through the
same struct sigcontext to parse the same memory region. Therefore, we
don't want to break ABI to cause this mechanism to fail in the
different kernel and Glibc combinations. Back to my approach, as you
mentioned that my patch changes the size of struct sigcontext.
However, this size difference does not seem to break the above
mechanism. I enumerate the possible case below for discussion.

1. Kernel without RVV support + user program using original Glibc sigcontext.
This is the current Glibc case. It has no problems.

2. Kernel with RVV support + user program using the new sigcontext definition
The mechanism can work smoothly because the sigcontext definition in
the kernel matches the definition in user programs.

3. Kernel without RVV support + user program using the new sigcontext definition
Because the kernel does not store vector registers context to memory,
the __reserved[4224] in GLIBC sigcontext is unneeded. Therefore, the
struct sigcontext in user programs will waste a lot of memory due to
__reserved[4224] if user programs allocate memory for it. But, the
mechanism still can work smoothly.

4. Kernel with RVV support + user program using original Glibc sigcontext
In this case, the kernel needs to save vector registers context to
memory. The user program may encounter memory issues if the user space
does not reserve enough memory size for the kernel to create the
sigcontext. However, we can't seem to avoid this case since there is
no flexible memory area in struct sigcontext for future expansion.

From the above enumeration, my approach in the 3rd case will be a
problem. But, it may be solved by replacing the __reserved[4224] in
struct sigcontext with the " C99 flexible length array". Therefore,
the new patch will become below.

--- a/sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h
+++ b/sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h

@@ -22,10 +22,28 @@
 # error "Never use <bits/sigcontext.h> directly; include <signal.h> instead."
 #endif

+#define sigcontext kernel_sigcontext
+#include <asm/sigcontext.h>
+#undef sigcontext

 struct sigcontext {
 /* gregs[0] holds the program counter. */
- unsigned long int gregs[32];
- unsigned long long int fpregs[66] __attribute__ ((__aligned__ (16)));
+ __extension__ union {
+ unsigned long int gregs[32];
+ /* Kernel uses struct user_regs_struct to save x1-x31 and pc
+ to the signal context, so please use sc_regs to access these
+ these registers from the signal context. */
+ struct user_regs_struct sc_regs;
+ };

+ __extension__ union {
+ unsigned long long int fpregs[66] __attribute__ ((__aligned__ (16)));
+ /* Kernel uses struct __riscv_fp_state to save f0-f31 and fcsr
+ to the signal context, so please use sc_fpregs to access these
+ fpu registers from the signal context. */
+ union __riscv_fp_state sc_fpregs;
+ };
+
+ __u8 sc_extn[] __attribute__((__aligned__(16)));
 };

 #endif


This change can reduce memory waste size to 16 bytes in the worst
case. The best case happens when the sc_extn locates at a 16-byte
aligned address. The size of the struct sigcontext is still the same.

If the above inference is acceptable, I want to mention some
advantages of my patch. This approach allows user programs to directly
access the vector register context. Besides, new user programs can use
kernel-defined struct sigcontext to access the context of the
register. Actually, the memory layout of the FPU register in
kernel-defined struct sigcontext is different from the Glibc-defined
struct sigcontext. It probably causes the user programs to get the
wrong value of FPU registers from the context. Therefore, my approach
can help user programs get the correct FPU registers because the user
program is able to use kernel-defined struct sigcontext to access the
FPU register context. It will help RISC-V users get rid of the
historical burden in Glibc sigcontext.h.


Thanks,
Vincent Chen

On Wed, Dec 21, 2022 at 4:05 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
>
> Hi folks,
>
> Apologies for the extraneous CC (and the top post), but I would really
> appreciate some feedback on this to close on the V-ext plumbing support
> in kernel/glibc. This is one of the two contentious issues (other being
> prctl enable) preventing us from getting to an RVV enabled SW ecosystem.
>
> The premise is : for preserving V-ext registers across signal handling,
> the natural way is to add V reg storage to kernel struct sigcontext
> where scalar / fp regs are currently saved. But this doesn’t seem to be
> the right way to go:
>
> 1. Breaks the userspace ABI (even if user programs were recompiled)
> because RV glibc port for historical reasons has defined its own version
> of struct sigcontext (vs. relying on kernel exported UAPI header).
>
> 2. Even if we were to expand sigcontext (in both kernel and glibc, which
> is always hard to time) there's still a (different) ABI breakage for
> existing binaries despite earlier proposed __extension__ union trick [2]
> since it still breaks old binaries w.r.t. size of the sigcontext struct.
>
> 3. glibc {set,get,*}context() routines use struct mcontext_t which is
> analogous to kernel struct sigcontext (in respective ucontext structs
> [1]). Thus ideally mcontext_t needs to be expanded too but need not be,
> given its semantics to save callee-saved regs only : per current psABI
> RVVV regs are caller-saved/call-clobbered [3]. Apparently this
> connection of sigcontext to mcontext_t is also historical as some arches
> used/still-use sigreturn to restore regs in setcontext [4]
>
> Does anyone disagree that 1-3 are not valid reasons.
>
> So the proposal here is to *not* add V-ext state to kernel sigcontext
> but instead dynamically to struct rt_sigframe, similar to aarch64
> kernel. This avoids touching glibc sigcontext as well.
>
> struct rt_sigframe {
>    struct siginfo info;
>    struct ucontext uc;
> +__u8 sc_extn[] __attribute__((__aligned__(16))); // C99 flexible length
> array to handle implementation defined VLEN wide regs
> }
>
> The only downside to this is that SA_SIGINFO signal handlers don’t have
> direct access to V state (but it seems aarch64 kernel doesn’t either).
>
> Does anyone really disagree with this proposal.
>
> Attached is a proof-of-concept kernel patch which implements this
> proposal with no need for any corresponding glibc change.
>
> Thx,
> -Vineet
>
>
> [1] ucontex in kernel and glibc respectively.
>
> kernel: arch/riscv/include/uapi/asm/ucontext.h
>
> struct ucontext {
>   unsigned long uc_flags;
>   struct ucontext *uc_link;
>   stack_t uc_stack;
>   sigset_t uc_sigmask;
>   __u8 __unused[1024 / 8 - sizeof(sigset_t)];
>   struct sigcontext uc_mcontext;
> }
>
> glibc: sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
>
> typedef struct ucontext_t
>    {
>      unsigned long int  __uc_flags;
>      struct ucontext_t *uc_link;
>      stack_t            uc_stack;
>      sigset_t           uc_sigmask;
>      /* padding to allow future sigset_t expansion */
>      char   __glibc_reserved[1024 / 8 - sizeof (sigset_t)];
>       mcontext_t uc_mcontext;
> } ucontext_t;
>
> [2] https://sourceware.org/pipermail/libc-alpha/2022-January/135610.html
> [3]
> https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc
> [4] https://sourceware.org/legacy-ml/libc-alpha/2014-04/msg00006.html
>
>
>
>
> On 12/8/22 19:39, Vineet Gupta wrote:
> > Hi Florian,
> >
> > P.S. Since I'm revisiting a year old thread with some new CC
> > recipients, here's the link to original patch/thread [1]
> >
> > On 9/17/21 20:04, Vincent Chen wrote:
> >> On Thu, Sep 16, 2021 at 4:14 PM Florian Weimer <fweimer@redhat.com>
> >> wrote:
> >>>>>> This changes the size of struct ucontext_t, which is an ABI break
> >>>>>> (getcontext callers are supposed to provide their own object).
> >>>>>>
> >>>> The riscv vector registers are all caller-saved registers except for
> >>>> VCSR. Therefore, the struct mcontext_t needs to reserve a space for
> >>>> it. In addition, RISCV ISA is growing, so I also hope the struct
> >>>> mcontext_t has a space for future expansion. Based on the above ideas,
> >>>> I reserved a 5K space here.
> >>> You have reserved space in ucontext_t that you could use for this.
> >>>
> >> Sorry, I cannot really understand what you mean. The following is the
> >> contents of ucontext_t
> >> typedef struct ucontext_t
> >>    {
> >>      unsigned long int  __uc_flags;
> >>      struct ucontext_t *uc_link;
> >>      stack_t            uc_stack;
> >>      sigset_t           uc_sigmask;
> >>      /* There's some padding here to allow sigset_t to be expanded in
> >> the
> >>         future.  Though this is unlikely, other architectures put
> >> uc_sigmask
> >>         at the end of this structure and explicitly state it can be
> >>         expanded, so we didn't want to box ourselves in here. */
> >>      char               __glibc_reserved[1024 / 8 - sizeof (sigset_t)];
> >>      /* We can't put uc_sigmask at the end of this structure because
> >> we need
> >>         to be able to expand sigcontext in the future.  For example, the
> >>         vector ISA extension will almost certainly add ISA state.  We
> >> want
> >>         to ensure all user-visible ISA state can be saved and
> >> restored via a
> >>         ucontext, so we're putting this at the end in order to allow for
> >>         infinite extensibility.  Since we know this will be extended
> >> and we
> >>         assume sigset_t won't be extended an extreme amount, we're
> >>         prioritizing this.  */
> >>      mcontext_t uc_mcontext;
> >>    } ucontext_t;
> >>
> >> Currently, we only reserve a space, __glibc_reserved[], for the future
> >> expansion of sigset_t.
> >> Do you mean I could use __glibc_reserved[] to for future expansion of
> >> ISA as well?
> >
> > Given unlikely sigset expansion, we could in theory use some of those
> > reserved fields to store pointers (offsets) to actual V state, but not
> > for actual V state which is way too large for non-embedded machines
> > with typical 128 or even wider V regs.
> >
> >
> >>
> >>>>>> This shouldn't be necessary if the additional vector registers are
> >>>>>> caller-saved.
> >>>> Here I am a little confused about the usage of struct mcontext_t. As
> >>>> far as I know, the struct mcontext_t is used to save the
> >>>> machine-specific information in user context operation. Therefore, in
> >>>> this case, the struct mcontext_t is allowed to reserve the space only
> >>>> for saving caller-saved registers. However, in the signal handler, the
> >>>> user seems to be allowed to use uc_mcontext whose data type is struct
> >>>> mcontext_t to access the content of the signal context. In this case,
> >>>> the struct mcontext_t may need to be the same as the struct sigcontext
> >>>> defined at kernel. However, it will have a conflict with your
> >>>> suggestion because the struct sigcontext cannot just reserve a space
> >>>> for saving caller-saved registers. Could you help me point out my
> >>>> misunderstanding? Thank you.
> >
> > I think the confusion comes from apparent equivalence of kernel struct
> > sigcontext and glibc mcontext_t as they appear in respective struct
> > ucontext definitions.
> > I've enumerated the actual RV structs below to keep them handy in one
> > place for discussion.
> >
> >>> struct sigcontext is allocated by the kernel, so you can have pointers
> >>> in reserved fields to out-of-line start, or after struct sigcontext.
> >
> > In this scheme, would the actual V regfile contents (at the
> > out-of-line location w.r.t kernel sigcontext) be anonymous for glibc
> > i.e. do we not need to expose them to glibc userspace ABI ?
> >
> >
> >>> I don't know how the kernel implements this, but there is considerable
> >>> flexibility and extensibility.  The main issues comes from small stacks
> >>> which are incompatible with large register files.
> >
> > Simplistically, Linux kernel needs to preserve the V regfile across
> > task switch. The necessary evil that follows is preserving V across
> > signal-handling (sigaction/sigreturn).
> >
> > In RV kernel we have following:
> >
> > struct rt_sigframe {
> >   struct siginfo info;
> >   struct ucontext uc;
> > };
> >
> > struct ucontext {
> >    unsigned long uc_flags;
> >    struct ucontext *uc_link;
> >    stack_t uc_stack;
> >    sigset_t uc_sigmask;
> >    __u8 __unused[1024 / 8 - sizeof(sigset_t)];     // this is for
> > sigset_t expansion
> >    struct sigcontext uc_mcontext;
> > };
> >
> > struct sigcontext {
> >    struct user_regs_struct sc_regs;
> >    union __riscv_fp_state sc_fpregs;
> > +  __u8 sc_extn[4096+128] __attribute__((__aligned__(16)));   //
> > handle 128B V regs
> > };
> >
> > The sc_extn[] would have V state (regfile + control state) in kernel
> > defined format.
> >
> > As I understand it, you are suggesting to prevent ABI break, we should
> > not add anything to kernel struct sigcontext i.e. do something like this
> >
> > struct rt_sigframe {
> >   struct siginfo info;
> >   struct ucontext uc;
> > +__u8 sc_extn[4096+128] __attribute__((__aligned__(16)));
> > }
> >
> > So kernel sig handling can continue to save/restore the V regfile on
> > user stack, w/o making it part of actual struct sigcontext.
> > So they are not explicitly visible to userspace at all - is that
> > feasible ? I know that SA_SIGINFO handlers can access the scalar/fp
> > regs, they won't do it V.
> > Is there a POSIX req for SA_SIGINFO handlers being able to access all
> > machine regs saved by signal handling.
> >
> > An alternate approach is what Vincent did originally, to add sc_exn to
> > struct sigcontext. Here to prevent ABI breakage, we can choose to not
> > reflect this in the glibc sigcontext. But the question remains, is
> > that OK ?
> >
> > The other topic is changing glibc mcontext_t to add V-regs. It would
> > seem one has to as mcontext is "visually equivalent" to struct
> > sigcontext in the respective ucontext structs. But in unserspace
> > *context routine semantics only require callee-regs to be saved, which
> > V regs are not per psABI [2]. So looks like this can be avoided which
> > is what Vincent did in v2 series [3]
> >
> >
> > [1]
> > https://sourceware.org/pipermail/libc-alpha/2021-September/130899.html
> > [2]
> > https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc
> > [3] https://sourceware.org/pipermail/libc-alpha/2022-January/135416.html

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2022-12-21 15:53                 ` Vincent Chen
@ 2022-12-21 19:45                   ` Vineet Gupta
  2022-12-21 19:52                     ` Vineet Gupta
                                       ` (2 more replies)
  0 siblings, 3 replies; 79+ messages in thread
From: Vineet Gupta @ 2022-12-21 19:45 UTC (permalink / raw)
  To: Vincent Chen
  Cc: Florian Weimer, Rich Felker, Andrew Waterman, Palmer Dabbelt,
	Kito Cheng, Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Philipp Tomsich, Szabolcs Nagy, Andy Chiu,
	Greentime Hu, Aaron Durbin, Andrew de los Reyes, linux-riscv,
	GNU C Library

Hi Vincent,

On 12/21/22 07:53, Vincent Chen wrote:
> Hi Vineet,
> Thank you for creating this discussion thread to get some consensus
> and propose a way to solve this problem. Actually, I don't object to
> your proposal. I just don't understand why my solution is
> inappropriate.

It is not inappropriate, in fact it is more natural to do it your way :-)
And if everything was rebuilt there was no issue. As some reviewers also 
pointed out the issue was with existing binaries with smaller sigcontext 
breaking with expanded sigcontext in kernel and/or glibc itself.


> IIUC, the struct sigcontext is used by the kernel to
> preserve the context of the register before entering the signal
> handler. Because the memory region used to save the register context
> is in user space, user space can obtain register context through the
> same struct sigcontext to parse the same memory region. Therefore, we
> don't want to break ABI to cause this mechanism to fail in the
> different kernel and Glibc combinations. Back to my approach, as you
> mentioned that my patch changes the size of struct sigcontext.
> However, this size difference does not seem to break the above
> mechanism. I enumerate the possible case below for discussion.
>
> 1. Kernel without RVV support + user program using original Glibc sigcontext.
> This is the current Glibc case. It has no problems.
>
> 2. Kernel with RVV support + user program using the new sigcontext definition
> The mechanism can work smoothly because the sigcontext definition in
> the kernel matches the definition in user programs.

Right but what about existing binaries. Imagine if they had

struct foo{
     struct sigcontext s;
     int bar;
}

Now with sigcontext expanded, bar is not at the expected location in memory.

> 3. Kernel without RVV support + user program using the new sigcontext definition
> Because the kernel does not store vector registers context to memory,
> the __reserved[4224] in GLIBC sigcontext is unneeded. Therefore, the
> struct sigcontext in user programs will waste a lot of memory due to
> __reserved[4224] if user programs allocate memory for it. But, the
> mechanism still can work smoothly.
>
> 4. Kernel with RVV support + user program using original Glibc sigcontext
> In this case, the kernel needs to save vector registers context to
> memory. The user program may encounter memory issues if the user space
> does not reserve enough memory size for the kernel to create the
> sigcontext. However, we can't seem to avoid this case since there is
> no flexible memory area in struct sigcontext for future expansion.
>
>  From the above enumeration, my approach in the 3rd case will be a
> problem. But, it may be solved by replacing the __reserved[4224] in
> struct sigcontext with the " C99 flexible length array". Therefore,
> the new patch will become below.
>
> --- a/sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h
> +++ b/sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h
>
> @@ -22,10 +22,28 @@
>   # error "Never use <bits/sigcontext.h> directly; include <signal.h> instead."
>   #endif
>
> +#define sigcontext kernel_sigcontext
> +#include <asm/sigcontext.h>
> +#undef sigcontext
>
>   struct sigcontext {
>   /* gregs[0] holds the program counter. */
> - unsigned long int gregs[32];
> - unsigned long long int fpregs[66] __attribute__ ((__aligned__ (16)));
> + __extension__ union {
> + unsigned long int gregs[32];
> + /* Kernel uses struct user_regs_struct to save x1-x31 and pc
> + to the signal context, so please use sc_regs to access these
> + these registers from the signal context. */
> + struct user_regs_struct sc_regs;
> + };
>
> + __extension__ union {
> + unsigned long long int fpregs[66] __attribute__ ((__aligned__ (16)));
> + /* Kernel uses struct __riscv_fp_state to save f0-f31 and fcsr
> + to the signal context, so please use sc_fpregs to access these
> + fpu registers from the signal context. */
> + union __riscv_fp_state sc_fpregs;
> + };
> +
> + __u8 sc_extn[] __attribute__((__aligned__(16)));
>   };
>
>   #endif
>
>
> This change can reduce memory waste size to 16 bytes in the worst
> case. The best case happens when the sc_extn locates at a 16-byte
> aligned address. The size of the struct sigcontext is still the same.

Its a neat trick. But the additional stack alignment means we could 
still potentially changing the size of sigcontext - even if by 16 bytes 
- again for existing binaries.

I agree that struct sigcontext is not something people commonly use in 
their code. And also not sure if the concern of breaking existing 
binaries with struct sigcontext is a real problem or a theoretical 
exercise. Hence I wanted some of the maintainers to weigh-in. I don't 
have issues with your approach, just that in the prior 2 reviews it 
seemed it was a no go.


> If the above inference is acceptable, I want to mention some
> advantages of my patch. This approach allows user programs to directly
> access the vector register context.

Correct, that is very true.

> Besides, new user programs can use
> kernel-defined struct sigcontext to access the context of the
> register. Actually, the memory layout of the FPU register in
> kernel-defined struct sigcontext is different from the Glibc-defined
> struct sigcontext. It probably causes the user programs to get the
> wrong value of FPU registers from the context. Therefore, my approach
> can help user programs get the correct FPU registers because the user
> program is able to use kernel-defined struct sigcontext to access the
> FPU register context. It will help RISC-V users get rid of the
> historical burden in Glibc sigcontext.h.

Indeed.

Thx,
-Vineet



>
>
> Thanks,
> Vincent Chen
>
> On Wed, Dec 21, 2022 at 4:05 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
>> Hi folks,
>>
>> Apologies for the extraneous CC (and the top post), but I would really
>> appreciate some feedback on this to close on the V-ext plumbing support
>> in kernel/glibc. This is one of the two contentious issues (other being
>> prctl enable) preventing us from getting to an RVV enabled SW ecosystem.
>>
>> The premise is : for preserving V-ext registers across signal handling,
>> the natural way is to add V reg storage to kernel struct sigcontext
>> where scalar / fp regs are currently saved. But this doesn’t seem to be
>> the right way to go:
>>
>> 1. Breaks the userspace ABI (even if user programs were recompiled)
>> because RV glibc port for historical reasons has defined its own version
>> of struct sigcontext (vs. relying on kernel exported UAPI header).
>>
>> 2. Even if we were to expand sigcontext (in both kernel and glibc, which
>> is always hard to time) there's still a (different) ABI breakage for
>> existing binaries despite earlier proposed __extension__ union trick [2]
>> since it still breaks old binaries w.r.t. size of the sigcontext struct.
>>
>> 3. glibc {set,get,*}context() routines use struct mcontext_t which is
>> analogous to kernel struct sigcontext (in respective ucontext structs
>> [1]). Thus ideally mcontext_t needs to be expanded too but need not be,
>> given its semantics to save callee-saved regs only : per current psABI
>> RVVV regs are caller-saved/call-clobbered [3]. Apparently this
>> connection of sigcontext to mcontext_t is also historical as some arches
>> used/still-use sigreturn to restore regs in setcontext [4]
>>
>> Does anyone disagree that 1-3 are not valid reasons.
>>
>> So the proposal here is to *not* add V-ext state to kernel sigcontext
>> but instead dynamically to struct rt_sigframe, similar to aarch64
>> kernel. This avoids touching glibc sigcontext as well.
>>
>> struct rt_sigframe {
>>     struct siginfo info;
>>     struct ucontext uc;
>> +__u8 sc_extn[] __attribute__((__aligned__(16))); // C99 flexible length
>> array to handle implementation defined VLEN wide regs
>> }
>>
>> The only downside to this is that SA_SIGINFO signal handlers don’t have
>> direct access to V state (but it seems aarch64 kernel doesn’t either).
>>
>> Does anyone really disagree with this proposal.
>>
>> Attached is a proof-of-concept kernel patch which implements this
>> proposal with no need for any corresponding glibc change.
>>
>> Thx,
>> -Vineet
>>
>>
>> [1] ucontex in kernel and glibc respectively.
>>
>> kernel: arch/riscv/include/uapi/asm/ucontext.h
>>
>> struct ucontext {
>>    unsigned long uc_flags;
>>    struct ucontext *uc_link;
>>    stack_t uc_stack;
>>    sigset_t uc_sigmask;
>>    __u8 __unused[1024 / 8 - sizeof(sigset_t)];
>>    struct sigcontext uc_mcontext;
>> }
>>
>> glibc: sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
>>
>> typedef struct ucontext_t
>>     {
>>       unsigned long int  __uc_flags;
>>       struct ucontext_t *uc_link;
>>       stack_t            uc_stack;
>>       sigset_t           uc_sigmask;
>>       /* padding to allow future sigset_t expansion */
>>       char   __glibc_reserved[1024 / 8 - sizeof (sigset_t)];
>>        mcontext_t uc_mcontext;
>> } ucontext_t;
>>
>> [2] https://sourceware.org/pipermail/libc-alpha/2022-January/135610.html
>> [3]
>> https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc
>> [4] https://sourceware.org/legacy-ml/libc-alpha/2014-04/msg00006.html
>>
>>
>>
>>
>> On 12/8/22 19:39, Vineet Gupta wrote:
>>> Hi Florian,
>>>
>>> P.S. Since I'm revisiting a year old thread with some new CC
>>> recipients, here's the link to original patch/thread [1]
>>>
>>> On 9/17/21 20:04, Vincent Chen wrote:
>>>> On Thu, Sep 16, 2021 at 4:14 PM Florian Weimer <fweimer@redhat.com>
>>>> wrote:
>>>>>>>> This changes the size of struct ucontext_t, which is an ABI break
>>>>>>>> (getcontext callers are supposed to provide their own object).
>>>>>>>>
>>>>>> The riscv vector registers are all caller-saved registers except for
>>>>>> VCSR. Therefore, the struct mcontext_t needs to reserve a space for
>>>>>> it. In addition, RISCV ISA is growing, so I also hope the struct
>>>>>> mcontext_t has a space for future expansion. Based on the above ideas,
>>>>>> I reserved a 5K space here.
>>>>> You have reserved space in ucontext_t that you could use for this.
>>>>>
>>>> Sorry, I cannot really understand what you mean. The following is the
>>>> contents of ucontext_t
>>>> typedef struct ucontext_t
>>>>     {
>>>>       unsigned long int  __uc_flags;
>>>>       struct ucontext_t *uc_link;
>>>>       stack_t            uc_stack;
>>>>       sigset_t           uc_sigmask;
>>>>       /* There's some padding here to allow sigset_t to be expanded in
>>>> the
>>>>          future.  Though this is unlikely, other architectures put
>>>> uc_sigmask
>>>>          at the end of this structure and explicitly state it can be
>>>>          expanded, so we didn't want to box ourselves in here. */
>>>>       char               __glibc_reserved[1024 / 8 - sizeof (sigset_t)];
>>>>       /* We can't put uc_sigmask at the end of this structure because
>>>> we need
>>>>          to be able to expand sigcontext in the future.  For example, the
>>>>          vector ISA extension will almost certainly add ISA state.  We
>>>> want
>>>>          to ensure all user-visible ISA state can be saved and
>>>> restored via a
>>>>          ucontext, so we're putting this at the end in order to allow for
>>>>          infinite extensibility.  Since we know this will be extended
>>>> and we
>>>>          assume sigset_t won't be extended an extreme amount, we're
>>>>          prioritizing this.  */
>>>>       mcontext_t uc_mcontext;
>>>>     } ucontext_t;
>>>>
>>>> Currently, we only reserve a space, __glibc_reserved[], for the future
>>>> expansion of sigset_t.
>>>> Do you mean I could use __glibc_reserved[] to for future expansion of
>>>> ISA as well?
>>> Given unlikely sigset expansion, we could in theory use some of those
>>> reserved fields to store pointers (offsets) to actual V state, but not
>>> for actual V state which is way too large for non-embedded machines
>>> with typical 128 or even wider V regs.
>>>
>>>
>>>>>>>> This shouldn't be necessary if the additional vector registers are
>>>>>>>> caller-saved.
>>>>>> Here I am a little confused about the usage of struct mcontext_t. As
>>>>>> far as I know, the struct mcontext_t is used to save the
>>>>>> machine-specific information in user context operation. Therefore, in
>>>>>> this case, the struct mcontext_t is allowed to reserve the space only
>>>>>> for saving caller-saved registers. However, in the signal handler, the
>>>>>> user seems to be allowed to use uc_mcontext whose data type is struct
>>>>>> mcontext_t to access the content of the signal context. In this case,
>>>>>> the struct mcontext_t may need to be the same as the struct sigcontext
>>>>>> defined at kernel. However, it will have a conflict with your
>>>>>> suggestion because the struct sigcontext cannot just reserve a space
>>>>>> for saving caller-saved registers. Could you help me point out my
>>>>>> misunderstanding? Thank you.
>>> I think the confusion comes from apparent equivalence of kernel struct
>>> sigcontext and glibc mcontext_t as they appear in respective struct
>>> ucontext definitions.
>>> I've enumerated the actual RV structs below to keep them handy in one
>>> place for discussion.
>>>
>>>>> struct sigcontext is allocated by the kernel, so you can have pointers
>>>>> in reserved fields to out-of-line start, or after struct sigcontext.
>>> In this scheme, would the actual V regfile contents (at the
>>> out-of-line location w.r.t kernel sigcontext) be anonymous for glibc
>>> i.e. do we not need to expose them to glibc userspace ABI ?
>>>
>>>
>>>>> I don't know how the kernel implements this, but there is considerable
>>>>> flexibility and extensibility.  The main issues comes from small stacks
>>>>> which are incompatible with large register files.
>>> Simplistically, Linux kernel needs to preserve the V regfile across
>>> task switch. The necessary evil that follows is preserving V across
>>> signal-handling (sigaction/sigreturn).
>>>
>>> In RV kernel we have following:
>>>
>>> struct rt_sigframe {
>>>    struct siginfo info;
>>>    struct ucontext uc;
>>> };
>>>
>>> struct ucontext {
>>>     unsigned long uc_flags;
>>>     struct ucontext *uc_link;
>>>     stack_t uc_stack;
>>>     sigset_t uc_sigmask;
>>>     __u8 __unused[1024 / 8 - sizeof(sigset_t)];     // this is for
>>> sigset_t expansion
>>>     struct sigcontext uc_mcontext;
>>> };
>>>
>>> struct sigcontext {
>>>     struct user_regs_struct sc_regs;
>>>     union __riscv_fp_state sc_fpregs;
>>> +  __u8 sc_extn[4096+128] __attribute__((__aligned__(16)));   //
>>> handle 128B V regs
>>> };
>>>
>>> The sc_extn[] would have V state (regfile + control state) in kernel
>>> defined format.
>>>
>>> As I understand it, you are suggesting to prevent ABI break, we should
>>> not add anything to kernel struct sigcontext i.e. do something like this
>>>
>>> struct rt_sigframe {
>>>    struct siginfo info;
>>>    struct ucontext uc;
>>> +__u8 sc_extn[4096+128] __attribute__((__aligned__(16)));
>>> }
>>>
>>> So kernel sig handling can continue to save/restore the V regfile on
>>> user stack, w/o making it part of actual struct sigcontext.
>>> So they are not explicitly visible to userspace at all - is that
>>> feasible ? I know that SA_SIGINFO handlers can access the scalar/fp
>>> regs, they won't do it V.
>>> Is there a POSIX req for SA_SIGINFO handlers being able to access all
>>> machine regs saved by signal handling.
>>>
>>> An alternate approach is what Vincent did originally, to add sc_exn to
>>> struct sigcontext. Here to prevent ABI breakage, we can choose to not
>>> reflect this in the glibc sigcontext. But the question remains, is
>>> that OK ?
>>>
>>> The other topic is changing glibc mcontext_t to add V-regs. It would
>>> seem one has to as mcontext is "visually equivalent" to struct
>>> sigcontext in the respective ucontext structs. But in unserspace
>>> *context routine semantics only require callee-regs to be saved, which
>>> V regs are not per psABI [2]. So looks like this can be avoided which
>>> is what Vincent did in v2 series [3]
>>>
>>>
>>> [1]
>>> https://sourceware.org/pipermail/libc-alpha/2021-September/130899.html
>>> [2]
>>> https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc
>>> [3] https://sourceware.org/pipermail/libc-alpha/2022-January/135416.html


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2022-12-21 19:45                   ` Vineet Gupta
@ 2022-12-21 19:52                     ` Vineet Gupta
  2022-12-22  3:37                       ` Vincent Chen
  2022-12-22  5:32                       ` Richard Henderson via Libc-alpha
  2022-12-22  1:50                     ` Vincent Chen
  2022-12-22  5:34                     ` Richard Henderson via Libc-alpha
  2 siblings, 2 replies; 79+ messages in thread
From: Vineet Gupta @ 2022-12-21 19:52 UTC (permalink / raw)
  To: Vincent Chen
  Cc: Florian Weimer, Rich Felker, Andrew Waterman, Palmer Dabbelt,
	Kito Cheng, Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Philipp Tomsich, Szabolcs Nagy, Andy Chiu,
	Greentime Hu, Aaron Durbin, Andrew de los Reyes, linux-riscv,
	GNU C Library



On 12/21/22 11:45, Vineet Gupta wrote:
>
> 4. Kernel with RVV support + user program using original Glibc sigcontext
> In this case, the kernel needs to save vector registers context to
> memory. The user program may encounter memory issues if the user space
> does not reserve enough memory size for the kernel to create the
> sigcontext. However, we can't seem to avoid this case since there is
> no flexible memory area in struct sigcontext for future expansion.

This is not an issue, if we don't change sigcontext (in kernel and 
glibc) - it is essentially the case of existing binaries.
kernel still saves regs on user stack, in rt_sigframe, its just that  
userspace is not able to access them in SA_SIGINFO signal handers.
aarch64 have this implemented this approach and it is likely they can't 
do that either for SVE regs.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2022-12-21 19:45                   ` Vineet Gupta
  2022-12-21 19:52                     ` Vineet Gupta
@ 2022-12-22  1:50                     ` Vincent Chen
  2022-12-22  5:34                     ` Richard Henderson via Libc-alpha
  2 siblings, 0 replies; 79+ messages in thread
From: Vincent Chen @ 2022-12-22  1:50 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Florian Weimer, Rich Felker, Andrew Waterman, Palmer Dabbelt,
	Kito Cheng, Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Philipp Tomsich, Szabolcs Nagy, Andy Chiu,
	Greentime Hu, Aaron Durbin, Andrew de los Reyes, linux-riscv,
	GNU C Library

On Thu, Dec 22, 2022 at 3:45 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
>
> Hi Vincent,
>
> On 12/21/22 07:53, Vincent Chen wrote:
> > Hi Vineet,
> > Thank you for creating this discussion thread to get some consensus
> > and propose a way to solve this problem. Actually, I don't object to
> > your proposal. I just don't understand why my solution is
> > inappropriate.
>
> It is not inappropriate, in fact it is more natural to do it your way :-)
> And if everything was rebuilt there was no issue. As some reviewers also
> pointed out the issue was with existing binaries with smaller sigcontext
> breaking with expanded sigcontext in kernel and/or glibc itself.

Thank you for your detailed explanations :-) I still have some
questions and hope you can help me clarify them.
>
>
> > IIUC, the struct sigcontext is used by the kernel to
> > preserve the context of the register before entering the signal
> > handler. Because the memory region used to save the register context
> > is in user space, user space can obtain register context through the
> > same struct sigcontext to parse the same memory region. Therefore, we
> > don't want to break ABI to cause this mechanism to fail in the
> > different kernel and Glibc combinations. Back to my approach, as you
> > mentioned that my patch changes the size of struct sigcontext.
> > However, this size difference does not seem to break the above
> > mechanism. I enumerate the possible case below for discussion.
> >
> > 1. Kernel without RVV support + user program using original Glibc sigcontext.
> > This is the current Glibc case. It has no problems.
> >
> > 2. Kernel with RVV support + user program using the new sigcontext definition
> > The mechanism can work smoothly because the sigcontext definition in
> > the kernel matches the definition in user programs.
>
> Right but what about existing binaries. Imagine if they had
>
> struct foo{
>      struct sigcontext s;
>      int bar;
> }
>
> Now with sigcontext expanded, bar is not at the expected location in memory.

I really miss considering this case. I guess the following example is
one of the cases you want to mention.
1. a.out
#include <bits/sigcontext.h>
...
struct foo{
   struct sigcontext s;
   int bar;
} sc;
int main (void) {
lala(&sc);              // it defined in lala.so
}

2. lala.so
#include <bits/sigcontext.h>
struct foo{
   struct sigcontext s;
   int bar;
} sc;
void lala(struct foo *ptr) {
}
If the lala.so and a.out are compiled with different sizes of the
struct sigcontext, it will have an issue apparently. But, as you
mentioned, I am also curious if this example is a real problem or just
a theoretical exercise.

>
> > 3. Kernel without RVV support + user program using the new sigcontext definition
> > Because the kernel does not store vector registers context to memory,
> > the __reserved[4224] in GLIBC sigcontext is unneeded. Therefore, the
> > struct sigcontext in user programs will waste a lot of memory due to
> > __reserved[4224] if user programs allocate memory for it. But, the
> > mechanism still can work smoothly.
> >
> > 4. Kernel with RVV support + user program using original Glibc sigcontext
> > In this case, the kernel needs to save vector registers context to
> > memory. The user program may encounter memory issues if the user space
> > does not reserve enough memory size for the kernel to create the
> > sigcontext. However, we can't seem to avoid this case since there is
> > no flexible memory area in struct sigcontext for future expansion.
> >
> >  From the above enumeration, my approach in the 3rd case will be a
> > problem. But, it may be solved by replacing the __reserved[4224] in
> > struct sigcontext with the " C99 flexible length array". Therefore,
> > the new patch will become below.
> >
> > --- a/sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h
> > +++ b/sysdeps/unix/sysv/linux/riscv/bits/sigcontext.h
> >
> > @@ -22,10 +22,28 @@
> >   # error "Never use <bits/sigcontext.h> directly; include <signal.h> instead."
> >   #endif
> >
> > +#define sigcontext kernel_sigcontext
> > +#include <asm/sigcontext.h>
> > +#undef sigcontext
> >
> >   struct sigcontext {
> >   /* gregs[0] holds the program counter. */
> > - unsigned long int gregs[32];
> > - unsigned long long int fpregs[66] __attribute__ ((__aligned__ (16)));
> > + __extension__ union {
> > + unsigned long int gregs[32];
> > + /* Kernel uses struct user_regs_struct to save x1-x31 and pc
> > + to the signal context, so please use sc_regs to access these
> > + these registers from the signal context. */
> > + struct user_regs_struct sc_regs;
> > + };
> >
> > + __extension__ union {
> > + unsigned long long int fpregs[66] __attribute__ ((__aligned__ (16)));
> > + /* Kernel uses struct __riscv_fp_state to save f0-f31 and fcsr
> > + to the signal context, so please use sc_fpregs to access these
> > + fpu registers from the signal context. */
> > + union __riscv_fp_state sc_fpregs;
> > + };
> > +
> > + __u8 sc_extn[] __attribute__((__aligned__(16)));
> >   };
> >
> >   #endif
> >
> >
> > This change can reduce memory waste size to 16 bytes in the worst
> > case. The best case happens when the sc_extn locates at a 16-byte
> > aligned address. The size of the struct sigcontext is still the same.
>
> Its a neat trick. But the additional stack alignment means we could
> still potentially changing the size of sigcontext - even if by 16 bytes
> - again for existing binaries.
>
> I agree that struct sigcontext is not something people commonly use in
> their code. And also not sure if the concern of breaking existing
> binaries with struct sigcontext is a real problem or a theoretical
> exercise. Hence I wanted some of the maintainers to weigh-in. I don't
> have issues with your approach, just that in the prior 2 reviews it
> seemed it was a no go.

I agree with you that we need more maintainers to weigh-in to find an
appropriate solution. In my opinion, if the prior example is not
extensively used, maybe it is a good time to get rid of the historical
burden.

Thanks,
Vincent

>
>
> > If the above inference is acceptable, I want to mention some
> > advantages of my patch. This approach allows user programs to directly
> > access the vector register context.
>
> Correct, that is very true.
>
> > Besides, new user programs can use
> > kernel-defined struct sigcontext to access the context of the
> > register. Actually, the memory layout of the FPU register in
> > kernel-defined struct sigcontext is different from the Glibc-defined
> > struct sigcontext. It probably causes the user programs to get the
> > wrong value of FPU registers from the context. Therefore, my approach
> > can help user programs get the correct FPU registers because the user
> > program is able to use kernel-defined struct sigcontext to access the
> > FPU register context. It will help RISC-V users get rid of the
> > historical burden in Glibc sigcontext.h.
>
> Indeed.
>
> Thx,
> -Vineet
>
>
>
> >
> >
> > Thanks,
> > Vincent Chen
> >
> > On Wed, Dec 21, 2022 at 4:05 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
> >> Hi folks,
> >>
> >> Apologies for the extraneous CC (and the top post), but I would really
> >> appreciate some feedback on this to close on the V-ext plumbing support
> >> in kernel/glibc. This is one of the two contentious issues (other being
> >> prctl enable) preventing us from getting to an RVV enabled SW ecosystem.
> >>
> >> The premise is : for preserving V-ext registers across signal handling,
> >> the natural way is to add V reg storage to kernel struct sigcontext
> >> where scalar / fp regs are currently saved. But this doesn’t seem to be
> >> the right way to go:
> >>
> >> 1. Breaks the userspace ABI (even if user programs were recompiled)
> >> because RV glibc port for historical reasons has defined its own version
> >> of struct sigcontext (vs. relying on kernel exported UAPI header).
> >>
> >> 2. Even if we were to expand sigcontext (in both kernel and glibc, which
> >> is always hard to time) there's still a (different) ABI breakage for
> >> existing binaries despite earlier proposed __extension__ union trick [2]
> >> since it still breaks old binaries w.r.t. size of the sigcontext struct.
> >>
> >> 3. glibc {set,get,*}context() routines use struct mcontext_t which is
> >> analogous to kernel struct sigcontext (in respective ucontext structs
> >> [1]). Thus ideally mcontext_t needs to be expanded too but need not be,
> >> given its semantics to save callee-saved regs only : per current psABI
> >> RVVV regs are caller-saved/call-clobbered [3]. Apparently this
> >> connection of sigcontext to mcontext_t is also historical as some arches
> >> used/still-use sigreturn to restore regs in setcontext [4]
> >>
> >> Does anyone disagree that 1-3 are not valid reasons.
> >>
> >> So the proposal here is to *not* add V-ext state to kernel sigcontext
> >> but instead dynamically to struct rt_sigframe, similar to aarch64
> >> kernel. This avoids touching glibc sigcontext as well.
> >>
> >> struct rt_sigframe {
> >>     struct siginfo info;
> >>     struct ucontext uc;
> >> +__u8 sc_extn[] __attribute__((__aligned__(16))); // C99 flexible length
> >> array to handle implementation defined VLEN wide regs
> >> }
> >>
> >> The only downside to this is that SA_SIGINFO signal handlers don’t have
> >> direct access to V state (but it seems aarch64 kernel doesn’t either).
> >>
> >> Does anyone really disagree with this proposal.
> >>
> >> Attached is a proof-of-concept kernel patch which implements this
> >> proposal with no need for any corresponding glibc change.
> >>
> >> Thx,
> >> -Vineet
> >>
> >>
> >> [1] ucontex in kernel and glibc respectively.
> >>
> >> kernel: arch/riscv/include/uapi/asm/ucontext.h
> >>
> >> struct ucontext {
> >>    unsigned long uc_flags;
> >>    struct ucontext *uc_link;
> >>    stack_t uc_stack;
> >>    sigset_t uc_sigmask;
> >>    __u8 __unused[1024 / 8 - sizeof(sigset_t)];
> >>    struct sigcontext uc_mcontext;
> >> }
> >>
> >> glibc: sysdeps/unix/sysv/linux/riscv/sys/ucontext.h
> >>
> >> typedef struct ucontext_t
> >>     {
> >>       unsigned long int  __uc_flags;
> >>       struct ucontext_t *uc_link;
> >>       stack_t            uc_stack;
> >>       sigset_t           uc_sigmask;
> >>       /* padding to allow future sigset_t expansion */
> >>       char   __glibc_reserved[1024 / 8 - sizeof (sigset_t)];
> >>        mcontext_t uc_mcontext;
> >> } ucontext_t;
> >>
> >> [2] https://sourceware.org/pipermail/libc-alpha/2022-January/135610.html
> >> [3]
> >> https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc
> >> [4] https://sourceware.org/legacy-ml/libc-alpha/2014-04/msg00006.html
> >>
> >>
> >>
> >>
> >> On 12/8/22 19:39, Vineet Gupta wrote:
> >>> Hi Florian,
> >>>
> >>> P.S. Since I'm revisiting a year old thread with some new CC
> >>> recipients, here's the link to original patch/thread [1]
> >>>
> >>> On 9/17/21 20:04, Vincent Chen wrote:
> >>>> On Thu, Sep 16, 2021 at 4:14 PM Florian Weimer <fweimer@redhat.com>
> >>>> wrote:
> >>>>>>>> This changes the size of struct ucontext_t, which is an ABI break
> >>>>>>>> (getcontext callers are supposed to provide their own object).
> >>>>>>>>
> >>>>>> The riscv vector registers are all caller-saved registers except for
> >>>>>> VCSR. Therefore, the struct mcontext_t needs to reserve a space for
> >>>>>> it. In addition, RISCV ISA is growing, so I also hope the struct
> >>>>>> mcontext_t has a space for future expansion. Based on the above ideas,
> >>>>>> I reserved a 5K space here.
> >>>>> You have reserved space in ucontext_t that you could use for this.
> >>>>>
> >>>> Sorry, I cannot really understand what you mean. The following is the
> >>>> contents of ucontext_t
> >>>> typedef struct ucontext_t
> >>>>     {
> >>>>       unsigned long int  __uc_flags;
> >>>>       struct ucontext_t *uc_link;
> >>>>       stack_t            uc_stack;
> >>>>       sigset_t           uc_sigmask;
> >>>>       /* There's some padding here to allow sigset_t to be expanded in
> >>>> the
> >>>>          future.  Though this is unlikely, other architectures put
> >>>> uc_sigmask
> >>>>          at the end of this structure and explicitly state it can be
> >>>>          expanded, so we didn't want to box ourselves in here. */
> >>>>       char               __glibc_reserved[1024 / 8 - sizeof (sigset_t)];
> >>>>       /* We can't put uc_sigmask at the end of this structure because
> >>>> we need
> >>>>          to be able to expand sigcontext in the future.  For example, the
> >>>>          vector ISA extension will almost certainly add ISA state.  We
> >>>> want
> >>>>          to ensure all user-visible ISA state can be saved and
> >>>> restored via a
> >>>>          ucontext, so we're putting this at the end in order to allow for
> >>>>          infinite extensibility.  Since we know this will be extended
> >>>> and we
> >>>>          assume sigset_t won't be extended an extreme amount, we're
> >>>>          prioritizing this.  */
> >>>>       mcontext_t uc_mcontext;
> >>>>     } ucontext_t;
> >>>>
> >>>> Currently, we only reserve a space, __glibc_reserved[], for the future
> >>>> expansion of sigset_t.
> >>>> Do you mean I could use __glibc_reserved[] to for future expansion of
> >>>> ISA as well?
> >>> Given unlikely sigset expansion, we could in theory use some of those
> >>> reserved fields to store pointers (offsets) to actual V state, but not
> >>> for actual V state which is way too large for non-embedded machines
> >>> with typical 128 or even wider V regs.
> >>>
> >>>
> >>>>>>>> This shouldn't be necessary if the additional vector registers are
> >>>>>>>> caller-saved.
> >>>>>> Here I am a little confused about the usage of struct mcontext_t. As
> >>>>>> far as I know, the struct mcontext_t is used to save the
> >>>>>> machine-specific information in user context operation. Therefore, in
> >>>>>> this case, the struct mcontext_t is allowed to reserve the space only
> >>>>>> for saving caller-saved registers. However, in the signal handler, the
> >>>>>> user seems to be allowed to use uc_mcontext whose data type is struct
> >>>>>> mcontext_t to access the content of the signal context. In this case,
> >>>>>> the struct mcontext_t may need to be the same as the struct sigcontext
> >>>>>> defined at kernel. However, it will have a conflict with your
> >>>>>> suggestion because the struct sigcontext cannot just reserve a space
> >>>>>> for saving caller-saved registers. Could you help me point out my
> >>>>>> misunderstanding? Thank you.
> >>> I think the confusion comes from apparent equivalence of kernel struct
> >>> sigcontext and glibc mcontext_t as they appear in respective struct
> >>> ucontext definitions.
> >>> I've enumerated the actual RV structs below to keep them handy in one
> >>> place for discussion.
> >>>
> >>>>> struct sigcontext is allocated by the kernel, so you can have pointers
> >>>>> in reserved fields to out-of-line start, or after struct sigcontext.
> >>> In this scheme, would the actual V regfile contents (at the
> >>> out-of-line location w.r.t kernel sigcontext) be anonymous for glibc
> >>> i.e. do we not need to expose them to glibc userspace ABI ?
> >>>
> >>>
> >>>>> I don't know how the kernel implements this, but there is considerable
> >>>>> flexibility and extensibility.  The main issues comes from small stacks
> >>>>> which are incompatible with large register files.
> >>> Simplistically, Linux kernel needs to preserve the V regfile across
> >>> task switch. The necessary evil that follows is preserving V across
> >>> signal-handling (sigaction/sigreturn).
> >>>
> >>> In RV kernel we have following:
> >>>
> >>> struct rt_sigframe {
> >>>    struct siginfo info;
> >>>    struct ucontext uc;
> >>> };
> >>>
> >>> struct ucontext {
> >>>     unsigned long uc_flags;
> >>>     struct ucontext *uc_link;
> >>>     stack_t uc_stack;
> >>>     sigset_t uc_sigmask;
> >>>     __u8 __unused[1024 / 8 - sizeof(sigset_t)];     // this is for
> >>> sigset_t expansion
> >>>     struct sigcontext uc_mcontext;
> >>> };
> >>>
> >>> struct sigcontext {
> >>>     struct user_regs_struct sc_regs;
> >>>     union __riscv_fp_state sc_fpregs;
> >>> +  __u8 sc_extn[4096+128] __attribute__((__aligned__(16)));   //
> >>> handle 128B V regs
> >>> };
> >>>
> >>> The sc_extn[] would have V state (regfile + control state) in kernel
> >>> defined format.
> >>>
> >>> As I understand it, you are suggesting to prevent ABI break, we should
> >>> not add anything to kernel struct sigcontext i.e. do something like this
> >>>
> >>> struct rt_sigframe {
> >>>    struct siginfo info;
> >>>    struct ucontext uc;
> >>> +__u8 sc_extn[4096+128] __attribute__((__aligned__(16)));
> >>> }
> >>>
> >>> So kernel sig handling can continue to save/restore the V regfile on
> >>> user stack, w/o making it part of actual struct sigcontext.
> >>> So they are not explicitly visible to userspace at all - is that
> >>> feasible ? I know that SA_SIGINFO handlers can access the scalar/fp
> >>> regs, they won't do it V.
> >>> Is there a POSIX req for SA_SIGINFO handlers being able to access all
> >>> machine regs saved by signal handling.
> >>>
> >>> An alternate approach is what Vincent did originally, to add sc_exn to
> >>> struct sigcontext. Here to prevent ABI breakage, we can choose to not
> >>> reflect this in the glibc sigcontext. But the question remains, is
> >>> that OK ?
> >>>
> >>> The other topic is changing glibc mcontext_t to add V-regs. It would
> >>> seem one has to as mcontext is "visually equivalent" to struct
> >>> sigcontext in the respective ucontext structs. But in unserspace
> >>> *context routine semantics only require callee-regs to be saved, which
> >>> V regs are not per psABI [2]. So looks like this can be avoided which
> >>> is what Vincent did in v2 series [3]
> >>>
> >>>
> >>> [1]
> >>> https://sourceware.org/pipermail/libc-alpha/2021-September/130899.html
> >>> [2]
> >>> https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc
> >>> [3] https://sourceware.org/pipermail/libc-alpha/2022-January/135416.html
>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2022-12-21 19:52                     ` Vineet Gupta
@ 2022-12-22  3:37                       ` Vincent Chen
  2022-12-22 19:25                         ` Vineet Gupta
  2022-12-22  5:32                       ` Richard Henderson via Libc-alpha
  1 sibling, 1 reply; 79+ messages in thread
From: Vincent Chen @ 2022-12-22  3:37 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Florian Weimer, Rich Felker, Andrew Waterman, Palmer Dabbelt,
	Kito Cheng, Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Philipp Tomsich, Szabolcs Nagy, Andy Chiu,
	Greentime Hu, Aaron Durbin, Andrew de los Reyes, linux-riscv,
	GNU C Library

On Thu, Dec 22, 2022 at 3:52 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
>
>
>
> On 12/21/22 11:45, Vineet Gupta wrote:
> >
> > 4. Kernel with RVV support + user program using original Glibc sigcontext
> > In this case, the kernel needs to save vector registers context to
> > memory. The user program may encounter memory issues if the user space
> > does not reserve enough memory size for the kernel to create the
> > sigcontext. However, we can't seem to avoid this case since there is
> > no flexible memory area in struct sigcontext for future expansion.
>
> This is not an issue, if we don't change sigcontext (in kernel and
> glibc) - it is essentially the case of existing binaries.
> kernel still saves regs on user stack, in rt_sigframe, its just that
> userspace is not able to access them in SA_SIGINFO signal handers.
> aarch64 have this implemented this approach and it is likely they can't
> do that either for SVE regs.

Sorry, I don't clearly describe the case. As you mentioned, the kernel
will save the vector registers to the user stack or user-specified
memory region by struct rt_sigframe in your patch. But, if there is an
existing binary compiled with the original sigcontext.h, it will
assume that the kernel only occupies the sizeof(struct sigcontext) to
save these registers. It will not aware the RVV extension is supported
and not expect that the kernel with RVV support needs an extra huge
memory region on its stack or specified memory region to save vector
registers context. In this case, the user program will encounter
memory corruption issues if the size of the memory region specified by
the user program is not enough to store these vector registers'
context.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2022-12-21 19:52                     ` Vineet Gupta
  2022-12-22  3:37                       ` Vincent Chen
@ 2022-12-22  5:32                       ` Richard Henderson via Libc-alpha
  2022-12-22 18:33                         ` Andy Chiu
  2022-12-22 20:30                         ` Vineet Gupta
  1 sibling, 2 replies; 79+ messages in thread
From: Richard Henderson via Libc-alpha @ 2022-12-22  5:32 UTC (permalink / raw)
  To: Vineet Gupta, Vincent Chen
  Cc: Florian Weimer, Rich Felker, Andrew Waterman, Palmer Dabbelt,
	Kito Cheng, Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Philipp Tomsich, Szabolcs Nagy, Andy Chiu,
	Greentime Hu, Aaron Durbin, Andrew de los Reyes, linux-riscv,
	GNU C Library

On 12/21/22 11:52, Vineet Gupta wrote:
> This is not an issue, if we don't change sigcontext (in kernel and glibc) - it is 
> essentially the case of existing binaries. kernel still saves regs on user stack, in
> rt_sigframe, its just that userspace is not able to access them in SA_SIGINFO signal
> handers. aarch64 have this implemented this approach and it is likely they can't do
> that either for SVE regs.

aarch64 can certainly access the SVE regs on the signal stack.  It simply requires that 
you parse the chain of extensions within __reserved to find it.
It's quite well designed, really.

What you can't do is "only" declare a sigcontext_t and be able to construct a new context, 
nor copy the entire context via structure assignment.

There is room within the risc-v context for a similar scheme, via

     sigcontext.sc_fpregs.q.reserved[3]

E.g.

     reserved[0] -> magic
     reserved[1] -> displacement to "extension area"
     reserved[2] -> size of "extension area"

Thus the area can be located anywhere within 4GB and expand to 4GB.
Not that I'd hope any signal frame would require 4GB.  :-)


r~

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2022-12-21 19:45                   ` Vineet Gupta
  2022-12-21 19:52                     ` Vineet Gupta
  2022-12-22  1:50                     ` Vincent Chen
@ 2022-12-22  5:34                     ` Richard Henderson via Libc-alpha
  2 siblings, 0 replies; 79+ messages in thread
From: Richard Henderson via Libc-alpha @ 2022-12-22  5:34 UTC (permalink / raw)
  To: Vineet Gupta, Vincent Chen
  Cc: Florian Weimer, Rich Felker, Andrew Waterman, Palmer Dabbelt,
	Kito Cheng, Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Philipp Tomsich, Szabolcs Nagy, Andy Chiu,
	Greentime Hu, Aaron Durbin, Andrew de los Reyes, linux-riscv,
	GNU C Library

On 12/21/22 11:45, Vineet Gupta wrote:
>> + __extension__ union {
>> + unsigned long long int fpregs[66] __attribute__ ((__aligned__ (16)));
>> + /* Kernel uses struct __riscv_fp_state to save f0-f31 and fcsr
>> + to the signal context, so please use sc_fpregs to access these
>> + fpu registers from the signal context. */
>> + union __riscv_fp_state sc_fpregs;
>> + };
>> +
>> + __u8 sc_extn[] __attribute__((__aligned__(16)));
>>   };
>>
>>   #endif
>>
>>
>> This change can reduce memory waste size to 16 bytes in the worst
>> case. The best case happens when the sc_extn locates at a 16-byte
>> aligned address. The size of the struct sigcontext is still the same.
> 
> Its a neat trick. But the additional stack alignment means we could still potentially 
> changing the size of sigcontext - even if by 16 bytes - again for existing binaries.

The riscv sigcontext is already aligned by 16, via __riscv_q_ext_state, fwiw.


r~

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2022-12-22  5:32                       ` Richard Henderson via Libc-alpha
@ 2022-12-22 18:33                         ` Andy Chiu
  2022-12-22 20:27                           ` Vineet Gupta
                                             ` (2 more replies)
  2022-12-22 20:30                         ` Vineet Gupta
  1 sibling, 3 replies; 79+ messages in thread
From: Andy Chiu @ 2022-12-22 18:33 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Vineet Gupta, Vincent Chen, Florian Weimer, Rich Felker,
	Andrew Waterman, Palmer Dabbelt, Kito Cheng,
	Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Philipp Tomsich, Szabolcs Nagy,
	Greentime Hu, Aaron Durbin, Andrew de los Reyes, linux-riscv,
	GNU C Library

On Thu, Dec 22, 2022 at 1:32 PM Richard Henderson
<richard.henderson@linaro.org> wrote:
> E.g.
>
>      reserved[0] -> magic
>      reserved[1] -> displacement to "extension area"
>      reserved[2] -> size of "extension area"
>
> Thus the area can be located anywhere within 4GB and expand to 4GB.
> Not that I'd hope any signal frame would require 4GB.  :-)
>

By encoding the extension magic into fp reserved space, and attaching
actual Vector states underneath, it is possible to make no size
changes to the sigcontext itself. In fact the comment section of
__riscv_q_ext_state specifies those bytes were purposely reserved for
sigcontext expansion. If this is the case then maybe we should just
use those reserved spaces anyway.

struct __riscv_q_ext_state {
        __u64 f[64] __attribute__((aligned(16)));
        __u32 fcsr;
        /*
         * Reserved for expansion of sigcontext structure.  Currently zeroed
         * upon signal, and must be zero upon sigreturn.
         */
        __u32 reserved[3];
};

Here is a way that keeps the size and layout of sigcontext, while it
also manages to let the kernel write Vector state into an user's
signal stack. This approach also lets the user space leverage existing
reserved space to get context from new extensions. We introduce a new
struct, __riscv_extra_ext_header, unioning with __riscv_fp_state in
sigcontext. __riscv_extra_ext_header is the same size as
__riscv_fp_state. The only purpose of the struct is to point to the
magic header of a following extension, e.g. Vector, located at the
reserved space. If there is no more extension to come, then all of
those bytes should be zeros.

 struct sigcontext {
        struct user_regs_struct sc_regs;
-       union __riscv_fp_state sc_fpregs;
+       union {
+               union __riscv_fp_state sc_fpregs;
+               struct __riscv_extra_ext_header sc_extdesc;
+       };
 };

I wrote a PoC patch for this and it has been pushed into the following git tree:
https://github.com/sifive/riscv-linux/tree/dev/andyc/for-next-v13
I tested it on a rv32 QEMU virt machine and the user space can get/set
Vector registers normally. I haven't tested it on rv64 yet but it
should be no difference. The patch is not the final version and maybe
I missed some basic ideas. But if everyone agrees with this approach
then I would like to start formalizing and submit the series.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2022-12-22  3:37                       ` Vincent Chen
@ 2022-12-22 19:25                         ` Vineet Gupta
  2022-12-23  2:27                           ` Vincent Chen
  0 siblings, 1 reply; 79+ messages in thread
From: Vineet Gupta @ 2022-12-22 19:25 UTC (permalink / raw)
  To: Vincent Chen
  Cc: Florian Weimer, Rich Felker, Andrew Waterman, Palmer Dabbelt,
	Kito Cheng, Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Philipp Tomsich, Szabolcs Nagy, Andy Chiu,
	Greentime Hu, Aaron Durbin, Andrew de los Reyes, linux-riscv,
	GNU C Library


On 12/21/22 19:37, Vincent Chen wrote:
> On Thu, Dec 22, 2022 at 3:52 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
>>
>>
>> On 12/21/22 11:45, Vineet Gupta wrote:
>>> 4. Kernel with RVV support + user program using original Glibc sigcontext
>>> In this case, the kernel needs to save vector registers context to
>>> memory. The user program may encounter memory issues if the user space
>>> does not reserve enough memory size for the kernel to create the
>>> sigcontext. However, we can't seem to avoid this case since there is
>>> no flexible memory area in struct sigcontext for future expansion.
>> This is not an issue, if we don't change sigcontext (in kernel and
>> glibc) - it is essentially the case of existing binaries.
>> kernel still saves regs on user stack, in rt_sigframe, its just that
>> userspace is not able to access them in SA_SIGINFO signal handers.
>> aarch64 have this implemented this approach and it is likely they can't
>> do that either for SVE regs.
> Sorry, I don't clearly describe the case. As you mentioned, the kernel
> will save the vector registers to the user stack or user-specified
> memory region by struct rt_sigframe in your patch. But, if there is an
> existing binary compiled with the original sigcontext.h, it will
> assume that the kernel only occupies the sizeof(struct sigcontext) to
> save these registers. It will not aware the RVV extension is supported
> and not expect that the kernel with RVV support needs an extra huge
> memory region on its stack or specified memory region to save vector
> registers context. In this case, the user program will encounter
> memory corruption issues if the size of the memory region specified by
> the user program is not enough to store these vector registers'
> context.

No, it will not. In this scheme struct sigcontext remains same as 
before. Kernel is copying the RVV context not in sigcontext, but beyond 
the canonical sigcontext, in layout specified in the rt_sigframe. Please 
take a look at my patch again. It works.

Again I don't care what scheme we follow, I just want o make forward 
progress.

-Vineet


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2022-12-22 18:33                         ` Andy Chiu
@ 2022-12-22 20:27                           ` Vineet Gupta
  2022-12-28 10:53                             ` Andy Chiu
  2022-12-22 22:33                           ` Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break Richard Henderson via Libc-alpha
  2022-12-22 23:47                           ` Conor Dooley via Libc-alpha
  2 siblings, 1 reply; 79+ messages in thread
From: Vineet Gupta @ 2022-12-22 20:27 UTC (permalink / raw)
  To: Andy Chiu, Richard Henderson
  Cc: Vincent Chen, Florian Weimer, Rich Felker, Andrew Waterman,
	Palmer Dabbelt, Kito Cheng, Christoph Müllner, davidlt,
	Arnd Bergmann, Björn Töpel, Philipp Tomsich,
	Szabolcs Nagy, Greentime Hu, Aaron Durbin, Andrew de los Reyes,
	linux-riscv, GNU C Library



On 12/22/22 10:33, Andy Chiu wrote:
> On Thu, Dec 22, 2022 at 1:32 PM Richard Henderson
> <richard.henderson@linaro.org> wrote:
>> E.g.
>>
>>       reserved[0] -> magic
>>       reserved[1] -> displacement to "extension area"
>>       reserved[2] -> size of "extension area"
>>
>> Thus the area can be located anywhere within 4GB and expand to 4GB.
>> Not that I'd hope any signal frame would require 4GB.  :-)
>>
> By encoding the extension magic into fp reserved space, and attaching
> actual Vector states underneath, it is possible to make no size
> changes to the sigcontext itself. In fact the comment section of
> __riscv_q_ext_state specifies those bytes were purposely reserved for
> sigcontext expansion. If this is the case then maybe we should just
> use those reserved spaces anyway.
>
> struct __riscv_q_ext_state {
>          __u64 f[64] __attribute__((aligned(16)));
>          __u32 fcsr;
>          /*
>           * Reserved for expansion of sigcontext structure.  Currently zeroed
>           * upon signal, and must be zero upon sigreturn.
>           */
>          __u32 reserved[3];
> };
>
> Here is a way that keeps the size and layout of sigcontext, while it
> also manages to let the kernel write Vector state into an user's
> signal stack. This approach also lets the user space leverage existing
> reserved space to get context from new extensions. We introduce a new
> struct, __riscv_extra_ext_header, unioning with __riscv_fp_state in
> sigcontext. __riscv_extra_ext_header is the same size as
> __riscv_fp_state. The only purpose of the struct is to point to the
> magic header of a following extension, e.g. Vector, located at the
> reserved space. If there is no more extension to come, then all of
> those bytes should be zeros.
>
>   struct sigcontext {
>          struct user_regs_struct sc_regs;
> -       union __riscv_fp_state sc_fpregs;
> +       union {
> +               union __riscv_fp_state sc_fpregs;
> +               struct __riscv_extra_ext_header sc_extdesc;
> +       };
>   };
>
> I wrote a PoC patch for this and it has been pushed into the following git tree:
> https://github.com/sifive/riscv-linux/tree/dev/andyc/for-next-v13
> I tested it on a rv32 QEMU virt machine and the user space can get/set
> Vector registers normally. I haven't tested it on rv64 yet but it
> should be no difference. The patch is not the final version and maybe
> I missed some basic ideas. But if everyone agrees with this approach
> then I would like to start formalizing and submit the series.

This approach looks perfect. Lets productize it to fold this patch into 
the respective patch(es).
We would then need fixups to not unconditionally enable V on fork/execve 
and hook that up to a prctl.
Let me work on that and provide something on top of your series.

-Vineet

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2022-12-22  5:32                       ` Richard Henderson via Libc-alpha
  2022-12-22 18:33                         ` Andy Chiu
@ 2022-12-22 20:30                         ` Vineet Gupta
  2022-12-22 21:38                           ` Andrew Waterman
  1 sibling, 1 reply; 79+ messages in thread
From: Vineet Gupta @ 2022-12-22 20:30 UTC (permalink / raw)
  To: Richard Henderson, Vincent Chen
  Cc: Florian Weimer, Rich Felker, Andrew Waterman, Palmer Dabbelt,
	Kito Cheng, Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Philipp Tomsich, Szabolcs Nagy, Andy Chiu,
	Greentime Hu, Aaron Durbin, Andrew de los Reyes, linux-riscv,
	GNU C Library



On 12/21/22 21:32, Richard Henderson wrote:
> On 12/21/22 11:52, Vineet Gupta wrote:
>> This is not an issue, if we don't change sigcontext (in kernel and 
>> glibc) - it is essentially the case of existing binaries. kernel 
>> still saves regs on user stack, in
>> rt_sigframe, its just that userspace is not able to access them in 
>> SA_SIGINFO signal
>> handers. aarch64 have this implemented this approach and it is likely 
>> they can't do
>> that either for SVE regs.
>
> aarch64 can certainly access the SVE regs on the signal stack.  It 
> simply requires that you parse the chain of extensions within 
> __reserved to find it.
> It's quite well designed, really.

Yep I've been staring at it this week and really appreciate the 
extensible design. Indeed one can do thru the existing __reserved in 
sigcontext to access that from userspace.


>
> What you can't do is "only" declare a sigcontext_t and be able to 
> construct a new context, nor copy the entire context via structure 
> assignment.
>
> There is room within the risc-v context for a similar scheme, via
>
>     sigcontext.sc_fpregs.q.reserved[3]
>
> E.g.
>
>     reserved[0] -> magic
>     reserved[1] -> displacement to "extension area"
>     reserved[2] -> size of "extension area"
>
> Thus the area can be located anywhere within 4GB and expand to 4GB.
> Not that I'd hope any signal frame would require 4GB.  :-)

Looks like we almost missed this. Thx for the pointer Richard.

-Vineet


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2022-12-22 20:30                         ` Vineet Gupta
@ 2022-12-22 21:38                           ` Andrew Waterman
  0 siblings, 0 replies; 79+ messages in thread
From: Andrew Waterman @ 2022-12-22 21:38 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Richard Henderson, Vincent Chen, Florian Weimer, Rich Felker,
	Palmer Dabbelt, Kito Cheng, Christoph Müllner, davidlt,
	Arnd Bergmann, Björn Töpel, Philipp Tomsich,
	Szabolcs Nagy, Andy Chiu, Greentime Hu, Aaron Durbin,
	Andrew de los Reyes, linux-riscv, GNU C Library

On Thu, Dec 22, 2022 at 12:30 PM Vineet Gupta <vineetg@rivosinc.com> wrote:
>
>
>
> On 12/21/22 21:32, Richard Henderson wrote:
> > On 12/21/22 11:52, Vineet Gupta wrote:
> >> This is not an issue, if we don't change sigcontext (in kernel and
> >> glibc) - it is essentially the case of existing binaries. kernel
> >> still saves regs on user stack, in
> >> rt_sigframe, its just that userspace is not able to access them in
> >> SA_SIGINFO signal
> >> handers. aarch64 have this implemented this approach and it is likely
> >> they can't do
> >> that either for SVE regs.
> >
> > aarch64 can certainly access the SVE regs on the signal stack.  It
> > simply requires that you parse the chain of extensions within
> > __reserved to find it.
> > It's quite well designed, really.
>
> Yep I've been staring at it this week and really appreciate the
> extensible design. Indeed one can do thru the existing __reserved in
> sigcontext to access that from userspace.

Sorry y'all had to reverse-engineer our logic: this was exactly our
intent for those reserved words when we defined the current ABI.  It's
also why the current ABI requires them to be zero: as a sentinel to
signify the end of the list of extension areas.

>
>
> >
> > What you can't do is "only" declare a sigcontext_t and be able to
> > construct a new context, nor copy the entire context via structure
> > assignment.
> >
> > There is room within the risc-v context for a similar scheme, via
> >
> >     sigcontext.sc_fpregs.q.reserved[3]
> >
> > E.g.
> >
> >     reserved[0] -> magic
> >     reserved[1] -> displacement to "extension area"
> >     reserved[2] -> size of "extension area"
> >
> > Thus the area can be located anywhere within 4GB and expand to 4GB.
> > Not that I'd hope any signal frame would require 4GB.  :-)
>
> Looks like we almost missed this. Thx for the pointer Richard.
>
> -Vineet
>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2022-12-22 18:33                         ` Andy Chiu
  2022-12-22 20:27                           ` Vineet Gupta
@ 2022-12-22 22:33                           ` Richard Henderson via Libc-alpha
  2022-12-22 23:47                           ` Conor Dooley via Libc-alpha
  2 siblings, 0 replies; 79+ messages in thread
From: Richard Henderson via Libc-alpha @ 2022-12-22 22:33 UTC (permalink / raw)
  To: Andy Chiu
  Cc: Vineet Gupta, Vincent Chen, Florian Weimer, Rich Felker,
	Andrew Waterman, Palmer Dabbelt, Kito Cheng,
	Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Philipp Tomsich, Szabolcs Nagy,
	Greentime Hu, Aaron Durbin, Andrew de los Reyes, linux-riscv,
	GNU C Library

On 12/22/22 10:33, Andy Chiu wrote:
> I wrote a PoC patch for this and it has been pushed into the following git tree:
> https://github.com/sifive/riscv-linux/tree/dev/andyc/for-next-v13

I had a look at your include/uapi/, and it looks good.
Mere nits:

> struct __riscv_q_ext_state {
> 	__u64 f[64] __attribute__((aligned(16)));
> 	__u32 fcsr;
> 	/*
> 	 * Reserved for expansion of sigcontext structure.  Currently zeroed
> 	 * upon signal, and must be zero upon sigreturn.
> 	 */
> 	__u32 reserved[3];
> };
> 
> struct __riscv_ctx_hdr {
> 	__u32 magic;
> 	__u32 size;
> 	__u32 reserved;
> };

Thinking about the _next_ extension on the chain, perhaps drop the 3rd word from here, so 
that (&hdr + 1) is 8-byte aligned (which may be enough depending on what the extension 
contains)?

> struct __riscv_extra_ext_header {
> 	__u64 ignored[64] __attribute__((aligned(16)));
> 	__u32 padding;
> 	/*
> 	 * Reserved for expansion of sigcontext structure.  Currently zeroed
> 	 * upon signal, and must be zero upon sigreturn.
> 	 */
> 	struct __riscv_ctx_hdr hdr;
> };

     __u32 __padding[129]
or
     __u64 __padding[65]

depending on your answer to the above?

It might reduce confusion to move (or replicate, for redundancy) the aligned(16) from the 
innermost __riscv_q_ext_state.f[] to the outermost sc_fpregs and/or sigcontext.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2022-12-22 18:33                         ` Andy Chiu
  2022-12-22 20:27                           ` Vineet Gupta
  2022-12-22 22:33                           ` Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break Richard Henderson via Libc-alpha
@ 2022-12-22 23:47                           ` Conor Dooley via Libc-alpha
  2022-12-22 23:58                             ` Vineet Gupta
  2 siblings, 1 reply; 79+ messages in thread
From: Conor Dooley via Libc-alpha @ 2022-12-22 23:47 UTC (permalink / raw)
  To: Andy Chiu
  Cc: Richard Henderson, Vineet Gupta, Vincent Chen, Florian Weimer,
	Rich Felker, Andrew Waterman, Palmer Dabbelt, Kito Cheng,
	Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Philipp Tomsich, Szabolcs Nagy,
	Greentime Hu, Aaron Durbin, Andrew de los Reyes, linux-riscv,
	GNU C Library

[-- Attachment #1: Type: text/plain, Size: 755 bytes --]

On Fri, Dec 23, 2022 at 02:33:26AM +0800, Andy Chiu wrote:
 
> I wrote a PoC patch for this and it has been pushed into the following git tree:
> https://github.com/sifive/riscv-linux/tree/dev/andyc/for-next-v13
> I tested it on a rv32 QEMU virt machine and the user space can get/set
> Vector registers normally. I haven't tested it on rv64 yet but it
> should be no difference. The patch is not the final version and maybe
> I missed some basic ideas.

> But if everyone agrees with this approach
> then I would like to start formalizing and submit the series.

Between yourself and the Rivos folk, you should probably sort out who is
doing what with the series at the very least, so that you're not both
working on "competing" v13s...


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2022-12-22 23:47                           ` Conor Dooley via Libc-alpha
@ 2022-12-22 23:58                             ` Vineet Gupta
  0 siblings, 0 replies; 79+ messages in thread
From: Vineet Gupta @ 2022-12-22 23:58 UTC (permalink / raw)
  To: Conor Dooley, Andy Chiu
  Cc: Richard Henderson, Vincent Chen, Florian Weimer, Rich Felker,
	Andrew Waterman, Palmer Dabbelt, Kito Cheng,
	Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Philipp Tomsich, Szabolcs Nagy,
	Greentime Hu, Aaron Durbin, Andrew de los Reyes, linux-riscv,
	GNU C Library



On 12/22/22 15:47, Conor Dooley wrote:
> On Fri, Dec 23, 2022 at 02:33:26AM +0800, Andy Chiu wrote:
>   
>> I wrote a PoC patch for this and it has been pushed into the following git tree:
>> https://github.com/sifive/riscv-linux/tree/dev/andyc/for-next-v13
>> I tested it on a rv32 QEMU virt machine and the user space can get/set
>> Vector registers normally. I haven't tested it on rv64 yet but it
>> should be no difference. The patch is not the final version and maybe
>> I missed some basic ideas.
>> But if everyone agrees with this approach
>> then I would like to start formalizing and submit the series.
> Between yourself and the Rivos folk, you should probably sort out who is
> doing what with the series at the very least, so that you're not both
> working on "competing" v13s...

No we are not competing ;-)
I'm mostly facilitating since this got stuck in a stalemate and original 
contributors had gone radio silent for a while.

-Vineet

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2022-12-22 19:25                         ` Vineet Gupta
@ 2022-12-23  2:27                           ` Vincent Chen
  2022-12-23 19:42                             ` Vineet Gupta
  0 siblings, 1 reply; 79+ messages in thread
From: Vincent Chen @ 2022-12-23  2:27 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Florian Weimer, Rich Felker, Andrew Waterman, Palmer Dabbelt,
	Kito Cheng, Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Philipp Tomsich, Szabolcs Nagy, Andy Chiu,
	Greentime Hu, Aaron Durbin, Andrew de los Reyes, linux-riscv,
	GNU C Library

On Fri, Dec 23, 2022 at 3:25 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
>
>
> On 12/21/22 19:37, Vincent Chen wrote:
> > On Thu, Dec 22, 2022 at 3:52 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
> >>
> >>
> >> On 12/21/22 11:45, Vineet Gupta wrote:
> >>> 4. Kernel with RVV support + user program using original Glibc sigcontext
> >>> In this case, the kernel needs to save vector registers context to
> >>> memory. The user program may encounter memory issues if the user space
> >>> does not reserve enough memory size for the kernel to create the
> >>> sigcontext. However, we can't seem to avoid this case since there is
> >>> no flexible memory area in struct sigcontext for future expansion.
> >> This is not an issue, if we don't change sigcontext (in kernel and
> >> glibc) - it is essentially the case of existing binaries.
> >> kernel still saves regs on user stack, in rt_sigframe, its just that
> >> userspace is not able to access them in SA_SIGINFO signal handers.
> >> aarch64 have this implemented this approach and it is likely they can't
> >> do that either for SVE regs.
> > Sorry, I don't clearly describe the case. As you mentioned, the kernel
> > will save the vector registers to the user stack or user-specified
> > memory region by struct rt_sigframe in your patch. But, if there is an
> > existing binary compiled with the original sigcontext.h, it will
> > assume that the kernel only occupies the sizeof(struct sigcontext) to
> > save these registers. It will not aware the RVV extension is supported
> > and not expect that the kernel with RVV support needs an extra huge
> > memory region on its stack or specified memory region to save vector
> > registers context. In this case, the user program will encounter
> > memory corruption issues if the size of the memory region specified by
> > the user program is not enough to store these vector registers'
> > context.
>
> No, it will not. In this scheme struct sigcontext remains same as
> before. Kernel is copying the RVV context not in sigcontext, but beyond
> the canonical sigcontext, in layout specified in the rt_sigframe. Please
> take a look at my patch again. It works.

If I understand correctly, in your patch, the kernel uses rt_sigframe
to back up all register contexts in the user space, including RVV
registers. Therefore, the user program needs to reserve enough memory
space for the kernel, which enough size of this memory space is the
sizeof(rt_sigframe) plus rvv_sc_size. However, the rvv_sc_size is
unexpected to existing RISC-V programs. Therefore, some memory of the
existing program may be corrupted by the kernel when the kernel backs
up the RVV registers context.

>
> Again I don't care what scheme we follow, I just want o make forward
> progress.
>

 I understand your thoughts and I sincerely appreciate everything you do.

> -Vineet
>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2022-12-23  2:27                           ` Vincent Chen
@ 2022-12-23 19:42                             ` Vineet Gupta
  0 siblings, 0 replies; 79+ messages in thread
From: Vineet Gupta @ 2022-12-23 19:42 UTC (permalink / raw)
  To: Vincent Chen
  Cc: Florian Weimer, Rich Felker, Andrew Waterman, Palmer Dabbelt,
	Kito Cheng, Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Philipp Tomsich, Szabolcs Nagy, Andy Chiu,
	Greentime Hu, Aaron Durbin, Andrew de los Reyes, linux-riscv,
	GNU C Library


On 12/22/22 18:27, Vincent Chen wrote:
> If I understand correctly, in your patch, the kernel uses rt_sigframe
> to back up all register contexts in the user space, including RVV
> registers.

Discussing this all moot point but still...

> Therefore, the user program needs to reserve enough memory
> space for the kernel, which enough size of this memory space is the
> sizeof(rt_sigframe) plus rvv_sc_size.

In my patch, rt_sigframe has the c99 flexible array. So it doesn't add 
any extra space on its own.
The total size increase is same whether we add it to kernel sigcontext 
or rt_sigframe. And since glibc sigcontext is not changed, application 
is unaware of rvv_sc_size in either case.

> However, the rvv_sc_size is
> unexpected to existing RISC-V programs.

Again not sure how it is different in both cases.

> Therefore, some memory of the
> existing program may be corrupted by the kernel when the kernel backs
> up the RVV registers context.

kernel builds signal frame on top of existing user stack.

setup_rt_frame
     get_sigframe
           sp = regs->sp;

So it can't possibly corrupt any existing user stack area. Sure when 
expanding the stack user stack rlimit etc may hit when doing put_user. 
But again that is same for both approaches.

FWIW kernel with my patch can be found below: it survives full glibc 
testsuite run w/o any regression so it definitely works w/o any obvious 
user memory corruption.

git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/linux.git 
#rvv-v13.2-use-rt_sigframe

-Vineet

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2022-12-22 20:27                           ` Vineet Gupta
@ 2022-12-28 10:53                             ` Andy Chiu
  2023-01-03 19:17                               ` Vineet Gupta
  0 siblings, 1 reply; 79+ messages in thread
From: Andy Chiu @ 2022-12-28 10:53 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Richard Henderson, Vincent Chen, Florian Weimer, Rich Felker,
	Andrew Waterman, Palmer Dabbelt, Kito Cheng,
	Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Philipp Tomsich, Szabolcs Nagy,
	Greentime Hu, Aaron Durbin, Andrew de los Reyes, linux-riscv,
	GNU C Library

On Fri, Dec 23, 2022 at 4:28 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
> This approach looks perfect. Lets productize it to fold this patch into
> the respective patch(es).
> We would then need fixups to not unconditionally enable V on fork/execve
> and hook that up to a prctl.
> Let me work on that and provide something on top of your series.

Hi Vineet, I have included the approach into the Vector series
according to suggestions, which makes it formaler than the PoC one.
Additionally, I picked up your prctl patch and added a kconfig to
compile a kernel that won't unconditionally enable V. Please tell me
if this does not seem right to you. I will submit the series if this
seems well to you and let's discuss some more details further in that
thread. Here is the tree, thanks:

https://github.com/sifive/riscv-linux/tree/dev/andyc/for-next-v13.1-newapi-prctl

-Andy

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2022-12-28 10:53                             ` Andy Chiu
@ 2023-01-03 19:17                               ` Vineet Gupta
  2023-01-04 16:34                                 ` Andy Chiu
  0 siblings, 1 reply; 79+ messages in thread
From: Vineet Gupta @ 2023-01-03 19:17 UTC (permalink / raw)
  To: Andy Chiu
  Cc: Richard Henderson, Vincent Chen, Florian Weimer, Rich Felker,
	Andrew Waterman, Palmer Dabbelt, Kito Cheng,
	Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Philipp Tomsich, Szabolcs Nagy,
	Greentime Hu, Aaron Durbin, Andrew de los Reyes, linux-riscv,
	GNU C Library

Hi Andy,

On 12/28/22 02:53, Andy Chiu wrote:
> On Fri, Dec 23, 2022 at 4:28 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
>> This approach looks perfect. Lets productize it to fold this patch into
>> the respective patch(es).
>> We would then need fixups to not unconditionally enable V on fork/execve
>> and hook that up to a prctl.
>> Let me work on that and provide something on top of your series.
> Hi Vineet, I have included the approach into the Vector series
> according to suggestions, which makes it formaler than the PoC one.
> Additionally, I picked up your prctl patch and added a kconfig to
> compile a kernel that won't unconditionally enable V. Please tell me
> if this does not seem right to you.

The prctl support in there is really rudimentary and incomplete. There's 
more work needed to use the dynamic state of enablement - for say signal 
frame etc. The new Kconfig CONFIG_RISCV_VSTATE_INIT_ALL seems like a 
hack bolted on top.
It would be best to drop it in the current state and rework properly 
based on your patches.

> I will submit the series if this
> seems well to you and let's discuss some more details further in that
> thread. Here is the tree, thanks:
>
> https://github.com/sifive/riscv-linux/tree/dev/andyc/for-next-v13.1-newapi-prctl

I would also suggesting dropping the 2 patches for in-kernel enablement 
for your submission as it might require some more thinking/design and 
builds naturally on top of the baseline patches.

Thx,
-Vineet

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2023-01-03 19:17                               ` Vineet Gupta
@ 2023-01-04 16:34                                 ` Andy Chiu
  2023-01-04 20:46                                   ` Vineet Gupta
  0 siblings, 1 reply; 79+ messages in thread
From: Andy Chiu @ 2023-01-04 16:34 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Richard Henderson, Vincent Chen, Florian Weimer, Rich Felker,
	Andrew Waterman, Palmer Dabbelt, Kito Cheng,
	Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Philipp Tomsich, Szabolcs Nagy,
	Greentime Hu, Aaron Durbin, Andrew de los Reyes, linux-riscv,
	GNU C Library

Hi Vineet,

On Wed, Jan 4, 2023 at 3:17 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
> The prctl support in there is really rudimentary and incomplete. There's
> more work needed to use the dynamic state of enablement - for say signal
> frame etc.
Yes, I agree that signal and ptrace need special handling if we'd turn
off Vector with prctl. For example, we may not need to save/restore
vector context on context switches and signal handlings. And we may
have to prevent ptrace from setting/getting vector context in such
case. I can implement this into the series if this is what you're
looking for. Or you could share the code somewhere so that I could
merge it.

> The new Kconfig CONFIG_RISCV_VSTATE_INIT_ALL seems like a
> hack bolted on top.
IIUC, most opinions suggested that we should keep the default Vector
state to ON in thread:
https://lore.kernel.org/all/20220921214439.1491510-17-stillson@rivosinc.com/T/#u
So IMHO adding a build option to those who prefer not to
unconditionally enable V should be sufficient.

> I would also suggesting dropping the 2 patches for in-kernel enablement
> for your submission as it might require some more thinking/design and
> builds naturally on top of the baseline patches.
Yes, I agree. Those patches were heavily copied from arm neon, which
will not benefit from hardware feature on riscv-V. I will refine those
patches and submit independently, on top of the baseline patch.

Thanks,
Andy

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2023-01-04 16:34                                 ` Andy Chiu
@ 2023-01-04 20:46                                   ` Vineet Gupta
  2023-01-04 21:29                                     ` Philipp Tomsich
  0 siblings, 1 reply; 79+ messages in thread
From: Vineet Gupta @ 2023-01-04 20:46 UTC (permalink / raw)
  To: Andy Chiu
  Cc: Richard Henderson, Vincent Chen, Florian Weimer, Rich Felker,
	Andrew Waterman, Palmer Dabbelt, Kito Cheng,
	Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Philipp Tomsich, Szabolcs Nagy,
	Greentime Hu, Aaron Durbin, Andrew de los Reyes, linux-riscv,
	GNU C Library



On 1/4/23 08:34, Andy Chiu wrote:
> Hi Vineet,
>
> On Wed, Jan 4, 2023 at 3:17 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
>> The prctl support in there is really rudimentary and incomplete. There's
>> more work needed to use the dynamic state of enablement - for say signal
>> frame etc.
> Yes, I agree that signal and ptrace need special handling if we'd turn
> off Vector with prctl. For example, we may not need to save/restore
> vector context on context switches and signal handlings. And we may
> have to prevent ptrace from setting/getting vector context in such
> case. I can implement this into the series if this is what you're
> looking for.

Perfect. This is exactly the coverage I was hoping to see. Go for it.

>> The new Kconfig CONFIG_RISCV_VSTATE_INIT_ALL seems like a
>> hack bolted on top.
> IIUC, most opinions suggested that we should keep the default Vector
> state to ON in thread:
> https://lore.kernel.org/all/20220921214439.1491510-17-stillson@rivosinc.com/T/#u

Actually community feedback is that they *don't * want the default 
vector state to be on due to power implications, increased stack and 
memory usage for vector contents (in that thread and else where as 
well). So we should keep it disabled by default, but indeed we could 
have that Kconfig option to enable it. Granted distro kernels will keep 
it disabled by default, this lets vendors enable it selectively until 
the full userspace enabling bits are in place.

> So IMHO adding a build option to those who prefer not to
> unconditionally enable V should be sufficient.

As above, it should be other way round.

Thx,
-Vineet

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2023-01-04 20:46                                   ` Vineet Gupta
@ 2023-01-04 21:29                                     ` Philipp Tomsich
  2023-01-04 21:37                                       ` Andrew Waterman
  2023-01-04 22:43                                       ` Vineet Gupta
  0 siblings, 2 replies; 79+ messages in thread
From: Philipp Tomsich @ 2023-01-04 21:29 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Andy Chiu, Richard Henderson, Vincent Chen, Florian Weimer,
	Rich Felker, Andrew Waterman, Palmer Dabbelt, Kito Cheng,
	Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Szabolcs Nagy, Greentime Hu, Aaron Durbin,
	Andrew de los Reyes, linux-riscv, GNU C Library

On Wed, 4 Jan 2023 at 21:46, Vineet Gupta <vineetg@rivosinc.com> wrote:
>
>
>
> On 1/4/23 08:34, Andy Chiu wrote:
> > Hi Vineet,
> >
> > On Wed, Jan 4, 2023 at 3:17 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
> >> The prctl support in there is really rudimentary and incomplete. There's
> >> more work needed to use the dynamic state of enablement - for say signal
> >> frame etc.
> > Yes, I agree that signal and ptrace need special handling if we'd turn
> > off Vector with prctl. For example, we may not need to save/restore
> > vector context on context switches and signal handlings. And we may
> > have to prevent ptrace from setting/getting vector context in such
> > case. I can implement this into the series if this is what you're
> > looking for.
>
> Perfect. This is exactly the coverage I was hoping to see. Go for it.
>
> >> The new Kconfig CONFIG_RISCV_VSTATE_INIT_ALL seems like a
> >> hack bolted on top.
> > IIUC, most opinions suggested that we should keep the default Vector
> > state to ON in thread:
> > https://lore.kernel.org/all/20220921214439.1491510-17-stillson@rivosinc.com/T/#u
>
> Actually community feedback is that they *don't * want the default
> vector state to be on due to power implications, increased stack and
> memory usage for vector contents (in that thread and else where as
> well). So we should keep it disabled by default, but indeed we could
> have that Kconfig option to enable it. Granted distro kernels will keep
> it disabled by default, this lets vendors enable it selectively until
> the full userspace enabling bits are in place.

Should we punt this to the ELF (e.g., using a RISC-V specific
attribute) and take a per-process decision on whether to start in ON
or OFF?
I don't feel fully comfortable with a KCONFIG that could change and
invalidate the assumptions a userspace process could have made…

Alternatively, we could establish the convention of having two stub
libraries that set up either enabled or disable state from their
.init_array to provide a mechanism for folks that want to make an
explicit assumption.  Although this may try to overdesign a solution
for a non-issue.

Philipp.

>
> > So IMHO adding a build option to those who prefer not to
> > unconditionally enable V should be sufficient.
>
> As above, it should be other way round.
>
> Thx,
> -Vineet

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2023-01-04 21:29                                     ` Philipp Tomsich
@ 2023-01-04 21:37                                       ` Andrew Waterman
  2023-01-04 22:43                                       ` Vineet Gupta
  1 sibling, 0 replies; 79+ messages in thread
From: Andrew Waterman @ 2023-01-04 21:37 UTC (permalink / raw)
  To: Philipp Tomsich
  Cc: Vineet Gupta, Andy Chiu, Richard Henderson, Vincent Chen,
	Florian Weimer, Rich Felker, Palmer Dabbelt, Kito Cheng,
	Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Szabolcs Nagy, Greentime Hu, Aaron Durbin,
	Andrew de los Reyes, linux-riscv, GNU C Library

On Wed, Jan 4, 2023 at 1:29 PM Philipp Tomsich <philipp.tomsich@vrull.eu> wrote:
>
> On Wed, 4 Jan 2023 at 21:46, Vineet Gupta <vineetg@rivosinc.com> wrote:
> >
> >
> >
> > On 1/4/23 08:34, Andy Chiu wrote:
> > > Hi Vineet,
> > >
> > > On Wed, Jan 4, 2023 at 3:17 AM Vineet Gupta <vineetg@rivosinc.com> wrote:
> > >> The prctl support in there is really rudimentary and incomplete. There's
> > >> more work needed to use the dynamic state of enablement - for say signal
> > >> frame etc.
> > > Yes, I agree that signal and ptrace need special handling if we'd turn
> > > off Vector with prctl. For example, we may not need to save/restore
> > > vector context on context switches and signal handlings. And we may
> > > have to prevent ptrace from setting/getting vector context in such
> > > case. I can implement this into the series if this is what you're
> > > looking for.
> >
> > Perfect. This is exactly the coverage I was hoping to see. Go for it.
> >
> > >> The new Kconfig CONFIG_RISCV_VSTATE_INIT_ALL seems like a
> > >> hack bolted on top.
> > > IIUC, most opinions suggested that we should keep the default Vector
> > > state to ON in thread:
> > > https://lore.kernel.org/all/20220921214439.1491510-17-stillson@rivosinc.com/T/#u
> >
> > Actually community feedback is that they *don't * want the default
> > vector state to be on due to power implications, increased stack and
> > memory usage for vector contents (in that thread and else where as
> > well). So we should keep it disabled by default, but indeed we could
> > have that Kconfig option to enable it. Granted distro kernels will keep
> > it disabled by default, this lets vendors enable it selectively until
> > the full userspace enabling bits are in place.
>
> Should we punt this to the ELF (e.g., using a RISC-V specific
> attribute) and take a per-process decision on whether to start in ON
> or OFF?
> I don't feel fully comfortable with a KCONFIG that could change and
> invalidate the assumptions a userspace process could have made…

I am supremely confident we will eventually have userspace that
unconditionally wants V (for optimized C library routines at minimum),
and that it will follow very closely on the heels of V becoming
mainstream.  So, your proposal to embed this information in the ELF
header (so that the kernel can enable V automatically on program load,
or so the dynamic loader can execute the `prctl` call on library load,
or whatever) seems more forward-looking to me than making this a
Kconfig option.

>
> Alternatively, we could establish the convention of having two stub
> libraries that set up either enabled or disable state from their
> .init_array to provide a mechanism for folks that want to make an
> explicit assumption.  Although this may try to overdesign a solution
> for a non-issue.
>
> Philipp.
>
> >
> > > So IMHO adding a build option to those who prefer not to
> > > unconditionally enable V should be sufficient.
> >
> > As above, it should be other way round.
> >
> > Thx,
> > -Vineet

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2023-01-04 21:29                                     ` Philipp Tomsich
  2023-01-04 21:37                                       ` Andrew Waterman
@ 2023-01-04 22:43                                       ` Vineet Gupta
  2023-01-09 13:33                                         ` Kito Cheng
  1 sibling, 1 reply; 79+ messages in thread
From: Vineet Gupta @ 2023-01-04 22:43 UTC (permalink / raw)
  To: Philipp Tomsich
  Cc: Andy Chiu, Richard Henderson, Vincent Chen, Florian Weimer,
	Rich Felker, Andrew Waterman, Palmer Dabbelt, Kito Cheng,
	Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Szabolcs Nagy, Greentime Hu, Aaron Durbin,
	Andrew de los Reyes, linux-riscv, GNU C Library



On 1/4/23 13:29, Philipp Tomsich wrote:
>>>> The new Kconfig CONFIG_RISCV_VSTATE_INIT_ALL seems like a
>>>> hack bolted on top.
>>> IIUC, most opinions suggested that we should keep the default Vector
>>> state to ON in thread:
>>> https://lore.kernel.org/all/20220921214439.1491510-17-stillson@rivosinc.com/T/#u
>> Actually community feedback is that they *don't * want the default
>> vector state to be on due to power implications, increased stack and
>> memory usage for vector contents (in that thread and else where as
>> well). So we should keep it disabled by default, but indeed we could
>> have that Kconfig option to enable it. Granted distro kernels will keep
>> it disabled by default, this lets vendors enable it selectively until
>> the full userspace enabling bits are in place.
> Should we punt this to the ELF (e.g., using a RISC-V specific
> attribute) and take a per-process decision on whether to start in ON
> or OFF?
> I don't feel fully comfortable with a KCONFIG that could change and
> invalidate the assumptions a userspace process could have made…

The Kconfig is just a stop gap for vendors to enable V development while 
the full userspace stuff is sorted out.

Indeed RISCV_ATTRIBUTES section has -march info, but we need to do some 
development around it to parse it and use it.
There are still corner cases such as non-V executable dlopen a dso - so 
kernel elf parser doing this might not cover all cases.
Similar logic will need to be added to glibc loader - eventually.

Adding the full plumbing is a chicken-and-egg problem.


> Alternatively, we could establish the convention of having two stub
> libraries that set up either enabled or disable state from their
> .init_array to provide a mechanism for folks that want to make an
> explicit assumption.  Although this may try to overdesign a solution
> for a non-issue.

I was thinking more along the lines of x86 GLIBC_TUNABLES to enable it 
via env/sub-shell on a per-task basis - the tunable hook could in turn 
verify that Vector support does exist - or it could invoke the prctl 
unconditionally (which would fail if V didn't exist etc).

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2023-01-04 22:43                                       ` Vineet Gupta
@ 2023-01-09 13:33                                         ` Kito Cheng
  2023-01-09 19:16                                           ` Vineet Gupta
  0 siblings, 1 reply; 79+ messages in thread
From: Kito Cheng @ 2023-01-09 13:33 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Philipp Tomsich, Andy Chiu, Richard Henderson, Vincent Chen,
	Florian Weimer, Rich Felker, Andrew Waterman, Palmer Dabbelt,
	Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Szabolcs Nagy, Greentime Hu, Aaron Durbin,
	Andrew de los Reyes, linux-riscv, GNU C Library

Hi Vineet:

> >>>> The new Kconfig CONFIG_RISCV_VSTATE_INIT_ALL seems like a
> >>>> hack bolted on top.
> >>> IIUC, most opinions suggested that we should keep the default Vector
> >>> state to ON in thread:
> >>> https://lore.kernel.org/all/20220921214439.1491510-17-stillson@rivosinc.com/T/#u
> >> Actually community feedback is that they *don't * want the default
> >> vector state to be on due to power implications, increased stack and
> >> memory usage for vector contents (in that thread and else where as
> >> well). So we should keep it disabled by default, but indeed we could
> >> have that Kconfig option to enable it. Granted distro kernels will keep
> >> it disabled by default, this lets vendors enable it selectively until
> >> the full userspace enabling bits are in place.
> > Should we punt this to the ELF (e.g., using a RISC-V specific
> > attribute) and take a per-process decision on whether to start in ON
> > or OFF?
> > I don't feel fully comfortable with a KCONFIG that could change and
> > invalidate the assumptions a userspace process could have made…
>
> The Kconfig is just a stop gap for vendors to enable V development while
> the full userspace stuff is sorted out.
>
> Indeed RISCV_ATTRIBUTES section has -march info, but we need to do some
> development around it to parse it and use it.

I don't think RISCV_ATTRIBUTES is the right place to check that - even if the
program compiles without V, it still can enable V and then get performance
benefit by ifunc in glibc, or even some 3rd party libraries might also be
optimized with V ext.

And don't forget other shared libraries in the system,
are we going to check all dependent libraries at program load time?
it will require resolving the library dependency at kernel.
Or we intend to enable V only if executable compiles with V?

> There are still corner cases such as non-V executable dlopen a dso - so
> kernel elf parser doing this might not cover all cases.
>
> Similar logic will need to be added to glibc loader - eventually.
>
> Adding the full plumbing is a chicken-and-egg problem.
>
>
> > Alternatively, we could establish the convention of having two stub
> > libraries that set up either enabled or disable state from their
> > .init_array to provide a mechanism for folks that want to make an
> > explicit assumption.  Although this may try to overdesign a solution
> > for a non-issue.
>
> I was thinking more along the lines of x86 GLIBC_TUNABLES to enable it
> via env/sub-shell on a per-task basis - the tunable hook could in turn
> verify that Vector support does exist - or it could invoke the prctl
> unconditionally (which would fail if V didn't exist etc).

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2023-01-09 13:33                                         ` Kito Cheng
@ 2023-01-09 19:16                                           ` Vineet Gupta
  2023-01-10 13:21                                             ` Kito Cheng
  0 siblings, 1 reply; 79+ messages in thread
From: Vineet Gupta @ 2023-01-09 19:16 UTC (permalink / raw)
  To: Kito Cheng
  Cc: Philipp Tomsich, Andy Chiu, Richard Henderson, Vincent Chen,
	Florian Weimer, Rich Felker, Andrew Waterman, Palmer Dabbelt,
	Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Szabolcs Nagy, Greentime Hu, Aaron Durbin,
	Andrew de los Reyes, linux-riscv, GNU C Library

Hi Kito,

On 1/9/23 05:33, Kito Cheng wrote:
> Hi Vineet:
>
>>>>>> The new Kconfig CONFIG_RISCV_VSTATE_INIT_ALL seems like a
>>>>>> hack bolted on top.
>>>>> IIUC, most opinions suggested that we should keep the default Vector
>>>>> state to ON in thread:
>>>>> https://lore.kernel.org/all/20220921214439.1491510-17-stillson@rivosinc.com/T/#u
>>>> Actually community feedback is that they *don't * want the default
>>>> vector state to be on due to power implications, increased stack and
>>>> memory usage for vector contents (in that thread and else where as
>>>> well). So we should keep it disabled by default, but indeed we could
>>>> have that Kconfig option to enable it. Granted distro kernels will keep
>>>> it disabled by default, this lets vendors enable it selectively until
>>>> the full userspace enabling bits are in place.
>>> Should we punt this to the ELF (e.g., using a RISC-V specific
>>> attribute) and take a per-process decision on whether to start in ON
>>> or OFF?
>>> I don't feel fully comfortable with a KCONFIG that could change and
>>> invalidate the assumptions a userspace process could have made…
>> The Kconfig is just a stop gap for vendors to enable V development while
>> the full userspace stuff is sorted out.
>>
>> Indeed RISCV_ATTRIBUTES section has -march info, but we need to do some
>> development around it to parse it and use it.
> I don't think RISCV_ATTRIBUTES is the right place to check that -

What a timing. I just finished testing initial kernel patch to parse the 
elf section and on to tag parsing now ;-)
https://git.kernel.org/pub/scm/linux/kernel/git/vgupta/linux.git/log/?h=topic-elf-attr 
<https://git.kernel.org/pub/scm/linux/kernel/git/vgupta/linux.git/log/?h=topic-elf-attr>

> even if the
> program compiles without V, it still can enable V and then get performance
> benefit by ifunc in glibc, or even some 3rd party libraries might also be
> optimized with V ext.

Right kernel can only handle dynamic executable and/or the the loader 
itself. If V is used distro wide we are covered.
And it can then also pass this info (V enabled as HWCAP*, no need for 
everything)

But you are not suggesting that there is a scenario with executable 
built somehow with V instructions (even .byte encoded) but not have that 
info encoded in RV_ATTR_TAG_arch string. And I'd argue that it is user 
error, they need to make sure that -march had 'v' passed to compiler 
and/or assembler.

> And don't forget other shared libraries in the system,

No I've not forgotten about shared libs (and there's also a case of 
non-V built executable dlopen a V built dso) which can't be handled by 
above.

> are we going to check all dependent libraries at program load time?
> it will require resolving the library dependency at kernel.
> Or we intend to enable V only if executable compiles with V?

So we need a similar parsing in glibc loader which creates a union of "V 
enabled in any lib" and then invokes the prctl to enable, if it is not 
already.

-Vineet

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break
  2023-01-09 19:16                                           ` Vineet Gupta
@ 2023-01-10 13:21                                             ` Kito Cheng
  2023-01-10 18:07                                               ` Auto-enabling V unit and/or use of elf attributes (was Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break) Vineet Gupta
  0 siblings, 1 reply; 79+ messages in thread
From: Kito Cheng @ 2023-01-10 13:21 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Philipp Tomsich, Andy Chiu, Richard Henderson, Vincent Chen,
	Florian Weimer, Rich Felker, Andrew Waterman, Palmer Dabbelt,
	Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Szabolcs Nagy, Greentime Hu, Aaron Durbin,
	Andrew de los Reyes, linux-riscv, GNU C Library

Hi Vineet:


> But you are not suggesting that there is a scenario with executable
> built somehow with V instructions (even .byte encoded) but not have that
> info encoded in RV_ATTR_TAG_arch string. And I'd argue that it is user
> error, they need to make sure that -march had 'v' passed to compiler
> and/or assembler.


The concept of Tag_RISCV_arch attribute is minimal execution
environment requirement of the executable or shared libraries; use
glibc as an example, we can compile glibc with rv64gc only and then it
can contain vector optimized routines like memcpy and memcpy, and
those function are resolved by ifunc, which means only use those
routines when vector extension are available, so the Tag_RISCV_arch
for the glibc is rv64gc, not rv64gcv since V is not minimal execution
environment requirement.

My expectation is most distro will still distribute with rv64gc for a
while and then optimize function with vector extension for some
libraries, and those vector code will guarded with some runtime check
mechanism maybe IFUNC, so Tag_RISCV_arch for those libraries won't
contain V.

It's not clear in psABI spec, but intend to fix in future:
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/292

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Auto-enabling V unit and/or use of elf attributes (was Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break)
  2023-01-10 13:21                                             ` Kito Cheng
@ 2023-01-10 18:07                                               ` Vineet Gupta
  2023-01-11  1:22                                                 ` Richard Henderson via Libc-alpha
  0 siblings, 1 reply; 79+ messages in thread
From: Vineet Gupta @ 2023-01-10 18:07 UTC (permalink / raw)
  To: Kito Cheng
  Cc: Philipp Tomsich, Andy Chiu, Richard Henderson, Vincent Chen,
	Florian Weimer, Rich Felker, Andrew Waterman, Palmer Dabbelt,
	Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Szabolcs Nagy, Greentime Hu, Aaron Durbin,
	Andrew de los Reyes, linux-riscv, GNU C Library

Hi Kito,

On 1/10/23 05:21, Kito Cheng wrote:
> Hi Vineet:
>
>
>> But you are not suggesting that there is a scenario with executable
>> built somehow with V instructions (even .byte encoded) but not have that
>> info encoded in RV_ATTR_TAG_arch string. And I'd argue that it is user
>> error, they need to make sure that -march had 'v' passed to compiler
>> and/or assembler.
>
> The concept of Tag_RISCV_arch attribute is minimal execution
> environment requirement of the executable or shared libraries; use
> glibc as an example, we can compile glibc with rv64gc only and then it
> can contain vector optimized routines like memcpy and memcpy, and
> those function are resolved by ifunc, which means only use those
> routines when vector extension are available, so the Tag_RISCV_arch
> for the glibc is rv64gc, not rv64gcv since V is not minimal execution
> environment requirement.

I understand where you are coming from. This "minimal" info can be used 
in a "compile-once-used-multiple" kind of a paradigm where a glibc with 
V enabled ifunc can still run on non-V hardware.


> My expectation is most distro will still distribute with rv64gc for a
> while and then optimize function with vector extension for some
> libraries, and those vector code will guarded with some runtime check
> mechanism maybe IFUNC, so Tag_RISCV_arch for those libraries won't
> contain V.

Yes bulk of glibc might not have vector code, but those V ifunc routines 
do and IMO this information needs to be recorded somewhere in the elf. 
Case in point being the current issue with how to enable V unit. 
Community wants a per-process enable, using an explicit prctl from 
userspace (since RV doesn't have fault-on-first use hardware mechanism 
unlike some of the other arches). But how does the glibc loader know to 
invoke prctl. We can't just rely on user env GLIBC_TUNABLE etc since 
that might not be accurate. It needs somethign concrete which IMO can 
come from elf attributes. If not, do you have suggestions on how to 
solve this issue ?

Granted the case of executable itself using V insns directly is less 
likely than the linked/dlopen dso, so we can punt this being done in 
kernel elf loader and do it in the glibc loader for the DT_NEEDED dsos.

> It's not clear in psABI spec, but intend to fix in future:
> https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/292

Please don't change the semantics of Tag_RISCV_arch itself. Keep the 
minimum if you want, but also have something which reflects the absolute 
-march used to build. If nothing it can be used to annotate binaries how 
they were built.

Thx,
-Vineet

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Auto-enabling V unit and/or use of elf attributes (was Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break)
  2023-01-10 18:07                                               ` Auto-enabling V unit and/or use of elf attributes (was Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break) Vineet Gupta
@ 2023-01-11  1:22                                                 ` Richard Henderson via Libc-alpha
  2023-01-11  4:28                                                   ` Jeff Law
                                                                     ` (2 more replies)
  0 siblings, 3 replies; 79+ messages in thread
From: Richard Henderson via Libc-alpha @ 2023-01-11  1:22 UTC (permalink / raw)
  To: Vineet Gupta, Kito Cheng
  Cc: Philipp Tomsich, Andy Chiu, Vincent Chen, Florian Weimer,
	Rich Felker, Andrew Waterman, Palmer Dabbelt,
	Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Szabolcs Nagy, Greentime Hu, Aaron Durbin,
	Andrew de los Reyes, linux-riscv, GNU C Library

On 1/10/23 10:07, Vineet Gupta wrote:
> Yes bulk of glibc might not have vector code, but those V ifunc routines do and IMO this 
> information needs to be recorded somewhere in the elf. Case in point being the current 
> issue with how to enable V unit. Community wants a per-process enable, using an explicit 
> prctl from userspace (since RV doesn't have fault-on-first use hardware mechanism unlike 
> some of the other arches). But how does the glibc loader know to invoke prctl. We can't 
> just rely on user env GLIBC_TUNABLE etc since that might not be accurate. It needs 
> somethign concrete which IMO can come from elf attributes. If not, do you have suggestions 
> on how to solve this issue ?

Why not just fault on first use to enable?  That's vastly less complicated than trying to 
plumb anything through elf resulting in a prctl.

You might say "but the fault could fail to allocate memory", but honestly, the prctl isn't 
able to fail either -- if it doesn't work, the process must exit.  And, surely, there's 
some minimal vector configuration for which the allocation must succeed.


r~

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Auto-enabling V unit and/or use of elf attributes (was Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break)
  2023-01-11  1:22                                                 ` Richard Henderson via Libc-alpha
@ 2023-01-11  4:28                                                   ` Jeff Law
  2023-01-11  4:57                                                     ` Richard Henderson via Libc-alpha
  2023-01-11  5:05                                                   ` Anup Patel
  2023-01-11  5:23                                                   ` Richard Henderson via Libc-alpha
  2 siblings, 1 reply; 79+ messages in thread
From: Jeff Law @ 2023-01-11  4:28 UTC (permalink / raw)
  To: Richard Henderson, Vineet Gupta, Kito Cheng
  Cc: Philipp Tomsich, Andy Chiu, Vincent Chen, Florian Weimer,
	Rich Felker, Andrew Waterman, Palmer Dabbelt,
	Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Szabolcs Nagy, Greentime Hu, Aaron Durbin,
	Andrew de los Reyes, linux-riscv, GNU C Library



On 1/10/23 18:22, Richard Henderson wrote:
> On 1/10/23 10:07, Vineet Gupta wrote:
>> Yes bulk of glibc might not have vector code, but those V ifunc 
>> routines do and IMO this information needs to be recorded somewhere in 
>> the elf. Case in point being the current issue with how to enable V 
>> unit. Community wants a per-process enable, using an explicit prctl 
>> from userspace (since RV doesn't have fault-on-first use hardware 
>> mechanism unlike some of the other arches). But how does the glibc 
>> loader know to invoke prctl. We can't just rely on user env 
>> GLIBC_TUNABLE etc since that might not be accurate. It needs somethign 
>> concrete which IMO can come from elf attributes. If not, do you have 
>> suggestions on how to solve this issue ?
> 
> Why not just fault on first use to enable?  That's vastly less 
> complicated than trying to plumb anything through elf resulting in a prctl.
Well, the answer is in Vineet's paragraph -- the hardware apparently 
doesn't have fault-on-first-use which is mighty unfortunate.

Jeff

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Auto-enabling V unit and/or use of elf attributes (was Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break)
  2023-01-11  4:28                                                   ` Jeff Law
@ 2023-01-11  4:57                                                     ` Richard Henderson via Libc-alpha
  2023-01-11  5:07                                                       ` Jeff Law
  0 siblings, 1 reply; 79+ messages in thread
From: Richard Henderson via Libc-alpha @ 2023-01-11  4:57 UTC (permalink / raw)
  To: Jeff Law, Vineet Gupta, Kito Cheng
  Cc: Philipp Tomsich, Andy Chiu, Vincent Chen, Florian Weimer,
	Rich Felker, Andrew Waterman, Palmer Dabbelt,
	Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Szabolcs Nagy, Greentime Hu, Aaron Durbin,
	Andrew de los Reyes, linux-riscv, GNU C Library

On 1/10/23 20:28, Jeff Law wrote:
> 
> 
> On 1/10/23 18:22, Richard Henderson wrote:
>> On 1/10/23 10:07, Vineet Gupta wrote:
>>> Yes bulk of glibc might not have vector code, but those V ifunc routines do and IMO 
>>> this information needs to be recorded somewhere in the elf. Case in point being the 
>>> current issue with how to enable V unit. Community wants a per-process enable, using an 
>>> explicit prctl from userspace (since RV doesn't have fault-on-first use hardware 
>>> mechanism unlike some of the other arches). But how does the glibc loader know to 
>>> invoke prctl. We can't just rely on user env GLIBC_TUNABLE etc since that might not be 
>>> accurate. It needs somethign concrete which IMO can come from elf attributes. If not, 
>>> do you have suggestions on how to solve this issue ?
>>
>> Why not just fault on first use to enable?  That's vastly less complicated than trying 
>> to plumb anything through elf resulting in a prctl.
> Well, the answer is in Vineet's paragraph -- the hardware apparently doesn't have 
> fault-on-first-use which is mighty unfortunate.

Nonsense -- sstatus.vs stores {off, initial, clean, dirty} state, just like fpu.
Now treat the vector unit just like fpu lazy migration.


r~

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Auto-enabling V unit and/or use of elf attributes (was Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break)
  2023-01-11  1:22                                                 ` Richard Henderson via Libc-alpha
  2023-01-11  4:28                                                   ` Jeff Law
@ 2023-01-11  5:05                                                   ` Anup Patel
  2023-01-11  5:23                                                   ` Richard Henderson via Libc-alpha
  2 siblings, 0 replies; 79+ messages in thread
From: Anup Patel @ 2023-01-11  5:05 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Vineet Gupta, Kito Cheng, Philipp Tomsich, Andy Chiu,
	Vincent Chen, Florian Weimer, Rich Felker, Andrew Waterman,
	Palmer Dabbelt, Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Szabolcs Nagy, Greentime Hu, Aaron Durbin,
	Andrew de los Reyes, linux-riscv, GNU C Library

On Wed, Jan 11, 2023 at 6:53 AM Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 1/10/23 10:07, Vineet Gupta wrote:
> > Yes bulk of glibc might not have vector code, but those V ifunc routines do and IMO this
> > information needs to be recorded somewhere in the elf. Case in point being the current
> > issue with how to enable V unit. Community wants a per-process enable, using an explicit
> > prctl from userspace (since RV doesn't have fault-on-first use hardware mechanism unlike
> > some of the other arches). But how does the glibc loader know to invoke prctl. We can't
> > just rely on user env GLIBC_TUNABLE etc since that might not be accurate. It needs
> > somethign concrete which IMO can come from elf attributes. If not, do you have suggestions
> > on how to solve this issue ?
>
> Why not just fault on first use to enable?  That's vastly less complicated than trying to
> plumb anything through elf resulting in a prctl.
>
> You might say "but the fault could fail to allocate memory", but honestly, the prctl isn't
> able to fail either -- if it doesn't work, the process must exit.  And, surely, there's
> some minimal vector configuration for which the allocation must succeed.

IMO, this is a very good suggestion.

For the benefit of everyone, both sstatus.FS and sstatus.VS have the
following states:
1. Off (0): All off and any access to float / vector will result in exception
2. Initial (1): None dirty or clean, some on
3. Clean (2): None dirty, some clean
4. Dirty (3): Some dirty

For float, we are setting sstatus.FS = 1 (Initial) in start_thread() by
default for all tasks and we are doing lazy save-restore in fstate_save()
and fstate_restore().

For vector, we can take a different approach where start_thread()
will by default set sstatus.VS = 0 (Off) for all tasks. Now whenever
any task access vector state, Linux RISC-V will get an exception
and at that point in time we can allocate memory for the vector
state and also set sstatus.VS = 1 (Initial) for that task. The save
restore of the vector state can still be lazy for the tasks using it.

Regards,
Anup

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Auto-enabling V unit and/or use of elf attributes (was Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break)
  2023-01-11  4:57                                                     ` Richard Henderson via Libc-alpha
@ 2023-01-11  5:07                                                       ` Jeff Law
  2023-01-11  6:00                                                         ` Andy Chiu
  0 siblings, 1 reply; 79+ messages in thread
From: Jeff Law @ 2023-01-11  5:07 UTC (permalink / raw)
  To: Richard Henderson, Vineet Gupta, Kito Cheng
  Cc: Philipp Tomsich, Andy Chiu, Vincent Chen, Florian Weimer,
	Rich Felker, Andrew Waterman, Palmer Dabbelt,
	Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Szabolcs Nagy, Greentime Hu, Aaron Durbin,
	Andrew de los Reyes, linux-riscv, GNU C Library



On 1/10/23 21:57, Richard Henderson wrote:
> On 1/10/23 20:28, Jeff Law wrote:
>>
>>
>> On 1/10/23 18:22, Richard Henderson wrote:
>>> On 1/10/23 10:07, Vineet Gupta wrote:
>>>> Yes bulk of glibc might not have vector code, but those V ifunc 
>>>> routines do and IMO this information needs to be recorded somewhere 
>>>> in the elf. Case in point being the current issue with how to enable 
>>>> V unit. Community wants a per-process enable, using an explicit 
>>>> prctl from userspace (since RV doesn't have fault-on-first use 
>>>> hardware mechanism unlike some of the other arches). But how does 
>>>> the glibc loader know to invoke prctl. We can't just rely on user 
>>>> env GLIBC_TUNABLE etc since that might not be accurate. It needs 
>>>> somethign concrete which IMO can come from elf attributes. If not, 
>>>> do you have suggestions on how to solve this issue ?
>>>
>>> Why not just fault on first use to enable?  That's vastly less 
>>> complicated than trying to plumb anything through elf resulting in a 
>>> prctl.
>> Well, the answer is in Vineet's paragraph -- the hardware apparently 
>> doesn't have fault-on-first-use which is mighty unfortunate.
> 
> Nonsense -- sstatus.vs stores {off, initial, clean, dirty} state, just 
> like fpu.
> Now treat the vector unit just like fpu lazy migration.
Then let's do something sensible.    Manually enabling via prctl seems 
silly if we have fault on first use.

jeff

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Auto-enabling V unit and/or use of elf attributes (was Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break)
  2023-01-11  1:22                                                 ` Richard Henderson via Libc-alpha
  2023-01-11  4:28                                                   ` Jeff Law
  2023-01-11  5:05                                                   ` Anup Patel
@ 2023-01-11  5:23                                                   ` Richard Henderson via Libc-alpha
  2 siblings, 0 replies; 79+ messages in thread
From: Richard Henderson via Libc-alpha @ 2023-01-11  5:23 UTC (permalink / raw)
  To: Vineet Gupta, Kito Cheng
  Cc: Philipp Tomsich, Andy Chiu, Vincent Chen, Florian Weimer,
	Rich Felker, Andrew Waterman, Palmer Dabbelt,
	Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Szabolcs Nagy, Greentime Hu, Aaron Durbin,
	Andrew de los Reyes, linux-riscv, GNU C Library

On 1/10/23 17:22, Richard Henderson wrote:
> And, surely, there's some minimal vector configuration for which the allocation must succeed.

To answer my own question here, no, there does not seem to be a way to cap VLMAX in the OS 
or the hypervisor -- vsetvli rd, r0, e8 will always set VL to the VLMAX for which the cpu 
is configured.

(ARM SVE can artificially limit the vector length.  Linux chooses the default vector 
length so that state fits within the existing 4k signal stack frame.  This is good enough 
for the vector usage within e.g. strlen.  In order to take advantage of any larger vector 
length the hardware may support, one must use a prctl.)


r~

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Auto-enabling V unit and/or use of elf attributes (was Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break)
  2023-01-11  5:07                                                       ` Jeff Law
@ 2023-01-11  6:00                                                         ` Andy Chiu
  2023-01-11  6:20                                                           ` Jeff Law
  0 siblings, 1 reply; 79+ messages in thread
From: Andy Chiu @ 2023-01-11  6:00 UTC (permalink / raw)
  To: Jeff Law
  Cc: Richard Henderson, Vineet Gupta, Kito Cheng, Philipp Tomsich,
	Vincent Chen, Florian Weimer, Rich Felker, Andrew Waterman,
	Palmer Dabbelt, Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Szabolcs Nagy, Greentime Hu, Aaron Durbin,
	Andrew de los Reyes, linux-riscv, GNU C Library

On Wed, Jan 11, 2023 at 1:07 PM Jeff Law <jlaw@ventanamicro.com> wrote:
>
>
>
> On 1/10/23 21:57, Richard Henderson wrote:
> > On 1/10/23 20:28, Jeff Law wrote:
> >>
> >>
> >> On 1/10/23 18:22, Richard Henderson wrote:
> >>> On 1/10/23 10:07, Vineet Gupta wrote:
> >>>> Yes bulk of glibc might not have vector code, but those V ifunc
> >>>> routines do and IMO this information needs to be recorded somewhere
> >>>> in the elf. Case in point being the current issue with how to enable
> >>>> V unit. Community wants a per-process enable, using an explicit
> >>>> prctl from userspace (since RV doesn't have fault-on-first use
> >>>> hardware mechanism unlike some of the other arches). But how does
> >>>> the glibc loader know to invoke prctl. We can't just rely on user
> >>>> env GLIBC_TUNABLE etc since that might not be accurate. It needs
> >>>> somethign concrete which IMO can come from elf attributes. If not,
> >>>> do you have suggestions on how to solve this issue ?
> >>>
> >>> Why not just fault on first use to enable?  That's vastly less
> >>> complicated than trying to plumb anything through elf resulting in a
> >>> prctl.
> >> Well, the answer is in Vineet's paragraph -- the hardware apparently
> >> doesn't have fault-on-first-use which is mighty unfortunate.
> >
> > Nonsense -- sstatus.vs stores {off, initial, clean, dirty} state, just
> > like fpu.
> > Now treat the vector unit just like fpu lazy migration.
> Then let's do something sensible.    Manually enabling via prctl seems
> silly if we have fault on first use.
Yes, faulting on first use is a viable way of approaching. However, my
concern is that doing this on a system with libraries having common
V-optimized routines such as memcpy, memset would essentially trap
every process to m-mode starting up. This might take more cost than a
prctl syscall. And if every process on the system wants to be
benefited from V-optimized ifuncs, then having an additional prctl to
call at start time seems tedious as well.

Andy

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Auto-enabling V unit and/or use of elf attributes (was Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break)
  2023-01-11  6:00                                                         ` Andy Chiu
@ 2023-01-11  6:20                                                           ` Jeff Law
  2023-01-11  9:28                                                             ` Andy Chiu
  0 siblings, 1 reply; 79+ messages in thread
From: Jeff Law @ 2023-01-11  6:20 UTC (permalink / raw)
  To: Andy Chiu
  Cc: Richard Henderson, Vineet Gupta, Kito Cheng, Philipp Tomsich,
	Vincent Chen, Florian Weimer, Rich Felker, Andrew Waterman,
	Palmer Dabbelt, Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Szabolcs Nagy, Greentime Hu, Aaron Durbin,
	Andrew de los Reyes, linux-riscv, GNU C Library



On 1/10/23 23:00, Andy Chiu wrote:
> On Wed, Jan 11, 2023 at 1:07 PM Jeff Law <jlaw@ventanamicro.com> wrote:
>>
>>
>>
>> On 1/10/23 21:57, Richard Henderson wrote:
>>> On 1/10/23 20:28, Jeff Law wrote:
>>>>
>>>>
>>>> On 1/10/23 18:22, Richard Henderson wrote:
>>>>> On 1/10/23 10:07, Vineet Gupta wrote:
>>>>>> Yes bulk of glibc might not have vector code, but those V ifunc
>>>>>> routines do and IMO this information needs to be recorded somewhere
>>>>>> in the elf. Case in point being the current issue with how to enable
>>>>>> V unit. Community wants a per-process enable, using an explicit
>>>>>> prctl from userspace (since RV doesn't have fault-on-first use
>>>>>> hardware mechanism unlike some of the other arches). But how does
>>>>>> the glibc loader know to invoke prctl. We can't just rely on user
>>>>>> env GLIBC_TUNABLE etc since that might not be accurate. It needs
>>>>>> somethign concrete which IMO can come from elf attributes. If not,
>>>>>> do you have suggestions on how to solve this issue ?
>>>>>
>>>>> Why not just fault on first use to enable?  That's vastly less
>>>>> complicated than trying to plumb anything through elf resulting in a
>>>>> prctl.
>>>> Well, the answer is in Vineet's paragraph -- the hardware apparently
>>>> doesn't have fault-on-first-use which is mighty unfortunate.
>>>
>>> Nonsense -- sstatus.vs stores {off, initial, clean, dirty} state, just
>>> like fpu.
>>> Now treat the vector unit just like fpu lazy migration.
>> Then let's do something sensible.    Manually enabling via prctl seems
>> silly if we have fault on first use.
> Yes, faulting on first use is a viable way of approaching. However, my
> concern is that doing this on a system with libraries having common
> V-optimized routines such as memcpy, memset would essentially trap
> every process to m-mode starting up. This might take more cost than a
> prctl syscall. And if every process on the system wants to be
> benefited from V-optimized ifuncs, then having an additional prctl to
> call at start time seems tedious as well.
>
It's not perfect, but it's workable.  Explicitly turning things on seems 
like madness.  It boils down to having to annotate every binary and DSO 
and also adds complexity to JITs, the dynamic loader and probably all 
kinds of places we haven't thought through yet.

Fault on first use is well understood and has been implemented on many 
architectures through the decades, even with its warts.

jeff

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Auto-enabling V unit and/or use of elf attributes (was Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break)
  2023-01-11  6:20                                                           ` Jeff Law
@ 2023-01-11  9:28                                                             ` Andy Chiu
  2023-01-11 12:13                                                               ` Andy Chiu
  0 siblings, 1 reply; 79+ messages in thread
From: Andy Chiu @ 2023-01-11  9:28 UTC (permalink / raw)
  To: Jeff Law
  Cc: Richard Henderson, Vineet Gupta, Kito Cheng, Philipp Tomsich,
	Vincent Chen, Florian Weimer, Rich Felker, Andrew Waterman,
	Palmer Dabbelt, Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Szabolcs Nagy, Greentime Hu, Aaron Durbin,
	Andrew de los Reyes, linux-riscv, GNU C Library

On Wed, Jan 11, 2023 at 2:20 PM Jeff Law <jlaw@ventanamicro.com> wrote:
> Fault on first use is well understood and has been implemented on many
> architectures through the decades, even with its warts.

Unfortunately, we don't have a direct way of acknowledging if an
illegal instruction is caused by illegitimate use of V instructions.
Unlike ARM64, where reading ESR_EL1.EC is enough to distinguish the
fault, we may have to perform a sw decode on the faulting instruction.
Then see if it is the first-use fault, or a more general illegal
instruction fault.

Yes, we may just enable V for a process whenever we find an OP-V major
opcode, or a LOAD/STORE-FP with vector-encoded width on illegal
instruction. But it could be kind of messy, IF, later extensions would
also like to be enabled at first-use-fault. (e.g. ARM has SME followed
by SVE). And implementing this decoding logic in sw just seems
redundant to me because hw has already done that for us.

Besides, ARM64 has individual mappings of traps for the use of
FP-related units in EL1 and EL0. So SIMD running in kernel mode would
not take additional instruction to enable the unit. I assume these
kinds of CSR-controlling instructions would have to flush hw internal
buffers to some extent. And doing these takes additional latencies.

Andy

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Auto-enabling V unit and/or use of elf attributes (was Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break)
  2023-01-11  9:28                                                             ` Andy Chiu
@ 2023-01-11 12:13                                                               ` Andy Chiu
  2023-01-23 12:17                                                                 ` Conor Dooley via Libc-alpha
  0 siblings, 1 reply; 79+ messages in thread
From: Andy Chiu @ 2023-01-11 12:13 UTC (permalink / raw)
  To: Jeff Law
  Cc: Richard Henderson, Vineet Gupta, Kito Cheng, Philipp Tomsich,
	Vincent Chen, Florian Weimer, Rich Felker, Andrew Waterman,
	Palmer Dabbelt, Christoph Müllner, davidlt, Arnd Bergmann,
	Björn Töpel, Szabolcs Nagy, Greentime Hu, Aaron Durbin,
	Andrew de los Reyes, linux-riscv, GNU C Library

On Wed, Jan 11, 2023 at 5:28 PM Andy Chiu <andy.chiu@sifive.com> wrote:
>
> On Wed, Jan 11, 2023 at 2:20 PM Jeff Law <jlaw@ventanamicro.com> wrote:
> > Fault on first use is well understood and has been implemented on many
> > architectures through the decades, even with its warts.
>
> Unfortunately, we don't have a direct way of acknowledging if an
> illegal instruction is caused by illegitimate use of V instructions.
> Unlike ARM64, where reading ESR_EL1.EC is enough to distinguish the
> fault, we may have to perform a sw decode on the faulting instruction.
> Then see if it is the first-use fault, or a more general illegal
> instruction fault.
After taking more considerations, I think this could be minor. The
first V-instruction of a valid program that uses Vector is limited to
vset{i}vl{i}, vl<nf>r, or vs<nf>r. And perhaps some r/w of
vector-specific CSRs. Decoding these instructions should be relatively
constraint and easy. And we need this decoding only once for each
process since we don't have to do lazy save/restore.
>
> Yes, we may just enable V for a process whenever we find an OP-V major
> opcode, or a LOAD/STORE-FP with vector-encoded width on illegal
> instruction. But it could be kind of messy, IF, later extensions would
> also like to be enabled at first-use-fault. (e.g. ARM has SME followed
> by SVE). And implementing this decoding logic in sw just seems
> redundant to me because hw has already done that for us.
Let's limit our discussion to the scope of VS enablement for now.
>
> Besides, ARM64 has individual mappings of traps for the use of
> FP-related units in EL1 and EL0. So SIMD running in kernel mode would
> not take additional instruction to enable the unit. I assume these
> kinds of CSR-controlling instructions would have to flush hw internal
> buffers to some extent. And doing these takes additional latencies.
We already do some VS/FS settings on the entry of kernel code. So this
should be minor as well.

Anyway, I agree that faulting on first-uses is a better way to make
per-process control of VS feasible. Sorry for disturbing the list.

Thanks,
Andy

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Auto-enabling V unit and/or use of elf attributes (was Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break)
  2023-01-11 12:13                                                               ` Andy Chiu
@ 2023-01-23 12:17                                                                 ` Conor Dooley via Libc-alpha
  2023-01-23 13:29                                                                   ` Andy Chiu
  0 siblings, 1 reply; 79+ messages in thread
From: Conor Dooley via Libc-alpha @ 2023-01-23 12:17 UTC (permalink / raw)
  To: Andy Chiu
  Cc: Jeff Law, Richard Henderson, Vineet Gupta, Kito Cheng,
	Philipp Tomsich, Vincent Chen, Florian Weimer, Rich Felker,
	Andrew Waterman, Palmer Dabbelt, Christoph Müllner, davidlt,
	Arnd Bergmann, Björn Töpel, Szabolcs Nagy, Greentime Hu,
	Aaron Durbin, Andrew de los Reyes, linux-riscv, GNU C Library

[-- Attachment #1: Type: text/plain, Size: 2497 bytes --]

Hey Andy!

On Wed, Jan 11, 2023 at 08:13:27PM +0800, Andy Chiu wrote:
> On Wed, Jan 11, 2023 at 5:28 PM Andy Chiu <andy.chiu@sifive.com> wrote:
> >
> > On Wed, Jan 11, 2023 at 2:20 PM Jeff Law <jlaw@ventanamicro.com> wrote:
> > > Fault on first use is well understood and has been implemented on many
> > > architectures through the decades, even with its warts.
> >
> > Unfortunately, we don't have a direct way of acknowledging if an
> > illegal instruction is caused by illegitimate use of V instructions.
> > Unlike ARM64, where reading ESR_EL1.EC is enough to distinguish the
> > fault, we may have to perform a sw decode on the faulting instruction.
> > Then see if it is the first-use fault, or a more general illegal
> > instruction fault.
> After taking more considerations, I think this could be minor. The
> first V-instruction of a valid program that uses Vector is limited to
> vset{i}vl{i}, vl<nf>r, or vs<nf>r. And perhaps some r/w of
> vector-specific CSRs. Decoding these instructions should be relatively
> constraint and easy. And we need this decoding only once for each
> process since we don't have to do lazy save/restore.
> >
> > Yes, we may just enable V for a process whenever we find an OP-V major
> > opcode, or a LOAD/STORE-FP with vector-encoded width on illegal
> > instruction. But it could be kind of messy, IF, later extensions would
> > also like to be enabled at first-use-fault. (e.g. ARM has SME followed
> > by SVE). And implementing this decoding logic in sw just seems
> > redundant to me because hw has already done that for us.
> Let's limit our discussion to the scope of VS enablement for now.
> >
> > Besides, ARM64 has individual mappings of traps for the use of
> > FP-related units in EL1 and EL0. So SIMD running in kernel mode would
> > not take additional instruction to enable the unit. I assume these
> > kinds of CSR-controlling instructions would have to flush hw internal
> > buffers to some extent. And doing these takes additional latencies.
> We already do some VS/FS settings on the entry of kernel code. So this
> should be minor as well.
> 
> Anyway, I agree that faulting on first-uses is a better way to make
> per-process control of VS feasible.

> Sorry for disturbing the list.

Meh, all of these discussions seem worthwhile to me!

Now that things have died down though, I'm curious - what are your
plans? Still going to submit another version of this series?

Thanks,
Conor.


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: Auto-enabling V unit and/or use of elf attributes (was Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break)
  2023-01-23 12:17                                                                 ` Conor Dooley via Libc-alpha
@ 2023-01-23 13:29                                                                   ` Andy Chiu
  0 siblings, 0 replies; 79+ messages in thread
From: Andy Chiu @ 2023-01-23 13:29 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Jeff Law, Richard Henderson, Vineet Gupta, Kito Cheng,
	Philipp Tomsich, Vincent Chen, Florian Weimer, Rich Felker,
	Andrew Waterman, Palmer Dabbelt, Christoph Müllner, davidlt,
	Arnd Bergmann, Björn Töpel, Szabolcs Nagy, Greentime Hu,
	Aaron Durbin, Andrew de los Reyes, linux-riscv, GNU C Library

Hey Conor,

On Mon, Jan 23, 2023 at 8:18 PM Conor Dooley <conor.dooley@microchip.com> wrote:
> Meh, all of these discussions seem worthwhile to me!
>
> Now that things have died down though, I'm curious - what are your
> plans? Still going to submit another version of this series?
>
Yes, we have implemented most of it and are planning to send the
series in recent days. Thanks to Vineet, he is helping me to sort out
some last bits before the submission. Here are some points related to
this thread that will be in v13:

1. allocate V context in the first-use trap
2. drop prctl V-controlling because it conflicts with the idea of the
first-use trap.
2. sigframe/ptrace will not have V context if a process's VS is off
3. If the kernel is compiled with CONFIG_RISCV_ISA_V enabled, then the
auxv always reports size of the sigframe as if there is a V context.
This is because user space may need information from auxv to set up an
alternative signal stack, and it may not know if it would use V. ARM64
also reports the size assuming all extensions are used.

Thanks,
Andy

^ permalink raw reply	[flat|nested] 79+ messages in thread

end of thread, other threads:[~2023-01-23 13:30 UTC | newest]

Thread overview: 79+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-13  1:41 [RFC patch 0/5] RISC-V: Add vector ISA support Vincent Chen
2021-09-13  1:41 ` [RFC patch 1/5] RISC-V: Remove riscv-specific sigcontext.h Vincent Chen
2021-09-13  1:41 ` [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion Vincent Chen
2021-09-13 13:44   ` Florian Weimer via Libc-alpha
2021-09-13 13:52     ` Rich Felker
2021-09-16  8:02       ` Vincent Chen
2021-09-16  8:14         ` Florian Weimer via Libc-alpha
2021-09-18  3:04           ` Vincent Chen
2022-12-09  3:39             ` RISCV kernel struct sigcontext expansion for V regs and potential glibc ABI break (was Re: [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion.) Vineet Gupta
2022-12-09  4:03               ` Vineet Gupta
2022-12-20 20:05               ` Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break Vineet Gupta
2022-12-21 15:53                 ` Vincent Chen
2022-12-21 19:45                   ` Vineet Gupta
2022-12-21 19:52                     ` Vineet Gupta
2022-12-22  3:37                       ` Vincent Chen
2022-12-22 19:25                         ` Vineet Gupta
2022-12-23  2:27                           ` Vincent Chen
2022-12-23 19:42                             ` Vineet Gupta
2022-12-22  5:32                       ` Richard Henderson via Libc-alpha
2022-12-22 18:33                         ` Andy Chiu
2022-12-22 20:27                           ` Vineet Gupta
2022-12-28 10:53                             ` Andy Chiu
2023-01-03 19:17                               ` Vineet Gupta
2023-01-04 16:34                                 ` Andy Chiu
2023-01-04 20:46                                   ` Vineet Gupta
2023-01-04 21:29                                     ` Philipp Tomsich
2023-01-04 21:37                                       ` Andrew Waterman
2023-01-04 22:43                                       ` Vineet Gupta
2023-01-09 13:33                                         ` Kito Cheng
2023-01-09 19:16                                           ` Vineet Gupta
2023-01-10 13:21                                             ` Kito Cheng
2023-01-10 18:07                                               ` Auto-enabling V unit and/or use of elf attributes (was Re: Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break) Vineet Gupta
2023-01-11  1:22                                                 ` Richard Henderson via Libc-alpha
2023-01-11  4:28                                                   ` Jeff Law
2023-01-11  4:57                                                     ` Richard Henderson via Libc-alpha
2023-01-11  5:07                                                       ` Jeff Law
2023-01-11  6:00                                                         ` Andy Chiu
2023-01-11  6:20                                                           ` Jeff Law
2023-01-11  9:28                                                             ` Andy Chiu
2023-01-11 12:13                                                               ` Andy Chiu
2023-01-23 12:17                                                                 ` Conor Dooley via Libc-alpha
2023-01-23 13:29                                                                   ` Andy Chiu
2023-01-11  5:05                                                   ` Anup Patel
2023-01-11  5:23                                                   ` Richard Henderson via Libc-alpha
2022-12-22 22:33                           ` Adding V-ext regs to signal context w/o expanding kernel struct sigcontext to avoid glibc ABI break Richard Henderson via Libc-alpha
2022-12-22 23:47                           ` Conor Dooley via Libc-alpha
2022-12-22 23:58                             ` Vineet Gupta
2022-12-22 20:30                         ` Vineet Gupta
2022-12-22 21:38                           ` Andrew Waterman
2022-12-22  1:50                     ` Vincent Chen
2022-12-22  5:34                     ` Richard Henderson via Libc-alpha
2021-09-16 23:56         ` [RFC patch 2/5] RISC-V: Reserve about 5K space in mcontext_t to support future ISA expansion Ben Woodard via Libc-alpha
2021-09-18  3:15           ` Vincent Chen
2021-09-20 16:41             ` DJ Delorie via Libc-alpha
2021-09-20 17:10               ` Florian Weimer via Libc-alpha
2021-10-01  1:43                 ` Vincent Chen
2021-10-01 12:08                   ` Adhemerval Zanella via Libc-alpha
2021-09-17 17:03         ` Rich Felker
2021-09-18  3:19           ` Vincent Chen
2021-09-13  1:41 ` [RFC patch 3/5] RISC-V: Save and restore VCSR when doing user context switch Vincent Chen
2021-09-14 23:48   ` Joseph Myers
2021-09-15  0:13     ` Andrew Waterman
2021-09-16  9:20       ` Vincent Chen
2021-10-01 13:04   ` Adhemerval Zanella via Libc-alpha
2021-09-13  1:41 ` [RFC patch 4/5] RISC-V: Extend MINSIGSTKSZ and SIGSTKSZ to backup RVV registers Vincent Chen
2021-09-13 13:51   ` Rich Felker
2021-09-16  9:25     ` Vincent Chen
2021-09-13  1:41 ` [RFC 5/5] RISC-V: Expand PTHREAD_STACK_MIN to support RVV environment Vincent Chen
2021-09-14 23:43   ` Joseph Myers
2021-09-15 10:42     ` Florian Weimer via Libc-alpha
2021-09-15 14:31       ` H.J. Lu via Libc-alpha
2021-09-16 10:21         ` Vincent Chen
2021-09-13 19:11 ` [RFC patch 0/5] RISC-V: Add vector ISA support Vineet Gupta via Libc-alpha
2021-09-15 19:37   ` Jim Wilson
2021-11-09 19:21 ` Darius Rad
2021-11-09 19:30   ` Andrew Waterman
2021-11-09 22:03     ` Darius Rad
2021-11-09 22:18       ` Andrew Waterman
2021-11-10 11:39         ` Darius Rad

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).