unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Palmer Dabbelt <palmer@dabbelt.com>
To: christoph.muellner@vrull.eu
Cc: libc-alpha@sourceware.org, adhemerval.zanella@linaro.org,
	Darius Rad <darius@bluespec.com>,
	Andrew Waterman <andrew@sifive.com>,
	philipp.tomsich@vrull.eu,  Evan Green <evan@rivosinc.com>,
	kito.cheng@sifive.com, jeffreyalaw@gmail.com,
	 Vineet Gupta <vineetg@rivosinc.com>,
	christoph.muellner@vrull.eu
Subject: Re: [RFC PATCH 3/3] RISC-V: Implement CPU yielding for busy loops with Zihintpause/Zawrs
Date: Thu, 18 Apr 2024 10:17:38 -0700 (PDT)	[thread overview]
Message-ID: <mhng-08eebb15-4e67-4616-abdd-64ea7522ab70@palmer-ri-x1c9a> (raw)
In-Reply-To: <20240418094635.3502009-4-christoph.muellner@vrull.eu>

On Thu, 18 Apr 2024 02:46:35 PDT (-0700), christoph.muellner@vrull.eu wrote:
> The macro atomic_spin_nop can be used to implement arch-specific
> CPU yielding that is used in busy loops (e.g. in pthread_spin_lock).
> This patch introduces an ifunc-based implementation for RISC-V,
> that uses Zihintpause's PAUSE instruction for that matter (as PAUSE
> is a HINT instruction there is not dependency to Zihintpause at
> runtime).  Further, we test for Zawrs via hwprobe() and if found
> we use WRS.STO instead of PAUSE.
>
> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
> ---
>  sysdeps/riscv/multiarch/cpu-relax_generic.S   | 31 +++++++++++++++
>  sysdeps/riscv/multiarch/cpu-relax_zawrs.S     | 28 +++++++++++++
>  .../unix/sysv/linux/riscv/atomic-machine.h    |  3 ++
>  .../unix/sysv/linux/riscv/multiarch/Makefile  |  8 ++++
>  .../sysv/linux/riscv/multiarch/cpu-relax.c    | 39 +++++++++++++++++++
>  .../linux/riscv/multiarch/ifunc-impl-list.c   | 32 +++++++++++++--
>  6 files changed, 137 insertions(+), 4 deletions(-)
>  create mode 100644 sysdeps/riscv/multiarch/cpu-relax_generic.S
>  create mode 100644 sysdeps/riscv/multiarch/cpu-relax_zawrs.S
>  create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/cpu-relax.c
>
> diff --git a/sysdeps/riscv/multiarch/cpu-relax_generic.S b/sysdeps/riscv/multiarch/cpu-relax_generic.S
> new file mode 100644
> index 0000000000..d3ccfdce84
> --- /dev/null
> +++ b/sysdeps/riscv/multiarch/cpu-relax_generic.S
> @@ -0,0 +1,31 @@
> +/* CPU strand yielding for busy loops.  RISC-V version.
> +   Copyright (C) 2024 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library.  If not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <sysdep.h>
> +#include <sys/asm.h>
> +
> +.option push
> +.option arch, +zihintpause
> +ENTRY (__cpu_relax_generic)
> +	/* While we can use the `pause` instruction without
> +	   the need of Zihintpause (because it is a HINT instruction),
> +	   we still have to enable Zihintpause for the assembler.  */
> +	pause
> +	ret
> +END (__cpu_relax_generic)
> +.option pop
> diff --git a/sysdeps/riscv/multiarch/cpu-relax_zawrs.S b/sysdeps/riscv/multiarch/cpu-relax_zawrs.S
> new file mode 100644
> index 0000000000..6d27b354df
> --- /dev/null
> +++ b/sysdeps/riscv/multiarch/cpu-relax_zawrs.S
> @@ -0,0 +1,28 @@
> +/* CPU strand yielding for busy loops.  RISC-V version with Zawrs.
> +   Copyright (C) 2024 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library.  If not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <sysdep.h>
> +#include <sys/asm.h>
> +
> +.option push
> +.option arch, +zawrs
> +ENTRY (__cpu_relax_zawrs)
> +	wrs.sto

This has the same forward progress/eventual success violation as the 
code you sent for GCC and Linux does.  It doesn't really matter if the 
user of the reservation is in a builtin, an asm block, or a function.  
The compiler just doesn't know about those reservation rules and isn't 
going to generate code that follows them.

That said, as far as I can tell the same grey area in the WRS spec that 
I pointed out during one of the Linux reviews still exists.   So if 
there's some hardware that has a more generous set of LR->WRS rules than 
the ISA-required LR->SC rules that's fine, we just need to know what the 
hardware actually is so we can write down the rules and then stick to 
them -- it's just a performance thing for WRS, so having weaker rules 
seems entirely plausible.

I don't know of any hardware with WRS, but sorry if I missed one.  Do 
you have something publicly availiable that behaves this way?

> +	ret
> +END (__cpu_relax_zawrs)
> +.option pop
> diff --git a/sysdeps/unix/sysv/linux/riscv/atomic-machine.h b/sysdeps/unix/sysv/linux/riscv/atomic-machine.h
> index c1c9d949a0..02b9b7a421 100644
> --- a/sysdeps/unix/sysv/linux/riscv/atomic-machine.h
> +++ b/sysdeps/unix/sysv/linux/riscv/atomic-machine.h
> @@ -178,4 +178,7 @@
>  # error "ISAs that do not subsume the A extension are not supported"
>  #endif /* !__riscv_atomic */
>
> +extern void __cpu_relax (void);
> +#define atomic_spin_nop() __cpu_relax()
> +
>  #endif /* bits/atomic.h */
> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> index fcef5659d4..0cdf37a38b 100644
> --- a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile
> @@ -1,3 +1,11 @@
> +# nscd uses atomic_spin_nop which in turn requires cpu_relax
> +ifeq ($(subdir),nscd)
> +sysdep_routines += \
> +  cpu-relax \
> +  cpu-relax_generic \
> +  cpu-relax_zawrs
> +endif
> +
>  ifeq ($(subdir),string)
>  sysdep_routines += \
>    memcpy \
> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/cpu-relax.c b/sysdeps/unix/sysv/linux/riscv/multiarch/cpu-relax.c
> new file mode 100644
> index 0000000000..5aeb120e21
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/cpu-relax.c
> @@ -0,0 +1,39 @@
> +/* Multiple versions of cpu-relax.
> +   All versions must be listed in ifunc-impl-list.c.
> +   Copyright (C) 2024 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +# include <ifunc-init.h>
> +# include <riscv-ifunc.h>
> +# include <sys/hwprobe.h>
> +
> +void __cpu_relax (void);
> +extern __typeof (__cpu_relax) __cpu_relax_generic attribute_hidden;
> +extern __typeof (__cpu_relax) __cpu_relax_zawrs attribute_hidden;
> +
> +static inline __typeof (__cpu_relax) *
> +select_cpu_relax_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func)
> +{
> +  unsigned long long int v;
> +  if (__riscv_hwprobe_one (hwprobe_func, RISCV_HWPROBE_KEY_IMA_EXT_0, &v) == 0
> +      && (v & RISCV_HWPROBE_EXT_ZAWRS))
> +    return __cpu_relax_zawrs;
> +
> +  return __cpu_relax_generic;
> +}
> +
> +riscv_libc_ifunc (__cpu_relax, select_cpu_relax_ifunc);
> diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> index 9f806d7a9e..9c7a8c2e1f 100644
> --- a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c
> @@ -20,24 +20,48 @@
>  #include <string.h>
>  #include <sys/hwprobe.h>
>
> +#define ARRAY_SIZE(A) (sizeof (A) / sizeof ((A)[0]))
> +
> +void __cpu_relax (void);
> +
>  size_t
>  __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
>  			size_t max)
>  {
>    size_t i = max;
> +  struct riscv_hwprobe pairs[] = {
> +    { .key = RISCV_HWPROBE_KEY_IMA_EXT_0 },
> +    { .key = RISCV_HWPROBE_KEY_CPUPERF_0 },
> +  };
>
>    bool fast_unaligned = false;
> +  bool has_zawrs = false;
> +
> +  if (__riscv_hwprobe (pairs, ARRAY_SIZE (pairs), 0, NULL, 0) == 0)
> +    {
> +      struct riscv_hwprobe *pair;
>
> -  struct riscv_hwprobe pair = { .key = RISCV_HWPROBE_KEY_CPUPERF_0 };
> -  if (__riscv_hwprobe (&pair, 1, 0, NULL, 0) == 0
> -      && (pair.value & RISCV_HWPROBE_MISALIGNED_MASK)
> +      /* RISCV_HWPROBE_KEY_IMA_EXT_0  */
> +      pair = &pairs[0];
> +      if (pair->value & RISCV_HWPROBE_EXT_ZAWRS)
> +	has_zawrs = true;
> +
> +      /* RISCV_HWPROBE_KEY_CPUPERF_0  */
> +      pair = &pairs[1];
> +      if ((pair->value & RISCV_HWPROBE_MISALIGNED_MASK)
>            == RISCV_HWPROBE_MISALIGNED_FAST)
> -    fast_unaligned = true;
> +	fast_unaligned = true;
> +    }
>
>    IFUNC_IMPL (i, name, memcpy,
>  	      IFUNC_IMPL_ADD (array, i, memcpy, fast_unaligned,
>  			      __memcpy_noalignment)
>  	      IFUNC_IMPL_ADD (array, i, memcpy, 1, __memcpy_generic))
>
> +  IFUNC_IMPL (i, name, __cpu_relax,
> +	      IFUNC_IMPL_ADD (array, i, __cpu_relax, has_zawrs,
> +			      __cpu_relax_zawrs)
> +	      IFUNC_IMPL_ADD (array, i, __cpu_relax, 1, __cpu_relax_generic))
> +
>    return 0;
>  }

  reply	other threads:[~2024-04-18 17:18 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-18  9:46 [RFC PATCH 0/3] RISC-V: Use WRS.STO for atomic_spin_nop Christoph Müllner
2024-04-18  9:46 ` [RFC PATCH 1/3] RISC-V: Sync hwprobe: Sync extension bits with Linux 6.8 Christoph Müllner
2024-04-18  9:46 ` [RFC PATCH 2/3] RISC-V: hwprobe: Add Zawrs test bit Christoph Müllner
2024-04-18  9:46 ` [RFC PATCH 3/3] RISC-V: Implement CPU yielding for busy loops with Zihintpause/Zawrs Christoph Müllner
2024-04-18 17:17   ` Palmer Dabbelt [this message]
2024-04-18 20:03     ` Vineet Gupta
2024-04-18 20:25       ` Christoph Müllner
2024-04-18 20:19     ` Christoph Müllner
2024-04-18 20:36       ` Vineet Gupta
2024-04-18 21:10         ` Palmer Dabbelt
2024-04-19 14:09           ` Andrew Jones

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/libc/involved.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mhng-08eebb15-4e67-4616-abdd-64ea7522ab70@palmer-ri-x1c9a \
    --to=palmer@dabbelt.com \
    --cc=adhemerval.zanella@linaro.org \
    --cc=andrew@sifive.com \
    --cc=christoph.muellner@vrull.eu \
    --cc=darius@bluespec.com \
    --cc=evan@rivosinc.com \
    --cc=jeffreyalaw@gmail.com \
    --cc=kito.cheng@sifive.com \
    --cc=libc-alpha@sourceware.org \
    --cc=philipp.tomsich@vrull.eu \
    --cc=vineetg@rivosinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).