unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] glibc-hwcaps support for LD_LIBRARY_PATH
@ 2020-10-12 15:21 Florian Weimer via Libc-alpha
  2020-10-12 15:21 ` [PATCH 1/3] elf: Add " Florian Weimer via Libc-alpha
                   ` (2 more replies)
  0 siblings, 3 replies; 32+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-10-12 15:21 UTC (permalink / raw)
  To: libc-alpha; +Cc: Paul A. Clarke

This sub-series implements glibc-hwcaps for LD_LIBRARY_PATH, with
auto-detected paths for x86-64 and POWER ("power10" being untested).

Paul, I hope I have addressed your comments regarding magic bits with
the new helper function, _dl_hwcaps_subdirs_build_bitmask.

Thanks,
Florian

Florian Weimer (3):
  elf: Add glibc-hwcaps support for LD_LIBRARY_PATH
  x86_64: Add glibc-hwcaps support
  powerpc64le: Add glibc-hwcaps support

 elf/Makefile                                  |  66 ++++++++-
 elf/dl-hwcaps-subdirs.c                       |  29 ++++
 elf/dl-hwcaps.c                               | 138 ++++++++++++++---
 elf/dl-hwcaps.h                               | 103 +++++++++++++
 elf/dl-hwcaps_split.c                         |  77 ++++++++++
 elf/dl-load.c                                 |   7 +-
 elf/dl-main.h                                 |  11 +-
 elf/dl-support.c                              |   5 +-
 elf/dl-usage.c                                |  68 ++++++++-
 elf/markermodMARKER-VALUE.c                   |  29 ++++
 elf/rtld.c                                    |  18 +++
 elf/tst-dl-hwcaps_split.c                     | 139 ++++++++++++++++++
 elf/tst-glibc-hwcaps-mask.c                   |  31 ++++
 elf/tst-glibc-hwcaps-prepend.c                |  32 ++++
 elf/tst-glibc-hwcaps.c                        |  28 ++++
 sysdeps/generic/ldsodefs.h                    |  20 ++-
 sysdeps/powerpc/powerpc64/le/Makefile         |  22 +++
 .../powerpc/powerpc64/le/dl-hwcaps-subdirs.c  |  39 +++++
 .../powerpc/powerpc64/le/tst-glibc-hwcaps.c   |  54 +++++++
 sysdeps/x86_64/Makefile                       |  36 ++++-
 sysdeps/x86_64/dl-hwcaps-subdirs.c            |  66 +++++++++
 sysdeps/x86_64/tst-glibc-hwcaps.c             |  65 ++++++++
 22 files changed, 1049 insertions(+), 34 deletions(-)
 create mode 100644 elf/dl-hwcaps-subdirs.c
 create mode 100644 elf/dl-hwcaps_split.c
 create mode 100644 elf/markermodMARKER-VALUE.c
 create mode 100644 elf/tst-dl-hwcaps_split.c
 create mode 100644 elf/tst-glibc-hwcaps-mask.c
 create mode 100644 elf/tst-glibc-hwcaps-prepend.c
 create mode 100644 elf/tst-glibc-hwcaps.c
 create mode 100644 sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
 create mode 100644 sysdeps/powerpc/powerpc64/le/tst-glibc-hwcaps.c
 create mode 100644 sysdeps/x86_64/dl-hwcaps-subdirs.c
 create mode 100644 sysdeps/x86_64/tst-glibc-hwcaps.c

-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 1/3] elf: Add glibc-hwcaps support for LD_LIBRARY_PATH
  2020-10-12 15:21 [PATCH v2 0/3] glibc-hwcaps support for LD_LIBRARY_PATH Florian Weimer via Libc-alpha
@ 2020-10-12 15:21 ` Florian Weimer via Libc-alpha
  2020-10-13 16:28   ` Paul A. Clarke via Libc-alpha
  2020-10-20 17:23   ` Paul A. Clarke via Libc-alpha
  2020-10-12 15:21 ` [PATCH v2 2/3] x86_64: Add glibc-hwcaps support Florian Weimer via Libc-alpha
  2020-10-12 15:22 ` [PATCH v2 3/3] powerpc64le: " Florian Weimer via Libc-alpha
  2 siblings, 2 replies; 32+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-10-12 15:21 UTC (permalink / raw)
  To: libc-alpha

This hacks non-power-set processing into _dl_important_hwcaps.
Once the legacy hwcaps handling goes away, the subdirectory
handling needs to be reworked, but it is premature to do this
while both approaches are still supported.
---
 elf/Makefile                   |  66 ++++++++++++++--
 elf/dl-hwcaps-subdirs.c        |  29 +++++++
 elf/dl-hwcaps.c                | 138 +++++++++++++++++++++++++++-----
 elf/dl-hwcaps.h                | 103 ++++++++++++++++++++++++
 elf/dl-hwcaps_split.c          |  77 ++++++++++++++++++
 elf/dl-load.c                  |   7 +-
 elf/dl-main.h                  |  11 ++-
 elf/dl-support.c               |   5 +-
 elf/dl-usage.c                 |  68 +++++++++++++++-
 elf/markermodMARKER-VALUE.c    |  29 +++++++
 elf/rtld.c                     |  18 +++++
 elf/tst-dl-hwcaps_split.c      | 139 +++++++++++++++++++++++++++++++++
 elf/tst-glibc-hwcaps-mask.c    |  31 ++++++++
 elf/tst-glibc-hwcaps-prepend.c |  32 ++++++++
 elf/tst-glibc-hwcaps.c         |  28 +++++++
 sysdeps/generic/ldsodefs.h     |  20 +++--
 16 files changed, 768 insertions(+), 33 deletions(-)
 create mode 100644 elf/dl-hwcaps-subdirs.c
 create mode 100644 elf/dl-hwcaps_split.c
 create mode 100644 elf/markermodMARKER-VALUE.c
 create mode 100644 elf/tst-dl-hwcaps_split.c
 create mode 100644 elf/tst-glibc-hwcaps-mask.c
 create mode 100644 elf/tst-glibc-hwcaps-prepend.c
 create mode 100644 elf/tst-glibc-hwcaps.c

diff --git a/elf/Makefile b/elf/Makefile
index f10cc59e7c..4983f7a2c0 100644
--- a/elf/Makefile
+++ b/elf/Makefile
@@ -59,7 +59,8 @@ elide-routines.os = $(all-dl-routines) dl-support enbl-secure dl-origin \
 # ld.so uses those routines, plus some special stuff for being the program
 # interpreter and operating independent of libc.
 rtld-routines	= rtld $(all-dl-routines) dl-sysdep dl-environ dl-minimal \
-  dl-error-minimal dl-conflict dl-hwcaps dl-usage
+  dl-error-minimal dl-conflict dl-hwcaps dl-hwcaps_split dl-hwcaps-subdirs \
+  dl-usage
 all-rtld-routines = $(rtld-routines) $(sysdep-rtld-routines)
 
 CFLAGS-dl-runtime.c += -fexceptions -fasynchronous-unwind-tables
@@ -210,14 +211,14 @@ tests += restest1 preloadtest loadfail multiload origtest resolvfail \
 	 tst-filterobj tst-filterobj-dlopen tst-auxobj tst-auxobj-dlopen \
 	 tst-audit14 tst-audit15 tst-audit16 \
 	 tst-single_threaded tst-single_threaded-pthread \
-	 tst-tls-ie tst-tls-ie-dlmopen \
-	 argv0test
+	 tst-tls-ie tst-tls-ie-dlmopen argv0test \
+	 tst-glibc-hwcaps tst-glibc-hwcaps-prepend tst-glibc-hwcaps-mask
 #	 reldep9
 tests-internal += loadtest unload unload2 circleload1 \
 	 neededtest neededtest2 neededtest3 neededtest4 \
 	 tst-tls3 tst-tls6 tst-tls7 tst-tls8 tst-dlmopen2 \
 	 tst-ptrguard1 tst-stackguard1 tst-libc_dlvsym \
-	 tst-create_format1 tst-tls-surplus
+	 tst-create_format1 tst-tls-surplus tst-dl-hwcaps_split
 tests-container += tst-pldd tst-dlopen-tlsmodid-container \
   tst-dlopen-self-container
 test-srcs = tst-pathopt
@@ -329,7 +330,10 @@ modules-names = testobj1 testobj2 testobj3 testobj4 testobj5 testobj6 \
 		tst-single_threaded-mod3 tst-single_threaded-mod4 \
 		tst-tls-ie-mod0 tst-tls-ie-mod1 tst-tls-ie-mod2 \
 		tst-tls-ie-mod3 tst-tls-ie-mod4 tst-tls-ie-mod5 \
-		tst-tls-ie-mod6
+		tst-tls-ie-mod6 markermod1-1 markermod1-2 markermod1-3 \
+		markermod2-1 markermod2-2 \
+		markermod3-1 markermod3-2 markermod3-3 \
+		markermod4-1 markermod4-2 markermod4-3 markermod4-4 \
 
 # Most modules build with _ISOMAC defined, but those filtered out
 # depend on internal headers.
@@ -1812,3 +1816,55 @@ $(objpfx)argv0test.out: tst-rtld-argv0.sh $(objpfx)ld.so \
             '$(test-wrapper-env)' '$(run_program_env)' \
             '$(rpath-link)' 'test-argv0' > $@; \
     $(evaluate-test)
+
+# Most likely search subdirectories across multiple architectures.
+glibc-hwcaps-first-subdirs = power9 x86-64-v2
+
+# The test modules are parameterized by preprocessor macros.
+LDFLAGS-markermod1-1.so += -Wl,-soname,markermod1.so
+LDFLAGS-markermod2-1.so += -Wl,-soname,markermod2.so
+LDFLAGS-markermod3-1.so += -Wl,-soname,markermod3.so
+LDFLAGS-markermod4-1.so += -Wl,-soname,markermod4.so
+$(objpfx)markermod%.os : markermodMARKER-VALUE.c
+	$(compile-command.c) \
+	  -DMARKER=marker$(firstword $(subst -, ,$*)) \
+	  -DVALUE=$(lastword $(subst -, ,$*))
+$(objpfx)markermod1.so: $(objpfx)markermod1-1.so
+	cp $< $@
+$(objpfx)markermod2.so: $(objpfx)markermod2-1.so
+	cp $< $@
+$(objpfx)markermod3.so: $(objpfx)markermod3-1.so
+	cp $< $@
+$(objpfx)markermod4.so: $(objpfx)markermod4-1.so
+	cp $< $@
+
+# tst-glibc-hwcaps-prepend checks that --glibc-hwcaps-prepend is
+# preferred over auto-detected subdirectories.
+$(objpfx)tst-glibc-hwcaps-prepend: $(objpfx)markermod1-1.so
+$(objpfx)glibc-hwcaps/prepend-markermod1/markermod1.so: \
+  $(objpfx)markermod1-2.so
+	$(make-target-directory)
+	cp $< $@
+$(objpfx)glibc-hwcaps/%/markermod1.so: $(objpfx)markermod1-3.so
+	$(make-target-directory)
+	cp $< $@
+$(objpfx)tst-glibc-hwcaps-prepend.out: \
+  $(objpfx)tst-glibc-hwcaps-prepend $(objpfx)markermod1.so \
+  $(patsubst %,$(objpfx)glibc-hwcaps/%/markermod1.so,prepend-markermod1 \
+  $(glibc-hwcaps-first-subdirs))
+	$(test-wrapper) $(rtld-prefix) \
+	  --glibc-hwcaps-prepend prepend-markermod1 \
+	  $< > $@; \
+	$(evaluate-test)
+
+# tst-glibc-hwcaps-mask checks that --glibc-hwcaps-mask can be used to
+# suppress all auto-detected subdirectories.
+$(objpfx)tst-glibc-hwcaps-mask: $(objpfx)markermod1-1.so
+$(objpfx)tst-glibc-hwcaps-mask.out: \
+  $(objpfx)tst-glibc-hwcaps-mask $(objpfx)markermod1.so \
+  $(patsubst %,$(objpfx)glibc-hwcaps/%/markermod1.so,\
+  $(glibc-hwcaps-first-subdirs))
+	$(test-wrapper) $(rtld-prefix) \
+	  --glibc-hwcaps-mask does-not-exist \
+	  $< > $@; \
+	$(evaluate-test)
diff --git a/elf/dl-hwcaps-subdirs.c b/elf/dl-hwcaps-subdirs.c
new file mode 100644
index 0000000000..60c6d59731
--- /dev/null
+++ b/elf/dl-hwcaps-subdirs.c
@@ -0,0 +1,29 @@
+/* Architecture-specific glibc-hwcaps subdirectories.  Generic version.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <dl-hwcaps.h>
+
+/* In the generic version, there are no subdirectories defined.  */
+
+const char _dl_hwcaps_subdirs[] = "";
+
+uint32_t
+_dl_hwcaps_subdirs_active (void)
+{
+  return 0;
+}
diff --git a/elf/dl-hwcaps.c b/elf/dl-hwcaps.c
index 44dbac099f..f611f3a1a6 100644
--- a/elf/dl-hwcaps.c
+++ b/elf/dl-hwcaps.c
@@ -26,20 +26,97 @@
 #include <dl-procinfo.h>
 #include <dl-hwcaps.h>
 
+/* This is the result of counting the substrings in a colon-separated
+   hwcaps string.  */
+struct hwcaps_counts
+{
+  /* Number of substrings.  */
+  size_t count;
+
+  /* Sum of the individual substring lengths (without separators or
+     null terminators).  */
+  size_t total_length;
+
+  /* Maximum length of an individual substring.  */
+  size_t maximum_length;
+};
+
+/* Update *COUNTS according to the contents of HWCAPS.  Skip over
+   entries whose bit is not set in MASK.  */
+static void
+update_hwcaps_counts (struct hwcaps_counts *counts, const char *hwcaps,
+		      uint32_t bitmask, const char *mask)
+{
+  struct dl_hwcaps_split_masked sp;
+  _dl_hwcaps_split_masked_init (&sp, hwcaps, bitmask, mask);
+  while (_dl_hwcaps_split_masked (&sp))
+    {
+      ++counts->count;
+      counts->total_length += sp.split.length;
+      if (sp.split.length > counts->maximum_length)
+	counts->maximum_length = sp.split.length;
+    }
+}
+
+/* State for copy_hwcaps.  Must be initialized to point to
+   the storage areas for the array and the strings themselves.  */
+struct copy_hwcaps
+{
+  struct r_strlenpair *next_pair;
+  char *next_string;
+};
+
+/* Copy HWCAPS into the string pairs and strings, advancing *TARGET.
+   Skip over entries whose bit is not set in MASK.  */
+static void
+copy_hwcaps (struct copy_hwcaps *target, const char *hwcaps,
+	     uint32_t bitmask, const char *mask)
+{
+  struct dl_hwcaps_split_masked sp;
+  _dl_hwcaps_split_masked_init (&sp, hwcaps, bitmask, mask);
+  while (_dl_hwcaps_split_masked (&sp))
+    {
+      target->next_pair->str = target->next_string;
+      char *slash = __mempcpy (__mempcpy (target->next_string,
+					  GLIBC_HWCAPS_PREFIX,
+					  strlen (GLIBC_HWCAPS_PREFIX)),
+			       sp.split.segment, sp.split.length);
+      *slash = '/';
+      target->next_pair->len
+	= strlen (GLIBC_HWCAPS_PREFIX) + sp.split.length + 1;
+      ++target->next_pair;
+      target->next_string = slash + 1;
+    }
+}
+
 /* Return an array of useful/necessary hardware capability names.  */
 const struct r_strlenpair *
-_dl_important_hwcaps (size_t *sz, size_t *max_capstrlen)
+_dl_important_hwcaps (const char *glibc_hwcaps_prepend,
+		      const char *glibc_hwcaps_mask,
+		      size_t *sz, size_t *max_capstrlen)
 {
   uint64_t hwcap_mask = GET_HWCAP_MASK();
   /* Determine how many important bits are set.  */
   uint64_t masked = GLRO(dl_hwcap) & hwcap_mask;
   size_t cnt = GLRO (dl_platform) != NULL;
   size_t n, m;
-  size_t total;
   struct r_strlenpair *result;
   struct r_strlenpair *rp;
   char *cp;
 
+  /* glibc-hwcaps subdirectories.  These are exempted from the power
+     set construction below.  */
+  uint32_t hwcaps_subdirs_active = _dl_hwcaps_subdirs_active ();
+  struct hwcaps_counts hwcaps_counts =  { 0, };
+  update_hwcaps_counts (&hwcaps_counts, glibc_hwcaps_prepend, -1, NULL);
+  update_hwcaps_counts (&hwcaps_counts, _dl_hwcaps_subdirs,
+			hwcaps_subdirs_active, glibc_hwcaps_mask);
+
+  /* Each hwcaps subdirectory has a GLIBC_HWCAPS_PREFIX string prefix
+     and a "/" suffix once stored in the result.  */
+  size_t total = (hwcaps_counts.count * (strlen (GLIBC_HWCAPS_PREFIX) + 1)
+		  + hwcaps_counts.total_length);
+
   /* Count the number of bits set in the masked value.  */
   for (n = 0; (~((1ULL << n) - 1) & masked) != 0; ++n)
     if ((masked & (1ULL << n)) != 0)
@@ -74,10 +151,10 @@ _dl_important_hwcaps (size_t *sz, size_t *max_capstrlen)
 
   /* Determine the total size of all strings together.  */
   if (cnt == 1)
-    total = temp[0].len + 1;
+    total += temp[0].len + 1;
   else
     {
-      total = temp[0].len + temp[cnt - 1].len + 2;
+      total += temp[0].len + temp[cnt - 1].len + 2;
       if (cnt > 2)
 	{
 	  total <<= 1;
@@ -94,26 +171,48 @@ _dl_important_hwcaps (size_t *sz, size_t *max_capstrlen)
 	}
     }
 
-  /* The result structure: we use a very compressed way to store the
-     various combinations of capability names.  */
-  *sz = 1 << cnt;
-  result = (struct r_strlenpair *) malloc (*sz * sizeof (*result) + total);
-  if (result == NULL)
+  *sz = hwcaps_counts.count + (1 << cnt);
+
+  /* This is the overall result, including both glibc-hwcaps
+     subdirectories and the legacy hwcaps subdirectories using the
+     power set construction.  */
+  struct r_strlenpair *overall_result
+    = malloc (*sz * sizeof (*result) + total);
+  if (overall_result == NULL)
     _dl_signal_error (ENOMEM, NULL, NULL,
 		      N_("cannot create capability list"));
 
+  /* Fill in the glibc-hwcaps subdirectories.  */
+  {
+    struct copy_hwcaps target;
+    target.next_pair = overall_result;
+    target.next_string = (char *) (overall_result + *sz);
+    copy_hwcaps (&target, glibc_hwcaps_prepend, -1, NULL);
+    copy_hwcaps (&target, _dl_hwcaps_subdirs,
+		 hwcaps_subdirs_active, glibc_hwcaps_mask);
+    /* Set up the write target for the power set construction.  */
+    result = target.next_pair;
+    cp = target.next_string;
+  }
+
+
+  /* Power set construction begins here.  We use a very compressed way
+     to store the various combinations of capability names.  */
+
   if (cnt == 1)
     {
-      result[0].str = (char *) (result + *sz);
+      result[0].str = cp;
       result[0].len = temp[0].len + 1;
-      result[1].str = (char *) (result + *sz);
+      result[1].str = cp;
       result[1].len = 0;
-      cp = __mempcpy ((char *) (result + *sz), temp[0].str, temp[0].len);
+      cp = __mempcpy (cp, temp[0].str, temp[0].len);
       *cp = '/';
-      *sz = 2;
-      *max_capstrlen = result[0].len;
+      if (result[0].len > hwcaps_counts.maximum_length)
+	*max_capstrlen = result[0].len;
+      else
+	*max_capstrlen = hwcaps_counts.maximum_length;
 
-      return result;
+      return overall_result;
     }
 
   /* Fill in the information.  This follows the following scheme
@@ -124,7 +223,7 @@ _dl_important_hwcaps (size_t *sz, size_t *max_capstrlen)
 	      #3: 0, 3			1001
      This allows the representation of all possible combinations of
      capability names in the string.  First generate the strings.  */
-  result[1].str = result[0].str = cp = (char *) (result + *sz);
+  result[1].str = result[0].str = cp;
 #define add(idx) \
       cp = __mempcpy (__mempcpy (cp, temp[idx].str, temp[idx].len), "/", 1);
   if (cnt == 2)
@@ -191,7 +290,10 @@ _dl_important_hwcaps (size_t *sz, size_t *max_capstrlen)
   while (--n != 0);
 
   /* The maximum string length.  */
-  *max_capstrlen = result[0].len;
+  if (result[0].len > hwcaps_counts.maximum_length)
+    *max_capstrlen = result[0].len;
+  else
+    *max_capstrlen = hwcaps_counts.maximum_length;
 
-  return result;
+  return overall_result;
 }
diff --git a/elf/dl-hwcaps.h b/elf/dl-hwcaps.h
index b66da59b89..9071367038 100644
--- a/elf/dl-hwcaps.h
+++ b/elf/dl-hwcaps.h
@@ -16,6 +16,11 @@
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
+#ifndef _DL_HWCAPS_H
+#define _DL_HWCAPS_H
+
+#include <stdint.h>
+
 #include <elf/dl-tunables.h>
 
 #if HAVE_TUNABLES
@@ -28,3 +33,101 @@
 #  define GET_HWCAP_MASK() (0)
 # endif
 #endif
+
+#define GLIBC_HWCAPS_SUBDIRECTORY "glibc-hwcaps"
+#define GLIBC_HWCAPS_PREFIX GLIBC_HWCAPS_SUBDIRECTORY "/"
+
+/* Used by _dl_hwcaps_split below, to split strings at ':'
+   separators.  */
+struct dl_hwcaps_split
+{
+  const char *segment;          /* Start of the current segment.  */
+  size_t length;                /* Number of bytes until ':' or NUL.  */
+};
+
+/* Prepare *S to parse SUBJECT, for future _dl_hwcaps_split calls.  If
+   SUBJECT is NULL, it is treated as the empty string.  */
+static inline void
+_dl_hwcaps_split_init (struct dl_hwcaps_split *s, const char *subject)
+{
+  s->segment = subject;
+  /* The initial call to _dl_hwcaps_split will not skip anything.  */
+  s->length = 0;
+}
+
+/* Extract the next non-empty string segment, up to ':' or the null
+   terminator.  Return true if one more segment was found, or false if
+   the end of the string was reached.  On success, S->segment is the
+   start of the segment found, and S->length is its length.
+   (Typically, S->segment[S->length] is not null.)  */
+_Bool _dl_hwcaps_split (struct dl_hwcaps_split *s) attribute_hidden;
+
+/* Similar to dl_hwcaps_split, but with bit-based and name-based
+   masking.  */
+struct dl_hwcaps_split_masked
+{
+  struct dl_hwcaps_split split;
+
+  /* For used by the iterator implementation.  */
+  const char *mask;
+  uint32_t bitmask;
+};
+
+/* Prepare *S for iteration with _dl_hwcaps_split_masked.  Only HWCAP
+   names in SUBJECT whose bit is set in BITMASK and whose name is in
+   MASK will be returned.  SUBJECT must not contain empty HWCAP names.
+   If MASK is NULL, no name-based masking is applied.  Likewise for
+   BITMASK if BITMASK is -1 (infinite number of bits).  */
+static inline void
+_dl_hwcaps_split_masked_init (struct dl_hwcaps_split_masked *s,
+                              const char *subject,
+                              uint32_t bitmask, const char *mask)
+{
+  _dl_hwcaps_split_init (&s->split, subject);
+  s->bitmask = bitmask;
+  s->mask = mask;
+}
+
+/* Like _dl_hwcaps_split, but apply masking.  */
+_Bool _dl_hwcaps_split_masked (struct dl_hwcaps_split_masked *s)
+  attribute_hidden;
+
+/* Returns true if the colon-separated HWCAP list HWCAPS contains the
+   capability NAME (with length NAME_LENGTH).  If HWCAPS is NULL, the
+   function returns true.  */
+_Bool _dl_hwcaps_contains (const char *hwcaps, const char *name,
+                           size_t name_length) attribute_hidden;
+
+/* Colon-separated string of glibc-hwcaps subdirectories, without the
+   "glibc-hwcaps/" prefix.  The most preferred subdirectory needs to
+   be listed first.  */
+extern const char _dl_hwcaps_subdirs[] attribute_hidden;
+
+/* Returns a bitmap of active subdirectories in _dl_hwcaps_subdirs.
+   Bit 0 (the LSB) corresponds to the first substring in
+   _dl_hwcaps_subdirs, bit 1 to the second substring, and so on.
+   There is no direct correspondence between HWCAP bitmasks and this
+   bitmask.  */
+uint32_t _dl_hwcaps_subdirs_active (void) attribute_hidden;
+
+/* Returns a bitmask that marks the last ACTIVE subdirectories in a
+   _dl_hwcaps_subdirs_active string (containing SUBDIRS directories in
+   total) as active.  Intended for use in _dl_hwcaps_subdirs_active
+   implementations.  */
+static inline uint32_t
+_dl_hwcaps_subdirs_build_bitmask (int subdirs, int active)
+{
+  /* Leading subdirectories that are not active.  */
+  int inactive = subdirs - active;
+  if (inactive == 32)
+    return 0;
+
+  uint32_t mask;
+  if (subdirs < 32)
+    mask = (1U << subdirs) - 1;
+  else
+    mask = -1;
+  return mask ^ ((1U << inactive) - 1);
+}
+
+#endif /* _DL_HWCAPS_H */
diff --git a/elf/dl-hwcaps_split.c b/elf/dl-hwcaps_split.c
new file mode 100644
index 0000000000..95225e9f40
--- /dev/null
+++ b/elf/dl-hwcaps_split.c
@@ -0,0 +1,77 @@
+/* Hardware capability support for run-time dynamic loader.  String splitting.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <dl-hwcaps.h>
+#include <stdbool.h>
+#include <string.h>
+
+_Bool
+_dl_hwcaps_split (struct dl_hwcaps_split *s)
+{
+  if (s->segment == NULL)
+    return false;
+
+  /* Skip over the previous segment.   */
+  s->segment += s->length;
+
+  /* Consume delimiters.  This also avoids returning an empty
+     segment.  */
+  while (*s->segment == ':')
+    ++s->segment;
+  if (*s->segment == '\0')
+    return false;
+
+  /* This could use strchrnul, but we would have to link the function
+     into ld.so for that.  */
+  const char *colon = strchr (s->segment, ':');
+  if (colon == NULL)
+    s->length = strlen (s->segment);
+  else
+    s->length = colon - s->segment;
+  return true;
+}
+
+_Bool
+_dl_hwcaps_split_masked (struct dl_hwcaps_split_masked *s)
+{
+  while (true)
+    {
+      if (!_dl_hwcaps_split (&s->split))
+        return false;
+      bool active = s->bitmask & 1;
+      s->bitmask >>= 1;
+      if (active && _dl_hwcaps_contains (s->mask,
+                                         s->split.segment, s->split.length))
+        return true;
+    }
+}
+
+_Bool
+_dl_hwcaps_contains (const char *hwcaps, const char *name, size_t name_length)
+{
+  if (hwcaps == NULL)
+    return true;
+
+  struct dl_hwcaps_split split;
+  _dl_hwcaps_split_init (&split, hwcaps);
+  while (_dl_hwcaps_split (&split))
+    if (split.length == name_length
+        && memcmp (split.segment, name, name_length) == 0)
+      return true;
+  return false;
+}
diff --git a/elf/dl-load.c b/elf/dl-load.c
index f3201e7c14..9020f1646f 100644
--- a/elf/dl-load.c
+++ b/elf/dl-load.c
@@ -682,7 +682,9 @@ cache_rpath (struct link_map *l,
 
 
 void
-_dl_init_paths (const char *llp, const char *source)
+_dl_init_paths (const char *llp, const char *source,
+		const char *glibc_hwcaps_prepend,
+		const char *glibc_hwcaps_mask)
 {
   size_t idx;
   const char *strp;
@@ -697,7 +699,8 @@ _dl_init_paths (const char *llp, const char *source)
 
 #ifdef SHARED
   /* Get the capabilities.  */
-  capstr = _dl_important_hwcaps (&ncapstr, &max_capstrlen);
+  capstr = _dl_important_hwcaps (glibc_hwcaps_prepend, glibc_hwcaps_mask,
+				 &ncapstr, &max_capstrlen);
 #endif
 
   /* First set up the rest of the default search directory entries.  */
diff --git a/elf/dl-main.h b/elf/dl-main.h
index b51256d3b4..566713a0d1 100644
--- a/elf/dl-main.h
+++ b/elf/dl-main.h
@@ -84,6 +84,14 @@ struct dl_main_state
   /* The preload list passed as a command argument.  */
   const char *preloadarg;
 
+  /* Additional glibc-hwcaps subdirectories to search first.
+     Colon-separated list.  */
+  const char *glibc_hwcaps_prepend;
+
+  /* Mask for the internal glibc-hwcaps subdirectories.
+     Colon-separated list.  */
+  const char *glibc_hwcaps_mask;
+
   enum rtld_mode mode;
 
   /* True if any of the debugging options is enabled.  */
@@ -98,7 +106,8 @@ struct dl_main_state
 static inline void
 call_init_paths (const struct dl_main_state *state)
 {
-  _dl_init_paths (state->library_path, state->library_path_source);
+  _dl_init_paths (state->library_path, state->library_path_source,
+                  state->glibc_hwcaps_prepend, state->glibc_hwcaps_mask);
 }
 
 /* Print ld.so usage information and exit.  */
diff --git a/elf/dl-support.c b/elf/dl-support.c
index afbc94df54..3264262f4e 100644
--- a/elf/dl-support.c
+++ b/elf/dl-support.c
@@ -323,7 +323,10 @@ _dl_non_dynamic_init (void)
 
   /* Initialize the data structures for the search paths for shared
      objects.  */
-  _dl_init_paths (getenv ("LD_LIBRARY_PATH"), "LD_LIBRARY_PATH");
+  _dl_init_paths (getenv ("LD_LIBRARY_PATH"), "LD_LIBRARY_PATH",
+		  /* No glibc-hwcaps selection support in statically
+		     linked binaries.  */
+		  NULL, NULL);
 
   /* Remember the last search directory added at startup.  */
   _dl_init_all_dirs = GL(dl_all_dirs);
diff --git a/elf/dl-usage.c b/elf/dl-usage.c
index 796ad38b43..e22a9c3942 100644
--- a/elf/dl-usage.c
+++ b/elf/dl-usage.c
@@ -83,7 +83,7 @@ print_search_path_for_help (struct dl_main_state *state)
 {
   if (__rtld_search_dirs.dirs == NULL)
     /* The run-time search paths have not yet been initialized.  */
-    _dl_init_paths (state->library_path, state->library_path_source);
+    call_init_paths (state);
 
   _dl_printf ("\nShared library search path:\n");
 
@@ -132,6 +132,67 @@ print_hwcap_1_finish (bool *first)
     _dl_printf (")\n");
 }
 
+/* Print the header for print_hwcaps_subdirectories.  */
+static void
+print_hwcaps_subdirectories_header (bool *nothing_printed)
+{
+  if (*nothing_printed)
+    {
+      _dl_printf ("\n\
+Subdirectories of glibc-hwcaps directories, in priority order:\n");
+      *nothing_printed = false;
+    }
+}
+
+/* Print the HWCAP name itself, indented.  */
+static void
+print_hwcaps_subdirectories_name (const struct dl_hwcaps_split *split)
+{
+  _dl_write (STDOUT_FILENO, "  ", 2);
+  _dl_write (STDOUT_FILENO, split->segment, split->length);
+}
+
+/* Print the list of recognized glibc-hwcaps subdirectories.  */
+static void
+print_hwcaps_subdirectories (const struct dl_main_state *state)
+{
+  bool nothing_printed = true;
+  struct dl_hwcaps_split split;
+
+  /* The prepended glibc-hwcaps subdirectories.  */
+  _dl_hwcaps_split_init (&split, state->glibc_hwcaps_prepend);
+  while (_dl_hwcaps_split (&split))
+    {
+      print_hwcaps_subdirectories_header (&nothing_printed);
+      print_hwcaps_subdirectories_name (&split);
+      bool first = true;
+      print_hwcap_1 (&first, true, "searched");
+      print_hwcap_1_finish (&first);
+    }
+
+  /* The built-in glibc-hwcaps subdirectories.  Do the filtering
+     manually, so that more precise diagnostics are possible.  */
+  uint32_t mask = _dl_hwcaps_subdirs_active ();
+  _dl_hwcaps_split_init (&split, _dl_hwcaps_subdirs);
+  while (_dl_hwcaps_split (&split))
+    {
+      print_hwcaps_subdirectories_header (&nothing_printed);
+      print_hwcaps_subdirectories_name (&split);
+      bool first = true;
+      print_hwcap_1 (&first, mask & 1, "supported");
+      bool listed = _dl_hwcaps_contains (state->glibc_hwcaps_mask,
+                                         split.segment, split.length);
+      print_hwcap_1 (&first, !listed, "masked");
+      print_hwcap_1 (&first, (mask & 1) && listed, "searched");
+      print_hwcap_1_finish (&first);
+      mask >>= 1;
+    }
+
+  if (nothing_printed)
+    _dl_printf ("\n\
+No subdirectories of glibc-hwcaps directories are searched.\n");
+}
+
 /* Write a list of hwcap subdirectories to standard output.  See
  _dl_important_hwcaps in dl-hwcaps.c.  */
 static void
@@ -186,6 +247,10 @@ setting environment variables (which would be inherited by subprocesses).\n\
   --inhibit-cache       Do not use " LD_SO_CACHE "\n\
   --library-path PATH   use given PATH instead of content of the environment\n\
                         variable LD_LIBRARY_PATH\n\
+  --glibc-hwcaps-prepend LIST\n\
+                        search glibc-hwcaps subdirectories in LIST\n\
+  --glibc-hwcaps-mask LIST\n\
+                        only search built-in subdirectories if in LIST\n\
   --inhibit-rpath LIST  ignore RUNPATH and RPATH information in object names\n\
                         in LIST\n\
   --audit LIST          use objects named in LIST as auditors\n\
@@ -198,6 +263,7 @@ This program interpreter self-identifies as: " RTLD "\n\
 ",
               argv0);
   print_search_path_for_help (state);
+  print_hwcaps_subdirectories (state);
   print_legacy_hwcap_directories ();
   _exit (EXIT_SUCCESS);
 }
diff --git a/elf/markermodMARKER-VALUE.c b/elf/markermodMARKER-VALUE.c
new file mode 100644
index 0000000000..99bdcf71a4
--- /dev/null
+++ b/elf/markermodMARKER-VALUE.c
@@ -0,0 +1,29 @@
+/* Source file template for building shared objects with marker functions.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+/* MARKER and VALUE must be set on the compiler command line.  */
+
+#ifndef MARKER
+# error MARKER not defined
+#endif
+
+int
+MARKER (void)
+{
+  return VALUE;
+}
diff --git a/elf/rtld.c b/elf/rtld.c
index fcf4bb70b1..1b2f17191d 100644
--- a/elf/rtld.c
+++ b/elf/rtld.c
@@ -289,6 +289,8 @@ dl_main_state_init (struct dl_main_state *state)
   state->library_path_source = NULL;
   state->preloadlist = NULL;
   state->preloadarg = NULL;
+  state->glibc_hwcaps_prepend = NULL;
+  state->glibc_hwcaps_mask = NULL;
   state->mode = rtld_mode_normal;
   state->any_debug = false;
   state->version_info = false;
@@ -1244,6 +1246,22 @@ dl_main (const ElfW(Phdr) *phdr,
 	  {
 	    argv0 = _dl_argv[2];
 
+	    _dl_skip_args += 2;
+	    _dl_argc -= 2;
+	    _dl_argv += 2;
+	  }
+	else if (strcmp (_dl_argv[1], "--glibc-hwcaps-prepend") == 0
+		 && _dl_argc > 2)
+	  {
+	    state.glibc_hwcaps_prepend = _dl_argv[2];
+	    _dl_skip_args += 2;
+	    _dl_argc -= 2;
+	    _dl_argv += 2;
+	  }
+	else if (strcmp (_dl_argv[1], "--glibc-hwcaps-mask") == 0
+		 && _dl_argc > 2)
+	  {
+	    state.glibc_hwcaps_mask = _dl_argv[2];
 	    _dl_skip_args += 2;
 	    _dl_argc -= 2;
 	    _dl_argv += 2;
diff --git a/elf/tst-dl-hwcaps_split.c b/elf/tst-dl-hwcaps_split.c
new file mode 100644
index 0000000000..929c99a23b
--- /dev/null
+++ b/elf/tst-dl-hwcaps_split.c
@@ -0,0 +1,139 @@
+/* Unit tests for dl-hwcaps.c.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <array_length.h>
+#include <dl-hwcaps.h>
+#include <string.h>
+#include <support/check.h>
+
+static void
+check_split_masked (const char *input, int32_t bitmask, const char *mask,
+                    const char *expected[], size_t expected_length)
+{
+  struct dl_hwcaps_split_masked split;
+  _dl_hwcaps_split_masked_init (&split, input, bitmask, mask);
+  size_t index = 0;
+  while (_dl_hwcaps_split_masked (&split))
+    {
+      TEST_VERIFY_EXIT (index < expected_length);
+      TEST_COMPARE_BLOB (expected[index], strlen (expected[index]),
+                         split.split.segment, split.split.length);
+      ++index;
+    }
+  TEST_COMPARE (index, expected_length);
+}
+
+static void
+check_split (const char *input,
+             const char *expected[], size_t expected_length)
+{
+  struct dl_hwcaps_split split;
+  _dl_hwcaps_split_init (&split, input);
+  size_t index = 0;
+  while (_dl_hwcaps_split (&split))
+    {
+      TEST_VERIFY_EXIT (index < expected_length);
+      TEST_COMPARE_BLOB (expected[index], strlen (expected[index]),
+                         split.segment, split.length);
+      ++index;
+    }
+  TEST_COMPARE (index, expected_length);
+
+  /* Reuse the test cases with masking that does not actually remove
+     anything.  */
+  check_split_masked (input, -1, NULL, expected, expected_length);
+  check_split_masked (input, -1, input, expected, expected_length);
+}
+
+static int
+do_test (void)
+{
+  /* Splitting tests, without masking.  */
+  check_split (NULL, NULL, 0);
+  check_split ("", NULL, 0);
+  check_split (":", NULL, 0);
+  check_split ("::", NULL, 0);
+
+  {
+    const char *expected[] = { "first" };
+    check_split ("first", expected, array_length (expected));
+    check_split (":first", expected, array_length (expected));
+    check_split ("first:", expected, array_length (expected));
+    check_split (":first:", expected, array_length (expected));
+  }
+
+  {
+    const char *expected[] = { "first", "second" };
+    check_split ("first:second", expected, array_length (expected));
+    check_split ("first::second", expected, array_length (expected));
+    check_split (":first:second", expected, array_length (expected));
+    check_split ("first:second:", expected, array_length (expected));
+    check_split (":first:second:", expected, array_length (expected));
+  }
+
+  /* Splitting tests with masking.  */
+  {
+    const char *expected[] = { "first" };
+    check_split_masked ("first", 3, "first:second",
+                        expected, array_length (expected));
+    check_split_masked ("first:second", 3, "first:",
+                        expected, array_length (expected));
+    check_split_masked ("first:second", 1, NULL,
+                        expected, array_length (expected));
+  }
+  {
+    const char *expected[] = { "second" };
+    check_split_masked ("first:second", 3, "second",
+                        expected, array_length (expected));
+    check_split_masked ("first:second:third", -1, "second:",
+                        expected, array_length (expected));
+    check_split_masked ("first:second", 2, NULL,
+                        expected, array_length (expected));
+    check_split_masked ("first:second:third", 2, "first:second",
+                        expected, array_length (expected));
+  }
+
+  /* Tests for _dl_hwcaps_contains.  */
+  TEST_VERIFY (_dl_hwcaps_contains (NULL, "first", strlen ("first")));
+  TEST_VERIFY (_dl_hwcaps_contains (NULL, "", 0));
+  TEST_VERIFY (! _dl_hwcaps_contains ("", "first", strlen ("first")));
+  TEST_VERIFY (! _dl_hwcaps_contains ("firs", "first", strlen ("first")));
+  TEST_VERIFY (_dl_hwcaps_contains ("firs", "first", strlen ("first") - 1));
+  for (int i = 0; i < strlen ("first"); ++i)
+    TEST_VERIFY (! _dl_hwcaps_contains ("first", "first", i));
+  TEST_VERIFY (_dl_hwcaps_contains ("first", "first", strlen ("first")));
+  TEST_VERIFY (_dl_hwcaps_contains ("first:", "first", strlen ("first")));
+  TEST_VERIFY (_dl_hwcaps_contains ("first:second",
+                                    "first", strlen ("first")));
+  TEST_VERIFY (_dl_hwcaps_contains (":first:second", "first",
+                                    strlen ("first")));
+  TEST_VERIFY (_dl_hwcaps_contains ("first:second", "second",
+                                    strlen ("second")));
+  TEST_VERIFY (_dl_hwcaps_contains ("first:second:", "second",
+                                    strlen ("second")));
+  for (int i = 0; i < strlen ("second"); ++i)
+    TEST_VERIFY (!_dl_hwcaps_contains ("first:second:", "sec", i));
+
+  return 0;
+}
+
+#include <support/test-driver.c>
+
+/* Rebuild the sources here because the object file is built for
+   inclusion into the dynamic loader.  */
+#include "dl-hwcaps_split.c"
diff --git a/elf/tst-glibc-hwcaps-mask.c b/elf/tst-glibc-hwcaps-mask.c
new file mode 100644
index 0000000000..27b09b358c
--- /dev/null
+++ b/elf/tst-glibc-hwcaps-mask.c
@@ -0,0 +1,31 @@
+/* Test that --glibc-hwcaps-mask works.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <support/check.h>
+
+extern int marker1 (void);
+
+static int
+do_test (void)
+{
+  /* The marker1 function in elf/markermod1.so returns 1.  */
+  TEST_COMPARE (marker1 (), 1);
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/elf/tst-glibc-hwcaps-prepend.c b/elf/tst-glibc-hwcaps-prepend.c
new file mode 100644
index 0000000000..57d7319f14
--- /dev/null
+++ b/elf/tst-glibc-hwcaps-prepend.c
@@ -0,0 +1,32 @@
+/* Test that --glibc-hwcaps-prepend works.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <support/check.h>
+
+extern int marker1 (void);
+
+static int
+do_test (void)
+{
+  /* The marker1 function in
+     glibc-hwcaps/prepend-markermod1/markermod1.so returns 2.  */
+  TEST_COMPARE (marker1 (), 2);
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/elf/tst-glibc-hwcaps.c b/elf/tst-glibc-hwcaps.c
new file mode 100644
index 0000000000..28f47cf891
--- /dev/null
+++ b/elf/tst-glibc-hwcaps.c
@@ -0,0 +1,28 @@
+/* Stub test for glibc-hwcaps.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <stdio.h>
+
+static int
+do_test (void)
+{
+  puts ("info: generic tst-glibc-hwcaps (tests nothing)");
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h
index 382eeb9be0..0b2babc70c 100644
--- a/sysdeps/generic/ldsodefs.h
+++ b/sysdeps/generic/ldsodefs.h
@@ -1047,8 +1047,13 @@ extern struct r_debug *_dl_debug_initialize (ElfW(Addr) ldbase, Lmid_t ns)
      attribute_hidden;
 
 /* Initialize the basic data structure for the search paths.  SOURCE
-   is either "LD_LIBRARY_PATH" or "--library-path".  */
-extern void _dl_init_paths (const char *library_path, const char *source)
+   is either "LD_LIBRARY_PATH" or "--library-path".
+   GLIBC_HWCAPS_PREPEND adds additional glibc-hwcaps subdirectories to
+   search.  GLIBC_HWCAPS_MASK is used to filter the built-in
+   subdirectories if not NULL.  */
+extern void _dl_init_paths (const char *library_path, const char *source,
+			    const char *glibc_hwcaps_prepend,
+			    const char *glibc_hwcaps_mask)
   attribute_hidden;
 
 /* Gather the information needed to install the profiling tables and start
@@ -1072,9 +1077,14 @@ extern void _dl_show_auxv (void) attribute_hidden;
 extern char *_dl_next_ld_env_entry (char ***position) attribute_hidden;
 
 /* Return an array with the names of the important hardware
-   capabilities.  The length of the array is written to *SZ, and the
-   maximum of all strings length is written to *MAX_CAPSTRLEN.  */
-const struct r_strlenpair *_dl_important_hwcaps (size_t *sz,
+   capabilities.  PREPEND is a colon-separated list of glibc-hwcaps
+   directories to search first.  MASK is a colon-separated list used
+   to filter the built-in glibc-hwcaps subdirectories.  The length of
+   the array is written to *SZ, and the maximum of all strings length
+   is written to *MAX_CAPSTRLEN.  */
+const struct r_strlenpair *_dl_important_hwcaps (const char *prepend,
+						 const char *mask,
+						 size_t *sz,
 						 size_t *max_capstrlen)
   attribute_hidden;
 
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v2 2/3] x86_64: Add glibc-hwcaps support
  2020-10-12 15:21 [PATCH v2 0/3] glibc-hwcaps support for LD_LIBRARY_PATH Florian Weimer via Libc-alpha
  2020-10-12 15:21 ` [PATCH 1/3] elf: Add " Florian Weimer via Libc-alpha
@ 2020-10-12 15:21 ` Florian Weimer via Libc-alpha
  2020-10-12 18:11   ` H.J. Lu via Libc-alpha
  2020-10-12 15:22 ` [PATCH v2 3/3] powerpc64le: " Florian Weimer via Libc-alpha
  2 siblings, 1 reply; 32+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-10-12 15:21 UTC (permalink / raw)
  To: libc-alpha

The subdirectories match those in the x86-64 psABI:

https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9a6b9396884b67c7c
---
 sysdeps/x86_64/Makefile            | 36 +++++++++++++++-
 sysdeps/x86_64/dl-hwcaps-subdirs.c | 66 ++++++++++++++++++++++++++++++
 sysdeps/x86_64/tst-glibc-hwcaps.c  | 65 +++++++++++++++++++++++++++++
 3 files changed, 166 insertions(+), 1 deletion(-)
 create mode 100644 sysdeps/x86_64/dl-hwcaps-subdirs.c
 create mode 100644 sysdeps/x86_64/tst-glibc-hwcaps.c

diff --git a/sysdeps/x86_64/Makefile b/sysdeps/x86_64/Makefile
index 42b97c5cc7..16030715e7 100644
--- a/sysdeps/x86_64/Makefile
+++ b/sysdeps/x86_64/Makefile
@@ -144,7 +144,41 @@ CFLAGS-tst-auditmod10b.c += $(AVX512-CFLAGS)
 CFLAGS-tst-avx512-aux.c += $(AVX512-CFLAGS)
 CFLAGS-tst-avx512mod.c += $(AVX512-CFLAGS)
 endif
-endif
+
+$(objpfx)tst-glibc-hwcaps: \
+  $(objpfx)markermod2-1.so $(objpfx)markermod3-1.so $(objpfx)markermod4-1.so
+$(objpfx)tst-glibc-hwcaps.out: \
+  $(objpfx)markermod2.so \
+    $(objpfx)glibc-hwcaps/x86-64-v2/markermod2.so \
+  $(objpfx)markermod3.so \
+    $(objpfx)glibc-hwcaps/x86-64-v2/markermod3.so \
+    $(objpfx)glibc-hwcaps/x86-64-v3/markermod3.so \
+  $(objpfx)markermod4.so \
+    $(objpfx)glibc-hwcaps/x86-64-v2/markermod4.so \
+    $(objpfx)glibc-hwcaps/x86-64-v3/markermod4.so \
+    $(objpfx)glibc-hwcaps/x86-64-v4/markermod4.so \
+
+$(objpfx)glibc-hwcaps/x86-64-v2/markermod2.so: $(objpfx)markermod2-2.so
+	$(make-target-directory)
+	cp $< $@
+$(objpfx)glibc-hwcaps/x86-64-v2/markermod3.so: $(objpfx)markermod3-2.so
+	$(make-target-directory)
+	cp $< $@
+$(objpfx)glibc-hwcaps/x86-64-v3/markermod3.so: $(objpfx)markermod3-3.so
+	$(make-target-directory)
+	cp $< $@
+$(objpfx)glibc-hwcaps/x86-64-v2/markermod4.so: $(objpfx)markermod4-2.so
+	$(make-target-directory)
+	cp $< $@
+$(objpfx)glibc-hwcaps/x86-64-v3/markermod4.so: $(objpfx)markermod4-3.so
+	$(make-target-directory)
+	cp $< $@
+$(objpfx)glibc-hwcaps/x86-64-v4/markermod4.so: $(objpfx)markermod4-4.so
+	$(make-target-directory)
+	cp $< $@
+
+
+endif # $(subdir) == elf
 
 ifeq ($(subdir),csu)
 gen-as-const-headers += tlsdesc.sym rtld-offsets.sym
diff --git a/sysdeps/x86_64/dl-hwcaps-subdirs.c b/sysdeps/x86_64/dl-hwcaps-subdirs.c
new file mode 100644
index 0000000000..c4d8b3a02a
--- /dev/null
+++ b/sysdeps/x86_64/dl-hwcaps-subdirs.c
@@ -0,0 +1,66 @@
+/* Architecture-specific glibc-hwcaps subdirectories.  x86 version.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <dl-hwcaps.h>
+#include <cpu-features.h>
+
+const char _dl_hwcaps_subdirs[] = "x86-64-v4:x86-64-v3:x86-64-v2";
+enum { subdirs_count = 3 };
+
+uint32_t
+_dl_hwcaps_subdirs_active (void)
+{
+  int active = 0;
+
+  /* Test in reverse preference order.  */
+
+  /* x86-64-v2.  */
+  if (!(CPU_FEATURE_USABLE (CMPXCHG16B)
+        && CPU_FEATURE_USABLE (LAHF64_SAHF64)
+        && CPU_FEATURE_USABLE (POPCNT)
+        && CPU_FEATURE_USABLE (SSE3)
+        && CPU_FEATURE_USABLE (SSE4_1)
+        && CPU_FEATURE_USABLE (SSE4_2)
+        && CPU_FEATURE_USABLE (SSSE3)))
+    return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
+  ++active;
+
+  /* x86-64-v3.  */
+  if (!(CPU_FEATURE_USABLE (AVX)
+        && CPU_FEATURE_USABLE (AVX2)
+        && CPU_FEATURE_USABLE (BMI1)
+        && CPU_FEATURE_USABLE (BMI2)
+        && CPU_FEATURE_USABLE (F16C)
+        && CPU_FEATURE_USABLE (FMA)
+        && CPU_FEATURE_USABLE (LZCNT)
+        && CPU_FEATURE_USABLE (MOVBE)
+        && CPU_FEATURE_USABLE (OSXSAVE)))
+    return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
+  ++active;
+
+ /* x86-64-v4.  */
+  if (!(CPU_FEATURE_USABLE (AVX512F)
+        && CPU_FEATURE_USABLE (AVX512BW)
+        && CPU_FEATURE_USABLE (AVX512CD)
+        && CPU_FEATURE_USABLE (AVX512DQ)
+        && CPU_FEATURE_USABLE (AVX512VL)))
+    return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
+  ++active;
+
+  return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
+}
diff --git a/sysdeps/x86_64/tst-glibc-hwcaps.c b/sysdeps/x86_64/tst-glibc-hwcaps.c
new file mode 100644
index 0000000000..b46e7cb236
--- /dev/null
+++ b/sysdeps/x86_64/tst-glibc-hwcaps.c
@@ -0,0 +1,65 @@
+/* glibc-hwcaps subdirectory test.  x86_64 version.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <stdio.h>
+#include <support/check.h>
+#include <sys/param.h>
+
+extern int marker2 (void);
+extern int marker3 (void);
+extern int marker4 (void);
+
+/* Return the x86-64-vN level, 1 for the baseline.  */
+static int
+compute_level (void)
+{
+  /* These checks are not entirely accurate because they are limited
+     by GCC capabilities.  But wrong results will only result from
+     inconsistent CPU models involving virtualization.  */
+  if (!(__builtin_cpu_supports ("sse3")
+        && __builtin_cpu_supports ("sse4.1")
+        && __builtin_cpu_supports ("sse4.2")
+        && __builtin_cpu_supports ("ssse3")))
+    return 1;
+  if (!(__builtin_cpu_supports ("avx")
+        && __builtin_cpu_supports ("avx2")
+        && __builtin_cpu_supports ("bmi")
+        && __builtin_cpu_supports ("bmi2")
+        && __builtin_cpu_supports ("fma")))
+    return 2;
+  if (!(__builtin_cpu_supports ("avx512f")
+        && __builtin_cpu_supports ("avx512bw")
+        && __builtin_cpu_supports ("avx512cd")
+        && __builtin_cpu_supports ("avx512dq")
+        && __builtin_cpu_supports ("avx512vl")))
+    return 3;
+  return 4;
+}
+
+static int
+do_test (void)
+{
+  int level = compute_level ();
+  printf ("info: detected x86-64 micro-architecture level: %d\n", level);
+  TEST_COMPARE (marker2 (), MIN (level, 2));
+  TEST_COMPARE (marker3 (), MIN (level, 3));
+  TEST_COMPARE (marker4 (), MIN (level, 4));
+  return 0;
+}
+
+#include <support/test-driver.c>
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v2 3/3] powerpc64le: Add glibc-hwcaps support
  2020-10-12 15:21 [PATCH v2 0/3] glibc-hwcaps support for LD_LIBRARY_PATH Florian Weimer via Libc-alpha
  2020-10-12 15:21 ` [PATCH 1/3] elf: Add " Florian Weimer via Libc-alpha
  2020-10-12 15:21 ` [PATCH v2 2/3] x86_64: Add glibc-hwcaps support Florian Weimer via Libc-alpha
@ 2020-10-12 15:22 ` Florian Weimer via Libc-alpha
  2020-10-13 16:36   ` Paul A. Clarke via Libc-alpha
                     ` (3 more replies)
  2 siblings, 4 replies; 32+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-10-12 15:22 UTC (permalink / raw)
  To: libc-alpha; +Cc: Paul A. Clarke

The "power10" and "power9" subdirectories are selected.
---
 sysdeps/powerpc/powerpc64/le/Makefile         | 22 ++++++++
 .../powerpc/powerpc64/le/dl-hwcaps-subdirs.c  | 39 ++++++++++++++
 .../powerpc/powerpc64/le/tst-glibc-hwcaps.c   | 54 +++++++++++++++++++
 3 files changed, 115 insertions(+)
 create mode 100644 sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
 create mode 100644 sysdeps/powerpc/powerpc64/le/tst-glibc-hwcaps.c

diff --git a/sysdeps/powerpc/powerpc64/le/Makefile b/sysdeps/powerpc/powerpc64/le/Makefile
index 033dc77b01..74715677ed 100644
--- a/sysdeps/powerpc/powerpc64/le/Makefile
+++ b/sysdeps/powerpc/powerpc64/le/Makefile
@@ -188,3 +188,25 @@ ifeq ($(subdir),nptl)
 CFLAGS-tst-thread_local1.cc += -mno-float128
 CFLAGS-tst-minstack-throw.cc += -mno-float128
 endif
+
+ifeq ($(subdir),elf)
+$(objpfx)tst-glibc-hwcaps: \
+  $(objpfx)markermod2-1.so $(objpfx)markermod3-1.so
+$(objpfx)tst-glibc-hwcaps.out: \
+  $(objpfx)markermod2.so \
+    $(objpfx)glibc-hwcaps/power9/markermod2.so \
+  $(objpfx)markermod3.so \
+    $(objpfx)glibc-hwcaps/power9/markermod3.so \
+    $(objpfx)glibc-hwcaps/power10/markermod3.so \
+
+$(objpfx)glibc-hwcaps/power9/markermod2.so: $(objpfx)markermod2-2.so
+	$(make-target-directory)
+	cp $< $@
+$(objpfx)glibc-hwcaps/power9/markermod3.so: $(objpfx)markermod3-2.so
+	$(make-target-directory)
+	cp $< $@
+$(objpfx)glibc-hwcaps/power10/markermod3.so: $(objpfx)markermod3-3.so
+	$(make-target-directory)
+	cp $< $@
+
+endif # $(subdir) == elf
diff --git a/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
new file mode 100644
index 0000000000..1fa3735a8c
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
@@ -0,0 +1,39 @@
+/* Architecture-specific glibc-hwcaps subdirectories.  powerpc64le version.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <dl-hwcaps.h>
+#include <ldsodefs.h>
+
+const char _dl_hwcaps_subdirs[] = "power10:power9";
+enum { subdirs_count = 2 };
+
+uint32_t
+_dl_hwcaps_subdirs_active (void)
+{
+  int active = 0;
+
+  if ((GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_00) == 0)
+    return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
+  ++active;
+
+  if ((GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_1) == 0)
+    return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
+  ++active;
+
+  return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
+}
diff --git a/sysdeps/powerpc/powerpc64/le/tst-glibc-hwcaps.c b/sysdeps/powerpc/powerpc64/le/tst-glibc-hwcaps.c
new file mode 100644
index 0000000000..e510fca80a
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/tst-glibc-hwcaps.c
@@ -0,0 +1,54 @@
+/* glibc-hwcaps subdirectory test.  powerpc64le version.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <stdio.h>
+#include <string.h>
+#include <support/check.h>
+#include <sys/auxv.h>
+#include <sys/param.h>
+
+extern int marker2 (void);
+extern int marker3 (void);
+
+/* Return the POWER level, 8 for the baseline.  */
+static int
+compute_level (void)
+{
+  const char *platform = (const char *) getauxval (AT_PLATFORM);
+  if (strcmp (platform, "power8") == 0)
+    return 8;
+  if (strcmp (platform, "power9") == 0)
+    return 9;
+  if (strcmp (platform, "power10") == 0)
+    return 10;
+  printf ("warning: unrecognized AT_PLATFORM value: %s\n", platform);
+  /* Assume that the new platform supports POWER10.  */
+  return 10;
+}
+
+static int
+do_test (void)
+{
+  int level = compute_level ();
+  printf ("info: detected POWER level: %d\n", level);
+  TEST_COMPARE (marker2 (), MIN (level - 7, 2));
+  TEST_COMPARE (marker3 (), MIN (level - 7, 3));
+  return 0;
+}
+
+#include <support/test-driver.c>
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 2/3] x86_64: Add glibc-hwcaps support
  2020-10-12 15:21 ` [PATCH v2 2/3] x86_64: Add glibc-hwcaps support Florian Weimer via Libc-alpha
@ 2020-10-12 18:11   ` H.J. Lu via Libc-alpha
  2020-10-13  9:29     ` Florian Weimer via Libc-alpha
  0 siblings, 1 reply; 32+ messages in thread
From: H.J. Lu via Libc-alpha @ 2020-10-12 18:11 UTC (permalink / raw)
  To: Florian Weimer; +Cc: GNU C Library

On Mon, Oct 12, 2020 at 8:23 AM Florian Weimer via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> The subdirectories match those in the x86-64 psABI:
>
> https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9a6b9396884b67c7c
> ---
>  sysdeps/x86_64/Makefile            | 36 +++++++++++++++-
>  sysdeps/x86_64/dl-hwcaps-subdirs.c | 66 ++++++++++++++++++++++++++++++
>  sysdeps/x86_64/tst-glibc-hwcaps.c  | 65 +++++++++++++++++++++++++++++
>  3 files changed, 166 insertions(+), 1 deletion(-)
>  create mode 100644 sysdeps/x86_64/dl-hwcaps-subdirs.c
>  create mode 100644 sysdeps/x86_64/tst-glibc-hwcaps.c
>
> diff --git a/sysdeps/x86_64/Makefile b/sysdeps/x86_64/Makefile
> index 42b97c5cc7..16030715e7 100644
> --- a/sysdeps/x86_64/Makefile
> +++ b/sysdeps/x86_64/Makefile
> @@ -144,7 +144,41 @@ CFLAGS-tst-auditmod10b.c += $(AVX512-CFLAGS)
>  CFLAGS-tst-avx512-aux.c += $(AVX512-CFLAGS)
>  CFLAGS-tst-avx512mod.c += $(AVX512-CFLAGS)
>  endif
> -endif
> +
> +$(objpfx)tst-glibc-hwcaps: \
> +  $(objpfx)markermod2-1.so $(objpfx)markermod3-1.so $(objpfx)markermod4-1.so
> +$(objpfx)tst-glibc-hwcaps.out: \
> +  $(objpfx)markermod2.so \
> +    $(objpfx)glibc-hwcaps/x86-64-v2/markermod2.so \
> +  $(objpfx)markermod3.so \
> +    $(objpfx)glibc-hwcaps/x86-64-v2/markermod3.so \
> +    $(objpfx)glibc-hwcaps/x86-64-v3/markermod3.so \
> +  $(objpfx)markermod4.so \
> +    $(objpfx)glibc-hwcaps/x86-64-v2/markermod4.so \
> +    $(objpfx)glibc-hwcaps/x86-64-v3/markermod4.so \
> +    $(objpfx)glibc-hwcaps/x86-64-v4/markermod4.so \
> +
> +$(objpfx)glibc-hwcaps/x86-64-v2/markermod2.so: $(objpfx)markermod2-2.so
> +       $(make-target-directory)
> +       cp $< $@
> +$(objpfx)glibc-hwcaps/x86-64-v2/markermod3.so: $(objpfx)markermod3-2.so
> +       $(make-target-directory)
> +       cp $< $@
> +$(objpfx)glibc-hwcaps/x86-64-v3/markermod3.so: $(objpfx)markermod3-3.so
> +       $(make-target-directory)
> +       cp $< $@
> +$(objpfx)glibc-hwcaps/x86-64-v2/markermod4.so: $(objpfx)markermod4-2.so
> +       $(make-target-directory)
> +       cp $< $@
> +$(objpfx)glibc-hwcaps/x86-64-v3/markermod4.so: $(objpfx)markermod4-3.so
> +       $(make-target-directory)
> +       cp $< $@
> +$(objpfx)glibc-hwcaps/x86-64-v4/markermod4.so: $(objpfx)markermod4-4.so
> +       $(make-target-directory)
> +       cp $< $@
> +
> +
> +endif # $(subdir) == elf
>
>  ifeq ($(subdir),csu)
>  gen-as-const-headers += tlsdesc.sym rtld-offsets.sym
> diff --git a/sysdeps/x86_64/dl-hwcaps-subdirs.c b/sysdeps/x86_64/dl-hwcaps-subdirs.c
> new file mode 100644
> index 0000000000..c4d8b3a02a
> --- /dev/null
> +++ b/sysdeps/x86_64/dl-hwcaps-subdirs.c
> @@ -0,0 +1,66 @@
> +/* Architecture-specific glibc-hwcaps subdirectories.  x86 version.
> +   Copyright (C) 2020 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <dl-hwcaps.h>
> +#include <cpu-features.h>
>

Please use <sys/platform/x86.h>.

> +const char _dl_hwcaps_subdirs[] = "x86-64-v4:x86-64-v3:x86-64-v2";
> +enum { subdirs_count = 3 };
> +
> +uint32_t
> +_dl_hwcaps_subdirs_active (void)
> +{
> +  int active = 0;
> +
> +  /* Test in reverse preference order.  */
> +
> +  /* x86-64-v2.  */
> +  if (!(CPU_FEATURE_USABLE (CMPXCHG16B)
> +        && CPU_FEATURE_USABLE (LAHF64_SAHF64)
> +        && CPU_FEATURE_USABLE (POPCNT)
> +        && CPU_FEATURE_USABLE (SSE3)
> +        && CPU_FEATURE_USABLE (SSE4_1)
> +        && CPU_FEATURE_USABLE (SSE4_2)
> +        && CPU_FEATURE_USABLE (SSSE3)))
> +    return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
> +  ++active;
> +
> +  /* x86-64-v3.  */
> +  if (!(CPU_FEATURE_USABLE (AVX)
> +        && CPU_FEATURE_USABLE (AVX2)
> +        && CPU_FEATURE_USABLE (BMI1)
> +        && CPU_FEATURE_USABLE (BMI2)
> +        && CPU_FEATURE_USABLE (F16C)
> +        && CPU_FEATURE_USABLE (FMA)
> +        && CPU_FEATURE_USABLE (LZCNT)
> +        && CPU_FEATURE_USABLE (MOVBE)
> +        && CPU_FEATURE_USABLE (OSXSAVE)))
> +    return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
> +  ++active;
> +
> + /* x86-64-v4.  */
> +  if (!(CPU_FEATURE_USABLE (AVX512F)
> +        && CPU_FEATURE_USABLE (AVX512BW)
> +        && CPU_FEATURE_USABLE (AVX512CD)
> +        && CPU_FEATURE_USABLE (AVX512DQ)
> +        && CPU_FEATURE_USABLE (AVX512VL)))
> +    return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
> +  ++active;
> +
> +  return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
> +}
> diff --git a/sysdeps/x86_64/tst-glibc-hwcaps.c b/sysdeps/x86_64/tst-glibc-hwcaps.c
> new file mode 100644
> index 0000000000..b46e7cb236
> --- /dev/null
> +++ b/sysdeps/x86_64/tst-glibc-hwcaps.c
> @@ -0,0 +1,65 @@
> +/* glibc-hwcaps subdirectory test.  x86_64 version.
> +   Copyright (C) 2020 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <stdio.h>
> +#include <support/check.h>
> +#include <sys/param.h>

Please include <sys/platform/x86.h>.

> +extern int marker2 (void);
> +extern int marker3 (void);
> +extern int marker4 (void);
> +
> +/* Return the x86-64-vN level, 1 for the baseline.  */
> +static int
> +compute_level (void)
> +{
> +  /* These checks are not entirely accurate because they are limited
> +     by GCC capabilities.  But wrong results will only result from
> +     inconsistent CPU models involving virtualization.  */

Please use

  const struct cpu_features *cpu_features
    = __x86_get_cpu_features (COMMON_CPUID_INDEX_MAX);

 if (CPU_FEATURE_USABLE_P (cpu_features, CMPXCHG16B)
      && CPU_FEATURE_USABLE_P (cpu_features, LAHF64_SAHF64)
      && CPU_FEATURE_USABLE_P (cpu_features, POPCNT)
      && CPU_FEATURE_USABLE_P (cpu_features, MMX)
      && CPU_FEATURE_USABLE_P (cpu_features, SSE)
      && CPU_FEATURE_USABLE_P (cpu_features, SSE2)
      && CPU_FEATURE_USABLE_P (cpu_features, SSE3)
      && CPU_FEATURE_USABLE_P (cpu_features, SSSE3)
      && CPU_FEATURE_USABLE_P (cpu_features, SSE4_1)
      && CPU_FEATURE_USABLE_P (cpu_features, SSE4_2))
...

> +  if (!(__builtin_cpu_supports ("sse3")
> +        && __builtin_cpu_supports ("sse4.1")
> +        && __builtin_cpu_supports ("sse4.2")
> +        && __builtin_cpu_supports ("ssse3")))
> +    return 1;
> +  if (!(__builtin_cpu_supports ("avx")
> +        && __builtin_cpu_supports ("avx2")
> +        && __builtin_cpu_supports ("bmi")
> +        && __builtin_cpu_supports ("bmi2")
> +        && __builtin_cpu_supports ("fma")))
> +    return 2;
> +  if (!(__builtin_cpu_supports ("avx512f")
> +        && __builtin_cpu_supports ("avx512bw")
> +        && __builtin_cpu_supports ("avx512cd")
> +        && __builtin_cpu_supports ("avx512dq")
> +        && __builtin_cpu_supports ("avx512vl")))
> +    return 3;
> +  return 4;
> +}
> +
> +static int
> +do_test (void)
> +{
> +  int level = compute_level ();
> +  printf ("info: detected x86-64 micro-architecture level: %d\n", level);
> +  TEST_COMPARE (marker2 (), MIN (level, 2));
> +  TEST_COMPARE (marker3 (), MIN (level, 3));
> +  TEST_COMPARE (marker4 (), MIN (level, 4));
> +  return 0;
> +}
> +
> +#include <support/test-driver.c>
> --
> Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
> Commercial register: Amtsgericht Muenchen, HRB 153243,
> Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill
>
>


-- 
H.J.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 2/3] x86_64: Add glibc-hwcaps support
  2020-10-12 18:11   ` H.J. Lu via Libc-alpha
@ 2020-10-13  9:29     ` Florian Weimer via Libc-alpha
  2020-10-13 11:02       ` H.J. Lu via Libc-alpha
  0 siblings, 1 reply; 32+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-10-13  9:29 UTC (permalink / raw)
  To: H.J. Lu via Libc-alpha

* H. J. Lu via Libc-alpha:

>> diff --git a/sysdeps/x86_64/dl-hwcaps-subdirs.c b/sysdeps/x86_64/dl-hwcaps-subdirs.c
>> new file mode 100644
>> index 0000000000..c4d8b3a02a
>> --- /dev/null
>> +++ b/sysdeps/x86_64/dl-hwcaps-subdirs.c

>> +#include <dl-hwcaps.h>
>> +#include <cpu-features.h>
>>
>
> Please use <sys/platform/x86.h>.

How does this work?  There is no wrapper header for it, so its
functionality is not really available within ld.so.

>> diff --git a/sysdeps/x86_64/tst-glibc-hwcaps.c b/sysdeps/x86_64/tst-glibc-hwcaps.c
>> new file mode 100644
>> index 0000000000..b46e7cb236
>> --- /dev/null
>> +++ b/sysdeps/x86_64/tst-glibc-hwcaps.c

>> +#include <stdio.h>
>> +#include <support/check.h>
>> +#include <sys/param.h>
>
> Please include <sys/platform/x86.h>.

I don't want to test the implementation against itself.  I'd rather use
the GCC functionality and if necessary, add the additional feature
checks by open-coding the CPU analysis.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 2/3] x86_64: Add glibc-hwcaps support
  2020-10-13  9:29     ` Florian Weimer via Libc-alpha
@ 2020-10-13 11:02       ` H.J. Lu via Libc-alpha
  2020-10-13 11:24         ` Florian Weimer via Libc-alpha
  0 siblings, 1 reply; 32+ messages in thread
From: H.J. Lu via Libc-alpha @ 2020-10-13 11:02 UTC (permalink / raw)
  To: Florian Weimer; +Cc: H.J. Lu via Libc-alpha

On Tue, Oct 13, 2020 at 2:29 AM Florian Weimer <fweimer@redhat.com> wrote:
>
> * H. J. Lu via Libc-alpha:
>
> >> diff --git a/sysdeps/x86_64/dl-hwcaps-subdirs.c b/sysdeps/x86_64/dl-hwcaps-subdirs.c
> >> new file mode 100644
> >> index 0000000000..c4d8b3a02a
> >> --- /dev/null
> >> +++ b/sysdeps/x86_64/dl-hwcaps-subdirs.c
>
> >> +#include <dl-hwcaps.h>
> >> +#include <cpu-features.h>
> >>
> >
> > Please use <sys/platform/x86.h>.
>
> How does this work?  There is no wrapper header for it, so its
> functionality is not really available within ld.so.

You are right.

> >> diff --git a/sysdeps/x86_64/tst-glibc-hwcaps.c b/sysdeps/x86_64/tst-glibc-hwcaps.c
> >> new file mode 100644
> >> index 0000000000..b46e7cb236
> >> --- /dev/null
> >> +++ b/sysdeps/x86_64/tst-glibc-hwcaps.c
>
> >> +#include <stdio.h>
> >> +#include <support/check.h>
> >> +#include <sys/param.h>
> >
> > Please include <sys/platform/x86.h>.
>
> I don't want to test the implementation against itself.  I'd rather use
> the GCC functionality and if necessary, add the additional feature
> checks by open-coding the CPU analysis.

You need to check all ISAs included in each ISA level.
There are testcases to check that <sys/platform/x86.h>
is correct.

-- 
H.J.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 2/3] x86_64: Add glibc-hwcaps support
  2020-10-13 11:02       ` H.J. Lu via Libc-alpha
@ 2020-10-13 11:24         ` Florian Weimer via Libc-alpha
  2020-10-13 11:43           ` H.J. Lu via Libc-alpha
  0 siblings, 1 reply; 32+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-10-13 11:24 UTC (permalink / raw)
  To: H.J. Lu via Libc-alpha

* H. J. Lu via Libc-alpha:

> On Tue, Oct 13, 2020 at 2:29 AM Florian Weimer <fweimer@redhat.com> wrote:
>>
>> * H. J. Lu via Libc-alpha:
>>
>> >> diff --git a/sysdeps/x86_64/dl-hwcaps-subdirs.c b/sysdeps/x86_64/dl-hwcaps-subdirs.c
>> >> new file mode 100644
>> >> index 0000000000..c4d8b3a02a
>> >> --- /dev/null
>> >> +++ b/sysdeps/x86_64/dl-hwcaps-subdirs.c
>>
>> >> +#include <dl-hwcaps.h>
>> >> +#include <cpu-features.h>
>> >>
>> >
>> > Please use <sys/platform/x86.h>.
>>
>> How does this work?  There is no wrapper header for it, so its
>> functionality is not really available within ld.so.
>
> You are right.

So this part is okay?

>> >> diff --git a/sysdeps/x86_64/tst-glibc-hwcaps.c b/sysdeps/x86_64/tst-glibc-hwcaps.c
>> >> new file mode 100644
>> >> index 0000000000..b46e7cb236
>> >> --- /dev/null
>> >> +++ b/sysdeps/x86_64/tst-glibc-hwcaps.c
>>
>> >> +#include <stdio.h>
>> >> +#include <support/check.h>
>> >> +#include <sys/param.h>
>> >
>> > Please include <sys/platform/x86.h>.
>>
>> I don't want to test the implementation against itself.  I'd rather use
>> the GCC functionality and if necessary, add the additional feature
>> checks by open-coding the CPU analysis.
>
> You need to check all ISAs included in each ISA level.
> There are testcases to check that <sys/platform/x86.h>
> is correct.

Do you still want me to use <sys/platform/x86.h> then?

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 2/3] x86_64: Add glibc-hwcaps support
  2020-10-13 11:24         ` Florian Weimer via Libc-alpha
@ 2020-10-13 11:43           ` H.J. Lu via Libc-alpha
  0 siblings, 0 replies; 32+ messages in thread
From: H.J. Lu via Libc-alpha @ 2020-10-13 11:43 UTC (permalink / raw)
  To: Florian Weimer; +Cc: H.J. Lu via Libc-alpha

On Tue, Oct 13, 2020 at 4:25 AM Florian Weimer <fweimer@redhat.com> wrote:
>
> * H. J. Lu via Libc-alpha:
>
> > On Tue, Oct 13, 2020 at 2:29 AM Florian Weimer <fweimer@redhat.com> wrote:
> >>
> >> * H. J. Lu via Libc-alpha:
> >>
> >> >> diff --git a/sysdeps/x86_64/dl-hwcaps-subdirs.c b/sysdeps/x86_64/dl-hwcaps-subdirs.c
> >> >> new file mode 100644
> >> >> index 0000000000..c4d8b3a02a
> >> >> --- /dev/null
> >> >> +++ b/sysdeps/x86_64/dl-hwcaps-subdirs.c
> >>
> >> >> +#include <dl-hwcaps.h>
> >> >> +#include <cpu-features.h>
> >> >>
> >> >
> >> > Please use <sys/platform/x86.h>.
> >>
> >> How does this work?  There is no wrapper header for it, so its
> >> functionality is not really available within ld.so.
> >
> > You are right.
>
> So this part is okay?

Yes.

> >> >> diff --git a/sysdeps/x86_64/tst-glibc-hwcaps.c b/sysdeps/x86_64/tst-glibc-hwcaps.c
> >> >> new file mode 100644
> >> >> index 0000000000..b46e7cb236
> >> >> --- /dev/null
> >> >> +++ b/sysdeps/x86_64/tst-glibc-hwcaps.c
> >>
> >> >> +#include <stdio.h>
> >> >> +#include <support/check.h>
> >> >> +#include <sys/param.h>
> >> >
> >> > Please include <sys/platform/x86.h>.
> >>
> >> I don't want to test the implementation against itself.  I'd rather use
> >> the GCC functionality and if necessary, add the additional feature
> >> checks by open-coding the CPU analysis.
> >
> > You need to check all ISAs included in each ISA level.
> > There are testcases to check that <sys/platform/x86.h>
> > is correct.
>
> Do you still want me to use <sys/platform/x86.h> then?
>

Yes.

-- 
H.J.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] elf: Add glibc-hwcaps support for LD_LIBRARY_PATH
  2020-10-12 15:21 ` [PATCH 1/3] elf: Add " Florian Weimer via Libc-alpha
@ 2020-10-13 16:28   ` Paul A. Clarke via Libc-alpha
  2020-10-14 13:58     ` Florian Weimer via Libc-alpha
  2020-10-20 17:23   ` Paul A. Clarke via Libc-alpha
  1 sibling, 1 reply; 32+ messages in thread
From: Paul A. Clarke via Libc-alpha @ 2020-10-13 16:28 UTC (permalink / raw)
  To: Florian Weimer; +Cc: libc-alpha

On Mon, Oct 12, 2020 at 05:21:44PM +0200, Florian Weimer via Libc-alpha wrote:
> This hacks non-power-set processing into _dl_important_hwcaps.
> Once the legacy hwcaps handling goes away, the subdirectory
> handling needs to be reworked, but it is premature to do this
> while both approaches are still supported.
> ---
[snip]
> diff --git a/elf/dl-hwcaps.h b/elf/dl-hwcaps.h
> index b66da59b89..9071367038 100644
> --- a/elf/dl-hwcaps.h
> +++ b/elf/dl-hwcaps.h
> @@ -16,6 +16,11 @@
>     License along with the GNU C Library; if not, see
>     <https://www.gnu.org/licenses/>.  */
> 
> +#ifndef _DL_HWCAPS_H
> +#define _DL_HWCAPS_H
> +
> +#include <stdint.h>
> +
>  #include <elf/dl-tunables.h>
> 
>  #if HAVE_TUNABLES
> @@ -28,3 +33,101 @@
>  #  define GET_HWCAP_MASK() (0)
>  # endif
>  #endif
> +
> +#define GLIBC_HWCAPS_SUBDIRECTORY "glibc-hwcaps"
> +#define GLIBC_HWCAPS_PREFIX GLIBC_HWCAPS_SUBDIRECTORY "/"
> +
> +/* Used by _dl_hwcaps_split below, to split strings at ':'
> +   separators.  */
> +struct dl_hwcaps_split
> +{
> +  const char *segment;          /* Start of the current segment.  */
> +  size_t length;                /* Number of bytes until ':' or NUL.  */
> +};
> +
> +/* Prepare *S to parse SUBJECT, for future _dl_hwcaps_split calls.  If
> +   SUBJECT is NULL, it is treated as the empty string.  */
> +static inline void
> +_dl_hwcaps_split_init (struct dl_hwcaps_split *s, const char *subject)
> +{
> +  s->segment = subject;
> +  /* The initial call to _dl_hwcaps_split will not skip anything.  */
> +  s->length = 0;
> +}
> +
> +/* Extract the next non-empty string segment, up to ':' or the null
> +   terminator.  Return true if one more segment was found, or false if
> +   the end of the string was reached.  On success, S->segment is the
> +   start of the segment found, and S->length is its length.
> +   (Typically, S->segment[S->length] is not null.)  */
> +_Bool _dl_hwcaps_split (struct dl_hwcaps_split *s) attribute_hidden;
> +
> +/* Similar to dl_hwcaps_split, but with bit-based and name-based
> +   masking.  */
> +struct dl_hwcaps_split_masked
> +{
> +  struct dl_hwcaps_split split;
> +
> +  /* For used by the iterator implementation.  */
> +  const char *mask;
> +  uint32_t bitmask;
> +};
> +
> +/* Prepare *S for iteration with _dl_hwcaps_split_masked.  Only HWCAP
> +   names in SUBJECT whose bit is set in BITMASK and whose name is in
> +   MASK will be returned.  SUBJECT must not contain empty HWCAP names.
> +   If MASK is NULL, no name-based masking is applied.  Likewise for
> +   BITMASK if BITMASK is -1 (infinite number of bits).  */
> +static inline void
> +_dl_hwcaps_split_masked_init (struct dl_hwcaps_split_masked *s,
> +                              const char *subject,
> +                              uint32_t bitmask, const char *mask)
> +{
> +  _dl_hwcaps_split_init (&s->split, subject);
> +  s->bitmask = bitmask;
> +  s->mask = mask;
> +}
> +
> +/* Like _dl_hwcaps_split, but apply masking.  */
> +_Bool _dl_hwcaps_split_masked (struct dl_hwcaps_split_masked *s)
> +  attribute_hidden;
> +
> +/* Returns true if the colon-separated HWCAP list HWCAPS contains the
> +   capability NAME (with length NAME_LENGTH).  If HWCAPS is NULL, the
> +   function returns true.  */
> +_Bool _dl_hwcaps_contains (const char *hwcaps, const char *name,
> +                           size_t name_length) attribute_hidden;
> +
> +/* Colon-separated string of glibc-hwcaps subdirectories, without the
> +   "glibc-hwcaps/" prefix.  The most preferred subdirectory needs to
> +   be listed first.  */
> +extern const char _dl_hwcaps_subdirs[] attribute_hidden;

Should we note the limitations, that the number of subdirectories must
be <= 32?

> +
> +/* Returns a bitmap of active subdirectories in _dl_hwcaps_subdirs.
> +   Bit 0 (the LSB) corresponds to the first substring in
> +   _dl_hwcaps_subdirs, bit 1 to the second substring, and so on.
> +   There is no direct correspondence between HWCAP bitmasks and this
> +   bitmask.  */
> +uint32_t _dl_hwcaps_subdirs_active (void) attribute_hidden;
> +
> +/* Returns a bitmask that marks the last ACTIVE subdirectories in a
> +   _dl_hwcaps_subdirs_active string (containing SUBDIRS directories in
> +   total) as active.  Intended for use in _dl_hwcaps_subdirs_active
> +   implementations.  */
> +static inline uint32_t
> +_dl_hwcaps_subdirs_build_bitmask (int subdirs, int active)
> +{
> +  /* Leading subdirectories that are not active.  */
> +  int inactive = subdirs - active;
> +  if (inactive == 32)
> +    return 0;
> +
> +  uint32_t mask;
> +  if (subdirs < 32)
> +    mask = (1U << subdirs) - 1;
> +  else
> +    mask = -1;
> +  return mask ^ ((1U << inactive) - 1);

Should we validate any inputs in this function, that:
- subdirs <= 32
- active <= 32 and active <= subdirs

While validating this function, I created an equivalent:
        if (subdirs == 0) return 0;
        if (active == 32) return -1;
        uint32_t mask = -1;
        /* Mask to include all subdirs.  */
        mask >>= 32 - s;
        /* Unmask all inactive.  */
        mask &= ~(mask >> a);
        return mask;
...I found this more readable, but it's subjective.

Also, this routine makes a broad assumption that active subdirectories
are all contiguous and at the head of the list.  Maybe this should be
renamed _dl_hwcaps_subdirs_build_range_bitmask (or ..._top_range_...),
with an updated comment that reflects its limited use-case.

> +}
> +
> +#endif /* _DL_HWCAPS_H */
[snip]

PC

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/3] powerpc64le: Add glibc-hwcaps support
  2020-10-12 15:22 ` [PATCH v2 3/3] powerpc64le: " Florian Weimer via Libc-alpha
@ 2020-10-13 16:36   ` Paul A. Clarke via Libc-alpha
  2020-10-20 17:23   ` Paul A. Clarke via Libc-alpha
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 32+ messages in thread
From: Paul A. Clarke via Libc-alpha @ 2020-10-13 16:36 UTC (permalink / raw)
  To: Florian Weimer; +Cc: libc-alpha

On Mon, Oct 12, 2020 at 05:22:11PM +0200, Florian Weimer via Libc-alpha wrote:
> The "power10" and "power9" subdirectories are selected.
> ---
[snip]
> diff --git a/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
> new file mode 100644
> index 0000000000..1fa3735a8c
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
> @@ -0,0 +1,39 @@
> +/* Architecture-specific glibc-hwcaps subdirectories.  powerpc64le version.
> +   Copyright (C) 2020 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <dl-hwcaps.h>
> +#include <ldsodefs.h>
> +
> +const char _dl_hwcaps_subdirs[] = "power10:power9";
> +enum { subdirs_count = 2 };
> +
> +uint32_t
> +_dl_hwcaps_subdirs_active (void)
> +{
> +  int active = 0;
> +
> +  if ((GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_00) == 0)
> +    return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);

Since active == 0, this is just "return 0", but it's consistent with
below... OK.

> +  ++active;
> +
> +  if ((GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_1) == 0)
> +    return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
> +  ++active;
> +
> +  return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
> +}

OK.

The maintainers will need to be careful with ordering, in both the 
_dl_hwcaps_subdirs string and the code within the function.
The string must be in priority order and the code stanzas must be in
reverse priority order, and the result must be a contiguous range.

Comments might help.

LGTM.

PC

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] elf: Add glibc-hwcaps support for LD_LIBRARY_PATH
  2020-10-13 16:28   ` Paul A. Clarke via Libc-alpha
@ 2020-10-14 13:58     ` Florian Weimer via Libc-alpha
  2020-10-14 15:14       ` Paul A. Clarke via Libc-alpha
  0 siblings, 1 reply; 32+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-10-14 13:58 UTC (permalink / raw)
  To: Paul A. Clarke; +Cc: libc-alpha

* Paul A. Clarke:

>> +/* Returns true if the colon-separated HWCAP list HWCAPS contains the
>> +   capability NAME (with length NAME_LENGTH).  If HWCAPS is NULL, the
>> +   function returns true.  */
>> +_Bool _dl_hwcaps_contains (const char *hwcaps, const char *name,
>> +                           size_t name_length) attribute_hidden;
>> +
>> +/* Colon-separated string of glibc-hwcaps subdirectories, without the
>> +   "glibc-hwcaps/" prefix.  The most preferred subdirectory needs to
>> +   be listed first.  */
>> +extern const char _dl_hwcaps_subdirs[] attribute_hidden;
>
> Should we note the limitations, that the number of subdirectories must
> be <= 32?

Fair enough, I'm going to expand the comment.

>> +/* Returns a bitmap of active subdirectories in _dl_hwcaps_subdirs.
>> +   Bit 0 (the LSB) corresponds to the first substring in
>> +   _dl_hwcaps_subdirs, bit 1 to the second substring, and so on.
>> +   There is no direct correspondence between HWCAP bitmasks and this
>> +   bitmask.  */
>> +uint32_t _dl_hwcaps_subdirs_active (void) attribute_hidden;
>> +
>> +/* Returns a bitmask that marks the last ACTIVE subdirectories in a
>> +   _dl_hwcaps_subdirs_active string (containing SUBDIRS directories in
>> +   total) as active.  Intended for use in _dl_hwcaps_subdirs_active
>> +   implementations.  */
>> +static inline uint32_t
>> +_dl_hwcaps_subdirs_build_bitmask (int subdirs, int active)
>> +{
>> +  /* Leading subdirectories that are not active.  */
>> +  int inactive = subdirs - active;
>> +  if (inactive == 32)
>> +    return 0;
>> +
>> +  uint32_t mask;
>> +  if (subdirs < 32)
>> +    mask = (1U << subdirs) - 1;
>> +  else
>> +    mask = -1;
>> +  return mask ^ ((1U << inactive) - 1);
>
> Should we validate any inputs in this function, that:
> - subdirs <= 32
> - active <= 32 and active <= subdirs

Violating these preconditions result in undefined behavior at compile
time, so I expected GCC (and Clang) to warn about that.  But no such
luck there.  I asked two colleagues about what we can do on the GCC
side.  I do think GCC should warn about this under -Wall because it
returns a totally made-up value.

I think if we can get that fixed in GCC mainline, we don't have to
clutter our code with asserts.

> While validating this function, I created an equivalent:
>         if (subdirs == 0) return 0;
>         if (active == 32) return -1;
>         uint32_t mask = -1;
>         /* Mask to include all subdirs.  */
>         mask >>= 32 - s;
>         /* Unmask all inactive.  */
>         mask &= ~(mask >> a);
>         return mask;
> ...I found this more readable, but it's subjective.

Yeah, what we really want here is LDB or DPB, or Erlang's bit syntax. 8-/

> Also, this routine makes a broad assumption that active subdirectories
> are all contiguous and at the head of the list.  Maybe this should be
> renamed _dl_hwcaps_subdirs_build_range_bitmask (or ..._top_range_...),
> with an updated comment that reflects its limited use-case.

That makes sense.  I think we can delay that until such a targer
arrives.

Do you have further comments on this code?  Anyone else?

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] elf: Add glibc-hwcaps support for LD_LIBRARY_PATH
  2020-10-14 13:58     ` Florian Weimer via Libc-alpha
@ 2020-10-14 15:14       ` Paul A. Clarke via Libc-alpha
  2020-10-14 15:19         ` Florian Weimer via Libc-alpha
  0 siblings, 1 reply; 32+ messages in thread
From: Paul A. Clarke via Libc-alpha @ 2020-10-14 15:14 UTC (permalink / raw)
  To: Florian Weimer; +Cc: libc-alpha

On Wed, Oct 14, 2020 at 03:58:59PM +0200, Florian Weimer via Libc-alpha wrote:
> * Paul A. Clarke:
> 
> >> +/* Returns true if the colon-separated HWCAP list HWCAPS contains the
> >> +   capability NAME (with length NAME_LENGTH).  If HWCAPS is NULL, the
> >> +   function returns true.  */
> >> +_Bool _dl_hwcaps_contains (const char *hwcaps, const char *name,
> >> +                           size_t name_length) attribute_hidden;
> >> +
> >> +/* Colon-separated string of glibc-hwcaps subdirectories, without the
> >> +   "glibc-hwcaps/" prefix.  The most preferred subdirectory needs to
> >> +   be listed first.  */
> >> +extern const char _dl_hwcaps_subdirs[] attribute_hidden;
> >
> > Should we note the limitations, that the number of subdirectories must
> > be <= 32?
> 
> Fair enough, I'm going to expand the comment.

OK.

> >> +/* Returns a bitmap of active subdirectories in _dl_hwcaps_subdirs.
> >> +   Bit 0 (the LSB) corresponds to the first substring in
> >> +   _dl_hwcaps_subdirs, bit 1 to the second substring, and so on.
> >> +   There is no direct correspondence between HWCAP bitmasks and this
> >> +   bitmask.  */
> >> +uint32_t _dl_hwcaps_subdirs_active (void) attribute_hidden;
> >> +
> >> +/* Returns a bitmask that marks the last ACTIVE subdirectories in a
> >> +   _dl_hwcaps_subdirs_active string (containing SUBDIRS directories in
> >> +   total) as active.  Intended for use in _dl_hwcaps_subdirs_active
> >> +   implementations.  */
> >> +static inline uint32_t
> >> +_dl_hwcaps_subdirs_build_bitmask (int subdirs, int active)
> >> +{
> >> +  /* Leading subdirectories that are not active.  */
> >> +  int inactive = subdirs - active;
> >> +  if (inactive == 32)
> >> +    return 0;
> >> +
> >> +  uint32_t mask;
> >> +  if (subdirs < 32)
> >> +    mask = (1U << subdirs) - 1;
> >> +  else
> >> +    mask = -1;
> >> +  return mask ^ ((1U << inactive) - 1);
> >
> > Should we validate any inputs in this function, that:
> > - subdirs <= 32
> > - active <= 32 and active <= subdirs
> 
> Violating these preconditions result in undefined behavior at compile
> time, so I expected GCC (and Clang) to warn about that.  But no such
> luck there.  I asked two colleagues about what we can do on the GCC
> side.  I do think GCC should warn about this under -Wall because it
> returns a totally made-up value.
> 
> I think if we can get that fixed in GCC mainline, we don't have to
> clutter our code with asserts.
> 

With sufficient visibility, GCC can issue such warnings:
test.c:4:34: warning: left shift count >= width of type [-Wshift-count-overflow]
  printf ("%x << 32 = %x\n", r, r << 32);

...so maybe it already "just works", but your patches don't exercise
that because they aren't broken.  :-)

> > While validating this function, I created an equivalent:
> >         if (subdirs == 0) return 0;
> >         if (active == 32) return -1;
> >         uint32_t mask = -1;
> >         /* Mask to include all subdirs.  */
> >         mask >>= 32 - s;
> >         /* Unmask all inactive.  */
> >         mask &= ~(mask >> a);
> >         return mask;
> > ...I found this more readable, but it's subjective.
> 
> Yeah, what we really want here is LDB or DPB, or Erlang's bit syntax. 8-/
> 
> > Also, this routine makes a broad assumption that active subdirectories
> > are all contiguous and at the head of the list.  Maybe this should be
> > renamed _dl_hwcaps_subdirs_build_range_bitmask (or ..._top_range_...),
> > with an updated comment that reflects its limited use-case.
> 
> That makes sense.  I think we can delay that until such a targer
> arrives.

I'm not sure what you mean here.

PC

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] elf: Add glibc-hwcaps support for LD_LIBRARY_PATH
  2020-10-14 15:14       ` Paul A. Clarke via Libc-alpha
@ 2020-10-14 15:19         ` Florian Weimer via Libc-alpha
  0 siblings, 0 replies; 32+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-10-14 15:19 UTC (permalink / raw)
  To: Paul A. Clarke; +Cc: libc-alpha

* Paul A. Clarke:

>> Violating these preconditions result in undefined behavior at compile
>> time, so I expected GCC (and Clang) to warn about that.  But no such
>> luck there.  I asked two colleagues about what we can do on the GCC
>> side.  I do think GCC should warn about this under -Wall because it
>> returns a totally made-up value.
>> 
>> I think if we can get that fixed in GCC mainline, we don't have to
>> clutter our code with asserts.
>> 
>
> With sufficient visibility, GCC can issue such warnings:
> test.c:4:34: warning: left shift count >= width of type [-Wshift-count-overflow]
>   printf ("%x << 32 = %x\n", r, r << 32);
>
> ...so maybe it already "just works", but your patches don't exercise
> that because they aren't broken.  :-)

I don't get the warning even when this happens.  I tried with a
reproducer.  I don't know why that happens; GCC compiles the expression
to a constant, so it must have seen the undefined computation.

>> > While validating this function, I created an equivalent:
>> >         if (subdirs == 0) return 0;
>> >         if (active == 32) return -1;
>> >         uint32_t mask = -1;
>> >         /* Mask to include all subdirs.  */
>> >         mask >>= 32 - s;
>> >         /* Unmask all inactive.  */
>> >         mask &= ~(mask >> a);
>> >         return mask;
>> > ...I found this more readable, but it's subjective.
>> 
>> Yeah, what we really want here is LDB or DPB, or Erlang's bit syntax. 8-/
>> 
>> > Also, this routine makes a broad assumption that active subdirectories
>> > are all contiguous and at the head of the list.  Maybe this should be
>> > renamed _dl_hwcaps_subdirs_build_range_bitmask (or ..._top_range_...),
>> > with an updated comment that reflects its limited use-case.
>> 
>> That makes sense.  I think we can delay that until such a targer
>> arrives.
>
> I'm not sure what you mean here.

I'm going to use this comment for _dl_hwcaps_subdirs_build_bitmask:

/* Returns a bitmask that marks the last ACTIVE subdirectories in a
   _dl_hwcaps_subdirs_active string (containing SUBDIRS directories in
   total) as active.  Intended for use in _dl_hwcaps_subdirs_active
   implementations (if a contiguous tail of the list in
   _dl_hwcaps_subdirs is selected).  */

We can rename the function if we add something else for building
bitmasks and there's potential for confusion around that.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/3] elf: Add glibc-hwcaps support for LD_LIBRARY_PATH
  2020-10-12 15:21 ` [PATCH 1/3] elf: Add " Florian Weimer via Libc-alpha
  2020-10-13 16:28   ` Paul A. Clarke via Libc-alpha
@ 2020-10-20 17:23   ` Paul A. Clarke via Libc-alpha
  1 sibling, 0 replies; 32+ messages in thread
From: Paul A. Clarke via Libc-alpha @ 2020-10-20 17:23 UTC (permalink / raw)
  To: Florian Weimer; +Cc: libc-alpha

On Mon, Oct 12, 2020 at 05:21:44PM +0200, Florian Weimer via Libc-alpha wrote:
> This hacks non-power-set processing into _dl_important_hwcaps.
> Once the legacy hwcaps handling goes away, the subdirectory
> handling needs to be reworked, but it is premature to do this
> while both approaches are still supported.

Why is the subject "...for LD_LIBRARY_PATH"?

> ---
[snip]
> diff --git a/elf/Makefile b/elf/Makefile
> index f10cc59e7c..4983f7a2c0 100644
> --- a/elf/Makefile
> +++ b/elf/Makefile
> @@ -59,7 +59,8 @@ elide-routines.os = $(all-dl-routines) dl-support enbl-secure dl-origin \
>  # ld.so uses those routines, plus some special stuff for being the program
>  # interpreter and operating independent of libc.
>  rtld-routines	= rtld $(all-dl-routines) dl-sysdep dl-environ dl-minimal \
> -  dl-error-minimal dl-conflict dl-hwcaps dl-usage
> +  dl-error-minimal dl-conflict dl-hwcaps dl-hwcaps_split dl-hwcaps-subdirs \
> +  dl-usage
>  all-rtld-routines = $(rtld-routines) $(sysdep-rtld-routines)
> 
>  CFLAGS-dl-runtime.c += -fexceptions -fasynchronous-unwind-tables
> @@ -210,14 +211,14 @@ tests += restest1 preloadtest loadfail multiload origtest resolvfail \
>  	 tst-filterobj tst-filterobj-dlopen tst-auxobj tst-auxobj-dlopen \
>  	 tst-audit14 tst-audit15 tst-audit16 \
>  	 tst-single_threaded tst-single_threaded-pthread \
> -	 tst-tls-ie tst-tls-ie-dlmopen \
> -	 argv0test
> +	 tst-tls-ie tst-tls-ie-dlmopen argv0test \
> +	 tst-glibc-hwcaps tst-glibc-hwcaps-prepend tst-glibc-hwcaps-mask
>  #	 reldep9
>  tests-internal += loadtest unload unload2 circleload1 \
>  	 neededtest neededtest2 neededtest3 neededtest4 \
>  	 tst-tls3 tst-tls6 tst-tls7 tst-tls8 tst-dlmopen2 \
>  	 tst-ptrguard1 tst-stackguard1 tst-libc_dlvsym \
> -	 tst-create_format1 tst-tls-surplus
> +	 tst-create_format1 tst-tls-surplus tst-dl-hwcaps_split
>  tests-container += tst-pldd tst-dlopen-tlsmodid-container \
>    tst-dlopen-self-container
>  test-srcs = tst-pathopt
> @@ -329,7 +330,10 @@ modules-names = testobj1 testobj2 testobj3 testobj4 testobj5 testobj6 \
>  		tst-single_threaded-mod3 tst-single_threaded-mod4 \
>  		tst-tls-ie-mod0 tst-tls-ie-mod1 tst-tls-ie-mod2 \
>  		tst-tls-ie-mod3 tst-tls-ie-mod4 tst-tls-ie-mod5 \
> -		tst-tls-ie-mod6
> +		tst-tls-ie-mod6 markermod1-1 markermod1-2 markermod1-3 \
> +		markermod2-1 markermod2-2 \
> +		markermod3-1 markermod3-2 markermod3-3 \
> +		markermod4-1 markermod4-2 markermod4-3 markermod4-4 \
> 
>  # Most modules build with _ISOMAC defined, but those filtered out
>  # depend on internal headers.
> @@ -1812,3 +1816,55 @@ $(objpfx)argv0test.out: tst-rtld-argv0.sh $(objpfx)ld.so \
>              '$(test-wrapper-env)' '$(run_program_env)' \
>              '$(rpath-link)' 'test-argv0' > $@; \
>      $(evaluate-test)
> +
> +# Most likely search subdirectories across multiple architectures.
> +glibc-hwcaps-first-subdirs = power9 x86-64-v2

It'll be challenging for mortals to know where this information comes from
and how to keep it updated when it gets stale.

> +# The test modules are parameterized by preprocessor macros.
> +LDFLAGS-markermod1-1.so += -Wl,-soname,markermod1.so
> +LDFLAGS-markermod2-1.so += -Wl,-soname,markermod2.so
> +LDFLAGS-markermod3-1.so += -Wl,-soname,markermod3.so
> +LDFLAGS-markermod4-1.so += -Wl,-soname,markermod4.so
> +$(objpfx)markermod%.os : markermodMARKER-VALUE.c
> +	$(compile-command.c) \
> +	  -DMARKER=marker$(firstword $(subst -, ,$*)) \
> +	  -DVALUE=$(lastword $(subst -, ,$*))
> +$(objpfx)markermod1.so: $(objpfx)markermod1-1.so
> +	cp $< $@
> +$(objpfx)markermod2.so: $(objpfx)markermod2-1.so
> +	cp $< $@
> +$(objpfx)markermod3.so: $(objpfx)markermod3-1.so
> +	cp $< $@
> +$(objpfx)markermod4.so: $(objpfx)markermod4-1.so
> +	cp $< $@
> +
> +# tst-glibc-hwcaps-prepend checks that --glibc-hwcaps-prepend is
> +# preferred over auto-detected subdirectories.
> +$(objpfx)tst-glibc-hwcaps-prepend: $(objpfx)markermod1-1.so
> +$(objpfx)glibc-hwcaps/prepend-markermod1/markermod1.so: \
> +  $(objpfx)markermod1-2.so
> +	$(make-target-directory)
> +	cp $< $@
> +$(objpfx)glibc-hwcaps/%/markermod1.so: $(objpfx)markermod1-3.so
> +	$(make-target-directory)
> +	cp $< $@
> +$(objpfx)tst-glibc-hwcaps-prepend.out: \
> +  $(objpfx)tst-glibc-hwcaps-prepend $(objpfx)markermod1.so \
> +  $(patsubst %,$(objpfx)glibc-hwcaps/%/markermod1.so,prepend-markermod1 \
> +  $(glibc-hwcaps-first-subdirs))

Should this last line be indented a bit, since it is comprised of parameters
from the preceding line?

> +	$(test-wrapper) $(rtld-prefix) \
> +	  --glibc-hwcaps-prepend prepend-markermod1 \
> +	  $< > $@; \
> +	$(evaluate-test)
> +
> +# tst-glibc-hwcaps-mask checks that --glibc-hwcaps-mask can be used to
> +# suppress all auto-detected subdirectories.
> +$(objpfx)tst-glibc-hwcaps-mask: $(objpfx)markermod1-1.so
> +$(objpfx)tst-glibc-hwcaps-mask.out: \
> +  $(objpfx)tst-glibc-hwcaps-mask $(objpfx)markermod1.so \
> +  $(patsubst %,$(objpfx)glibc-hwcaps/%/markermod1.so,\
> +  $(glibc-hwcaps-first-subdirs))

Ditto.

> +	$(test-wrapper) $(rtld-prefix) \
> +	  --glibc-hwcaps-mask does-not-exist \
> +	  $< > $@; \
> +	$(evaluate-test)

[snip]

PC

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/3] powerpc64le: Add glibc-hwcaps support
  2020-10-12 15:22 ` [PATCH v2 3/3] powerpc64le: " Florian Weimer via Libc-alpha
  2020-10-13 16:36   ` Paul A. Clarke via Libc-alpha
@ 2020-10-20 17:23   ` Paul A. Clarke via Libc-alpha
  2020-10-29 16:26   ` Florian Weimer via Libc-alpha
  2020-10-30 23:10   ` Tulio Magno Quites Machado Filho via Libc-alpha
  3 siblings, 0 replies; 32+ messages in thread
From: Paul A. Clarke via Libc-alpha @ 2020-10-20 17:23 UTC (permalink / raw)
  To: Florian Weimer; +Cc: libc-alpha

On Mon, Oct 12, 2020 at 05:22:11PM +0200, Florian Weimer via Libc-alpha wrote:
> The "power10" and "power9" subdirectories are selected.
> ---
[snip]
> diff --git a/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
> new file mode 100644
> index 0000000000..1fa3735a8c
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
> @@ -0,0 +1,39 @@
> +/* Architecture-specific glibc-hwcaps subdirectories.  powerpc64le version.
> +   Copyright (C) 2020 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <dl-hwcaps.h>
> +#include <ldsodefs.h>
> +
> +const char _dl_hwcaps_subdirs[] = "power10:power9";
> +enum { subdirs_count = 2 };
> +
> +uint32_t
> +_dl_hwcaps_subdirs_active (void)
> +{
> +  int active = 0;
> +
> +  if ((GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_00) == 0)
> +    return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
> +  ++active;
> +
> +  if ((GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_1) == 0)
> +    return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
> +  ++active;
> +
> +  return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
> +}
[snip]

LGTM

PC

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/3] powerpc64le: Add glibc-hwcaps support
  2020-10-12 15:22 ` [PATCH v2 3/3] powerpc64le: " Florian Weimer via Libc-alpha
  2020-10-13 16:36   ` Paul A. Clarke via Libc-alpha
  2020-10-20 17:23   ` Paul A. Clarke via Libc-alpha
@ 2020-10-29 16:26   ` Florian Weimer via Libc-alpha
  2020-10-30 23:10   ` Tulio Magno Quites Machado Filho via Libc-alpha
  3 siblings, 0 replies; 32+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-10-29 16:26 UTC (permalink / raw)
  To: Florian Weimer via Libc-alpha
  Cc: Paul A. Clarke, Tulio Magno Quites Machado Filho

* Florian Weimer via Libc-alpha:

> diff --git a/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
> new file mode 100644
> index 0000000000..1fa3735a8c
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
> @@ -0,0 +1,39 @@
> +/* Architecture-specific glibc-hwcaps subdirectories.  powerpc64le version.
> +   Copyright (C) 2020 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <dl-hwcaps.h>
> +#include <ldsodefs.h>
> +
> +const char _dl_hwcaps_subdirs[] = "power10:power9";
> +enum { subdirs_count = 2 };
> +
> +uint32_t
> +_dl_hwcaps_subdirs_active (void)
> +{
> +  int active = 0;
> +
> +  if ((GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_00) == 0)
> +    return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
> +  ++active;
> +
> +  if ((GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_1) == 0)
> +    return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
> +  ++active;
> +
> +  return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
> +}

This patch came up in an off-list discussion.  There were some concerns
with it:

* The bits PPC_FEATURE2_ARCH_3_00 and PPC_FEATURE2_ARCH_3_1 may not be
  sufficient to select the "power9" and "power10" subdirectories.  Or
  worded differently, -mcpu=power9 and -mcpu=power10 in GCC enable more
  than that.  For -mcpu=power9, I see _ARCH_PWR9, __FLOAT128_HARDWARE__,
  __POWER9_VECTOR__.  For -mcpu=power10, I see _ARCH_PWR10, __MMA__,
  __PCREL__.  This suggests to me that we need to check additional
  AT_HWCAP/AT_HWCAP2 bits for these directories.

* The names "power9" and "power10" may be too implementation-specific in
  the future.

I would like to postpone this patch for now, but I'd like to resolve
both issues.  Any suggestions how to proceed?

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/3] powerpc64le: Add glibc-hwcaps support
  2020-10-12 15:22 ` [PATCH v2 3/3] powerpc64le: " Florian Weimer via Libc-alpha
                     ` (2 preceding siblings ...)
  2020-10-29 16:26   ` Florian Weimer via Libc-alpha
@ 2020-10-30 23:10   ` Tulio Magno Quites Machado Filho via Libc-alpha
  2020-11-02 10:15     ` Florian Weimer via Libc-alpha
  3 siblings, 1 reply; 32+ messages in thread
From: Tulio Magno Quites Machado Filho via Libc-alpha @ 2020-10-30 23:10 UTC (permalink / raw)
  To: Florian Weimer, libc-alpha; +Cc: Paul A. Clarke

Florian Weimer via Libc-alpha <libc-alpha@sourceware.org> writes:

> The "power10" and "power9" subdirectories are selected.

Tested on power10.

Should this patch also include modification to
glibc-hwcaps-first-subdirs-for-tests ?

Is it intentional that other architectures end up with the following file?

    elf/glibc-hwcaps/x86-64-v2/markermod1.so

> diff --git a/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
> new file mode 100644
> index 0000000000..1fa3735a8c
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
> @@ -0,0 +1,39 @@
>...
> +
> +uint32_t
> +_dl_hwcaps_subdirs_active (void)
> +{
> +  int active = 0;
> +
> +  if ((GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_00) == 0)
> +    return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
> +  ++active;
> +
> +  if ((GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_1) == 0)
> +    return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);

This is the tricky part.  I like your proposal to match with the behavior
of -mcpu.
In that case we would have:

  power9:
    ((GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_00) == 0
     || (GLRO (dl_hwcap2) & PPC_FEATURE2_HAS_IEEE128) == 0
     || (GLRO (dl_hwcap) & PPC_FEATURE_HAS_ALTIVEC) == 0
     || (GLRO (dl_hwcap) & PPC_FEATURE_HAS_VSX) == 0)

  power10:
    /* power10 also requires altivec, vsx and ieee128 availability, but these
       features have already been tested.  */
    ((GLRO (dl_hwcap2) & (PPC_FEATURE2_ARCH_3_1 | PPC_FEATURE2_MMA)) == 0)

This would mean that a processor that implements POWER ISA 3.0, but does not
implement altivec, would not be able to benefit from that particular library
build, falling back to the general build (power8), e.g. microwatt would fall
in this category right now.

>   * The names "power9" and "power10" may be too implementation-specific in
>     the future.

I do agree, but I don't have a better suggestion.
It's hard to be future-proof here.
I don't think that using the POWER ISA level would help much though,
because new processors may decide to not implement a particular feature
in the future that we believe is essential right now.

-- 
Tulio Magno

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/3] powerpc64le: Add glibc-hwcaps support
  2020-10-30 23:10   ` Tulio Magno Quites Machado Filho via Libc-alpha
@ 2020-11-02 10:15     ` Florian Weimer via Libc-alpha
  2020-11-03 15:14       ` Tulio Magno Quites Machado Filho via Libc-alpha
  0 siblings, 1 reply; 32+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-11-02 10:15 UTC (permalink / raw)
  To: Tulio Magno Quites Machado Filho; +Cc: libc-alpha, Paul A. Clarke

* Tulio Magno Quites Machado Filho:

> Florian Weimer via Libc-alpha <libc-alpha@sourceware.org> writes:
>
>> The "power10" and "power9" subdirectories are selected.
>
> Tested on power10.
>
> Should this patch also include modification to
> glibc-hwcaps-first-subdirs-for-tests ?

Yes, it should.  In the posted version, it depended on power9 being
included in the first patch tht added the generic test.

> Is it intentional that other architectures end up with the following file?
>
>     elf/glibc-hwcaps/x86-64-v2/markermod1.so

Yes, it doesn't hurt, and it allows us to keep the test generic.

>> diff --git a/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
>> new file mode 100644
>> index 0000000000..1fa3735a8c
>> --- /dev/null
>> +++ b/sysdeps/powerpc/powerpc64/le/dl-hwcaps-subdirs.c
>> @@ -0,0 +1,39 @@
>>...
>> +
>> +uint32_t
>> +_dl_hwcaps_subdirs_active (void)
>> +{
>> +  int active = 0;
>> +
>> +  if ((GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_00) == 0)
>> +    return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
>> +  ++active;
>> +
>> +  if ((GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_1) == 0)
>> +    return _dl_hwcaps_subdirs_build_bitmask (subdirs_count, active);
>
> This is the tricky part.  I like your proposal to match with the behavior
> of -mcpu.
> In that case we would have:
>
>   power9:
>     ((GLRO (dl_hwcap2) & PPC_FEATURE2_ARCH_3_00) == 0
>      || (GLRO (dl_hwcap2) & PPC_FEATURE2_HAS_IEEE128) == 0
>      || (GLRO (dl_hwcap) & PPC_FEATURE_HAS_ALTIVEC) == 0
>      || (GLRO (dl_hwcap) & PPC_FEATURE_HAS_VSX) == 0)
>
>   power10:
>     /* power10 also requires altivec, vsx and ieee128 availability, but these
>        features have already been tested.  */
>     ((GLRO (dl_hwcap2) & (PPC_FEATURE2_ARCH_3_1 | PPC_FEATURE2_MMA)) == 0)
>
> This would mean that a processor that implements POWER ISA 3.0, but does not
> implement altivec, would not be able to benefit from that particular library
> build, falling back to the general build (power8), e.g. microwatt would fall
> in this category right now.

I think we need documentation what it means for a processor to implement
ISA 3.0, and not altivec.  Or for that matter, what an implementation of
powerpc64le-*-linux-gnu without altivec looks like.  Presumably, it will
be different yet again from the original hardware used during
architecture bring-up.

Once we have no-altivec support for powerpc64le in the GNU toolchain, we
probably should add a power8 subdirectory that stores libraries that use
altivec/vsx.  (This directory would be searched even on systems which
are built to use altivec because they use the POWER8 baseline.)

>>   * The names "power9" and "power10" may be too implementation-specific in
>>     the future.
>
> I do agree, but I don't have a better suggestion.
> It's hard to be future-proof here.
> I don't think that using the POWER ISA level would help much though,
> because new processors may decide to not implement a particular feature
> in the future that we believe is essential right now.

One way to address this is to restrict the selected ISA features for
each level to something that has the most benefit and is unlikely to go
away.

ISA features that cannot automatically and pervasively used by compilers
can be excluded as well.  MMA could be in that category, and I think
cryptography-related instructions generally are.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/3] powerpc64le: Add glibc-hwcaps support
  2020-11-02 10:15     ` Florian Weimer via Libc-alpha
@ 2020-11-03 15:14       ` Tulio Magno Quites Machado Filho via Libc-alpha
  2020-11-03 16:29         ` Florian Weimer via Libc-alpha
  0 siblings, 1 reply; 32+ messages in thread
From: Tulio Magno Quites Machado Filho via Libc-alpha @ 2020-11-03 15:14 UTC (permalink / raw)
  To: Florian Weimer; +Cc: libc-alpha, Paul A. Clarke

Florian Weimer <fweimer@redhat.com> writes:

> I think we need documentation what it means for a processor to implement
> ISA 3.0, and not altivec.  Or for that matter, what an implementation of
> powerpc64le-*-linux-gnu without altivec looks like.  Presumably, it will
> be different yet again from the original hardware used during
> architecture bring-up.

I think we already have this documented in a couple of places in different
documents from the OpenPOWER Foundation.

Up to POWER ISA 2.07 (included) [1], there existed categories of features that
processors may not implement.  Category Vector (aka. altivec) and category
Vector-Scalar Extension (aka. vsx) exist there and are listed in the
OpenPOWER Instruction Set Architecture Profile specification v1.1.0 [2] as
required in the OpenPOWER chip architecture.

POWER ISA 3.0 [3] dropped support for categories, but the OpenPOWER ISA
Compliance Definition 2.0 [4] lists specific sections of the POWER ISA document
that are required, with references to the Vector and Vector-Scalar chapters.

POWER ISA 3.1 [5] added the concept of OpenISA compliancy subset (page vi).
In this new concept, both Vector and Vector-Scalar are required for Linux
compliancy (aka. SIMD in ISA 3.1).

Furthermore, the POWER 64-bit ELF V2 ABI 1.5 [6], which is under public review,
states that:

    It expects an OpenPOWER-compliant processor to implement at least Power ISA
    V2.07B with all OpenPOWER Architecture instruction categories as well as
    OpenPOWER-defined implementation characteristics for some
    implementation-specific features.

It's also worth mentioning that it's removing the list of categories that was
duplicated from [2] which used to mention Vector and Vector-Scalar.

With all that said, it's clear to notice these requirements add barriers for
new processors to start booting Linux. In order to minimize that, I've been
working on some ifunc functions in glibc so that they do not make assumptions
based on IBM processors and are either adapted to work on new processors or
are correctly ignored.

[1] https://openpowerfoundation.org/?resource_lib=ibm-power-isa-version-2-07-b
[2] http://cdn.openpowerfoundation.org/wp-content/uploads/resources/isa-profile/content/ch_profile.html
[3] https://openpowerfoundation.org/?resource_lib=power-isa-version-3-0
[4] http://cdn.openpowerfoundation.org/wp-content/uploads/resources/openpower-isa-thts-V2-1/content/_Ref436814652.html
[5] https://ibm.box.com/s/hhjfw0x0lrbtyzmiaffnbxh2fuo0fog0
[6] https://openpowerfoundation.org/?resource_lib=64-bit-elf-v2-abi-specification-review-draft

> Once we have no-altivec support for powerpc64le in the GNU toolchain, we
> probably should add a power8 subdirectory that stores libraries that use
> altivec/vsx.  (This directory would be searched even on systems which
> are built to use altivec because they use the POWER8 baseline.)

Per my previous explanation, I don't think this is necessary for
OpenPOWER-compliant systems.

>>>   * The names "power9" and "power10" may be too implementation-specific in
>>>     the future.
>>
>> I do agree, but I don't have a better suggestion.
>> It's hard to be future-proof here.
>> I don't think that using the POWER ISA level would help much though,
>> because new processors may decide to not implement a particular feature
>> in the future that we believe is essential right now.
>
> One way to address this is to restrict the selected ISA features for
> each level to something that has the most benefit and is unlikely to go
> away.

As I've just reviewed the OpenPOWER ISA Compliance documents [2] [4], they do
make the usage of the terms "power8" and "power9".  So, I feel better in using
them.  Notice that a definition for power10 is not available yet, except
for the OpenISA compliancy subset from the POWER ISA 3.1, which does not
mention power10.

> ISA features that cannot automatically and pervasively used by compilers
> can be excluded as well.  MMA could be in that category, and I think
> cryptography-related instructions generally are.

MMA is indeed optional for Linux.
AFAICS, cryptography-related instructions are part of SIMD and should be
required for Linux.

-- 
Tulio Magno

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/3] powerpc64le: Add glibc-hwcaps support
  2020-11-03 15:14       ` Tulio Magno Quites Machado Filho via Libc-alpha
@ 2020-11-03 16:29         ` Florian Weimer via Libc-alpha
  2020-11-03 23:02           ` Segher Boessenkool
  0 siblings, 1 reply; 32+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-11-03 16:29 UTC (permalink / raw)
  To: Tulio Magno Quites Machado Filho; +Cc: libc-alpha, Paul A. Clarke

* Tulio Magno Quites Machado Filho:

> Florian Weimer <fweimer@redhat.com> writes:
>
>> I think we need documentation what it means for a processor to implement
>> ISA 3.0, and not altivec.  Or for that matter, what an implementation of
>> powerpc64le-*-linux-gnu without altivec looks like.  Presumably, it will
>> be different yet again from the original hardware used during
>> architecture bring-up.
>
> I think we already have this documented in a couple of places in different
> documents from the OpenPOWER Foundation.

Yes, but all those documents say that Altivec + VSX are required for
powerpc64le-*-linux-gnu.  Your summary below seems to re-confirm that.

My point was that if we want powerpc64le-*-linux-gnu to stand for
something different (without Altivec/VSX), we need (new) documentation
that says what it means.

>> ISA features that cannot automatically and pervasively used by compilers
>> can be excluded as well.  MMA could be in that category, and I think
>> cryptography-related instructions generally are.
>
> MMA is indeed optional for Linux.
> AFAICS, cryptography-related instructions are part of SIMD and should be
> required for Linux.

Then I think we should change GCC not to enable MMA with -mcpu=power10.

The other change is that I should check for PPC_FEATURE2_HAS_IEEE128 for
power9, and add a comment that ALTIVEC and VSX are implied by the place
in the source tree (I deliberately made all this specific to
powerpc64le-*-linux-gnu on the glibc side, like we didn't define new ABI
levels for i386).

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/3] powerpc64le: Add glibc-hwcaps support
  2020-11-03 16:29         ` Florian Weimer via Libc-alpha
@ 2020-11-03 23:02           ` Segher Boessenkool
  2020-11-04  8:28             ` Florian Weimer via Libc-alpha
  0 siblings, 1 reply; 32+ messages in thread
From: Segher Boessenkool @ 2020-11-03 23:02 UTC (permalink / raw)
  To: Florian Weimer; +Cc: libc-alpha, Paul A. Clarke

Hi!

(Cc: Bill)

On Tue, Nov 03, 2020 at 05:29:08PM +0100, Florian Weimer via Libc-alpha wrote:
> * Tulio Magno Quites Machado Filho:
> > Florian Weimer <fweimer@redhat.com> writes:
> >> I think we need documentation what it means for a processor to implement
> >> ISA 3.0, and not altivec.  Or for that matter, what an implementation of
> >> powerpc64le-*-linux-gnu without altivec looks like.  Presumably, it will
> >> be different yet again from the original hardware used during
> >> architecture bring-up.
> >
> > I think we already have this documented in a couple of places in different
> > documents from the OpenPOWER Foundation.
> 
> Yes, but all those documents say that Altivec + VSX are required for
> powerpc64le-*-linux-gnu.  Your summary below seems to re-confirm that.
> 
> My point was that if we want powerpc64le-*-linux-gnu to stand for
> something different (without Altivec/VSX), we need (new) documentation
> that says what it means.

Not going to happen.  A new powerpc64le-linux-novec triple (or whatever
naming, this is just an example) can be made of course, but the existing
name will keep standing for the existing ABI!

> >> ISA features that cannot automatically and pervasively used by compilers
> >> can be excluded as well.  MMA could be in that category, and I think
> >> cryptography-related instructions generally are.
> >
> > MMA is indeed optional for Linux.
> > AFAICS, cryptography-related instructions are part of SIMD and should be
> > required for Linux.
> 
> Then I think we should change GCC not to enable MMA with -mcpu=power10.

No.

-mcpu=power10 enables MMA.  If you do not want all Power10 features, you
should not use -mcpu=power10.  It is that simple.

Since powerpc64le-linux requires Power8 or later, you always have VMX
and VSX enabled there.  In exactly that same way.

GCC never generates anything MMA that the user did not explicitly ask
for in the source code (with builtins, say), so this is not an issue.
Compare this with hardware DFP.

> The other change is that I should check for PPC_FEATURE2_HAS_IEEE128 for
> power9, and add a comment that ALTIVEC and VSX are implied by the place
> in the source tree (I deliberately made all this specific to
> powerpc64le-*-linux-gnu on the glibc side, like we didn't define new ABI
> levels for i386).

All the VMX, VSX, QP float, MMA, whatever stuff is all the same on *all*
GCC Power targets, not just those that are powerpc64le-linux.  If you
use -mcpu=whatever, the resulting program can only be run on CPUs with
all features that <whatever> has.  This is what -mcpu= means.


Segher

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/3] powerpc64le: Add glibc-hwcaps support
  2020-11-03 23:02           ` Segher Boessenkool
@ 2020-11-04  8:28             ` Florian Weimer via Libc-alpha
  2020-11-04 19:36               ` Segher Boessenkool
  2020-11-16 14:51               ` Tulio Magno Quites Machado Filho via Libc-alpha
  0 siblings, 2 replies; 32+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-11-04  8:28 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: libc-alpha, Paul A. Clarke

* Segher Boessenkool:

> Not going to happen.  A new powerpc64le-linux-novec triple (or whatever
> naming, this is just an example) can be made of course, but the existing
> name will keep standing for the existing ABI!

Fine with me.

>> >> ISA features that cannot automatically and pervasively used by compilers
>> >> can be excluded as well.  MMA could be in that category, and I think
>> >> cryptography-related instructions generally are.
>> >
>> > MMA is indeed optional for Linux.
>> > AFAICS, cryptography-related instructions are part of SIMD and should be
>> > required for Linux.
>> 
>> Then I think we should change GCC not to enable MMA with -mcpu=power10.
>
> No.
>
> -mcpu=power10 enables MMA.  If you do not want all Power10 features, you
> should not use -mcpu=power10.  It is that simple.

Then we need a different name, or require MMA for the "power10"
glibc-hwcaps subdirectory.

> Since powerpc64le-linux requires Power8 or later, you always have VMX
> and VSX enabled there.  In exactly that same way.
>
> GCC never generates anything MMA that the user did not explicitly ask
> for in the source code (with builtins, say), so this is not an issue.
> Compare this with hardware DFP.

GCC defines __MMA__ for -mcpu=power10, and source code will evenually be
sensitive to that macro.

I think it is important that -mcpu=power10 and the "power10"
subdirectory mean the same thing.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/3] powerpc64le: Add glibc-hwcaps support
  2020-11-04  8:28             ` Florian Weimer via Libc-alpha
@ 2020-11-04 19:36               ` Segher Boessenkool
  2020-11-04 19:56                 ` Florian Weimer via Libc-alpha
  2020-11-16 14:51               ` Tulio Magno Quites Machado Filho via Libc-alpha
  1 sibling, 1 reply; 32+ messages in thread
From: Segher Boessenkool @ 2020-11-04 19:36 UTC (permalink / raw)
  To: Florian Weimer; +Cc: libc-alpha, Paul A. Clarke

Hi Florian,

On Wed, Nov 04, 2020 at 09:28:33AM +0100, Florian Weimer wrote:
> * Segher Boessenkool:
> 
> > Not going to happen.  A new powerpc64le-linux-novec triple (or whatever
> > naming, this is just an example) can be made of course, but the existing
> > name will keep standing for the existing ABI!
> 
> Fine with me.
> 
> >> >> ISA features that cannot automatically and pervasively used by compilers
> >> >> can be excluded as well.  MMA could be in that category, and I think
> >> >> cryptography-related instructions generally are.
> >> >
> >> > MMA is indeed optional for Linux.
> >> > AFAICS, cryptography-related instructions are part of SIMD and should be
> >> > required for Linux.
> >> 
> >> Then I think we should change GCC not to enable MMA with -mcpu=power10.
> >
> > No.
> >
> > -mcpu=power10 enables MMA.  If you do not want all Power10 features, you
> > should not use -mcpu=power10.  It is that simple.
> 
> Then we need a different name, or require MMA for the "power10"
> glibc-hwcaps subdirectory.

Or do nothing.  Glibc doesn't use any MMA code, does it?  This is never
generated automatically, you need to really ask for it in your source
code.

> > Since powerpc64le-linux requires Power8 or later, you always have VMX
> > and VSX enabled there.  In exactly that same way.
> >
> > GCC never generates anything MMA that the user did not explicitly ask
> > for in the source code (with builtins, say), so this is not an issue.
> > Compare this with hardware DFP.
> 
> GCC defines __MMA__ for -mcpu=power10, and source code will evenually be
> sensitive to that macro.

That macro simply says that source code can use MMA builtins and the
like.  Is it important for glibc whether user code uses MMA?

> I think it is important that -mcpu=power10 and the "power10"
> subdirectory mean the same thing.

-mcpu=power10 means "generate code optimised for power10" (and: "it will
probably not run on cpus not compatible to power10").

Is that what that subdir holds as well?


Segher

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/3] powerpc64le: Add glibc-hwcaps support
  2020-11-04 19:36               ` Segher Boessenkool
@ 2020-11-04 19:56                 ` Florian Weimer via Libc-alpha
  2020-11-04 21:58                   ` Segher Boessenkool
  0 siblings, 1 reply; 32+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-11-04 19:56 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: libc-alpha, Paul A. Clarke

* Segher Boessenkool:

>> > -mcpu=power10 enables MMA.  If you do not want all Power10 features, you
>> > should not use -mcpu=power10.  It is that simple.
>> 
>> Then we need a different name, or require MMA for the "power10"
>> glibc-hwcaps subdirectory.
>
> Or do nothing.  Glibc doesn't use any MMA code, does it?  This is never
> generated automatically, you need to really ask for it in your source
> code.

glibc's internal use does not matter in this context.  Programmers must
be able to drop their own libraries built with -mcpu=power10 into the
power10 subdirectory.  If GCC turns on MMA by default for this switch
and glibc selects the power10 subdirectory without checking for MMA
support, then this isn't guaranteed to work.

We have been through this with x86-64 already.  I don't want to produce
the same bug.

>> > Since powerpc64le-linux requires Power8 or later, you always have VMX
>> > and VSX enabled there.  In exactly that same way.
>> >
>> > GCC never generates anything MMA that the user did not explicitly ask
>> > for in the source code (with builtins, say), so this is not an issue.
>> > Compare this with hardware DFP.
>> 
>> GCC defines __MMA__ for -mcpu=power10, and source code will evenually be
>> sensitive to that macro.
>
> That macro simply says that source code can use MMA builtins and the
> like.  Is it important for glibc whether user code uses MMA?

Yes, we can only load code that is built to use MMA unconditionally
(potentially) if the system supports MMA.

And in the future, GCC might recognize common code patterns with
-ftree-loop-vectorize and replace them with MMA intrinsics.

>> I think it is important that -mcpu=power10 and the "power10"
>> subdirectory mean the same thing.
>
> -mcpu=power10 means "generate code optimised for power10" (and: "it will
> probably not run on cpus not compatible to power10").
>
> Is that what that subdir holds as well?

Yes, that's the idea.  The programmer can also drop a -mcpu=power9
library into the power9 subdirectory.  The difference to the existing
AT_PLATFORM subdirectory is that on POWER10, both subdirectories (power9
and power10) are searched.  With AT_PLATFORM on POWER10, only power10
would be searched.  (And we're also fixing a silent on-disk format
change in the way AT_PLATFORM libraries are represented in ld.so.cache.)

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/3] powerpc64le: Add glibc-hwcaps support
  2020-11-04 19:56                 ` Florian Weimer via Libc-alpha
@ 2020-11-04 21:58                   ` Segher Boessenkool
  2020-11-05 11:40                     ` Florian Weimer via Libc-alpha
  0 siblings, 1 reply; 32+ messages in thread
From: Segher Boessenkool @ 2020-11-04 21:58 UTC (permalink / raw)
  To: Florian Weimer; +Cc: libc-alpha, Paul A. Clarke

On Wed, Nov 04, 2020 at 08:56:05PM +0100, Florian Weimer wrote:
> * Segher Boessenkool:
> 
> >> > -mcpu=power10 enables MMA.  If you do not want all Power10 features, you
> >> > should not use -mcpu=power10.  It is that simple.
> >> 
> >> Then we need a different name, or require MMA for the "power10"
> >> glibc-hwcaps subdirectory.
> >
> > Or do nothing.  Glibc doesn't use any MMA code, does it?  This is never
> > generated automatically, you need to really ask for it in your source
> > code.
> 
> glibc's internal use does not matter in this context.  Programmers must
> be able to drop their own libraries built with -mcpu=power10 into the
> power10 subdirectory.  If GCC turns on MMA by default for this switch
> and glibc selects the power10 subdirectory without checking for MMA
> support, then this isn't guaranteed to work.

Are you saying that it is *normal* for people to put very different code
into libc like this?  Wow.

> We have been through this with x86-64 already.  I don't want to produce
> the same bug.

No code in libc should ever use MMA, imnsho.

> >> > Since powerpc64le-linux requires Power8 or later, you always have VMX
> >> > and VSX enabled there.  In exactly that same way.
> >> >
> >> > GCC never generates anything MMA that the user did not explicitly ask
> >> > for in the source code (with builtins, say), so this is not an issue.
> >> > Compare this with hardware DFP.
> >> 
> >> GCC defines __MMA__ for -mcpu=power10, and source code will evenually be
> >> sensitive to that macro.
> >
> > That macro simply says that source code can use MMA builtins and the
> > like.  Is it important for glibc whether user code uses MMA?
> 
> Yes, we can only load code that is built to use MMA unconditionally
> (potentially) if the system supports MMA.
> 
> And in the future, GCC might recognize common code patterns with
> -ftree-loop-vectorize and replace them with MMA intrinsics.

No, not really.  Maybe 20 years from now though, sure.

MMA uses its own register set.  Moves to other regs are expensive.  You
cannot pass those MMA registers around at all, either.

So sure, only load modules with MMA code if the hwcap says you have MMA,
but I don't see why you would refuse power10 code without it.  But, ask
Tulio, of course -- I just don't see the point of having a separate
hwcap for it at all, if you do not use it!

> >> I think it is important that -mcpu=power10 and the "power10"
> >> subdirectory mean the same thing.
> >
> > -mcpu=power10 means "generate code optimised for power10" (and: "it will
> > probably not run on cpus not compatible to power10").
> >
> > Is that what that subdir holds as well?
> 
> Yes, that's the idea.  The programmer can also drop a -mcpu=power9
> library into the power9 subdirectory.  The difference to the existing
> AT_PLATFORM subdirectory is that on POWER10, both subdirectories (power9
> and power10) are searched.  With AT_PLATFORM on POWER10, only power10
> would be searched.  (And we're also fixing a silent on-disk format
> change in the way AT_PLATFORM libraries are represented in ld.so.cache.)

HtH, cheers,


Segher

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/3] powerpc64le: Add glibc-hwcaps support
  2020-11-04 21:58                   ` Segher Boessenkool
@ 2020-11-05 11:40                     ` Florian Weimer via Libc-alpha
  2020-11-05 21:42                       ` Segher Boessenkool
  0 siblings, 1 reply; 32+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-11-05 11:40 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: libc-alpha, Paul A. Clarke

* Segher Boessenkool:

> On Wed, Nov 04, 2020 at 08:56:05PM +0100, Florian Weimer wrote:
>> * Segher Boessenkool:
>> 
>> >> > -mcpu=power10 enables MMA.  If you do not want all Power10 features, you
>> >> > should not use -mcpu=power10.  It is that simple.
>> >> 
>> >> Then we need a different name, or require MMA for the "power10"
>> >> glibc-hwcaps subdirectory.
>> >
>> > Or do nothing.  Glibc doesn't use any MMA code, does it?  This is never
>> > generated automatically, you need to really ask for it in your source
>> > code.
>> 
>> glibc's internal use does not matter in this context.  Programmers must
>> be able to drop their own libraries built with -mcpu=power10 into the
>> power10 subdirectory.  If GCC turns on MMA by default for this switch
>> and glibc selects the power10 subdirectory without checking for MMA
>> support, then this isn't guaranteed to work.
>
> Are you saying that it is *normal* for people to put very different code
> into libc like this?  Wow.
>
>> We have been through this with x86-64 already.  I don't want to produce
>> the same bug.
>
> No code in libc should ever use MMA, imnsho.

Oh, I see now.  I think we don't agree about the scope of the
glibc-hwcaps feature.

It's going to be used to ELF multilibs in general, not just glibc
components.  So a BLAS implementation could use it and drop its
implementation DSOs into the appropriate directories.

That's why vector features such as MMA matter in this context.

Given this additional context, I hope we can agree that a rule for
programmers like “build with -mcpu=power10 for the power10 glibc-hwcaps
subdirectory“ has a lot of value.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/3] powerpc64le: Add glibc-hwcaps support
  2020-11-05 11:40                     ` Florian Weimer via Libc-alpha
@ 2020-11-05 21:42                       ` Segher Boessenkool
  2020-11-09 18:32                         ` Florian Weimer via Libc-alpha
  0 siblings, 1 reply; 32+ messages in thread
From: Segher Boessenkool @ 2020-11-05 21:42 UTC (permalink / raw)
  To: Florian Weimer; +Cc: libc-alpha, Paul A. Clarke

Hi Florian,

On Thu, Nov 05, 2020 at 12:40:29PM +0100, Florian Weimer wrote:
> * Segher Boessenkool:
> > Are you saying that it is *normal* for people to put very different code
> > into libc like this?  Wow.
> >
> >> We have been through this with x86-64 already.  I don't want to produce
> >> the same bug.
> >
> > No code in libc should ever use MMA, imnsho.
> 
> Oh, I see now.  I think we don't agree about the scope of the
> glibc-hwcaps feature.

I did not know about this new feature at all, I thought this was about
the exiting power10/ etc. directories :-)

> It's going to be used to ELF multilibs in general, not just glibc
> components.  So a BLAS implementation could use it and drop its
> implementation DSOs into the appropriate directories.
> 
> That's why vector features such as MMA matter in this context.

Sure.

> Given this additional context, I hope we can agree that a rule for
> programmers like “build with -mcpu=power10 for the power10 glibc-hwcaps
> subdirectory“ has a lot of value.

But you cannot change what -mcpu=power10 means...  It should keep the
same scheme as we have used for very long already.

If there is anything else GCC can do that is *not* a huge problem for
all our users, we can talk about that of course (but on the GCC lists!)


Segher

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/3] powerpc64le: Add glibc-hwcaps support
  2020-11-05 21:42                       ` Segher Boessenkool
@ 2020-11-09 18:32                         ` Florian Weimer via Libc-alpha
  0 siblings, 0 replies; 32+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-11-09 18:32 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: libc-alpha, Paul A. Clarke

* Segher Boessenkool:

> Hi Florian,
>
> On Thu, Nov 05, 2020 at 12:40:29PM +0100, Florian Weimer wrote:
>> * Segher Boessenkool:
>> > Are you saying that it is *normal* for people to put very different code
>> > into libc like this?  Wow.
>> >
>> >> We have been through this with x86-64 already.  I don't want to produce
>> >> the same bug.
>> >
>> > No code in libc should ever use MMA, imnsho.
>> 
>> Oh, I see now.  I think we don't agree about the scope of the
>> glibc-hwcaps feature.
>
> I did not know about this new feature at all, I thought this was about
> the exiting power10/ etc. directories :-)
>
>> It's going to be used to ELF multilibs in general, not just glibc
>> components.  So a BLAS implementation could use it and drop its
>> implementation DSOs into the appropriate directories.
>> 
>> That's why vector features such as MMA matter in this context.
>
> Sure.
>
>> Given this additional context, I hope we can agree that a rule for
>> programmers like “build with -mcpu=power10 for the power10 glibc-hwcaps
>> subdirectory“ has a lot of value.
>
> But you cannot change what -mcpu=power10 means...  It should keep the
> same scheme as we have used for very long already.

I don't have a problem with selecting the power10 subdirectory for CPUs
which support ISA 3.1 *and* MMA (plus ISA 3.0 plus float128 in
hardware).  This would align glibc with the current GCC option.

The new glibc scheme allows us to add something between power10 and
power9 easily once the need arises (without backwards-incompatible
/etc/ld.so.cache format changes *cough* *cough*).

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/3] powerpc64le: Add glibc-hwcaps support
  2020-11-04  8:28             ` Florian Weimer via Libc-alpha
  2020-11-04 19:36               ` Segher Boessenkool
@ 2020-11-16 14:51               ` Tulio Magno Quites Machado Filho via Libc-alpha
  2020-11-16 19:35                 ` Segher Boessenkool
  1 sibling, 1 reply; 32+ messages in thread
From: Tulio Magno Quites Machado Filho via Libc-alpha @ 2020-11-16 14:51 UTC (permalink / raw)
  To: Florian Weimer, Segher Boessenkool; +Cc: libc-alpha, Paul A. Clarke

Florian Weimer <fweimer@redhat.com> writes:

> GCC defines __MMA__ for -mcpu=power10, and source code will evenually be
> sensitive to that macro.
>
> I think it is important that -mcpu=power10 and the "power10"
> subdirectory mean the same thing.

I agree to stick with -mcpu values.

It was a bad interpretation of my part when I used OpenPOWER Foundation's
documents to explain the availability of features on GCC.  These are different
usages.
GCC uses -mcpu to:

    ... specify a specific processor.  Code generated under those options
    runs best on that processor, and may not run at all on others.

So, -mcpu=power10 is the IBM POWER10 processor that supports MMA.

But then, we're back to this point you had raised:

> I think we need documentation what it means for a processor to implement
> ISA 3.0, and not altivec.

Unfortunately, I think this documentation will only exist after a new -mcpu
value is created.

-- 
Tulio Magno

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/3] powerpc64le: Add glibc-hwcaps support
  2020-11-16 14:51               ` Tulio Magno Quites Machado Filho via Libc-alpha
@ 2020-11-16 19:35                 ` Segher Boessenkool
  2020-11-23 10:20                   ` Florian Weimer via Libc-alpha
  0 siblings, 1 reply; 32+ messages in thread
From: Segher Boessenkool @ 2020-11-16 19:35 UTC (permalink / raw)
  To: Tulio Magno Quites Machado Filho
  Cc: Florian Weimer, libc-alpha, Paul A. Clarke

Hi!

On Mon, Nov 16, 2020 at 11:51:27AM -0300, Tulio Magno Quites Machado Filho wrote:
> But then, we're back to this point you had raised:
> 
> > I think we need documentation what it means for a processor to implement
> > ISA 3.0, and not altivec.
> 
> Unfortunately, I think this documentation will only exist after a new -mcpu
> value is created.

Nothing that uses the vector registers can work, and disabling all
instructions that touch a vector register gets you 99.9% there.  This is
completely analoguous with -msoft-float (which really means "no FPRs").

With -mno-altivec you also have no VSCR and VRSAVE registers, just like
you lose FPSCR with -msoft-float.

But yes, this should be documented in the ABI, it matters for the
calling sequences etc.

A "normal" compilation may use the vector registers if the -mcpu= you
used allows that, so it cannot run on hardware without vector regs.  The
compiler can also use vector registers if you did not explicitly ask for
them (again analogous to the floating point registers), if you do not
explicitly forbid it of course.


Segher

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/3] powerpc64le: Add glibc-hwcaps support
  2020-11-16 19:35                 ` Segher Boessenkool
@ 2020-11-23 10:20                   ` Florian Weimer via Libc-alpha
  0 siblings, 0 replies; 32+ messages in thread
From: Florian Weimer via Libc-alpha @ 2020-11-23 10:20 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: libc-alpha, Paul A. Clarke

* Segher Boessenkool:

> On Mon, Nov 16, 2020 at 11:51:27AM -0300, Tulio Magno Quites Machado Filho wrote:
>> But then, we're back to this point you had raised:
>> 
>> > I think we need documentation what it means for a processor to implement
>> > ISA 3.0, and not altivec.
>> 
>> Unfortunately, I think this documentation will only exist after a new -mcpu
>> value is created.
>
> Nothing that uses the vector registers can work, and disabling all
> instructions that touch a vector register gets you 99.9% there.  This is
> completely analoguous with -msoft-float (which really means "no FPRs").
>
> With -mno-altivec you also have no VSCR and VRSAVE registers, just like
> you lose FPSCR with -msoft-float.
>
> But yes, this should be documented in the ABI, it matters for the
> calling sequences etc.
>
> A "normal" compilation may use the vector registers if the -mcpu= you
> used allows that, so it cannot run on hardware without vector regs.  The
> compiler can also use vector registers if you did not explicitly ask for
> them (again analogous to the floating point registers), if you do not
> explicitly forbid it of course.

Fair enough.  I take it that the subdirectory logic in the patch is
correct.  If changes are needed, they will come in the form of new
subdirectories.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2020-11-23 10:20 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-12 15:21 [PATCH v2 0/3] glibc-hwcaps support for LD_LIBRARY_PATH Florian Weimer via Libc-alpha
2020-10-12 15:21 ` [PATCH 1/3] elf: Add " Florian Weimer via Libc-alpha
2020-10-13 16:28   ` Paul A. Clarke via Libc-alpha
2020-10-14 13:58     ` Florian Weimer via Libc-alpha
2020-10-14 15:14       ` Paul A. Clarke via Libc-alpha
2020-10-14 15:19         ` Florian Weimer via Libc-alpha
2020-10-20 17:23   ` Paul A. Clarke via Libc-alpha
2020-10-12 15:21 ` [PATCH v2 2/3] x86_64: Add glibc-hwcaps support Florian Weimer via Libc-alpha
2020-10-12 18:11   ` H.J. Lu via Libc-alpha
2020-10-13  9:29     ` Florian Weimer via Libc-alpha
2020-10-13 11:02       ` H.J. Lu via Libc-alpha
2020-10-13 11:24         ` Florian Weimer via Libc-alpha
2020-10-13 11:43           ` H.J. Lu via Libc-alpha
2020-10-12 15:22 ` [PATCH v2 3/3] powerpc64le: " Florian Weimer via Libc-alpha
2020-10-13 16:36   ` Paul A. Clarke via Libc-alpha
2020-10-20 17:23   ` Paul A. Clarke via Libc-alpha
2020-10-29 16:26   ` Florian Weimer via Libc-alpha
2020-10-30 23:10   ` Tulio Magno Quites Machado Filho via Libc-alpha
2020-11-02 10:15     ` Florian Weimer via Libc-alpha
2020-11-03 15:14       ` Tulio Magno Quites Machado Filho via Libc-alpha
2020-11-03 16:29         ` Florian Weimer via Libc-alpha
2020-11-03 23:02           ` Segher Boessenkool
2020-11-04  8:28             ` Florian Weimer via Libc-alpha
2020-11-04 19:36               ` Segher Boessenkool
2020-11-04 19:56                 ` Florian Weimer via Libc-alpha
2020-11-04 21:58                   ` Segher Boessenkool
2020-11-05 11:40                     ` Florian Weimer via Libc-alpha
2020-11-05 21:42                       ` Segher Boessenkool
2020-11-09 18:32                         ` Florian Weimer via Libc-alpha
2020-11-16 14:51               ` Tulio Magno Quites Machado Filho via Libc-alpha
2020-11-16 19:35                 ` Segher Boessenkool
2020-11-23 10:20                   ` Florian Weimer via Libc-alpha

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).