unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] PPC64: First in the series of patches implementing POWER8  vector math.
@ 2019-02-14 20:56 GT
  2019-02-14 21:17 ` Joseph Myers
  2019-02-15 16:45 ` Steve Ellcey
  0 siblings, 2 replies; 24+ messages in thread
From: GT @ 2019-02-14 20:56 UTC (permalink / raw
  To: libc-alpha@sourceware.org

[-- Attachment #1: Type: text/plain, Size: 15 bytes --]

Empty Message

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-PPC64-First-in-the-series-of-patches-implementing-PO.patch --]
[-- Type: text/x-patch; name="0001-PPC64-First-in-the-series-of-patches-implementing-PO.patch", Size: 20766 bytes --]

From 1740326ba3e5bac6c1524ad3acf6672e08f454fe Mon Sep 17 00:00:00 2001
From: Bert Tenjy <bert.tenjy@gmail.com>
Date: Thu, 14 Feb 2019 19:20:42 +0000
Subject: [PATCH] PPC64: First in the series of patches implementing POWER8
 vector math.

Implements double-precision cosine using VSX vector capability. Algorithm for
cosine is from x86_64 [commit #2193311288] adapted to PPC64.

Name-mangling exactly duplicates SSE ISA of the x86_64 ABI. The details are
at <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.

Adds tests of the new double-precision vector cosine.

[BZ #24205]
2019-02-14    <bert.tenjy@gmail.com>

        * sysdeps/powerpc/bits/math-vector.h: New file.
        * sysdeps/powerpc/fpu/libm-test-ulps (cos_vlen2): Added accuracy of
        double-precision vector cosine.
        * sysdeps/powerpc/powerpc64/fpu/Versions: New file.
        * sysdeps/powerpc/powerpc64/multiarch/Makefile (libmvec-sysdep_routines)
        (double-vlen-funcs,double-vlen-arch-ext-flags): Added build of VSX
        vector cos function and its tests.
        * sysdeps/powerpc/powerpc64/multiarch/math-tests-arch.h: New file.
        * sysdeps/powerpc/powerpc64/multiarch/test-double-vlen2-wrappers.c: New file.
        * sysdeps/powerpc/powerpc64/multiarch/test-double-vlen2.c: New file.
        * sysdeps/powerpc/powerpc64/multiarch/vec_d_cos2_core.c: New file.
        * sysdeps/powerpc/powerpc64/multiarch/vec_d_cos2_power8.c: New file.
        * sysdeps/powerpc/powerpc64/multiarch/vec_d_cos2_vmx.c: New file.
        * sysdeps/powerpc/powerpc64/multiarch/vec_d_trig_data.h: New file.
        * sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist: New file.
---
 ChangeLog                                     | 19 ++++
 sysdeps/powerpc/bits/math-vector.h            | 41 +++++++++
 sysdeps/powerpc/fpu/libm-test-ulps            |  3 +
 sysdeps/powerpc/powerpc64/fpu/Versions        |  6 ++
 .../powerpc/powerpc64/fpu/multiarch/Makefile  | 17 ++++
 .../powerpc64/fpu/multiarch/math-tests-arch.h | 19 ++++
 .../multiarch/test-double-vlen2-wrappers.c    | 24 ++++++
 .../fpu/multiarch/test-double-vlen2.c         | 23 +++++
 .../powerpc64/fpu/multiarch/vec_d_cos2_core.c | 31 +++++++
 .../fpu/multiarch/vec_d_cos2_power8.c         | 86 +++++++++++++++++++
 .../powerpc64/fpu/multiarch/vec_d_cos2_vmx.c  | 34 ++++++++
 .../powerpc64/fpu/multiarch/vec_d_trig_data.h | 86 +++++++++++++++++++
 .../linux/powerpc/powerpc64/libmvec.abilist   |  1 +
 13 files changed, 390 insertions(+)
 create mode 100644 sysdeps/powerpc/bits/math-vector.h
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/Versions
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/math-tests-arch.h
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2.c
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_core.c
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_power8.c
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vmx.c
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h
 create mode 100644 sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist

diff --git a/ChangeLog b/ChangeLog
index ab9f593a55..432405ea52 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,22 @@
+2019-02-14    <bert.tenjy@gmail.com>
+
+	* sysdeps/powerpc/bits/math-vector.h: New file.
+	* sysdeps/powerpc/fpu/libm-test-ulps (cos_vlen2): Added accuracy of
+	double-precision vector cosine.
+	* sysdeps/powerpc/powerpc64/fpu/Versions: New file.
+	* sysdeps/powerpc/powerpc64/multiarch/Makefile (libmvec-sysdep_routines)
+	(double-vlen-funcs,double-vlen-arch-ext-flags): Added build of VSX
+	vector cos function and its tests.
+	* sysdeps/powerpc/powerpc64/multiarch/math-tests-arch.h: New file.
+	* sysdeps/powerpc/powerpc64/multiarch/test-double-vlen2-wrappers.c: New file.
+	* sysdeps/powerpc/powerpc64/multiarch/test-double-vlen2.c: New file.
+	* sysdeps/powerpc/powerpc64/multiarch/vec_d_cos2_core.c: New file.
+	* sysdeps/powerpc/powerpc64/multiarch/vec_d_cos2_power8.c: New file.
+	* sysdeps/powerpc/powerpc64/multiarch/vec_d_cos2_vmx.c: New file.
+	* sysdeps/powerpc/powerpc64/multiarch/vec_d_trig_data.h: New file.
+	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist: New file.
+
+
 2019-02-14  Wilco Dijkstra  <wdijkstr@arm.com>
 
 	* benchtests/Makefile: Add malloc-simple benchmark.
diff --git a/sysdeps/powerpc/bits/math-vector.h b/sysdeps/powerpc/bits/math-vector.h
new file mode 100644
index 0000000000..7d3cedabcc
--- /dev/null
+++ b/sysdeps/powerpc/bits/math-vector.h
@@ -0,0 +1,41 @@
+/* Platform-specific SIMD declarations of math functions.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _MATH_H
+# error "Never include <bits/math-vector.h> directly;\
+ include <math.h> instead."
+#endif
+
+/* Get default empty definitions for simd declarations.  */
+#include <bits/libm-simd-decl-stubs.h>
+
+#if defined _ARCH_PPC64 && defined __FAST_MATH__
+# if defined _OPENMP && _OPENMP >= 201307
+/* OpenMP case.  */
+#  define __DECL_SIMD_ARCH_PPC64 _Pragma ("omp declare simd notinbranch")
+# elif __GNUC_PREREQ (6,0)
+/* W/o OpenMP use GCC 6.* __attribute__ ((__simd__)).  */
+#  define __DECL_SIMD_ARCH_PPC64 __attribute__ ((__simd__ ("notinbranch")))
+# endif
+
+# ifdef __DECL_SIMD_ARCH_PPC64
+#  undef __DECL_SIMD_cos
+#  define __DECL_SIMD_cos __DECL_SIMD_ARCH_PPC64
+
+# endif
+#endif
diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps
index 1eec27c1dc..d392b135a7 100644
--- a/sysdeps/powerpc/fpu/libm-test-ulps
+++ b/sysdeps/powerpc/fpu/libm-test-ulps
@@ -1311,6 +1311,9 @@ ifloat128: 2
 ildouble: 5
 ldouble: 5
 
+Function: "cos_vlen2":
+double: 2
+
 Function: "cosh":
 double: 1
 float: 1
diff --git a/sysdeps/powerpc/powerpc64/fpu/Versions b/sysdeps/powerpc/powerpc64/fpu/Versions
new file mode 100644
index 0000000000..9da7f92ffe
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/Versions
@@ -0,0 +1,6 @@
+libmvec {
+  GLIBC_2.30 {
+    _ZGVbN2v_cos;
+  }
+}
+
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
index 39b557604c..3e9d541e75 100644
--- a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
@@ -42,3 +42,20 @@ CFLAGS-e_hypotf-power7.c = -mcpu=power7
 CFLAGS-s_modf-ppc64.c += -fsignaling-nans
 CFLAGS-s_modff-ppc64.c += -fsignaling-nans
 endif
+
+ifeq ($(subdir),mathvec)
+libmvec-sysdep_routines += vec_d_cos2_core vec_d_cos2_power8 \
+			   vec_d_cos2_vmx
+endif
+
+# Variables for libmvec tests.
+ifeq ($(subdir),math)
+ifeq ($(build-mathvec),yes)
+libmvec-tests += double-vlen2
+
+double-vlen2-funcs = cos
+
+double-vlen2-arch-ext-cflags = -mvsx
+
+endif
+endif
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/math-tests-arch.h b/sysdeps/powerpc/powerpc64/fpu/multiarch/math-tests-arch.h
new file mode 100644
index 0000000000..e79b98480b
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/math-tests-arch.h
@@ -0,0 +1,19 @@
+/*
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdeps/generic/math-tests-arch.h>
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c
new file mode 100644
index 0000000000..17e2cc0724
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c
@@ -0,0 +1,24 @@
+/* Wrapper part of tests for VSX ISA versions of vector math functions.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "test-double-vlen2.h"
+#include <altivec.h>
+
+#define VEC_TYPE vector double
+
+VECTOR_WRAPPER (WRAPPER_NAME (cos), _ZGVbN2v_cos)
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2.c
new file mode 100644
index 0000000000..ca1673f103
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2.c
@@ -0,0 +1,23 @@
+/* Tests for VSX ISA versions of vector math functions.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "test-double-vlen2.h"
+
+#define TEST_VECTOR_cos 1
+
+#include "libm-test.c"
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_core.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_core.c
new file mode 100644
index 0000000000..1fee72d50e
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_core.c
@@ -0,0 +1,31 @@
+/* Multiple versions of vectorized cos function.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <math.h>
+#include <shlib-compat.h>
+#include "init-arch.h"
+
+vector double _ZGVbN2v_cos(vector double x);
+
+extern __typeof (_ZGVbN2v_cos) _ZGVbN2v_cos_vmx attribute_hidden;
+extern __typeof (_ZGVbN2v_cos) _ZGVbN2v_cos_vsx   attribute_hidden;
+
+libc_ifunc (_ZGVbN2v_cos,
+           (hwcap2 & PPC_FEATURE2_ARCH_2_07)
+           ? _ZGVbN2v_cos_vsx
+           : _ZGVbN2v_cos_vmx);
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_power8.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_power8.c
new file mode 100644
index 0000000000..7d22434a68
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_power8.c
@@ -0,0 +1,86 @@
+/* Function cos vectorized with VSX.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <math.h>
+#include "vec_d_trig_data.h"
+
+vector double _ZGVbN2v_cos_vsx (vector double x)
+{
+
+/*
+   ARGUMENT RANGE REDUCTION:
+   Add Pi/2 to argument: X' = X+Pi/2
+ */
+    vector double x_prime = (vector double) *dHalfPI + x;
+
+/* Get absolute argument value: X' = |X'| */
+    vector double abs_x_prime = vec_abs(x_prime);
+
+/* Y = X'*InvPi + RS : right shifter add */
+    vector double y = (x_prime * (*dInvPI)) + *dRShifter;
+
+/* Check for large arguments path */
+    vector bool long long large_in = vec_cmpgt(abs_x_prime,*dRangeVal);
+
+/* N = Y - RS : right shifter sub */
+    vector double n = y - *dRShifter;
+
+/* SignRes = Y<<63 : shift LSB to MSB place for result sign */
+    vector double sign_res = (vector double) vec_sl((vector long long ) y,
+                                (vector unsigned long long) vec_splats(63));
+
+/* N = N - 0.5 */
+    n = n - *dOneHalf;
+
+/* R = X - N*Pi1 */
+    vector double r = x - (n * (*dPI1_FMA));
+
+/* R = R - N*Pi2 */
+    r = r - (n * (*dPI2_FMA));
+
+/* R = R - N*Pi3 */
+    r = r - (n * (*dPI3_FMA));
+
+/* R2 = R*R */
+    vector double r2 = r * r;
+
+/* Poly = C3+R2*(C4+R2*(C5+R2*(C6+R2*C7))) */
+    vector double poly = *dC3 + r2*(*dC4 + r2*(*dC5 + r2*(*dC6 + r2*(*dC7))));
+
+/* Poly = R+R*(R2*(C1+R2*(C2+R2*Poly))) */
+    poly = r + r*(r2*(*dC1 + r2*(*dC2 + r2*poly)));
+
+/*
+   RECONSTRUCTION:
+   Final sign setting: Res = Poly^SignRes */
+    vector double out = (vector double)
+        ((vector long long) poly ^ (vector long long) sign_res);
+
+    if(large_in[0])
+    {
+        out[0] = cos(x[0]);
+    }
+
+    if(large_in[1])
+    {
+        out[1] = cos(x[1]);
+    }
+
+    return out;
+
+} // _ZGVbN2v_cos_vsx
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vmx.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vmx.c
new file mode 100644
index 0000000000..6f92b15eb4
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vmx.c
@@ -0,0 +1,34 @@
+/* PowerPC64 default version of vectorized cos function.
+   Calls scalar cos version twice.
+
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <math.h>
+#include <altivec.h>
+
+vector double _ZGVbN2v_cos_vmx (vector double x)
+{
+
+    vector double out;
+
+    out[0] = cos(x[0]);
+    out[1] = cos(x[1]);
+
+    return out;
+
+} // _ZGVbN2v_cos_vmx
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h
new file mode 100644
index 0000000000..968bc680ab
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h
@@ -0,0 +1,86 @@
+/* Constants used in polynomail approximations for vectorized sin, cos,
+   and sincos functions.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef D_TRIG_DATA_H
+#define D_TRIG_DATA_H
+
+#include <altivec.h>
+
+vector unsigned long long dHalfPI_t   =
+{0x3ff921fb54442d18,0x3ff921fb54442d18};
+
+vector unsigned long long dInvPI_t    =
+{0x3fd45f306dc9c883,0x3fd45f306dc9c883};
+
+vector unsigned long long dRShifter_t =
+{0x4338000000000000,0x4338000000000000};
+
+vector unsigned long long dRangeVal_t =
+{0x4160000000000000,0x4160000000000000};
+
+vector unsigned long long dOneHalf_t  =
+{0x3fe0000000000000,0x3fe0000000000000};
+
+vector unsigned long long dPI1_FMA_t  =
+{0x400921fb54442d18,0x400921fb54442d18};
+
+vector unsigned long long dPI2_FMA_t  =
+{0x3ca1a62633145c06,0x3ca1a62633145c06};
+
+vector unsigned long long dPI3_FMA_t  =
+{0x395c1cd129024e09,0x395c1cd129024e09};
+
+vector unsigned long long dC7_t       =
+{0xbd69f0d60811aac8,0xbd69f0d60811aac8};
+
+vector unsigned long long dC6_t       =
+{0x3de60e6857a2f220,0x3de60e6857a2f220};
+
+vector unsigned long long dC5_t       =
+{0xbe5ae63546002231,0xbe5ae63546002231};
+
+vector unsigned long long dC4_t       =
+{0x3ec71de38030fea0,0x3ec71de38030fea0};
+
+vector unsigned long long dC3_t       =
+{0xbf2a01a019a5b86d,0xbf2a01a019a5b86d};
+
+vector unsigned long long dC2_t       =
+{0x3f8111111110a4a8,0x3f8111111110a4a8};
+
+vector unsigned long long dC1_t       =
+{0xbfc55555555554a7,0xbfc55555555554a7};
+
+vector double *dHalfPI     = (vector double *) &dHalfPI_t;
+vector double *dInvPI      = (vector double *) &dInvPI_t;
+vector double *dRShifter   = (vector double *) &dRShifter_t;
+vector double *dRangeVal   = (vector double *) &dRangeVal_t;
+vector double *dOneHalf    = (vector double *) &dOneHalf_t;
+vector double *dPI1_FMA    = (vector double *) &dPI1_FMA_t;
+vector double *dPI2_FMA    = (vector double *) &dPI2_FMA_t;
+vector double *dPI3_FMA    = (vector double *) &dPI3_FMA_t;
+vector double *dC7         = (vector double *) &dC7_t;
+vector double *dC6         = (vector double *) &dC6_t;
+vector double *dC5         = (vector double *) &dC5_t;
+vector double *dC4         = (vector double *) &dC4_t;
+vector double *dC3         = (vector double *) &dC3_t;
+vector double *dC2         = (vector double *) &dC2_t;
+vector double *dC1         = (vector double *) &dC1_t;
+
+#endif // D_TRIG_DATA_H
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist
new file mode 100644
index 0000000000..656ce0541f
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist
@@ -0,0 +1 @@
+GLIBC_2.30 _ZGVbN2v_cos F
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] PPC64: First in the series of patches implementing POWER8 vector math.
  2019-02-14 20:56 GT
@ 2019-02-14 21:17 ` Joseph Myers
  2019-02-15 20:33   ` GT
  2019-02-18 18:30   ` GT
  2019-02-15 16:45 ` Steve Ellcey
  1 sibling, 2 replies; 24+ messages in thread
From: Joseph Myers @ 2019-02-14 21:17 UTC (permalink / raw
  To: GT; +Cc: libc-alpha@sourceware.org

If you have a copyright assignment / employer disclaimer on file at the 
FSF, could you give details of the names involved?

>         * sysdeps/powerpc/powerpc64/multiarch/test-double-vlen2.c: New file.

Why?  x86_64 doesn't have this.

> +#if defined _ARCH_PPC64 && defined __FAST_MATH__

Is _ARCH_PPC64 correct here - what's the status of support (in the GNU 
toolchain, Linux kernel, etc.) for -mpowerpc64 with the 32-bit ABI (which 
also defines _ARCH_PPC64)?

> diff --git a/sysdeps/powerpc/powerpc64/fpu/Versions b/sysdeps/powerpc/powerpc64/fpu/Versions
> new file mode 100644
> index 0000000000..9da7f92ffe
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/fpu/Versions
> @@ -0,0 +1,6 @@
> +libmvec {
> +  GLIBC_2.30 {
> +    _ZGVbN2v_cos;
> +  }
> +}
> +

You can't push commits that add blank lines at end of files; the 
repository blocks such pushes.

> diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/math-tests-arch.h b/sysdeps/powerpc/powerpc64/fpu/multiarch/math-tests-arch.h
> new file mode 100644
> index 0000000000..e79b98480b
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/math-tests-arch.h
> @@ -0,0 +1,19 @@
> +/*
> +   Copyright (C) 2019 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <sysdeps/generic/math-tests-arch.h>

This file is useless (the only point of a sysdeps file that just includes 
the generic version would be to override a file in another sysdeps 
directory that would otherwise be used).

> +vector double _ZGVbN2v_cos_vsx (vector double x)
> +{
> +
> +/*
> +   ARGUMENT RANGE REDUCTION:
> +   Add Pi/2 to argument: X' = X+Pi/2
> + */
> +    vector double x_prime = (vector double) *dHalfPI + x;

Code formatting needs fixing to follow the GNU Coding Standards, here and 
elsewhere in the patch.

> +#include <altivec.h>
> +
> +vector unsigned long long dHalfPI_t   =
> +{0x3ff921fb54442d18,0x3ff921fb54442d18};

There are namespace problems here.  All external constants should have 
names in the implementation namespace, meaning two leading underscores.  
They should also be const; don't add new writable global variables.  In 
addition, use vector double and hex floats, rather than writing out the 
integer representation.  In addition, each constant should have a comment 
explaining its semantics, sufficiently precisely that someone could 
recompute the value and verify its correctness.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] PPC64: First in the series of patches implementing POWER8 vector math.
  2019-02-14 20:56 GT
  2019-02-14 21:17 ` Joseph Myers
@ 2019-02-15 16:45 ` Steve Ellcey
  2019-02-18 18:32   ` Tulio Magno Quites Machado Filho
  1 sibling, 1 reply; 24+ messages in thread
From: Steve Ellcey @ 2019-02-15 16:45 UTC (permalink / raw
  To: libc-alpha@sourceware.org, tnggil@protonmail.com


I am curious, have patches been sent or will patches be sent to GCC to
generate calls to vector functions.  I do not see any of the
TARGET_SIMD_CLONE* macros defined for power8 in the GCC tree so I don't
see how it would ever generate calls to the vector functions.

I am also not clear on how we decide on the mangling of the vector
function names, particularly the 'b'.  I know that x86 uses 'b', 'c',
'd', or 'e' depending on the FP vectors available.  Aarch64 is using
'n' in its name mangling (and maybe something else for SVE).  Is it OK
for Power8 to use the same letter as x86?  I don't know if this is
covered in some standard or how the letters were chosen.

Steve Ellcey
sellcey@marvell.com

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] PPC64: First in the series of patches implementing POWER8  vector math.
  2019-02-14 21:17 ` Joseph Myers
@ 2019-02-15 20:33   ` GT
  2019-02-15 21:07     ` Joseph Myers
  2019-02-18 18:30   ` GT
  1 sibling, 1 reply; 24+ messages in thread
From: GT @ 2019-02-15 20:33 UTC (permalink / raw
  To: libc-alpha@sourceware.org

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, February 14, 2019 9:17 PM, Joseph Myers <joseph@codesourcery.com> wrote:

> If you have a copyright assignment / employer disclaimer on file at the
> FSF, could you give details of the names involved?
>

I emailed an electronically signed copyright assignment to assign@gnu.org a little while ago. The name is Bert Tenjy and there are no employers/schools/other who could claim any contributions by me.

> >         * sysdeps/powerpc/powerpc64/multiarch/test-double-vlen2.c: New file.
> >
>
> Why? x86_64 doesn't have this.
>

I have been shadowing the sequence of function implementation and testing done by x86_64. Looking further ahead, there was a significant reorganization of the testing procedure which removed multiple files, including the one in question here. I will rewrite PPC64 testing to fit into the current testing infrastructure.

> > +#if defined _ARCH_PPC64 && defined FAST_MATH
>
> Is _ARCH_PPC64 correct here - what's the status of support (in the GNU
> toolchain, Linux kernel, etc.) for -mpowerpc64 with the 32-bit ABI (which
> also defines _ARCH_PPC64)?
>

To address the support issues raised, a solution would be to have 'configure' verify that the compiler generates a valid executable. Then _ARCH_PPC64 would be replaced here by a macro determined at configuration time.

> > diff --git a/sysdeps/powerpc/powerpc64/fpu/Versions b/sysdeps/powerpc/powerpc64/fpu/Versions
> > new file mode 100644
> > index 0000000000..9da7f92ffe
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc64/fpu/Versions
> > @@ -0,0 +1,6 @@
> > +libmvec {
> >
> > -   GLIBC_2.30 {
> > -   _ZGVbN2v_cos;
> > -   }
> >     +}
> >
> > -
>
> You can't push commits that add blank lines at end of files; the
> repository blocks such pushes.
>

Will ensure no such blank lines in any file.

> > diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/math-tests-arch.h b/sysdeps/powerpc/powerpc64/fpu/multiarch/math-tests-arch.h
> > new file mode 100644
> > index 0000000000..e79b98480b
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/math-tests-arch.h
> > @@ -0,0 +1,19 @@
> > +/*
> >
> > -   Copyright (C) 2019 Free Software Foundation, Inc.
> > -   This file is part of the GNU C Library.
> > -
> > -   The GNU C Library is free software; you can redistribute it and/or
> > -   modify it under the terms of the GNU Lesser General Public
> > -   License as published by the Free Software Foundation; either
> > -   version 2.1 of the License, or (at your option) any later version.
> > -
> > -   The GNU C Library is distributed in the hope that it will be useful,
> > -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> > -   Lesser General Public License for more details.
> > -
> > -   You should have received a copy of the GNU Lesser General Public
> > -   License along with the GNU C Library; if not, see
> > -   http://www.gnu.org/licenses/. */
> > -
> >
> > +#include <sysdeps/generic/math-tests-arch.h>
>
> This file is useless (the only point of a sysdeps file that just includes
> the generic version would be to override a file in another sysdeps
> directory that would otherwise be used).
>

This has to do with the code not using the latest testing infrastructure. Fixing that should either preserve this file with overrides or eliminate its inclusion.

> > +vector double _ZGVbN2v_cos_vsx (vector double x)
> > +{
> > +
> > +/*
> >
> > -   ARGUMENT RANGE REDUCTION:
> > -   Add Pi/2 to argument: X' = X+Pi/2
> > -   */
> > -   vector double x_prime = (vector double) *dHalfPI + x;
>
> Code formatting needs fixing to follow the GNU Coding Standards, here and
> elsewhere in the patch.
>

Will look through the standard and adhere.

> > +#include <altivec.h>
> > +
> > +vector unsigned long long dHalfPI_t =
> > +{0x3ff921fb54442d18,0x3ff921fb54442d18};
>
> There are namespace problems here. All external constants should have
> names in the implementation namespace, meaning two leading underscores.
> They should also be const; don't add new writable global variables. In
> addition, use vector double and hex floats, rather than writing out the
> integer representation. In addition, each constant should have a comment
> explaining its semantics, sufficiently precisely that someone could
> recompute the value and verify its correctness.
>
>

Will fix.

===========
Thank.
Bert Tenjy.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] PPC64: First in the series of patches implementing POWER8 vector math.
  2019-02-15 20:33   ` GT
@ 2019-02-15 21:07     ` Joseph Myers
  2019-02-16  2:16       ` GT
  0 siblings, 1 reply; 24+ messages in thread
From: Joseph Myers @ 2019-02-15 21:07 UTC (permalink / raw
  To: GT; +Cc: libc-alpha@sourceware.org

On Fri, 15 Feb 2019, GT wrote:

> > > +#if defined _ARCH_PPC64 && defined FAST_MATH
> >
> > Is _ARCH_PPC64 correct here - what's the status of support (in the GNU
> > toolchain, Linux kernel, etc.) for -mpowerpc64 with the 32-bit ABI (which
> > also defines _ARCH_PPC64)?
> 
> To address the support issues raised, a solution would be to have 
> 'configure' verify that the compiler generates a valid executable. Then 
> _ARCH_PPC64 would be replaced here by a macro determined at 
> configuration time.

Since installed headers need to work for all multilibs that might share a 
compiler and a set of headers, a configure-time test isn't suitable here.  
I think __powerpc64__ is the correct thing to test as an ABI conditional 
(as opposed to an instruction set conditional) - it's what 
sysdeps/unix/sysv/linux/powerpc/bits/wordsize.h uses.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] PPC64: First in the series of patches implementing POWER8  vector math.
  2019-02-15 21:07     ` Joseph Myers
@ 2019-02-16  2:16       ` GT
  0 siblings, 0 replies; 24+ messages in thread
From: GT @ 2019-02-16  2:16 UTC (permalink / raw
  To: libc-alpha@sourceware.org

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Friday, February 15, 2019 9:07 PM, Joseph Myers <joseph@codesourcery.com> wrote:

> On Fri, 15 Feb 2019, GT wrote:
>
> > > > +#if defined _ARCH_PPC64 && defined FAST_MATH
> > >
> > > Is _ARCH_PPC64 correct here - what's the status of support (in the GNU
> > > toolchain, Linux kernel, etc.) for -mpowerpc64 with the 32-bit ABI (which
> > > also defines _ARCH_PPC64)?
> >
> > To address the support issues raised, a solution would be to have
> > 'configure' verify that the compiler generates a valid executable. Then
> > _ARCH_PPC64 would be replaced here by a macro determined at
> > configuration time.
>
> Since installed headers need to work for all multilibs that might share a
> compiler and a set of headers, a configure-time test isn't suitable here.
> I think __powerpc64__ is the correct thing to test as an ABI conditional
> (as opposed to an instruction set conditional) - it's what
> sysdeps/unix/sysv/linux/powerpc/bits/wordsize.h uses.
>

Going by Table 5.1 of the 64-bit ELFV2 ABI, __powerpc64__does appear to be the right macro to test for. I will make that change.

===========
Bert Tenjy.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] PPC64: First in the series of patches implementing POWER8  vector math.
  2019-02-14 21:17 ` Joseph Myers
  2019-02-15 20:33   ` GT
@ 2019-02-18 18:30   ` GT
  2019-02-18 18:38     ` Joseph Myers
  1 sibling, 1 reply; 24+ messages in thread
From: GT @ 2019-02-18 18:30 UTC (permalink / raw
  To: libc-alpha@sourceware.org

>
> > +#include <altivec.h>
> > +
> > +vector unsigned long long dHalfPI_t =
> > +{0x3ff921fb54442d18,0x3ff921fb54442d18};
>
> There are namespace problems here. All external constants should have
> names in the implementation namespace, meaning two leading underscores.
> They should also be const; don't add new writable global variables. In
> addition, use vector double and hex floats, rather than writing out the
> integer representation. In addition, each constant should have a comment
> explaining its semantics, sufficiently precisely that someone could
> recompute the value and verify its correctness.

I have addressed all the issues you raised except for one. I haven't been able to locate documentation which explains derivations of corrections to some constants used in the x86_64 code. Corrections to PI when used in repeated FMA operations, and corrections to 1/factorial(n) coefficients in the approximation polynomial.
I put the question of how to obtain these derivations on the mailing list, and am also still looking online for a solution.
May I resubmit the patch or is the lack of clarity regarding these constants a show-stopper?

Thanks.
Bert.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] PPC64: First in the series of patches implementing POWER8  vector math.
  2019-02-15 16:45 ` Steve Ellcey
@ 2019-02-18 18:32   ` Tulio Magno Quites Machado Filho
  2019-02-18 19:13     ` GT
  0 siblings, 1 reply; 24+ messages in thread
From: Tulio Magno Quites Machado Filho @ 2019-02-18 18:32 UTC (permalink / raw
  To: Steve Ellcey, libc-alpha@sourceware.org, tnggil@protonmail.com,
	andrew.n.senkevich
  Cc: William J. Schmidt, Segher Boessenkool

Steve Ellcey <sellcey@marvell.com> writes:

> I am curious, have patches been sent or will patches be sent to GCC to
> generate calls to vector functions.  I do not see any of the
> TARGET_SIMD_CLONE* macros defined for power8 in the GCC tree so I don't
> see how it would ever generate calls to the vector functions.

That still has to be implemented.

> I am also not clear on how we decide on the mangling of the vector
> function names, particularly the 'b'.  I know that x86 uses 'b', 'c',
> 'd', or 'e' depending on the FP vectors available.  Aarch64 is using
> 'n' in its name mangling (and maybe something else for SVE).  Is it OK
> for Power8 to use the same letter as x86?  I don't know if this is
> covered in some standard or how the letters were chosen.

It was proposed to the X86-64 System V Application Binary Interface, but
it was refused. [1]

Another question: in the C++ ABI, "_ZGV" is reserved for guard variables.
How is this name collision being treated?

Andrew, could you help answer these questions?

[1] https://groups.google.com/d/msg/x86-64-abi/LmppCfN1rZ4/TydP1Gxr4cIJ

-- 
Tulio Magno

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] PPC64: First in the series of patches implementing POWER8 vector math.
  2019-02-18 18:30   ` GT
@ 2019-02-18 18:38     ` Joseph Myers
  0 siblings, 0 replies; 24+ messages in thread
From: Joseph Myers @ 2019-02-18 18:38 UTC (permalink / raw
  To: GT; +Cc: libc-alpha@sourceware.org

On Mon, 18 Feb 2019, GT wrote:

> I have addressed all the issues you raised except for one. I haven't 
> been able to locate documentation which explains derivations of 
> corrections to some constants used in the x86_64 code. Corrections to PI 
> when used in repeated FMA operations, and corrections to 1/factorial(n) 
> coefficients in the approximation polynomial. I put the question of how 
> to obtain these derivations on the mailing list, and am also still 
> looking online for a solution. May I resubmit the patch or is the lack 
> of clarity regarding these constants a show-stopper?

Well, you could copy the comments from the x86_64 code, if these are the 
same constants.

I'd expect, for example, given the comments, that __dPI1_FMA + __dPI2_FMA 
+ __dPI3_FMA (using the x86_64 names) is an approximation to pi to about 
3*53 bits, which is something that should be easy to verify.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] PPC64: First in the series of patches implementing POWER8  vector math.
  2019-02-18 18:32   ` Tulio Magno Quites Machado Filho
@ 2019-02-18 19:13     ` GT
  2019-02-19 19:38       ` Tulio Magno Quites Machado Filho
  0 siblings, 1 reply; 24+ messages in thread
From: GT @ 2019-02-18 19:13 UTC (permalink / raw
  To: libc-alpha@sourceware.org

> > I am also not clear on how we decide on the mangling of the vector
> > function names, particularly the 'b'. I know that x86 uses 'b', 'c',
> > 'd', or 'e' depending on the FP vectors available. Aarch64 is using
> > 'n' in its name mangling (and maybe something else for SVE). Is it OK
> > for Power8 to use the same letter as x86? I don't know if this is
> > covered in some standard or how the letters were chosen.
>
> It was proposed to the X86-64 System V Application Binary Interface, but
> it was refused. [1]

Meaning, Power8 cannot use the 'b'?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] PPC64: First in the series of patches implementing POWER8 vector math.
@ 2019-02-18 23:48 GT
  2019-02-19  1:42 ` Joseph Myers
  0 siblings, 1 reply; 24+ messages in thread
From: GT @ 2019-02-18 23:48 UTC (permalink / raw
  To: libc-alpha@sourceware.org

[-- Attachment #1: Type: text/plain, Size: 15 bytes --]

Empty Message

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-PPC64-First-in-the-series-of-patches-implementing-PO.patch --]
[-- Type: text/x-patch; name="0001-PPC64-First-in-the-series-of-patches-implementing-PO.patch", Size: 16267 bytes --]

From 65fda4d67a61d7ae19e53001855cd4482e2bf59e Mon Sep 17 00:00:00 2001
From: Bert Tenjy <bert.tenjy@gmail.com>
Date: Mon, 18 Feb 2019 23:24:46 +0000
Subject: [PATCH] PPC64: First in the series of patches implementing POWER8
 vector math.

Implements double-precision cosine using VSX vector capability. Algorithm for
cosine is from x86_64 [commit #2193311288] adapted to PPC64.

Name-mangling exactly duplicates SSE ISA of the x86_64 ABI. The details are
at <https://groups.google.com/forum/#!topic/x86-64-abi/LmppCfN1rZ4>.

Adds tests of the new double-precision vector cosine.

[BZ #24205]
---
 ChangeLog                                     | 16 ++++
 sysdeps/powerpc/bits/math-vector.h            | 41 ++++++++
 sysdeps/powerpc/fpu/libm-test-ulps            |  3 +
 sysdeps/powerpc/powerpc64/fpu/Versions        |  5 +
 .../powerpc/powerpc64/fpu/multiarch/Makefile  | 17 ++++
 .../multiarch/test-double-vlen2-wrappers.c    | 24 +++++
 .../powerpc64/fpu/multiarch/vec_d_cos2_core.c | 30 ++++++
 .../fpu/multiarch/vec_d_cos2_power8.c         | 93 +++++++++++++++++++
 .../powerpc64/fpu/multiarch/vec_d_cos2_vmx.c  | 35 +++++++
 .../powerpc64/fpu/multiarch/vec_d_trig_data.h | 61 ++++++++++++
 .../linux/powerpc/powerpc64/libmvec.abilist   |  1 +
 11 files changed, 326 insertions(+)
 create mode 100644 sysdeps/powerpc/bits/math-vector.h
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/Versions
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_core.c
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_power8.c
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vmx.c
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h
 create mode 100644 sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist

diff --git a/ChangeLog b/ChangeLog
index 312ef3bd8f..ccdd855136 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,19 @@
+2019-02-18    <bert.tenjy@gmail.com>
+
+	* sysdeps/powerpc/bits/math-vector.h: New file.
+	* sysdeps/powerpc/fpu/libm-test-ulps (cos_vlen2): Regenerated.
+	* sysdeps/powerpc/powerpc64/fpu/Versions: New file.
+	* sysdeps/powerpc/powerpc64/multiarch/Makefile (libmvec-sysdep_routines)
+            (double-vlen-funcs,double-vlen-arch-ext-flags): Added build of VSX
+            vector cos function and its tests.
+	* sysdeps/powerpc/powerpc64/multiarch/test-double-vlen2-wrappers.c: New file.
+	* sysdeps/powerpc/powerpc64/multiarch/vec_d_cos2_core.c: New file.
+	* sysdeps/powerpc/powerpc64/multiarch/vec_d_cos2_power8.c: New file.
+	* sysdeps/powerpc/powerpc64/multiarch/vec_d_cos2_vmx.c: New file.
+	* sysdeps/powerpc/powerpc64/multiarch/vec_d_trig_data.h: New file.
+	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist: New file.
+
+
 2019-02-18  Florian Weimer  <fweimer@redhat.com>
 
 	* resolv/compat-gethnamaddr.c (Dprintf): Remove definition.
diff --git a/sysdeps/powerpc/bits/math-vector.h b/sysdeps/powerpc/bits/math-vector.h
new file mode 100644
index 0000000000..a569b19e7a
--- /dev/null
+++ b/sysdeps/powerpc/bits/math-vector.h
@@ -0,0 +1,41 @@
+/* Platform-specific SIMD declarations of math functions.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _MATH_H
+# error "Never include <bits/math-vector.h> directly;\
+ include <math.h> instead."
+#endif
+
+/* Get default empty definitions for simd declarations.  */
+#include <bits/libm-simd-decl-stubs.h>
+
+#if defined __POWERPC64__ && defined __FAST_MATH__
+# if defined _OPENMP && _OPENMP >= 201307
+/* OpenMP case.  */
+#  define __DECL_SIMD_ARCH_PPC64 _Pragma ("omp declare simd notinbranch")
+# elif __GNUC_PREREQ (6,0)
+/* W/o OpenMP use GCC 6.* __attribute__ ((__simd__)).  */
+#  define __DECL_SIMD_ARCH_PPC64 __attribute__ ((__simd__ ("notinbranch")))
+# endif
+
+# ifdef __DECL_SIMD_ARCH_PPC64
+#  undef __DECL_SIMD_cos
+#  define __DECL_SIMD_cos __DECL_SIMD_ARCH_PPC64
+
+# endif
+#endif
diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps
index 1eec27c1dc..d392b135a7 100644
--- a/sysdeps/powerpc/fpu/libm-test-ulps
+++ b/sysdeps/powerpc/fpu/libm-test-ulps
@@ -1311,6 +1311,9 @@ ifloat128: 2
 ildouble: 5
 ldouble: 5
 
+Function: "cos_vlen2":
+double: 2
+
 Function: "cosh":
 double: 1
 float: 1
diff --git a/sysdeps/powerpc/powerpc64/fpu/Versions b/sysdeps/powerpc/powerpc64/fpu/Versions
new file mode 100644
index 0000000000..9a3e1211cc
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/Versions
@@ -0,0 +1,5 @@
+libmvec {
+  GLIBC_2.30 {
+    _ZGVbN2v_cos;
+  }
+}
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
index 39b557604c..51e89a2532 100644
--- a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
@@ -42,3 +42,20 @@ CFLAGS-e_hypotf-power7.c = -mcpu=power7
 CFLAGS-s_modf-ppc64.c += -fsignaling-nans
 CFLAGS-s_modff-ppc64.c += -fsignaling-nans
 endif
+
+ifeq ($(subdir),mathvec)
+libmvec-sysdep_routines += vec_d_cos2_core vec_d_cos2_power8 \
+                          vec_d_cos2_vmx
+endif
+
+# Variables for libmvec tests.
+ifeq ($(subdir),math)
+ifeq ($(build-mathvec),yes)
+libmvec-tests += double-vlen2
+
+double-vlen2-funcs = cos
+
+double-vlen2-arch-ext-cflags = -mvsx
+
+endif
+endif
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c
new file mode 100644
index 0000000000..17e2cc0724
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c
@@ -0,0 +1,24 @@
+/* Wrapper part of tests for VSX ISA versions of vector math functions.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "test-double-vlen2.h"
+#include <altivec.h>
+
+#define VEC_TYPE vector double
+
+VECTOR_WRAPPER (WRAPPER_NAME (cos), _ZGVbN2v_cos)
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_core.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_core.c
new file mode 100644
index 0000000000..e089a8d844
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_core.c
@@ -0,0 +1,30 @@
+/* Multiple versions of vectorized cos function.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <math.h>
+#include <shlib-compat.h>
+#include "init-arch.h"
+
+vector double _ZGVbN2v_cos (vector double x);
+
+extern __typeof (_ZGVbN2v_cos) _ZGVbN2v_cos_vmx attribute_hidden;
+extern __typeof (_ZGVbN2v_cos) _ZGVbN2v_cos_vsx attribute_hidden;
+
+libc_ifunc (_ZGVbN2v_cos,
+	    (hwcap2 & PPC_FEATURE2_ARCH_2_07)
+	    ? _ZGVbN2v_cos_vsx : _ZGVbN2v_cos_vmx);
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_power8.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_power8.c
new file mode 100644
index 0000000000..4f6c9cce6b
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_power8.c
@@ -0,0 +1,93 @@
+/* Function cos vectorized with VSX.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <math.h>
+#include "vec_d_trig_data.h"
+
+vector double
+_ZGVbN2v_cos_vsx (vector double x)
+{
+
+/*
+   ARGUMENT RANGE REDUCTION:
+   Add Pi/2 to argument: X' = X+Pi/2
+ */
+  vector double x_prime = (vector double) d_half_pi + x;
+
+/* Get absolute argument value: X' = |X'| */
+  vector double abs_x_prime = vec_abs (x_prime);
+
+/* Y = X'*InvPi + RS : right shifter add */
+  vector double y = (x_prime * d_inv_pi) + d_rshifter;
+
+/* Check for large arguments path */
+  vector bool long long large_in = vec_cmpgt (abs_x_prime, d_rangeval);
+
+/* N = Y - RS : right shifter sub */
+  vector double n = y - d_rshifter;
+
+/* SignRes = Y<<63 : shift LSB to MSB place for result sign */
+  vector double sign_res = (vector double) vec_sl ((vector long long) y,
+						   (vector unsigned long long)
+						   vec_splats (63));
+
+/* N = N - 0.5 */
+  n = n - d_one_half;
+
+/* R = X - N*Pi1 */
+  vector double r = x - (n * d_pi1_fma);
+
+/* R = R - N*Pi2 */
+  r = r - (n * d_pi2_fma);
+
+/* R = R - N*Pi3 */
+  r = r - (n * d_pi3_fma);
+
+/* R2 = R*R */
+  vector double r2 = r * r;
+
+/* Poly = C3+R2*(C4+R2*(C5+R2*(C6+R2*C7))) */
+  vector double poly = r2 * d_coeff7 + d_coeff6;
+  poly = poly * r2 + d_coeff5;
+  poly = poly * r2 + d_coeff4;
+  poly = poly * r2 + d_coeff3;
+
+/* Poly = R+R*(R2*(C1+R2*(C2+R2*Poly))) */
+  poly = poly * r2 + d_coeff2;
+  poly = poly * r2 + d_coeff1;
+  poly = poly * r2 * r + r;
+
+/*
+   RECONSTRUCTION:
+   Final sign setting: Res = Poly^SignRes */
+  vector double out
+    = (vector double) ((vector long long) poly ^ (vector long long) sign_res);
+
+  if (large_in[0] != 0)
+    {
+      out[0] = cos (x[0]);
+    }
+
+  if (large_in[1] != 0)
+    {
+      out[1] = cos (x[1]);
+    }
+
+  return out;
+
+}				/* _ZGVbN2v_cos_vsx */
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vmx.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vmx.c
new file mode 100644
index 0000000000..e32fd77b6c
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vmx.c
@@ -0,0 +1,35 @@
+/* PowerPC64 default version of vectorized cos function.
+   Calls scalar cos version twice.
+
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <math.h>
+#include <altivec.h>
+
+vector double
+_ZGVbN2v_cos_vmx (vector double x)
+{
+
+  vector double out;
+
+  out[0] = cos (x[0]);
+  out[1] = cos (x[1]);
+
+  return out;
+
+}				// _ZGVbN2v_cos_vmx
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h
new file mode 100644
index 0000000000..ae33017324
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h
@@ -0,0 +1,61 @@
+/* Constants used in polynomail approximations for vectorized sin, cos,
+   and sincos functions.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef D_TRIG_DATA_H
+#define D_TRIG_DATA_H
+
+#include <altivec.h>
+
+/* PI/2 */
+const vector double d_half_pi  = {0x1.921fb54442d18p+0, 0x1.921fb54442d18p+0};
+
+/* 1/PI */
+const vector double d_inv_pi   = {0x1.45f306dc9c883p-2, 0x1.45f306dc9c883p-2};
+
+/* right-shifter constant */
+const vector double d_rshifter = {0x1.8p+52, 0x1.8p+52};
+
+/* working range threshold */
+const vector double d_rangeval = {0x1p+23, 0x1p+23};
+
+/* 0.5 */
+const vector double d_one_half = {0x1p-1, 0x1p-1};
+
+/* Range reduction PI-based constants if FMA available:
+   PI high part (FMA available)
+ */
+const vector double d_pi1_fma = {0x1.921fb54442d18p+1, 0x1.921fb54442d18p+1};
+
+/* PI mid part  (FMA available) */
+const vector double d_pi2_fma = {0x1.1a62633145c06p-53, 0x1.1a62633145c06p-53};
+
+/* PI low part  (FMA available) */
+const vector double d_pi3_fma
+= {0x1.c1cd129024e09p-106,0x1.c1cd129024e09p-106};
+
+/* Polynomial coefficients (relative error 2^(-52.115)): */
+const vector double d_coeff7 = {-0x1.9f0d60811aac8p-41,-0x1.9f0d60811aac8p-41};
+const vector double d_coeff6 = {0x1.60e6857a2f22p-33,0x1.60e6857a2f22p-33};
+const vector double d_coeff5 = {-0x1.ae63546002231p-26,-0x1.ae63546002231p-26};
+const vector double d_coeff4 = {0x1.71de38030feap-19,0x1.71de38030feap-19};
+const vector double d_coeff3 = {-0x1.a01a019a5b86dp-13,-0x1.a01a019a5b86dp-13};
+const vector double d_coeff2 = {0x1.111111110a4a8p-7,0x1.111111110a4a8p-7};
+const vector double d_coeff1 = {-0x1.55555555554a7p-3,-0x1.55555555554a7p-3};
+
+#endif // D_TRIG_DATA_H
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist
new file mode 100644
index 0000000000..656ce0541f
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist
@@ -0,0 +1 @@
+GLIBC_2.30 _ZGVbN2v_cos F
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] PPC64: First in the series of patches implementing POWER8 vector math.
  2019-02-18 23:48 GT
@ 2019-02-19  1:42 ` Joseph Myers
  2019-02-19 17:43   ` GT
  2019-02-24  3:33   ` GT
  0 siblings, 2 replies; 24+ messages in thread
From: Joseph Myers @ 2019-02-19  1:42 UTC (permalink / raw
  To: GT; +Cc: libc-alpha@sourceware.org

Please include the patch description in the body of your message, not just 
in an attachment - "Empty Message" isn't helpful.  When sending multiple 
patch versions, it's very helpful to describe what has changed since the 
previous version (after a "---" line or some other such indication, 
accepted by "git am", that the following text is not intended as part of 
the commit message).

I think you need to have explicit buy-in from powerpc toolchain 
maintainers that the ABI you have chosen is the desired one for this 
purpose, agreed for use by any toolchain that wishes to be able to use 
this functionality.  Please also confirm the GCC version that implements 
the ABI in question (when given the pragmas / attributes in the header you 
add).  To be clear, this information should be part of the commit message 
(in every version of the patch submission), not just in a one-off reply to 
this message.

Please also reference somewhere other than Google Groups for the ABI (both 
because Google Groups requires non-free JavaScript, contrary to GNU 
principles, and because Google Groups has a history of breaking URLs that 
used to work for linking to messages, so somewhere with a more reliably 
stable URL is important).  The glibc wiki has a copy of the x86_64 ABI at 
<https://sourceware.org/glibc/wiki/libmvec?action=AttachFile&do=view&target=VectorABI.txt>.

Please describe in patch submissions how the patch was tested.  In this 
case, running at least the libm tests for both powerpc64 big-endian and 
powerpc64 little-endian, and verifying there are no failures, would seem 
appropriate.  But more tests are needed that the installed header really 
does work as expected (see below).

You seem to build both VSX and AltiVec versions of the functions - is that 
correct?  But I don't see any Makefile code that actually causes the 
versions intended to be VSX versions to be built with -mvsx.

What do you intend to happen with the tests (test the VSX version, test 
the AltiVec version, return without running tests) in each of the 
following cases: running on VSX hardware, running on non-VSX hardware with 
AltiVec, running on hardware without either?  How do you ensure that?  
(x86_64 has math-tests-arch.h to define CHECK_ARCH_EXT to avoid running 
tests on unsupported hardware.)

What happens if you build for hardware without AltiVec?  It's of course 
fine for these functions to have an ABI that depends on AltiVec (so they 
never get called on such hardware) - do they need an explicit -maltivec 
option to ensure the vector types / ABI are available, or do they build OK 
for the correct ABI even without those options?

You're now testing a macro __POWERPC64__ in the header, but GCC doesn't 
predefine that macro, only __powerpc64__.  You need to test properly that 
the header you have really does work to cause appropriately built code, 
using the installed headers, to call these functions.  (glibc's own 
testsuite doesn't verify that, only that the internal functions work as 
expected by their mangled names.  So you need appropriate manual tests of 
using installed glibc with a suitable compiler to make sure calls get 
vectorized as expected and the vectorized calls work - and you should give 
details of that testing in the proposed commit message, to demonstrate 
that the patch has been sufficiently tested.)

A new feature like this needs a NEWS entry.

There are still coding style issues in the patch.  Comments inside a 
function are expected to be appropriately indented to match the code in 
the function, not have the "/*" at the left margin.  "if" blocks only 
containing a single statement should generally not use braces around it.  
C++-style "//" comments are not used.  Comments should end with ".  " 
(full stop, two spaces, end of comment), and start with a capital letter.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] PPC64: First in the series of patches implementing POWER8 vector math.
  2019-02-19  1:42 ` Joseph Myers
@ 2019-02-19 17:43   ` GT
  2019-02-19 19:32     ` Tulio Magno Quites Machado Filho
  2019-02-24  3:33   ` GT
  1 sibling, 1 reply; 24+ messages in thread
From: GT @ 2019-02-19 17:43 UTC (permalink / raw
  To: libc-alpha@sourceware.org

> Please include the patch description in the body of your message, not just
> in an attachment - "Empty Message" isn't helpful. When sending multiple
> patch versions, it's very helpful to describe what has changed since the
> previous version (after a "---" line or some other such indication,
> accepted by "git am", that the following text is not intended as part of
> the commit message).
>

The next patch sent will have a description of differences from the one we
are discussing.

> I think you need to have explicit buy-in from powerpc toolchain
> maintainers that the ABI you have chosen is the desired one for this
> purpose, agreed for use by any toolchain that wishes to be able to use
> this functionality.

Tulio and/or Bill:
Can you confirm that the POWER8 Vector Function ABI will match the
SSE ISA sections of x86_64 Vector Function ABI?

>Please also confirm the GCC version that implements
> the ABI in question (when given the pragmas / attributes in the header you
> add). To be clear, this information should be part of the commit message
> (in every version of the patch submission), not just in a one-off reply to
> this message.
>

Patches going foward will include information on the minimum GCC version
required for POWER8 libmvec.

However:

Tulio and/or Bill:
Please affirm my understanding that at present GCC does NOT
yet implement the Vector Function ABI for POWER8.


> Please also reference somewhere other than Google Groups for the ABI (both
> because Google Groups requires non-free JavaScript, contrary to GNU
> principles, and because Google Groups has a history of breaking URLs that
> used to work for linking to messages, so somewhere with a more reliably
> stable URL is important). The glibc wiki has a copy of the x86_64 ABI at
> https://sourceware.org/glibc/wiki/libmvec?action=AttachFile&do=view&target=VectorABI.txt.
>

Next patch will replace link to google groups with the glibc url you've
suggested.

> Please describe in patch submissions how the patch was tested. In this
> case, running at least the libm tests for both powerpc64 big-endian and
> powerpc64 little-endian, and verifying there are no failures, would seem
> appropriate.

Testing will be described all patches.

=========
Remaining issues you raised will be answered shortly.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] PPC64: First in the series of patches implementing POWER8 vector math.
  2019-02-19 17:43   ` GT
@ 2019-02-19 19:32     ` Tulio Magno Quites Machado Filho
  0 siblings, 0 replies; 24+ messages in thread
From: Tulio Magno Quites Machado Filho @ 2019-02-19 19:32 UTC (permalink / raw
  To: GT, libc-alpha@sourceware.org; +Cc: William J. Schmidt

GT <tnggil@protonmail.com> writes:

>> I think you need to have explicit buy-in from powerpc toolchain
>> maintainers that the ABI you have chosen is the desired one for this
>> purpose, agreed for use by any toolchain that wishes to be able to use
>> this functionality.
>
> Tulio and/or Bill:
> Can you confirm that the POWER8 Vector Function ABI will match the
> SSE ISA sections of x86_64 Vector Function ABI?

Not yet.
We need to clarify the concerns that have been raised in your previous thread:
https://www.sourceware.org/ml/libc-alpha/2019-02/msg00455.html

We can only confirm the ABI after having answers to all those questions.

>>Please also confirm the GCC version that implements
>> the ABI in question (when given the pragmas / attributes in the header you
>> add). To be clear, this information should be part of the commit message
>> (in every version of the patch submission), not just in a one-off reply to
>> this message.
>>
>
> Patches going foward will include information on the minimum GCC version
> required for POWER8 libmvec.
>
> However:
>
> Tulio and/or Bill:
> Please affirm my understanding that at present GCC does NOT
> yet implement the Vector Function ABI for POWER8.

That's correct.

-- 
Tulio Magno

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] PPC64: First in the series of patches implementing POWER8  vector math.
  2019-02-18 19:13     ` GT
@ 2019-02-19 19:38       ` Tulio Magno Quites Machado Filho
  0 siblings, 0 replies; 24+ messages in thread
From: Tulio Magno Quites Machado Filho @ 2019-02-19 19:38 UTC (permalink / raw
  To: GT, libc-alpha@sourceware.org

GT <tnggil@protonmail.com> writes:

>> > I am also not clear on how we decide on the mangling of the vector
>> > function names, particularly the 'b'. I know that x86 uses 'b', 'c',
>> > 'd', or 'e' depending on the FP vectors available. Aarch64 is using
>> > 'n' in its name mangling (and maybe something else for SVE). Is it OK
>> > for Power8 to use the same letter as x86? I don't know if this is
>> > covered in some standard or how the letters were chosen.
>>
>> It was proposed to the X86-64 System V Application Binary Interface, but
>> it was refused. [1]
>
> Meaning, Power8 cannot use the 'b'?

That's still unanswered.
My comment was actually related to:

>>> I don't know if this is covered in some standard or how the letters
>>> were chosen.

-- 
Tulio Magno

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] PPC64: First in the series of patches implementing POWER8 vector math.
@ 2019-02-19 22:36 GT
  2019-02-19 22:54 ` Joseph Myers
  0 siblings, 1 reply; 24+ messages in thread
From: GT @ 2019-02-19 22:36 UTC (permalink / raw
  To: libc-alpha@sourceware.org

> Please describe in patch submissions how the patch was tested. In this
> case, running at least the libm tests for both powerpc64 big-endian and
> powerpc64 little-endian, and verifying there are no failures, would seem
> appropriate.
>

This is the testing being done. I will add the description.

>
> You seem to build both VSX and AltiVec versions of the functions - is that
> correct? But I don't see any Makefile code that actually causes the
> versions intended to be VSX versions to be built with -mvsx.
>

There do need to be separate VSX and Altivec versions. Power8 hardware
automatically enables VSX functionality. But I will add the -mvsx flag to
the Makefile for VSX version builds occurring on non-Power8 systems.
And also add -maltivec for the Altivec version.

> What do you intend to happen with the tests (test the VSX version, test
> the AltiVec version, return without running tests) in each of the
> following cases: running on VSX hardware, running on non-VSX hardware with
> AltiVec, running on hardware without either? How do you ensure that?
> (x86_64 has math-tests-arch.h to define CHECK_ARCH_EXT to avoid running
> tests on unsupported hardware.)
>

There is a runtime test in vec_d_cos2_core.c which currently selects
between the VSX and Altivec versions. It reads hwcap and makes
the choice depending on the result returned for supported ISA. On
systems with no Altivec and no VSX, my intention is to fall back to
calling the scalar cosine twice and run the test anyway.

> What happens if you build for hardware without AltiVec? It's of course
> fine for these functions to have an ABI that depends on AltiVec (so they
> never get called on such hardware) - do they need an explicit -maltivec
> option to ensure the vector types / ABI are available, or do they build OK
> for the correct ABI even without those options?
>

GCC fails if an attempt is made to compile code that uses Altivec
functionality without passing the -maltivec flag. Additionally it also
fails on attempts to build for pre-POWER7 hardware even when
-maltivec flag is given to the compiler. Is there need to add a
check in configure for attempts to build libmvec on non-Altivec, non-VSX
systems; or let the failure happen when make calls GCC?

> You're now testing a macro __POWERPC64__ in the header, but GCC doesn't
> predefine that macro, only __powerpc64__.

You are right. I will make the change.

> You need to test properly that
> the header you have really does work to cause appropriately built code,
> using the installed headers, to call these functions. (glibc's own
> testsuite doesn't verify that, only that the internal functions work as
> expected by their mangled names. So you need appropriate manual tests of
> using installed glibc with a suitable compiler to make sure calls get
> vectorized as expected and the vectorized calls work - and you should give
> details of that testing in the proposed commit message, to demonstrate
> that the patch has been sufficiently tested.)
>

As I understand the issue here, the only vector function that can be called
after glibc is built with this patch is the double-precision vector cosine.
Automatic vectorization of functions is a GCC feature that is as yet
unimplemented.

I will follow the instructions on how to test a new installed glibc given here:
https://sourceware.org/glibc/wiki/Testing/Builds

Even then only the vector cosine can be tested. Unless I've missed a
signficant point.

> A new feature like this needs a NEWS entry.
>

Will add an entry similar to that when x86_64 implementation commenced.

> There are still coding style issues in the patch. Comments inside a
> function are expected to be appropriately indented to match the code in
> the function, not have the "/*" at the left margin. "if" blocks only
> containing a single statement should generally not use braces around it.
> C++-style "//" comments are not used. Comments should end with ". "
> (full stop, two spaces, end of comment), and start with a capital letter.
>

Will be fixed.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] PPC64: First in the series of patches implementing POWER8 vector math.
  2019-02-19 22:36 [PATCH] PPC64: First in the series of patches implementing POWER8 vector math GT
@ 2019-02-19 22:54 ` Joseph Myers
  2019-02-22  0:21   ` GT
  0 siblings, 1 reply; 24+ messages in thread
From: Joseph Myers @ 2019-02-19 22:54 UTC (permalink / raw
  To: GT; +Cc: libc-alpha@sourceware.org

On Tue, 19 Feb 2019, GT wrote:

> GCC fails if an attempt is made to compile code that uses Altivec
> functionality without passing the -maltivec flag. Additionally it also
> fails on attempts to build for pre-POWER7 hardware even when
> -maltivec flag is given to the compiler. Is there need to add a
> check in configure for attempts to build libmvec on non-Altivec, non-VSX
> systems; or let the failure happen when make calls GCC?

It's best to detect the issue at configure time.  That's definitely needed 
if you want to enable libmvec by default as on x86_64.

I'm not sure why it would fail for pre-POWER7, if you pass -maltivec for 
the AltiVec versions and -mvsx for the versions requiring VSX.

> As I understand the issue here, the only vector function that can be called
> after glibc is built with this patch is the double-precision vector cosine.
> Automatic vectorization of functions is a GCC feature that is as yet
> unimplemented.

What are the effects of the pragma / attributes when using an existing GCC 
version?  Do they do nothing (code always gets built for scalar function 
calls), or do they cause compile errors or cause code to be built for an 
unsupported ABI?  That determines whether you might e.g. need 
__GNUC_PREREQ (10, 0) conditionals to avoid problems when this header gets 
used with an older GCC version not supporting the required ABI.

> Even then only the vector cosine can be tested. Unless I've missed a
> signficant point.

Yes, the expectation would be that you'd verify that, with appropriate 
options, some code using cosine and including <math.h> gets compiled to 
use the new vector cosine function, and that the calls do work as expected 
(so the compiler's ABI matches the one used by these functions).  Later 
patches in the series, adding other functions, would have such tests run 
that they properly generate calls to those functions.

We had issues for x86_64 where it turned out the vector sincos functions 
were expecting an ABI different from that (arguments being vectors of 
pointers) generated by the compiler for the pragma in the header (see 
commit ee2196bb6766ca7e63a1ba22ebb7619a3266776a that fixed this 
inconsistency).  It's to avoid such issues that it's important to verify 
things end-to-end - that sources calling each function get compiled to 
code calling the corresponding vector function and execution of that code 
behaves properly.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] PPC64: First in the series of patches implementing POWER8 vector math.
  2019-02-19 22:54 ` Joseph Myers
@ 2019-02-22  0:21   ` GT
  2019-02-22  1:42     ` Joseph Myers
  0 siblings, 1 reply; 24+ messages in thread
From: GT @ 2019-02-22  0:21 UTC (permalink / raw
  To: libc-alpha@sourceware.org

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, February 19, 2019 10:54 PM, Joseph Myers <joseph@codesourcery.com> wrote:

> On Tue, 19 Feb 2019, GT wrote:
>
> > GCC fails if an attempt is made to compile code that uses Altivec
> > functionality without passing the -maltivec flag. Additionally it also
> > fails on attempts to build for pre-POWER7 hardware even when
> > -maltivec flag is given to the compiler. Is there need to add a
> > check in configure for attempts to build libmvec on non-Altivec, non-VSX
> > systems; or let the failure happen when make calls GCC?
>
> It's best to detect the issue at configure time. That's definitely needed
> if you want to enable libmvec by default as on x86_64.
>

I've had trouble modifying configure.ac to detect attempts at building
libmvec on machines without Altivec/VSX. But perhaps the problem isn't
with changes I've made:

After making changes to sysdeps/powerpc/powerpc64/configure.ac I changed
directory to the root of the glibc source. Then I ran autoconf with no
arguments, as instructed in the manual. Checking the timestamps,
configure was not regenerated. I forced regeneration by running
'autoconf -f' and this time configure was regenerated at the root of
glibc source. But sysdeps/powerpc/powerpc64/configure was not
re-created. Running autoconf -v showed lines that begin with
'autom4te: forbidden tokens: .....'.

Just to be sure it isn't something I introduced, I cloned a new glibc
source tree and the only command run there thus far was autoconf -f -v.
The same lines appear noting forbidden tokens.

Is there an issue generating the configure script or are those messages
about forbidden tokens harmless?

And running autoconf at the glibc source-tree root directory should
regenerate configure scripts in all subdirectories which have
configure.ac in them, right?


> I'm not sure why it would fail for pre-POWER7, if you pass -maltivec for
> the AltiVec versions and -mvsx for the versions requiring VSX.
>

Turns out POWER7 is the earliest POWERn that implements v2.06 of the ISA.
v2.06 is when type vector double was introduced. So the behavior is
as expected.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] PPC64: First in the series of patches implementing POWER8 vector math.
  2019-02-22  0:21   ` GT
@ 2019-02-22  1:42     ` Joseph Myers
  0 siblings, 0 replies; 24+ messages in thread
From: Joseph Myers @ 2019-02-22  1:42 UTC (permalink / raw
  To: GT; +Cc: libc-alpha@sourceware.org

On Fri, 22 Feb 2019, GT wrote:

> And running autoconf at the glibc source-tree root directory should
> regenerate configure scripts in all subdirectories which have
> configure.ac in them, right?

No.  It's "autoconf /path/to/configure.ac > /path/to/configure" to 
regenerate a subdirectory configure fragment.  See the Makefile rules.

> Turns out POWER7 is the earliest POWERn that implements v2.06 of the ISA.
> v2.06 is when type vector double was introduced. So the behavior is
> as expected.

I.e., any processor supporting this functionality has VSX.  So do you need 
the plain AltiVec versions at all, or just the VSX ones?  (At least for 
double; maybe if you do float functions later, AltiVec versions will be 
relevant there?)

(I'd suggest that, instead of configure tests, it would be best just to 
build the functions in question with appropriate -m options, and arrange 
for the tests not to run on processors not supporting the required 
functionality.  If that won't work for some reason, presumably you can 
give a clear explanation of that reason for the commit message.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] PPC64: First in the series of patches implementing POWER8 vector math.
  2019-02-19  1:42 ` Joseph Myers
  2019-02-19 17:43   ` GT
@ 2019-02-24  3:33   ` GT
  2019-02-25 17:21     ` Joseph Myers
  1 sibling, 1 reply; 24+ messages in thread
From: GT @ 2019-02-24  3:33 UTC (permalink / raw
  To: libc-alpha@sourceware.org

> You're now testing a macro POWERPC64 in the header, but GCC doesn't
> predefine that macro, only powerpc64. You need to test properly that
> the header you have really does work to cause appropriately built code,
> using the installed headers, to call these functions. (glibc's own
> testsuite doesn't verify that, only that the internal functions work as
> expected by their mangled names. So you need appropriate manual tests of
> using installed glibc with a suitable compiler to make sure calls get
> vectorized as expected and the vectorized calls work - and you should give
> details of that testing in the proposed commit message, to demonstrate
> that the patch has been sufficiently tested.)
>

So, I'm trying to test the newly-built glibc to verify the installed
headers as asked for above. I am unable to compile against the new glibc.

The sequence of steps taken is this:

"Compile against glibc in an installed location", on this sourceware
URL: https://sourceware.org/glibc/wiki/Testing/Builds

The installation steps came from a section at the same URL with
title: "Building glibc with intent to install". Here I did everything
including the final step of obtaining gcc's helper library for
cancellation.

My intention is to compare outputs of the simple test programs given
here: https://sourceware.org/glibc/wiki/libmvec as Examples 1 and 2.
For each of them, I run them as given, using the normal non-vector
cosine. Then I replace the calls to cosine with calls to the new
vector cosine. With manual, direct calls to the vector cosine since
GCC does not yet perform vectorization for PPC64.

The error I get when I try to compile against the new glibc:

/usr/bin/ld:
cannot find /usr/lib64/libmvec_nonshared.a inside /home/fedora/my-install-glibc

collect2: error: ld returned 1 exit status

The compile command used:

gcc -L${SYSROOT}/usr/lib64/ -I${SYSROOT}/include/ --sysroot=${SYSROOT} \
-Wl,-rpath=${SYSROOT}/lib64 -Wl,--dynamic-linker=${SYSROOT}/lib64/ld-2.29.9000.so \
-O1 -fopenmp -ffast-math -lm -mvsx -o testcos testcos.c

SYSROOT is set to the same directory I used for DESTDIR in the make install step.

I'm uncertain on how to proceed. I appreciate any and all assistance in solving this.

Thanks.
Bert.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] PPC64: First in the series of patches implementing POWER8 vector math.
  2019-02-24  3:33   ` GT
@ 2019-02-25 17:21     ` Joseph Myers
  2019-02-25 22:03       ` GT
  0 siblings, 1 reply; 24+ messages in thread
From: Joseph Myers @ 2019-02-25 17:21 UTC (permalink / raw
  To: GT; +Cc: libc-alpha@sourceware.org

On Sun, 24 Feb 2019, GT wrote:

> gcc -L${SYSROOT}/usr/lib64/ -I${SYSROOT}/include/ --sysroot=${SYSROOT} \
> -Wl,-rpath=${SYSROOT}/lib64 -Wl,--dynamic-linker=${SYSROOT}/lib64/ld-2.29.9000.so \
> -O1 -fopenmp -ffast-math -lm -mvsx -o testcos testcos.c

Don't use -L (or -I) options pointing into a sysroot; ld cares about 
whether it found a given library via a sysrooted or nonsysrooted path, so 
using such a -L option can result in a linker script in the sysroot being 
misinterpreted (absolute paths therein not being interpreted as relative 
to the sysroot).

However, I don't think that's your issue.  You should have 
libmvec_nonshared.a; it's needed to deal with cases when the compiler 
generates calls to _Z*___*_finite because of bits/math-finite.h being in 
effect together with bits/math-vector.h (see 
sysdeps/x86_64/fpu/svml_finite_alias.S and 
<https://gcc.gnu.org/ml/gcc/2015-06/msg00173.html>).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] PPC64: First in the series of patches implementing POWER8 vector math.
  2019-02-25 17:21     ` Joseph Myers
@ 2019-02-25 22:03       ` GT
  2019-02-25 23:46         ` Joseph Myers
  0 siblings, 1 reply; 24+ messages in thread
From: GT @ 2019-02-25 22:03 UTC (permalink / raw
  To: libc-alpha@sourceware.org

> However, I don't think that's your issue. You should have
> libmvec_nonshared.a; it's needed to deal with cases when the compiler
> generates calls to Z*__*_finite because of bits/math-finite.h being in
> effect together with bits/math-vector.h (see
> sysdeps/x86_64/fpu/svml_finite_alias.S and
> https://gcc.gnu.org/ml/gcc/2015-06/msg00173.html).
>

1. svml_finite_alias.S deals only with log/logf, exp/expf and pow/powf.
So does the discussion linked at the given url. At this time only the
cos function has been implemented. Do we then still require
libmvec_nonshared.a to be created?

2. I find a single reference in the entire source tree where
libmvec_nonshared.a doesn't appear in the dependency section of
a rule. That's in math/Makefile. Does this mean libmvec_nonshared.a
is created with 'make install'? I rather expected it would be built
by the initial 'make' which generates most other objects.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] PPC64: First in the series of patches implementing POWER8 vector math.
  2019-02-25 22:03       ` GT
@ 2019-02-25 23:46         ` Joseph Myers
  0 siblings, 0 replies; 24+ messages in thread
From: Joseph Myers @ 2019-02-25 23:46 UTC (permalink / raw
  To: GT; +Cc: libc-alpha@sourceware.org

On Mon, 25 Feb 2019, GT wrote:

> 1. svml_finite_alias.S deals only with log/logf, exp/expf and pow/powf.
> So does the discussion linked at the given url. At this time only the
> cos function has been implemented. Do we then still require
> libmvec_nonshared.a to be created?

The alternative would be to have some kind of conditional in 
math/Makefile, where it creates the linker script for libm.so, based on 
whatever logic determines whether there is libmvec_nonshared.a (i.e. 
whether there are any objects that go in it).  If you're adding new 
architecture support for libmvec that doesn't start off with any functions 
that bits/math-finite.h does anything with, you get to deal with adapting 
the generic code to handle that previously unsupported case.

> 2. I find a single reference in the entire source tree where
> libmvec_nonshared.a doesn't appear in the dependency section of
> a rule. That's in math/Makefile. Does this mean libmvec_nonshared.a
> is created with 'make install'? I rather expected it would be built

The linker scripts that are only usable in a glibc installation, not in a 
build context, are created at install time, yes.  (So is the manual, hence 
the build of the manual breaking from time to time when someone had a 
broken change to the manual and didn't run "make install" in testing.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] PPC64: First in the series of patches implementing POWER8  vector math.
@ 2019-02-27  3:24 GT
  0 siblings, 0 replies; 24+ messages in thread
From: GT @ 2019-02-27  3:24 UTC (permalink / raw
  To: libc-alpha@sourceware.org

[-- Attachment #1: Type: text/plain, Size: 15 bytes --]

Empty Message

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-PPC64-First-in-the-series-of-patches-implementing-PO.patch --]
[-- Type: text/x-patch; name="0001-PPC64-First-in-the-series-of-patches-implementing-PO.patch", Size: 19082 bytes --]

From e7d282c21dd987a30d5b6eb674f32a501594273f Mon Sep 17 00:00:00 2001
From: Bert Tenjy <bert.tenjy@gmail.com>
Date: Wed, 27 Feb 2019 02:00:29 +0000
Subject: [PATCH] PPC64: First in the series of patches implementing POWER8
 vector math.

[BZ #24205]

Implements double-precision cosine using VSX vector capability. Algorithm for
cosine is from x86_64 [commit #2193311288] adapted to PPC64.

Name-mangling exactly duplicates SSE ISA of the x86_64 ABI. The details are at
<https://sourceware.org/glibc/wiki/
libmvec?action=AttachFile&do=view&target=VectorABI.txt>

The patch has been tested on PPC64/POWER8 Little Endian and Big Endian. It is
tested using the framework created for libmvec on x86_64 which runs tests on
issuing 'make check'. Tests of the new vector cosine function all pass.

Glibc built with this patch was installed using the procedure outlined at
<https://sourceware.org/glibc/wiki/Testing/Builds>. Compiling against the new
library created a test executable which computes cosines using the vector
version of the function. The results are at most 2-ulps away from the scalar
cosine. That is expected and indicated in the comments describing the
algorithm - as obtained from x86_64 commit #2193311288.
---
 ChangeLog                                     | 17 ++++
 NEWS                                          | 13 +++
 sysdeps/powerpc/bits/math-vector.h            | 41 +++++++++
 sysdeps/powerpc/fpu/libm-test-ulps            |  3 +
 sysdeps/powerpc/powerpc64/fpu/Makefile        |  7 ++
 sysdeps/powerpc/powerpc64/fpu/Versions        |  5 ++
 .../powerpc/powerpc64/fpu/multiarch/Makefile  | 17 ++++
 .../multiarch/test-double-vlen2-wrappers.c    | 24 +++++
 .../powerpc64/fpu/multiarch/vec_d_cos2_vsx.c  | 88 +++++++++++++++++++
 .../powerpc64/fpu/multiarch/vec_d_trig_data.h | 60 +++++++++++++
 .../powerpc/powerpc64/fpu/vec_finite_alias.c  | 41 +++++++++
 .../linux/powerpc/powerpc64/libmvec.abilist   |  1 +
 12 files changed, 317 insertions(+)
 create mode 100644 sysdeps/powerpc/bits/math-vector.h
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/Makefile
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/Versions
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vsx.c
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h
 create mode 100644 sysdeps/powerpc/powerpc64/fpu/vec_finite_alias.c
 create mode 100644 sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist

Notable differences from the previous patch and further commentary:

1. Renamed the main C source file from vec_d_cos2_power8.c to
vec_d_cos2_vsx.c. VSX functionality is also available on POWER7 and
POWER9, hence the change.

2. Removed vec_d_cos2_core.c and vec_d_cos2_vmx.c. The former did
ifunc selection between the latter and the main C implementation.
File vec_d_cos2_vmx.c was not a true Altivec implementation. It was
only a wrapper to the scalar cosine funtion.

3. A new file, vec_finite_alias.c is a workaround until the vector
log function is implemented. It is needed so that libmvec_nonshared.a
is built. Without it, compiling against the newly-built glibc will
fail due to its being missing.

4. __PPC64__ is the macro tested in math-vector.h. Table 5.1 of
the POWER ELFv2 ABI defines it and __powerpc64__ as synonyms.
The other macros in that file are all-uppercase and the choice
made preserves consistency.

5. GCC has no vectorizing support for PPC64. The openmp pragmas
are ignored and only scalar cosine calls generated. Exactly as when
libmvec doesn't exist.

6. The executables created to test against new glibc installation
required a workaround. x86_64 also did when I tried to compile the
same test. The test is a modification of Example #1 at
<https://sourceware.org/glibc/wiki/libmvec>. The only change initially
is a replacement of the call to cos () with one to the vector version
_ZGVbN2v_cos (). Compilation fails due to function without a
prototype. The solution for both PPC64 and x86_64 was to supply a
'extern <return type> _ZGVbN2v_cos (<in arg. type>)' forward
declaration. Then compilation created an executable that used
the new vector cosine.

7. This patch is half of the requirement for BZ #24205. The other is
implementing vector single-precision cosine. There are two outstanding
issues which I ask to be pushed into the patch for cosf. Gracefully
terminating configure if the GCC used does not provide the VSX builtins
required to build libmvec. And runtime avoidance of tests of the vector
functions on machines without VSX hardware.

diff --git a/ChangeLog b/ChangeLog
index 8096175cc9..654774d690 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,20 @@
+2019-02-27    <bert.tenjy@gmail.com>
+
+	[BZ #24205]
+	* sysdeps/powerpc/bits/math-vector.h: New file.
+	* sysdeps/powerpc/fpu/libm-test-ulps (cos_vlen2): Regenerated.
+	* sysdeps/powerpc/powerpc64/fpu/Makefile: New file.
+	* sysdeps/powerpc/powerpc64/fpu/Versions: Likewise.
+	* sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (libmvec-sysdep_routines)
+	(CFLAGS-vec_d_cos2_vsx.c, libmvec-tests, double-vlen2-funcs)
+	(double-vlen2-arch-ext-cflags): Added build of VSX vector cos function
+	and its tests.
+	* sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c: New file.
+	* sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vsx.c: Likewise.
+	* sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h: Likewise.
+	* sysdeps/powerpc/powerpc64/fpu/vec_finite_alias.c: Likewise.
+	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist: Likewise.
+
 2019-02-26  Joseph Myers  <joseph@codesourcery.com>
 
 	* sysdeps/arm/sysdep.h (#if condition): Break lines before rather
diff --git a/NEWS b/NEWS
index 0a3b6c7a5a..fc08f11c51 100644
--- a/NEWS
+++ b/NEWS
@@ -5,6 +5,19 @@ See the end for copying conditions.
 Please send GNU C library bug reports via <https://sourceware.org/bugzilla/>
 using `glibc' in the "product" field.
 \f
+
+* Start of implementing vector math library libmvec on PPC64/POWER8.
+  The double-precision cosine now has a vector version.
+  GCC support for auto-vectorization of functions on PPC64 is not yet
+  available. Until that is done, the new vector math functions will be
+  inaccessible to applications.
+  Building libmvec for PPC64 VSX hardware is done at configuration with
+  --enable-mathvec. The default is to not build.
+  The library ABI specification is x86_64 Vector Function ABI.
+  More information on libmvec including a link to the ABI document is at:
+  <https://sourceware.org/glibc/wiki/libmvec>
+
+\f
 Version 2.30
 
 Major new features:
diff --git a/sysdeps/powerpc/bits/math-vector.h b/sysdeps/powerpc/bits/math-vector.h
new file mode 100644
index 0000000000..78d9db64bf
--- /dev/null
+++ b/sysdeps/powerpc/bits/math-vector.h
@@ -0,0 +1,41 @@
+/* Platform-specific SIMD declarations of math functions.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _MATH_H
+# error "Never include <bits/math-vector.h> directly;\
+ include <math.h> instead."
+#endif
+
+/* Get default empty definitions for simd declarations.  */
+#include <bits/libm-simd-decl-stubs.h>
+
+#if defined __PPC64__ && defined __FAST_MATH__
+# if defined _OPENMP && _OPENMP >= 201307
+/* OpenMP case.  */
+#  define __DECL_SIMD_PPC64 _Pragma ("omp declare simd notinbranch")
+# elif __GNUC_PREREQ (6,0)
+/* W/o OpenMP use GCC 6.* __attribute__ ((__simd__)).  */
+#  define __DECL_SIMD_PPC64 __attribute__ ((__simd__ ("notinbranch")))
+# endif
+
+# ifdef __DECL_SIMD_PPC64
+#  undef __DECL_SIMD_cos
+#  define __DECL_SIMD_cos __DECL_SIMD_PPC64
+
+# endif
+#endif
diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps
index 1eec27c1dc..d392b135a7 100644
--- a/sysdeps/powerpc/fpu/libm-test-ulps
+++ b/sysdeps/powerpc/fpu/libm-test-ulps
@@ -1311,6 +1311,9 @@ ifloat128: 2
 ildouble: 5
 ldouble: 5
 
+Function: "cos_vlen2":
+double: 2
+
 Function: "cosh":
 double: 1
 float: 1
diff --git a/sysdeps/powerpc/powerpc64/fpu/Makefile b/sysdeps/powerpc/powerpc64/fpu/Makefile
new file mode 100644
index 0000000000..21dc67ff73
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/Makefile
@@ -0,0 +1,7 @@
+ifeq ($(subdir),mathvec)
+libmvec-support += vec_finite_alias
+
+CFLAGS-vec_finite_alias.c += -mvsx
+
+libmvec-static-only-routines = vec_finite_alias
+endif
diff --git a/sysdeps/powerpc/powerpc64/fpu/Versions b/sysdeps/powerpc/powerpc64/fpu/Versions
new file mode 100644
index 0000000000..9a3e1211cc
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/Versions
@@ -0,0 +1,5 @@
+libmvec {
+  GLIBC_2.30 {
+    _ZGVbN2v_cos;
+  }
+}
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
index 39b557604c..44c1c04c13 100644
--- a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile
@@ -42,3 +42,20 @@ CFLAGS-e_hypotf-power7.c = -mcpu=power7
 CFLAGS-s_modf-ppc64.c += -fsignaling-nans
 CFLAGS-s_modff-ppc64.c += -fsignaling-nans
 endif
+
+ifeq ($(subdir),mathvec)
+libmvec-sysdep_routines += vec_d_cos2_vsx
+CFLAGS-vec_d_cos2_vsx.c += -mvsx
+endif
+
+# Variables for libmvec tests.
+ifeq ($(subdir),math)
+ifeq ($(build-mathvec),yes)
+libmvec-tests += double-vlen2
+
+double-vlen2-funcs = cos
+
+double-vlen2-arch-ext-cflags = -mvsx -DREQUIRE_VSX
+
+endif
+endif
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c
new file mode 100644
index 0000000000..17e2cc0724
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c
@@ -0,0 +1,24 @@
+/* Wrapper part of tests for VSX ISA versions of vector math functions.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "test-double-vlen2.h"
+#include <altivec.h>
+
+#define VEC_TYPE vector double
+
+VECTOR_WRAPPER (WRAPPER_NAME (cos), _ZGVbN2v_cos)
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vsx.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vsx.c
new file mode 100644
index 0000000000..ed8fe330c1
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vsx.c
@@ -0,0 +1,88 @@
+/* Function cos vectorized with VSX.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <math.h>
+#include "vec_d_trig_data.h"
+
+vector double
+_ZGVbN2v_cos (vector double x)
+{
+
+  /*
+   ARGUMENT RANGE REDUCTION:
+   Add Pi/2 to argument: X' = X+Pi/2.  */
+  vector double x_prime = (vector double) d_half_pi + x;
+
+  /* Get absolute argument value: X' = |X'|.  */
+  vector double abs_x_prime = vec_abs (x_prime);
+
+  /* Y = X'*InvPi + RS : right shifter add.  */
+  vector double y = (x_prime * d_inv_pi) + d_rshifter;
+
+  /* Check for large arguments path.  */
+  vector bool long long large_in = vec_cmpgt (abs_x_prime, d_rangeval);
+
+  /* N = Y - RS : right shifter sub.  */
+  vector double n = y - d_rshifter;
+
+  /* SignRes = Y<<63 : shift LSB to MSB place for result sign.  */
+  vector double sign_res = (vector double) vec_sl ((vector long long) y,
+						   (vector unsigned long long)
+						   vec_splats (63));
+
+  /* N = N - 0.5.  */
+  n = n - d_one_half;
+
+  /* R = X - N*Pi1.  */
+  vector double r = x - (n * d_pi1_fma);
+
+  /* R = R - N*Pi2.  */
+  r = r - (n * d_pi2_fma);
+
+  /* R = R - N*Pi3.  */
+  r = r - (n * d_pi3_fma);
+
+  /* R2 = R*R.  */
+  vector double r2 = r * r;
+
+  /* Poly = C3+R2*(C4+R2*(C5+R2*(C6+R2*C7))).  */
+  vector double poly = r2 * d_coeff7 + d_coeff6;
+  poly = poly * r2 + d_coeff5;
+  poly = poly * r2 + d_coeff4;
+  poly = poly * r2 + d_coeff3;
+
+  /* Poly = R+R*(R2*(C1+R2*(C2+R2*Poly))).  */
+  poly = poly * r2 + d_coeff2;
+  poly = poly * r2 + d_coeff1;
+  poly = poly * r2 * r + r;
+
+  /*
+     RECONSTRUCTION:
+     Final sign setting: Res = Poly^SignRes.  */
+  vector double out
+    = (vector double) ((vector long long) poly ^ (vector long long) sign_res);
+
+  if (large_in[0] != 0)
+    out[0] = cos (x[0]);
+
+  if (large_in[1] != 0)
+    out[1] = cos (x[1]);
+
+  return out;
+
+}
diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h
new file mode 100644
index 0000000000..4b2678928f
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h
@@ -0,0 +1,60 @@
+/* Constants used in polynomail approximations for vectorized sin, cos,
+   and sincos functions.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef D_TRIG_DATA_H
+#define D_TRIG_DATA_H
+
+#include <altivec.h>
+
+/* PI/2.  */
+const vector double d_half_pi  = {0x1.921fb54442d18p+0, 0x1.921fb54442d18p+0};
+
+/* Inverse PI.  */
+const vector double d_inv_pi   = {0x1.45f306dc9c883p-2, 0x1.45f306dc9c883p-2};
+
+/* Right-shifter constant.  */
+const vector double d_rshifter = {0x1.8p+52, 0x1.8p+52};
+
+/* Working range threshold.  */
+const vector double d_rangeval = {0x1p+23, 0x1p+23};
+
+/* One-half . */
+const vector double d_one_half = {0x1p-1, 0x1p-1};
+
+/* Range reduction PI-based constants if FMA available:
+   PI high part (FMA available).  */
+const vector double d_pi1_fma = {0x1.921fb54442d18p+1, 0x1.921fb54442d18p+1};
+
+/* PI mid part  (FMA available).  */
+const vector double d_pi2_fma = {0x1.1a62633145c06p-53, 0x1.1a62633145c06p-53};
+
+/* PI low part  (FMA available).  */
+const vector double d_pi3_fma
+= {0x1.c1cd129024e09p-106,0x1.c1cd129024e09p-106};
+
+/* Polynomial coefficients (relative error 2^(-52.115)).  */
+const vector double d_coeff7 = {-0x1.9f0d60811aac8p-41,-0x1.9f0d60811aac8p-41};
+const vector double d_coeff6 = {0x1.60e6857a2f22p-33,0x1.60e6857a2f22p-33};
+const vector double d_coeff5 = {-0x1.ae63546002231p-26,-0x1.ae63546002231p-26};
+const vector double d_coeff4 = {0x1.71de38030feap-19,0x1.71de38030feap-19};
+const vector double d_coeff3 = {-0x1.a01a019a5b86dp-13,-0x1.a01a019a5b86dp-13};
+const vector double d_coeff2 = {0x1.111111110a4a8p-7,0x1.111111110a4a8p-7};
+const vector double d_coeff1 = {-0x1.55555555554a7p-3,-0x1.55555555554a7p-3};
+
+#endif /* D_TRIG_DATA_H.  */
diff --git a/sysdeps/powerpc/powerpc64/fpu/vec_finite_alias.c b/sysdeps/powerpc/powerpc64/fpu/vec_finite_alias.c
new file mode 100644
index 0000000000..f1a062aadf
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/fpu/vec_finite_alias.c
@@ -0,0 +1,41 @@
+/* A temporary workaround until vector log is implemented.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <math.h>
+#include <altivec.h>
+
+/* We need this wrapper to the scalar log function so that
+   libmvec_nonshared.a is generated. Otherwise compiling
+   against the new glibc during testing results in an error
+   due to the missing libmvec_nonshared.a.  */
+
+vector double
+_ZGVbN2v___log_finite (vector double x)
+{
+
+  /*
+   Calls the scalar log function twice, once for each
+   of the pair of doubles in the input argument.  */
+  vector double out;
+
+  out[0] = log (x[0]);
+  out[1] = log (x[1]);
+
+  return out;
+
+}
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist
new file mode 100644
index 0000000000..656ce0541f
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist
@@ -0,0 +1 @@
+GLIBC_2.30 _ZGVbN2v_cos F
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2019-02-27  3:25 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-02-19 22:36 [PATCH] PPC64: First in the series of patches implementing POWER8 vector math GT
2019-02-19 22:54 ` Joseph Myers
2019-02-22  0:21   ` GT
2019-02-22  1:42     ` Joseph Myers
  -- strict thread matches above, loose matches on Subject: below --
2019-02-27  3:24 GT
2019-02-18 23:48 GT
2019-02-19  1:42 ` Joseph Myers
2019-02-19 17:43   ` GT
2019-02-19 19:32     ` Tulio Magno Quites Machado Filho
2019-02-24  3:33   ` GT
2019-02-25 17:21     ` Joseph Myers
2019-02-25 22:03       ` GT
2019-02-25 23:46         ` Joseph Myers
2019-02-14 20:56 GT
2019-02-14 21:17 ` Joseph Myers
2019-02-15 20:33   ` GT
2019-02-15 21:07     ` Joseph Myers
2019-02-16  2:16       ` GT
2019-02-18 18:30   ` GT
2019-02-18 18:38     ` Joseph Myers
2019-02-15 16:45 ` Steve Ellcey
2019-02-18 18:32   ` Tulio Magno Quites Machado Filho
2019-02-18 19:13     ` GT
2019-02-19 19:38       ` Tulio Magno Quites Machado Filho

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).