From: "H.J. Lu via Libc-alpha" <libc-alpha@sourceware.org>
To: IA32 System V Application Binary Interface <ia32-abi@googlegroups.com>
Cc: "Wang, Pengfei" <pengfei.wang@intel.com>,
llvm-dev@lists.llvm.org,
GNU C Library <libc-alpha@sourceware.org>,
GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [llvm-dev] [PATCH] Add optional _Float16 support
Date: Tue, 13 Jul 2021 09:24:14 -0700 [thread overview]
Message-ID: <CAMe9rOo_JwDHMUx1nSBkmjxj9B5BDrPNWK9jz_Y4oOYiDgVu4Q@mail.gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.22.394.2107131536460.2044871@digraph.polyomino.org.uk>
[-- Attachment #1: Type: text/plain, Size: 1285 bytes --]
On Tue, Jul 13, 2021 at 8:41 AM Joseph Myers <joseph@codesourcery.com> wrote:
>
> On Tue, 13 Jul 2021, H.J. Lu wrote:
>
> > On Mon, Jul 12, 2021 at 8:59 PM Wang, Pengfei <pengfei.wang@intel.com> wrote:
> > >
> > > > Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers.
> > >
> > > Can you please explain the behavior here? Is there difference between _Float16 and _Complex _Float16 when return? I.e.,
> > > 1, In which case will _Float16 values return in both %xmm0 and %xmm1?
> > > 2, For a single _Float16 value, are both real part and imaginary part returned in %xmm0? Or returned in %xmm0 and %xmm1 respectively?
> >
> > Here is the v2 patch to add the missing _Float16 bits. The PDF file is at
> >
> > https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI
>
> This PDF shows _Complex _Float16 as having a size of 2 bytes (should be
> 4-byte size, 2-byte alignment).
>
> It also seems to change double from 4-byte to 8-byte alignment, which is
> wrong. And it's inconsistent about whether it covers the long double =
> double (Android) case - it shows that case for _Complex long double but
> not for long double itself.
Here is the v3 patch with the fixes. I also updated the PDF file.
> --
> Joseph S. Myers
> joseph@codesourcery.com
>
--
H.J.
[-- Attachment #2: v3-0001-Add-optional-_Float16-support.patch --]
[-- Type: text/x-patch, Size: 7346 bytes --]
From a02a11ef0ea066cab57eb66ef392b21d243d2734 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Thu, 1 Jul 2021 13:58:00 -0700
Subject: [PATCH v3] Add optional _Float16 support
1. Pass _Float16 and _Complex _Float16 values on stack.
2. Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers.
---
low-level-sys-info.tex | 76 ++++++++++++++++++++++++++++++------------
1 file changed, 54 insertions(+), 22 deletions(-)
diff --git a/low-level-sys-info.tex b/low-level-sys-info.tex
index acaf30e..9ae7995 100644
--- a/low-level-sys-info.tex
+++ b/low-level-sys-info.tex
@@ -30,7 +30,8 @@ object, and the term \emph{\textindex{\sixteenbyte{}}} refers to a
\subsubsection{Fundamental Types}
Table~\ref{basic-types} shows the correspondence between ISO C
-scalar types and the processor scalar types. \code{__float80},
+scalar types and the processor scalar types. \code{_Float16},
+\code{__float80},
\code{__float128}, \code{__m64}, \code{__m128}, \code{__m256} and
\code{__m512} types are optional.
@@ -79,23 +80,28 @@ scalar types and the processor scalar types. \code{__float80},
& \texttt{\textit{any-type} *} & 4 & 4 & unsigned \fourbyte \\
& \texttt{\textit{any-type} (*)()} & & \\
\hline
- Floating-& \texttt{float} & 4 & 4 & single (IEEE-754) \\
\cline{2-5}
- point & \texttt{double} & 8 & 4 & double (IEEE-754) \\
- & \texttt{long double}$^{\dagger\dagger\dagger\dagger}$ & & & \\
+ & \texttt{_Float16}$^{\dagger\dagger\dagger\dagger\dagger}$ & 2 & 2 & 16-bit (IEEE-754) \\
+ \cline{2-5}
+ & \texttt{float} & 4 & 4 & single (IEEE-754) \\
+ \cline{2-5}
+ Floating- & \texttt{double} & 8 & 4 & double (IEEE-754) \\
+ point & \texttt{long double}$^{\dagger\dagger\dagger\dagger}$ & 8 & 4 & double (IEEE-754) \\
\cline{2-5}
& \texttt{__float80}$^{\dagger\dagger}$ & 12 & 4 & 80-bit extended (IEEE-754) \\
- & \texttt{long double}$^{\dagger\dagger\dagger\dagger}$ & & & \\
+ & \texttt{long double}$^{\dagger\dagger\dagger\dagger}$ & 12 & 4 & 80-bit extended (IEEE-754) \\
\cline{2-5}
& \texttt{__float128}$^{\dagger\dagger}$ & 16 & 16 & 128-bit extended (IEEE-754) \\
\hline
- Complex& \texttt{_Complex float} & 8 & 4 & complex single (IEEE-754) \\
+ & \texttt{_Complex _Float16} $^{\dagger\dagger\dagger\dagger\dagger}$ & 4 & 2 & complex 16-bit (IEEE-754) \\
\cline{2-5}
- Floating-& \texttt{_Complex double} & 16 & 4 & complex double (IEEE-754) \\
- point & \texttt{_Complex long double}$^{\dagger\dagger\dagger\dagger}$ & & & \\
+ & \texttt{_Complex float} & 8 & 4 & complex single (IEEE-754) \\
\cline{2-5}
- & \texttt{_Complex __float80}$^{\dagger\dagger}$ & 24 & 4 & complex 80-bit extended (IEEE-754) \\
- & \texttt{_Complex long double}$^{\dagger\dagger\dagger\dagger}$ & & & \\
+ Complex& \texttt{_Complex double} & 16 & 4 & complex double (IEEE-754) \\
+ Floating-& \texttt{_Complex long double}$^{\dagger\dagger\dagger\dagger}$ & & & \\
+ \cline{2-5}
+ point & \texttt{_Complex __float80}$^{\dagger\dagger}$ & 24 & 4 & complex 80-bit extended (IEEE-754) \\
+ & \texttt{_Complex long double}$^{\dagger\dagger\dagger\dagger}$ & & & \\
\cline{2-5}
& \texttt{_Complex __float128}$^{\dagger\dagger}$ & 32 & 16 & complex 128-bit extended (IEEE-754) \\
\hline
@@ -125,6 +131,8 @@ The \texttt{long double} type is 64-bit, the same as the \texttt{double}
type, on the Android{\texttrademark} platform. More information on the
Android{\texttrademark} platform is available from
\url{http://www.android.com/}.}\\
+\multicolumn{5}{p{13cm}}{\myfontsize $^{\dagger\dagger\dagger\dagger\dagger}$
+The \texttt{_Float16} type, from ISO/IEC TS 18661-3:2015, is optional.}\\
\end{tabular}
}
\end{table}
@@ -323,6 +331,7 @@ at the time of the call.
\begin{table}
\Hrule
\caption{Register Usage}
+ \myfontsize
\label{fig-reg-usage}
\begin{center}
\begin{tabular}{l|p{8.35cm}|l}
@@ -346,13 +355,29 @@ of some 64bit return types & No \\
\EBP & callee-saved register; optionally used as frame pointer & Yes \\
\ESI & callee-saved register & yes \\
\EDI & callee-saved register & yes \\
-\reg{xmm0}, \reg{ymm0} & scratch registers; also used to pass and return
-\code{__m128}, \code{__m256} parameters & No\\
-\reg{xmm1}--\reg{xmm2},& scratch registers; also used to pass
-\code{__m128}, & No \\
-\reg{ymm1}--\reg{ymm2} & \code{__m256} parameters & \\
-\reg{xmm3}--\reg{xmm7},& scratch registers & No \\
-\reg{ymm3}--\reg{ymm7} & & \\
+\reg{xmm0} & scratch register; also used to pass the first \code{__m128}
+ parameter and return \code{__m128}, \code{_Float16},
+ the real part of \code{_Complex _Float16} & No \\
+\reg{ymm0} & scratch register; also used to pass the first \code{__m256}
+ parameter and return \code{__m256} & No \\
+\reg{zmm0} & scratch register; also used to pass the first \code{__m512}
+ parameter and return \code{__m512} & No \\
+\reg{xmm1} & scratch register; also used to pass the second \code{__m128}
+ parameter and return the imaginary part of
+ \code{_Complex _Float16} & No \\
+\reg{ymm1} & scratch register; also used to pass the second \code{__m256}
+ parameters & No \\
+\reg{zmm1} & scratch register; also used to pass the second \code{__m512}
+ parameters & No \\
+\reg{xmm2} & scratch register; also used to pass the third \code{__m128}
+ parameters & No \\
+\reg{ymm2} & scratch register; also used to pass the third \code{__m256}
+ parameters & No \\
+\reg{zmm2} & scratch register; also used to pass the third \code{__m512}
+ parameters & No \\
+\reg{xmm3}--\reg{xmm7} & scratch registers & No \\
+\reg{ymm3}--\reg{ymm7} & scratch registers & No \\
+\reg{zmm3}--\reg{zmm7} & scratch registers & No \\
\reg{mm0} & scratch register; also used to pass and return
\code{__m64} parameter & No\\
\reg{mm1}--\reg{mm2} & used to pass \code{__m64} parameters & No\\
@@ -420,6 +445,8 @@ and \texttt{unions}) are always returned in memory.
& \texttt{\textit{any-type} *} & \EAX \\
& \texttt{\textit{any-type} (*)()} & \\
\hline
+ & \texttt{_Float16} & \reg{xmm0} \\
+ \cline{2-3}
& \texttt{float} & \reg{st0} \\
\cline{2-3}
Floating- & \texttt{double} & \reg{st0} \\
@@ -430,14 +457,19 @@ and \texttt{unions}) are always returned in memory.
\cline{2-3}
& \texttt{__float128} & memory \\
\hline
- & \texttt{_Complex float} & \EDX:\EAX \\
- & & The real part is returned in \EAX. The imaginary part is
+ & \texttt{_Complex _Float16} & \reg{xmm0}:\reg{xmm1} \\
+ & & The real part is returned in \reg{xmm0}. The imaginary part is
+ returned \\
+ & & in \reg{xmm1}.\\
+ \cline{2-3}
+ Complex & \texttt{_Complex float} & \EDX:\EAX \\
+ floating- & & The real part is returned in \EAX. The imaginary part is
returned \\
- Complex & & in \EDX.\\
+ point & & in \EDX.\\
\cline{2-3}
- floating- & \texttt{_Complex double} & memory \\
+ & \texttt{_Complex double} & memory \\
\cline{2-3}
- point & \texttt{_Complex long double} & memory \\
+ & \texttt{_Complex long double} & memory \\
\cline{2-3}
& \texttt{_Complex __float80} & memory \\
\cline{2-3}
--
2.31.1
next prev parent reply other threads:[~2021-07-13 16:25 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-01 21:05 [PATCH] Add optional _Float16 support H.J. Lu via Libc-alpha
2021-07-01 22:10 ` Joseph Myers
2021-07-01 22:27 ` H.J. Lu via Libc-alpha
2021-07-01 22:40 ` Joseph Myers
2021-07-01 23:01 ` H.J. Lu via Libc-alpha
2021-07-01 23:05 ` [llvm-dev] " Craig Topper via Libc-alpha
2021-07-01 23:33 ` Jacob Lifshay via Libc-alpha
2021-07-02 7:45 ` Richard Biener via Libc-alpha
2021-07-02 8:03 ` Hongtao Liu via Libc-alpha
2021-07-02 9:21 ` Jakub Jelinek via Libc-alpha
2021-07-13 3:59 ` Wang, Pengfei via Libc-alpha
2021-07-13 14:26 ` H.J. Lu via Libc-alpha
2021-07-13 14:48 ` Wang, Pengfei via Libc-alpha
2021-07-13 15:04 ` H.J. Lu via Libc-alpha
2021-07-13 15:41 ` Joseph Myers
2021-07-13 16:24 ` H.J. Lu via Libc-alpha [this message]
2021-07-29 13:39 ` H.J. Lu via Libc-alpha
2021-08-24 5:55 ` John McCall via Libc-alpha
2021-08-25 12:35 ` H.J. Lu via Libc-alpha
2021-08-25 20:32 ` John McCall via Libc-alpha
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/libc/involved.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMe9rOo_JwDHMUx1nSBkmjxj9B5BDrPNWK9jz_Y4oOYiDgVu4Q@mail.gmail.com \
--to=libc-alpha@sourceware.org \
--cc=gcc-patches@gcc.gnu.org \
--cc=hjl.tools@gmail.com \
--cc=ia32-abi@googlegroups.com \
--cc=llvm-dev@lists.llvm.org \
--cc=pengfei.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).