* [PATCH, alpha]: Fix sysdeps/alpha/remqu.S clobbering $f3 reg
@ 2019-01-18 13:06 Uros Bizjak
2019-01-20 18:21 ` Uros Bizjak
2019-01-24 8:23 ` Richard Henderson
0 siblings, 2 replies; 7+ messages in thread
From: Uros Bizjak @ 2019-01-18 13:06 UTC (permalink / raw
To: libc-alpha; +Cc: Richard Henderson
[-- Attachment #1: Type: text/plain, Size: 497 bytes --]
Hello!
Attached patch fixes sysdeps/alpha/remqu.S clobbering $f3 register via
$y_is_neg path. There was missing restore of $f3 before the return
from the function.
The patch also reorders insns a bit, so it becomes similar as much as
possible to divqu.S.
Without the patch, math/big testcase from Go-1.11 testsuite (that
includes lots of corner cases that exercise remqu) FAIL, with patched
function, the testcase PASSes without problems.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Uros.
[-- Attachment #2: libc-alpha.diff.txt --]
[-- Type: text/plain, Size: 1439 bytes --]
diff --git a/sysdeps/alpha/remqu.S b/sysdeps/alpha/remqu.S
index c2c5caf3c20..7210628f973 100644
--- a/sysdeps/alpha/remqu.S
+++ b/sysdeps/alpha/remqu.S
@@ -59,20 +59,19 @@ __remqu:
subq Y, 1, AT
stt $f0, 0(sp)
and Y, AT, AT
+ excb
+ beq AT, $powerof2
stt $f1, 8(sp)
- excb
stt $f3, 48(sp)
- beq AT, $powerof2
cfi_rel_offset ($f0, 0)
cfi_rel_offset ($f1, 8)
cfi_rel_offset ($f3, 48)
+ mf_fpcr $f3
_ITOFT2 X, $f0, 16, Y, $f1, 24
- mf_fpcr $f3
cvtqt $f0, $f0
cvtqt $f1, $f1
-
blt X, $x_is_neg
divt/c $f0, $f1, $f0
@@ -94,12 +93,12 @@ __remqu:
mulq AT, Y, AT
ldt $f0, 0(sp)
ldt $f3, 48(sp)
- lda sp, FRAME(sp)
cfi_remember_state
cfi_restore ($f0)
cfi_restore ($f1)
cfi_restore ($f3)
cfi_def_cfa_offset (0)
+ lda sp, FRAME(sp)
.align 4
subq X, AT, RV
@@ -116,11 +115,13 @@ $x_is_neg:
cfi_rel_offset ($f2, 24)
_ITOFS AT, $f2, 16
+ .align 4
addt $f0, $f2, $f0
+ unop
divt/c $f0, $f1, $f0
+ unop
/* Ok, we've now the divide issued. Continue with other checks. */
- .align 4
ldt $f1, 8(sp)
unop
ldt $f2, 24(sp)
@@ -246,12 +247,16 @@ $y_is_neg:
quotient must be either 0 or 1, so the remainder must be X
or X-Y, so just compute it directly. */
cmpule Y, X, AT
+ excb
+ mt_fpcr $f3
subq X, Y, RV
ldt $f0, 0(sp)
+ ldt $f3, 48(sp)
cmoveq AT, X, RV
lda sp, FRAME(sp)
cfi_restore ($f0)
+ cfi_restore ($f3)
cfi_def_cfa_offset (0)
ret $31, (RA), 1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH, alpha]: Fix sysdeps/alpha/remqu.S clobbering $f3 reg
2019-01-18 13:06 [PATCH, alpha]: Fix sysdeps/alpha/remqu.S clobbering $f3 reg Uros Bizjak
@ 2019-01-20 18:21 ` Uros Bizjak
2019-01-24 8:23 ` Richard Henderson
1 sibling, 0 replies; 7+ messages in thread
From: Uros Bizjak @ 2019-01-20 18:21 UTC (permalink / raw
To: libc-alpha; +Cc: Richard Henderson
On Fri, Jan 18, 2019 at 2:06 PM Uros Bizjak <ubizjak@gmail.com>
wrote:Please also find the missing ChangeLog entry.
Please find missing ChangeLog entry in this message. Also, please note
I don't have write access to glibc repository.
Uros.
> Attached patch fixes sysdeps/alpha/remqu.S clobbering $f3 register via
> $y_is_neg path. There was missing restore of $f3 before the return
> from the function.
>
> The patch also reorders insns a bit, so it becomes similar as much as
> possible to divqu.S.
>
> Without the patch, math/big testcase from Go-1.11 testsuite (that
> includes lots of corner cases that exercise remqu) FAIL, with patched
> function, the testcase PASSes without problems.
2019-18-01 Uroš Bizjak <ubizjak@gmail.com>
* sysdeps/alpha/remqu.S (__remqu): Add missing restore
of $f3 register on $y_is_neg path. Reorder instructions
similar to divqu.S order.
> Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
>
> Uros.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH, alpha]: Fix sysdeps/alpha/remqu.S clobbering $f3 reg
2019-01-18 13:06 [PATCH, alpha]: Fix sysdeps/alpha/remqu.S clobbering $f3 reg Uros Bizjak
2019-01-20 18:21 ` Uros Bizjak
@ 2019-01-24 8:23 ` Richard Henderson
2019-01-24 8:35 ` Uros Bizjak
2019-02-27 19:54 ` [PATCH, alpha]: Improve sysdeps/alpha/divqu.S and sysdeps/alpha/remqu.S Uros Bizjak
1 sibling, 2 replies; 7+ messages in thread
From: Richard Henderson @ 2019-01-24 8:23 UTC (permalink / raw
To: Uros Bizjak, libc-alpha
On 1/18/19 5:06 AM, Uros Bizjak wrote:
> Hello!
>
> Attached patch fixes sysdeps/alpha/remqu.S clobbering $f3 register via
> $y_is_neg path. There was missing restore of $f3 before the return
> from the function.
>
> The patch also reorders insns a bit, so it becomes similar as much as
> possible to divqu.S.
>
> Without the patch, math/big testcase from Go-1.11 testsuite (that
> includes lots of corner cases that exercise remqu) FAIL, with patched
> function, the testcase PASSes without problems.
> +++ b/sysdeps/alpha/remqu.S
> @@ -59,20 +59,19 @@ __remqu:
> subq Y, 1, AT
> stt $f0, 0(sp)
> and Y, AT, AT
> + excb
> + beq AT, $powerof2
>
> stt $f1, 8(sp)
> - excb
Why are you moving the excb above the powerof2 branch?
The path at powerof2 does not touch fpcr or issue fp insns.
> @@ -94,12 +93,12 @@ __remqu:
> mulq AT, Y, AT
> ldt $f0, 0(sp)
> ldt $f3, 48(sp)
> - lda sp, FRAME(sp)
> cfi_remember_state
> cfi_restore ($f0)
> cfi_restore ($f1)
> cfi_restore ($f3)
> cfi_def_cfa_offset (0)
> + lda sp, FRAME(sp)
This change is actively wrong wrt the unwind info.
> @@ -246,12 +247,16 @@ $y_is_neg:
> quotient must be either 0 or 1, so the remainder must be X
> or X-Y, so just compute it directly. */
> cmpule Y, X, AT
> + excb
> + mt_fpcr $f3
> subq X, Y, RV
> ldt $f0, 0(sp)
> + ldt $f3, 48(sp)
> cmoveq AT, X, RV
>
> lda sp, FRAME(sp)
> cfi_restore ($f0)
> + cfi_restore ($f3)
> cfi_def_cfa_offset (0)
> ret $31, (RA), 1
This appears to be the only change required to fix the bug.
Can you walk me through why the other changes?
r~
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH, alpha]: Fix sysdeps/alpha/remqu.S clobbering $f3 reg
2019-01-24 8:23 ` Richard Henderson
@ 2019-01-24 8:35 ` Uros Bizjak
2019-01-24 9:04 ` Richard Henderson
2019-02-27 19:54 ` [PATCH, alpha]: Improve sysdeps/alpha/divqu.S and sysdeps/alpha/remqu.S Uros Bizjak
1 sibling, 1 reply; 7+ messages in thread
From: Uros Bizjak @ 2019-01-24 8:35 UTC (permalink / raw
To: Richard Henderson; +Cc: libc-alpha
On Thu, Jan 24, 2019 at 9:23 AM Richard Henderson <rth@twiddle.net> wrote:
>
> On 1/18/19 5:06 AM, Uros Bizjak wrote:
> > Hello!
> >
> > Attached patch fixes sysdeps/alpha/remqu.S clobbering $f3 register via
> > $y_is_neg path. There was missing restore of $f3 before the return
> > from the function.
> >
> > The patch also reorders insns a bit, so it becomes similar as much as
> > possible to divqu.S.
> >
> > Without the patch, math/big testcase from Go-1.11 testsuite (that
> > includes lots of corner cases that exercise remqu) FAIL, with patched
> > function, the testcase PASSes without problems.
>
>
> > +++ b/sysdeps/alpha/remqu.S
> > @@ -59,20 +59,19 @@ __remqu:
> > subq Y, 1, AT
> > stt $f0, 0(sp)
> > and Y, AT, AT
> > + excb
> > + beq AT, $powerof2
> >
> > stt $f1, 8(sp)
> > - excb
>
> Why are you moving the excb above the powerof2 branch?
> The path at powerof2 does not touch fpcr or issue fp insns.
This was meant to unify the flow with the __divqu assembly, which does
the above before calling DIVBYZERO. The idea was that __divqu is used
much more than __remqu, so the later should do the same as the former.
> > @@ -94,12 +93,12 @@ __remqu:
> > mulq AT, Y, AT
> > ldt $f0, 0(sp)
> > ldt $f3, 48(sp)
> > - lda sp, FRAME(sp)
> > cfi_remember_state
> > cfi_restore ($f0)
> > cfi_restore ($f1)
> > cfi_restore ($f3)
> > cfi_def_cfa_offset (0)
> > + lda sp, FRAME(sp)
>
> This change is actively wrong wrt the unwind info.
Again, this will match __divqu assembly. It looks that __divqu needs
to be fixed then.
> > @@ -246,12 +247,16 @@ $y_is_neg:
> > quotient must be either 0 or 1, so the remainder must be X
> > or X-Y, so just compute it directly. */
> > cmpule Y, X, AT
> > + excb
> > + mt_fpcr $f3
> > subq X, Y, RV
> > ldt $f0, 0(sp)
> > + ldt $f3, 48(sp)
> > cmoveq AT, X, RV
> >
> > lda sp, FRAME(sp)
> > cfi_restore ($f0)
> > + cfi_restore ($f3)
> > cfi_def_cfa_offset (0)
> > ret $31, (RA), 1
>
> This appears to be the only change required to fix the bug.
That is true. This part is the problematic part and clobbers $f3.
Should I resend the patch only with this part fixed?
> Can you walk me through why the other changes?
As said above, I was trying to make __remqu like __divqu, but it looks
that __divqu should be fixed in some places.
Thanks,
Uros.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH, alpha]: Fix sysdeps/alpha/remqu.S clobbering $f3 reg
2019-01-24 8:35 ` Uros Bizjak
@ 2019-01-24 9:04 ` Richard Henderson
0 siblings, 0 replies; 7+ messages in thread
From: Richard Henderson @ 2019-01-24 9:04 UTC (permalink / raw
To: Uros Bizjak; +Cc: libc-alpha
On 1/24/19 12:35 AM, Uros Bizjak wrote:
> That is true. This part is the problematic part and clobbers $f3.
> Should I resend the patch only with this part fixed?
Yes please. We're in the final stages of release
and any changes should be most minimal.
We can fix the div/rem matchup and unwind errors
for the next release.
r~
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH, alpha]: Improve sysdeps/alpha/divqu.S and sysdeps/alpha/remqu.S
2019-01-24 8:23 ` Richard Henderson
2019-01-24 8:35 ` Uros Bizjak
@ 2019-02-27 19:54 ` Uros Bizjak
2019-04-01 9:02 ` Richard Henderson
1 sibling, 1 reply; 7+ messages in thread
From: Uros Bizjak @ 2019-02-27 19:54 UTC (permalink / raw
To: Richard Henderson; +Cc: libc-alpha
[-- Attachment #1: Type: text/plain, Size: 567 bytes --]
Attached patch improves and unifies sysdeps/alpha/divqu.S and
sysdeps/alpha/remqu.S.
* sysdeps/alpha/divqu.S (__divqu): Move save of $f0 and excb after
conditional branch to DIVBYZERO. Fix unwind info.
* sysdeps/alpha/remqu.S (__remqu): Move saves of $f0, $f1, $f2 and
excb after conditional branch to $powerof2. Add missing unop
instructions and .align directives and reorder instructions to match
__divqu.
Signed-off-by: Uroš Bizjak <ubizjak@gmail.com>
Patch was tested on alphaev68-linux-gnu, also by running complete
libgo testsuite.
Uros.
[-- Attachment #2: a.diff.txt --]
[-- Type: text/plain, Size: 1762 bytes --]
diff --git a/sysdeps/alpha/divqu.S b/sysdeps/alpha/divqu.S
index f5cedd0716..3165374b6d 100644
--- a/sysdeps/alpha/divqu.S
+++ b/sysdeps/alpha/divqu.S
@@ -56,10 +56,10 @@ __divqu:
that's done, we have at least 22 cycles until its results are
ready -- all the time in the world to figure out how we're
going to use the results. */
- stt $f0, 0(sp)
- excb
beq Y, DIVBYZERO
+ stt $f0, 0(sp)
+ excb
stt $f1, 8(sp)
stt $f3, 48(sp)
cfi_rel_offset ($f0, 0)
@@ -70,6 +70,7 @@ __divqu:
_ITOFT2 X, $f0, 16, Y, $f1, 24
cvtqt $f0, $f0
cvtqt $f1, $f1
+
blt X, $x_is_neg
divt/c $f0, $f1, $f0
@@ -90,12 +91,12 @@ __divqu:
ldt $f0, 0(sp)
ldt $f3, 48(sp)
+ lda sp, FRAME(sp)
cfi_remember_state
cfi_restore ($f0)
cfi_restore ($f1)
cfi_restore ($f3)
cfi_def_cfa_offset (0)
- lda sp, FRAME(sp)
ret $31, (RA), 1
.align 4
diff --git a/sysdeps/alpha/remqu.S b/sysdeps/alpha/remqu.S
index a240ee9735..3b6a62dd88 100644
--- a/sysdeps/alpha/remqu.S
+++ b/sysdeps/alpha/remqu.S
@@ -57,19 +57,19 @@ __remqu:
ready -- all the time in the world to figure out how we're
going to use the results. */
subq Y, 1, AT
- stt $f0, 0(sp)
and Y, AT, AT
+ beq AT, $powerof2
- stt $f1, 8(sp)
+ stt $f0, 0(sp)
excb
+ stt $f1, 8(sp)
stt $f3, 48(sp)
- beq AT, $powerof2
cfi_rel_offset ($f0, 0)
cfi_rel_offset ($f1, 8)
cfi_rel_offset ($f3, 48)
+ mf_fpcr $f3
_ITOFT2 X, $f0, 16, Y, $f1, 24
- mf_fpcr $f3
cvtqt $f0, $f0
cvtqt $f1, $f1
@@ -116,11 +116,13 @@ $x_is_neg:
cfi_rel_offset ($f2, 24)
_ITOFS AT, $f2, 16
+ .align 4
addt $f0, $f2, $f0
+ unop
divt/c $f0, $f1, $f0
+ unop
/* Ok, we've now the divide issued. Continue with other checks. */
- .align 4
ldt $f1, 8(sp)
unop
ldt $f2, 24(sp)
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH, alpha]: Improve sysdeps/alpha/divqu.S and sysdeps/alpha/remqu.S
2019-02-27 19:54 ` [PATCH, alpha]: Improve sysdeps/alpha/divqu.S and sysdeps/alpha/remqu.S Uros Bizjak
@ 2019-04-01 9:02 ` Richard Henderson
0 siblings, 0 replies; 7+ messages in thread
From: Richard Henderson @ 2019-04-01 9:02 UTC (permalink / raw
To: Uros Bizjak; +Cc: libc-alpha
On 2/28/19 2:54 AM, Uros Bizjak wrote:
> Attached patch improves and unifies sysdeps/alpha/divqu.S and
> sysdeps/alpha/remqu.S.
>
> * sysdeps/alpha/divqu.S (__divqu): Move save of $f0 and excb after
> conditional branch to DIVBYZERO. Fix unwind info.
> * sysdeps/alpha/remqu.S (__remqu): Move saves of $f0, $f1, $f2 and
> excb after conditional branch to $powerof2. Add missing unop
> instructions and .align directives and reorder instructions to match
> __divqu.
>
> Signed-off-by: Uroš Bizjak <ubizjak@gmail.com>
Thanks, committed.
r~
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-04-01 9:02 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-01-18 13:06 [PATCH, alpha]: Fix sysdeps/alpha/remqu.S clobbering $f3 reg Uros Bizjak
2019-01-20 18:21 ` Uros Bizjak
2019-01-24 8:23 ` Richard Henderson
2019-01-24 8:35 ` Uros Bizjak
2019-01-24 9:04 ` Richard Henderson
2019-02-27 19:54 ` [PATCH, alpha]: Improve sysdeps/alpha/divqu.S and sysdeps/alpha/remqu.S Uros Bizjak
2019-04-01 9:02 ` Richard Henderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).