git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: linux@horizon.com
To: paulus@samba.org
Cc: git@vger.kernel.org, linux@horizon.com
Subject: Re: Revised PPC assembly implementation
Date: 25 Apr 2005 17:34:30 -0000	[thread overview]
Message-ID: <20050425173430.11031.qmail@science.horizon.com> (raw)
In-Reply-To: <17004.47876.414.756912@cargo.ozlabs.ibm.com>

>> Which lead to three questions:
>> - Is the stack set properly now?

> Not quite; you are saving 20 registers, so you need a 96-byte stack
> frame, like this:

> 	stwu	%r1,-96(%r1)
> 	stmw	%r13,16(%r1)
> 	...
> 	lmw	%r13,16(%r1)
> 	addi	%r1,%r1,96
> 	blr

Huh?  I'm saving 19 registers, r13..r31, and not saving 13, namely
r0..r12.

The dodgy thing *I'm* thinking of is saving %r2 (the TOC pointer)
and using it as an extra temporary.  (The alternative is spilling
one of the "old" hash values to the stack, which is not
too big a disaster.)

>> - Is it any faster?

> I did 10 repetitions of my program that calls SHA1_Update with a
> 4096-byte block of zeroes 256,000 times.  With my version, the average
> time was 4.6191 seconds with a standard deviation of 0.0157.  With your
> version, the average was 4.6063 and the standard deviation 0.0148.  So
> I would say that your version is probably just a little faster - of the
> order of 0.3% faster.

Damn.  So that's actually *worse* than me earlier version which achieved
an (also piddling) 2% speedup?
As you can see, I tried to make the addition tree bushier, but I guess
it didn't help.  Or the processor isn't out-of-order enough to find
the parallelism I made available.

Damn, I wish I had at that IBM pipeline profiling tool.  If it could
just tell me which cycles didn't have both ALUs busy, I could solve it
in relatively little time.

The place that could really use scheduing help is the G4, which has three
integer ALUs, but can only *think* about executing the bottom three entries
in the reorder queue.  So if one of those instructions isn't ready, it
stalls in the queue and idles the ALU with it.

Especially there, it may be necessary to interleave the EXPANDW code
with the round code to avoid having the (non-critical-path) EXPANDW
code scheduled ahead of critical-path round code.

The two critical-path inter-round dependencies are:
- summing into E to be rotated by 5 and added to D next round.
  (this is the "A<<<5" code in the current round)
- rotating B left for use in the next round's F(a,b,c) function.
  (this is the current round's C input)
Actually, the E variable isn't critical-path at all; it was last
modified several rounds ago.

Maybe I can improve the scheduling some more...

  reply	other threads:[~2005-04-25 17:32 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-04-23 12:42 [PATCH] PPC assembly implementation of SHA1 linux
2005-04-23 13:03 ` linux
2005-04-24  2:49 ` Benjamin Herrenschmidt
2005-04-24  4:40 ` Paul Mackerras
2005-04-24 12:04   ` Wayne Scott
2005-04-25  0:16   ` linux
2005-04-25  3:13   ` Revised PPC assembly implementation linux
2005-04-25  9:40     ` Paul Mackerras
2005-04-25 17:34       ` linux [this message]
2005-04-25 23:00         ` Paul Mackerras
2005-04-25 23:17           ` David S. Miller
2005-04-26  1:22             ` Paul Mackerras
2005-04-27  1:47               ` linux
2005-04-27  3:39                 ` Paul Mackerras
2005-04-27 16:01                   ` linux
2005-04-26  2:14             ` linux
2005-04-26  2:35             ` linux

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050425173430.11031.qmail@science.horizon.com \
    --to=linux@horizon.com \
    --cc=git@vger.kernel.org \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).