git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Nicolas Pitre <nico@cam.org>
To: George Spelvin <linux@horizon.com>
Cc: bdonlan@gmail.com, johnflux@gmail.com, P@draigBrady.com,
	art.08.09@gmail.com, git@vger.kernel.org,
	torvalds@linux-foundation.org
Subject: Re: Linus' sha1 is much faster!
Date: Mon, 17 Aug 2009 13:06:45 -0400 (EDT)	[thread overview]
Message-ID: <alpine.LFD.2.00.0908171228570.6044@xanadu.home> (raw)
In-Reply-To: <20090817072315.4314.qmail@science.horizon.com>

On Mon, 17 Aug 2009, George Spelvin wrote:

> If it helps anyone resolve license issues, here's a from-FIPS-180-2
> implementation that's placed in the public domain.  That should be
> compatible with any license.
> 
> It uses Linus's and Artur's performance ideas, and some of Linus' macro
> ideas (in the rotate implementation), but tries to be textually different.
> Is there anything recognizable that anyone cares to clam copyright to?

I don't think this trick of making source code textually different from 
another work while still intimately mimicking the same structure entitles 
you to any copyright (or non copyright) claims over that other work.  I 
certainly wouldn't bet any dime for this standing up in court.  
Otherwise anyone could grab any copyrighted source code and perform a 
bunch of search-and-replace ops on it, and maybe some code reordering 
for good measure, to be able to claim own copyright on it. It is 
probably much safer to simply ask the people involved to agree with your 
relicensing.  And so far I don't see anyone with a stake in this 
fiercely wanting to stick to a particular license.

> It's not quite 100% finished, as I haven't benchmarked it against Linus's
> code yet, but it's functionally correct.
> 
> It's also clean with -W -Wall -Wextra.

Not if you try with the unaligned put_be32() as the destination pointer 
is marked const.

As to the actual result on ARM... Well, the assembly _looks_ much worse 
than Linus' version.  It uses a stack frame of 152 bytes instead of 64 
bytes.  The resulting binary is also 6868 bytes large compared to 6180 
bytes.  Surprisingly, the performance is not that bad (the reason for 
the underlined "looks" above) albeit still a bit worse, like 5% slower.  
I was expecting much worse than that.

One possible reason for the bad assembly is probably due to the fact 
that gcc is not smart enough to propagate constant address offsets 
across different pointer types.  For example, my first version of 
get_be32() was a macro that did this:

#define SHA_SRC(t) \
  ({ unsigned char *__d = (unsigned char *)&data[t]; \
     (__d[0] << 24) | (__d[1] << 16) | (__d[2] << 8) | (__d[3] << 0); })

With such a construct, gcc would always allocate a register to hold __d 
and then dereference that with an offset from 0 to 3.  Whereas:

#define SHA_SRC(t) \
   ({   unsigned char *__d = (unsigned char *)data; \
        (__d[(t)*4 + 0] << 24) | (__d[(t)*4 + 1] << 16) | \
        (__d[(t)*4 + 2] <<  8) | (__d[(t)*4 + 3] <<  0); })

does produce optimal assembly as only the register holding the data 
pointer is dereferenced with the absolute byte offset.  I suspect your 
usage of inline functions has the same effect as the first SHA_SRC 
definition above.

Also, wrt skipping the last 3 write back to the 16 word array...  For 
all the (limited) attempts I've made so far to do that, it always ended 
up making things worse.  I've yet to investigate why though.


Nicolas

  parent reply	other threads:[~2009-08-17 17:07 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-17  7:23 Linus' sha1 is much faster! George Spelvin
2009-08-17 14:20 ` Nicolas Pitre
2009-08-17 17:06 ` Nicolas Pitre [this message]
2009-08-17 17:20   ` Paolo Bonzini
2009-08-17 18:54   ` George Spelvin
2009-08-17 19:34     ` Nicolas Pitre
2009-08-17 23:12       ` George Spelvin
  -- strict thread matches above, loose matches on Subject: below --
2009-08-14 23:25 Pádraig Brady
2009-08-15 20:02 ` Bryan Donlan
2009-08-15 20:12   ` John Tapsell
2009-08-15 20:23     ` Linus Torvalds
2009-08-15 20:54       ` Linus Torvalds
2009-08-17  1:55         ` Nicolas Pitre
2009-08-26 11:39           ` Pádraig Brady
2017-04-20 21:35             ` galt
2017-04-20 21:38             ` galt
2009-08-17  8:22         ` Andreas Ericsson
2009-08-16  0:06     ` Theodore Tso
2009-08-16 19:25 ` Giuseppe Scrivano
2009-08-16 20:10   ` Linus Torvalds
2009-08-16 22:15     ` Giuseppe Scrivano
2009-08-16 22:47       ` Linus Torvalds
2009-08-17  1:53   ` Pádraig Brady
2009-08-17 10:51     ` Giuseppe Scrivano
2009-08-17 15:44       ` Steven Noonan
2009-08-17 16:22         ` Linus Torvalds
2009-08-17 21:43           ` Steven Noonan
2009-08-17 17:32         ` Giuseppe Scrivano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.00.0908171228570.6044@xanadu.home \
    --to=nico@cam.org \
    --cc=P@draigBrady.com \
    --cc=art.08.09@gmail.com \
    --cc=bdonlan@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=johnflux@gmail.com \
    --cc=linux@horizon.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).