Re: Hash algorithm analysis

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

From: Joan Daemen <jda@noekeon.org>
To: Dan Shumow <danshu@microsoft.com>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	"brian m. carlson" <sandals@crustytoothpaste.net>
Cc: Junio C Hamano <gitster@pobox.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Edward Thomson <ethomson@edwardthomson.com>,
	Jonathan Nieder <jrnieder@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	demerphq <demerphq@gmail.com>, Adam Langley <agl@google.com>
Subject: Re: Hash algorithm analysis
Date: Tue, 18 Sep 2018 17:18:28 +0200	[thread overview]
Message-ID: <64c1fc82-8830-bd62-7cc6-ef76ad06f6d5@noekeon.org> (raw)
In-Reply-To: <MWHPR21MB01738A46E37527E45E950937C42F0@MWHPR21MB0173.namprd21.prod.outlook.com>

Dear all,

when going over my todo list I was confronted with the mail of Dan Shumow on the successor of SHA-1 for git. I know the decision was made and it is not my intention to change it, but please see below some comments on Dan's arguments.

In short, I argue below that SHA256 has no serious advantages when compared to KangarooTwelve. In that light, the fact that SHA2 was designed behind closed doors (like SHA-1 was) should be enough reason to skip it entirely in an undertaking that takes open-source seriously.

Kind regards,

Joan

PS: In my comments below I use "we" as I discussed them with the members of the Keccak team being Gilles Van Assche, Michaël Peeters, Guido Bertoni, Ronny Van Keer and Seth Hoffert, and we agree on all of them.

On 30/07/2018 22:01, Dan Shumow wrote:
> Hello all.   Johannes, thanks for adding me to this discussion.
>
> So, as one of the coauthors of the SHA-1 collision detection code, I just wanted to chime in and say I'm glad to see the move to a longer hash function.  Though, as a cryptographer, I have a few thoughts on the matter that I thought I would share.
>
> I think that moving to SHA256 is a fine change, and I support it.
>
> I'm not anywhere near the expert in this that Joan Daeman is. 

Note that the correct spelling is "Daemen". But anyway, it is not a matter of being expert, but a matter of taking the arguments at face value. 

> I am someone who has worked in this space more or less peripherally.  However, I agree with Adam Langley that basically all of the finalists for a hash function replacement are about the same for the security needs of Git.  I think that, for this community, other software engineering considerations should be more important to the selection process.

We are also with Adam on the engineering considerations. We see the parallelism that K12 can exploit adaptively (unlike SHA256) as an example of such a consideration.

> I think Joan's survey of cryptanalysis papers and the numbers that he gives are interesting, and I had never seen the comparison laid out like that.  So, I think that there is a good argument to be made that SHA3 has had more cryptanalysis than SHA2.  Though, Joan, are the papers that you surveyed only focused on SHA2?  I'm curious if you think that the design/construction of SHA2, as it can be seen as an iteration of MD5/SHA1, means that the cryptanalysis papers on those constructions can be considered to apply to SHA2?  

This argument works both ways, i.e., the knowledge and experience of the symmetric cryptography community in general has also contributed to our choices in Keccak and in K12 (including the experience gained by Rijndael/AES). But in the end, the only objective metric we have for comparing public scrutiny is the amount of cryptanalysis (and analysis) published, and there Keccak simply scores better.

> Again, I'm not an expert in this, but I do know that Marc Steven's techniques for constructing collisions also provided some small cryptanalytic improvements against the SHA2 family as well.  I also think that while the paper survey is a good way to look over all of this, the more time in the position of high profile visibility that SHA2 has can give us some confidence as well.

High profile visibility to implementers does not mean more cryptanalysis, since users and implementers are usually not cryptanalysts. Actually, one of the reasons that SHA2 attracted much less cryptanalysis than you would expect due to its age is that during the SHA3 competition all cryptanalysts pointed their arrows to SHA3 candidates.

> Also something worth pointing out is that the connection SHA2 has to SHA1 means that if Marc Steven's cryptanalysis of MD5/SHA-1 were ever successfully applied to SHA2, the SHA1 collision detection approach could be applied there as well, thus providing a drop in replacement in that situation.  That said, we don't know that there is not a similar way of addressing issues with the SHA3/Sponge design.  It's just that because we haven't seen any weaknesses of this sort in similar designs, we just don't know what a similar approach would be there yet.  I don't want to put too much stock in this argument, it's just saying "Well, we already know how SHA2 is likely to break, and we've had fixes for similar things in the past."  This is pragmatic but not inspiring or confidence building.
>
> So, I also want to state my biases in favor of SHA2 as an employee of Microsoft.  Microsoft, being a corporation headquartered in a America, with the US Gov't as a major customer definitely prefers to defer to the US Gov't NIST standardization process.  And from that perspective SHA2 or SHA3 would be good choices.  I, personally, think that the NIST process is the best we have.  It is relatively transparent, and NIST employs a fair number of very competent cryptographers.  Also, I am encouraged by the widespread international participation that the NIST competitions and selection processes attract.

Of course, NIST has done (and is still doing) a great job at organizing public competitions, where all submissions have to include a design rationale and where the final selection is based on extensive openly published cryptanalysis and comparisons done by the cryptographic community. This is obviously AES and SHA3. However, NIST also put forward NSA designs as standards, without design rationale or public cryptanalysis whatsoever, and in some cases even with built-in backdoors
(EC-DRBG as Dan probably remembers). Examples of this are DES, SHA(0), SHA-1 and, yes, SHA2. The former we would call open-source philosophy and
the latter closed-source.

> As such, and reflecting this bias, in the internal discussions that Johannes alluded to, SHA2 and SHA3 were the primary suggestions.  There was a slight preference for SHA2 because SHA3 is not exposed through the windows cryptographic APIs (though Git does not use those, so this is a nonissue for this discussion.)

We find it cynical to bring up a Microsoft-internal argument that is actually not relevant to Git.

> I also wanted to thank Johannes for keeping the cryptographers that he discussed this with anonymous.  After all, cryptographers are known for being private.  And I wanted to say that Johannes did, in fact, accurately represent our internal discussions on the matter.

Our experience is that in the cryptographic community there are many outspoken individuals that fearlessly ventilate their opinions (sometimes even controversial ones).

> I also wanted to comment on the discussion of the "internal state having the same size as the output."  Linus referred to this several times.  This is known as narrow-pipe vs wide-pipe in the hash function design literature.  Linus is correct that wide-pipe designs are more in favor currently, and IIRC, all of the serious SHA3 candidates employed this.  That said, it did seem that in the discussion this was being equated with "length extension attacks."  And that connection is just not accurate.  Length extension attacks are primarily a motivation of the HMAC liked nested hashing design for MACs, because of a potential forgery attack.  Again, this doesn't really matter because the decision has been made despite this discussion.  I just wanted to set the record straight about this, as to avoid doing the right thing for the wrong reason (T.S. Elliot's "greatest treason.")

Indeed, vulnerability to length extension attacks and size of the internal state (chaining value in compression function based designs, or capacity in sponge) are two different things. Still, we would like to make a few points here.

1) There were SHA3 submissions that were narrow-pipe, i.e., finalist Blake is narrow-pipe.

2) SHA2 and its predecessors are vulnerable to length extension, SHA3 (or any of the SHA3 finalists) isn't. Length extension is a problem when using the hash function for MAC computation but this can be fixed by putting a construction on top of it. That construction is HMAC, that comes with some fixed overhead (up to a factor 4 for short messages).

3) The relatively large state in the sponge construction increases the generic strength against attacks when the input contains redundancy or
has a certain form. For instance, if the input is restricted to be text in ASCII (such as source code), then the collision-resistance grows
higher than the nominal 2^{c/2}. Such an effect does not exist with narrow-pipe Merkle-Damgård. (This may be what Linus had intuitively in mind.)

> One other thing that I wanted to throw out there for the future is that in the crypto community there is currently a very large push to post quantum cryptography.  Whether the threat of quantum computers is real or imagined this is a hot area of research, with a NIST competition to select post quantum asymmetric cryptographic algorithms.  That is not directly of concern to the selection of a hash function.  However, if we take this threat as legitimate, quantum computers reduce the strength of symmetric crypto, both encryption and hash functions, by 1/2.  

This is not what the experts say. In [1] a quantum algorithm is given that reduces the effort to generate a hash collision to 2^{n/3} (instead of 2^{n/2} classically). So according to [1] the strength reduction is 2/3 rather than 1/2. Moreover, in [2], Dan Bernstein takes a more detailed look at the actual cost of that algorithm and argues that the quantum algorithm of [1] performs worse than classical ones and that there is no security reduction at all for collision-resistance.

[1] Gilles Brassard, Peter Høyer, Alain Tapp, Quantum cryptanalysis of hash and claw-free functions, in LATIN'98 proceedings (1998), 163–169.

[2] Daniel J. Bernstein, Cost analysis of hash collisions: Will quantum computers make SHARCS obsolete? Workshop Record of SHARCS'09.

> So, if this is the direction that the crypto community ultimately goes in, 512bit hashes will be seen as standard over the next decade or so.  I don't think that this should be involved in this discussion, presently.   I'm just saying that not unlike the time when SHA1 was selected, I think that the replacement of a 256bit hash is on the horizon as well.
>
> Thanks,
> Dan Shumow
>

next prev parent reply	other threads:[~2018-09-18 15:27 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-09 20:56 State of NewHash work, future directions, and discussion brian m. carlson
2018-06-09 21:26 ` Ævar Arnfjörð Bjarmason
2018-06-09 22:49 ` Hash algorithm analysis brian m. carlson
2018-06-11 19:29   ` Jonathan Nieder
2018-06-11 20:20     ` Linus Torvalds
2018-06-11 23:27       ` Ævar Arnfjörð Bjarmason
2018-06-12  0:11         ` David Lang
2018-06-12  0:45         ` Linus Torvalds
2018-06-11 22:35     ` brian m. carlson
2018-06-12 16:21       ` Gilles Van Assche
2018-06-13 23:58         ` brian m. carlson
2018-06-15 10:33           ` Gilles Van Assche
2018-07-20 21:52     ` brian m. carlson
2018-07-21  0:31       ` Jonathan Nieder
2018-07-21 19:52       ` Ævar Arnfjörð Bjarmason
2018-07-21 20:25         ` brian m. carlson
2018-07-21 22:38       ` Johannes Schindelin
2018-07-21 23:09         ` Linus Torvalds
2018-07-21 23:59         ` brian m. carlson
2018-07-22  9:34           ` Eric Deplagne
2018-07-22 14:21             ` brian m. carlson
2018-07-22 14:55               ` Eric Deplagne
2018-07-26 10:05                 ` Johannes Schindelin
2018-07-22 15:23           ` Joan Daemen
2018-07-22 18:54             ` Adam Langley
2018-07-26 10:31             ` Johannes Schindelin
2018-07-23 12:40           ` demerphq
2018-07-23 12:48             ` Sitaram Chamarty
2018-07-23 12:55               ` demerphq
2018-07-23 18:23               ` Linus Torvalds
2018-07-23 17:57             ` Stefan Beller
2018-07-23 18:35             ` Jonathan Nieder
2018-07-24 19:01       ` Edward Thomson
2018-07-24 20:31         ` Linus Torvalds
2018-07-24 20:49           ` Jonathan Nieder
2018-07-24 21:13           ` Junio C Hamano
2018-07-24 22:10             ` brian m. carlson
2018-07-30  9:06               ` Johannes Schindelin
2018-07-30 20:01                 ` Dan Shumow
2018-08-03  2:57                   ` Jonathan Nieder
2018-09-18 15:18                   ` Joan Daemen [this message]
2018-09-18 15:32                     ` Jonathan Nieder
2018-09-18 16:50                     ` Linus Torvalds
2018-07-25  8:30             ` [PATCH 0/2] document that NewHash is now SHA-256 Ævar Arnfjörð Bjarmason
2018-07-25  8:30             ` [PATCH 1/2] doc hash-function-transition: note the lack of a changelog Ævar Arnfjörð Bjarmason
2018-07-25  8:30             ` [PATCH 2/2] doc hash-function-transition: pick SHA-256 as NewHash Ævar Arnfjörð Bjarmason
2018-07-25 16:45               ` Junio C Hamano
2018-07-25 17:25                 ` Jonathan Nieder
2018-07-25 21:32                   ` Junio C Hamano
2018-07-26 13:41                     ` [PATCH v2 " Ævar Arnfjörð Bjarmason
2018-08-03  7:20                       ` Jonathan Nieder
2018-08-03 16:40                         ` Junio C Hamano
2018-08-03 17:01                           ` Linus Torvalds
2018-08-03 16:42                         ` Linus Torvalds
2018-08-03 17:43                         ` Ævar Arnfjörð Bjarmason
2018-08-04  8:52                           ` Jonathan Nieder
2018-08-03 17:45                         ` brian m. carlson
2018-07-25 22:56                 ` [PATCH " brian m. carlson
2018-06-11 21:19   ` Hash algorithm analysis Ævar Arnfjörð Bjarmason
2018-06-21  8:20     ` Johannes Schindelin
2018-06-21 22:39     ` brian m. carlson
2018-06-11 18:09 ` State of NewHash work, future directions, and discussion Duy Nguyen
2018-06-12  1:28   ` brian m. carlson
2018-06-11 19:01 ` Jonathan Nieder
2018-06-12  2:28   ` brian m. carlson
2018-06-12  2:42     ` Jonathan Nieder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=64c1fc82-8830-bd62-7cc6-ef76ad06f6d5@noekeon.org \
    --to=jda@noekeon.org \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=agl@google.com \
    --cc=danshu@microsoft.com \
    --cc=demerphq@gmail.com \
    --cc=ethomson@edwardthomson.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=sandals@crustytoothpaste.net \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).