Re: Hash algorithm analysis

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

From: Linus Torvalds <torvalds@linux-foundation.org>
To: jda@noekeon.org
Cc: Dan Shumow <danshu@microsoft.com>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	"brian m. carlson" <sandals@crustytoothpaste.net>,
	Junio C Hamano <gitster@pobox.com>,
	Edward Thomson <ethomson@edwardthomson.com>,
	Jonathan Nieder <jrnieder@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	demerphq <demerphq@gmail.com>, Adam Langley <agl@google.com>
Subject: Re: Hash algorithm analysis
Date: Tue, 18 Sep 2018 09:50:17 -0700	[thread overview]
Message-ID: <CA+55aFyYyh0EYSotDYMv+mk+QbRghQnn3MK7oRn=131VDm=HfQ@mail.gmail.com> (raw)
In-Reply-To: <64c1fc82-8830-bd62-7cc6-ef76ad06f6d5@noekeon.org>

On Tue, Sep 18, 2018 at 8:18 AM Joan Daemen <jda@noekeon.org> wrote:
>
> 3) The relatively large state in the sponge construction increases the generic strength against attacks when the input contains redundancy or
> has a certain form. For instance, if the input is restricted to be text in ASCII (such as source code), then the collision-resistance grows
> higher than the nominal 2^{c/2}. Such an effect does not exist with narrow-pipe Merkle-Damgård. (This may be what Linus had intuitively in mind.)

Answering to just this part:

No, what I had in mind was literally just exactly the kind of attack
that SHA1 broke for - attacking the internal state vector directly,
and not paying any penalty for it, because the stat size is the same
as the final hash size.

The length extension attack is just the simplest and most trivial
version of that kind of attack - because the internal state vector
*is* the result, and you just continue using it.

But that trivial length extension thing not the real problem, it's
just the absolutely simplest symptom of the real problem.

I think that the model where the internal state of the hash is the
same width as the final result is simply broken. It was what broke
SHA1, and that problem is shared with SHA2.

"Length extension" is just the simplest way to say "broken by design", imho.

Because the length extension attack is just the most trivial attack,
but it isn't the fundamental problem. It was just the first and the
cheapest attack found, but it was also the most special-cased and
least interesting. You need to have a very special case (with that
secret at the beginning etc) to make the pure length extension attack
interesting. And git has no secrets, so in that sense "length
extension" by itself is totally immaterial. But the basic problem of
internal hash size obviously wasn't.

So I would say that length extension is a direct result of the _real_
problem, which is that the hash exposes _all_ of the internal data.

That is what makes length extension possible - because you can just
continue from a known state, and there is absolutely nothing hidden -
and yes, that's a really easy special case where you don't even need
to actually break the hash at all.

But I argue that it's _also_ one big part of what made SHAttered
practical, and I think the underlying problem is exactly the same.
When the internal state is the same size as the hash, you can attack
the internal state itself for basically the same cost as attacking the
whole hash.

So you can pick-and-choose the weakest point.

Which is basically exactly what SHAttered did. No, it wasn't the
trivial "just add to the end", but it used the exact same underlying
weakness as one part of the attack.

*This* is why I dislike SHA2. It has basically the exact same basic
weakness that we already know SHA1 fell for. The hashing details are
different, and hopefully that means that there aren't the same kind of
patterns that can be generated to do the "attack the internal hash
state" part, but I don't understand why people seem to ignore that
other fundamental issue.

Something like SHA-512/256 would have been better, but I think almost
nobody does that in hardware, which was one of the big advantages of
plain SHA2.

The main reason I think SHA2 is acceptable is simply that 256 bits is
a lot. So even if somebody comes up with a shortcut that weakens it by
tens of bits, nobody really cares. Plus I'm obviously not a
cryptographer, so I didn't feel like I was going to fight it a lot.

But yes, I'd have probably gone with any of the other alternatives,
because I think it's a bit silly that we're switching hashes to
another hash that has (at least in part) the *exact* same issue as the
one people call broken.

(And yes, the hashing details are different, so it's "exactly the
same" only wrt that internal state part - not the bitpattern finding
part that made the attack on the internal state much cheaper. Real
cryptographers obviously found that "figure out the weakness of the
hashing" to be the more interesting and novel part over the trivial
internal hash size part).

That said..

The real reason I think SHA2 is the right choice was simply that there
needs to be a decision, and none of the choices were *wrong*.
Sometimes just the _act_ of making a decision is more important than
_what_ the decision is.

And hey, it is also likely that the reason _I_ get hung up on just the
size of the internal state is that exactly because I am _not_ a
cryptographer, that kind of high-level stuff is the part I understand.
When you start talking about why the exact rules of Merkle–Damgård
constructions work, my eyes just glaze over.

So I'm probably - no, certainly - myopic and looking at only one part
of the issue to begin with.

The end result is that I argued for more bits in the internal state
(and apparently wide vs narrow is the technical term), and I would
have seen parallel algorithms as a bonus for the large-file case. None
of which argued for SHA2.

But see above on why I think SHA2 is if not *the* right choice, at
least *a* right choice.

                    Linus

next prev parent reply	other threads:[~2018-09-18 16:50 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-09 20:56 State of NewHash work, future directions, and discussion brian m. carlson
2018-06-09 21:26 ` Ævar Arnfjörð Bjarmason
2018-06-09 22:49 ` Hash algorithm analysis brian m. carlson
2018-06-11 19:29   ` Jonathan Nieder
2018-06-11 20:20     ` Linus Torvalds
2018-06-11 23:27       ` Ævar Arnfjörð Bjarmason
2018-06-12  0:11         ` David Lang
2018-06-12  0:45         ` Linus Torvalds
2018-06-11 22:35     ` brian m. carlson
2018-06-12 16:21       ` Gilles Van Assche
2018-06-13 23:58         ` brian m. carlson
2018-06-15 10:33           ` Gilles Van Assche
2018-07-20 21:52     ` brian m. carlson
2018-07-21  0:31       ` Jonathan Nieder
2018-07-21 19:52       ` Ævar Arnfjörð Bjarmason
2018-07-21 20:25         ` brian m. carlson
2018-07-21 22:38       ` Johannes Schindelin
2018-07-21 23:09         ` Linus Torvalds
2018-07-21 23:59         ` brian m. carlson
2018-07-22  9:34           ` Eric Deplagne
2018-07-22 14:21             ` brian m. carlson
2018-07-22 14:55               ` Eric Deplagne
2018-07-26 10:05                 ` Johannes Schindelin
2018-07-22 15:23           ` Joan Daemen
2018-07-22 18:54             ` Adam Langley
2018-07-26 10:31             ` Johannes Schindelin
2018-07-23 12:40           ` demerphq
2018-07-23 12:48             ` Sitaram Chamarty
2018-07-23 12:55               ` demerphq
2018-07-23 18:23               ` Linus Torvalds
2018-07-23 17:57             ` Stefan Beller
2018-07-23 18:35             ` Jonathan Nieder
2018-07-24 19:01       ` Edward Thomson
2018-07-24 20:31         ` Linus Torvalds
2018-07-24 20:49           ` Jonathan Nieder
2018-07-24 21:13           ` Junio C Hamano
2018-07-24 22:10             ` brian m. carlson
2018-07-30  9:06               ` Johannes Schindelin
2018-07-30 20:01                 ` Dan Shumow
2018-08-03  2:57                   ` Jonathan Nieder
2018-09-18 15:18                   ` Joan Daemen
2018-09-18 15:32                     ` Jonathan Nieder
2018-09-18 16:50                     ` Linus Torvalds [this message]
2018-07-25  8:30             ` [PATCH 0/2] document that NewHash is now SHA-256 Ævar Arnfjörð Bjarmason
2018-07-25  8:30             ` [PATCH 1/2] doc hash-function-transition: note the lack of a changelog Ævar Arnfjörð Bjarmason
2018-07-25  8:30             ` [PATCH 2/2] doc hash-function-transition: pick SHA-256 as NewHash Ævar Arnfjörð Bjarmason
2018-07-25 16:45               ` Junio C Hamano
2018-07-25 17:25                 ` Jonathan Nieder
2018-07-25 21:32                   ` Junio C Hamano
2018-07-26 13:41                     ` [PATCH v2 " Ævar Arnfjörð Bjarmason
2018-08-03  7:20                       ` Jonathan Nieder
2018-08-03 16:40                         ` Junio C Hamano
2018-08-03 17:01                           ` Linus Torvalds
2018-08-03 16:42                         ` Linus Torvalds
2018-08-03 17:43                         ` Ævar Arnfjörð Bjarmason
2018-08-04  8:52                           ` Jonathan Nieder
2018-08-03 17:45                         ` brian m. carlson
2018-07-25 22:56                 ` [PATCH " brian m. carlson
2018-06-11 21:19   ` Hash algorithm analysis Ævar Arnfjörð Bjarmason
2018-06-21  8:20     ` Johannes Schindelin
2018-06-21 22:39     ` brian m. carlson
2018-06-11 18:09 ` State of NewHash work, future directions, and discussion Duy Nguyen
2018-06-12  1:28   ` brian m. carlson
2018-06-11 19:01 ` Jonathan Nieder
2018-06-12  2:28   ` brian m. carlson
2018-06-12  2:42     ` Jonathan Nieder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+55aFyYyh0EYSotDYMv+mk+QbRghQnn3MK7oRn=131VDm=HfQ@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=agl@google.com \
    --cc=danshu@microsoft.com \
    --cc=demerphq@gmail.com \
    --cc=ethomson@edwardthomson.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jda@noekeon.org \
    --cc=jrnieder@gmail.com \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).