From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-3.8 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI shortcircuit=no autolearn=ham autolearn_force=no version=3.4.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by dcvr.yhbt.net (Postfix) with ESMTP id E59DB1F453 for ; Tue, 18 Sep 2018 16:50:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729859AbeIRWXz (ORCPT ); Tue, 18 Sep 2018 18:23:55 -0400 Received: from mail-it0-f51.google.com ([209.85.214.51]:50356 "EHLO mail-it0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728065AbeIRWXz (ORCPT ); Tue, 18 Sep 2018 18:23:55 -0400 Received: by mail-it0-f51.google.com with SMTP id j81-v6so4251775ite.0 for ; Tue, 18 Sep 2018 09:50:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=B6ww+BFiWBUTZvUOFrQSLwKMVeMz1r37mb3aUnQoe3U=; b=P0GE/epJ4ZjWKl8ODgQGqrHXw/5QigP4TcTrv8oSrt/uEdAPA1ScRZWP3DdwmfSNQV zjQlQBasar6K53usEscfwAXjcwtq2b87ZqMua2c64kE5T/1oFH5Od96eFGcM2gMXEldm G4mv/5f/fsY10FqVUNrJ7gk2XqxVfp/Kemers= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=B6ww+BFiWBUTZvUOFrQSLwKMVeMz1r37mb3aUnQoe3U=; b=hH5UGFx74qToe+GOCglHB4bs09mBWmZno1uKORVtbW6dljnaJaHaJDQabHuDBhpeZy UOBdY7mpK+12iLt/ldvLJq7vOr4Ia2EN45D+Xib4QUrx10dz91T89gQpvQ5G6q94uwJL tB6X0b+uhFvNLwy4LB9w0xPkbqpDO7+y7d120ydK+4N4MvM+zlnA3lazXE35Gj/z9AsI Y4ntllNSGBSJImMLbLeqL9Upe8bu24kiCzM6shL7Bm2TQFseI03RC4ZVYI+mkrbftbYq oYuZsHPcpwi8LXMcQKWIwqw0lyVrBWpL+65VmrgCSuzWajRqNJg3kaHEKzu51GBv4o27 yEBw== X-Gm-Message-State: APzg51BjFuJ8P4WBdmhPxX42tnESIqK2XA9PhifFJJLvQeKatTqBK9MB 51zo2rGqNMoKdz9XlrHWBhvAdgRI6uOJjLjI6L0= X-Google-Smtp-Source: ANB0VdZS1RsIjRy3mkadzpmSio2yMSVlrdHgzyzRpQiifUzFrM5urc0FQ7fwrxGeIbum+CaBbD5lYMD41UN0LO6KcHs= X-Received: by 2002:a02:702:: with SMTP id f2-v6mr28502502jaf.70.1537289428870; Tue, 18 Sep 2018 09:50:28 -0700 (PDT) MIME-Version: 1.0 References: <20180609205628.GB38834@genre.crustytoothpaste.net> <20180609224913.GC38834@genre.crustytoothpaste.net> <20180611192942.GC20665@aiede.svl.corp.google.com> <20180720215220.GB18502@genre.crustytoothpaste.net> <20180724190136.GA5@0f3cdde9c159> <20180724221008.GI18502@genre.crustytoothpaste.net> <64c1fc82-8830-bd62-7cc6-ef76ad06f6d5@noekeon.org> In-Reply-To: <64c1fc82-8830-bd62-7cc6-ef76ad06f6d5@noekeon.org> From: Linus Torvalds Date: Tue, 18 Sep 2018 09:50:17 -0700 Message-ID: Subject: Re: Hash algorithm analysis To: jda@noekeon.org Cc: Dan Shumow , Johannes Schindelin , "brian m. carlson" , Junio C Hamano , Edward Thomson , Jonathan Nieder , Git Mailing List , demerphq , Adam Langley Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Tue, Sep 18, 2018 at 8:18 AM Joan Daemen wrote: > > 3) The relatively large state in the sponge construction increases the ge= neric strength against attacks when the input contains redundancy or > has a certain form. For instance, if the input is restricted to be text i= n ASCII (such as source code), then the collision-resistance grows > higher than the nominal 2^{c/2}. Such an effect does not exist with narro= w-pipe Merkle-Damg=C3=A5rd. (This may be what Linus had intuitively in mind= .) Answering to just this part: No, what I had in mind was literally just exactly the kind of attack that SHA1 broke for - attacking the internal state vector directly, and not paying any penalty for it, because the stat size is the same as the final hash size. The length extension attack is just the simplest and most trivial version of that kind of attack - because the internal state vector *is* the result, and you just continue using it. But that trivial length extension thing not the real problem, it's just the absolutely simplest symptom of the real problem. I think that the model where the internal state of the hash is the same width as the final result is simply broken. It was what broke SHA1, and that problem is shared with SHA2. "Length extension" is just the simplest way to say "broken by design", imho= . Because the length extension attack is just the most trivial attack, but it isn't the fundamental problem. It was just the first and the cheapest attack found, but it was also the most special-cased and least interesting. You need to have a very special case (with that secret at the beginning etc) to make the pure length extension attack interesting. And git has no secrets, so in that sense "length extension" by itself is totally immaterial. But the basic problem of internal hash size obviously wasn't. So I would say that length extension is a direct result of the _real_ problem, which is that the hash exposes _all_ of the internal data. That is what makes length extension possible - because you can just continue from a known state, and there is absolutely nothing hidden - and yes, that's a really easy special case where you don't even need to actually break the hash at all. But I argue that it's _also_ one big part of what made SHAttered practical, and I think the underlying problem is exactly the same. When the internal state is the same size as the hash, you can attack the internal state itself for basically the same cost as attacking the whole hash. So you can pick-and-choose the weakest point. Which is basically exactly what SHAttered did. No, it wasn't the trivial "just add to the end", but it used the exact same underlying weakness as one part of the attack. *This* is why I dislike SHA2. It has basically the exact same basic weakness that we already know SHA1 fell for. The hashing details are different, and hopefully that means that there aren't the same kind of patterns that can be generated to do the "attack the internal hash state" part, but I don't understand why people seem to ignore that other fundamental issue. Something like SHA-512/256 would have been better, but I think almost nobody does that in hardware, which was one of the big advantages of plain SHA2. The main reason I think SHA2 is acceptable is simply that 256 bits is a lot. So even if somebody comes up with a shortcut that weakens it by tens of bits, nobody really cares. Plus I'm obviously not a cryptographer, so I didn't feel like I was going to fight it a lot. But yes, I'd have probably gone with any of the other alternatives, because I think it's a bit silly that we're switching hashes to another hash that has (at least in part) the *exact* same issue as the one people call broken. (And yes, the hashing details are different, so it's "exactly the same" only wrt that internal state part - not the bitpattern finding part that made the attack on the internal state much cheaper. Real cryptographers obviously found that "figure out the weakness of the hashing" to be the more interesting and novel part over the trivial internal hash size part). That said.. The real reason I think SHA2 is the right choice was simply that there needs to be a decision, and none of the choices were *wrong*. Sometimes just the _act_ of making a decision is more important than _what_ the decision is. And hey, it is also likely that the reason _I_ get hung up on just the size of the internal state is that exactly because I am _not_ a cryptographer, that kind of high-level stuff is the part I understand. When you start talking about why the exact rules of Merkle=E2=80=93Damg=C3= =A5rd constructions work, my eyes just glaze over. So I'm probably - no, certainly - myopic and looking at only one part of the issue to begin with. The end result is that I argued for more bits in the internal state (and apparently wide vs narrow is the technical term), and I would have seen parallel algorithms as a bonus for the large-file case. None of which argued for SHA2. But see above on why I think SHA2 is if not *the* right choice, at least *a* right choice. Linus