git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jeff King <peff@peff.net>,
	Ian Jackson <ijackson@chiark.greenend.org.uk>,
	Joey Hess <id@joeyh.name>, Git Mailing List <git@vger.kernel.org>
Subject: Re: SHA1 collisions found
Date: Thu, 2 Mar 2017 13:21:30 -0800	[thread overview]
Message-ID: <CA+55aFziZRA29foAMbM-HS5fiup7T0TuYf4XQ1kNT_SR7FfSgw@mail.gmail.com> (raw)
In-Reply-To: <xmqqk287be9l.fsf@gitster.mtv.corp.google.com>

On Thu, Mar 2, 2017 at 12:43 PM, Junio C Hamano <gitster@pobox.com> wrote:
>
> My reaction heavily depends on how that "object version" thing
> works.
>
> Would "object version" be like a truncated SHA-1 over the same data
> but with different IV or something, i.e. something that guarantees
> anybody would get the same result given the data to be hashed?

Yes, it does need to be that in practice. So what I was thinking the
object version would be is:

 (a) we actually take the object type into account explicitly.

 (b) we explicitly add another truncated hash.

The first part we can already do without any actual data structure
changes, since basically all users already know the type of an object
when they look it up.

So we already have information that we could use to narrow down the
hash collision case if we saw one.

There are some (very few) cases where we don't already explicitly have
the object type (a tag reference can be any object, for example, and
existing scripts might ask for "give me the type of this SHA1 object
with "git cat-file -t"), but that just goes back to the whole "yeah,
we'll handle legacy uses and we will look up objects even _without_
the extra version data, so it actually integrates well into the whole
notion.

Basically, once you accept that "hey, we'll just have a list of
objects with that hash", it just makes sense to narrow it down by the
object type we also already have.

But yes, the object type is obviously only two bits of information
(actually, considering the type distribution, probably just one bit),
and it's already encoded in the first hash, so it doesn't actually
help much as "collision avoidance" particularly once you have a
particular attack against that hash in place.

It's just that it *is* extra information that we already have, and
that is very natural to use once you start thinking of the hash lookup
as returning a list of objects. It also mitigates one of the worst
_confusions_ in git, and so basically mitigates the worst-case
downside of an attack basically for free, so it seems like a
no-brainer.

But the real new piece of object version would be a truncated second
hash of the object.

I don't think it matters too much what that second hash is, I would
say that we'd just approximate having a total of 256 bits of hash.

Since we already have basically 160 bits of fairly good hashing, and
roughly 128 bits of that isn't known to be attackable, we'd just use
another hash and truncate that to 128 bits. That would be *way*
overkill in practice, but maybe overkill is what we want. And it
wouldn't really expand the objects all that much more than just
picking a new 256-bit hash would do.

So you'd have to be able to attack both the full SHA1, _and_ whatever
other different good hash to 128 bits.

                Linus

PS.  if people think that SHA1 is of a good _size_, and only worry
about the known weaknesses of the hashing itself, we'd only need to
get back the bits that the attacks take away from brute force. That's
currently the 80 -> ~63 bits attack, so you'd really only want about
40 bits of second hash to claw us back back up to 80 bits of brute
force (again: brute force is basically sqrt() of the search space, so
half the bits, so adding 40 bits of hash adds 20 bits to the brute
force cost and you'd get back up to the 2**80 we started with).

So 128 bits of secondary hash really is much more than we'd need. 64
bits would probably be fine.

  reply	other threads:[~2017-03-03  1:43 UTC|newest]

Thread overview: 136+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-23 16:43 SHA1 collisions found Joey Hess
2017-02-23 17:00 ` David Lang
2017-02-23 17:02 ` Junio C Hamano
2017-02-23 17:12   ` David Lang
2017-02-23 20:49     ` Jakub Narębski
2017-02-23 20:57       ` Jeff King
2017-02-23 17:18   ` Junio C Hamano
2017-02-23 17:35   ` Joey Hess
2017-02-23 17:52     ` Linus Torvalds
2017-02-23 18:21       ` Joey Hess
2017-02-23 18:31         ` Joey Hess
2017-02-23 19:13           ` Morten Welinder
2017-02-24 15:52             ` Geert Uytterhoeven
2017-02-23 18:40         ` Linus Torvalds
2017-02-23 18:46           ` Jeff King
2017-02-23 19:09             ` Linus Torvalds
2017-02-23 19:32               ` Jeff King
2017-02-23 19:47                 ` Linus Torvalds
2017-02-23 19:57                   ` Jeff King
     [not found]                     ` <alpine.LFD.2.20.1702231428540.30435@i7.lan>
2017-02-23 22:43                       ` Jeff King
2017-02-23 22:50                         ` Linus Torvalds
2017-02-23 23:05                         ` Jeff King
2017-02-23 23:05                           ` [PATCH 1/3] add collision-detecting sha1 implementation Jeff King
2017-02-23 23:15                             ` Stefan Beller
2017-02-24  0:01                               ` Jeff King
2017-02-24  0:12                                 ` Linus Torvalds
2017-02-24  0:16                                   ` Jeff King
2017-02-23 23:05                           ` [PATCH 2/3] sha1dc: adjust header includes for git Jeff King
2017-02-23 23:06                           ` [PATCH 3/3] Makefile: add USE_SHA1DC knob Jeff King
2017-02-24 18:36                             ` HW42
2017-02-24 18:57                               ` Jeff King
2017-02-23 23:14                           ` SHA1 collisions found Linus Torvalds
2017-02-28 18:41                           ` Junio C Hamano
2017-02-28 19:07                             ` Junio C Hamano
2017-02-28 19:20                               ` Jeff King
2017-03-01  8:57                                 ` Dan Shumow
2017-02-28 19:34                               ` Linus Torvalds
2017-02-28 19:52                                 ` Shawn Pearce
2017-02-28 22:56                                   ` Linus Torvalds
2017-02-28 21:22                                 ` Dan Shumow
2017-02-28 22:50                                   ` Marc Stevens
2017-02-28 23:11                                     ` Linus Torvalds
2017-03-01 19:05                                       ` Jeff King
2017-02-23 20:47               ` Øyvind A. Holm
2017-02-23 20:46             ` Joey Hess
2017-02-23 18:42         ` Jeff King
2017-02-23 17:52     ` David Lang
2017-02-23 19:20   ` David Lang
2017-02-23 17:19 ` Linus Torvalds
2017-02-23 17:29   ` Linus Torvalds
2017-02-23 18:10   ` Joey Hess
2017-02-23 18:29     ` Linus Torvalds
2017-02-23 18:38     ` Junio C Hamano
2017-02-24  9:42 ` Duy Nguyen
2017-02-25 19:04   ` brian m. carlson
2017-02-27 13:29     ` René Scharfe
2017-02-28 13:25       ` brian m. carlson
2017-02-24 15:13 ` Ian Jackson
2017-02-24 17:04   ` ankostis
2017-02-24 17:23   ` Jason Cooper
2017-02-25 23:22     ` ankostis
2017-02-24 17:32   ` Junio C Hamano
2017-02-24 17:45     ` David Lang
2017-02-24 18:14       ` Junio C Hamano
2017-02-24 18:58         ` Stefan Beller
2017-02-24 19:20           ` Junio C Hamano
2017-02-24 20:05             ` ankostis
2017-02-24 20:32               ` Junio C Hamano
2017-02-25  0:31                 ` ankostis
2017-02-26  0:16                   ` Jason Cooper
2017-02-26 17:38                     ` brian m. carlson
2017-02-26 19:11                       ` Linus Torvalds
2017-02-26 21:38                         ` Ævar Arnfjörð Bjarmason
2017-02-26 21:52                           ` Jeff King
2017-02-27 13:00                             ` Transition plan for git to move to a new hash function Ian Jackson
2017-02-27 14:37                               ` Why BLAKE2? Markus Trippelsdorf
2017-02-27 15:42                                 ` Ian Jackson
2017-02-27 19:26                               ` Transition plan for git to move to a new hash function Tony Finch
2017-02-28 21:47                               ` brian m. carlson
2017-03-02 18:13                                 ` Ian Jackson
2017-03-04 22:49                                   ` brian m. carlson
2017-03-05 13:45                                     ` Ian Jackson
2017-03-05 23:45                                       ` brian m. carlson
2017-02-24 20:05             ` SHA1 collisions found Junio C Hamano
2017-02-24 20:33           ` Philip Oakley
2017-02-24 23:39     ` Jeff King
2017-02-25  0:39       ` Linus Torvalds
2017-02-25  0:54         ` Linus Torvalds
2017-02-25  1:16         ` Jeff King
2017-02-26 18:55           ` Junio C Hamano
2017-02-25  6:10         ` Junio C Hamano
2017-02-26  1:13           ` Jason Cooper
2017-02-26  5:18             ` Jeff King
2017-02-26 18:30               ` brian m. carlson
2017-03-02 21:46               ` Brandon Williams
2017-03-03 11:13                 ` Jeff King
2017-03-03 14:54                   ` Ian Jackson
2017-03-03 22:18                     ` Jeff King
2017-03-02 19:55         ` Linus Torvalds
2017-03-02 20:43           ` Junio C Hamano
2017-03-02 21:21             ` Linus Torvalds [this message]
2017-03-02 21:54               ` Joey Hess
2017-03-02 22:27                 ` Linus Torvalds
2017-03-03  1:50                   ` Mike Hommey
2017-03-03  2:19                     ` Linus Torvalds
2017-03-03 11:04           ` Jeff King
2017-03-03 21:47           ` Stefan Beller
2017-02-25  1:00       ` David Lang
2017-02-25  1:15         ` Stefan Beller
2017-02-25  1:21         ` Jeff King
2017-02-25  1:39           ` David Lang
2017-02-25  1:47             ` Jeff King
2017-02-25  1:56               ` David Lang
2017-02-25  2:28             ` Jacob Keller
2017-02-25  2:26           ` Jacob Keller
2017-02-25  5:39             ` grarpamp
2017-02-24 23:43     ` Ian Jackson
2017-02-25  0:06       ` Ian Jackson
2017-02-25 18:50     ` brian m. carlson
2017-02-25 19:26       ` Jeff King
2017-02-25 22:09         ` Mike Hommey
2017-02-26 17:38           ` brian m. carlson
2017-02-24 22:47 ` Jakub Narębski
2017-02-24 22:53   ` Santiago Torres
2017-02-24 23:05     ` Jakub Narębski
2017-02-24 23:24       ` Øyvind A. Holm
2017-02-24 23:06   ` Jeff King
2017-02-24 23:35     ` Jakub Narębski
2017-02-25 22:35     ` Lars Schneider
2017-02-26  0:46       ` Jeff King
2017-02-26 18:22         ` Junio C Hamano
2017-02-26 18:57     ` Thomas Braun
2017-02-26 21:30       ` Jeff King
2017-02-27  9:57         ` Geert Uytterhoeven
2017-02-27 10:43           ` Jeff King
2017-02-27 12:39             ` Morten Welinder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+55aFziZRA29foAMbM-HS5fiup7T0TuYf4XQ1kNT_SR7FfSgw@mail.gmail.com \
    --to=torvalds@linux-foundation.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=id@joeyh.name \
    --cc=ijackson@chiark.greenend.org.uk \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).