git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Jeff King <peff@peff.net>
Cc: Junio C Hamano <gitster@pobox.com>,
	Ian Jackson <ijackson@chiark.greenend.org.uk>,
	Joey Hess <id@joeyh.name>, Git Mailing List <git@vger.kernel.org>
Subject: Re: SHA1 collisions found
Date: Fri, 24 Feb 2017 16:39:45 -0800	[thread overview]
Message-ID: <CA+55aFw6BLjPK-F0RGd9LT7X5xosKOXOxuhmKX65ZHn09r1xow@mail.gmail.com> (raw)
In-Reply-To: <20170224233929.p2yckbc6ksyox5nu@sigill.intra.peff.net>

On Fri, Feb 24, 2017 at 3:39 PM, Jeff King <peff@peff.net> wrote:
>
> One thing I worry about in a mixed-hash setting is how often the two
> will be mixed.

Honestly, I think that a primary goal for a new hash implementation
absolutely needs to be to minimize mixing.

Not for security issues, but because of combinatorics. You want to
have a model that basically reads old data, but that very aggressively
approaches "new data only" in order to avoid the situation where you
have basically the exact same tree state, just _represented_
differently.

For example, what I would suggest the rules be is something like this:

 - introduce new tag2/commit2/tree2/blob2 object type tags that imply
that they were hashed using the new hash

 - an old type obviously can never contain a pointer to a new type (ie
you can't have a "tree" object that contains a tree2 object or a blob2
object.

 - but also make the rule that a *new* type can never contain a
pointer to an old type, with the *very* specific exception that a
commit2 can have a parent that is of type "commit".

That way everything "converges" towards the new format: the only way
you can stay on the old format is if you only have old-format objects,
and once you have a new-format object all your objects are going to be
new format - except for the history.

Obviously, if somebody stays in old format, you might end up still
getting some object duplication when you continue to merge from him,
but that tree can never merge back without converting to new-format,
so it will be a temporary situation.

So you will end up with duplicate objects, and that's not good (think
of what it does to all our full-tree "diff" optimizations, for example
- you no longer get the "these sub-trees are identical" across a
format change), but realistically you'll have a very limited time of
that kind of duplication.

I'd furthermore suggest that from a UI standpoint, we'd

 - convert to 64-character hex numbers (32-byte hashes)

 - (as mentioned earlier) default to a 40-character abbreviation

 - make the old 40-character SHA1's just show up within the same
address space (so they'd also be encoded as 32-byte hashes, just with
the last 12 bytes zero).

 - you'd see in the "object->type" whether it's a new or old-style hash.

I suspect it shouldn't be too painful to do it that way.

                Linus

  reply	other threads:[~2017-02-25  0:39 UTC|newest]

Thread overview: 136+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-23 16:43 SHA1 collisions found Joey Hess
2017-02-23 17:00 ` David Lang
2017-02-23 17:02 ` Junio C Hamano
2017-02-23 17:12   ` David Lang
2017-02-23 20:49     ` Jakub Narębski
2017-02-23 20:57       ` Jeff King
2017-02-23 17:18   ` Junio C Hamano
2017-02-23 17:35   ` Joey Hess
2017-02-23 17:52     ` Linus Torvalds
2017-02-23 18:21       ` Joey Hess
2017-02-23 18:31         ` Joey Hess
2017-02-23 19:13           ` Morten Welinder
2017-02-24 15:52             ` Geert Uytterhoeven
2017-02-23 18:40         ` Linus Torvalds
2017-02-23 18:46           ` Jeff King
2017-02-23 19:09             ` Linus Torvalds
2017-02-23 19:32               ` Jeff King
2017-02-23 19:47                 ` Linus Torvalds
2017-02-23 19:57                   ` Jeff King
     [not found]                     ` <alpine.LFD.2.20.1702231428540.30435@i7.lan>
2017-02-23 22:43                       ` Jeff King
2017-02-23 22:50                         ` Linus Torvalds
2017-02-23 23:05                         ` Jeff King
2017-02-23 23:05                           ` [PATCH 1/3] add collision-detecting sha1 implementation Jeff King
2017-02-23 23:15                             ` Stefan Beller
2017-02-24  0:01                               ` Jeff King
2017-02-24  0:12                                 ` Linus Torvalds
2017-02-24  0:16                                   ` Jeff King
2017-02-23 23:05                           ` [PATCH 2/3] sha1dc: adjust header includes for git Jeff King
2017-02-23 23:06                           ` [PATCH 3/3] Makefile: add USE_SHA1DC knob Jeff King
2017-02-24 18:36                             ` HW42
2017-02-24 18:57                               ` Jeff King
2017-02-23 23:14                           ` SHA1 collisions found Linus Torvalds
2017-02-28 18:41                           ` Junio C Hamano
2017-02-28 19:07                             ` Junio C Hamano
2017-02-28 19:20                               ` Jeff King
2017-03-01  8:57                                 ` Dan Shumow
2017-02-28 19:34                               ` Linus Torvalds
2017-02-28 19:52                                 ` Shawn Pearce
2017-02-28 22:56                                   ` Linus Torvalds
2017-02-28 21:22                                 ` Dan Shumow
2017-02-28 22:50                                   ` Marc Stevens
2017-02-28 23:11                                     ` Linus Torvalds
2017-03-01 19:05                                       ` Jeff King
2017-02-23 20:47               ` Øyvind A. Holm
2017-02-23 20:46             ` Joey Hess
2017-02-23 18:42         ` Jeff King
2017-02-23 17:52     ` David Lang
2017-02-23 19:20   ` David Lang
2017-02-23 17:19 ` Linus Torvalds
2017-02-23 17:29   ` Linus Torvalds
2017-02-23 18:10   ` Joey Hess
2017-02-23 18:29     ` Linus Torvalds
2017-02-23 18:38     ` Junio C Hamano
2017-02-24  9:42 ` Duy Nguyen
2017-02-25 19:04   ` brian m. carlson
2017-02-27 13:29     ` René Scharfe
2017-02-28 13:25       ` brian m. carlson
2017-02-24 15:13 ` Ian Jackson
2017-02-24 17:04   ` ankostis
2017-02-24 17:23   ` Jason Cooper
2017-02-25 23:22     ` ankostis
2017-02-24 17:32   ` Junio C Hamano
2017-02-24 17:45     ` David Lang
2017-02-24 18:14       ` Junio C Hamano
2017-02-24 18:58         ` Stefan Beller
2017-02-24 19:20           ` Junio C Hamano
2017-02-24 20:05             ` ankostis
2017-02-24 20:32               ` Junio C Hamano
2017-02-25  0:31                 ` ankostis
2017-02-26  0:16                   ` Jason Cooper
2017-02-26 17:38                     ` brian m. carlson
2017-02-26 19:11                       ` Linus Torvalds
2017-02-26 21:38                         ` Ævar Arnfjörð Bjarmason
2017-02-26 21:52                           ` Jeff King
2017-02-27 13:00                             ` Transition plan for git to move to a new hash function Ian Jackson
2017-02-27 14:37                               ` Why BLAKE2? Markus Trippelsdorf
2017-02-27 15:42                                 ` Ian Jackson
2017-02-27 19:26                               ` Transition plan for git to move to a new hash function Tony Finch
2017-02-28 21:47                               ` brian m. carlson
2017-03-02 18:13                                 ` Ian Jackson
2017-03-04 22:49                                   ` brian m. carlson
2017-03-05 13:45                                     ` Ian Jackson
2017-03-05 23:45                                       ` brian m. carlson
2017-02-24 20:05             ` SHA1 collisions found Junio C Hamano
2017-02-24 20:33           ` Philip Oakley
2017-02-24 23:39     ` Jeff King
2017-02-25  0:39       ` Linus Torvalds [this message]
2017-02-25  0:54         ` Linus Torvalds
2017-02-25  1:16         ` Jeff King
2017-02-26 18:55           ` Junio C Hamano
2017-02-25  6:10         ` Junio C Hamano
2017-02-26  1:13           ` Jason Cooper
2017-02-26  5:18             ` Jeff King
2017-02-26 18:30               ` brian m. carlson
2017-03-02 21:46               ` Brandon Williams
2017-03-03 11:13                 ` Jeff King
2017-03-03 14:54                   ` Ian Jackson
2017-03-03 22:18                     ` Jeff King
2017-03-02 19:55         ` Linus Torvalds
2017-03-02 20:43           ` Junio C Hamano
2017-03-02 21:21             ` Linus Torvalds
2017-03-02 21:54               ` Joey Hess
2017-03-02 22:27                 ` Linus Torvalds
2017-03-03  1:50                   ` Mike Hommey
2017-03-03  2:19                     ` Linus Torvalds
2017-03-03 11:04           ` Jeff King
2017-03-03 21:47           ` Stefan Beller
2017-02-25  1:00       ` David Lang
2017-02-25  1:15         ` Stefan Beller
2017-02-25  1:21         ` Jeff King
2017-02-25  1:39           ` David Lang
2017-02-25  1:47             ` Jeff King
2017-02-25  1:56               ` David Lang
2017-02-25  2:28             ` Jacob Keller
2017-02-25  2:26           ` Jacob Keller
2017-02-25  5:39             ` grarpamp
2017-02-24 23:43     ` Ian Jackson
2017-02-25  0:06       ` Ian Jackson
2017-02-25 18:50     ` brian m. carlson
2017-02-25 19:26       ` Jeff King
2017-02-25 22:09         ` Mike Hommey
2017-02-26 17:38           ` brian m. carlson
2017-02-24 22:47 ` Jakub Narębski
2017-02-24 22:53   ` Santiago Torres
2017-02-24 23:05     ` Jakub Narębski
2017-02-24 23:24       ` Øyvind A. Holm
2017-02-24 23:06   ` Jeff King
2017-02-24 23:35     ` Jakub Narębski
2017-02-25 22:35     ` Lars Schneider
2017-02-26  0:46       ` Jeff King
2017-02-26 18:22         ` Junio C Hamano
2017-02-26 18:57     ` Thomas Braun
2017-02-26 21:30       ` Jeff King
2017-02-27  9:57         ` Geert Uytterhoeven
2017-02-27 10:43           ` Jeff King
2017-02-27 12:39             ` Morten Welinder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+55aFw6BLjPK-F0RGd9LT7X5xosKOXOxuhmKX65ZHn09r1xow@mail.gmail.com \
    --to=torvalds@linux-foundation.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=id@joeyh.name \
    --cc=ijackson@chiark.greenend.org.uk \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).