git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
From: Jason Cooper <jason@lakedaemon.net>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	demerphq <demerphq@gmail.com>,
	Brandon Williams <bmwill@google.com>,
	Junio C Hamano <gitster@pobox.com>,
	Jonathan Nieder <jrnieder@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	Stefan Beller <sbeller@google.com>,
	Jonathan Tan <jonathantanmy@google.com>,
	Jeff King <peff@peff.net>, David Lang <david@lang.hm>,
	"brian m. carlson" <sandals@crustytoothpaste.net>
Subject: Re: RFC v3: Another proposed hash function transition plan
Date: Mon, 2 Oct 2017 14:00:11 +0000
Message-ID: <20171002140011.GE31762@io.lakedaemon.net> (raw)
In-Reply-To: <alpine.DEB.2.21.1.1709262356360.40514@virtualbox>

Hi Johannes,

Thanks for the response.  Sorry for the delay.  Had a large deadline for
$dayjob.

On Wed, Sep 27, 2017 at 12:11:14AM +0200, Johannes Schindelin wrote:
> On Tue, 26 Sep 2017, Jason Cooper wrote:
> > On Thu, Sep 14, 2017 at 08:45:35PM +0200, Johannes Schindelin wrote:
> > > On Wed, 13 Sep 2017, Linus Torvalds wrote:
> > > > On Wed, Sep 13, 2017 at 6:43 AM, demerphq <demerphq@gmail.com> wrote:
> > > > > SHA3 however uses a completely different design where it mixes a 1088
> > > > > bit block into a 1600 bit state, for a leverage of 2:3, and the excess
> > > > > is *preserved between each block*.
> > > > 
> > > > Yes. And considering that the SHA1 attack was actually predicated on
> > > > the fact that each block was independent (no extra state between), I
> > > > do think SHA3 is a better model.
> > > > 
> > > > So I'd rather see SHA3-256 than SHA256.
> > 
> > Well, for what it's worth, we need to be aware that SHA3 is *different*.
> > In crypto, "different" = "bugs haven't been found yet".  :-P
> > 
> > And SHA2 is *known*.  So we have a pretty good handle on how it'll
> > weaken over time.
> 
> Here, you seem to agree with me.

Yep.

> > > SHA-256 got much more cryptanalysis than SHA3-256, and apart from the
> > > length-extension problem that does not affect Git's usage, there are no
> > > known weaknesses so far.
> > 
> > While I think that statement is true on it's face (particularly when
> > including post-competition analysis), I don't think it's sufficient
> > justification to chose one over the other.
> 
> And here you don't.
> 
> I find that very confusing.

What I'm saying is that there is more to selecting a hash function for
git than just the cryptographic assessment.  In fact I would argue that
the primary cryptographic concern for git is "What is the likelihood
that we'll wake up one day to full collisions with no warning?"

To that, I'd argue that SHA-256's time in the field and SHA3-256's
competition give them both passing marks in that regard.  fwiw, I'd also
put Blake and Skein in there as well.

The chance that any of those will suffer sudden, catastrophic failure is
minimal.  IOW, we'll have warnings, and time to migrate to the next
function.

None of us can predict the future, but having a significant amount of
vetting reduces the chances of catastrophic failure.

> > > It would seem that the experts I talked to were much more concerned about
> > > that amount of attention than the particulars of the algorithm. My
> > > impression was that the new features of SHA3 were less studied than the
> > > well-known features of SHA2, and that the new-ness of SHA3 is not
> > > necessarily a good thing.
> > 
> > The only thing I really object to here is the abstract "experts".  We're
> > talking about cryptography and integrity here.  It's no longer
> > sufficient to cite anonymous experts.  Either they can put their
> > thoughts, opinions and analysis on record here, or it shouldn't be
> > considered.  Sorry.
> 
> Sorry, you are asking cryptography experts to spend their time on the Git
> mailing list. I tried to get them to speak out on the Git mailing list.
> They respectfully declined.

Ok, fair enough.  Just please understand that it's difficult to place
much weight on statements that we can't discuss with the person who made
them.

> > However, whether we chose SHA2 or SHA3 doesn't matter.
> 
> To you, it does not matter.

Well, I'd say it does not matter for *most* users.

> To me, it matters. To the several thousand developers working on Windows,
> probably the largest Git repository in active use, it matters. It matters
> because the speed difference that has little impact on you has a lot more
> impact on us.

Ahhh, so if I understand you correctly, you'd prefer SHA-256 over
SHA3-256 because it's more performant for your usecase?  Well, that's a
completely different animal that cryptographic suitability.

Have you been able to crunch numbers yet?  Will you be able to share
some empirical data?  I'd love to see some comparisons between SHA1,
SHA-256, SHA512-256, and SHA3-256 for different git operations under
your work load.

> > If SHA3 is chosen as the successor, it's going to get a *lot* more
> > adoption, and thus, a lot more analysis.  If cracks start to show, the
> > hard work of making git flexible is already done.  We can migrate to
> > SHA4/5/whatever in an orderly fashion with far less effort than the
> > transition away from SHA1.
> 
> Sure. And if XYZ789 is chosen, it's going to get a *lot* more adoption,
> too.
> 
> We think.
> 
> Let's be realistic. Git is pretty important to us, but it is not important
> enough to sway, say, Intel into announcing hardware support for SHA3.
> And if you try to force through *any* hash function only so that it gets
> more adoption and hence more support,

That's quite a jump from what I was saying.  I would never advise using
code in a production setting just to increase adoption.

What I /was/ saying: Let's say you don't get what you want, and SHA3-256
is chosen.  It's not the end of the world from a cryptographic PoV.
The hard work of making the git (and libgit2) codebases hash-flexible is
already done.  So, if you're correct, and SHA3 was too immature, the
increased visibility will help us discover that more quickly.  And, the
code will already be in a position to conduct an orderly migration.

Will it still be costly?  Yes.  But I would argue that it's naive to
think that we will be using git/sha3-256 or git/sha-256 10 to 15 years
from now.  It might be git, it might not.  But there *will* be another
migration of existing data (code, history, etc) from one object storage
model to another.  It might be git/SHA4-512, or hg/sha4-384.

So, we aren't trying to find the perfect hash function so that we
naively think we'll never have to change again.  Rather, we're choosing
the next hash function so that we can hold off another migration for as
long as possible.  After all, SHA4-512 doesn't exist yet. ;-)

> in the short run you will make life
> harder for developers on more obscure platforms, who may not easily get
> high-quality, high-speed implementations of anything but the very
> mainstream (which is, let's face it, MD5, SHA-1 and SHA-256). I know I
> would have cursed you for such a decision back when I had to work on AIX
> and IRIX.

I think you're assuming that all developers on obscure platforms have
a similar git usecase to your current one.  I've not heard of that being
the case.

> > For my use cases, as a user of git, I have a plan to maintain provable
> > integrity of existing objects stored in git under sha1 while migrating
> > away from sha1.  The same plan works for migrating away from SHA2 or
> > SHA3 when the time comes.
> 
> Please do not make the mistake of taking your use case to be a template
> for everybody's use case.

I wasn't.  But I will argue that my usecase is valid.  Just as yours is.

> Migrating a large team away from any hash function to another one *will*
> be painful, and costly.

Assuming that it will never happen again would make that doubly costly.

> Migrating will be very costly for hosting companies like GitHub, Microsoft
> and BitBucket, too.

<with_my_business_hat_on>
GitHub and BitBucket have git as the core of their business model.  If
they aren't keeping an eye on the future path of git and maintaining
migration plans, shame on them.
</with_my_business_hat_on>

Thanks,

Jason.

  parent reply index

Thread overview: 112+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-04  1:12 RFC: " Jonathan Nieder
2017-03-05  2:35 ` Linus Torvalds
2017-03-06  0:26   ` brian m. carlson
2017-03-06 18:24     ` Brandon Williams
2017-06-15 10:30       ` Which hash function to use, was " Johannes Schindelin
2017-06-15 11:05         ` Mike Hommey
2017-06-15 13:01           ` Jeff King
2017-06-15 16:30             ` Ævar Arnfjörð Bjarmason
2017-06-15 19:34               ` Johannes Schindelin
2017-06-15 21:59                 ` Adam Langley
2017-06-15 22:41                   ` brian m. carlson
2017-06-15 23:36                     ` Ævar Arnfjörð Bjarmason
2017-06-16  0:17                       ` brian m. carlson
2017-06-16  6:25                         ` Ævar Arnfjörð Bjarmason
2017-06-16 13:24                           ` Johannes Schindelin
2017-06-16 17:38                             ` Adam Langley
2017-06-16 20:52                               ` Junio C Hamano
2017-06-16 21:12                                 ` Junio C Hamano
2017-06-16 21:24                                   ` Jonathan Nieder
2017-06-16 21:39                                     ` Ævar Arnfjörð Bjarmason
2017-06-16 20:42                             ` Jeff King
2017-06-19  9:26                               ` Johannes Schindelin
2017-06-15 21:10             ` Mike Hommey
2017-06-16  4:30               ` Jeff King
2017-06-15 17:36         ` Brandon Williams
2017-06-15 19:20           ` Junio C Hamano
2017-06-15 19:13         ` Jonathan Nieder
2017-03-07  0:17   ` RFC v3: " Jonathan Nieder
2017-03-09 19:14     ` Shawn Pearce
2017-03-09 20:24       ` Jonathan Nieder
2017-03-10 19:38         ` Jeff King
2017-03-10 19:55           ` Jonathan Nieder
2017-09-28  4:43       ` [PATCH v4] technical doc: add a design doc for hash function transition Jonathan Nieder
2017-09-29  6:06         ` Junio C Hamano
2017-09-29  8:09           ` Junio C Hamano
2017-09-29 17:34           ` Jonathan Nieder
2017-10-02  8:25             ` Junio C Hamano
2017-10-02 19:41             ` Jason Cooper
2017-10-02  9:02         ` Junio C Hamano
2017-10-02 19:23         ` Jason Cooper
2017-10-03  5:40         ` Junio C Hamano
2017-10-03 13:08           ` Jason Cooper
2017-10-04  1:44         ` Junio C Hamano
2017-09-06  6:28     ` RFC v3: Another proposed hash function transition plan Junio C Hamano
2017-09-08  2:40       ` Junio C Hamano
2017-09-08  3:34         ` Jeff King
2017-09-11 18:59         ` Brandon Williams
2017-09-13 12:05           ` Johannes Schindelin
2017-09-13 13:43             ` demerphq
2017-09-13 22:51               ` Jonathan Nieder
2017-09-14 18:26                 ` Johannes Schindelin
2017-09-14 18:40                   ` Jonathan Nieder
2017-09-14 22:09                     ` Johannes Schindelin
2017-09-13 23:30               ` Linus Torvalds
2017-09-14 18:45                 ` Johannes Schindelin
2017-09-18 12:17                   ` Gilles Van Assche
2017-09-18 22:16                     ` Johannes Schindelin
2017-09-19 16:45                       ` Gilles Van Assche
2017-09-29 13:17                         ` Johannes Schindelin
2017-09-29 14:54                           ` Joan Daemen
2017-09-29 22:33                             ` Johannes Schindelin
2017-09-30 22:02                               ` Joan Daemen
2017-10-02 14:26                                 ` Johannes Schindelin
2017-09-18 22:25                     ` Jonathan Nieder
2017-09-26 17:05                   ` Jason Cooper
2017-09-26 22:11                     ` Johannes Schindelin
2017-09-26 22:25                       ` [PATCH] technical doc: add a design doc for hash function transition Stefan Beller
2017-09-26 23:38                         ` Jonathan Nieder
2017-09-26 23:51                       ` RFC v3: Another proposed hash function transition plan Jonathan Nieder
2017-10-02 14:54                         ` Jason Cooper
2017-10-02 16:50                           ` Brandon Williams
2017-10-02 14:00                       ` Jason Cooper [this message]
2017-10-02 17:18                         ` Linus Torvalds
2017-10-02 19:37                           ` Jeff King
2017-09-13 16:30             ` Jonathan Nieder
2017-09-13 21:52               ` Junio C Hamano
2017-09-13 22:07                 ` Stefan Beller
2017-09-13 22:18                   ` Jonathan Nieder
2017-09-14  2:13                     ` Junio C Hamano
2017-09-14 15:23                       ` Johannes Schindelin
2017-09-14 15:45                         ` demerphq
2017-09-14 22:06                           ` Johannes Schindelin
2017-09-13 22:15                 ` Junio C Hamano
2017-09-13 22:27                   ` Jonathan Nieder
2017-09-14  2:10                     ` Junio C Hamano
2017-09-14 12:39               ` Johannes Schindelin
2017-09-14 16:36                 ` Brandon Williams
2017-09-14 18:49                 ` Jonathan Nieder
2017-09-15 20:42                   ` Philip Oakley
2017-03-05 11:02 ` RFC: " David Lang
     [not found]   ` <CA+dhYEXHbQfJ6KUB1tWS9u1MLEOJL81fTYkbxu4XO-i+379LPw@mail.gmail.com>
2017-03-06  9:43     ` Jeff King
2017-03-06 23:40   ` Jonathan Nieder
2017-03-07  0:03     ` Mike Hommey
2017-03-06  8:43 ` Jeff King
2017-03-06 18:39   ` Jonathan Tan
2017-03-06 19:22     ` Linus Torvalds
2017-03-06 19:59       ` Brandon Williams
2017-03-06 21:53       ` Junio C Hamano
2017-03-07  8:59     ` Jeff King
2017-03-06 18:43   ` Junio C Hamano
2017-03-07 18:57 ` Ian Jackson
2017-03-07 19:15   ` Linus Torvalds
2017-03-08 11:20     ` Ian Jackson
2017-03-08 15:37       ` Johannes Schindelin
2017-03-08 15:40       ` Johannes Schindelin
2017-03-20  5:21         ` Use base32? Jason Hennessey
2017-03-20  5:58           ` Michael Steuer
2017-03-20  8:05             ` Jacob Keller
2017-03-21  3:07               ` Michael Steuer
2017-03-13  9:24 ` RFC: Another proposed hash function transition plan The Keccak Team
2017-03-13 17:48   ` Jonathan Nieder
2017-03-13 18:34     ` ankostis
2017-03-17 11:07       ` Johannes Schindelin

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171002140011.GE31762@io.lakedaemon.net \
    --to=jason@lakedaemon.net \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=bmwill@google.com \
    --cc=david@lang.hm \
    --cc=demerphq@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=peff@peff.net \
    --cc=sandals@crustytoothpaste.net \
    --cc=sbeller@google.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox