On Mon, Jul 18, 2016 at 09:00:06AM +0200, Johannes Schindelin wrote: > Hi Brian, > > On Sun, 17 Jul 2016, brian m. carlson wrote: > > > On Sun, Jul 17, 2016 at 10:01:38AM +0200, Johannes Schindelin wrote: > > > Out of curiosity: have you considered something like padding the SHA-1s > > > with, say 0xa1, to the size of the new hash and using that padding to > > > distinguish between old vs new hash? > > > > I'm going to end up having to do something similar because of the issue > > of submodules. Submodules may still be SHA-1, while the main repo may > > be a newer hash. I was going to zero-pad, however. > > I thought about zero-padding, but there are plenty of > is_null_sha1()/is_null_oid() calls around. Of course, I assumed > left-padding. But you may have thought of right-padding instead? That > would make short name handling much easier, too. I was going to right-pad. > FWIW it never crossed my mind to allow different same-sized hash > algorithms. So I never thought we'd need a way to distinguish, say, > BLAKE2b-256 from SHA-256. > > Is there a good reason to add the maintenance burden of several 256-bit > hash algorithms, apart from speed (which in my mind should decide which > one to use, always, rather than letting the user choose)? It would also > complicate transport even further, let alone subtree merges from > differently-hashed repositories. There are really three candidates: * SHA-256 (the SHA-2 algorithm): While this looks good right now, cryptanalysis is advancing. This is not a good choice for a long-term solution. * SHA3-256 (the SHA-3 algorithm): This is the conservative choice. It's also faster than SHA-256 on 64-bit systems. It has a very conservative security margin and is a good long-term choice. * BLAKE2b-256: This is the blisteringly fast choice. It outperforms SHA-1 and even MD5 on 64-bit systems. This algorithm was designed so that nobody would have a reason to use an insecure algorithm. It will probably be secure for some time, but maybe not as long as SHA3-256. I'm only considering 256-bit hashes, because anything longer won't fit on an 80-column terminal in hex form. The reason I had considered implementing both SHA3-256 and BLAKE2b-256 is that I want there to be no reason not to upgrade. People who need a FIPS-approved algorithm or want a long-term, conservative choice should use SHA3-256. People who want even better performance than current Git would use BLAKE2b-256. Performance comparison (my implementations): SHA-1: 437 MiB/s SHA-256: 196 MiB/s SHA3-256: 273 MiB/s BLAKE2b: 649 MiB/s I hadn't thought about subtree merges, though. -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: https://keybase.io/bk2204