On 2022-06-24 at 10:52:36, Jeff King wrote: > On Wed, Jun 22, 2022 at 12:29:59AM +0000, brian m. carlson wrote: > > > > We've since migrated our default hash function from SHA-1 to SHA-1DC > > > (except on vanilla OSX, see [2]). It's a variant SHA-1 that detects the > > > SHAttered attack implemented by the same researchers. I'm not aware of a > > > current viable SHA-1 collision against the variant of SHA-1 that we > > > actually use these days. > > > > That's true, but that still doesn't let you store the data. There is > > some data that you can't store in a SHA-1 repository, and SHA-1DC is > > extremely slow. Using SHA-256 can make things like indexing packs > > substantially faster. > > I'm curious if you have numbers on this. I naively converted linux.git > to sha256 by doing "fast-export | fast-import" (the latter in a sha256 > repo, of course, and then both repacked with "-f --window=250" to get > reasonable apples-to-apples packs). I did the same thing, except I just did a regular gc and not a custom repack, and I created both a SHA-1 and SHA-256 repo from the same original. > Running "index-pack --verify" on the result takes about the same time > (this is on an 8-core system, hence the real/user differences): > > [sha1dc] > real 2m43.754s > user 10m52.452s > sys 0m36.745s > > [sha256] > real 2m41.884s > user 12m23.344s > sys 0m35.222s Here are my results: [sha256] time ~/checkouts/git/git index-pack --verify .git/objects/pack/pack-*.pack ~/checkouts/git/git index-pack --verify .git/objects/pack/pack-*.pack 2768.42s user 181.00s system 185% cpu 26:31.70 total [sha1dc] time ~/checkouts/git/git index-pack --verify .git/objects/pack/pack-*.pack ~/checkouts/git/git index-pack --verify .git/objects/pack/pack-*.pack 3041.28s user 184.84s system 199% cpu 26:54.74 total Note that in my case, I'm using an accelerated hardware-based SHA-256 implementation (Nettle, which I will send a patch for soon). This is a brand new ThinkPad X1 Carbon Gen 10 with an i7-1280P (with 20 "cores" of different sizes). So this is about 9% faster in terms of total CPU usage on SHA-256 with that implementation. The wallclock time is less impressive here. Of course, it might be slower in software, but considering that AMD has had SHA-NI for some time, newer Intel processors have it, and ARM also has SHA-2 acceleration instructions, it's likely it will be faster on most recent machines assuming it's compiled appropriately. -- brian m. carlson (he/him or they/them) Toronto, Ontario, CA