On Sun, Jul 17, 2016 at 10:01:38AM +0200, Johannes Schindelin wrote:
> Out of curiosity: have you considered something like padding the SHA-1s
> with, say 0xa1, to the size of the new hash and using that padding to
> distinguish between old vs new hash?

I'm going to end up having to do something similar because of the issue
of submodules.  Submodules may still be SHA-1, while the main repo may
be a newer hash.  I was going to zero-pad, however.  I was also, at
least at first, going to force a separate .git dir for those, to avoid
having to try to store two separate types of objects in the same repo.

The other limitation with this is that it isn't immediately obvious what
hash is in use just because it has a certain length.  For example, I
plan on implementing SHA3-256, but it's also possible I might add
BLAKE2b-256 for people for whom SHA3-256 is too slow.  There's no way to
distinguish between those two algorithms.  Thus allowing multiple hashes
in the same repo won't work without a format byte.

What I might do, however, is add multihash-style format information to
the on-disk format for non-SHA-1 repos.  Then SHA-1 compatibility could
come in a future iteration.  That would be compatible with the existing
refactor.

> I guess that it would also possible to introduce an opt-in "legacy mapper"
> which would generate a mapping locally of all objects' SHA-1 to whatever
> new hash you choose. Generating it locally would side-step the security
> issues of the SHA-1 algorithm. We would need to teach Git to pick that
> mapping up if available and use it, of course.

I think that might be easier.  Considering the number of tests that
hard-code object names, I might need that for the testsuite.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204