This session was led by brian m. carlson. Supporting cast: Jonathan "jrnieder" Nieder, Derrick Stolee, Johannes "Dscho" Schindelin, Toon Claes, and Ævar Arnfjörð Bjarmason. Notes: 1. Summarizing where we are with what merged: 1. We have full SHA256 support \o/ 2. Some minor glitches, updated the docs to reflect that 3. It works. 2.30 is a good state. 4. None of the major forges support it yet but that will come 2. Interop between SHA256 repositories and SHA1 repositories 1. Take each object we receive over the wire or create locally and give it a sha1 value as well 2. We have a giant loose object index that maps sha256-id and sha1-id values. Hashmap 3. Will be changed to some tree to allow prefix mapping 4. index-pack has to take two passes over the pack, because you can’t map a commit before you’ve mapped the tree it points to (or more generally can’t map an object before the objects it references) 5. Fortunately blobs don’t point to any other objects so this is relatively quick 3. Submodules are tricky 1. They come from a different repository so we don’t have anything to map to 2. What I’m currently doing is requiring that the submodule be present locally and storing the mapping separately in the superproject 3. The mapping isn’t sent over the wire. That could create some complexity around malicious histories 4. For the same reason we don’t have partial clone working 1. Might require an on-disk format bump 2. jrnieder: taking a step back, the hash verifies the full history via the Merkle tree property. 3. However, with partial clone we already relax this: it is no longer verifiable, locally. 4. Therefore, we place a lot of trust on the server. 5. The server could tell us more information about the edge commits, e.g. SHA-1<->SHA-256 mapping 6. Stolee: if I am sha256 client, that’s what I want, you kind of decide up front what you want 7. jrnieder: at $DAYJOB common partial clone scenario is triangular workflow 8. Stolee: how likely are the multiple hosts not homogenous (all SHA-1, or all SHA-256)? 9. brian: Valuable to be able to work in SHA256 and refer in input+output to SHA-1. If someone refers to a SHA-1, you still want to be able to see what they’re referring to, to interact with other people, even though SHA-1 is insecure 5. Multi-pack index: doesn’t work, but won’t be hard to fix 6. We write signatures for both objects. When you “git commit --gpg-sign”, it can sign in both formats 1. Verifies in current format 7. Timeframe for hosting providers moving to SHA256 1. Dscho: should we have a multi provider meeting and coordinate that? Could be everyone waiting for others 2. brian: cgit supports SHA256 already, allows self-hosting 3. jrnieder: with interop, individuals can use SHA256 against servers that only support SHA-1. Then that creates pressure for the servers to support SHA256 for performance reasons 4. brian: interop doesn’t exist yet. If GitHub decides I work on that for the next two months, I think I could do it. But requires the code getting written. 5. Toon: we at gitlab have sha256 on our radar, but with a very low prio https://gitlab.com/groups/gitlab-org/-/epics/794 8. jrnieder: Signing: very old Git versions won’t know to invalidate them when I commit --amend. How old is “very old”? 1. brian: somewhere between 2.20 and 2.28. In 2.20 started treating everything with “gpgsig” at start as a potential signature. 2. There were a couple of bugs I fixed in 2.30, working on signature interoperability. Tested with sha256. 3. Updated the transition plan: in tags, the trailing signature is always the current signature, other ones go in the header. 9. Updating other hosting provider glue to support sha256 1. jrnieder: e.g. GitHub API, UIs, …. Is it hard, similar to the Git part, or a little easier? 2. brian: hardest part is libgit2. Lots of hardcoded oids in its testsuite 3. Libraries tend to be the hardest piece --- e.g. Gerrit will need JGit updates 4. Dscho: gitk also has some references to hardcoded 40-length 5. Ævar: some patches on the mailing list for gitk and git-gui to adapt them, from Carlos 6. brian: hopefully the ecosystem learns from this experience and doesn’t just hardcode 64 here :) 10. Interop code only supports 2 algorithms. Hopefully finish this transition before we need the next one :)