mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <>
To: Jonathan Nieder <>
Cc: "Eric S. Raymond" <>,
	Jakub Narebski <>,
Subject: Re: RFC: Separate commit identification from Merkle hashing
Date: Thu, 23 May 2019 23:50:24 +0200	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On Thu, May 23 2019, Jonathan Nieder wrote:

> Eric S. Raymond wrote:
>> Jakub Narebski <>:
>>> Currently Git makes use of the fact that SHA-1 and SHA-256 identifiers
>>> are of different lengths to distinguish them (see section "Meaning of
>>> signatures") in Documentation/technical/hash-function-transition.txt
>> That's the obvious hack.  As a future-proofing issue, though, I think
>> it would be unwise to count on all future hashes being of distinguishable
>> lengths.
> We're not counting on that.  As discussed in that section, future
> hashes can change the format.

I think both of you are also missing something that's implicit (but
unfortunately not very explicitly talked about) in that document, which
is that such hash transitions are assumed to have an out-of-bounds
temporal component to them.

I.e. let's assume that instead of SHA-256 we're switching to SHA-X,
which like SHA-1 is also a 20 byte hash function, so they're the same

You'd then get a git.git with SHA-1 today, next year you'd have A
SHA-1<->SHA-X mapping table, but the year after that we'd be fully on
SHA-X for new content.

So even though we carry code and lookup table for looking up the old
SHA-1 values we're not going to continue to pointlessly generate that
bidirectional mapping forever. We'll have some sort of gravestone marker
where we say "past this point it's SHA-X only".

That's not implemented or specified yet, but could e.g. be a magic ref
of some sort advertised by the server, and the client would enforce that
such a marker could only be made with the stronger hash function.

Thus a couple of years after that the SHA-1 -> SHA-X transition someone
generating a colliding tag where a new good SHA-X tag *could* point to
bad SHA-1 content won't be exploitable in practice. At that point
clients won't be downloading SHA-1'd content or generating the mapping
table anymore.

So I don't see why a format change for the tags is needed, it would only
matter *if* we have a full collision *and* the hashes are the same
length (which we have no plan for), *and* if we assume we don't have
some other mitigations in play.

      parent reply	other threads:[~2019-05-23 21:50 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-21  1:32 RFC: Separate commit identification from Merkle hashing Eric S. Raymond
2019-05-21  1:57 ` Jonathan Nieder
2019-05-21  2:38   ` Eric S. Raymond
2019-05-21  2:58     ` Jonathan Nieder
2019-05-21  3:31       ` Eric S. Raymond
2019-05-23 19:09 ` Jakub Narebski
2019-05-23 20:09   ` Jonathan Nieder
2019-05-23 20:53     ` Eric S. Raymond
2019-05-23 20:50   ` Eric S. Raymond
2019-05-23 20:54     ` Jonathan Nieder
2019-05-23 21:19       ` Eric S. Raymond
2019-05-23 21:39         ` Randall S. Becker
2019-05-23 21:50       ` Ævar Arnfjörð Bjarmason [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

  List information:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).