From: "H. Peter Anvin" <hpa@zytor.com>
To: Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: Migrating away from SHA-1?
Date: Tue, 12 Apr 2016 18:38:25 -0700 [thread overview]
Message-ID: <570DA311.3000500@zytor.com> (raw)
In-Reply-To: <xmqqlh4imibd.fsf@gitster.mtv.corp.google.com>
On 04/12/16 18:03, Junio C Hamano wrote:
>>
>> and so on. Of course trees don't have any space for this; they have a
>> fixed-length for the hash part of each record, which is basically:
>>
>> <mode> <name> NUL <20-byte-sha1>
>>
>> So we'd probably need a "treev2" object type that gives room for an
>> algorithm byte (or we'd have to try to shove it into the mode, but since
>> old versions won't know the new algorithm anyway, I don't think it
>> solves that much...). Or you can just define for the whole tree object
>> (either implicit in its type, or in a header) that it always uses
>> algorithm X.
>
> This will hurt the performance a lot during the transition period as
> it no longer will be possible to rely on "most of the time a fine
> grained commit changes only a small part of the tree, and we can
> cheaply avoid descending into trees that haven't changed because we
> can tell that the corresponding tree objects in the pre- and post-
> trees have the same object name" optimization. But we cannot avoid
> it.
>
Not really, because you can point to the algoX hash even for the
existing objects.
Perhaps the tree object can add a format descriptor at the beginning;
something like:
<invalid mode number> <hash format used>
>> Transitioning to that would be something like:
>>
>> 0. Overhaul all of the git code to handle arbitrary-sized object ids.
>>
>> 1. Decide on the new algorithm and implement it in git.
>>
>> 2. Recognize parameterized object ids in commits and tags (designing
>> format, implementing the reading side).
>>
>> 3. Recognize parameterized object ids somehow in trees (designing
>> format, implementing the reading side).
>>
>> 4. Teach the object database to index objects by the new algorithm (or
>> possibly both algorithms).
>>
>> 5. Add a protocol extension so that both sides can decide which
>> algorithm is being used when they talk about oids.
>>
>> 6. Add a config option to write references in objects using the new
>> algorithm.
>>
>> 7. After a while, flip the config option on. Hopefully the readers
>> from steps 1-5 have percolated to the masses by then, and it's not
>> a horrible flag day.
>>
>> We're basically on step 0 right now. I'm sure I'm missing some
>> subtleties in there, too.
>
> One subtlety is that 7. "not a flag day" may not be a good thing.
>
> There has to be a section of a history that spans the transition,
> set of commits and trees that have pointers to both kinds of object
> names. The narrower such a section of the history, the more
> pleasant to use the result of the transition would be.
>
> Different projects that can have their own flag days at their own
> pace is a good thing, so the above observation does not invalidate
> your transition plan, though.
I don't think there is any way this can *not* be by repository and
somehow require a manual operation in order to preserve the
cryptographic integrity. In some ways, the transition point and the
transition table becomes a special kind of tag object. There may have
to be more than one in the case of commits in multiple trees.
next prev parent reply other threads:[~2016-04-13 1:38 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-12 22:38 Migrating away from SHA-1? H. Peter Anvin
2016-04-12 23:00 ` Stefan Beller
2016-04-12 23:06 ` H. Peter Anvin
2016-04-12 23:15 ` Jeff King
2016-04-12 23:15 ` David Turner
2016-04-12 23:44 ` Jeff King
2016-04-14 1:53 ` Theodore Ts'o
2016-04-14 16:47 ` Joey Hess
2016-04-14 17:23 ` David Turner
2016-04-14 17:28 ` H. Peter Anvin
2016-04-14 22:40 ` Theodore Ts'o
2016-04-15 2:13 ` Jeff King
2016-04-15 2:18 ` Junio C Hamano
2016-04-15 2:22 ` Jeff King
2016-04-12 23:42 ` Jeff King
2016-04-13 1:03 ` Junio C Hamano
2016-04-13 1:36 ` Jeff King
2016-04-13 1:38 ` H. Peter Anvin [this message]
2016-04-13 1:51 ` Duy Nguyen
2016-04-13 1:58 ` H. Peter Anvin
2016-04-15 1:50 ` brian m. carlson
-- strict thread matches above, loose matches on Subject: below --
2016-06-18 2:10 Leo Gaspard
2016-06-18 3:30 ` Eric Wong
2016-06-24 18:17 ` brian m. carlson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=570DA311.3000500@zytor.com \
--to=hpa@zytor.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).