From: Junio C Hamano <gitster@pobox.com>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Git Mailing List <git@vger.kernel.org>,
Stefan Beller <sbeller@google.com>,
bmwill@google.com, jonathantanmy@google.com,
Jeff King <peff@peff.net>, David Lang <david@lang.hm>,
"brian m. carlson" <sandals@crustytoothpaste.net>
Subject: Re: RFC v3: Another proposed hash function transition plan
Date: Fri, 08 Sep 2017 11:40:21 +0900 [thread overview]
Message-ID: <xmqq1snh29re.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <xmqqa828733s.fsf@gitster.mtv.corp.google.com> (Junio C. Hamano's message of "Wed, 06 Sep 2017 15:28:23 +0900")
Junio C Hamano <gitster@pobox.com> writes:
> One thing I still do not know how I feel about after re-reading the
> thread, and I didn't find the above doc, is Linus's suggestion to
> use the objects themselves as NewHash-to-SHA-1 mapper [*1*].
> ...
> [Reference]
>
> *1* <CA+55aFxj7Vtwac64RfAz_u=U4tob4Xg+2pDBDFNpJdmgaTCmxA@mail.gmail.com>
I think this falls into the same category as the often-talked-about
addition of the "generation number" field. It is very tempting to
add these "mechanically derivable but expensive to compute" pieces
of information to the sha3-content while converting from
sha1-content and creating anew.
Because the "sha1-name" or the "generation number" can mechanically
be computed, as long as everybody agrees to _always_ place them in
the sha3-content, the same sha1-content will be converted into
exactly the same sha3-content without ambiguity, and converting them
back to sha1-content while pushing to an older repository will
correctly produce the original sha1-content, as it would just be the
matter of simply stripping these extra pieces of information.
The reason why I still feel a bit uneasy about adding these things
(aside from the fact that sha1-name thing will be a baggage we would
need to carry forever even after we completely wean ourselves off of
the old hash) is because I am not sure what we should do when we
encounter sha3-content in the wild that has these things _wrong_.
An object that exists today in the SHA-1 world is fetched into the
new repository and converted to SHA-3 contents, and Linus's extra
"original SHA-1 name" field is added to the object's header while
recording the SHA-3 content. But for whatever reason, the original
SHA-1 name is recorded incorrectly in the resulting SHA-3 object.
The same thing could happen if we decide to bake "generation number"
in the SHA-3 commit objects. One possible definition would be that
a root commit will have gen #0; a commit with 1 or more parents will
get max(parents' gen numbers) + 1 as its gen number. But somebody
may botch the counting and records sum(parents' gen numbers) as its
gen number.
In these cases, not just the SHA3-content but also the resulting
SHA-3 object name would be different from the name of the object
that would have recorded the same contents correctly. So converting
back to SHA-1 world from these botched SHA-3 contents may produce
the original contents, but we may end up with multiple "plausibly
looking" set of SHA-3 objects that (clain to) correspond to a single
SHA-1 object, only one of which is a valid one.
Our "git fsck" already treats certain brokenness (like a tree whose
entry has mode that is 0-padded to the left) as broken but still
tolerate them. I am not sure if it is sufficient to diagnose and
declare broken and invalid when we see sha3-content that records
these "mechanically derivable but expensive to compute" pieces of
information incorrectly.
I am leaning towards saying "yes, catching in fsck is enough" and
suggesting to add generation number to sha3-content of the commit
objects, and to add even the "original sha1 name" thing if we find
good use of it. But I cannot shake this nagging feeling off that I
am missing some huge problems that adding these fields and opening
ourselves to more classes of broken objects.
Thoughts?
next prev parent reply other threads:[~2017-09-08 2:40 UTC|newest]
Thread overview: 113+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-04 1:12 RFC: Another proposed hash function transition plan Jonathan Nieder
2017-03-05 2:35 ` Linus Torvalds
2017-03-06 0:26 ` brian m. carlson
2017-03-06 18:24 ` Brandon Williams
2017-06-15 10:30 ` Which hash function to use, was " Johannes Schindelin
2017-06-15 11:05 ` Mike Hommey
2017-06-15 13:01 ` Jeff King
2017-06-15 16:30 ` Ævar Arnfjörð Bjarmason
2017-06-15 19:34 ` Johannes Schindelin
2017-06-15 21:59 ` Adam Langley
2017-06-15 22:41 ` brian m. carlson
2017-06-15 23:36 ` Ævar Arnfjörð Bjarmason
2017-06-16 0:17 ` brian m. carlson
2017-06-16 6:25 ` Ævar Arnfjörð Bjarmason
2017-06-16 13:24 ` Johannes Schindelin
2017-06-16 17:38 ` Adam Langley
2017-06-16 20:52 ` Junio C Hamano
2017-06-16 21:12 ` Junio C Hamano
2017-06-16 21:24 ` Jonathan Nieder
2017-06-16 21:39 ` Ævar Arnfjörð Bjarmason
2017-06-16 20:42 ` Jeff King
2017-06-19 9:26 ` Johannes Schindelin
2017-06-15 21:10 ` Mike Hommey
2017-06-16 4:30 ` Jeff King
2017-06-15 17:36 ` Brandon Williams
2017-06-15 19:20 ` Junio C Hamano
2017-06-15 19:13 ` Jonathan Nieder
2017-03-07 0:17 ` RFC v3: " Jonathan Nieder
2017-03-09 19:14 ` Shawn Pearce
2017-03-09 20:24 ` Jonathan Nieder
2017-03-10 19:38 ` Jeff King
2017-03-10 19:55 ` Jonathan Nieder
2017-09-28 4:43 ` [PATCH v4] technical doc: add a design doc for hash function transition Jonathan Nieder
2017-09-29 6:06 ` Junio C Hamano
2017-09-29 8:09 ` Junio C Hamano
2017-09-29 17:34 ` Jonathan Nieder
2017-10-02 8:25 ` Junio C Hamano
2017-10-02 19:41 ` Jason Cooper
2017-10-02 9:02 ` Junio C Hamano
2017-10-02 19:23 ` Jason Cooper
2017-10-03 5:40 ` Junio C Hamano
2017-10-03 13:08 ` Jason Cooper
2017-10-04 1:44 ` Junio C Hamano
2017-09-06 6:28 ` RFC v3: Another proposed hash function transition plan Junio C Hamano
2017-09-08 2:40 ` Junio C Hamano [this message]
2017-09-08 3:34 ` Jeff King
2017-09-11 18:59 ` Brandon Williams
2017-09-13 12:05 ` Johannes Schindelin
2017-09-13 13:43 ` demerphq
2017-09-13 22:51 ` Jonathan Nieder
2017-09-14 18:26 ` Johannes Schindelin
2017-09-14 18:40 ` Jonathan Nieder
2017-09-14 22:09 ` Johannes Schindelin
2017-09-13 23:30 ` Linus Torvalds
2017-09-14 18:45 ` Johannes Schindelin
2017-09-18 12:17 ` Gilles Van Assche
2017-09-18 22:16 ` Johannes Schindelin
2017-09-19 16:45 ` Gilles Van Assche
2017-09-29 13:17 ` Johannes Schindelin
2017-09-29 14:54 ` Joan Daemen
2017-09-29 22:33 ` Johannes Schindelin
2017-09-30 22:02 ` Joan Daemen
2017-10-02 14:26 ` Johannes Schindelin
2017-09-18 22:25 ` Jonathan Nieder
2017-09-26 17:05 ` Jason Cooper
2017-09-26 22:11 ` Johannes Schindelin
2017-09-26 22:25 ` [PATCH] technical doc: add a design doc for hash function transition Stefan Beller
2017-09-26 23:38 ` Jonathan Nieder
2017-09-26 23:51 ` RFC v3: Another proposed hash function transition plan Jonathan Nieder
2017-10-02 14:54 ` Jason Cooper
2017-10-02 16:50 ` Brandon Williams
2017-10-02 14:00 ` Jason Cooper
2017-10-02 17:18 ` Linus Torvalds
2017-10-02 19:37 ` Jeff King
2017-09-13 16:30 ` Jonathan Nieder
2017-09-13 21:52 ` Junio C Hamano
2017-09-13 22:07 ` Stefan Beller
2017-09-13 22:18 ` Jonathan Nieder
2017-09-14 2:13 ` Junio C Hamano
2017-09-14 15:23 ` Johannes Schindelin
2017-09-14 15:45 ` demerphq
2017-09-14 22:06 ` Johannes Schindelin
2017-09-13 22:15 ` Junio C Hamano
2017-09-13 22:27 ` Jonathan Nieder
2017-09-14 2:10 ` Junio C Hamano
2017-09-14 12:39 ` Johannes Schindelin
2017-09-14 16:36 ` Brandon Williams
2017-09-14 18:49 ` Jonathan Nieder
2017-09-15 20:42 ` Philip Oakley
2017-03-05 11:02 ` RFC: " David Lang
[not found] ` <CA+dhYEXHbQfJ6KUB1tWS9u1MLEOJL81fTYkbxu4XO-i+379LPw@mail.gmail.com>
2017-03-06 9:43 ` Jeff King
2017-03-06 23:40 ` Jonathan Nieder
2017-03-07 0:03 ` Mike Hommey
2017-03-06 8:43 ` Jeff King
2017-03-06 18:39 ` Jonathan Tan
2017-03-06 19:22 ` Linus Torvalds
2017-03-06 19:59 ` Brandon Williams
2017-03-06 21:53 ` Junio C Hamano
2017-03-07 8:59 ` Jeff King
2017-03-06 18:43 ` Junio C Hamano
2017-03-07 18:57 ` Ian Jackson
2017-03-07 19:15 ` Linus Torvalds
2017-03-08 11:20 ` Ian Jackson
2017-03-08 15:37 ` Johannes Schindelin
2017-03-08 15:40 ` Johannes Schindelin
2017-03-20 5:21 ` Use base32? Jason Hennessey
2017-03-20 5:58 ` Michael Steuer
2017-03-20 8:05 ` Jacob Keller
2017-03-21 3:07 ` Michael Steuer
2017-03-13 9:24 ` RFC: Another proposed hash function transition plan The Keccak Team
2017-03-13 17:48 ` Jonathan Nieder
2017-03-13 18:34 ` ankostis
2017-03-17 11:07 ` Johannes Schindelin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqq1snh29re.fsf@gitster.mtv.corp.google.com \
--to=gitster@pobox.com \
--cc=bmwill@google.com \
--cc=david@lang.hm \
--cc=git@vger.kernel.org \
--cc=jonathantanmy@google.com \
--cc=jrnieder@gmail.com \
--cc=peff@peff.net \
--cc=sandals@crustytoothpaste.net \
--cc=sbeller@google.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).