git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "brian m. carlson" <sandals@crustytoothpaste.net>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Duy Nguyen <pclouds@gmail.com>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Herczeg Zsolt <zsolt94@gmail.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: Git and SHA-1 security (again)
Date: Sun, 17 Jul 2016 22:04:17 +0000	[thread overview]
Message-ID: <20160717220417.GE6644@vauxhall.crustytoothpaste.net> (raw)
In-Reply-To: <20160717162349.GB11276@thunk.org>

[-- Attachment #1: Type: text/plain, Size: 3288 bytes --]

On Sun, Jul 17, 2016 at 12:23:49PM -0400, Theodore Ts'o wrote:
> On Sun, Jul 17, 2016 at 03:42:34PM +0000, brian m. carlson wrote:
> > As I said, I'm not planning on multiple hash support at first, but it
> > doesn't appear impossible if we go this route.  We might still have to
> > rewrite objects, but we can verify signatures over the legacy SHA-1
> > objects by forcing them into the old-style object format.
> 
> How hard would it be to make the on-disk format be multihash, even if
> there is no support for anything other than a single hash, at least
> for now?  That way we won't have to rewrite the objects twice.

Other than the amount of work to change reading from the on-disk format,
nothing prevents us from doing that, although I would recommend storing
the object database with the tag prefix if we do so (i.e., instead of
.git/objects/17, writing .git/objects/111417).  That future-proofs us
for when we change the hash.

I will say that the pack format will likely require some changes,
because it assumes things are 4-byte aligned.  It also assumes you can
use the object ID in the mmaped pack directly (4-byte aligned), which
you can no longer do.  We have some cases where we cast that memory
directly to struct object_id, which will no longer be valid, and even if
we add the two prefix bytes to struct object_id, that doesn't guarantee
that struct won't be aligned differently.

We could require that the pack format have two NUL bytes before the
hash, which would force it to be aligned.  We'd still have to make the
Git protocol negotiate the new extension and fail gracefully if the
version is too old.  We could do this by requiring a pack version 5,
which would simply cause older Gits to report errors.

It's a lot of work, and it's definitely a flag day.  That's why I had
planned to only do it with a new hash format: it would impact only
people who were moving to the new hash.  It also means that we get to
work out any problems with the design at that point and not be committed
to a design that might be inadequate.  This is a place where I don't
want to mess up.

> Personally, so long as the newer versions of the tree are secured, I
> wouldn't mind if the older commits stayed using SHA1 only.  The newer
> commits are the ones that are most important and security-critical
> anyway.  It seems like the main reason to rewrite all of the objects
> is to simplify the initial rollout of a newer hash algorithm, no?

The reason is that we can't have an unambiguous parse of the current
objects if two hash algorithms are in use.  tree objects don't use a hex
encoding of hashes; they use a binary encoding.  It's therefore possible
to create an ambiguous tree representation.

So when we look at a new hash, we need to provide an unambiguous way to
know what hash is in use.  The two choices are to either require all
object use the new hash, or to extend the objects to include the hash.
Until a couple days ago, I had planned to do the former.  I had not even
considered using a multihash approach due to the complexity.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

  reply	other threads:[~2016-07-17 22:04 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-16 13:48 Git and SHA-1 security (again) Herczeg Zsolt
2016-07-16 20:13 ` brian m. carlson
2016-07-16 21:46   ` Herczeg Zsolt
2016-07-16 22:03     ` brian m. carlson
2016-07-17  8:01   ` Johannes Schindelin
2016-07-17 14:21     ` brian m. carlson
2016-07-17 15:19       ` Duy Nguyen
2016-07-17 15:42         ` brian m. carlson
2016-07-17 16:23           ` Theodore Ts'o
2016-07-17 22:04             ` brian m. carlson [this message]
     [not found]               ` <1468804249.2037.0@smtp.gmail.com>
2016-07-18  1:18                 ` Fwd: " Herczeg Zsolt
2016-07-18  7:12                 ` Johannes Schindelin
2016-07-18 15:09                   ` Herczeg Zsolt
2016-07-18 15:57                     ` Johannes Schindelin
2016-07-18 16:05                       ` Duy Nguyen
2016-07-19  7:18                         ` Johannes Schindelin
2016-07-19 15:31                           ` Duy Nguyen
2016-07-19 17:34                             ` David Lang
2016-07-19 17:43                               ` Duy Nguyen
2016-07-19 17:59                                 ` David Lang
2016-07-19 18:04                                   ` Duy Nguyen
2016-07-19 18:58                                     ` Herczeg Zsolt
2016-07-20 14:48                                       ` Duy Nguyen
2016-07-20 12:28                                     ` Johannes Schindelin
2016-07-20 14:44                                       ` Duy Nguyen
2016-07-20 17:10                                         ` Stefan Beller
2016-07-20 19:26                                           ` Junio C Hamano
2016-08-22 22:01                                         ` Philip Oakley
2016-07-18 16:12                       ` Herczeg Zsolt
2016-07-19  7:21                         ` Johannes Schindelin
2016-07-18 18:00               ` Junio C Hamano
2016-07-18 21:26                 ` Jonathan Nieder
2016-07-18 23:03                 ` brian m. carlson
2016-07-21 13:19                   ` Johannes Schindelin
2016-07-21 12:53                 ` Johannes Schindelin
2016-07-22 15:59                   ` Junio C Hamano
2016-07-18  7:00       ` Johannes Schindelin
2016-07-18 22:44         ` brian m. carlson
2016-07-21 14:13           ` Johannes Schindelin
2016-07-18 16:51       ` Duy Nguyen
2016-07-19  7:31         ` Johannes Schindelin
2016-07-19  7:46           ` David Lang
2016-07-19 16:07         ` Duy Nguyen
2016-07-19 17:06           ` Junio C Hamano
2016-07-19 17:27             ` Duy Nguyen
2016-07-19 18:46               ` Junio C Hamano
2016-07-18 16:51 ` Ævar Arnfjörð Bjarmason
2016-07-18 17:48   ` Herczeg Zsolt
2016-07-18 20:01     ` David Lang
2016-07-18 20:02     ` Ævar Arnfjörð Bjarmason
2016-07-18 20:55       ` Junio C Hamano
2016-07-18 21:28         ` Herczeg Zsolt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160717220417.GE6644@vauxhall.crustytoothpaste.net \
    --to=sandals@crustytoothpaste.net \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    --cc=tytso@mit.edu \
    --cc=zsolt94@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).