git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "brian m. carlson" <sandals@crustytoothpaste.net>
To: Hans Petter Selasky <hps@selasky.org>
Cc: git@vger.kernel.org
Subject: Re: Gitorious should use CRC128 / 256 / 512 instead of SHA-1
Date: Sat, 14 Jan 2023 23:59:23 +0000	[thread overview]
Message-ID: <Y8NB21PExmifhyeQ@tapette.crustytoothpaste.net> (raw)
In-Reply-To: <9c0fda42-67ab-f406-489b-38a2d9bbcfc2@selasky.org>

[-- Attachment #1: Type: text/plain, Size: 3907 bytes --]

On 2023-01-13 at 13:23:59, Hans Petter Selasky wrote:
> Hi,
> 
> Currently GIT only supports cryptographic hashes for its commit tags.
> 
> That means:
> 
> 1) It's very difficult to edit the history without also recomputing the hash
> tags for all commits after the needed change-point, which then means
> references to a repository is broken.

This is intentional.  Commit and tag signing requires an unbroken Merkle
tree-like construction that prevents the history from being modified by
signing a single commit or tag.

> 2) Only a single bit error in the main repository can break everything!

git fsck is designed to detect this, and by default it's run every time
the repository is repacked (such as by git gc).  But yes, this is a
problem, and changing to an algorithm which isn't cryptographically
secure won't change that.  Prudent users back up data to prevent data
loss.

> 3) Illicit contents may be present in binary blobs, which in the future may
> be need to be removed without warrant and the only way to do that is by
> rebasing and force pushing, which will break "everything". It can be
> everything from child-porn to expired distribution licenses.

This is a problem in every Merkle tree-like system.  Most repositories
have some sort of code review or access control that prevents people
from generally pushing inappropriate content.  For example, if somebody
proposed to push any sort of pornography or other inappropriate content
(e.g., a racist screed) to one of my repositories or one of my
employer's, I'd refuse to approve or merge such a change, because
that wouldn't be appropriate for the repository.

I don't feel this is enough of a problem that using a Merkle tree-like
construction is a bad idea, given the benefits it offers.

> Therefore I propose the following changes to GIT.
> 
> 1) Use a CRC128 / 256 or 512 non-cryptographic based hashing algorithm as
> default.

As the person who wrote the SHA-256 support, I'm pleased to report that
adding a new hash algorithm isn't very difficult anymore.  The largest
part of the work is updating all the tests.  I've tried very hard to
make this substantially easier for everyone.

However, Git is moving in the direction of stronger cryptographic
algorithms, rather than insecure hashing algorithms.  I don't think your
proposal is a good idea, nor do I think it's likely to be adopted.

If it were adopted, the signing of commits and tags would be
meaningless, and because it would be trivial to create collisions[0], there
would clearly be some pairs of objects which could not be stored.  This
would make Git much less useful, and it might allow users to attempt to
forge or replace content without being detected.

That being said, you are free to create your own fork of the code which
does so, provided you comply with the terms of the license.

> 2) Add support for a CRC fixup field, which usually is zero, but when merges
> are needed, it can be non-zero, to allow the hash-tag-value to remain the
> same! This also allows for easy conversion of existing GIT repositories to
> the new scheme.

For the same reason as above, I don't think this is a good idea.

> 3) All git objects should be uncompressed.

This would dramatically increase the size of most repositories.  I've
easily seen repositories where the uncompressed contents exceed 1 TB in
size yet the repository is only double-digit gigabytes, if that.  Most
people will find the increase in disk usage unacceptable, and I'm
certain that includes Git hosterse.

[0] CRC is linear and the following relations apply, which makes forgery
trivial (see https://en.wikipedia.org/wiki/Cyclic_redundancy_check):

CRC(x XOR y) = CRC(x) XOR CRC(y) XOR c for some c
CRC(x XOR y XOR z) = CRC(x) XOR CRC(y) XOR CRC(z)
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

  reply	other threads:[~2023-01-14 23:59 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-13 13:23 Gitorious should use CRC128 / 256 / 512 instead of SHA-1 Hans Petter Selasky
2023-01-14 23:59 ` brian m. carlson [this message]
2023-01-15  3:14   ` Junio C Hamano
2023-01-15 10:09   ` demerphq
2023-01-16  7:21   ` Hans Petter Selasky
2023-01-16  7:23   ` Hans Petter Selasky
2023-01-16 12:34     ` rsbecker
2023-01-16 14:01       ` Hans Petter Selasky
2023-01-16 15:06         ` Junio C Hamano
2023-01-15 13:53 ` Michal Suchánek
2023-01-16  7:17   ` Hans Petter Selasky
2023-01-16  9:13     ` Michal Suchánek
2023-01-16  9:55       ` Hans Petter Selasky
2023-01-16 12:31         ` rsbecker
2023-01-16 14:10           ` Hans Petter Selasky
2023-01-16 19:08         ` Michal Suchánek
  -- strict thread matches above, loose matches on Subject: below --
2023-01-13 12:59 Hans Petter Selasky
2023-01-13 13:30 ` Konstantin Khomoutov
2023-01-13 13:39   ` Hans Petter Selasky
2023-01-13 14:21     ` rsbecker
2023-01-13 14:42       ` Hans Petter Selasky
2023-01-13 15:45         ` Konstantin Ryabitsev
2023-01-13 15:50           ` Hans Petter Selasky
2023-01-13 15:56             ` rsbecker
2023-01-13 16:02               ` Hans Petter Selasky
2023-01-13 15:54           ` Hans Petter Selasky
2023-01-13 16:02             ` Konstantin Ryabitsev
2023-01-13 16:06               ` Hans Petter Selasky
2023-01-13 16:18                 ` Hans Petter Selasky
2023-01-13 16:36                   ` Konstantin Ryabitsev
2023-01-13 16:44                     ` Hans Petter Selasky
2023-01-13 16:49                       ` Konstantin Ryabitsev
2023-01-13 16:51                         ` Hans Petter Selasky
2023-01-13 16:27                 ` Konstantin Ryabitsev
2023-01-13 16:30                   ` Hans Petter Selasky
2023-01-13 16:35                   ` Hans Petter Selasky
2023-01-13 16:41                     ` Konstantin Ryabitsev
2023-01-13 16:45                       ` Hans Petter Selasky
2023-01-13 15:15       ` Hans Petter Selasky
2023-01-13 17:44       ` Philip Oakley
2023-01-13 15:30     ` Konstantin Khomoutov
2023-01-13 15:39     ` Konstantin Ryabitsev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y8NB21PExmifhyeQ@tapette.crustytoothpaste.net \
    --to=sandals@crustytoothpaste.net \
    --cc=git@vger.kernel.org \
    --cc=hps@selasky.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).