git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Ian Jackson <ijackson@chiark.greenend.org.uk>
To: "brian m. carlson" <sandals@crustytoothpaste.net>
Cc: "Jeff King" <peff@peff.net>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"Jason Cooper" <git@lakedaemon.net>,
	ankostis <ankostis@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Git Mailing List" <git@vger.kernel.org>,
	"Stefan Beller" <sbeller@google.com>,
	"David Lang" <david@lang.hm>, "Joey Hess" <id@joeyh.name>
Subject: Re: Transition plan for git to move to a new hash function
Date: Sun, 5 Mar 2017 13:45:46 +0000	[thread overview]
Message-ID: <22716.5770.95842.704242@chiark.greenend.org.uk> (raw)
In-Reply-To: <20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net>

brian m. carlson writes ("Re: Transition plan for git to move to a new hash function"):
> Instead, I was referring to areas like the notes code.  It has extensive
> use of the last byte as a type of lookup table key.  It's very dependent
> on having exactly one hash, since it will always want to use the last
> byte.

You mean note_tree_search ?  (My tree here may be a bit out of date.)
This doesn't seem difficult to fix.  The nontrivial changes would be
mostly confined to SUBTREE_SHA1_PREFIXCMP and GET_NIBBLE.

It's true that like most of git there's a lot of hardcoded `sha1'.


Are you arguing in favour of "replace git with git2 by simply
s/20/64/g; s/sha1/blake/g" ?  This seems to me to be a poor idea.
Takeup of the new `git2' would be very slow because of the pain
involved.

Any sensible method of moving to a new hash that isn't "make a
completely incompatible new version of git" is going to involve
teaching the code we have in git right now to handle new hashes as
well as sha1 hashes.

Even if the plan is to try to convert old data, rather than keep it
and be able to refer to it from new data, something will have to be
able to parse old packfiles, old commits, old tags, old notes,
etc. etc. etc.  Either that's going to be some separate conversion
utility, or it has to be the same code in git that's there already.[1]

The ability to handle both old-format and new-format data can be
achieved in the code by doing away with the hardcoded sha1s, so that
instead the hash is an abstract data type with operations like
"initialise", "compare", "get a nybble", etc.  We've already seen
patches going in this direction.

[1] I've heard suggestions here that instead we should expect users to
"git1 fast-export", which you would presumably feed into "git2
fast-import".  But what is `git1' here ?  Is it the current git
codebase frozen in time ?  I don't think it can be.  With this
conversion strategy, we will need to maintain git1 for decades.  It
will need portability fixes, security fixes, fixes for new hostile
compiler optimisations, and so on.  The difficulty of conversion means
there will be pressure to backport new features from `git2' to `git1'.
(Also this approach means that all signatures are definitively lost
during the conversion process.)

So if we want to provide both `git1' and `git2', it's still better to
compile `git' and `git2' from the same codebase.  But if we do that,
the resulting ifdeffery and/or other hash abstractions are most of the
work to be hash-agile.  It's just the difference between a
compile-time and runtime switch.

I think the incompatibile approach is much more work in the medium and
long term - and it leads to a longer transition period.


Bear in mind that our objective is not to minimise the time until the
new version of git is available.  Our objective is to minimise the
time until (most) people are using it.  An approach which takes longer
for the git community to develop, but which is easier to deploy, can
easily be better.

Or maybe the objective is to minimise overall effort.  In which case
more work on git, for an easier transition for all the users, seems
like a no-brainer.  I think this is arguably true even from the point
of view of effort amongst the community of git contributors.  git
contributors start out as git users - and if git's users are all busy
struggling with a difficult transition, they will have less time to
improve other stuff and will tend less to get involved upstream.  (And
they may be less inclined to feel that the git upstream developers
understand their needs well.)

The better alternative is to adopt a plan that has a clear and
straightforward transition for users, and ask git users to help with
implementation.

I think many git users, including sophisticated users and competent
organisations, are concerned about sha1.  Currently most of those
users will find it difficult to help, because it's not clear to them
what needs to be done.

Thanks,
Ian.

  reply	other threads:[~2017-03-05 14:19 UTC|newest]

Thread overview: 136+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-23 16:43 SHA1 collisions found Joey Hess
2017-02-23 17:00 ` David Lang
2017-02-23 17:02 ` Junio C Hamano
2017-02-23 17:12   ` David Lang
2017-02-23 20:49     ` Jakub Narębski
2017-02-23 20:57       ` Jeff King
2017-02-23 17:18   ` Junio C Hamano
2017-02-23 17:35   ` Joey Hess
2017-02-23 17:52     ` Linus Torvalds
2017-02-23 18:21       ` Joey Hess
2017-02-23 18:31         ` Joey Hess
2017-02-23 19:13           ` Morten Welinder
2017-02-24 15:52             ` Geert Uytterhoeven
2017-02-23 18:40         ` Linus Torvalds
2017-02-23 18:46           ` Jeff King
2017-02-23 19:09             ` Linus Torvalds
2017-02-23 19:32               ` Jeff King
2017-02-23 19:47                 ` Linus Torvalds
2017-02-23 19:57                   ` Jeff King
     [not found]                     ` <alpine.LFD.2.20.1702231428540.30435@i7.lan>
2017-02-23 22:43                       ` Jeff King
2017-02-23 22:50                         ` Linus Torvalds
2017-02-23 23:05                         ` Jeff King
2017-02-23 23:05                           ` [PATCH 1/3] add collision-detecting sha1 implementation Jeff King
2017-02-23 23:15                             ` Stefan Beller
2017-02-24  0:01                               ` Jeff King
2017-02-24  0:12                                 ` Linus Torvalds
2017-02-24  0:16                                   ` Jeff King
2017-02-23 23:05                           ` [PATCH 2/3] sha1dc: adjust header includes for git Jeff King
2017-02-23 23:06                           ` [PATCH 3/3] Makefile: add USE_SHA1DC knob Jeff King
2017-02-24 18:36                             ` HW42
2017-02-24 18:57                               ` Jeff King
2017-02-23 23:14                           ` SHA1 collisions found Linus Torvalds
2017-02-28 18:41                           ` Junio C Hamano
2017-02-28 19:07                             ` Junio C Hamano
2017-02-28 19:20                               ` Jeff King
2017-03-01  8:57                                 ` Dan Shumow
2017-02-28 19:34                               ` Linus Torvalds
2017-02-28 19:52                                 ` Shawn Pearce
2017-02-28 22:56                                   ` Linus Torvalds
2017-02-28 21:22                                 ` Dan Shumow
2017-02-28 22:50                                   ` Marc Stevens
2017-02-28 23:11                                     ` Linus Torvalds
2017-03-01 19:05                                       ` Jeff King
2017-02-23 20:47               ` Øyvind A. Holm
2017-02-23 20:46             ` Joey Hess
2017-02-23 18:42         ` Jeff King
2017-02-23 17:52     ` David Lang
2017-02-23 19:20   ` David Lang
2017-02-23 17:19 ` Linus Torvalds
2017-02-23 17:29   ` Linus Torvalds
2017-02-23 18:10   ` Joey Hess
2017-02-23 18:29     ` Linus Torvalds
2017-02-23 18:38     ` Junio C Hamano
2017-02-24  9:42 ` Duy Nguyen
2017-02-25 19:04   ` brian m. carlson
2017-02-27 13:29     ` René Scharfe
2017-02-28 13:25       ` brian m. carlson
2017-02-24 15:13 ` Ian Jackson
2017-02-24 17:04   ` ankostis
2017-02-24 17:23   ` Jason Cooper
2017-02-25 23:22     ` ankostis
2017-02-24 17:32   ` Junio C Hamano
2017-02-24 17:45     ` David Lang
2017-02-24 18:14       ` Junio C Hamano
2017-02-24 18:58         ` Stefan Beller
2017-02-24 19:20           ` Junio C Hamano
2017-02-24 20:05             ` ankostis
2017-02-24 20:32               ` Junio C Hamano
2017-02-25  0:31                 ` ankostis
2017-02-26  0:16                   ` Jason Cooper
2017-02-26 17:38                     ` brian m. carlson
2017-02-26 19:11                       ` Linus Torvalds
2017-02-26 21:38                         ` Ævar Arnfjörð Bjarmason
2017-02-26 21:52                           ` Jeff King
2017-02-27 13:00                             ` Transition plan for git to move to a new hash function Ian Jackson
2017-02-27 14:37                               ` Why BLAKE2? Markus Trippelsdorf
2017-02-27 15:42                                 ` Ian Jackson
2017-02-27 19:26                               ` Transition plan for git to move to a new hash function Tony Finch
2017-02-28 21:47                               ` brian m. carlson
2017-03-02 18:13                                 ` Ian Jackson
2017-03-04 22:49                                   ` brian m. carlson
2017-03-05 13:45                                     ` Ian Jackson [this message]
2017-03-05 23:45                                       ` brian m. carlson
2017-02-24 20:05             ` SHA1 collisions found Junio C Hamano
2017-02-24 20:33           ` Philip Oakley
2017-02-24 23:39     ` Jeff King
2017-02-25  0:39       ` Linus Torvalds
2017-02-25  0:54         ` Linus Torvalds
2017-02-25  1:16         ` Jeff King
2017-02-26 18:55           ` Junio C Hamano
2017-02-25  6:10         ` Junio C Hamano
2017-02-26  1:13           ` Jason Cooper
2017-02-26  5:18             ` Jeff King
2017-02-26 18:30               ` brian m. carlson
2017-03-02 21:46               ` Brandon Williams
2017-03-03 11:13                 ` Jeff King
2017-03-03 14:54                   ` Ian Jackson
2017-03-03 22:18                     ` Jeff King
2017-03-02 19:55         ` Linus Torvalds
2017-03-02 20:43           ` Junio C Hamano
2017-03-02 21:21             ` Linus Torvalds
2017-03-02 21:54               ` Joey Hess
2017-03-02 22:27                 ` Linus Torvalds
2017-03-03  1:50                   ` Mike Hommey
2017-03-03  2:19                     ` Linus Torvalds
2017-03-03 11:04           ` Jeff King
2017-03-03 21:47           ` Stefan Beller
2017-02-25  1:00       ` David Lang
2017-02-25  1:15         ` Stefan Beller
2017-02-25  1:21         ` Jeff King
2017-02-25  1:39           ` David Lang
2017-02-25  1:47             ` Jeff King
2017-02-25  1:56               ` David Lang
2017-02-25  2:28             ` Jacob Keller
2017-02-25  2:26           ` Jacob Keller
2017-02-25  5:39             ` grarpamp
2017-02-24 23:43     ` Ian Jackson
2017-02-25  0:06       ` Ian Jackson
2017-02-25 18:50     ` brian m. carlson
2017-02-25 19:26       ` Jeff King
2017-02-25 22:09         ` Mike Hommey
2017-02-26 17:38           ` brian m. carlson
2017-02-24 22:47 ` Jakub Narębski
2017-02-24 22:53   ` Santiago Torres
2017-02-24 23:05     ` Jakub Narębski
2017-02-24 23:24       ` Øyvind A. Holm
2017-02-24 23:06   ` Jeff King
2017-02-24 23:35     ` Jakub Narębski
2017-02-25 22:35     ` Lars Schneider
2017-02-26  0:46       ` Jeff King
2017-02-26 18:22         ` Junio C Hamano
2017-02-26 18:57     ` Thomas Braun
2017-02-26 21:30       ` Jeff King
2017-02-27  9:57         ` Geert Uytterhoeven
2017-02-27 10:43           ` Jeff King
2017-02-27 12:39             ` Morten Welinder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=22716.5770.95842.704242@chiark.greenend.org.uk \
    --to=ijackson@chiark.greenend.org.uk \
    --cc=ankostis@gmail.com \
    --cc=avarab@gmail.com \
    --cc=david@lang.hm \
    --cc=git@lakedaemon.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=id@joeyh.name \
    --cc=peff@peff.net \
    --cc=sandals@crustytoothpaste.net \
    --cc=sbeller@google.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).