git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "brian m. carlson" <sandals@crustytoothpaste.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: Typesafer git hash patch
Date: Tue, 28 Feb 2017 20:19:19 +0000	[thread overview]
Message-ID: <20170228201918.dpqrrijibdapemaf@genre.crustytoothpaste.net> (raw)
In-Reply-To: <xmqqvarujdmv.fsf@gitster.mtv.corp.google.com>

[-- Attachment #1: Type: text/plain, Size: 3410 bytes --]

On Tue, Feb 28, 2017 at 11:53:28AM -0800, Junio C Hamano wrote:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
> 
> > That
> > actually can clean up some code, because right now we have duplicate
> > interfaces for some things that take an oid pointer and others take a
> > "const unsigned char *sha1", and that duplication essentially can go
> > away.
> 
> ... I understand that this is an eventual goal of "struct object_id"
> effort.

Yes, it is.

> > ..., and the lines are generally shorter, eg
> >
> > -               const unsigned char *mb = result->item->object.oid.hash;
> > -               if (!hashcmp(mb, current_bad_oid->hash)) {
> >
> > turns into
> >
> > +               const hash_t *mb = &result->item->object.oid;
> > +               if (!hashcmp(mb, current_bad_oid)) {
> 
> Hmph.  I somehow thought the longer term directio for the above code
> would be to turn it into
> 
> 		if (!oidcmp(&result->item->object.oid, &current_bad_oid))
> 
> instead.

It is.  The duplication is temporary.  We've actually removed some of
the original SHA-1 functions in favor of the object_id versions.

> Having said all that, I do not offhand see a huge benefit of the
> current layout that has one layer between the hash (i.e. oid.hash)
> and the object name (i.e. oid) over "there is no need for oid.hash;
> oid is just a hash", which you seem to be doing.
> 
> Except for cases where we need to be prepared to see two or more
> kinds of object names (in which case struct object_id would have an
> enum that tells you if oid.hash is SHA-1 or SHA-3-256), that is.  Of
> course we can encode that in hash_t itself (e.g. multihash).

Right, this is indeed a benefit.

The bigger issue is the assumptions in the code base that assume a given
hash size.  These assumptions are in a bunch of places, so pretty much
any time you see a number, you end up having to question what it means.
We have a 64 that means 40 (SHA-1 hex) plus 24 other values, for
example.

I've sent various series because I think that's the way that the Git
development community has most welcomed my contributions.  I have
another 69 patches in two series (which are as of yet untested).  I can
probably have almost all of the transition actually complete in another
couple weeks (because I have a day job which does not involve working on
Git).

I've hesitated to send additional patches to the list when my existing
changes haven't even hit next yet, since they're received little actual
testing.

I'm unsure if sending one enormous patch is actually going to be
welcome; I feel Junio and most of the regulars are probably not going to
be very excited about the idea, mostly because the reviewing task is
enormous, and the risk for breakage is also pretty high.  We want to
move away from SHA-1 as quickly as possible, yes, but we don't want Git
to become unusably buggy.

I guess what I'm saying is that I'm fine if we want to have a more
aggressive timeline, and I can probably make that happen, at least if we
can wait a few weeks.  If Junio and the other regulars are comfortable
merging one omnibus patch to get it moving even more quickly, I won't
stand in the way.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

  parent reply	other threads:[~2017-02-28 23:26 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-28  6:59 Typesafer git hash patch Linus Torvalds
     [not found] ` <xmqqvarujdmv.fsf@gitster.mtv.corp.google.com>
2017-02-28 20:19   ` brian m. carlson [this message]
2017-02-28 20:38     ` Linus Torvalds
2017-02-28 20:25   ` Linus Torvalds
2017-02-28 20:45     ` brian m. carlson
2017-02-28 20:26 ` Jeff King
2017-02-28 20:33   ` brian m. carlson
2017-02-28 20:37     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170228201918.dpqrrijibdapemaf@genre.crustytoothpaste.net \
    --to=sandals@crustytoothpaste.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).