git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Stefan Beller <sbeller@google.com>,
	Junio C Hamano <gitster@pobox.com>,
	Phillip Wood <phillip.wood@dunelm.org.uk>,
	"git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: pushing for a new hash, was Re: [PATCH 2/3] rebase: Add tests for console output
Date: Fri, 2 Jun 2017 22:29:36 +0200	[thread overview]
Message-ID: <CACBZZX7JRA2niwt9wsGAxnzS+gWS8hTUgzWm8NaY1gs87o8xVQ@mail.gmail.com> (raw)
In-Reply-To: <20170602175455.GA30988@aiede.mtv.corp.google.com>

On Fri, Jun 2, 2017 at 7:54 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Hi Dscho,
>
> Johannes Schindelin wrote:
>> On Thu, 1 Jun 2017, Stefan Beller wrote:
>
>>> We had a discussion off list how much of the test suite is in bad shape,
>>> and "$ git grep ^index" points out a lot of places as well.
>>
>> Maybe we should call out a specific month (or even a longer period) during
>> which we try to push toward that new hash function, and focus more on
>> those tasks (and on critical bug fixes, if any) than anything else.
>
> Thanks for offering. ;-)
>
> Here's a rough list of some useful tasks, in no particular order:
>
> 1. bc/object-id: This patch series continues, eliminating assumptions
>    about the size of object ids by encapsulating them in a struct.
>    One straightforward way to find code that still needs to be
>    converted is to grep for "sha" --- often the conversion patches
>    change function and variable names to refer to oid_ where they used
>    to use sha1_, making the stragglers easier to spot.
>
> 2. Hard-coded object ids in tests: As Stefan hinted, many tests beyond
>    t00* make assumptions about the exact values of object ids.  That's
>    bad for maintainability for other reasons beyond the hash function
>    transition, too.
>
>    It should be possible to suss them out by patching git's sha1
>    routine to use the ones-complement of sha1 (~sha1) instead and
>    seeing which tests fail.

I just hacked this up locally. While a lot of stuff fails, it's not
exactly an out of control garbage fire and could be churned through by
someone interested. A lot of it's just lazy sha1 hardcoding for no
good reason where we could use a tag, or e.g. test_cmp on ls-tree
output without the test really caring about the hash. Things that
really need to test the sha1 could be guarded by some new prereq.

Both of my attempts to fuzz SHA1DCInit though resulted in some
segfaults, I tried both changing "0x" to "~0x" in the ihv assignment,
and just calling SHA1DCUpdate(ctx, "x", 1) at the end of the function,
not sure why that's happening.

If someone knows offhand what I'm doing wrong here I might be
interested in going through this if I don't have to sift through the
segfaults. I.e. what's an easy hack to make all of git pretend that
"foo" hashes to "xfoo", "" to "x" etc.

> 4. When choosing a hash function, people may argue about performance.
>    It would be useful for run some benchmarks for git (running
>    the test suite, t/perf tests, etc) using a variety of hash
>    functions as input to such a discussion.

To the extent that such benchmarks matter, it seems prudent to heavily
weigh them in favor of whatever seems to be likely to be the more
common hash function going forward, since those are likely to get
faster through future hardware acceleration.

E.g. Intel announced Goldmont last year which according to one SHA-1
implementation improved from 9.5 cycles per byte to 2.7 cpb[1]. They
only have acceleration for SHA-1 and SHA-256[2]

1. https://github.com/weidai11/cryptopp/issues/139#issuecomment-264283385

2. https://en.wikipedia.org/wiki/Goldmont

  parent reply	other threads:[~2017-06-02 20:30 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-31 10:42 [PATCH 0/3] Add regression tests for recent rebase -i fixes Phillip Wood
2017-05-31 10:42 ` [PATCH 1/3] rebase -i: Add test for reflog message Phillip Wood
2017-06-01  2:00   ` Junio C Hamano
2017-05-31 10:42 ` [PATCH 2/3] rebase: Add tests for console output Phillip Wood
2017-05-31 19:02   ` Phillip Wood
2017-06-01  1:59     ` Junio C Hamano
2017-06-01 12:56   ` Johannes Schindelin
2017-06-01 23:40     ` Junio C Hamano
2017-06-01 23:47       ` Stefan Beller
2017-06-02 12:47         ` pushing for a new hash, was " Johannes Schindelin
2017-06-02 17:54           ` Jonathan Nieder
2017-06-02 18:05             ` Jonathan Nieder
2017-06-02 20:29             ` Ævar Arnfjörð Bjarmason [this message]
2017-06-15 10:38               ` Johannes Schindelin
2017-06-03  0:36             ` Junio C Hamano
2017-06-06 22:22             ` Johannes Schindelin
2017-06-06 22:45               ` Jonathan Nieder
2017-06-07  1:09                 ` Junio C Hamano
2017-06-07  2:18                   ` [PATCH] t4005: modernize style and drop hard coded sha1 Stefan Beller
2017-06-07 17:39                     ` Brandon Williams
2017-06-06 22:45               ` pushing for a new hash, was Re: [PATCH 2/3] rebase: Add tests for console output Stefan Beller
2017-06-06 22:52                 ` Jonathan Nieder
2017-06-07  0:34                 ` Samuel Lijin
2017-06-07 14:47                 ` Johannes Schindelin
2017-06-07 16:53                   ` Stefan Beller
2017-06-07 10:47     ` Phillip Wood
2017-06-09 16:39       ` Junio C Hamano
2017-06-14 10:18         ` Phillip Wood
2017-06-14 12:51       ` Johannes Schindelin
2017-05-31 10:42 ` [PATCH 3/3] rebase: Add tests for console output with conflicting stash Phillip Wood
2017-06-14 10:24 ` [PATCH v2 0/3] Add regression tests for rectent rebase -i fixes Phillip Wood
2017-06-14 10:24   ` [PATCH v2 1/3] rebase -i: Add test for reflog message Phillip Wood
2017-06-14 10:24   ` [PATCH v2 2/3] rebase: Add regression tests for console output Phillip Wood
2017-06-14 10:24   ` [PATCH v2 3/3] rebase: Add more " Phillip Wood
2017-06-14 20:35   ` [PATCH v2 0/3] Add regression tests for rectent rebase -i fixes Johannes Schindelin
2017-06-15 23:05   ` Junio C Hamano
2017-06-15 23:23     ` Junio C Hamano
2017-06-15 23:29       ` Junio C Hamano
2017-06-16 13:49         ` Johannes Schindelin
2017-06-16 18:43           ` Johannes Sixt
2017-06-16 21:05             ` Junio C Hamano
2017-06-19 19:45             ` Johannes Sixt
2017-06-19 20:02               ` Junio C Hamano
2017-06-19  9:49           ` Phillip Wood
2017-06-19 15:45             ` Junio C Hamano
2017-06-19  9:52         ` Phillip Wood
2017-06-19 17:56 ` [PATCH v3 0/4] Add regression tests for recent " Phillip Wood
2017-06-19 17:56   ` [PATCH v3 1/4] sequencer: print autostash messages to stderr Phillip Wood
2017-06-19 17:56   ` [PATCH v3 2/4] rebase -i: Add test for reflog message Phillip Wood
2017-06-19 17:56   ` [PATCH v3 3/4] rebase: Add regression tests for console output Phillip Wood
2017-06-19 17:56   ` [PATCH v3 4/4] rebase: Add more " Phillip Wood
2017-06-23  4:17   ` [PATCH v3 0/4] Add regression tests for recent rebase -i fixes Junio C Hamano
2017-06-23  5:07     ` Junio C Hamano
2017-06-23  9:53       ` Phillip Wood
2017-06-23 17:03         ` Junio C Hamano
2017-06-23 18:53           ` Junio C Hamano
2017-06-26  9:17             ` Phillip Wood
2017-06-23 19:01           ` Junio C Hamano
2017-06-26  9:23             ` Phillip Wood

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACBZZX7JRA2niwt9wsGAxnzS+gWS8hTUgzWm8NaY1gs87o8xVQ@mail.gmail.com \
    --to=avarab@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=phillip.wood@dunelm.org.uk \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).