git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Martin Ågren" <martin.agren@gmail.com>
To: "brian m. carlson" <sandals@crustytoothpaste.net>,
	"Git Mailing List" <git@vger.kernel.org>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Eric Sunshine" <sunshine@sunshineco.com>
Subject: Re: [PATCH 18/41] index-pack: abstract away hash function constant
Date: Wed, 25 Apr 2018 20:49:47 +0200	[thread overview]
Message-ID: <CAN0heSqpj9JfTrnMFRbquraxve9iTwoowgWRUhcD-gXHMg3V=g@mail.gmail.com> (raw)
In-Reply-To: <20180424235150.GD245996@genre.crustytoothpaste.net>

On 25 April 2018 at 01:51, brian m. carlson
<sandals@crustytoothpaste.net> wrote:
> On Tue, Apr 24, 2018 at 11:50:16AM +0200, Martin Ågren wrote:
>> On 24 April 2018 at 01:39, brian m. carlson
>> <sandals@crustytoothpaste.net> wrote:
>> > The code for reading certain pack v2 offsets had a hard-coded 5
>> > representing the number of uint32_t words that we needed to skip over.
>> > Specify this value in terms of a value from the_hash_algo.
>> >
>> > Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
>> > ---
>> >  builtin/index-pack.c | 3 ++-
>> >  1 file changed, 2 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/builtin/index-pack.c b/builtin/index-pack.c
>> > index d81473e722..c1f94a7da6 100644
>> > --- a/builtin/index-pack.c
>> > +++ b/builtin/index-pack.c
>> > @@ -1543,12 +1543,13 @@ static void read_v2_anomalous_offsets(struct packed_git *p,
>> >  {
>> >         const uint32_t *idx1, *idx2;
>> >         uint32_t i;
>> > +       const uint32_t hashwords = the_hash_algo->rawsz / sizeof(uint32_t);
>>
>> Should we round up? Or just what should we do if a length is not
>> divisible by 4? (I am not aware of any such hash functions, but one
>> could exist for all I know.) Another question is whether such an
>> index-pack v2 will ever contain non-SHA-1 oids to begin with. I can't
>> find anything suggesting that it could, but this is unfamiliar code to
>> me.
>
> I opted not to simply because I know that our current hash is 20 bytes
> and the new one will be 32, and I know those are both divisible by 4.  I
> feel confident that any future hash we choose will also be divisible by
> 4, and the code is going to be complicated if it isn't.
>
> I agree that pack v2 is not going to have anything but SHA-1.  However,
> writing all the code such that it's algorithm agnostic means that we can
> do testing of new algorithms by wholesale replacing the algorithm with a
> new one, which simplifies things considerably.

Ok. I do sort of wonder if a "successful" test run after globally
substituting Hash-Foo for SHA-1 (regardless of whether the size changes
or not) hints at a problem. That is, nowhere do we test that this code
uses 20-byte SHA-1s, regardless of what other hash functions are
available and configured. Of course, until soon, that did not really
have to be tested since there was only one hash function available to
choose from. As for identifying all the places that matter ... no idea.

Of course I can see how this helps get things to a point where Git does
not crash and burn because the hash has a different size, and where the
test suite doesn't spew failures because the initial chaining value of
"SHA-1" is changed.

Once that is accomplished, I sort of suspect that this code will want to
be updated to not always blindly use the_hash_algo, but to always work
with SHA-1 sizes. Or rather, this would turn into more generic code to
handle both "v2 with SHA-1" and "v3 with some hash function(s)". This
commit might be a good first step in that direction.

Long rambling short, yeah, I see your point.

Martin

  reply	other threads:[~2018-04-25 18:49 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-23 23:39 [PATCH 00/41] object_id part 13 brian m. carlson
2018-04-23 23:39 ` [PATCH 01/41] cache: add a function to read an object ID from a buffer brian m. carlson
2018-04-24  9:39   ` Martin Ågren
2018-05-01  9:36   ` Duy Nguyen
2018-05-01 23:58     ` brian m. carlson
2018-04-23 23:39 ` [PATCH 02/41] server-info: remove unused members from struct pack_info brian m. carlson
2018-04-24  9:41   ` Martin Ågren
2018-05-01  9:39   ` Duy Nguyen
2018-04-23 23:39 ` [PATCH 03/41] Remove unused member in struct object_context brian m. carlson
2018-05-01  9:50   ` Duy Nguyen
2018-04-23 23:39 ` [PATCH 04/41] packfile: remove unused member from struct pack_entry brian m. carlson
2018-05-01 10:01   ` Duy Nguyen
2018-04-23 23:39 ` [PATCH 05/41] packfile: convert has_sha1_pack to object_id brian m. carlson
2018-04-23 23:39 ` [PATCH 06/41] sha1_file: convert freshen functions " brian m. carlson
2018-04-23 23:39 ` [PATCH 07/41] packfile: convert find_pack_entry " brian m. carlson
2018-04-23 23:39 ` [PATCH 08/41] packfile: abstract away hash constant values brian m. carlson
2018-05-01 10:22   ` Duy Nguyen
2018-05-02  0:11     ` brian m. carlson
2018-05-02 15:26       ` Duy Nguyen
2018-05-02 23:05         ` brian m. carlson
2018-04-23 23:39 ` [PATCH 09/41] pack-objects: abstract away hash algorithm brian m. carlson
2018-05-01 10:26   ` Duy Nguyen
2018-04-23 23:39 ` [PATCH 10/41] pack-redundant: " brian m. carlson
2018-04-23 23:39 ` [PATCH 11/41] tree-walk: avoid hard-coded 20 constant brian m. carlson
2018-04-23 23:39 ` [PATCH 12/41] tree-walk: convert get_tree_entry_follow_symlinks to object_id brian m. carlson
2018-04-23 23:39 ` [PATCH 13/41] fsck: convert static functions to struct object_id brian m. carlson
2018-04-23 23:39 ` [PATCH 14/41] submodule-config: convert structures to object_id brian m. carlson
2018-04-23 23:39 ` [PATCH 15/41] split-index: convert struct split_index " brian m. carlson
2018-04-23 23:39 ` [PATCH 16/41] Update struct index_state to use struct object_id brian m. carlson
2018-04-23 23:39 ` [PATCH 17/41] pack-redundant: convert linked lists " brian m. carlson
2018-04-23 23:39 ` [PATCH 18/41] index-pack: abstract away hash function constant brian m. carlson
2018-04-24  9:50   ` Martin Ågren
2018-04-24 23:51     ` brian m. carlson
2018-04-25 18:49       ` Martin Ågren [this message]
2018-04-26 15:46         ` Duy Nguyen
2018-04-27 21:08           ` brian m. carlson
2018-04-28  5:41             ` Duy Nguyen
2018-04-23 23:39 ` [PATCH 19/41] commit: convert uses of get_sha1_hex to get_oid_hex brian m. carlson
2018-04-23 23:39 ` [PATCH 20/41] dir: convert struct untracked_cache_dir to object_id brian m. carlson
2018-04-23 23:39 ` [PATCH 21/41] http: eliminate hard-coded constants brian m. carlson
2018-04-24  9:53   ` Martin Ågren
2018-04-24 23:44     ` Junio C Hamano
2018-04-25  1:29       ` brian m. carlson
2018-04-23 23:39 ` [PATCH 22/41] revision: replace use of " brian m. carlson
2018-04-23 23:39 ` [PATCH 23/41] upload-pack: replace use of several " brian m. carlson
2018-04-24  7:53   ` Simon Ruderich
2018-04-23 23:39 ` [PATCH 24/41] diff: specify abbreviation size in terms of the_hash_algo brian m. carlson
2018-04-23 23:39 ` [PATCH 25/41] builtin/receive-pack: avoid hard-coded constants for push certs brian m. carlson
2018-04-24  9:58   ` Martin Ågren
2018-04-25  2:00     ` brian m. carlson
2018-04-25  5:06       ` Martin Ågren
2018-04-23 23:39 ` [PATCH 26/41] builtin/am: convert uses of EMPTY_TREE_SHA1_BIN to the_hash_algo brian m. carlson
2018-04-23 23:39 ` [PATCH 27/41] builtin/merge: switch tree functions to use object_id brian m. carlson
2018-04-23 23:39 ` [PATCH 28/41] merge: convert empty tree constant to the_hash_algo brian m. carlson
2018-04-23 23:39 ` [PATCH 29/41] sequencer: convert one use of EMPTY_TREE_SHA1_HEX brian m. carlson
2018-04-23 23:39 ` [PATCH 30/41] submodule: convert several uses " brian m. carlson
2018-04-23 23:39 ` [PATCH 31/41] wt-status: convert two " brian m. carlson
2018-04-24 10:03   ` Martin Ågren
2018-05-01  2:29     ` brian m. carlson
2018-04-23 23:39 ` [PATCH 32/41] builtin/receive-pack: convert one use " brian m. carlson
2018-04-23 23:39 ` [PATCH 33/41] builtin/reset: convert use of EMPTY_TREE_SHA1_BIN brian m. carlson
2018-04-23 23:39 ` [PATCH 34/41] sha1_file: convert cached object code to struct object_id brian m. carlson
2018-04-23 23:39 ` [PATCH 35/41] cache-tree: use is_empty_tree_oid brian m. carlson
2018-04-23 23:39 ` [PATCH 36/41] sequencer: use the_hash_algo for empty tree object ID brian m. carlson
2018-04-23 23:39 ` [PATCH 37/41] dir: use the_hash_algo for empty blob " brian m. carlson
2018-04-23 23:39 ` [PATCH 38/41] sha1_file: only expose empty object constants through git_hash_algo brian m. carlson
2018-04-23 23:39 ` [PATCH 39/41] Update shell scripts to compute empty tree object ID brian m. carlson
2018-05-01 10:42   ` Duy Nguyen
2018-05-04  1:29     ` brian m. carlson
2018-04-23 23:39 ` [PATCH 40/41] add--interactive: compute the empty tree value brian m. carlson
2018-04-23 23:39 ` [PATCH 41/41] merge-one-file: compute empty blob object ID brian m. carlson
2018-04-24  1:00   ` SZEDER Gábor
2018-04-24  1:03     ` brian m. carlson
2018-04-30 18:03 ` [PATCH 00/41] object_id part 13 Duy Nguyen
2018-04-30 23:59   ` brian m. carlson
2018-05-01 10:51     ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAN0heSqpj9JfTrnMFRbquraxve9iTwoowgWRUhcD-gXHMg3V=g@mail.gmail.com' \
    --to=martin.agren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    --cc=sandals@crustytoothpaste.net \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).