From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: "brian m. carlson" <sandals@crustytoothpaste.net>
Cc: git@vger.kernel.org, Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH 03/15] cache: add an algo member to struct object_id
Date: Sun, 11 Apr 2021 13:55:57 +0200 [thread overview]
Message-ID: <87mtu5f31u.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <20210410152140.3525040-4-sandals@crustytoothpaste.net>
On Sat, Apr 10 2021, brian m. carlson wrote:
> Now that we're working with multiple hash algorithms in the same repo,
> it's best if we label each object ID with its algorithm so we can
> determine how to format a given object ID. Add a member called algo to
> struct object_id.
>
> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---
> hash.h | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/hash.h b/hash.h
> index 3fb0c3d400..dafdcb3335 100644
> --- a/hash.h
> +++ b/hash.h
> @@ -181,6 +181,7 @@ static inline int hash_algo_by_ptr(const struct git_hash_algo *p)
>
> struct object_id {
> unsigned char hash[GIT_MAX_RAWSZ];
> + int algo;
Curiosity since I'm not as familiar as you with the multi-hash support
by far:
So struct object_id is GIT_MAX_RAWSZ, not two types of structs for
GIT_SHA1_RAWSZ and GIT_SHA256_RAWSZ. That pre-dates this series because
we'd like to not deal with two types of objects everywhere for SHA-1 and
SHA-256. Makes sense.
Before this series we'd memcmp them up to their actual length, but the
last GIT_MAX_RAWSZ-GIT_SHA1_RAWSZ would be uninitialized
Now we pad them out, so the last 96 bits of every SHA1 are 0000...;
Couldn't we also tell which hash an object is by memcmp-ing those last N
bits and see if they're all zero'd?
Feels a bit hackish, and we'd need to reconsider that method if we'd
ever support other same-length hashes.
But OTOH having these objects all padded out in memory to the same
length, but having to carry around a "what hash algo" is it yields the
arguably weird hack of having a per-hash NULL_OID, which has never been
an actual object of any hash type, but just a pseudo-object.
As another aside I had some local patches (just for playing around) to
implement SHA-256/160, i.e. a SHA-256-to-SHA-1-length that doesn't
officially exist. We'd store things as full-length SHA-256 internally,
but on anything that would format them (including plumbing output) we'd
emit the truncated version(s).
The idea was to support Git/SHA-256 when combined with legacy systems
who'd all need DB column changes to have different length hashes.
I abandoned it as insany sillyness after playing with it for about a
day, but it did reveal that much of the hash code now can assume
internal length == formatting length, which is why I'm 3 paragraphs into
this digression, i.e. maybe some of the code structure also makes having
a NULL_OID always be 256-bits when we want to format it as 160/256
painful...
next prev parent reply other threads:[~2021-04-11 11:56 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-10 15:21 [PATCH 00/15] SHA-256 / SHA-1 interop, part 1 brian m. carlson
2021-04-10 15:21 ` [PATCH 01/15] sha1-file: allow hashing objects literally with any algorithm brian m. carlson
2021-04-15 8:55 ` Denton Liu
2021-04-15 23:03 ` brian m. carlson
2021-04-16 15:04 ` Ævar Arnfjörð Bjarmason
2021-04-16 18:55 ` Junio C Hamano
2021-04-10 15:21 ` [PATCH 02/15] builtin/hash-object: allow literally hashing with a given algorithm brian m. carlson
2021-04-11 8:52 ` Ævar Arnfjörð Bjarmason
2021-04-11 21:07 ` brian m. carlson
2021-04-16 15:21 ` Ævar Arnfjörð Bjarmason
2021-04-16 17:27 ` Ævar Arnfjörð Bjarmason
2021-04-10 15:21 ` [PATCH 03/15] cache: add an algo member to struct object_id brian m. carlson
2021-04-11 11:55 ` Ævar Arnfjörð Bjarmason [this message]
2021-04-11 21:37 ` brian m. carlson
2021-04-13 12:12 ` Derrick Stolee
2021-04-14 1:08 ` brian m. carlson
2021-04-15 8:47 ` Ævar Arnfjörð Bjarmason
2021-04-15 23:51 ` brian m. carlson
2021-04-10 15:21 ` [PATCH 04/15] Always use oidread to read into " brian m. carlson
2021-04-10 15:21 ` [PATCH 05/15] hash: add a function to finalize object IDs brian m. carlson
2021-04-10 15:21 ` [PATCH 06/15] Use the final_oid_fn to finalize hashing of " brian m. carlson
2021-04-10 15:21 ` [PATCH 07/15] builtin/pack-redundant: avoid casting buffers to struct object_id brian m. carlson
2021-04-10 15:21 ` [PATCH 08/15] cache: compare the entire buffer for " brian m. carlson
2021-04-11 8:17 ` Chris Torek
2021-04-11 11:36 ` Ævar Arnfjörð Bjarmason
2021-04-11 21:05 ` brian m. carlson
2021-04-10 15:21 ` [PATCH 09/15] hash: set and copy algo field in " brian m. carlson
2021-04-11 11:57 ` Ævar Arnfjörð Bjarmason
2021-04-11 21:48 ` brian m. carlson
2021-04-11 22:12 ` Ævar Arnfjörð Bjarmason
2021-04-11 23:52 ` brian m. carlson
2021-04-12 11:02 ` [PATCH 0/2] C99: harder dependency on variadic macros Ævar Arnfjörð Bjarmason
2021-04-12 11:02 ` [PATCH 1/2] git-compat-util.h: clarify comment on GCC-specific code Ævar Arnfjörð Bjarmason
2021-04-13 7:57 ` Jeff King
2021-04-13 21:07 ` Junio C Hamano
2021-04-14 5:21 ` Jeff King
2021-04-14 6:12 ` Ævar Arnfjörð Bjarmason
2021-04-14 7:31 ` Jeff King
2021-05-21 2:06 ` Jonathan Nieder
2021-04-12 11:02 ` [PATCH 2/2] C99 support: remove non-HAVE_VARIADIC_MACROS code Ævar Arnfjörð Bjarmason
2021-04-12 17:58 ` Junio C Hamano
2021-04-13 8:00 ` Jeff King
2021-05-21 2:50 ` Jonathan Nieder
2021-04-12 12:14 ` [PATCH 0/2] C99: harder dependency on variadic macros Bagas Sanjaya
2021-04-12 12:41 ` Ævar Arnfjörð Bjarmason
2021-04-12 22:57 ` brian m. carlson
2021-04-12 23:19 ` Junio C Hamano
2022-01-28 11:11 ` [PATCH v2 0/2] C99: remove hardcoded-out !HAVE_VARIADIC_MACROS code Ævar Arnfjörð Bjarmason
2022-01-28 11:11 ` [PATCH v2 1/2] git-compat-util.h: clarify GCC v.s. C99-specific in comment Ævar Arnfjörð Bjarmason
2022-01-28 11:11 ` [PATCH v2 2/2] C99: remove hardcoded-out !HAVE_VARIADIC_MACROS code Ævar Arnfjörð Bjarmason
2022-01-28 22:40 ` Junio C Hamano
2022-02-19 10:41 ` [PATCH v3 0/3] C99: remove dead " Ævar Arnfjörð Bjarmason
2022-02-19 10:41 ` [PATCH v3 1/3] git-compat-util.h: clarify GCC v.s. C99-specific in comment Ævar Arnfjörð Bjarmason
2022-02-19 10:41 ` [PATCH v3 2/3] C99: remove hardcoded-out !HAVE_VARIADIC_MACROS code Ævar Arnfjörð Bjarmason
2022-02-19 10:41 ` [PATCH v3 3/3] trace.h: remove never-used TRACE_CONTEXT Ævar Arnfjörð Bjarmason
2022-02-20 12:02 ` Junio C Hamano
2022-02-20 12:38 ` Ævar Arnfjörð Bjarmason
2022-02-20 20:12 ` Junio C Hamano
2022-02-21 16:05 ` [PATCH v4 0/2] C99: remove dead !HAVE_VARIADIC_MACROS code Ævar Arnfjörð Bjarmason
2022-02-21 16:05 ` [PATCH v4 1/2] git-compat-util.h: clarify GCC v.s. C99-specific in comment Ævar Arnfjörð Bjarmason
2022-02-21 16:05 ` [PATCH v4 2/2] C99: remove hardcoded-out !HAVE_VARIADIC_MACROS code Ævar Arnfjörð Bjarmason
2021-04-12 10:53 ` [PATCH 09/15] hash: set and copy algo field in struct object_id Junio C Hamano
2021-04-12 11:13 ` Ævar Arnfjörð Bjarmason
2021-04-10 15:21 ` [PATCH 10/15] hash: provide per-algorithm null OIDs brian m. carlson
2021-04-11 14:03 ` Junio C Hamano
2021-04-11 21:51 ` brian m. carlson
2021-04-10 15:21 ` [PATCH 11/15] builtin/show-index: set the algorithm for object IDs brian m. carlson
2021-04-10 15:21 ` [PATCH 12/15] commit-graph: don't store file hashes as struct object_id brian m. carlson
2021-04-10 15:21 ` [PATCH 13/15] builtin/pack-objects: avoid using struct object_id for pack hash brian m. carlson
2021-04-10 15:21 ` [PATCH 14/15] hex: default to the_hash_algo on zero algorithm value brian m. carlson
2021-04-10 15:21 ` [PATCH 15/15] hex: print objects using the hash algorithm member brian m. carlson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87mtu5f31u.fsf@evledraar.gmail.com \
--to=avarab@gmail.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).