git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Brandon Williams <bmwill@google.com>
Cc: git@vger.kernel.org, peff@peff.net, sandals@crustytoothpaste.net,
	Jonathan Nieder <jrnieder@gmail.com>
Subject: Re: [PATCH 25/33] notes-merge: convert verify_notes_filepair to struct object_id
Date: Sat, 03 Jun 2017 08:49:39 +0900	[thread overview]
Message-ID: <xmqqlgpax84s.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <20170602185528.GC57260@google.com> (Brandon Williams's message of "Fri, 2 Jun 2017 11:55:28 -0700")

Brandon Williams <bmwill@google.com> writes:

> On 06/02, Junio C Hamano wrote:
>> 
>> > -static int path_to_sha1(const char *path, unsigned char *sha1)
>> > +static int path_to_oid(const char *path, struct object_id *oid)
>> >  {
>> > -	char hex_sha1[40];
>> > +	char hex_oid[GIT_SHA1_HEXSZ];
>> >  	int i = 0;
>> > -	while (*path && i < 40) {
>> > +	while (*path && i < GIT_SHA1_HEXSZ) {
>> >  		if (*path != '/')
>> > -			hex_sha1[i++] = *path;
>> > +			hex_oid[i++] = *path;
>> >  		path++;
>> >  	}
>> 
>> It's no brainer to do s/GIT_SHA1_HEXSZ/GIT_MAX_HEXSZ/ for all of the
>> above, but ...
>
> I'll fix this.
>
>> 
>> > -	if (*path || i != 40)
>> > +	if (*path || i != GIT_SHA1_HEXSZ)
>> >  		return -1;
>> 
>> ... this one is tricky.  
>> 
>> What's in our envisioned future?  Are we expecing to see object
>> names, named with two or more hash functions, in a same repository?
>> If so, and one is 20 bytes and another one is 32 bytes, then this
>> should check 'i' against 40 and 64 and pass if 'i' is one of these
>> expected lengths?
>
> That's a good question, and at this point in time do we have an
> envisioned future?  From some of our conversations I seem to remember
> that we don't want a single repository to have objects based on two
> different hash functions, but rather some translation layer to convert
> between two hashing functions (for compatibility with other
> non-converted repos).  Though nothing has been settled upon yet so I
> don't know what the future would look like (and the best way to prepare
> for it).

I do not think we know precisely what we want yet.  The "MAX" in the
allocation size is an indication that we are allocating for the
largest variant possible in the version of Git being compiled, which
is a hint that we anticipate that we may have multiple variants with
different sizes.  And that is where my "against 40 and 64 ... one of
these expected lengths" come from.  

But there is no such set of macros that define acceptable/expected
lengths and that tells me that nobody among who defined, reviewed
and accepted the MAX macro has thought this part through yet.

I see Jonathan started talking about experiments with "different
hash" elsewhere; I am reading his message as assuming one hash at a
time, and the repository tells us what the hash is.  And I think
that is one plausible design.

If we were to go that route, then

    - MAX will be the maximum among hashes we can choose at the
      beginning of a process (perhaps by consulting the repository
      extension).

    - In addition to GIT_SHA1_HEXSZ, GIT_SHA2_HEXSZ, and friends, we
      need a variable that records the length of the hash chosen for
      the process at the startup, GIT_OIDHASH_HEXSZ.  That would
      become one of the fields in your "struct repository",

        #define GIT_OIDHASH_HEXSZ (the_repository.oidhash_hexsz)

      among other things like what the oid for an empty blob is.

I'd say in the meantime we can do something like the attached and
use it here, perhaps?

 cache.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/cache.h b/cache.h
index 53999418d3..cafb13db9f 100644
--- a/cache.h
+++ b/cache.h
@@ -70,6 +70,10 @@ unsigned long git_deflate_bound(git_zstream *, unsigned long);
 #define GIT_MAX_RAWSZ GIT_SHA1_RAWSZ
 #define GIT_MAX_HEXSZ GIT_SHA1_HEXSZ
 
+/* The length for the object name hash */
+#define GIT_OIDHASH_RAWSZ GIT_SHA1_RAWSZ
+#define GIT_OIDHASH_HEXSZ GIT_SHA1_HEXSZ
+
 struct object_id {
 	unsigned char hash[GIT_MAX_RAWSZ];
 };

  reply	other threads:[~2017-06-02 23:49 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-30 17:30 [PATCH 00/33] object id conversion (grep and diff) Brandon Williams
2017-05-30 17:30 ` [PATCH 01/33] notes: convert internal structures to struct object_id Brandon Williams
2017-05-30 17:30 ` [PATCH 02/33] notes: convert internal parts " Brandon Williams
2017-05-30 17:30 ` [PATCH 03/33] notes: convert for_each_note " Brandon Williams
2017-05-30 17:30 ` [PATCH 04/33] notes: make get_note return pointer " Brandon Williams
2017-07-15 18:15   ` René Scharfe
2017-07-17 17:49     ` Brandon Williams
2017-05-30 17:30 ` [PATCH 05/33] notes: convert format_display_notes " Brandon Williams
2017-05-30 17:30 ` [PATCH 06/33] builtin/notes: convert " Brandon Williams
2017-05-30 17:30 ` [PATCH 07/33] notes: convert some accessor functions " Brandon Williams
2017-05-30 17:30 ` [PATCH 08/33] grep: convert " Brandon Williams
2017-06-02  1:00   ` Junio C Hamano
2017-05-30 17:30 ` [PATCH 09/33] diff: convert get_stat_data " Brandon Williams
2017-05-30 17:30 ` [PATCH 10/33] diff: convert diff_index_show_file " Brandon Williams
2017-05-30 17:30 ` [PATCH 11/33] diff: convert diff_addremove " Brandon Williams
2017-05-30 17:30 ` [PATCH 12/33] diff: convert run_diff_files " Brandon Williams
2017-05-30 17:30 ` [PATCH 13/33] diff: convert diff_change " Brandon Williams
2017-05-30 17:30 ` [PATCH 14/33] diff: convert fill_filespec " Brandon Williams
2017-05-30 17:30 ` [PATCH 15/33] diff: convert reuse_worktree_file " Brandon Williams
2017-05-30 17:30 ` [PATCH 16/33] diff: finish conversion for prepare_temp_file " Brandon Williams
2017-05-31  0:41   ` Stefan Beller
2017-05-30 17:30 ` [PATCH 17/33] patch-ids: convert " Brandon Williams
2017-05-30 17:30 ` [PATCH 18/33] diff: convert diff_flush_patch_id " Brandon Williams
2017-05-30 17:30 ` [PATCH 19/33] combine-diff: convert diff_tree_combined " Brandon Williams
2017-05-30 17:30 ` [PATCH 20/33] combine-diff: convert find_paths_* " Brandon Williams
2017-05-31 17:49   ` Stefan Beller
2017-06-02  1:37     ` Junio C Hamano
2017-05-30 17:30 ` [PATCH 21/33] tree-diff: convert diff_root_tree_sha1 " Brandon Williams
2017-05-30 17:30 ` [PATCH 22/33] notes-merge: convert notes_merge* " Brandon Williams
2017-05-31 17:54   ` Stefan Beller
2017-05-31 22:00   ` brian m. carlson
2017-06-02  1:13     ` Junio C Hamano
2017-06-02 18:32     ` Brandon Williams
2017-05-30 17:30 ` [PATCH 23/33] notes-merge: convert merge_from_diffs " Brandon Williams
2017-05-31 18:04   ` Stefan Beller
2017-06-02  0:42     ` Junio C Hamano
2017-05-30 17:31 ` [PATCH 24/33] notes-merge: convert find_notes_merge_pair_ps " Brandon Williams
2017-05-30 17:31 ` [PATCH 25/33] notes-merge: convert verify_notes_filepair " Brandon Williams
2017-05-31 18:09   ` Stefan Beller
2017-06-02  0:47   ` Junio C Hamano
2017-06-02 18:55     ` Brandon Williams
2017-06-02 23:49       ` Junio C Hamano [this message]
2017-05-30 17:31 ` [PATCH 26/33] notes-merge: convert write_note_to_worktree " Brandon Williams
2017-05-30 17:31 ` [PATCH 27/33] diff-tree: convert diff_tree_sha1 " Brandon Williams
2017-05-30 17:31 ` [PATCH 28/33] builtin/diff-tree: cleanup references to sha1 Brandon Williams
2017-06-02  1:26   ` Junio C Hamano
2017-05-30 17:31 ` [PATCH 29/33] tree-diff: convert try_to_follow_renames to struct object_id Brandon Williams
2017-05-30 17:31 ` [PATCH 30/33] tree-diff: convert diff_tree_paths " Brandon Williams
2017-05-31 18:24   ` Stefan Beller
2017-05-31 21:29     ` Jeff King
2017-05-31 21:39       ` Jeff King
2017-06-02  1:31   ` Junio C Hamano
2017-07-15 17:18   ` René Scharfe
2017-07-15 17:22     ` brian m. carlson
2017-05-30 17:31 ` [PATCH 31/33] tree-diff: convert path_appendnew to object_id Brandon Williams
2017-06-02  1:32   ` Junio C Hamano
2017-05-30 17:31 ` [PATCH 32/33] diffcore-rename: use is_empty_blob_oid Brandon Williams
2017-05-30 17:31 ` [PATCH 33/33] diff: rename diff_fill_sha1_info to diff_fill_oid_info Brandon Williams
2017-05-31 22:06 ` [PATCH 00/33] object id conversion (grep and diff) brian m. carlson
2017-06-02 19:24   ` Brandon Williams
2017-06-02  1:34 ` Junio C Hamano
2017-06-02  5:08   ` Junio C Hamano
2017-06-02  7:19     ` Junio C Hamano
2017-06-02 18:22       ` Brandon Williams
2017-06-02 23:35         ` Junio C Hamano
2017-06-05 19:42           ` Brandon Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqlgpax84s.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=bmwill@google.com \
    --cc=git@vger.kernel.org \
    --cc=jrnieder@gmail.com \
    --cc=peff@peff.net \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).