git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Eric W. Biederman" <ebiederm@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: <git@vger.kernel.org>, "brian m. carlson" <sandals@crustytoothpaste.net>
Subject: [PATCH 00/30] Initial support for multiple hash functions
Date: Wed, 27 Sep 2023 14:49:57 -0500	[thread overview]
Message-ID: <87jzsbjt0a.fsf@gmail.froward.int.ebiederm.org> (raw)


I have been going over and over this patchset trying to figure
out if it is ready to be merged.  I don't know of any deficiencies
so it is at a point it could benefit from a set of eyes that
are not mine.

I had planned to wait a little bit longer but there are some on-going
conversations that could benefit from people seeing what it means for a
repository to support two hash functions at the same time.


A key part of the hash function transition plan is a way that a single
git repository can inter-operate with git repositories whose storage
hash function is SHA-1 and git repositories whose storage hash function
is SHA-256.

This interoperability can defined in terms of two repositories one whose
storage hash function is SHA-1 and another whose storage hash function
is SHA-256.  Those two repositories receive exactly the same objects,
but they store them in different but equivalent ways.

For a repository that has one storage hash function to inter-operate
with a repository that has a different storage hash function requires
the first repository to be able produce it's objects as if they were
stored in the second hash function.

This series of changes focuses on implementing the pieces that allow
a repository that uses one storage hash function to produce the objects
that would have been stored with a second storage hash function.

The final patch in this series is the addition of a test that creates
two repositories one that uses SHA-1 as it's storage hash function
and the other that uses SHA-256 as it's storage hash function.
Identical operations are performed on the two repositories, and their
compatibility objects are compared to verify they are the same.
AKA the SHA-1 repository on the fly generates the objects store in
the SHA-256 repository, and the SHA-256 repository on the fly generates
the objects that are stored in the SHA-1 repository.

There are two fundamental technologies for enabling this.
- The ability to convert a stored object into the object the
  other repository would have stored.
- The ability to remember a mapping between SHA-1 and SHA-256 oids
  of equivalent objects.

With such technologies it is very easy to implement user facing changes.
To avoid locking git into poor decisions by accident I have done my best
to minimize the user facing changes, while still building the internal
infrastructure that is needed for interoperability.

All of this work is inspired by earlier work on interoperability by
"brian m. carlson" and some of the key pieces of code are still his.

To get to the point where I can test if a SHA-1 and a SHA-256 repository
can on the fly generate each other, I have made some small user-facing
changes.

git rev-parse now supports --output-object-format as a way to query
the internal mapping tables between oids and report the equivalent
oid of the other format.

git cat-file when given a oid that does not match the repositories
storage format will now attempt to find the oids equivalent object that
is stored in the repository and if found dynamically generate the object
that would have been stored in a repository with a different storage
hash function and display the object.

An additional file loose-object-index will be stored in ".git/objects/".

An additional option "extensions.compatObjectFormat" is implemented,
that generates and stores mappings between the oids of objects stored in
the repository and oids of the equivalent objects that would be stored
in a repository show storage format was extensions.compatObjectFormat.

Eric W. Biederman (23):
      object-file-convert: Stubs for converting from one object format to another
      oid-array: Teach oid-array to handle multiple kinds of oids
      object-names: Support input of oids in any supported hash
      repository: add a compatibility hash algorithm
      loose: Compatibilty short name support
      object-file: Update the loose object map when writing loose objects
      object-file: Add a compat_oid_in parameter to write_object_file_flags
      commit: Convert mergetag before computing the signature of a commit
      commit: Export add_header_signature to support handling signatures on tags
      tag: sign both hashes
      object: Factor out parse_mode out of fast-import and tree-walk into in object.h
      object-file-convert: Don't leak when converting tag objects
      object-file-convert: Convert commits that embed signed tags
      object-file: Update object_info_extended to reencode objects
      rev-parse: Add an --output-object-format parameter
      builtin/cat-file:  Let the oid determine the output algorithm
      tree-walk: init_tree_desc take an oid to get the hash algorithm
      object-file: Handle compat objects in check_object_signature
      builtin/ls-tree: Let the oid determine the output algorithm
      test-lib: Compute the compatibility hash so tests may use it
      t1006: Rename sha1 to oid
      t1006: Test oid compatibility with cat-file
      t1016-compatObjectFormat: Add tests to verify the conversion between objects

brian m. carlson (7):
      loose: add a mapping between SHA-1 and SHA-256 for loose objects
      commit: write commits for both hashes
      cache: add a function to read an OID of a specific algorithm
      object-file-convert: add a function to convert trees between algorithms
      object-file-convert: convert tag objects when writing
      object-file-convert: convert commit objects when writing
      repository: Implement extensions.compatObjectFormat

 Documentation/config/extensions.txt |  12 ++
 Documentation/git-rev-parse.txt     |  12 ++
 Makefile                            |   3 +
 archive.c                           |   3 +-
 builtin/am.c                        |   6 +-
 builtin/cat-file.c                  |  12 +-
 builtin/checkout.c                  |   8 +-
 builtin/clone.c                     |   2 +-
 builtin/commit.c                    |   2 +-
 builtin/fast-import.c               |  18 +-
 builtin/grep.c                      |   8 +-
 builtin/ls-tree.c                   |   5 +-
 builtin/merge.c                     |   3 +-
 builtin/pack-objects.c              |   6 +-
 builtin/read-tree.c                 |   2 +-
 builtin/rev-parse.c                 |  25 ++-
 builtin/stash.c                     |   5 +-
 builtin/tag.c                       |  45 ++++-
 cache-tree.c                        |   4 +-
 commit.c                            | 219 ++++++++++++++++-----
 commit.h                            |   1 +
 delta-islands.c                     |   2 +-
 diff-lib.c                          |   2 +-
 fsck.c                              |   6 +-
 hash-ll.h                           |   1 +
 hash.h                              |   9 +-
 http-push.c                         |   2 +-
 list-objects.c                      |   2 +-
 loose.c                             | 258 ++++++++++++++++++++++++
 loose.h                             |  22 +++
 match-trees.c                       |   4 +-
 merge-ort.c                         |  11 +-
 merge-recursive.c                   |   2 +-
 merge.c                             |   3 +-
 object-file-convert.c               | 277 ++++++++++++++++++++++++++
 object-file-convert.h               |  24 +++
 object-file.c                       | 212 ++++++++++++++++++--
 object-name.c                       |  49 +++--
 object-name.h                       |   3 +-
 object-store-ll.h                   |   7 +-
 object.c                            |   2 +
 object.h                            |  18 ++
 oid-array.c                         |  12 +-
 pack-bitmap-write.c                 |   2 +-
 packfile.c                          |   3 +-
 reflog.c                            |   2 +-
 repository.c                        |  14 ++
 repository.h                        |   4 +
 revision.c                          |   4 +-
 setup.c                             |  22 +++
 setup.h                             |   1 +
 t/helper/test-delete-gpgsig.c       |  62 ++++++
 t/helper/test-tool.c                |   1 +
 t/helper/test-tool.h                |   1 +
 t/t1006-cat-file.sh                 | 379 +++++++++++++++++++++---------------
 t/t1016-compatObjectFormat.sh       | 280 ++++++++++++++++++++++++++
 t/t1016/gpg                         |   2 +
 t/test-lib-functions.sh             |  17 +-
 tree-walk.c                         |  58 +++---
 tree-walk.h                         |   7 +-
 tree.c                              |   2 +-
 walker.c                            |   2 +-
 62 files changed, 1843 insertions(+), 349 deletions(-)

Eric

             reply	other threads:[~2023-09-27 19:50 UTC|newest]

Thread overview: 104+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-27 19:49 Eric W. Biederman [this message]
2023-09-27 19:55 ` [PATCH 01/30] object-file-convert: Stubs for converting from one object format to another Eric W. Biederman
2023-09-27 20:42   ` Eric Sunshine
2023-10-02  1:22     ` Eric W. Biederman
2023-10-02  2:27       ` Eric Sunshine
2023-09-27 19:55 ` [PATCH 02/30] oid-array: Teach oid-array to handle multiple kinds of oids Eric W. Biederman
2023-09-27 23:20   ` Eric Sunshine
2023-09-27 19:55 ` [PATCH 03/30] object-names: Support input of oids in any supported hash Eric W. Biederman
2023-09-27 23:29   ` Eric Sunshine
2023-10-02  1:54     ` Eric W. Biederman
2023-09-27 19:55 ` [PATCH 04/30] repository: add a compatibility hash algorithm Eric W. Biederman
2023-09-27 19:55 ` [PATCH 05/30] loose: add a mapping between SHA-1 and SHA-256 for loose objects Eric W. Biederman
2023-09-28  7:14   ` Eric Sunshine
2023-10-02  2:11     ` Eric W. Biederman
2023-10-02  2:36       ` Eric Sunshine
2023-09-27 19:55 ` [PATCH 06/30] loose: Compatibilty short name support Eric W. Biederman
2023-09-27 19:55 ` [PATCH 07/30] object-file: Update the loose object map when writing loose objects Eric W. Biederman
2023-09-27 19:55 ` [PATCH 08/30] object-file: Add a compat_oid_in parameter to write_object_file_flags Eric W. Biederman
2023-09-27 19:55 ` [PATCH 09/30] commit: write commits for both hashes Eric W. Biederman
2023-09-27 19:55 ` [PATCH 10/30] commit: Convert mergetag before computing the signature of a commit Eric W. Biederman
2023-09-27 19:55 ` [PATCH 11/30] commit: Export add_header_signature to support handling signatures on tags Eric W. Biederman
2023-09-27 19:55 ` [PATCH 12/30] tag: sign both hashes Eric W. Biederman
2023-09-27 19:55 ` [PATCH 13/30] cache: add a function to read an OID of a specific algorithm Eric W. Biederman
2023-09-27 19:55 ` [PATCH 14/30] object: Factor out parse_mode out of fast-import and tree-walk into in object.h Eric W. Biederman
2023-09-27 19:55 ` [PATCH 15/30] object-file-convert: add a function to convert trees between algorithms Eric W. Biederman
2023-09-27 19:55 ` [PATCH 16/30] object-file-convert: convert tag objects when writing Eric W. Biederman
2023-09-27 19:55 ` [PATCH 17/30] object-file-convert: Don't leak when converting tag objects Eric W. Biederman
2023-09-27 19:55 ` [PATCH 18/30] object-file-convert: convert commit objects when writing Eric W. Biederman
2023-09-27 19:55 ` [PATCH 19/30] object-file-convert: Convert commits that embed signed tags Eric W. Biederman
2023-09-27 19:55 ` [PATCH 20/30] object-file: Update object_info_extended to reencode objects Eric W. Biederman
2023-09-27 19:55 ` [PATCH 21/30] repository: Implement extensions.compatObjectFormat Eric W. Biederman
2023-09-27 21:39   ` Junio C Hamano
2023-09-28 20:18     ` Junio C Hamano
2023-09-29  0:50       ` Eric Biederman
2023-09-29 16:59       ` Eric W. Biederman
2023-09-29 18:48         ` Junio C Hamano
2023-10-02  0:48           ` Eric W. Biederman
2023-10-02  1:31     ` Eric W. Biederman
2023-09-27 19:55 ` [PATCH 22/30] rev-parse: Add an --output-object-format parameter Eric W. Biederman
2023-09-27 19:55 ` [PATCH 23/30] builtin/cat-file: Let the oid determine the output algorithm Eric W. Biederman
2023-09-27 19:55 ` [PATCH 24/30] tree-walk: init_tree_desc take an oid to get the hash algorithm Eric W. Biederman
2023-09-27 19:55 ` [PATCH 25/30] object-file: Handle compat objects in check_object_signature Eric W. Biederman
2023-09-27 19:55 ` [PATCH 26/30] builtin/ls-tree: Let the oid determine the output algorithm Eric W. Biederman
2023-09-27 19:55 ` [PATCH 27/30] test-lib: Compute the compatibility hash so tests may use it Eric W. Biederman
2023-09-27 19:55 ` [PATCH 28/30] t1006: Rename sha1 to oid Eric W. Biederman
2023-09-27 19:55 ` [PATCH 29/30] t1006: Test oid compatibility with cat-file Eric W. Biederman
2023-09-27 19:55 ` [PATCH 30/30] t1016-compatObjectFormat: Add tests to verify the conversion between objects Eric W. Biederman
2023-09-27 21:31 ` [PATCH 00/30] Initial support for multiple hash functions Junio C Hamano
2023-10-02  2:39 ` [PATCH v2 00/30] initial " Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 01/30] object-file-convert: stubs for converting from one object format to another Eric W. Biederman
2024-02-08  8:23     ` Linus Arver
2024-02-15 11:21     ` Patrick Steinhardt
2023-10-02  2:40   ` [PATCH v2 02/30] oid-array: teach oid-array to handle multiple kinds of oids Eric W. Biederman
2024-02-13  8:16     ` Linus Arver
2024-02-15  6:22       ` Eric W. Biederman
2024-02-16  0:16         ` Linus Arver
2024-02-16  4:48           ` Eric W. Biederman
2024-02-17  1:59             ` Linus Arver
2024-02-13  8:31     ` Kristoffer Haugsbakk
2024-02-15  6:24       ` Eric W. Biederman
2024-02-15 11:21     ` Patrick Steinhardt
2023-10-02  2:40   ` [PATCH v2 03/30] object-names: support input of oids in any supported hash Eric W. Biederman
2024-02-13  9:33     ` Linus Arver
2024-02-15 11:21     ` Patrick Steinhardt
2023-10-02  2:40   ` [PATCH v2 04/30] repository: add a compatibility hash algorithm Eric W. Biederman
2024-02-13 10:02     ` Linus Arver
2024-02-15 11:22     ` Patrick Steinhardt
2023-10-02  2:40   ` [PATCH v2 05/30] loose: add a mapping between SHA-1 and SHA-256 for loose objects Eric W. Biederman
2024-02-14  7:20     ` Linus Arver
2024-02-15  5:33       ` Eric W. Biederman
2024-02-15 11:22     ` Patrick Steinhardt
2023-10-02  2:40   ` [PATCH v2 06/30] loose: compatibilty short name support Eric W. Biederman
2024-02-15 11:22     ` Patrick Steinhardt
2023-10-02  2:40   ` [PATCH v2 07/30] object-file: update the loose object map when writing loose objects Eric W. Biederman
2024-02-15 11:22     ` Patrick Steinhardt
2023-10-02  2:40   ` [PATCH v2 08/30] object-file: add a compat_oid_in parameter to write_object_file_flags Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 09/30] commit: write commits for both hashes Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 10/30] commit: convert mergetag before computing the signature of a commit Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 11/30] commit: export add_header_signature to support handling signatures on tags Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 12/30] tag: sign both hashes Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 13/30] cache: add a function to read an OID of a specific algorithm Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 14/30] object: factor out parse_mode out of fast-import and tree-walk into in object.h Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 15/30] object-file-convert: add a function to convert trees between algorithms Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 16/30] object-file-convert: convert tag objects when writing Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 17/30] object-file-convert: don't leak when converting tag objects Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 18/30] object-file-convert: convert commit objects when writing Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 19/30] object-file-convert: convert commits that embed signed tags Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 20/30] object-file: update object_info_extended to reencode objects Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 21/30] repository: implement extensions.compatObjectFormat Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 22/30] rev-parse: add an --output-object-format parameter Eric W. Biederman
2024-02-08 16:25     ` Jean-Noël Avila
2023-10-02  2:40   ` [PATCH v2 23/30] builtin/cat-file: let the oid determine the output algorithm Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 24/30] tree-walk: init_tree_desc take an oid to get the hash algorithm Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 25/30] object-file: handle compat objects in check_object_signature Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 26/30] builtin/ls-tree: let the oid determine the output algorithm Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 27/30] test-lib: compute the compatibility hash so tests may use it Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 28/30] t1006: rename sha1 to oid Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 29/30] t1006: test oid compatibility with cat-file Eric W. Biederman
2023-10-02  2:40   ` [PATCH v2 30/30] t1016-compatObjectFormat: add tests to verify the conversion between objects Eric W. Biederman
2024-02-07 22:18   ` [PATCH v2 00/30] initial support for multiple hash functions Junio C Hamano
2024-02-08  0:24     ` Linus Arver
2024-02-08  6:11       ` Patrick Steinhardt
2024-02-14  7:36       ` Linus Arver
2024-02-15 11:27   ` Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87jzsbjt0a.fsf@gmail.froward.int.ebiederm.org \
    --to=ebiederm@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).