git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: martin.agren@gmail.com, sandals@crustytoothpaste.net,
	me@ttaylorr.com, abhishekkumar8222@gmail.com,
	Derrick Stolee <derrickstolee@github.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: [PATCH 2/3] commit-graph: use the hash version byte
Date: Fri, 14 Aug 2020 18:07:19 +0000	[thread overview]
Message-ID: <4bbfd345d16da4604dd20decda8ecb12372e4223.1597428440.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.703.git.1597428440.gitgitgadget@gmail.com>

From: Derrick Stolee <dstolee@microsoft.com>

The commit-graph format reserved a byte among the header of the file to
store a "hash version". During the SHA-256 work, this was not modified
because file formats are not necessarily intended to work across hash
versions. If a repository has SHA-256 as its hash algorithm, it
automatically up-shifts the lengths of object names in all necessary
formats.

However, since we have this byte available for adjusting the version, we
can make the file formats more obviously incompatible instead of relying
on other context from the repository.

Update the oid_version() method in commit-graph.c to add a new value, 2,
for sha-256. This automatically writes the new value in a SHA-256
repository _and_ verifies the value is correct. This is a breaking
change relative to the current 'master' branch since 092b677 (Merge
branch 'bc/sha-256-cvs-svn-updates', 2020-08-13) but it is not breaking
relative to any released version of Git.

The test impact is relatively minor: the output of 'test-tool
read-graph' lists the header information, so those instances of '1' need
to be replaced with a variable determined by GIT_TEST_DEFAULT_HASH. A
more careful test is added that specifically creates a repository of
each type then swaps the commit-graph files. The important value here is
that the "git log" command succeeds while writing a message to stderr.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 .../technical/commit-graph-format.txt         |  9 ++++-
 commit-graph.c                                |  6 ++-
 t/t4216-log-bloom.sh                          |  8 +++-
 t/t5318-commit-graph.sh                       | 37 ++++++++++++++++++-
 t/t5324-split-commit-graph.sh                 |  8 +++-
 5 files changed, 62 insertions(+), 6 deletions(-)

diff --git a/Documentation/technical/commit-graph-format.txt b/Documentation/technical/commit-graph-format.txt
index 440541045d..6ddbceba15 100644
--- a/Documentation/technical/commit-graph-format.txt
+++ b/Documentation/technical/commit-graph-format.txt
@@ -42,8 +42,13 @@ HEADER:
   1-byte version number:
       Currently, the only valid version is 1.
 
-  1-byte Hash Version (1 = SHA-1)
-      We infer the hash length (H) from this value.
+  1-byte Hash Version
+      We infer the hash length (H) from this value:
+	1 => SHA-1
+	2 => SHA-256
+      If the hash type does not match the repository's hash algorithm, the
+      commit-graph file should be ignored with a warning presented to the
+      user.
 
   1-byte number (C) of "chunks"
 
diff --git a/commit-graph.c b/commit-graph.c
index e51c91dd5b..d03328d64c 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -179,7 +179,11 @@ static char *get_chain_filename(struct object_directory *odb)
 
 static uint8_t oid_version(void)
 {
-	return 1;
+	if (the_hash_algo->rawsz == GIT_SHA1_RAWSZ)
+		return 1;
+	if (the_hash_algo->rawsz == GIT_SHA256_RAWSZ)
+		return 2;
+	die(_("invalid hash version"));
 }
 
 static struct commit_graph *alloc_commit_graph(void)
diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh
index c21cc160f3..906af2799d 100755
--- a/t/t4216-log-bloom.sh
+++ b/t/t4216-log-bloom.sh
@@ -6,6 +6,12 @@ test_description='git log for a path with Bloom filters'
 GIT_TEST_COMMIT_GRAPH=0
 GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=0
 
+OID_VERSION=1
+if [ "$GIT_DEFAULT_HASH" = "sha256" ]
+then
+	OID_VERSION=2
+fi
+
 test_expect_success 'setup test - repo, commits, commit graph, log outputs' '
 	git init &&
 	mkdir A A/B A/B/C &&
@@ -35,7 +41,7 @@ test_expect_success 'setup test - repo, commits, commit graph, log outputs' '
 graph_read_expect () {
 	NUM_CHUNKS=5
 	cat >expect <<- EOF
-	header: 43475048 1 1 $NUM_CHUNKS 0
+	header: 43475048 1 $OID_VERSION $NUM_CHUNKS 0
 	num_commits: $1
 	chunks: oid_fanout oid_lookup commit_metadata bloom_indexes bloom_data
 	EOF
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 044cf8a3de..5b65017676 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -5,6 +5,12 @@ test_description='commit graph'
 
 GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=0
 
+OID_VERSION=1
+if [ "$GIT_DEFAULT_HASH" = "sha256" ]
+then
+	OID_VERSION=2
+fi
+
 test_expect_success 'setup full repo' '
 	mkdir full &&
 	cd "$TRASH_DIRECTORY/full" &&
@@ -77,7 +83,7 @@ graph_read_expect() {
 		NUM_CHUNKS=$((3 + $(echo "$2" | wc -w)))
 	fi
 	cat >expect <<- EOF
-	header: 43475048 1 1 $NUM_CHUNKS 0
+	header: 43475048 1 $OID_VERSION $NUM_CHUNKS 0
 	num_commits: $1
 	chunks: oid_fanout oid_lookup commit_metadata$OPTIONAL
 	EOF
@@ -412,6 +418,35 @@ test_expect_success 'replace-objects invalidates commit-graph' '
 	)
 '
 
+test_expect_success 'warn on improper hash version' '
+	git init --object-format=sha1 sha1 &&
+	(
+		cd sha1 &&
+		test_commit 1 &&
+		git commit-graph write --reachable &&
+		mv .git/objects/info/commit-graph ../cg-sha1
+	) &&
+	git init --object-format=sha256 sha256 &&
+	(
+		cd sha256 &&
+		test_commit 1 &&
+		git commit-graph write --reachable &&
+		mv .git/objects/info/commit-graph ../cg-sha256
+	) &&
+	(
+		cd sha1 &&
+		mv ../cg-sha256 .git/objects/info/commit-graph &&
+		git log -1 2>err &&
+		test_i18ngrep "commit-graph hash version 2 does not match version 1" err
+	) &&
+	(
+		cd sha256 &&
+		mv ../cg-sha1 .git/objects/info/commit-graph &&
+		git log -1 2>err &&
+		test_i18ngrep "commit-graph hash version 1 does not match version 2" err
+	)
+'
+
 # the verify tests below expect the commit-graph to contain
 # exactly the commits reachable from the commits/8 branch.
 # If the file changes the set of commits in the list, then the
diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh
index ea28d522b8..6f1a324f4f 100755
--- a/t/t5324-split-commit-graph.sh
+++ b/t/t5324-split-commit-graph.sh
@@ -6,6 +6,12 @@ test_description='split commit graph'
 GIT_TEST_COMMIT_GRAPH=0
 GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=0
 
+OID_VERSION=1
+if [ "$GIT_DEFAULT_HASH" = "sha256" ]
+then
+	OID_VERSION=2
+fi
+
 test_expect_success 'setup repo' '
 	git init &&
 	git config core.commitGraph true &&
@@ -28,7 +34,7 @@ graph_read_expect() {
 		NUM_BASE=$2
 	fi
 	cat >expect <<- EOF
-	header: 43475048 1 1 3 $NUM_BASE
+	header: 43475048 1 $OID_VERSION 3 $NUM_BASE
 	num_commits: $1
 	chunks: oid_fanout oid_lookup commit_metadata
 	EOF
-- 
gitgitgadget


  parent reply	other threads:[~2020-08-14 18:07 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-14 18:07 [PATCH 0/3] SHA-256: Update commit-graph and multi-pack-index formats Derrick Stolee via GitGitGadget
2020-08-14 18:07 ` [PATCH 1/3] t/README: document GIT_TEST_DEFAULT_HASH Derrick Stolee via GitGitGadget
2020-08-14 19:02   ` Junio C Hamano
2020-08-14 20:39   ` Eric Sunshine
2020-08-14 20:41     ` Derrick Stolee
2020-08-14 18:07 ` Derrick Stolee via GitGitGadget [this message]
2020-08-14 19:05   ` [PATCH 2/3] commit-graph: use the hash version byte Junio C Hamano
2020-08-14 20:05     ` Taylor Blau
2020-08-14 20:11   ` brian m. carlson
2020-08-14 20:22     ` Junio C Hamano
2020-08-14 20:36     ` Derrick Stolee
2020-08-15 13:46       ` Martin Ågren
2020-08-14 18:07 ` [PATCH 3/3] multi-pack-index: use " Derrick Stolee via GitGitGadget
2020-08-14 20:14   ` brian m. carlson
2020-08-14 19:25 ` [PATCH 0/3] SHA-256: Update commit-graph and multi-pack-index formats Junio C Hamano
2020-08-14 20:34   ` Derrick Stolee
2020-08-14 21:41     ` Junio C Hamano
2020-08-17 14:04 ` [PATCH v2 " Derrick Stolee via GitGitGadget
2020-08-17 14:04   ` [PATCH v2 1/3] t/README: document GIT_TEST_DEFAULT_HASH Derrick Stolee via GitGitGadget
2020-08-17 14:04   ` [PATCH v2 2/3] commit-graph: use the "hash version" byte Derrick Stolee via GitGitGadget
2020-08-17 14:04   ` [PATCH v2 3/3] multi-pack-index: use hash version byte Derrick Stolee via GitGitGadget
2020-08-17 23:12   ` [PATCH v2 0/3] SHA-256: Update commit-graph and multi-pack-index formats brian m. carlson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4bbfd345d16da4604dd20decda8ecb12372e4223.1597428440.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=abhishekkumar8222@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=martin.agren@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=sandals@crustytoothpaste.net \
    --subject='Re: [PATCH 2/3] commit-graph: use the hash version byte' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Code repositories for project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).