git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Han Xin" <chiyutianyi@gmail.com>,
	"Jiang Xin" <worldhello.net@gmail.com>,
	"René Scharfe" <l.s.r@web.de>,
	"Derrick Stolee" <stolee@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: [PATCH v3 08/12] object-file API: split up and simplify check_object_signature()
Date: Sat,  5 Feb 2022 00:48:30 +0100	[thread overview]
Message-ID: <patch-v3-08.12-478d2915731-20220204T234435Z-avarab@gmail.com> (raw)
In-Reply-To: <cover-v3-00.12-00000000000-20220204T234435Z-avarab@gmail.com>

Split up the check_object_signature() function into that non-streaming
version (it accepts an already filled "buf"), and a new
stream_object_signature() which will retrieve the object from storage,
and hash it on-the-fly.

All of the callers of check_object_signature() were effectively
calling two different functions, if we go by cyclomatic
complexity. I.e. they'd either take the early "if (map)" branch and
return early, or not. This has been the case since the "if (map)"
condition was added in 090ea12671b (parse_object: avoid putting whole
blob in core, 2012-03-07).

We can then further simplify the resulting check_object_signature()
function since only one caller wanted to pass a non-NULL "buf" and a
non-NULL "real_oidp". That "read_loose_object()" codepath used by "git
fsck" can instead use hash_object_file() followed by oideq().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/fast-export.c |  2 +-
 builtin/index-pack.c  |  2 +-
 builtin/mktag.c       |  5 ++---
 cache.h               | 12 ++++++++----
 object-file.c         | 32 ++++++++++++++++++--------------
 object.c              |  4 ++--
 pack-check.c          |  9 ++++++---
 7 files changed, 38 insertions(+), 28 deletions(-)

diff --git a/builtin/fast-export.c b/builtin/fast-export.c
index 9f1c730e587..319859db30e 100644
--- a/builtin/fast-export.c
+++ b/builtin/fast-export.c
@@ -300,7 +300,7 @@ static void export_blob(const struct object_id *oid)
 		if (!buf)
 			die("could not read blob %s", oid_to_hex(oid));
 		if (check_object_signature(the_repository, oid, buf, size,
-					   type_name(type), NULL) < 0)
+					   type_name(type)) < 0)
 			die("oid mismatch in blob %s", oid_to_hex(oid));
 		object = parse_object_buffer(the_repository, oid, type,
 					     size, buf, &eaten);
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 416f60a98c1..e1927205a7b 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1413,7 +1413,7 @@ static void fix_unresolved_deltas(struct hashfile *f)
 			continue;
 
 		if (check_object_signature(the_repository, &d->oid, data, size,
-					   type_name(type), NULL) < 0)
+					   type_name(type)) < 0)
 			die(_("local object %s is corrupt"), oid_to_hex(&d->oid));
 
 		/*
diff --git a/builtin/mktag.c b/builtin/mktag.c
index 98d1e66f327..4d28eceeba6 100644
--- a/builtin/mktag.c
+++ b/builtin/mktag.c
@@ -61,9 +61,8 @@ static int verify_object_in_tag(struct object_id *tagged_oid, int *tagged_type)
 		    type_name(*tagged_type), type_name(type));
 
 	repl = lookup_replace_object(the_repository, tagged_oid);
-	ret = check_object_signature(the_repository, repl,
-				     buffer, size, type_name(*tagged_type),
-				     NULL);
+	ret = check_object_signature(the_repository, repl, buffer, size,
+				     type_name(*tagged_type));
 	free(buffer);
 
 	return ret;
diff --git a/cache.h b/cache.h
index 5d081952121..c068f99c793 100644
--- a/cache.h
+++ b/cache.h
@@ -1322,15 +1322,19 @@ int parse_loose_header(const char *hdr, struct object_info *oi);
 /**
  * With in-core object data in "buf", rehash it to make sure the
  * object name actually matches "oid" to detect object corruption.
- * With "buf" == NULL, try reading the object named with "oid" using
- * the streaming interface and rehash it to do the same.
  *
  * A negative value indicates an error, usually that the OID is not
  * what we expected, but it might also indicate another error.
  */
 int check_object_signature(struct repository *r, const struct object_id *oid,
-			   void *buf, unsigned long size, const char *type,
-			   struct object_id *real_oidp);
+			   void *buf, unsigned long size, const char *type);
+
+/**
+ * A streaming version of check_object_signature().
+ * Try reading the object named with "oid" using
+ * the streaming interface and rehash it to do the same.
+ */
+int stream_object_signature(struct repository *r, const struct object_id *oid);
 
 int finalize_object_file(const char *tmpfile, const char *filename);
 
diff --git a/object-file.c b/object-file.c
index d628f58c0d2..f9854922741 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1067,22 +1067,25 @@ int format_object_header(char *str, size_t size, enum object_type type,
 }
 
 int check_object_signature(struct repository *r, const struct object_id *oid,
-			   void *buf, unsigned long size, const char *type,
-			   struct object_id *real_oidp)
+			   void *buf, unsigned long size, const char *type)
 {
-	struct object_id tmp;
-	struct object_id *real_oid = real_oidp ? real_oidp : &tmp;
+	struct object_id real_oid;
+
+	hash_object_file(r->hash_algo, buf, size, type, &real_oid);
+
+	return !oideq(oid, &real_oid) ? -1 : 0;
+}
+
+int stream_object_signature(struct repository *r, const struct object_id *oid)
+{
+	struct object_id real_oid;
+	unsigned long size;
 	enum object_type obj_type;
 	struct git_istream *st;
 	git_hash_ctx c;
 	char hdr[MAX_HEADER_LEN];
 	int hdrlen;
 
-	if (buf) {
-		hash_object_file(r->hash_algo, buf, size, type, real_oid);
-		return !oideq(oid, real_oid) ? -1 : 0;
-	}
-
 	st = open_istream(r, oid, &obj_type, &size, NULL);
 	if (!st)
 		return -1;
@@ -1105,9 +1108,9 @@ int check_object_signature(struct repository *r, const struct object_id *oid,
 			break;
 		r->hash_algo->update_fn(&c, buf, readlen);
 	}
-	r->hash_algo->final_oid_fn(real_oid, &c);
+	r->hash_algo->final_oid_fn(&real_oid, &c);
 	close_istream(st);
-	return !oideq(oid, real_oid) ? -1 : 0;
+	return !oideq(oid, &real_oid) ? -1 : 0;
 }
 
 int git_open_cloexec(const char *name, int flags)
@@ -2611,9 +2614,10 @@ int read_loose_object(const char *path,
 			git_inflate_end(&stream);
 			goto out;
 		}
-		if (check_object_signature(the_repository, expected_oid,
-					   *contents, *size,
-					   oi->type_name->buf, real_oid) < 0)
+		hash_object_file(the_repository->hash_algo,
+				 *contents, *size, oi->type_name->buf,
+				 real_oid);
+		if (!oideq(expected_oid, real_oid))
 			goto out;
 	}
 
diff --git a/object.c b/object.c
index c37501fc120..d07678ef6da 100644
--- a/object.c
+++ b/object.c
@@ -279,7 +279,7 @@ struct object *parse_object(struct repository *r, const struct object_id *oid)
 	if ((obj && obj->type == OBJ_BLOB && repo_has_object_file(r, oid)) ||
 	    (!obj && repo_has_object_file(r, oid) &&
 	     oid_object_info(r, oid, NULL) == OBJ_BLOB)) {
-		if (check_object_signature(r, repl, NULL, 0, NULL, NULL) < 0) {
+		if (stream_object_signature(r, repl) < 0) {
 			error(_("hash mismatch %s"), oid_to_hex(oid));
 			return NULL;
 		}
@@ -290,7 +290,7 @@ struct object *parse_object(struct repository *r, const struct object_id *oid)
 	buffer = repo_read_object_file(r, oid, &type, &size);
 	if (buffer) {
 		if (check_object_signature(r, repl, buffer, size,
-					   type_name(type), NULL) < 0) {
+					   type_name(type)) < 0) {
 			free(buffer);
 			error(_("hash mismatch %s"), oid_to_hex(repl));
 			return NULL;
diff --git a/pack-check.c b/pack-check.c
index 48d818ee7b4..d8d8f91de6b 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -127,7 +127,7 @@ static int verify_packfile(struct repository *r,
 
 		if (type == OBJ_BLOB && big_file_threshold <= size) {
 			/*
-			 * Let check_object_signature() check it with
+			 * Let stream_object_signature() check it with
 			 * the streaming interface; no point slurping
 			 * the data in-core only to discard.
 			 */
@@ -142,8 +142,11 @@ static int verify_packfile(struct repository *r,
 			err = error("cannot unpack %s from %s at offset %"PRIuMAX"",
 				    oid_to_hex(&oid), p->pack_name,
 				    (uintmax_t)entries[i].offset);
-		else if (check_object_signature(r, &oid, data, size,
-						type_name(type), NULL) < 0)
+		else if (data && check_object_signature(r, &oid, data, size,
+							type_name(type)) < 0)
+			err = error("packed %s from %s is corrupt",
+				    oid_to_hex(&oid), p->pack_name);
+		else if (!data && stream_object_signature(r, &oid) < 0)
 			err = error("packed %s from %s is corrupt",
 				    oid_to_hex(&oid), p->pack_name);
 		else if (fn) {
-- 
2.35.1.940.ge7a5b4b05f2


  parent reply	other threads:[~2022-02-04 23:49 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-01 14:53 [PATCH 00/10] object-file API: pass object enums, tidy up streaming interface Ævar Arnfjörð Bjarmason
2022-02-01 14:53 ` [PATCH 01/10] object-file.c: split up declaration of unrelated variables Ævar Arnfjörð Bjarmason
2022-02-01 14:53 ` [PATCH 02/10] object-file API: return "void", not "int" from hash_object_file() Ævar Arnfjörð Bjarmason
2022-02-01 14:53 ` [PATCH 03/10] object-file API: add a format_object_header() function Ævar Arnfjörð Bjarmason
2022-02-01 14:53 ` [PATCH 04/10] object-file API: have write_object_file() take "enum object_type" Ævar Arnfjörð Bjarmason
2022-02-01 18:58   ` Junio C Hamano
2022-02-01 14:53 ` [PATCH 05/10] object-file API: provide a hash_object_file_oideq() Ævar Arnfjörð Bjarmason
2022-02-01 19:08   ` Junio C Hamano
2022-02-01 20:56     ` Ævar Arnfjörð Bjarmason
2022-02-01 14:53 ` [PATCH 06/10] object-file API: replace some use of check_object_signature() Ævar Arnfjörð Bjarmason
2022-02-01 19:16   ` Junio C Hamano
2022-02-01 14:53 ` [PATCH 07/10] object-file API: have hash_object_file() take "enum object_type" Ævar Arnfjörð Bjarmason
2022-02-01 14:53 ` [PATCH 08/10] object-file API: replace check_object_signature() with stream_* Ævar Arnfjörð Bjarmason
2022-02-01 14:53 ` [PATCH 09/10] object-file.c: add a literal version of write_object_file_prepare() Ævar Arnfjörð Bjarmason
2022-02-01 14:53 ` [PATCH 10/10] object-file API: pass an enum to read_object_with_reference() Ævar Arnfjörð Bjarmason
2022-02-04 13:51 ` [PATCH v2 00/11] object-file API: pass object enums, tidy up streaming interface Ævar Arnfjörð Bjarmason
2022-02-04 13:51   ` [PATCH v2 01/11] object-file.c: split up declaration of unrelated variables Ævar Arnfjörð Bjarmason
2022-02-04 13:51   ` [PATCH v2 02/11] object-file API: return "void", not "int" from hash_object_file() Ævar Arnfjörð Bjarmason
2022-02-04 13:51   ` [PATCH v2 03/11] object-file API: add a format_object_header() function Ævar Arnfjörð Bjarmason
2022-02-04 13:51   ` [PATCH v2 04/11] object-file API: have write_object_file() take "enum object_type" Ævar Arnfjörð Bjarmason
2022-02-04 20:52     ` Junio C Hamano
2022-02-04 13:51   ` [PATCH v2 05/11] object API: correct "buf" v.s. "map" mismatch in *.c and *.h Ævar Arnfjörð Bjarmason
2022-02-04 20:54     ` Junio C Hamano
2022-02-04 13:51   ` [PATCH v2 06/11] object API: make check_object_signature() oideq()-like, move docs Ævar Arnfjörð Bjarmason
2022-02-04 21:15     ` Junio C Hamano
2022-02-04 13:51   ` [PATCH v2 07/11] object-file API: split up and simplify check_object_signature() Ævar Arnfjörð Bjarmason
2022-02-04 13:51   ` [PATCH v2 08/11] object API: rename hash_object_file_literally() to write_*() Ævar Arnfjörð Bjarmason
2022-02-04 13:51   ` [PATCH v2 09/11] object-file API: have hash_object_file() take "enum object_type" Ævar Arnfjörð Bjarmason
2022-02-04 13:51   ` [PATCH v2 10/11] object-file.c: add a literal version of write_object_file_prepare() Ævar Arnfjörð Bjarmason
2022-02-04 13:51   ` [PATCH v2 11/11] object-file API: pass an enum to read_object_with_reference() Ævar Arnfjörð Bjarmason
2022-02-04 23:48   ` [PATCH v3 00/12] object-file API: pass object enums, tidy up streaming interface Ævar Arnfjörð Bjarmason
2022-02-04 23:48     ` [PATCH v3 01/12] object-file.c: split up declaration of unrelated variables Ævar Arnfjörð Bjarmason
2022-02-04 23:48     ` [PATCH v3 02/12] object-file API: return "void", not "int" from hash_object_file() Ævar Arnfjörð Bjarmason
2022-02-04 23:48     ` [PATCH v3 03/12] object-file API: add a format_object_header() function Ævar Arnfjörð Bjarmason
2022-02-17  4:59       ` Jiang Xin
2022-02-17  9:21         ` Ævar Arnfjörð Bjarmason
2022-03-01  3:09           ` Jiang Xin
2022-02-04 23:48     ` [PATCH v3 04/12] object-file API: have write_object_file() take "enum object_type" Ævar Arnfjörð Bjarmason
2022-02-04 23:48     ` [PATCH v3 05/12] object API: correct "buf" v.s. "map" mismatch in *.c and *.h Ævar Arnfjörð Bjarmason
2022-02-04 23:48     ` [PATCH v3 06/12] object API docs: move check_object_signature() docs to cache.h Ævar Arnfjörð Bjarmason
2022-02-04 23:48     ` [PATCH v3 07/12] object API users + docs: check <0, not !0 with check_object_signature() Ævar Arnfjörð Bjarmason
2022-02-04 23:48     ` Ævar Arnfjörð Bjarmason [this message]
2022-02-04 23:48     ` [PATCH v3 09/12] object API: rename hash_object_file_literally() to write_*() Ævar Arnfjörð Bjarmason
2022-02-04 23:48     ` [PATCH v3 10/12] object-file API: have hash_object_file() take "enum object_type" Ævar Arnfjörð Bjarmason
2022-02-04 23:48     ` [PATCH v3 11/12] object-file.c: add a literal version of write_object_file_prepare() Ævar Arnfjörð Bjarmason
2022-02-04 23:48     ` [PATCH v3 12/12] object-file API: pass an enum to read_object_with_reference() Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=patch-v3-08.12-478d2915731-20220204T234435Z-avarab@gmail.com \
    --to=avarab@gmail.com \
    --cc=chiyutianyi@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    --cc=stolee@gmail.com \
    --cc=worldhello.net@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).