git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Taylor Blau <me@ttaylorr.com>, Elijah Newren <newren@gmail.com>
Subject: Re: [PATCH 1/2] blob.c: remove buffer & size arguments to parse_blob_buffer()
Date: Sat, 10 Apr 2021 14:57:12 +0200	[thread overview]
Message-ID: <87y2dqfgbb.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <YHCUFwZMitOXELpq@coredump.intra.peff.net>


On Fri, Apr 09 2021, Jeff King wrote:

> On Fri, Apr 09, 2021 at 10:07:27AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> As noted in the comment introduced in 837d395a5c0 (Replace
>> parse_blob() with an explanatory comment, 2010-01-18) the old
>> parse_blob() function and the current parse_blob_buffer() exist merely
>> to provide consistency in the API.
>> 
>> We're not going to parse blobs like we "parse" commits, trees or
>> tags. So let's not have the parse_blob_buffer() take arguments that
>> pretends that we do. Its only use is to set the "parsed" flag.
>> 
>> See bd2c39f58f9 ([PATCH] don't load and decompress objects twice with
>> parse_object(), 2005-05-06) for the introduction of parse_blob_buffer().
>
> OK. Calling it parse_blob_buffer() is a little silly since it doesn't
> even take a buffer anymore. But I guess parse_blob() might imply that it
> actually loads the contents from disk to check them (which the other
> parse_foo() functions do), so that's not a good name.
>
> So this might be the least bad thing. Given that there are only two
> callers, just setting blob->object.parsed might not be unreasonable,
> either. But I don't think it's worth spending too much time on.
>
>> @@ -266,7 +266,7 @@ struct object *parse_object(struct repository *r, const struct object_id *oid)
>>  			error(_("hash mismatch %s"), oid_to_hex(oid));
>>  			return NULL;
>>  		}
>> -		parse_blob_buffer(lookup_blob(r, oid), NULL, 0);
>> +		parse_blob_buffer(lookup_blob(r, oid));
>>  		return lookup_object(r, oid);
>
> Not new in your patch, but I wondered if this could cause a segfault
> when lookup_blob() returns NULL. I _think_ the answer is "no". We'd hit
> this code path when either:
>
>   - lookup_object() returns an object with type OBJ_BLOB, in which case
>     lookup_blob() would return that same object
>
>   - lookup_object() returned NULL, in which case lookup_blob() will call
>     it again, get NULL again, and then auto-create the blob and return
>     it
>
> So I think it is OK. But there are a bunch of duplicate hash lookups in
> this code. It would be clearer and more efficient as:
>
> diff --git a/object.c b/object.c
> index 2c32691dc4..2dfa038f13 100644
> --- a/object.c
> +++ b/object.c
> @@ -262,12 +262,14 @@ struct object *parse_object(struct repository *r, const struct object_id *oid)
>  	if ((obj && obj->type == OBJ_BLOB && repo_has_object_file(r, oid)) ||
>  	    (!obj && repo_has_object_file(r, oid) &&
>  	     oid_object_info(r, oid, NULL) == OBJ_BLOB)) {
> +		if (!obj)
> +			obj = create_object(r, oid, alloc_blob_node(r));
>  		if (check_object_signature(r, repl, NULL, 0, NULL) < 0) {
>  			error(_("hash mismatch %s"), oid_to_hex(oid));
>  			return NULL;
>  		}
> -		parse_blob_buffer(lookup_blob(r, oid), NULL, 0);
> -		return lookup_object(r, oid);
> +		parse_blob_buffer(obj, NULL, 0);
> +		return obj;
>  	}
>  
>  	buffer = repo_read_object_file(r, oid, &type, &size);
>
> but I doubt the efficiency matters much in practice. Those hash lookups
> will be lost in the noise of computing the hash of the blob contents.

I was trying to keep the changes smaller, but what about just doing this?:

diff --git a/blob.c b/blob.c
index 182718aba9..69293e7d8e 100644
--- a/blob.c
+++ b/blob.c
@@ -5,16 +5,16 @@
 
 const char *blob_type = "blob";
 
+struct blob *create_blob(struct repository *r, const struct object_id *oid)
+{
+	return create_object(r, oid, alloc_blob_node(r));
+}
+
 struct blob *lookup_blob(struct repository *r, const struct object_id *oid)
 {
 	struct object *obj = lookup_object(r, oid);
 	if (!obj)
-		return create_object(r, oid, alloc_blob_node(r));
-	return object_as_type(obj, OBJ_BLOB, 0);
-}
+		return create_blob(r, oid);
 
-int parse_blob_buffer(struct blob *item, void *buffer, unsigned long size)
-{
-	item->object.parsed = 1;
-	return 0;
+	return object_as_type(obj, OBJ_BLOB, 0);
 }
diff --git a/blob.h b/blob.h
index 1664872055..ad34f0e9cc 100644
--- a/blob.h
+++ b/blob.h
@@ -9,10 +9,9 @@ struct blob {
 	struct object object;
 };
 
+struct blob *create_blob(struct repository *r, const struct object_id *oid);
 struct blob *lookup_blob(struct repository *r, const struct object_id *oid);
 
-int parse_blob_buffer(struct blob *item, void *buffer, unsigned long size);
-
 /**
  * Blobs do not contain references to other objects and do not have
  * structured data that needs parsing. However, code may use the
diff --git a/object.c b/object.c
index 78343781ae..2699431404 100644
--- a/object.c
+++ b/object.c
@@ -195,8 +195,7 @@ struct object *parse_object_buffer(struct repository *r, const struct object_id
 	if (type == OBJ_BLOB) {
 		struct blob *blob = lookup_blob(r, oid);
 		if (blob) {
-			if (parse_blob_buffer(blob, buffer, size))
-				return NULL;
+			blob->object.parsed = 1;
 			obj = &blob->object;
 		}
 	} else if (type == OBJ_TREE) {
@@ -262,12 +261,16 @@ struct object *parse_object(struct repository *r, const struct object_id *oid)
 	if ((obj && obj->type == OBJ_BLOB && repo_has_object_file(r, oid)) ||
 	    (!obj && repo_has_object_file(r, oid) &&
 	     oid_object_info(r, oid, NULL) == OBJ_BLOB)) {
+		struct blob *blob;
+		if (!obj)
+			blob = create_blob(r, oid);
 		if (check_object_signature(r, repl, NULL, 0, NULL) < 0) {
 			error(_("hash mismatch %s"), oid_to_hex(oid));
 			return NULL;
 		}
-		parse_blob_buffer(lookup_blob(r, oid), NULL, 0);
-		return lookup_object(r, oid);
+		obj = &blob->object;
+		obj->parsed = 1;
+		return obj;
 	}
 
 	buffer = repo_read_object_file(r, oid, &type, &size);

  parent reply	other threads:[~2021-04-10 12:57 UTC|newest]

Thread overview: 142+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-06-22  0:33 [PATCH 0/2] Pull objects of various types Daniel Barkalow
2005-06-22  0:35 ` [PATCH 1/2] Parse tags for absent objects Daniel Barkalow
2021-03-08 20:04   ` [PATCH 0/7] improve reporting of unexpected objects Ævar Arnfjörð Bjarmason
2021-03-28  2:13     ` [PATCH v2 00/10] " Ævar Arnfjörð Bjarmason
2021-03-28  2:13       ` [PATCH v2 01/10] object.c: stop supporting len == -1 in type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-03-28  5:35         ` Junio C Hamano
2021-03-28 15:46           ` Ævar Arnfjörð Bjarmason
2021-03-28 18:25             ` Junio C Hamano
2021-04-22 18:09               ` Felipe Contreras
2021-03-28  2:13       ` [PATCH v2 02/10] object.c: refactor type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-03-28  2:13       ` [PATCH v2 03/10] object.c: make type_from_string() return "enum object_type" Ævar Arnfjörð Bjarmason
2021-03-28  2:13       ` [PATCH v2 04/10] object-file.c: make oid_object_info() " Ævar Arnfjörð Bjarmason
2021-03-28  2:13       ` [PATCH v2 05/10] object-name.c: make dependency on object_type order more obvious Ævar Arnfjörð Bjarmason
2021-03-28  2:13       ` [PATCH v2 06/10] tree.c: fix misindentation in parse_tree_gently() Ævar Arnfjörð Bjarmason
2021-03-28  2:13       ` [PATCH v2 07/10] object.c: add a utility function for "expected type X, got Y" Ævar Arnfjörð Bjarmason
2021-03-28  2:13       ` [PATCH v2 08/10] object.c: add and use oid_is_type_or_die_msg() function Ævar Arnfjörð Bjarmason
2021-03-28  2:13       ` [PATCH v2 09/10] object tests: add test for unexpected objects in tags Ævar Arnfjörð Bjarmason
2021-03-28  2:13       ` [PATCH v2 10/10] tag: don't misreport type of tagged objects in errors Ævar Arnfjörð Bjarmason
2021-03-30  5:50         ` Junio C Hamano
2021-03-31 11:02           ` Jeff King
2021-03-31 18:05             ` Junio C Hamano
2021-03-31 18:31             ` Ævar Arnfjörð Bjarmason
2021-03-31 18:59               ` Jeff King
2021-03-31 20:46                 ` Ævar Arnfjörð Bjarmason
2021-04-01  7:54                   ` Jeff King
2021-04-01  8:32                     ` [PATCH] ref-filter: fix NULL check for parse object failure Jeff King
2021-04-01 13:56                       ` [PATCH v2 0/5] mktag tests & fix for-each-ref segfault Ævar Arnfjörð Bjarmason
2021-04-01 13:56                         ` [PATCH v2 1/5] mktag tests: parse out options in helper Ævar Arnfjörð Bjarmason
2021-04-01 13:56                         ` [PATCH v2 2/5] mktag tests: invert --no-strict test Ævar Arnfjörð Bjarmason
2021-04-01 13:56                         ` [PATCH v2 3/5] mktag tests: do fsck on failure Ævar Arnfjörð Bjarmason
2021-04-01 13:56                         ` [PATCH v2 4/5] mktag tests: test for maybe segfaulting for-each-ref Ævar Arnfjörð Bjarmason
2021-04-01 13:56                         ` [PATCH v2 5/5] ref-filter: fix NULL check for parse object failure Ævar Arnfjörð Bjarmason
2021-04-01 19:19                           ` Ramsay Jones
2021-04-01 19:56                         ` [PATCH v2 0/5] mktag tests & fix for-each-ref segfault Junio C Hamano
2021-04-02 11:37                           ` Ævar Arnfjörð Bjarmason
2021-04-02 20:51                             ` Junio C Hamano
2021-04-01 19:52                       ` [PATCH] ref-filter: fix NULL check for parse object failure Junio C Hamano
2021-03-31 18:41             ` [PATCH v2 10/10] tag: don't misreport type of tagged objects in errors Junio C Hamano
2021-03-31 19:00               ` Jeff King
2021-03-28  9:27       ` [PATCH v2 00/10] improve reporting of unexpected objects Jeff King
2021-03-29 13:34         ` Ævar Arnfjörð Bjarmason
2021-03-31 10:43           ` Jeff King
2021-04-09  8:07       ` [PATCH 0/2] blob/object.c: trivial readability improvements Ævar Arnfjörð Bjarmason
2021-04-09  8:07         ` [PATCH 1/2] blob.c: remove buffer & size arguments to parse_blob_buffer() Ævar Arnfjörð Bjarmason
2021-04-09 17:51           ` Jeff King
2021-04-09 22:31             ` Junio C Hamano
2021-04-10 12:57             ` Ævar Arnfjörð Bjarmason [this message]
2021-04-10 13:01               ` Ævar Arnfjörð Bjarmason
2021-04-13  8:25               ` Jeff King
2021-04-09  8:07         ` [PATCH 2/2] object.c: initialize automatic variable in lookup_object() Ævar Arnfjörð Bjarmason
2021-04-09 17:53           ` Jeff King
2021-04-09 22:32             ` Junio C Hamano
2021-04-09  8:32         ` [PATCH 0/6] {tag,object}*.c: refactorings + prep for a larger change Ævar Arnfjörð Bjarmason
2021-04-09  8:32           ` [PATCH 1/6] object.c: stop supporting len == -1 in type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-04-09 18:06             ` Jeff King
2021-04-09 18:10               ` Jeff King
2021-04-09  8:32           ` [PATCH 2/6] object.c: remove "gently" argument to type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-04-09 18:10             ` Jeff King
2021-04-09  8:32           ` [PATCH 3/6] object.c: make type_from_string() return "enum object_type" Ævar Arnfjörð Bjarmason
2021-04-09 18:14             ` Jeff King
2021-04-09 19:42               ` Ævar Arnfjörð Bjarmason
2021-04-09 21:29                 ` Jeff King
2021-04-09  8:32           ` [PATCH 4/6] object-file.c: make oid_object_info() " Ævar Arnfjörð Bjarmason
2021-04-09 18:24             ` Jeff King
2021-04-09  8:32           ` [PATCH 5/6] object-name.c: make dependency on object_type order more obvious Ævar Arnfjörð Bjarmason
2021-04-09 18:36             ` Jeff King
2021-04-09  8:32           ` [PATCH 6/6] tag.c: use type_from_string_gently() when parsing tags Ævar Arnfjörð Bjarmason
2021-04-09 18:42             ` Jeff King
2021-04-09  8:49           ` [PATCH 0/7] object.c: add and use "is expected" utility function + object_as_type() use Ævar Arnfjörð Bjarmason
2021-04-09  8:49             ` [PATCH 1/7] tree.c: fix misindentation in parse_tree_gently() Ævar Arnfjörð Bjarmason
2021-04-09  8:49             ` [PATCH 2/7] object.c: add a utility function for "expected type X, got Y" Ævar Arnfjörð Bjarmason
2021-04-09  8:49             ` [PATCH 3/7] object.c: add and use oid_is_type_or_die_msg() function Ævar Arnfjörð Bjarmason
2021-04-09  8:49             ` [PATCH 4/7] commit-graph: use obj->type, not object_as_type() Ævar Arnfjörð Bjarmason
2021-04-09  8:50             ` [PATCH 5/7] commit.c: don't use deref_tag() -> object_as_type() Ævar Arnfjörð Bjarmason
2021-04-09  8:50             ` [PATCH 6/7] object.c: normalize brace style in object_as_type() Ævar Arnfjörð Bjarmason
2021-04-09  8:50             ` [PATCH 7/7] object.c: remove "quiet" parameter from object_as_type() Ævar Arnfjörð Bjarmason
2021-04-20 13:36             ` [PATCH v2 0/8] object.c: add and use "is expected" utility function + object_as_type() use Ævar Arnfjörð Bjarmason
2021-04-20 13:36               ` [PATCH v2 1/8] tree.c: fix misindentation in parse_tree_gently() Ævar Arnfjörð Bjarmason
2021-04-20 13:36               ` [PATCH v2 2/8] object.c: add a utility function for "expected type X, got Y" Ævar Arnfjörð Bjarmason
2021-04-21 22:02                 ` Jonathan Tan
2021-04-22  6:10                   ` Ævar Arnfjörð Bjarmason
2021-04-20 13:36               ` [PATCH v2 3/8] object.c: add and use oid_is_type_or_die_msg() function Ævar Arnfjörð Bjarmason
2021-04-21 22:07                 ` Jonathan Tan
2021-04-21 23:28                 ` Josh Steadmon
2021-04-28  4:12                   ` Junio C Hamano
2021-04-20 13:36               ` [PATCH v2 4/8] commit-graph: use obj->type, not object_as_type() Ævar Arnfjörð Bjarmason
2021-04-20 13:36               ` [PATCH v2 5/8] branch tests: assert lookup_commit_reference_gently() error Ævar Arnfjörð Bjarmason
2021-04-20 13:36               ` [PATCH v2 6/8] commit.c: don't use deref_tag() -> object_as_type() Ævar Arnfjörð Bjarmason
2021-04-21 22:26                 ` Jonathan Tan
2021-04-20 13:36               ` [PATCH v2 7/8] object.c: normalize brace style in object_as_type() Ævar Arnfjörð Bjarmason
2021-04-20 13:37               ` [PATCH v2 8/8] object.c: remove "quiet" parameter from object_as_type() Ævar Arnfjörð Bjarmason
2021-04-20 13:00           ` [PATCH v2 00/10] {tag,object}*.c: refactorings + prep for a larger change Ævar Arnfjörð Bjarmason
2021-04-20 13:00             ` [PATCH v2 01/10] object.c: stop supporting len == -1 in type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-04-20 13:00             ` [PATCH v2 02/10] object.c: remove "gently" argument to type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-04-20 13:00             ` [PATCH v2 03/10] object.c: make type_from_string() return "enum object_type" Ævar Arnfjörð Bjarmason
2021-04-20 13:00             ` [PATCH v2 04/10] object-file.c: make oid_object_info() " Ævar Arnfjörð Bjarmason
2021-04-20 13:00             ` [PATCH v2 05/10] object-name.c: make dependency on object_type order more obvious Ævar Arnfjörð Bjarmason
2021-04-20 13:00             ` [PATCH v2 06/10] tag.c: use type_from_string_gently() when parsing tags Ævar Arnfjörð Bjarmason
2021-04-20 13:00             ` [PATCH v2 07/10] hash-object: pass along type length to object.c Ævar Arnfjörð Bjarmason
2021-04-20 13:00             ` [PATCH v2 08/10] hash-object: refactor nested else/if/if into else if/else if Ævar Arnfjörð Bjarmason
2021-04-20 13:00             ` [PATCH v2 09/10] hash-object: show usage on invalid --type Ævar Arnfjörð Bjarmason
2021-04-20 13:00             ` [PATCH v2 10/10] object.c: move type_from_string() code to its last user Ævar Arnfjörð Bjarmason
2021-04-20 12:50         ` [PATCH v2 00/10] object.c et al: tests, small bug fixes etc Ævar Arnfjörð Bjarmason
2021-04-20 12:50           ` [PATCH v2 01/10] cat-file tests: test for bogus type name handling Ævar Arnfjörð Bjarmason
2021-04-29  4:15             ` Junio C Hamano
2021-04-20 12:50           ` [PATCH v2 02/10] hash-object tests: more detailed test for invalid type Ævar Arnfjörð Bjarmason
2021-04-20 12:50           ` [PATCH v2 03/10] mktree tests: add test for invalid object type Ævar Arnfjörð Bjarmason
2021-04-20 12:50           ` [PATCH v2 04/10] object-file.c: take type id, not string, in read_object_with_reference() Ævar Arnfjörð Bjarmason
2021-04-29  4:37             ` Junio C Hamano
2021-04-20 12:50           ` [PATCH v2 05/10] {commit,tree,blob,tag}.c: add a create_{commit,tree,blob,tag}() Ævar Arnfjörð Bjarmason
2021-04-29  4:45             ` Junio C Hamano
2021-04-29 12:01               ` Ævar Arnfjörð Bjarmason
2021-04-20 12:50           ` [PATCH v2 06/10] blob.c: remove parse_blob_buffer() Ævar Arnfjörð Bjarmason
2021-04-29  4:51             ` Junio C Hamano
2021-04-20 12:50           ` [PATCH v2 07/10] object.c: simplify return semantic of parse_object_buffer() Ævar Arnfjörð Bjarmason
2021-04-20 12:50           ` [PATCH v2 08/10] object.c: don't go past "len" under die() in type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-04-29  4:55             ` Junio C Hamano
2021-04-20 12:50           ` [PATCH v2 09/10] mktree: stop setting *ntr++ to NIL Ævar Arnfjörð Bjarmason
2021-04-29  5:01             ` Junio C Hamano
2021-04-20 12:50           ` [PATCH v2 10/10] mktree: emit a more detailed error when the <type> is invalid Ævar Arnfjörð Bjarmason
2021-03-08 20:04   ` [PATCH 1/7] object.c: refactor type_from_string_gently() Ævar Arnfjörð Bjarmason
2021-03-08 20:52     ` Taylor Blau
2021-03-09 10:46     ` Jeff King
2021-03-08 20:04   ` [PATCH 2/7] object.c: make type_from_string() return "enum object_type" Ævar Arnfjörð Bjarmason
2021-03-08 20:56     ` Taylor Blau
2021-03-08 21:48     ` Junio C Hamano
2021-03-08 20:04   ` [PATCH 3/7] oid_object_info(): " Ævar Arnfjörð Bjarmason
2021-03-08 21:54     ` Junio C Hamano
2021-03-08 22:32       ` Junio C Hamano
2021-03-09 10:34     ` Jeff King
2021-03-08 20:04   ` [PATCH 4/7] tree.c: fix misindentation in parse_tree_gently() Ævar Arnfjörð Bjarmason
2021-03-08 20:04   ` [PATCH 5/7] object.c: add a utility function for "expected type X, got Y" Ævar Arnfjörð Bjarmason
2021-03-08 20:59     ` Taylor Blau
2021-03-08 22:15     ` Junio C Hamano
2021-03-08 20:04   ` [PATCH 6/7] object tests: add test for unexpected objects in tags Ævar Arnfjörð Bjarmason
2021-03-09 10:44     ` Jeff King
2021-03-28  1:35       ` Ævar Arnfjörð Bjarmason
2021-03-28  9:06         ` Jeff King
2021-03-28 15:39           ` Ævar Arnfjörð Bjarmason
2021-03-29  9:16             ` Jeff King
2021-03-08 20:04   ` [PATCH 7/7] tag: don't misreport type of tagged objects in errors Ævar Arnfjörð Bjarmason
2005-06-22  0:35 ` [PATCH 2/2] Pull misc objects Daniel Barkalow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y2dqfgbb.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).