git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] pack-write: use hashwrite_be32() instead of double-buffering array
@ 2020-11-01  8:52 René Scharfe
  2020-11-04 13:36 ` Jeff King
  0 siblings, 1 reply; 3+ messages in thread
From: René Scharfe @ 2020-11-01  8:52 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Junio C Hamano

hashwrite() already buffers writes, so pass the fanout table entries
individually via hashwrite_be32(), which also does the endianess
conversion for us.  This avoids a memory copy, shortens the code and
reduces the number of magic numbers.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
Patch generated with -U8 for easier review.

 pack-write.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/pack-write.c b/pack-write.c
index a6cdb3c67c..23e19cc1ec 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -43,17 +43,16 @@ static int need_large_offset(off_t offset, const struct pack_idx_option *opts)
  */
 const char *write_idx_file(const char *index_name, struct pack_idx_entry **objects,
 			   int nr_objects, const struct pack_idx_option *opts,
 			   const unsigned char *sha1)
 {
 	struct hashfile *f;
 	struct pack_idx_entry **sorted_by_sha, **list, **last;
 	off_t last_obj_offset = 0;
-	uint32_t array[256];
 	int i, fd;
 	uint32_t index_version;

 	if (nr_objects) {
 		sorted_by_sha = objects;
 		list = sorted_by_sha;
 		last = sorted_by_sha + nr_objects;
 		for (i = 0; i < nr_objects; ++i) {
@@ -101,20 +100,19 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec
 	for (i = 0; i < 256; i++) {
 		struct pack_idx_entry **next = list;
 		while (next < last) {
 			struct pack_idx_entry *obj = *next;
 			if (obj->oid.hash[0] != i)
 				break;
 			next++;
 		}
-		array[i] = htonl(next - sorted_by_sha);
+		hashwrite_be32(f, next - sorted_by_sha);
 		list = next;
 	}
-	hashwrite(f, array, 256 * 4);

 	/*
 	 * Write the actual SHA1 entries..
 	 */
 	list = sorted_by_sha;
 	for (i = 0; i < nr_objects; i++) {
 		struct pack_idx_entry *obj = *list++;
 		if (index_version < 2)
--
2.29.2

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] pack-write: use hashwrite_be32() instead of double-buffering array
  2020-11-01  8:52 [PATCH] pack-write: use hashwrite_be32() instead of double-buffering array René Scharfe
@ 2020-11-04 13:36 ` Jeff King
  2020-11-04 16:23   ` Junio C Hamano
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff King @ 2020-11-04 13:36 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git Mailing List, Junio C Hamano

On Sun, Nov 01, 2020 at 09:52:12AM +0100, René Scharfe wrote:

> hashwrite() already buffers writes, so pass the fanout table entries
> individually via hashwrite_be32(), which also does the endianess
> conversion for us.  This avoids a memory copy, shortens the code and
> reduces the number of magic numbers.

Yep, this seems trivially correct. The key observation is that we are
filling the array in order:

> @@ -101,20 +100,19 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec
>  	for (i = 0; i < 256; i++) {
>  		struct pack_idx_entry **next = list;
>  		while (next < last) {
>  			struct pack_idx_entry *obj = *next;
>  			if (obj->oid.hash[0] != i)
>  				break;
>  			next++;
>  		}
> -		array[i] = htonl(next - sorted_by_sha);
> +		hashwrite_be32(f, next - sorted_by_sha);
>  		list = next;
>  	}
> -	hashwrite(f, array, 256 * 4);

Perhaps obvious, but I got bit trying to do another similar conversion
recently that was filling in the array out-of-order (not on the list,
but in some improvements in the bitmap code that haven't been sent in
yet).

-Peff

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] pack-write: use hashwrite_be32() instead of double-buffering array
  2020-11-04 13:36 ` Jeff King
@ 2020-11-04 16:23   ` Junio C Hamano
  0 siblings, 0 replies; 3+ messages in thread
From: Junio C Hamano @ 2020-11-04 16:23 UTC (permalink / raw)
  To: Jeff King; +Cc: René Scharfe, Git Mailing List

Jeff King <peff@peff.net> writes:

> On Sun, Nov 01, 2020 at 09:52:12AM +0100, René Scharfe wrote:
>
>> hashwrite() already buffers writes, so pass the fanout table entries
>> individually via hashwrite_be32(), which also does the endianess
>> conversion for us.  This avoids a memory copy, shortens the code and
>> reduces the number of magic numbers.
>
> Yep, this seems trivially correct. The key observation is that we are
> filling the array in order:
>
>> @@ -101,20 +100,19 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec
>>  	for (i = 0; i < 256; i++) {
>>  		struct pack_idx_entry **next = list;
>>  		while (next < last) {
>>  			struct pack_idx_entry *obj = *next;
>>  			if (obj->oid.hash[0] != i)
>>  				break;
>>  			next++;
>>  		}
>> -		array[i] = htonl(next - sorted_by_sha);
>> +		hashwrite_be32(f, next - sorted_by_sha);
>>  		list = next;
>>  	}
>> -	hashwrite(f, array, 256 * 4);
>
> Perhaps obvious, but I got bit trying to do another similar conversion
> recently that was filling in the array out-of-order...

Yeah, filling an array out of order and then writing the result in
order would obviously different from writing out individual pieces
in the order the original codeflow used to fill the array.  Worse,
the order the data items are fed to hashwrite() obviously affects
the hash computed at the end.  An example of too much abstraction
biting us? ;-)



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-11-04 16:23 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-01  8:52 [PATCH] pack-write: use hashwrite_be32() instead of double-buffering array René Scharfe
2020-11-04 13:36 ` Jeff King
2020-11-04 16:23   ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).