git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "SZEDER Gábor via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: me@ttaylorr.com, szeder.dev@gmail.com, jnareb@gmail.com,
	peff@peff.net, garimasigit@gmail.com,
	"Derrick Stolee" <dstolee@microsoft.com>,
	"SZEDER Gábor" <szeder.dev@gmail.com>
Subject: [PATCH 08/10] commit-graph: simplify parse_commit_graph() #2
Date: Fri, 05 Jun 2020 13:00:30 +0000	[thread overview]
Message-ID: <83641b5e49e4f2abd2185fc1cde58eadad496cd4.1591362033.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.650.git.1591362032.gitgitgadget@gmail.com>

From: =?UTF-8?q?SZEDER=20G=C3=A1bor?= <szeder.dev@gmail.com>

The Chunk Lookup table stores the chunks' starting offset in the
commit-graph file, not their sizes.  Consequently, the size of a chunk
can only be calculated by subtracting its offset from the offset of
the subsequent chunk (or that of the terminating label).  This is
currenly implemented in a bit complicated way: as we iterate over the
entries of the Chunk Lookup table, we check the id of each chunk and
store its starting offset, then we check the id of the last seen chunk
and calculate its size using its previously saved offset.  At the
moment there is only one chunk for which we calculate its size, but
this patch series will add more, and the repeated chunk id checks are
not that pretty.

Instead let's read ahead the offset of the next chunk on each
iteration, so we can calculate the size of each chunk right away,
right where we store its starting offset.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 commit-graph.c | 26 +++++++++-----------------
 1 file changed, 9 insertions(+), 17 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index 9927762f18c..84206f0f512 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -230,8 +230,7 @@ struct commit_graph *parse_commit_graph(void *graph_map, int fd,
 	const unsigned char *data, *chunk_lookup;
 	uint32_t i;
 	struct commit_graph *graph;
-	uint64_t last_chunk_offset;
-	uint32_t last_chunk_id;
+	uint64_t next_chunk_offset;
 	uint32_t graph_signature;
 	unsigned char graph_version, hash_version;
 
@@ -281,18 +280,17 @@ struct commit_graph *parse_commit_graph(void *graph_map, int fd,
 		return NULL;
 	}
 
-	last_chunk_id = 0;
-	last_chunk_offset = 8;
 	chunk_lookup = data + 8;
-	for (i = 0; i <= graph->num_chunks; i++) {
+	next_chunk_offset = get_be64(chunk_lookup + 4);
+	for (i = 0; i < graph->num_chunks; i++) {
 		uint32_t chunk_id;
-		uint64_t chunk_offset;
+		uint64_t chunk_offset = next_chunk_offset;
 		int chunk_repeated = 0;
 
 		chunk_id = get_be32(chunk_lookup + 0);
-		chunk_offset = get_be64(chunk_lookup + 4);
 
 		chunk_lookup += GRAPH_CHUNKLOOKUP_WIDTH;
+		next_chunk_offset = get_be64(chunk_lookup + 4);
 
 		if (chunk_offset > graph_size - the_hash_algo->rawsz) {
 			error(_("commit-graph improper chunk offset %08x%08x"), (uint32_t)(chunk_offset >> 32),
@@ -312,8 +310,11 @@ struct commit_graph *parse_commit_graph(void *graph_map, int fd,
 		case GRAPH_CHUNKID_OIDLOOKUP:
 			if (graph->chunk_oid_lookup)
 				chunk_repeated = 1;
-			else
+			else {
 				graph->chunk_oid_lookup = data + chunk_offset;
+				graph->num_commits = (next_chunk_offset - chunk_offset)
+						     / graph->hash_len;
+			}
 			break;
 
 		case GRAPH_CHUNKID_DATA:
@@ -368,15 +369,6 @@ struct commit_graph *parse_commit_graph(void *graph_map, int fd,
 			free(graph);
 			return NULL;
 		}
-
-		if (last_chunk_id == GRAPH_CHUNKID_OIDLOOKUP)
-		{
-			graph->num_commits = (chunk_offset - last_chunk_offset)
-					     / graph->hash_len;
-		}
-
-		last_chunk_id = chunk_id;
-		last_chunk_offset = chunk_offset;
 	}
 
 	if (graph->chunk_bloom_indexes && graph->chunk_bloom_data) {
-- 
gitgitgadget


  parent reply	other threads:[~2020-06-05 13:01 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-05 13:00 [PATCH 00/10] Szeder's commit-graph cleanups Derrick Stolee via GitGitGadget
2020-06-05 13:00 ` [PATCH 01/10] tree-walk.c: don't match submodule entries for 'submod/anything' SZEDER Gábor via GitGitGadget
2020-06-05 13:00 ` [PATCH 02/10] commit-graph: fix parsing the Chunk Lookup table SZEDER Gábor via GitGitGadget
2020-06-05 13:00 ` [PATCH 03/10] commit-graph-format.txt: all multi-byte numbers are in network byte order SZEDER Gábor via GitGitGadget
2020-06-05 13:00 ` [PATCH 04/10] commit-slab: add a function to deep free entries on the slab SZEDER Gábor via GitGitGadget
2020-06-18 20:59   ` René Scharfe
2020-06-19 12:52     ` Derrick Stolee
2020-06-27 15:53   ` SZEDER Gábor
2020-06-05 13:00 ` [PATCH 05/10] diff.h: drop diff_tree_oid() & friends' return value SZEDER Gábor via GitGitGadget
2020-06-05 13:00 ` [PATCH 06/10] commit-graph: clean up #includes SZEDER Gábor via GitGitGadget
2020-06-05 13:00 ` [PATCH 07/10] commit-graph: simplify parse_commit_graph() #1 SZEDER Gábor via GitGitGadget
2020-06-05 13:00 ` SZEDER Gábor via GitGitGadget [this message]
2020-06-05 13:00 ` [PATCH 09/10] commit-graph: simplify write_commit_graph_file() #1 SZEDER Gábor via GitGitGadget
2020-06-05 13:00 ` [PATCH 10/10] commit-graph: simplify write_commit_graph_file() #2 SZEDER Gábor via GitGitGadget
2020-06-08 17:39 ` [PATCH 00/10] Szeder's commit-graph cleanups Junio C Hamano
2020-06-18  1:48 ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83641b5e49e4f2abd2185fc1cde58eadad496cd4.1591362033.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=garimasigit@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).