git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: szeder.dev@gmail.com, me@ttaylorr.com,
	Derrick Stolee <derrickstolee@github.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: [PATCH 14/15] commit-graph: restore duplicate chunk checks
Date: Thu, 03 Dec 2020 16:16:53 +0000	[thread overview]
Message-ID: <106dd51f75699fbf4fc1e46687124995f5ef0278.1607012215.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.804.git.1607012215.gitgitgadget@gmail.com>

From: Derrick Stolee <dstolee@microsoft.com>

The previous change introduced read_table_of_contents() in the
chunk-format API, but dropped the duplicate chunk check from the
commit-graph parsing logic. This was done to keep flexibility in the
chunk-format API.

One way to restore this check is to have each chunk_read_fn method check
if it has run before. This is somewhat repetitive. If we determine that
the chunk-format API would be better off with a hard requirement that
chunks are never repeated, then this could be replaced with a check in
chunk-format.c.

For now, only restore the duplicate checks that previously existed in
the commit-graph parsing logic.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 commit-graph.c | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/commit-graph.c b/commit-graph.c
index 0a3ba147df..c0102fceba 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -289,10 +289,20 @@ static int verify_commit_graph_lite(struct commit_graph *g)
 	return 0;
 }
 
+static int report_duplicate(void)
+{
+	warning(_("duplicate chunk detected"));
+	return 1;
+}
+
 static int graph_read_oid_fanout(const unsigned char *chunk_start,
 				 size_t chunk_size, void *data)
 {
 	struct commit_graph *g = (struct commit_graph *)data;
+
+	if (g->chunk_oid_fanout)
+		return report_duplicate();
+
 	g->chunk_oid_fanout = (uint32_t*)chunk_start;
 	return 0;
 }
@@ -301,6 +311,10 @@ static int graph_read_oid_lookup(const unsigned char *chunk_start,
 				 size_t chunk_size, void *data)
 {
 	struct commit_graph *g = (struct commit_graph *)data;
+
+	if (g->chunk_oid_lookup)
+		return report_duplicate();
+
 	g->chunk_oid_lookup = chunk_start;
 	g->num_commits = chunk_size / g->hash_len;
 	return 0;
@@ -310,6 +324,10 @@ static int graph_read_data(const unsigned char *chunk_start,
 				 size_t chunk_size, void *data)
 {
 	struct commit_graph *g = (struct commit_graph *)data;
+
+	if (g->chunk_commit_data)
+		return report_duplicate();
+
 	g->chunk_commit_data = chunk_start;
 	return 0;
 }
@@ -318,6 +336,10 @@ static int graph_read_extra_edges(const unsigned char *chunk_start,
 				  size_t chunk_size, void *data)
 {
 	struct commit_graph *g = (struct commit_graph *)data;
+
+	if (g->chunk_extra_edges)
+		return report_duplicate();
+
 	g->chunk_extra_edges = chunk_start;
 	return 0;
 }
@@ -326,6 +348,10 @@ static int graph_read_base_graphs(const unsigned char *chunk_start,
 				  size_t chunk_size, void *data)
 {
 	struct commit_graph *g = (struct commit_graph *)data;
+
+	if (g->chunk_base_graphs)
+		return report_duplicate();
+
 	g->chunk_base_graphs = chunk_start;
 	return 0;
 }
@@ -334,6 +360,10 @@ static int graph_read_bloom_indices(const unsigned char *chunk_start,
 				    size_t chunk_size, void *data)
 {
 	struct commit_graph *g = (struct commit_graph *)data;
+
+	if (g->chunk_bloom_indexes)
+		return report_duplicate();
+
 	g->chunk_bloom_indexes = chunk_start;
 	return 0;
 }
@@ -343,6 +373,10 @@ static int graph_read_bloom_data(const unsigned char *chunk_start,
 {
 	struct commit_graph *g = (struct commit_graph *)data;
 	uint32_t hash_version;
+
+	if (g->chunk_bloom_data)
+		return report_duplicate();
+
 	g->chunk_bloom_data = chunk_start;
 	hash_version = get_be32(chunk_start);
 
-- 
gitgitgadget


  parent reply	other threads:[~2020-12-03 16:22 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-03 16:16 [PATCH 00/15] Refactor chunk-format into an API Derrick Stolee via GitGitGadget
2020-12-03 16:16 ` [PATCH 01/15] commit-graph: anonymize data in chunk_write_fn Derrick Stolee via GitGitGadget
2020-12-03 16:16 ` [PATCH 02/15] chunk-format: add API for writing table of contents Derrick Stolee via GitGitGadget
2020-12-08 17:56   ` Taylor Blau
2020-12-03 16:16 ` [PATCH 03/15] midx: rename pack_info to write_midx_context Derrick Stolee via GitGitGadget
2020-12-03 16:16 ` [PATCH 04/15] midx: use context in write_midx_pack_names() Derrick Stolee via GitGitGadget
2020-12-03 16:16 ` [PATCH 05/15] midx: add entries to write_midx_context Derrick Stolee via GitGitGadget
2020-12-03 21:42   ` Junio C Hamano
2020-12-04 13:39     ` Derrick Stolee
2020-12-08 18:00   ` Taylor Blau
2020-12-03 16:16 ` [PATCH 06/15] midx: add pack_perm " Derrick Stolee via GitGitGadget
2020-12-03 16:16 ` [PATCH 07/15] midx: add num_large_offsets " Derrick Stolee via GitGitGadget
2020-12-03 16:16 ` [PATCH 08/15] midx: convert chunk write methods to return int Derrick Stolee via GitGitGadget
2020-12-03 21:50   ` Junio C Hamano
2020-12-04 13:40     ` Derrick Stolee
2020-12-03 16:16 ` [PATCH 09/15] midx: drop chunk progress during write Derrick Stolee via GitGitGadget
2020-12-03 16:16 ` [PATCH 10/15] midx: use chunk-format API in write_midx_internal() Derrick Stolee via GitGitGadget
2020-12-08 18:42   ` Taylor Blau
2020-12-10 14:36     ` Derrick Stolee
2020-12-03 16:16 ` [PATCH 11/15] midx: use 64-bit multiplication for chunk sizes Derrick Stolee via GitGitGadget
2020-12-03 22:00   ` Junio C Hamano
2020-12-08 18:43     ` Taylor Blau
2020-12-03 16:16 ` [PATCH 12/15] chunk-format: create write_chunks() Derrick Stolee via GitGitGadget
2020-12-08 18:45   ` Taylor Blau
2020-12-03 16:16 ` [PATCH 13/15] chunk-format: create chunk reading API Derrick Stolee via GitGitGadget
2020-12-03 22:17   ` Junio C Hamano
2020-12-04 13:47     ` Derrick Stolee
2020-12-04 20:17       ` Junio C Hamano
2020-12-03 22:43   ` Junio C Hamano
2020-12-04 13:45     ` Derrick Stolee
2020-12-03 16:16 ` Derrick Stolee via GitGitGadget [this message]
2020-12-07 13:43   ` [PATCH 14/15] commit-graph: restore duplicate chunk checks Derrick Stolee
2020-12-03 16:16 ` [PATCH 15/15] chunk-format: add technical docs Derrick Stolee via GitGitGadget
2020-12-04 12:48 ` [PATCH 00/15] Refactor chunk-format into an API René Scharfe
2020-12-04 13:57   ` Derrick Stolee
2020-12-04 19:42   ` Junio C Hamano
2020-12-08 18:49   ` Taylor Blau
2020-12-09 17:13     ` René Scharfe
2020-12-10  0:50       ` Taylor Blau
2020-12-10 14:30         ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=106dd51f75699fbf4fc1e46687124995f5ef0278.1607012215.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).