git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Jakub Narebski" <jnareb@gmail.com>, "Ted Ts'o" <tytso@mit.edu>,
	"Jonathan Nieder" <jrnieder@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Clemens Buchacher" <drizzd@aon.at>,
	"Shawn O. Pearce" <spearce@spearce.org>
Subject: [RFC/PATCHv2 5/6] check commit generation cache validity against grafts
Date: Wed, 13 Jul 2011 03:06:16 -0400	[thread overview]
Message-ID: <20110713070616.GE18566@sigill.intra.peff.net> (raw)
In-Reply-To: <20110713064709.GA18499@sigill.intra.peff.net>

Some caches, like the commit generation cache, rely on the
shape of the history graph to be accurate. Because commits
are immutable, that shape should never change. However, our
view onto the graph is modified by grafts and replace refs;
if these change, the values in our cache are invalid and
should be regenerated.

We take a pretty heavy-handed approach, and simply throw out
and regenerate the whole cache when either grafts or replace
refs change. In theory we could be slightly more efficient
by comparing the view under which our cache was generated to
the current one. But doing that is complex, and requires
storing the old state.

Instead, we summarize all of the grafts and replace objects
with a single 20-byte sha1. Because the grafts and replace
refs don't tend to change very often, this is simple and
efficient enough.

The actual contents of what we stir into the sha1 are not
important, as long as:

  1. A given state is consistently represented across runs.

  2. Distinct states generate distinct input to the sha1
     function.

Signed-off-by: Jeff King <peff@peff.net>
---
New in this version of the series.

 Documentation/technical/api-metadata-cache.txt |    7 ++++
 cache.h                                        |    1 +
 commit.c                                       |   24 +++++++++++++++-
 commit.h                                       |    2 +
 metadata-cache.c                               |   16 ++++++++++
 metadata-cache.h                               |    3 ++
 replace_object.c                               |   15 ++++++++++
 t/t6070-commit-generations.sh                  |   36 ++++++++++++++++++++++++
 8 files changed, 103 insertions(+), 1 deletions(-)

diff --git a/Documentation/technical/api-metadata-cache.txt b/Documentation/technical/api-metadata-cache.txt
index 192a868..f43b1ba 100644
--- a/Documentation/technical/api-metadata-cache.txt
+++ b/Documentation/technical/api-metadata-cache.txt
@@ -128,3 +128,10 @@ Functions
 	Convenience wrapper for storing unsigned 32-bit integers. Note
 	that integers are stored on disk in network-byte order, so it is
 	safe to access caches from any architecture.
+
+`metadata_graph_validity`::
+
+	This function is intended to be used with `METADATA_CACHE_INIT`
+	as a validity function. It returns a SHA1 summarizing the
+	current state of any commit grafts and replace objects that
+	would affect the shape of the history graph.
diff --git a/cache.h b/cache.h
index bc9e5eb..50e8a1c 100644
--- a/cache.h
+++ b/cache.h
@@ -745,6 +745,7 @@ static inline const unsigned char *lookup_replace_object(const unsigned char *sh
 		return sha1;
 	return do_lookup_replace_object(sha1);
 }
+extern void replace_object_validity(git_SHA_CTX *ctx);
 
 /* Read and unpack a sha1 file into memory, write memory to a sha1 file */
 extern int sha1_object_info(const unsigned char *, unsigned long *);
diff --git a/commit.c b/commit.c
index fb37aa0..e72bb3e 100644
--- a/commit.c
+++ b/commit.c
@@ -246,6 +246,27 @@ int unregister_shallow(const unsigned char *sha1)
 	return 0;
 }
 
+void commit_graft_validity(git_SHA_CTX *ctx)
+{
+	int i;
+
+	prepare_commit_graft();
+
+	for (i = 0; i < commit_graft_nr; i++) {
+		const struct commit_graft *c = commit_graft[i];
+		git_SHA1_Update(ctx, c->sha1, 20);
+		if (c->nr_parent < 0)
+			git_SHA1_Update(ctx, "shallow", 7);
+		else {
+			uint32_t v = htonl(c->nr_parent);
+			int j;
+			git_SHA1_Update(ctx, &v, sizeof(v));
+			for (j = 0; j < c->nr_parent; j++)
+				git_SHA1_Update(ctx, c->parent[j], 20);
+		}
+	}
+}
+
 int parse_commit_buffer(struct commit *item, const void *buffer, unsigned long size)
 {
 	const char *tail = buffer;
@@ -881,7 +902,8 @@ int commit_tree(const char *msg, unsigned char *tree,
 }
 
 static struct metadata_cache generations =
-	METADATA_CACHE_INIT("generations", sizeof(uint32_t), NULL);
+	METADATA_CACHE_INIT("generations", sizeof(uint32_t),
+			    metadata_graph_validity);
 
 static unsigned long commit_generation_recurse(struct commit *c)
 {
diff --git a/commit.h b/commit.h
index bff6b36..e6d144d 100644
--- a/commit.h
+++ b/commit.h
@@ -178,4 +178,6 @@ extern int commit_tree(const char *msg, unsigned char *tree,
 
 unsigned long commit_generation(const struct commit *commit);
 
+void commit_graft_validity(git_SHA_CTX *ctx);
+
 #endif /* COMMIT_H */
diff --git a/metadata-cache.c b/metadata-cache.c
index e2e5ff8..32d3c21 100644
--- a/metadata-cache.c
+++ b/metadata-cache.c
@@ -2,6 +2,7 @@
 #include "metadata-cache.h"
 #include "sha1-lookup.h"
 #include "object.h"
+#include "commit.h"
 
 static struct metadata_cache **autowrite;
 static int autowrite_nr;
@@ -335,3 +336,18 @@ void metadata_cache_add_uint32(struct metadata_cache *c,
 	value = htonl(value);
 	metadata_cache_add(c, obj, &value);
 }
+
+void metadata_graph_validity(unsigned char out[20])
+{
+	git_SHA_CTX ctx;
+
+	git_SHA1_Init(&ctx);
+
+	git_SHA1_Update(&ctx, "grafts", 6);
+	commit_graft_validity(&ctx);
+
+	git_SHA1_Update(&ctx, "replace", 7);
+	replace_object_validity(&ctx);
+
+	git_SHA1_Final(out, &ctx);
+}
diff --git a/metadata-cache.h b/metadata-cache.h
index 5b761e1..15484b5 100644
--- a/metadata-cache.h
+++ b/metadata-cache.h
@@ -37,4 +37,7 @@ void metadata_cache_add_uint32(struct metadata_cache *,
 			       const struct object *,
 			       uint32_t value);
 
+/* Common validity token functions */
+void metadata_graph_validity(unsigned char out[20]);
+
 #endif /* METADATA_CACHE_H */
diff --git a/replace_object.c b/replace_object.c
index d0b1548..9ec462b 100644
--- a/replace_object.c
+++ b/replace_object.c
@@ -115,3 +115,18 @@ const unsigned char *do_lookup_replace_object(const unsigned char *sha1)
 
 	return cur;
 }
+
+void replace_object_validity(git_SHA_CTX *ctx)
+{
+	int i;
+
+	if (!read_replace_refs)
+		return;
+
+	prepare_replace_object();
+
+	for (i = 0; i < replace_object_nr; i++) {
+		git_SHA1_Update(ctx, replace_object[i]->sha1[0], 20);
+		git_SHA1_Update(ctx, replace_object[i]->sha1[1], 20);
+	}
+}
diff --git a/t/t6070-commit-generations.sh b/t/t6070-commit-generations.sh
index 3e0f2ad..0aefd01 100755
--- a/t/t6070-commit-generations.sh
+++ b/t/t6070-commit-generations.sh
@@ -38,4 +38,40 @@ test_expect_success 'cached values are the same' '
 	test_cmp expect actual
 '
 
+cat >expect-grafted <<'EOF'
+1 six
+0 Merge branch 'other'
+EOF
+test_expect_success 'adding grafts invalidates generation cache' '
+	git rev-parse six^ >.git/info/grafts &&
+	git log --format="%G %s" >actual &&
+	test_cmp expect-grafted actual
+'
+
+test_expect_success 'removing graft invalidates cache' '
+	rm .git/info/grafts &&
+	git log --format="%G %s" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'setup replace ref' '
+	H=$(git rev-parse six^) &&
+	R=$(git cat-file commit $H |
+	    sed /^parent/d |
+	    git hash-object -t commit --stdin -w) &&
+	git update-ref refs/replace/$H $R
+'
+
+test_expect_success 'adding replace refs invalidates generation cache' '
+	git log --format="%G %s" >actual &&
+	test_cmp expect-grafted actual
+'
+
+test_expect_success 'cache respects replace-object settings' '
+	git --no-replace-objects log --format="%G %s" >actual &&
+	test_cmp expect actual &&
+	git log --format="%G %s" >actual &&
+	test_cmp expect-grafted actual
+'
+
 test_done
-- 
1.7.6.37.g989c6

  parent reply	other threads:[~2011-07-13  7:06 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-13  6:47 [RFC/PATCHv2 0/6] generation numbers for faster traversals Jeff King
2011-07-13  6:57 ` [RFC/PATCHv2 1/6] decorate: allow storing values instead of pointers Jeff King
2011-07-13 17:52   ` Jonathan Nieder
2011-07-13 20:08     ` Jeff King
2011-07-14 17:34       ` Jeff King
2011-07-14 17:51         ` [PATCH 1/3] implement generic key/value map Jeff King
2011-07-14 18:52           ` Bert Wesarg
2011-07-14 18:54             ` Bert Wesarg
2011-07-14 18:55               ` Jeff King
2011-07-14 19:07                 ` Bert Wesarg
2011-07-14 19:14                   ` Jeff King
2011-07-14 19:18                     ` Bert Wesarg
2011-07-14 17:52         ` [PATCH 2/3] fast-export: use object to uint32 map instead of "decorate" Jeff King
2011-07-15  9:40           ` Sverre Rabbelier
2011-07-15 20:00             ` Jeff King
2011-07-14 17:53         ` [PATCH 3/3] decorate: use "map" for the underlying implementation Jeff King
2011-07-14 21:06         ` [RFC/PATCHv2 1/6] decorate: allow storing values instead of pointers Junio C Hamano
2011-08-04 22:43           ` [RFC/PATCH 0/5] macro-based key/value maps Jeff King
2011-08-04 22:45             ` [PATCH 1/5] implement generic key/value map Jeff King
2011-08-04 22:46             ` [PATCH 2/5] fast-export: use object to uint32 map instead of "decorate" Jeff King
2011-08-04 22:46             ` [PATCH 3/5] decorate: use "map" for the underlying implementation Jeff King
2011-08-04 22:46             ` [PATCH 4/5] map: implement persistent maps Jeff King
2011-08-04 22:46             ` [PATCH 5/5] implement metadata cache subsystem Jeff King
2011-08-04 22:48             ` [RFC/PATCH 0/2] patch-id caching Jeff King
2011-08-04 22:49               ` [PATCH 1/2] cherry: read default config Jeff King
2011-08-04 22:49               ` [PATCH 2/2] cache patch ids on disk Jeff King
2011-08-04 22:52                 ` Jeff King
2011-08-05 11:03             ` [RFC/PATCH 0/5] macro-based key/value maps Jeff King
2011-08-05 15:31               ` René Scharfe
2011-08-06  6:30                 ` Jeff King
2011-07-13  7:04 ` [RFC/PATCHv2 2/6] add metadata-cache infrastructure Jeff King
2011-07-13  8:18   ` Bert Wesarg
2011-07-13  8:31     ` Jeff King
2011-07-13  8:45       ` Bert Wesarg
2011-07-13 19:18         ` Jeff King
2011-07-13 19:40       ` Junio C Hamano
2011-07-13 19:33   ` Junio C Hamano
2011-07-13 20:25     ` Jeff King
2011-07-13  7:05 ` [RFC/PATCHv2 3/6] commit: add commit_generation function Jeff King
2011-07-13 14:26   ` Eric Sunshine
2011-07-13  7:05 ` [RFC/PATCHv2 4/6] pretty: support %G to show the generation number of a commit Jeff King
2011-07-13  7:06 ` Jeff King [this message]
2011-07-13 14:26   ` [RFC/PATCHv2 5/6] check commit generation cache validity against grafts Eric Sunshine
2011-07-13 19:35     ` Jeff King
2011-07-13  7:06 ` [RFC/PATCHv2 6/6] limit "contains" traversals based on commit generation Jeff King
2011-07-13  7:23   ` Jeff King
2011-07-13 20:33     ` Junio C Hamano
2011-07-13 20:58       ` Jeff King
2011-07-13 21:12         ` Junio C Hamano
2011-07-13 21:18           ` Jeff King
2011-07-15 18:22   ` Junio C Hamano
2011-07-15 20:40     ` Jeff King
2011-07-15 21:04       ` Junio C Hamano
2011-07-15 21:14         ` Jeff King
2011-07-15 21:01 ` Generation numbers and replacement objects Jakub Narebski
2011-07-15 21:10   ` Jeff King
2011-07-16 21:10     ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110713070616.GE18566@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=drizzd@aon.at \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jnareb@gmail.com \
    --cc=jrnieder@gmail.com \
    --cc=spearce@spearce.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).