git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Jakub Narebski" <jnareb@gmail.com>, "Ted Ts'o" <tytso@mit.edu>,
	"Jonathan Nieder" <jrnieder@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Clemens Buchacher" <drizzd@aon.at>
Subject: [PATCH 5/5] limit "contains" traversals based on commit generation
Date: Mon, 11 Jul 2011 12:18:46 -0400	[thread overview]
Message-ID: <20110711161846.GE10418@sigill.intra.peff.net> (raw)
In-Reply-To: <20110711161332.GA10057@sigill.intra.peff.net>

When looking for commits that contain other commits (e.g.,
via "git tag --contains"), we can end up traversing useless
portions of the graph. For example, if I am looking for a
tag that contains a commit made last week, there is not much
point in traversing portions of the history graph made five
years ago.

This optimization can provide massive speedups. For example,
doing "git tag --contains HEAD~1000" in the linux-2.6
repository goes from:

  real    0m3.139s
  user    0m3.044s
  sys     0m0.092s

to:

  real    0m0.035s
  user    0m0.028s
  sys     0m0.004s

We could use commit timestamps to know when we are going too
far back in history, but they are sometimes not trustworthy.
Extreme clock skew on committers' machines (or bugs in
commit-generating software) mean that we may stop the
traversal too early when seeing commits skewed into the
past.

Instead, we use the calculated commit generation, which is a
propery of the graph itself (but since we cache it, it's
still cheap to consult).

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/tag.c |   20 +++++++++++++++++---
 1 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/builtin/tag.c b/builtin/tag.c
index 63bce6e..df6de47 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -40,7 +40,8 @@ static int in_commit_list(const struct commit_list *want, struct commit *c)
 }
 
 static int contains_recurse(struct commit *candidate,
-			    const struct commit_list *want)
+			    const struct commit_list *want,
+			    unsigned long cutoff)
 {
 	struct commit_list *p;
 
@@ -57,9 +58,13 @@ static int contains_recurse(struct commit *candidate,
 	if (parse_commit(candidate) < 0)
 		return 0;
 
+	/* stop searching if we go too far back in time */
+	if (commit_generation(candidate) < cutoff)
+		return 0;
+
 	/* Otherwise recurse and mark ourselves for future traversals. */
 	for (p = candidate->parents; p; p = p->next) {
-		if (contains_recurse(p->item, want)) {
+		if (contains_recurse(p->item, want, cutoff)) {
 			candidate->object.flags |= TMP_MARK;
 			return 1;
 		}
@@ -70,7 +75,16 @@ static int contains_recurse(struct commit *candidate,
 
 static int contains(struct commit *candidate, const struct commit_list *want)
 {
-	return contains_recurse(candidate, want);
+	unsigned long cutoff = ULONG_MAX;
+	const struct commit_list *c;
+
+	for (c = want; c; c = c->next) {
+		unsigned long g = commit_generation(c->item);
+		if (g < cutoff)
+			cutoff = g;
+	}
+
+	return contains_recurse(candidate, want, cutoff);
 }
 
 static int show_reference(const char *refname, const unsigned char *sha1,
-- 
1.7.6.37.g989c6

      parent reply	other threads:[~2011-07-11 16:18 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-11 16:13 [RFC/PATCH 0/5] generation numbers for faster traversals Jeff King
2011-07-11 16:16 ` [PATCH 1/5] decorate: allow storing values instead of pointers Jeff King
2011-07-11 16:39   ` Jeff King
2011-07-11 19:06   ` Junio C Hamano
2011-07-11 21:20     ` Jeff King
2011-07-11 16:17 ` [PATCH 2/5] add object-cache infrastructure Jeff King
2011-07-11 16:46   ` Jeff King
2011-07-11 16:58     ` Shawn Pearce
2011-07-11 19:17   ` Junio C Hamano
2011-07-11 22:01     ` Jeff King
2011-07-11 23:21       ` Junio C Hamano
2011-07-11 23:42         ` Junio C Hamano
2011-07-12  0:03         ` Jeff King
2011-07-12 19:38           ` Clemens Buchacher
2011-07-12 19:45             ` Jeff King
2011-07-12 21:07               ` Clemens Buchacher
2011-07-12 21:15                 ` Clemens Buchacher
2011-07-12 21:36                 ` Jeff King
2011-07-14  8:04                   ` Clemens Buchacher
2011-07-14 16:26                     ` Illia Bobyr
2011-07-13  1:33                 ` John Szakmeister
2011-07-12  0:14         ` Illia Bobyr
2011-07-12  5:35           ` Jeff King
2011-07-12 21:52             ` Illia Bobyr
2011-07-12  6:36       ` Miles Bader
2011-07-12 10:41   ` Jakub Narebski
2011-07-12 17:57     ` Jeff King
2011-07-12 18:41       ` Junio C Hamano
2011-07-13  6:37         ` Jeff King
2011-07-13 17:49           ` Junio C Hamano
2011-07-11 16:18 ` [PATCH 3/5] commit: add commit_generation function Jeff King
2011-07-11 17:57   ` Clemens Buchacher
2011-07-11 21:10     ` Jeff King
2011-07-11 16:18 ` [PATCH 4/5] pretty: support %G to show the generation number of a commit Jeff King
2011-07-11 16:18 ` Jeff King [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110711161846.GE10418@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=drizzd@aon.at \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jnareb@gmail.com \
    --cc=jrnieder@gmail.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).