git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: Vicent Marti <vicent@github.com>
Subject: [PATCH v2 12/19] rev-list: add bitmap mode to speed up object lists
Date: Fri, 25 Oct 2013 02:04:11 -0400	[thread overview]
Message-ID: <20131025060411.GJ23098@sigill.intra.peff.net> (raw)
In-Reply-To: <20131025055521.GD11810@sigill.intra.peff.net>

From: Vicent Marti <tanoku@gmail.com>

The bitmap reachability index used to speed up the counting objects
phase during `pack-objects` can also be used to optimize a normal
rev-list if the only thing required are the SHA1s of the objects during
the list (i.e., not the path names at which trees and blobs were found).

Calling `git rev-list --objects --use-bitmap-index [committish]` will
perform an object iteration based on a bitmap result instead of actually
walking the object graph.

These are some example timings for `torvalds/linux` (warm cache,
best-of-five):

    $ time git rev-list --objects master > /dev/null

    real    0m34.191s
    user    0m33.904s
    sys     0m0.268s

    $ time git rev-list --objects --use-bitmap-index master > /dev/null

    real    0m1.041s
    user    0m0.976s
    sys     0m0.064s

Likewise, using `git rev-list --count --use-bitmap-index` will speed up
the counting operation by building the resulting bitmap and performing a
fast popcount (number of bits set on the bitmap) on the result.

Here are some sample timings of different ways to count commits in
`torvalds/linux`:

    $ time git rev-list master | wc -l
        399882

        real    0m6.524s
        user    0m6.060s
        sys     0m3.284s

    $ time git rev-list --count master
        399882

        real    0m4.318s
        user    0m4.236s
        sys     0m0.076s

    $ time git rev-list --use-bitmap-index --count master
        399882

        real    0m0.217s
        user    0m0.176s
        sys     0m0.040s

This also respects negative refs, so you can use it to count
a slice of history:

        $ time git rev-list --count v3.0..master
        144843

        real    0m1.971s
        user    0m1.932s
        sys     0m0.036s

        $ time git rev-list --use-bitmap-index --count v3.0..master
        real    0m0.280s
        user    0m0.220s
        sys     0m0.056s

Though note that the closer the endpoints, the less it helps. In the
traversal case, we have fewer commits to cross, so we take less time.
But the bitmap time is dominated by generating the pack revindex, which
is constant with respect to the refs given.

Note that you cannot yet get a fast --left-right count of a symmetric
difference (e.g., "--count --left-right master...topic"). The slow part
of that walk actually happens during the merge-base determination when
we parse "master...topic". Even though a count does not actually need to
know the real merge base (it only needs to take the symmetric difference
of the bitmaps), the revision code would require some refactoring to
handle this case.

Additionally, a `--test-bitmap` flag has been added that will perform
the same rev-list manually (i.e. using a normal revwalk) and using
bitmaps, and verify that the results are the same. This can be used to
exercise the bitmap code, and also to verify that the contents of the
.bitmap file are sane.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/git-rev-list.txt     |  1 +
 Documentation/rev-list-options.txt |  8 ++++++++
 builtin/rev-list.c                 | 39 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 48 insertions(+)

diff --git a/Documentation/git-rev-list.txt b/Documentation/git-rev-list.txt
index 045b37b..7a1585d 100644
--- a/Documentation/git-rev-list.txt
+++ b/Documentation/git-rev-list.txt
@@ -55,6 +55,7 @@ SYNOPSIS
 	     [ \--reverse ]
 	     [ \--walk-reflogs ]
 	     [ \--no-walk ] [ \--do-walk ]
+	     [ \--use-bitmap-index ]
 	     <commit>... [ \-- <paths>... ]
 
 DESCRIPTION
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 5bdfb42..c236b85 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -274,6 +274,14 @@ See also linkgit:git-reflog[1].
 	Output excluded boundary commits. Boundary commits are
 	prefixed with `-`.
 
+ifdef::git-rev-list[]
+--use-bitmap-index::
+
+	Try to speed up the traversal using the pack bitmap index (if
+	one is available). Note that when traversing with `--objects`,
+	trees and blobs will not have their associated path printed.
+endif::git-rev-list[]
+
 --
 
 History Simplification
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 4fc1616..5209255 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -3,6 +3,8 @@
 #include "diff.h"
 #include "revision.h"
 #include "list-objects.h"
+#include "pack.h"
+#include "pack-bitmap.h"
 #include "builtin.h"
 #include "log-tree.h"
 #include "graph.h"
@@ -257,6 +259,18 @@ static int show_bisect_vars(struct rev_list_info *info, int reaches, int all)
 	return 0;
 }
 
+static int show_object_fast(
+	const unsigned char *sha1,
+	enum object_type type,
+	int exclude,
+	uint32_t name_hash,
+	struct packed_git *found_pack,
+	off_t found_offset)
+{
+	fprintf(stdout, "%s\n", sha1_to_hex(sha1));
+	return 1;
+}
+
 int cmd_rev_list(int argc, const char **argv, const char *prefix)
 {
 	struct rev_info revs;
@@ -265,6 +279,7 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	int bisect_list = 0;
 	int bisect_show_vars = 0;
 	int bisect_find_all = 0;
+	int use_bitmap_index = 0;
 
 	git_config(git_default_config, NULL);
 	init_revisions(&revs, prefix);
@@ -306,6 +321,14 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 			bisect_show_vars = 1;
 			continue;
 		}
+		if (!strcmp(arg, "--use-bitmap-index")) {
+			use_bitmap_index = 1;
+			continue;
+		}
+		if (!strcmp(arg, "--test-bitmap")) {
+			test_bitmap_walk(&revs);
+			return 0;
+		}
 		usage(rev_list_usage);
 
 	}
@@ -333,6 +356,22 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	if (bisect_list)
 		revs.limited = 1;
 
+	if (use_bitmap_index) {
+		if (revs.count && !revs.left_right && !revs.cherry_mark) {
+			uint32_t commit_count;
+			if (!prepare_bitmap_walk(&revs)) {
+				count_bitmap_commit_list(&commit_count, NULL, NULL, NULL);
+				printf("%d\n", commit_count);
+				return 0;
+			}
+		} else if (revs.tag_objects && revs.tree_objects && revs.blob_objects) {
+			if (!prepare_bitmap_walk(&revs)) {
+				traverse_bitmap_commit_list(&show_object_fast);
+				return 0;
+			}
+		}
+	}
+
 	if (prepare_revision_walk(&revs))
 		die("revision walk setup failed");
 	if (revs.tree_objects)
-- 
1.8.4.1.898.g8bf8a41.dirty

  parent reply	other threads:[~2013-10-25  6:04 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-24 17:59 [PATCH 0/19] pack bitmaps Jeff King
2013-10-24 17:59 ` [PATCH 01/19] sha1write: make buffer const-correct Jeff King
2013-10-24 18:00 ` [PATCH 02/19] revindex: Export new APIs Jeff King
2013-10-24 18:01 ` [PATCH 03/19] pack-objects: Refactor the packing list Jeff King
2013-10-24 18:01 ` [PATCH 04/19] pack-objects: factor out name_hash Jeff King
2013-10-24 18:01 ` [PATCH 05/19] revision: allow setting custom limiter function Jeff King
2013-10-24 18:01 ` [PATCH 06/19] sha1_file: export `git_open_noatime` Jeff King
2013-10-24 18:01 ` [PATCH 07/19] compat: add endianness helpers Jeff King
2013-10-26  7:55   ` Thomas Rast
2013-10-30  8:25     ` Jeff King
2013-10-30 17:06       ` Vicent Martí
2013-10-24 18:02 ` [PATCH 08/19] ewah: compressed bitmap implementation Jeff King
2013-10-24 23:34   ` Junio C Hamano
2013-10-25  3:15     ` Jeff King
2013-10-26  7:55   ` Thomas Rast
2013-10-24 18:03 ` [PATCH 09/19] documentation: add documentation for the bitmap format Jeff King
2013-10-25  1:16   ` Duy Nguyen
2013-10-25  3:21     ` Jeff King
2013-10-25  3:28       ` Duy Nguyen
2013-10-25 13:47       ` Shawn Pearce
2013-10-30  7:50         ` Jeff King
2013-10-30 10:23           ` Shawn Pearce
2013-10-30 16:11             ` Vicent Marti
2013-10-30 16:14             ` Vicent Marti
2013-10-24 18:03 ` [PATCH 10/19] pack-bitmap: add support for bitmap indexes Jeff King
2013-10-25 13:55   ` Shawn Pearce
2013-10-30  8:10     ` Jeff King
2013-10-30 10:27       ` Shawn Pearce
2013-10-30 15:47       ` Vicent Marti
2013-10-30 16:04         ` Shawn Pearce
2013-10-30 20:25         ` Jeff King
2013-10-24 18:04 ` [PATCH 11/19] pack-objects: use bitmaps when packing objects Jeff King
2013-10-25 14:14   ` Shawn Pearce
2013-10-30  8:21     ` Jeff King
2013-10-30 10:38       ` Shawn Pearce
2013-10-30 16:01         ` Vicent Marti
2013-10-24 18:06 ` [PATCH 12/19] rev-list: add bitmap mode to speed up object lists Jeff King
2013-10-25 14:00   ` Shawn Pearce
2013-10-30  8:12     ` Jeff King
2013-10-24 18:06 ` [PATCH 13/19] pack-objects: implement bitmap writing Jeff King
2013-10-25  1:21   ` Duy Nguyen
2013-10-25  3:22     ` Jeff King
2013-10-24 18:06 ` [PATCH 14/19] repack: stop using magic number for ARRAY_SIZE(exts) Jeff King
2013-10-24 18:07 ` [PATCH 15/19] repack: turn exts array into array-of-struct Jeff King
2013-10-24 18:07 ` [PATCH 16/19] repack: handle optional files created by pack-objects Jeff King
2013-10-24 18:08 ` [PATCH 17/19] repack: consider bitmaps when performing repacks Jeff King
2013-10-24 18:08 ` [PATCH 18/19] t: add basic bitmap functionality tests Jeff King
2013-10-24 18:08 ` [PATCH 19/19] pack-bitmap: implement optional name_hash cache Jeff King
2013-10-24 20:25 ` [PATCH 0/19] pack bitmaps Junio C Hamano
2013-10-25  3:07 ` Junio C Hamano
2013-10-25  5:55 ` [PATCHv2 " Jeff King
2013-10-25  5:57   ` [PATCH v2 01/19] sha1write: make buffer const-correct Jeff King
2013-10-25  6:02   ` [PATCH 02/19] revindex: Export new APIs Jeff King
2013-10-25  6:03   ` [PATCH v2 03/19] pack-objects: Refactor the packing list Jeff King
2013-10-25  6:03   ` [PATCH v2 04/19] pack-objects: factor out name_hash Jeff King
2013-10-25  6:03   ` [PATCH v2 05/19] revision: allow setting custom limiter function Jeff King
2013-10-25  6:03   ` [PATCH v2 06/19] sha1_file: export `git_open_noatime` Jeff King
2013-10-25  6:03   ` [PATCH v2 07/19] compat: add endianness helpers Jeff King
2013-10-25  6:03   ` [PATCH v2 08/19] ewah: compressed bitmap implementation Jeff King
2013-10-25  6:03   ` [PATCH v2 09/19] documentation: add documentation for the bitmap format Jeff King
2013-10-25  6:03   ` [PATCH v2 10/19] pack-bitmap: add support for bitmap indexes Jeff King
2013-10-25 23:06     ` Junio C Hamano
2013-10-26  0:26       ` Jeff King
2013-10-26  6:25         ` Jeff King
2013-10-28 15:22           ` Junio C Hamano
2013-10-30  7:00             ` Jeff King
2013-10-26 10:14     ` Duy Nguyen
2013-10-30  7:27       ` Jeff King
2013-10-25  6:03   ` [PATCH v2 11/19] pack-objects: use bitmaps when packing objects Jeff King
2013-10-26 10:25     ` Duy Nguyen
2013-10-30  7:36       ` Jeff King
2013-10-30 10:28         ` Duy Nguyen
2013-10-30 20:07           ` Jeff King
2013-10-31 12:03             ` Duy Nguyen
2013-10-25  6:04   ` Jeff King [this message]
2013-10-25  6:04   ` [PATCH v2 13/19] pack-objects: implement bitmap writing Jeff King
2013-10-25  6:04   ` [PATCH v2 14/19] repack: stop using magic number for ARRAY_SIZE(exts) Jeff King
2013-10-25  6:04   ` [PATCH v2 15/19] repack: turn exts array into array-of-struct Jeff King
2013-10-25  6:04   ` [PATCH v2 16/19] repack: handle optional files created by pack-objects Jeff King
2013-10-25  6:04   ` [PATCH v2 17/19] repack: consider bitmaps when performing repacks Jeff King
2013-10-25  6:04   ` [PATCH v2 18/19] t: add basic bitmap functionality tests Jeff King
2013-10-28 22:13     ` SZEDER Gábor
2013-10-30  7:39       ` Jeff King
2013-10-25  6:04   ` [PATCH v2 19/19] pack-bitmap: implement optional name_hash cache Jeff King
2013-10-26 10:19     ` [PATCH 20/19] count-objects: consider .bitmap without .pack/.idx pair garbage Nguyễn Thái Ngọc Duy
2013-10-30  6:59       ` Jeff King
2013-10-30 17:36         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131025060411.GJ23098@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=vicent@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).