git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 2/4] pack-bitmap.c: make object filtering functions generic
  2020-04-22 23:13 [PATCH 0/4] pack-bitmap: use bitmaps for traversals with '--filter=tree:0' Taylor Blau
@ 2020-04-22 23:13 ` Taylor Blau
  0 siblings, 0 replies; 13+ messages in thread
From: Taylor Blau @ 2020-04-22 23:13 UTC (permalink / raw)
  To: git; +Cc: peff, chriscool

In 4f3bd5606a (pack-bitmap: implement BLOB_NONE filtering, 2020-02-14),
filtering support for bitmaps was added for the 'LOFC_BLOB_NONE' filter.

In the future, we would like to add support for filters that behave as
if they exclude a certain type of object, for e.g., the tree depth
filter with depth 0.

To prepare for this, make some of the functions used for filtering more
generic, such as 'find_tip_blobs' and 'filter_bitmap_blob_none' so that
they can work over arbitrary object types.

To that end, create 'find_tip_objects' and
'filter_bitmap_exclude_type', and redefine the aforementioned functions
in terms of those.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pack-bitmap.c | 35 ++++++++++++++++++++++++-----------
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/pack-bitmap.c b/pack-bitmap.c
index 49a8d10d0c..3693c9e62f 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -715,8 +715,9 @@ static int in_bitmapped_pack(struct bitmap_index *bitmap_git,
 	return 0;
 }
 
-static struct bitmap *find_tip_blobs(struct bitmap_index *bitmap_git,
-				     struct object_list *tip_objects)
+static struct bitmap *find_tip_objects(struct bitmap_index *bitmap_git,
+				       struct object_list *tip_objects,
+				       enum object_type type)
 {
 	struct bitmap *result = bitmap_new();
 	struct object_list *p;
@@ -724,7 +725,7 @@ static struct bitmap *find_tip_blobs(struct bitmap_index *bitmap_git,
 	for (p = tip_objects; p; p = p->next) {
 		int pos;
 
-		if (p->item->type != OBJ_BLOB)
+		if (p->item->type != type)
 			continue;
 
 		pos = bitmap_position(bitmap_git, &p->item->oid);
@@ -737,9 +738,10 @@ static struct bitmap *find_tip_blobs(struct bitmap_index *bitmap_git,
 	return result;
 }
 
-static void filter_bitmap_blob_none(struct bitmap_index *bitmap_git,
-				    struct object_list *tip_objects,
-				    struct bitmap *to_filter)
+static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git,
+				       struct object_list *tip_objects,
+				       struct bitmap *to_filter,
+				       enum object_type type)
 {
 	struct eindex *eindex = &bitmap_git->ext_index;
 	struct bitmap *tips;
@@ -747,18 +749,21 @@ static void filter_bitmap_blob_none(struct bitmap_index *bitmap_git,
 	eword_t mask;
 	uint32_t i;
 
+	if (type != OBJ_BLOB)
+		BUG("filter_bitmap_exclude_type: unsupported type '%d'", type);
+
 	/*
 	 * The non-bitmap version of this filter never removes
-	 * blobs which the other side specifically asked for,
+	 * objects which the other side specifically asked for,
 	 * so we must match that behavior.
 	 */
-	tips = find_tip_blobs(bitmap_git, tip_objects);
+	tips = find_tip_objects(bitmap_git, tip_objects, type);
 
 	/*
 	 * We can use the blob type-bitmap to work in whole words
 	 * for the objects that are actually in the bitmapped packfile.
 	 */
-	for (i = 0, init_type_iterator(&it, bitmap_git, OBJ_BLOB);
+	for (i = 0, init_type_iterator(&it, bitmap_git, type);
 	     i < to_filter->word_alloc && ewah_iterator_next(&mask, &it);
 	     i++) {
 		if (i < tips->word_alloc)
@@ -773,7 +778,7 @@ static void filter_bitmap_blob_none(struct bitmap_index *bitmap_git,
 	 */
 	for (i = 0; i < eindex->count; i++) {
 		uint32_t pos = i + bitmap_git->pack->num_objects;
-		if (eindex->objects[i]->type == OBJ_BLOB &&
+		if (eindex->objects[i]->type == type &&
 		    bitmap_get(to_filter, pos) &&
 		    !bitmap_get(tips, pos))
 			bitmap_unset(to_filter, pos);
@@ -782,6 +787,14 @@ static void filter_bitmap_blob_none(struct bitmap_index *bitmap_git,
 	bitmap_free(tips);
 }
 
+static void filter_bitmap_blob_none(struct bitmap_index *bitmap_git,
+				    struct object_list *tip_objects,
+				    struct bitmap *to_filter)
+{
+	filter_bitmap_exclude_type(bitmap_git, tip_objects, to_filter,
+				   OBJ_BLOB);
+}
+
 static unsigned long get_size_by_pos(struct bitmap_index *bitmap_git,
 				     uint32_t pos)
 {
@@ -820,7 +833,7 @@ static void filter_bitmap_blob_limit(struct bitmap_index *bitmap_git,
 	eword_t mask;
 	uint32_t i;
 
-	tips = find_tip_blobs(bitmap_git, tip_objects);
+	tips = find_tip_objects(bitmap_git, tip_objects, OBJ_BLOB);
 
 	for (i = 0, init_type_iterator(&it, bitmap_git, OBJ_BLOB);
 	     i < to_filter->word_alloc && ewah_iterator_next(&mask, &it);
-- 
2.26.2.112.g65467a058e


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 0/4] pack-bitmap: use bitmaps for traversals with '--filter=tree:0'
@ 2020-05-04 23:12 Taylor Blau
  2020-05-04 23:12 ` [PATCH 1/4] list-objects-filter: treat NULL filter_options as "disabled" Taylor Blau
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Taylor Blau @ 2020-05-04 23:12 UTC (permalink / raw)
  To: git; +Cc: peff, chriscool

Hi,

This is a re-submission of mine and Peff's series in [1] to get some
more eyes on it. The intent of this series is to take advantage of
the type bitmaps when using '--filter=tree:0'.

We have been running these patches at GitHub for a couple of months now
to power fetches with '--filter=tree:0' (but no other values of N,
patches to come later to address this).

The content of the patches attached are the same as in
their original submission; the only change on my part has been to
rebase them over the latest from master.

Thanks in advance for your review.

[1]: https://lore.kernel.org/git/cover.1587597151.git.me@ttaylorr.com/

Jeff King (2):
  list-objects-filter: treat NULL filter_options as "disabled"
  pack-bitmap: pass object filter to fill-in traversal

Taylor Blau (2):
  pack-bitmap.c: make object filtering functions generic
  pack-bitmap.c: support 'tree:0' filtering

 list-objects-filter.c              |  3 ++
 pack-bitmap.c                      | 72 +++++++++++++++++++++++-------
 t/perf/p5310-pack-bitmaps.sh       | 10 +++++
 t/t6113-rev-list-bitmap-filters.sh | 21 +++++++++
 4 files changed, 90 insertions(+), 16 deletions(-)

--
2.26.0.113.ge9739cdccc

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/4] list-objects-filter: treat NULL filter_options as "disabled"
  2020-05-04 23:12 [PATCH 0/4] pack-bitmap: use bitmaps for traversals with '--filter=tree:0' Taylor Blau
@ 2020-05-04 23:12 ` Taylor Blau
  2020-05-05  5:07   ` Junio C Hamano
  2020-05-04 23:12 ` [PATCH 2/4] pack-bitmap.c: make object filtering functions generic Taylor Blau
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Taylor Blau @ 2020-05-04 23:12 UTC (permalink / raw)
  To: git; +Cc: peff, chriscool

From: Jeff King <peff@peff.net>

In most callers, we have an actual list_objects_filter_options struct,
and if no filtering is desired its "choice" element will be
LOFC_DISABLED. However, some code may have only a pointer to such a
struct which may be NULL (because _their_ callers didn't care about
filtering, either). Rather than forcing them to handle this explicitly
like:

  if (filter_options)
          traverse_commit_list_filtered(filter_options, revs,
	                                show_commit, show_object,
					show_data, NULL);
  else
          traverse_commit_list(revs, show_commit, show_object,
	                             show_data);

let's just treat a NULL filter_options the same as LOFC_DISABLED. We
only need a small change, since that option struct is converted into a
real filter only in the "init" function.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 list-objects-filter.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/list-objects-filter.c b/list-objects-filter.c
index 1e8d4e763d..0a3ef3cab3 100644
--- a/list-objects-filter.c
+++ b/list-objects-filter.c
@@ -663,6 +663,9 @@ struct filter *list_objects_filter__init(
 
 	assert((sizeof(s_filters) / sizeof(s_filters[0])) == LOFC__COUNT);
 
+	if (!filter_options)
+		return NULL;
+
 	if (filter_options->choice >= LOFC__COUNT)
 		BUG("invalid list-objects filter choice: %d",
 		    filter_options->choice);
-- 
2.26.0.113.ge9739cdccc


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/4] pack-bitmap.c: make object filtering functions generic
  2020-05-04 23:12 [PATCH 0/4] pack-bitmap: use bitmaps for traversals with '--filter=tree:0' Taylor Blau
  2020-05-04 23:12 ` [PATCH 1/4] list-objects-filter: treat NULL filter_options as "disabled" Taylor Blau
@ 2020-05-04 23:12 ` Taylor Blau
  2020-05-05  5:12   ` Junio C Hamano
  2020-05-04 23:12 ` [PATCH 3/4] pack-bitmap.c: support 'tree:0' filtering Taylor Blau
  2020-05-04 23:12 ` [PATCH 4/4] pack-bitmap: pass object filter to fill-in traversal Taylor Blau
  3 siblings, 1 reply; 13+ messages in thread
From: Taylor Blau @ 2020-05-04 23:12 UTC (permalink / raw)
  To: git; +Cc: peff, chriscool

In 4f3bd5606a (pack-bitmap: implement BLOB_NONE filtering, 2020-02-14),
filtering support for bitmaps was added for the 'LOFC_BLOB_NONE' filter.

In the future, we would like to add support for filters that behave as
if they exclude a certain type of object, for e.g., the tree depth
filter with depth 0.

To prepare for this, make some of the functions used for filtering more
generic, such as 'find_tip_blobs' and 'filter_bitmap_blob_none' so that
they can work over arbitrary object types.

To that end, create 'find_tip_objects' and
'filter_bitmap_exclude_type', and redefine the aforementioned functions
in terms of those.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pack-bitmap.c | 35 ++++++++++++++++++++++++-----------
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/pack-bitmap.c b/pack-bitmap.c
index 49a8d10d0c..3693c9e62f 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -715,8 +715,9 @@ static int in_bitmapped_pack(struct bitmap_index *bitmap_git,
 	return 0;
 }
 
-static struct bitmap *find_tip_blobs(struct bitmap_index *bitmap_git,
-				     struct object_list *tip_objects)
+static struct bitmap *find_tip_objects(struct bitmap_index *bitmap_git,
+				       struct object_list *tip_objects,
+				       enum object_type type)
 {
 	struct bitmap *result = bitmap_new();
 	struct object_list *p;
@@ -724,7 +725,7 @@ static struct bitmap *find_tip_blobs(struct bitmap_index *bitmap_git,
 	for (p = tip_objects; p; p = p->next) {
 		int pos;
 
-		if (p->item->type != OBJ_BLOB)
+		if (p->item->type != type)
 			continue;
 
 		pos = bitmap_position(bitmap_git, &p->item->oid);
@@ -737,9 +738,10 @@ static struct bitmap *find_tip_blobs(struct bitmap_index *bitmap_git,
 	return result;
 }
 
-static void filter_bitmap_blob_none(struct bitmap_index *bitmap_git,
-				    struct object_list *tip_objects,
-				    struct bitmap *to_filter)
+static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git,
+				       struct object_list *tip_objects,
+				       struct bitmap *to_filter,
+				       enum object_type type)
 {
 	struct eindex *eindex = &bitmap_git->ext_index;
 	struct bitmap *tips;
@@ -747,18 +749,21 @@ static void filter_bitmap_blob_none(struct bitmap_index *bitmap_git,
 	eword_t mask;
 	uint32_t i;
 
+	if (type != OBJ_BLOB)
+		BUG("filter_bitmap_exclude_type: unsupported type '%d'", type);
+
 	/*
 	 * The non-bitmap version of this filter never removes
-	 * blobs which the other side specifically asked for,
+	 * objects which the other side specifically asked for,
 	 * so we must match that behavior.
 	 */
-	tips = find_tip_blobs(bitmap_git, tip_objects);
+	tips = find_tip_objects(bitmap_git, tip_objects, type);
 
 	/*
 	 * We can use the blob type-bitmap to work in whole words
 	 * for the objects that are actually in the bitmapped packfile.
 	 */
-	for (i = 0, init_type_iterator(&it, bitmap_git, OBJ_BLOB);
+	for (i = 0, init_type_iterator(&it, bitmap_git, type);
 	     i < to_filter->word_alloc && ewah_iterator_next(&mask, &it);
 	     i++) {
 		if (i < tips->word_alloc)
@@ -773,7 +778,7 @@ static void filter_bitmap_blob_none(struct bitmap_index *bitmap_git,
 	 */
 	for (i = 0; i < eindex->count; i++) {
 		uint32_t pos = i + bitmap_git->pack->num_objects;
-		if (eindex->objects[i]->type == OBJ_BLOB &&
+		if (eindex->objects[i]->type == type &&
 		    bitmap_get(to_filter, pos) &&
 		    !bitmap_get(tips, pos))
 			bitmap_unset(to_filter, pos);
@@ -782,6 +787,14 @@ static void filter_bitmap_blob_none(struct bitmap_index *bitmap_git,
 	bitmap_free(tips);
 }
 
+static void filter_bitmap_blob_none(struct bitmap_index *bitmap_git,
+				    struct object_list *tip_objects,
+				    struct bitmap *to_filter)
+{
+	filter_bitmap_exclude_type(bitmap_git, tip_objects, to_filter,
+				   OBJ_BLOB);
+}
+
 static unsigned long get_size_by_pos(struct bitmap_index *bitmap_git,
 				     uint32_t pos)
 {
@@ -820,7 +833,7 @@ static void filter_bitmap_blob_limit(struct bitmap_index *bitmap_git,
 	eword_t mask;
 	uint32_t i;
 
-	tips = find_tip_blobs(bitmap_git, tip_objects);
+	tips = find_tip_objects(bitmap_git, tip_objects, OBJ_BLOB);
 
 	for (i = 0, init_type_iterator(&it, bitmap_git, OBJ_BLOB);
 	     i < to_filter->word_alloc && ewah_iterator_next(&mask, &it);
-- 
2.26.0.113.ge9739cdccc


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/4] pack-bitmap.c: support 'tree:0' filtering
  2020-05-04 23:12 [PATCH 0/4] pack-bitmap: use bitmaps for traversals with '--filter=tree:0' Taylor Blau
  2020-05-04 23:12 ` [PATCH 1/4] list-objects-filter: treat NULL filter_options as "disabled" Taylor Blau
  2020-05-04 23:12 ` [PATCH 2/4] pack-bitmap.c: make object filtering functions generic Taylor Blau
@ 2020-05-04 23:12 ` Taylor Blau
  2020-05-05  5:25   ` Junio C Hamano
  2020-05-04 23:12 ` [PATCH 4/4] pack-bitmap: pass object filter to fill-in traversal Taylor Blau
  3 siblings, 1 reply; 13+ messages in thread
From: Taylor Blau @ 2020-05-04 23:12 UTC (permalink / raw)
  To: git; +Cc: peff, chriscool

In the previous patch, we made it easy to define other filters that
exclude all objects of a certain type. Use that in order to implement
bitmap-level filtering for the '--filter=tree:<n>' filter when 'n' is
equal to 0.

The general case is not helped by bitmaps, since for values of 'n > 0',
the object filtering machinery requires a full-blown tree traversal in
order to determine the depth of a given tree. Caching this is
non-obvious, too, since the same tree object can have a different depth
depending on the context (e.g., a tree was moved up in the directory
hierarchy between two commits).

But, the 'n = 0' case can be helped, and this patch does so. Running
p5310.11 in this tree and on master with the kernel, we can see that
this case is helped substantially:

  Test                                  master              this tree
  --------------------------------------------------------------------------------
  5310.11: rev-list count with tree:0   10.68(10.39+0.27)   0.06(0.04+0.01) -99.4%

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pack-bitmap.c                      | 25 ++++++++++++++++++++++++-
 t/perf/p5310-pack-bitmaps.sh       |  5 +++++
 t/t6113-rev-list-bitmap-filters.sh | 21 +++++++++++++++++++++
 3 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/pack-bitmap.c b/pack-bitmap.c
index 3693c9e62f..195ee8cad0 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -749,7 +749,7 @@ static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git,
 	eword_t mask;
 	uint32_t i;
 
-	if (type != OBJ_BLOB)
+	if (type != OBJ_BLOB && type != OBJ_TREE)
 		BUG("filter_bitmap_exclude_type: unsupported type '%d'", type);
 
 	/*
@@ -867,6 +867,20 @@ static void filter_bitmap_blob_limit(struct bitmap_index *bitmap_git,
 	bitmap_free(tips);
 }
 
+static void filter_bitmap_tree_depth(struct bitmap_index *bitmap_git,
+				     struct object_list *tip_objects,
+				     struct bitmap *to_filter,
+				     unsigned long limit)
+{
+	if (limit)
+		BUG("filter_bitmap_tree_depth given non-zero limit");
+
+	filter_bitmap_exclude_type(bitmap_git, tip_objects, to_filter,
+				   OBJ_TREE);
+	filter_bitmap_exclude_type(bitmap_git, tip_objects, to_filter,
+				   OBJ_BLOB);
+}
+
 static int filter_bitmap(struct bitmap_index *bitmap_git,
 			 struct object_list *tip_objects,
 			 struct bitmap *to_filter,
@@ -890,6 +904,15 @@ static int filter_bitmap(struct bitmap_index *bitmap_git,
 		return 0;
 	}
 
+	if (filter->choice == LOFC_TREE_DEPTH &&
+	    filter->tree_exclude_depth == 0) {
+		if (bitmap_git)
+			filter_bitmap_tree_depth(bitmap_git, tip_objects,
+						 to_filter,
+						 filter->tree_exclude_depth);
+		return 0;
+	}
+
 	/* filter choice not handled */
 	return -1;
 }
diff --git a/t/perf/p5310-pack-bitmaps.sh b/t/perf/p5310-pack-bitmaps.sh
index 80c53edca7..75ccf9f4e3 100755
--- a/t/perf/p5310-pack-bitmaps.sh
+++ b/t/perf/p5310-pack-bitmaps.sh
@@ -53,6 +53,11 @@ test_perf 'rev-list count with blob:limit=1k' '
 		--filter=blob:limit=1k >/dev/null
 '
 
+test_perf 'rev-list count with tree:0' '
+	git rev-list --use-bitmap-index --count --objects --all \
+		--filter=tree:0 >/dev/null
+'
+
 test_perf 'simulated partial clone' '
 	git pack-objects --stdout --all --filter=blob:none </dev/null >/dev/null
 '
diff --git a/t/t6113-rev-list-bitmap-filters.sh b/t/t6113-rev-list-bitmap-filters.sh
index 145603f124..2b551e6fd0 100755
--- a/t/t6113-rev-list-bitmap-filters.sh
+++ b/t/t6113-rev-list-bitmap-filters.sh
@@ -53,4 +53,25 @@ test_expect_success 'blob:limit filter with specified blob' '
 	test_bitmap_traversal expect actual
 '
 
+test_expect_success 'tree:0 filter' '
+	git rev-list --objects --filter=tree:0 HEAD >expect &&
+	git rev-list --use-bitmap-index \
+		     --objects --filter=tree:0 HEAD >actual &&
+	test_bitmap_traversal expect actual
+'
+
+test_expect_success 'tree:0 filter with specified blob, tree' '
+	git rev-list --objects --filter=tree:0 HEAD HEAD:two.t >expect &&
+	git rev-list --use-bitmap-index \
+		     --objects --filter=tree:0 HEAD HEAD:two.t >actual &&
+	test_bitmap_traversal expect actual
+'
+
+test_expect_success 'tree:1 filter' '
+	git rev-list --objects --filter=tree:1 HEAD >expect &&
+	git rev-list --use-bitmap-index \
+		     --objects --filter=tree:1 HEAD >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.26.0.113.ge9739cdccc


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/4] pack-bitmap: pass object filter to fill-in traversal
  2020-05-04 23:12 [PATCH 0/4] pack-bitmap: use bitmaps for traversals with '--filter=tree:0' Taylor Blau
                   ` (2 preceding siblings ...)
  2020-05-04 23:12 ` [PATCH 3/4] pack-bitmap.c: support 'tree:0' filtering Taylor Blau
@ 2020-05-04 23:12 ` Taylor Blau
  2020-05-05  5:40   ` Junio C Hamano
  3 siblings, 1 reply; 13+ messages in thread
From: Taylor Blau @ 2020-05-04 23:12 UTC (permalink / raw)
  To: git; +Cc: peff, chriscool

From: Jeff King <peff@peff.net>

Sometimes a bitmap traversal still has to walk some commits manually,
because those commits aren't included in the bitmap packfile (e.g., due
to a push or commit since the last full repack). If we're given an
object filter, we don't pass it down to this traversal. It's not
necessary for correctness because the bitmap code has its own filters to
post-process the bitmap result (which it must, to filter out the objects
that _are_ mentioned in the bitmapped packfile).

And with blob filters, there was no performance reason to pass along
those filters, either. The fill-in traversal could omit them from the
result, but it wouldn't save us any time to do so, since we'd still have
to walk each tree entry to see if it's a blob or not.

But now that we support tree filters, there's opportunity for savings. A
tree:depth=0 filter means we can avoid accessing trees entirely, since
we know we won't them (or any of the subtrees or blobs they point to).
The new test in p5310 shows this off (the "partial bitmap" state is one
where HEAD~100 and its ancestors are all in a bitmapped pack, but
HEAD~100..HEAD are not). Here are the results (run against linux.git):

  Test                                                  HEAD^               HEAD
  -------------------------------------------------------------------------------------------------
  [...]
  5310.16: rev-list with tree filter (partial bitmap)   0.19(0.17+0.02)     0.03(0.02+0.01) -84.2%

The absolute number of savings isn't _huge_, but keep in mind that we
only omitted 100 first-parent links (in the version of linux.git here,
that's 894 actual commits). In a more pathological case, we might have a
much larger proportion of non-bitmapped commits. I didn't bother
creating such a case in the perf script because the setup is expensive,
and this is plenty to show the savings as a percentage.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pack-bitmap.c                | 14 +++++++++-----
 t/perf/p5310-pack-bitmaps.sh |  5 +++++
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/pack-bitmap.c b/pack-bitmap.c
index 195ee8cad0..4077e731e8 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -506,7 +506,8 @@ static int should_include(struct commit *commit, void *_data)
 static struct bitmap *find_objects(struct bitmap_index *bitmap_git,
 				   struct rev_info *revs,
 				   struct object_list *roots,
-				   struct bitmap *seen)
+				   struct bitmap *seen,
+				   struct list_objects_filter_options *filter)
 {
 	struct bitmap *base = NULL;
 	int needs_walk = 0;
@@ -599,8 +600,9 @@ static struct bitmap *find_objects(struct bitmap_index *bitmap_git,
 		show_data.bitmap_git = bitmap_git;
 		show_data.base = base;
 
-		traverse_commit_list(revs, show_commit, show_object,
-				     &show_data);
+		traverse_commit_list_filtered(filter, revs,
+					      show_commit, show_object,
+					      &show_data, NULL);
 	}
 
 	return base;
@@ -999,7 +1001,8 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs,
 
 	if (haves) {
 		revs->ignore_missing_links = 1;
-		haves_bitmap = find_objects(bitmap_git, revs, haves, NULL);
+		haves_bitmap = find_objects(bitmap_git, revs, haves, NULL,
+					    filter);
 		reset_revision_walk();
 		revs->ignore_missing_links = 0;
 
@@ -1007,7 +1010,8 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs,
 			BUG("failed to perform bitmap walk");
 	}
 
-	wants_bitmap = find_objects(bitmap_git, revs, wants, haves_bitmap);
+	wants_bitmap = find_objects(bitmap_git, revs, wants, haves_bitmap,
+				    filter);
 
 	if (!wants_bitmap)
 		BUG("failed to perform bitmap walk");
diff --git a/t/perf/p5310-pack-bitmaps.sh b/t/perf/p5310-pack-bitmaps.sh
index 75ccf9f4e3..b3e725f031 100755
--- a/t/perf/p5310-pack-bitmaps.sh
+++ b/t/perf/p5310-pack-bitmaps.sh
@@ -91,4 +91,9 @@ test_perf 'pack to file (partial bitmap)' '
 	git pack-objects --use-bitmap-index --all pack2b </dev/null >/dev/null
 '
 
+test_perf 'rev-list with tree filter (partial bitmap)' '
+	git rev-list --use-bitmap-index --count --objects --all \
+		--filter=tree:0 >/dev/null
+'
+
 test_done
-- 
2.26.0.113.ge9739cdccc

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/4] list-objects-filter: treat NULL filter_options as "disabled"
  2020-05-04 23:12 ` [PATCH 1/4] list-objects-filter: treat NULL filter_options as "disabled" Taylor Blau
@ 2020-05-05  5:07   ` Junio C Hamano
  0 siblings, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2020-05-05  5:07 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, peff, chriscool, Emily Shaffer

Taylor Blau <me@ttaylorr.com> writes:

> From: Jeff King <peff@peff.net>
>
> In most callers, we have an actual list_objects_filter_options struct,
> and if no filtering is desired its "choice" element will be
> LOFC_DISABLED. However, some code may have only a pointer to such a
> struct which may be NULL (because _their_ callers didn't care about
> filtering, either). Rather than forcing them to handle this explicitly
> like:
>
>   if (filter_options)
>           traverse_commit_list_filtered(filter_options, revs,
> 	                                show_commit, show_object,
> 					show_data, NULL);
>   else
>           traverse_commit_list(revs, show_commit, show_object,
> 	                             show_data);
>
> let's just treat a NULL filter_options the same as LOFC_DISABLED. We
> only need a small change, since that option struct is converted into a
> real filter only in the "init" function.

This is not a problem the patch introduces, but anytime we improve
the revision traversal API, Documentation/MyFirstObjectWalk.txt
risks to be left behind (in the sense that it still is correct but
is now suboptimal) or outright broken (when we change the function
signature).  I wonder if we can do some "mark-up" to that tutorial
so that we can extract runnable code in some of the tests and make
sure the code snippet in the documentation is still OK.

As to the patch, there apparently is no existing caller that passes
NULL to the function (or they would be immediately segfaulting), so
by definition, this step alone cannot break anything, but at the
same time, it is a bit hard to assess if this is a good change with
this step alone.  So let's read on.

Thanks.

>
> Signed-off-by: Jeff King <peff@peff.net>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  list-objects-filter.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/list-objects-filter.c b/list-objects-filter.c
> index 1e8d4e763d..0a3ef3cab3 100644
> --- a/list-objects-filter.c
> +++ b/list-objects-filter.c
> @@ -663,6 +663,9 @@ struct filter *list_objects_filter__init(
>  
>  	assert((sizeof(s_filters) / sizeof(s_filters[0])) == LOFC__COUNT);
>  
> +	if (!filter_options)
> +		return NULL;
> +
>  	if (filter_options->choice >= LOFC__COUNT)
>  		BUG("invalid list-objects filter choice: %d",
>  		    filter_options->choice);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/4] pack-bitmap.c: make object filtering functions generic
  2020-05-04 23:12 ` [PATCH 2/4] pack-bitmap.c: make object filtering functions generic Taylor Blau
@ 2020-05-05  5:12   ` Junio C Hamano
  0 siblings, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2020-05-05  5:12 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, peff, chriscool

Taylor Blau <me@ttaylorr.com> writes:

> In 4f3bd5606a (pack-bitmap: implement BLOB_NONE filtering, 2020-02-14),
> filtering support for bitmaps was added for the 'LOFC_BLOB_NONE' filter.
>
> In the future, we would like to add support for filters that behave as
> if they exclude a certain type of object, for e.g., the tree depth
> filter with depth 0.
>
> To prepare for this, make some of the functions used for filtering more
> generic, such as 'find_tip_blobs' and 'filter_bitmap_blob_none' so that
> they can work over arbitrary object types.
>
> To that end, create 'find_tip_objects' and
> 'filter_bitmap_exclude_type', and redefine the aforementioned functions
> in terms of those.

OK, so instead of blobs at the tip, we find objects of requested
type at the tip, which makes sense.

Similarly, blob_none filter used to exclude blobs at the tip, but a
more generic form is to exclude objects of requested type at the
tip.

And these "more generic" variants are named quite intuitively.

Then the blob_none filter becomes a thin wrapper of the more generic
one, with hardcoded OBJ_BLOB type.

Nicely done.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] pack-bitmap.c: support 'tree:0' filtering
  2020-05-04 23:12 ` [PATCH 3/4] pack-bitmap.c: support 'tree:0' filtering Taylor Blau
@ 2020-05-05  5:25   ` Junio C Hamano
  2020-05-05 15:59     ` Taylor Blau
  0 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2020-05-05  5:25 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, peff, chriscool

Taylor Blau <me@ttaylorr.com> writes:

> diff --git a/pack-bitmap.c b/pack-bitmap.c
> index 3693c9e62f..195ee8cad0 100644
> --- a/pack-bitmap.c
> +++ b/pack-bitmap.c
> @@ -749,7 +749,7 @@ static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git,
>  	eword_t mask;
>  	uint32_t i;
>  
> -	if (type != OBJ_BLOB)
> +	if (type != OBJ_BLOB && type != OBJ_TREE)
>  		BUG("filter_bitmap_exclude_type: unsupported type '%d'", type);

OK.  This is the same as the previous step, but why would we even
need this guard?  find_tip_objects() is equipped to find tips of any
object type, iterating on the bitmap for "type", or flipping the
bits in the to_filter bitmap, does not have any limitation to the
blob type in the previous step, and there is no limitation to the
blob or tree types after this step, either, no?

> @@ -867,6 +867,20 @@ static void filter_bitmap_blob_limit(struct bitmap_index *bitmap_git,
>  	bitmap_free(tips);
>  }
>  
> +static void filter_bitmap_tree_depth(struct bitmap_index *bitmap_git,
> +				     struct object_list *tip_objects,
> +				     struct bitmap *to_filter,
> +				     unsigned long limit)
> +{
> +	if (limit)
> +		BUG("filter_bitmap_tree_depth given non-zero limit");

This one does make sense, because the code to exclude all trees and
all blobs we have below won't be able to cull only trees at a given
depth or deeper.

> +	filter_bitmap_exclude_type(bitmap_git, tip_objects, to_filter,
> +				   OBJ_TREE);
> +	filter_bitmap_exclude_type(bitmap_git, tip_objects, to_filter,
> +				   OBJ_BLOB);

And these two are quite straight-forward.

> +}
> +
>  static int filter_bitmap(struct bitmap_index *bitmap_git,
>  			 struct object_list *tip_objects,
>  			 struct bitmap *to_filter,
> @@ -890,6 +904,15 @@ static int filter_bitmap(struct bitmap_index *bitmap_git,
>  		return 0;
>  	}
>  
> +	if (filter->choice == LOFC_TREE_DEPTH &&
> +	    filter->tree_exclude_depth == 0) {
> +		if (bitmap_git)
> +			filter_bitmap_tree_depth(bitmap_git, tip_objects,
> +						 to_filter,
> +						 filter->tree_exclude_depth);

I briefly wondered if it is cleaner to read if we hardcode 0 as the
last argument.  But if the helper function ever learns how to filter
by tree with non-zero depth, we can only tweak the if() condition
without changing the call, so the way you wrote it is the right way.

Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/4] pack-bitmap: pass object filter to fill-in traversal
  2020-05-04 23:12 ` [PATCH 4/4] pack-bitmap: pass object filter to fill-in traversal Taylor Blau
@ 2020-05-05  5:40   ` Junio C Hamano
  2020-05-05 16:00     ` Taylor Blau
  0 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2020-05-05  5:40 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, peff, chriscool

Taylor Blau <me@ttaylorr.com> writes:

> But now that we support tree filters, there's opportunity for savings. A
> tree:depth=0 filter means we can avoid accessing trees entirely, since
> we know we won't them (or any of the subtrees or blobs they point to).

"we know we won't _have_ them"?

> @@ -506,7 +506,8 @@ static int should_include(struct commit *commit, void *_data)
>  static struct bitmap *find_objects(struct bitmap_index *bitmap_git,
>  				   struct rev_info *revs,
>  				   struct object_list *roots,
> -				   struct bitmap *seen)
> +				   struct bitmap *seen,
> +				   struct list_objects_filter_options *filter)
>  {
>  	struct bitmap *base = NULL;
>  	int needs_walk = 0;
> @@ -599,8 +600,9 @@ static struct bitmap *find_objects(struct bitmap_index *bitmap_git,
>  		show_data.bitmap_git = bitmap_git;
>  		show_data.base = base;
>  
> -		traverse_commit_list(revs, show_commit, show_object,
> -				     &show_data);
> +		traverse_commit_list_filtered(filter, revs,
> +					      show_commit, show_object,
> +					      &show_data, NULL);

And then finally the change in step 1/4 pays off.

>  	}
>  
>  	return base;
> @@ -999,7 +1001,8 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs,
>  
>  	if (haves) {
>  		revs->ignore_missing_links = 1;
> -		haves_bitmap = find_objects(bitmap_git, revs, haves, NULL);
> +		haves_bitmap = find_objects(bitmap_git, revs, haves, NULL,
> +					    filter);
>  		reset_revision_walk();
>  		revs->ignore_missing_links = 0;
>  
> @@ -1007,7 +1010,8 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs,
>  			BUG("failed to perform bitmap walk");
>  	}
>  
> -	wants_bitmap = find_objects(bitmap_git, revs, wants, haves_bitmap);
> +	wants_bitmap = find_objects(bitmap_git, revs, wants, haves_bitmap,
> +				    filter);
>  
>  	if (!wants_bitmap)
>  		BUG("failed to perform bitmap walk");
> diff --git a/t/perf/p5310-pack-bitmaps.sh b/t/perf/p5310-pack-bitmaps.sh
> index 75ccf9f4e3..b3e725f031 100755
> --- a/t/perf/p5310-pack-bitmaps.sh
> +++ b/t/perf/p5310-pack-bitmaps.sh
> @@ -91,4 +91,9 @@ test_perf 'pack to file (partial bitmap)' '
>  	git pack-objects --use-bitmap-index --all pack2b </dev/null >/dev/null
>  '
>  
> +test_perf 'rev-list with tree filter (partial bitmap)' '
> +	git rev-list --use-bitmap-index --count --objects --all \
> +		--filter=tree:0 >/dev/null
> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] pack-bitmap.c: support 'tree:0' filtering
  2020-05-05  5:25   ` Junio C Hamano
@ 2020-05-05 15:59     ` Taylor Blau
  2020-05-05 18:20       ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Taylor Blau @ 2020-05-05 15:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git, peff, chriscool

On Mon, May 04, 2020 at 10:25:46PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > diff --git a/pack-bitmap.c b/pack-bitmap.c
> > index 3693c9e62f..195ee8cad0 100644
> > --- a/pack-bitmap.c
> > +++ b/pack-bitmap.c
> > @@ -749,7 +749,7 @@ static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git,
> >  	eword_t mask;
> >  	uint32_t i;
> >
> > -	if (type != OBJ_BLOB)
> > +	if (type != OBJ_BLOB && type != OBJ_TREE)
> >  		BUG("filter_bitmap_exclude_type: unsupported type '%d'", type);
>
> OK.  This is the same as the previous step, but why would we even
> need this guard?  find_tip_objects() is equipped to find tips of any
> object type, iterating on the bitmap for "type", or flipping the
> bits in the to_filter bitmap, does not have any limitation to the
> blob type in the previous step, and there is no limitation to the
> blob or tree types after this step, either, no?

I think we need some sort of guard here, since we could receive any
value of object_type, but you're right that this isn't the right one. It
should probably be something like:

  if (type < OBJ_COMMIT || type > OBJ_TAG)

to pick out the sentinel values like OBJ_BAD and OBJ_NONE, as well as
the pack-specific types, like OBJ_OFS_DELTA and so on.

I fixed this locally, and will resend it along with the rest of v2 in a
day or so. Thanks for a review :).

> > @@ -867,6 +867,20 @@ static void filter_bitmap_blob_limit(struct bitmap_index *bitmap_git,
> >  	bitmap_free(tips);
> >  }
> >
> > +static void filter_bitmap_tree_depth(struct bitmap_index *bitmap_git,
> > +				     struct object_list *tip_objects,
> > +				     struct bitmap *to_filter,
> > +				     unsigned long limit)
> > +{
> > +	if (limit)
> > +		BUG("filter_bitmap_tree_depth given non-zero limit");
>
> This one does make sense, because the code to exclude all trees and
> all blobs we have below won't be able to cull only trees at a given
> depth or deeper.
>
> > +	filter_bitmap_exclude_type(bitmap_git, tip_objects, to_filter,
> > +				   OBJ_TREE);
> > +	filter_bitmap_exclude_type(bitmap_git, tip_objects, to_filter,
> > +				   OBJ_BLOB);
>
> And these two are quite straight-forward.
>
> > +}
> > +
> >  static int filter_bitmap(struct bitmap_index *bitmap_git,
> >  			 struct object_list *tip_objects,
> >  			 struct bitmap *to_filter,
> > @@ -890,6 +904,15 @@ static int filter_bitmap(struct bitmap_index *bitmap_git,
> >  		return 0;
> >  	}
> >
> > +	if (filter->choice == LOFC_TREE_DEPTH &&
> > +	    filter->tree_exclude_depth == 0) {
> > +		if (bitmap_git)
> > +			filter_bitmap_tree_depth(bitmap_git, tip_objects,
> > +						 to_filter,
> > +						 filter->tree_exclude_depth);
>
> I briefly wondered if it is cleaner to read if we hardcode 0 as the
> last argument.  But if the helper function ever learns how to filter
> by tree with non-zero depth, we can only tweak the if() condition
> without changing the call, so the way you wrote it is the right way.
>
> Thanks.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/4] pack-bitmap: pass object filter to fill-in traversal
  2020-05-05  5:40   ` Junio C Hamano
@ 2020-05-05 16:00     ` Taylor Blau
  0 siblings, 0 replies; 13+ messages in thread
From: Taylor Blau @ 2020-05-05 16:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git, peff, chriscool

On Mon, May 04, 2020 at 10:40:43PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > But now that we support tree filters, there's opportunity for savings. A
> > tree:depth=0 filter means we can avoid accessing trees entirely, since
> > we know we won't them (or any of the subtrees or blobs they point to).
>
> "we know we won't _have_ them"?

Yep, fixed. Thanks for pointing it out.

> > @@ -506,7 +506,8 @@ static int should_include(struct commit *commit, void *_data)
> >  static struct bitmap *find_objects(struct bitmap_index *bitmap_git,
> >  				   struct rev_info *revs,
> >  				   struct object_list *roots,
> > -				   struct bitmap *seen)
> > +				   struct bitmap *seen,
> > +				   struct list_objects_filter_options *filter)
> >  {
> >  	struct bitmap *base = NULL;
> >  	int needs_walk = 0;
> > @@ -599,8 +600,9 @@ static struct bitmap *find_objects(struct bitmap_index *bitmap_git,
> >  		show_data.bitmap_git = bitmap_git;
> >  		show_data.base = base;
> >
> > -		traverse_commit_list(revs, show_commit, show_object,
> > -				     &show_data);
> > +		traverse_commit_list_filtered(filter, revs,
> > +					      show_commit, show_object,
> > +					      &show_data, NULL);
>
> And then finally the change in step 1/4 pays off.

Woohoo!

> >  	}
> >
> >  	return base;
> > @@ -999,7 +1001,8 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs,
> >
> >  	if (haves) {
> >  		revs->ignore_missing_links = 1;
> > -		haves_bitmap = find_objects(bitmap_git, revs, haves, NULL);
> > +		haves_bitmap = find_objects(bitmap_git, revs, haves, NULL,
> > +					    filter);
> >  		reset_revision_walk();
> >  		revs->ignore_missing_links = 0;
> >
> > @@ -1007,7 +1010,8 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs,
> >  			BUG("failed to perform bitmap walk");
> >  	}
> >
> > -	wants_bitmap = find_objects(bitmap_git, revs, wants, haves_bitmap);
> > +	wants_bitmap = find_objects(bitmap_git, revs, wants, haves_bitmap,
> > +				    filter);
> >
> >  	if (!wants_bitmap)
> >  		BUG("failed to perform bitmap walk");
> > diff --git a/t/perf/p5310-pack-bitmaps.sh b/t/perf/p5310-pack-bitmaps.sh
> > index 75ccf9f4e3..b3e725f031 100755
> > --- a/t/perf/p5310-pack-bitmaps.sh
> > +++ b/t/perf/p5310-pack-bitmaps.sh
> > @@ -91,4 +91,9 @@ test_perf 'pack to file (partial bitmap)' '
> >  	git pack-objects --use-bitmap-index --all pack2b </dev/null >/dev/null
> >  '
> >
> > +test_perf 'rev-list with tree filter (partial bitmap)' '
> > +	git rev-list --use-bitmap-index --count --objects --all \
> > +		--filter=tree:0 >/dev/null
> > +'
> > +
> >  test_done

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] pack-bitmap.c: support 'tree:0' filtering
  2020-05-05 15:59     ` Taylor Blau
@ 2020-05-05 18:20       ` Junio C Hamano
  0 siblings, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2020-05-05 18:20 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, peff, chriscool

Taylor Blau <me@ttaylorr.com> writes:

> I think we need some sort of guard here, since we could receive any
> value of object_type, but you're right that this isn't the right one. It
> should probably be something like:
>
>   if (type < OBJ_COMMIT || type > OBJ_TAG)
>
> to pick out the sentinel values like OBJ_BAD and OBJ_NONE, as well as
> the pack-specific types, like OBJ_OFS_DELTA and so on.

Yeah, it looked strange to start checking for OBJ_BLOB and OBJ_TREE
in commits that starts passing these types to the function, while
the code in the function was prepared to take any valid type, so
using the above condition from the get-go would probably be a lot
more sensible.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-05-05 18:20 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-04 23:12 [PATCH 0/4] pack-bitmap: use bitmaps for traversals with '--filter=tree:0' Taylor Blau
2020-05-04 23:12 ` [PATCH 1/4] list-objects-filter: treat NULL filter_options as "disabled" Taylor Blau
2020-05-05  5:07   ` Junio C Hamano
2020-05-04 23:12 ` [PATCH 2/4] pack-bitmap.c: make object filtering functions generic Taylor Blau
2020-05-05  5:12   ` Junio C Hamano
2020-05-04 23:12 ` [PATCH 3/4] pack-bitmap.c: support 'tree:0' filtering Taylor Blau
2020-05-05  5:25   ` Junio C Hamano
2020-05-05 15:59     ` Taylor Blau
2020-05-05 18:20       ` Junio C Hamano
2020-05-04 23:12 ` [PATCH 4/4] pack-bitmap: pass object filter to fill-in traversal Taylor Blau
2020-05-05  5:40   ` Junio C Hamano
2020-05-05 16:00     ` Taylor Blau
  -- strict thread matches above, loose matches on Subject: below --
2020-04-22 23:13 [PATCH 0/4] pack-bitmap: use bitmaps for traversals with '--filter=tree:0' Taylor Blau
2020-04-22 23:13 ` [PATCH 2/4] pack-bitmap.c: make object filtering functions generic Taylor Blau

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).