git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Subject: [PATCH 0/13] combining object filters and bitmaps
Date: Wed, 12 Feb 2020 21:15:06 -0500	[thread overview]
Message-ID: <20200213021506.GA1124607@coredump.intra.peff.net> (raw)

Partial clones are nice for saving bandwidth, but they can actually be
more expensive in CPU because many of the filters require the server to
do extra work to figure out which objects to send.

In particular, it's impossible for many filters to work with
reachability bitmaps, because the bitmap result doesn't provide enough
information. E.g., we can't evaluate a path-based sparse filter without
actually walking the trees to find out which objects are referred to at
which paths.

But in some cases the bitmaps _do_ give enough information, and it's
just a weakness of our implementation. E.g., it's easy to remove all of
the blobs from a bitmap result. So this patch series implements some of
the easier filters, while providing a framework for the bitmap code to
say "nope, that filter is too complex for me" so we can fallback to a
non-bitmap path.

One thing I'm not wild about is that this basically creates a parallel
universe of filter implementations, as nothing is shared with the
code in list-objects-filter.c (beyond the option parsing stage).
But if you look at the implementations in patches 11 and 12, I think
that's how it has to be in order to be efficient. The existing filters
are inherently about hooking into the traversal, and the whole point of
reachability bitmaps is to avoid that traversal.

So I think the best we can do is carefully write bitmap-aware versions
of specific filters, and then fall back to a regular traversal as
necessary. I'd like to also give server providers some tools for
limiting which filters they're willing to support (so they can protect
themselves from people running expensive filters), but I've left that as
future work for now.

Here's an outline of the series:

  [01/13]: pack-bitmap: factor out type iterator initialization
  [02/13]: pack-bitmap: fix leak of haves/wants object lists

    Some related fixes and cleanups in the bitmap area.

  [03/13]: rev-list: fallback to non-bitmap traversal when filtering
  [04/13]: rev-list: consolidate bitmap-disabling options
  [05/13]: rev-list: factor out bitmap-optimized routines

    It's much easier to exercise this code via rev-list, so this is some
    refactoring around the bitmap support there.

  [06/13]: rev-list: make --count work with --objects
  [07/13]: rev-list: allow bitmaps when counting objects

    These two aren't strictly necessary for the topic at hand, but have
    been bugging me for years. And they were made simple by the earlier
    refactoring, and provide a handy test mechanism.

  [08/13]: pack-bitmap: basic noop bitmap filter infrastructure
  [09/13]: rev-list: use bitmap filters for traversal
  [10/13]: bitmap: add bitmap_unset() function

    More refactoring in preparation for supporting filters.

  [11/13]: pack-bitmap: implement BLOB_NONE filtering
  [12/13]: pack-bitmap: implement BLOB_LIMIT filtering

    Some actual filters (tested with rev-list support added earlier).

  [13/13]: pack-objects: support filters with bitmaps

    And finally turning on support for pack generation. The final
    numeric result is (using linux.git with blob:none):

      Test                              HEAD^               HEAD
      ------------------------------------------------------------------------------
      5310.9: simulated partial clone   38.94(37.28+5.87)   11.06(11.27+4.07) -71.6%

    which isn't too shabby.

 builtin/pack-objects.c             |   3 +-
 builtin/rev-list.c                 | 124 +++++++++++---
 ewah/bitmap.c                      |   8 +
 ewah/ewok.h                        |   1 +
 object.c                           |   9 ++
 object.h                           |   2 +
 pack-bitmap.c                      | 252 +++++++++++++++++++++++++----
 pack-bitmap.h                      |   4 +-
 reachable.c                        |   2 +-
 t/perf/p5310-pack-bitmaps.sh       |  14 ++
 t/t5310-pack-bitmaps.sh            |  20 +++
 t/t6000-rev-list-misc.sh           |  12 ++
 t/t6113-rev-list-bitmap-filters.sh |  77 +++++++++
 13 files changed, 468 insertions(+), 60 deletions(-)
 create mode 100755 t/t6113-rev-list-bitmap-filters.sh

-Peff

             reply	other threads:[~2020-02-13  2:15 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-13  2:15 Jeff King [this message]
2020-02-13  2:16 ` [PATCH 01/13] pack-bitmap: factor out type iterator initialization Jeff King
2020-02-13 17:45   ` Junio C Hamano
2020-02-13  2:16 ` [PATCH 02/13] pack-bitmap: fix leak of haves/wants object lists Jeff King
2020-02-13 18:12   ` Junio C Hamano
2020-02-13  2:17 ` [PATCH 03/13] rev-list: fallback to non-bitmap traversal when filtering Jeff King
2020-02-13 18:19   ` Junio C Hamano
2020-02-13 18:40     ` Jeff King
2020-02-13  2:17 ` [PATCH 04/13] rev-list: consolidate bitmap-disabling options Jeff King
2020-02-13  2:18 ` [PATCH 05/13] rev-list: factor out bitmap-optimized routines Jeff King
2020-02-13 18:34   ` Junio C Hamano
2020-02-13  2:19 ` [PATCH 06/13] rev-list: make --count work with --objects Jeff King
2020-02-13 19:14   ` Junio C Hamano
2020-02-13 20:27     ` Jeff King
2020-02-13  2:20 ` [PATCH 07/13] rev-list: allow bitmaps when counting objects Jeff King
2020-02-13 21:47   ` Junio C Hamano
2020-02-13 22:27     ` Jeff King
2020-02-13  2:20 ` [PATCH 08/13] pack-bitmap: basic noop bitmap filter infrastructure Jeff King
2020-02-13  2:21 ` [PATCH 09/13] rev-list: use bitmap filters for traversal Jeff King
2020-02-13 22:22   ` Junio C Hamano
2020-02-13 22:34     ` Jeff King
2020-02-13  2:21 ` [PATCH 10/13] bitmap: add bitmap_unset() function Jeff King
2020-02-13  2:23 ` [PATCH 11/13] pack-bitmap: implement BLOB_NONE filtering Jeff King
2020-02-13  2:25 ` [PATCH 12/13] pack-bitmap: implement BLOB_LIMIT filtering Jeff King
2020-02-13 23:17   ` Junio C Hamano
2020-02-13  2:25 ` [PATCH 13/13] pack-objects: support filters with bitmaps Jeff King
2020-02-14 18:21 ` [PATCH v2 0/15] combining object filters and bitmaps Jeff King
2020-02-14 18:22   ` [PATCH v2 01/15] pack-bitmap: factor out type iterator initialization Jeff King
2020-02-15  0:10     ` Taylor Blau
2020-02-14 18:22   ` [PATCH v2 02/15] pack-bitmap: fix leak of haves/wants object lists Jeff King
2020-02-15  0:15     ` Taylor Blau
2020-02-15  6:46       ` Jeff King
2020-02-18 17:58     ` Derrick Stolee
2020-02-18 20:02       ` Jeff King
2020-02-14 18:22   ` [PATCH v2 03/15] rev-list: fallback to non-bitmap traversal when filtering Jeff King
2020-02-15  0:22     ` Taylor Blau
2020-02-14 18:22   ` [PATCH v2 04/15] pack-bitmap: refuse to do a bitmap traversal with pathspecs Jeff King
2020-02-14 19:03     ` Junio C Hamano
2020-02-14 20:51       ` Jeff King
2020-02-14 18:22   ` [PATCH v2 05/15] rev-list: factor out bitmap-optimized routines Jeff King
2020-02-15  0:35     ` Taylor Blau
2020-02-14 18:22   ` [PATCH v2 06/15] rev-list: make --count work with --objects Jeff King
2020-02-15  0:42     ` Taylor Blau
2020-02-15  6:48       ` Jeff King
2020-02-16 23:34         ` Junio C Hamano
2020-02-18  5:24           ` Jeff King
2020-02-18 17:28             ` Junio C Hamano
2020-02-18 19:55               ` Jeff King
2020-02-18 21:19                 ` Junio C Hamano
2020-02-18 21:23                   ` Jeff King
2020-02-18 18:05     ` Derrick Stolee
2020-02-18 19:59       ` Jeff King
2020-02-14 18:22   ` [PATCH v2 07/15] rev-list: allow bitmaps when counting objects Jeff King
2020-02-15  0:45     ` Taylor Blau
2020-02-15  6:55       ` Jeff King
2020-02-16 23:36         ` Junio C Hamano
2020-02-14 18:22   ` [PATCH v2 08/15] t5310: factor out bitmap traversal comparison Jeff King
2020-02-15  2:14     ` Taylor Blau
2020-02-15  7:00       ` Jeff King
2020-02-14 18:22   ` [PATCH v2 09/15] rev-list: allow commit-only bitmap traversals Jeff King
2020-02-18 18:18     ` Derrick Stolee
2020-02-18 20:05       ` Jeff King
2020-02-18 20:11         ` Derrick Stolee
2020-02-14 18:22   ` [PATCH v2 10/15] pack-bitmap: basic noop bitmap filter infrastructure Jeff King
2020-02-14 18:22   ` [PATCH v2 11/15] rev-list: use bitmap filters for traversal Jeff King
2020-02-14 18:22   ` [PATCH v2 12/15] bitmap: add bitmap_unset() function Jeff King
2020-02-14 18:22   ` [PATCH v2 13/15] pack-bitmap: implement BLOB_NONE filtering Jeff King
2020-02-18 19:26     ` Derrick Stolee
2020-02-18 19:36       ` Derrick Stolee
2020-02-18 20:30         ` Jeff King
2020-02-18 20:24       ` Jeff King
2020-02-14 18:22   ` [PATCH v2 14/15] pack-bitmap: implement BLOB_LIMIT filtering Jeff King
2020-02-14 18:22   ` [PATCH v2 15/15] pack-objects: support filters with bitmaps Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200213021506.GA1124607@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).