From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Christian Couder <christian.couder@gmail.com>
Subject: Re: [PATCH 0/7] rev-parse: implement object type filter
Date: Mon, 15 Mar 2021 12:25:40 +0100 [thread overview]
Message-ID: <YE9ENGKkKRbzUL7I@ncase> (raw)
In-Reply-To: <YEk8iiDf/FMxzhIF@coredump.intra.peff.net>
On Wed, Mar 10, 2021 at 04:39:22PM -0500, Jeff King wrote:
> On Mon, Mar 01, 2021 at 01:20:26PM +0100, Patrick Steinhardt wrote:
>
> > Altogether, this ends up with the following queries, both of which have
> > been executed in a well-packed linux.git repository:
> >
> > # Previous query which uses object names as a heuristic to filter
> > # non-blob objects, which bars us from using bitmap indices because
> > # they cannot print paths.
> > $ time git rev-list --objects --filter=blob:limit=200 \
> > --object-names --all | sed -r '/^.{,41}$/d' | wc -l
> > 4502300
> >
> > real 1m23.872s
> > user 1m30.076s
> > sys 0m6.002s
> >
> > # New query.
> > $ time git rev-list --objects --filter-provided \
> > --filter=object:type=blob --filter=blob:limit=200 \
> > --use-bitmap-index --all | wc -l
> > 22585
> >
> > real 0m19.216s
> > user 0m16.768s
> > sys 0m2.450s
>
> Those produce very different answers. I guess because in the first one,
> you still have a bunch of tree objects, too. You'd do much better to get
> the actual types from cat-file, and filter on that. That also lets you
> use bitmaps for the traversal portion. E.g.:
>
> $ time git rev-list --use-bitmap-index --objects --filter=blob:limit=200 --all |
> git cat-file --buffer --batch-check='%(objecttype) %(objectname)' |
> perl -lne 'print $1 if /^blob (.*)/' | wc -l
> 14966
>
> real 0m6.248s
> user 0m7.810s
> sys 0m0.440s
>
> which is faster than what you showed above (this is on linux.git, but my
> result is different; maybe you have more refs than me?). But we should
> be able to do better purely internally, so I suspect my computer is just
> faster (or maybe your extra refs just aren't well-covered by bitmaps).
> Running with your patches I get:
>
> $ time git rev-list --objects --use-bitmap-index --all \
> --filter-provided --filter=object:type=blob \
> --filter=blob:limit=200 | wc -l
> 16339
>
> real 0m1.309s
> user 0m1.234s
> sys 0m0.079s
>
> which is indeed faster. It's quite curious that the answer is not the
> same, though! I think yours has some bugs. If I sort and diff the
> results, I see some commits mentioned in the output. Perhaps this is
> --filter-provided not working, as they all seem to be ref tips.
[snip]
I've found the issue: when converting filters to a combined filter via
`transform_to_combine_type()`, we reset the top-level filter via a call
to `memset()`. So for combined filters, the option wouldn't have taken
any effect because it got reset iff the `--filter-provided` option comes
before the second filter.
Patrick
next prev parent reply other threads:[~2021-03-15 11:26 UTC|newest]
Thread overview: 97+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-01 12:20 [PATCH 0/7] rev-parse: implement object type filter Patrick Steinhardt
2021-03-01 12:20 ` [PATCH 1/7] revision: mark commit parents as NOT_USER_GIVEN Patrick Steinhardt
2021-03-01 12:20 ` [PATCH 2/7] list-objects: move tag processing into its own function Patrick Steinhardt
2021-03-01 12:20 ` [PATCH 3/7] list-objects: support filtering by tag and commit Patrick Steinhardt
2021-03-01 12:20 ` [PATCH 4/7] list-objects: implement object type filter Patrick Steinhardt
2021-03-01 12:20 ` [PATCH 5/7] pack-bitmap: " Patrick Steinhardt
2021-03-01 12:20 ` [PATCH 6/7] pack-bitmap: implement combined filter Patrick Steinhardt
2021-03-01 12:21 ` [PATCH 7/7] rev-list: allow filtering of provided items Patrick Steinhardt
2021-03-10 21:39 ` [PATCH 0/7] rev-parse: implement object type filter Jeff King
2021-03-11 14:38 ` Patrick Steinhardt
2021-03-11 17:54 ` Jeff King
2021-03-15 11:25 ` Patrick Steinhardt [this message]
2021-03-10 21:58 ` Taylor Blau
2021-03-10 22:19 ` Jeff King
2021-03-11 14:43 ` Patrick Steinhardt
2021-03-11 17:56 ` Jeff King
2021-03-15 13:14 ` [PATCH v2 0/8] " Patrick Steinhardt
2021-03-15 13:14 ` [PATCH v2 1/8] uploadpack.txt: document implication of `uploadpackfilter.allow` Patrick Steinhardt
2021-04-06 17:17 ` Jeff King
2021-03-15 13:14 ` [PATCH v2 2/8] revision: mark commit parents as NOT_USER_GIVEN Patrick Steinhardt
2021-04-06 17:30 ` Jeff King
2021-04-09 10:19 ` Patrick Steinhardt
2021-03-15 13:14 ` [PATCH v2 3/8] list-objects: move tag processing into its own function Patrick Steinhardt
2021-04-06 17:39 ` Jeff King
2021-03-15 13:14 ` [PATCH v2 4/8] list-objects: support filtering by tag and commit Patrick Steinhardt
2021-03-15 13:14 ` [PATCH v2 5/8] list-objects: implement object type filter Patrick Steinhardt
2021-04-06 17:42 ` Jeff King
2021-03-15 13:14 ` [PATCH v2 6/8] pack-bitmap: " Patrick Steinhardt
2021-04-06 17:48 ` Jeff King
2021-03-15 13:14 ` [PATCH v2 7/8] pack-bitmap: implement combined filter Patrick Steinhardt
2021-04-06 17:54 ` Jeff King
2021-04-09 10:31 ` Patrick Steinhardt
2021-04-09 15:53 ` Jeff King
2021-04-09 11:17 ` Patrick Steinhardt
2021-04-09 15:55 ` Jeff King
2021-03-15 13:15 ` [PATCH v2 8/8] rev-list: allow filtering of provided items Patrick Steinhardt
2021-04-06 18:04 ` Jeff King
2021-04-09 10:59 ` Patrick Steinhardt
2021-04-09 15:58 ` Jeff King
2021-03-20 21:10 ` [PATCH v2 0/8] rev-parse: implement object type filter Junio C Hamano
2021-04-06 18:08 ` Jeff King
2021-04-09 11:14 ` Patrick Steinhardt
2021-04-09 16:05 ` Jeff King
2021-04-09 11:27 ` [PATCH v3 " Patrick Steinhardt
2021-04-09 11:27 ` [PATCH v3 1/8] uploadpack.txt: document implication of `uploadpackfilter.allow` Patrick Steinhardt
2021-04-09 11:27 ` [PATCH v3 2/8] revision: mark commit parents as NOT_USER_GIVEN Patrick Steinhardt
2021-04-09 11:28 ` [PATCH v3 3/8] list-objects: move tag processing into its own function Patrick Steinhardt
2021-04-09 11:28 ` [PATCH v3 4/8] list-objects: support filtering by tag and commit Patrick Steinhardt
2021-04-11 6:49 ` Junio C Hamano
2021-04-09 11:28 ` [PATCH v3 5/8] list-objects: implement object type filter Patrick Steinhardt
2021-04-09 11:28 ` [PATCH v3 6/8] pack-bitmap: " Patrick Steinhardt
2021-04-09 11:28 ` [PATCH v3 7/8] pack-bitmap: implement combined filter Patrick Steinhardt
2021-04-09 11:28 ` [PATCH v3 8/8] rev-list: allow filtering of provided items Patrick Steinhardt
2021-04-09 11:32 ` [RESEND PATCH " Patrick Steinhardt
2021-04-09 15:00 ` [PATCH " Philip Oakley
2021-04-12 13:15 ` Patrick Steinhardt
2021-04-11 6:02 ` [PATCH v3 0/8] rev-parse: implement object type filter Junio C Hamano
2021-04-12 13:12 ` Patrick Steinhardt
2021-04-12 13:37 ` [PATCH v4 0/8] rev-list: " Patrick Steinhardt
2021-04-12 13:37 ` [PATCH v4 1/8] uploadpack.txt: document implication of `uploadpackfilter.allow` Patrick Steinhardt
2021-04-12 13:37 ` [PATCH v4 2/8] revision: mark commit parents as NOT_USER_GIVEN Patrick Steinhardt
2021-04-12 13:37 ` [PATCH v4 3/8] list-objects: move tag processing into its own function Patrick Steinhardt
2021-04-12 13:37 ` [PATCH v4 4/8] list-objects: support filtering by tag and commit Patrick Steinhardt
2021-04-12 13:37 ` [PATCH v4 5/8] list-objects: implement object type filter Patrick Steinhardt
2021-04-13 9:57 ` Ævar Arnfjörð Bjarmason
2021-04-13 10:43 ` Andreas Schwab
2021-04-14 11:32 ` Patrick Steinhardt
2021-04-12 13:37 ` [PATCH v4 6/8] pack-bitmap: " Patrick Steinhardt
2021-04-12 13:37 ` [PATCH v4 7/8] pack-bitmap: implement combined filter Patrick Steinhardt
2021-04-12 13:37 ` [PATCH v4 8/8] rev-list: allow filtering of provided items Patrick Steinhardt
2021-04-13 7:45 ` [PATCH v4 0/8] rev-list: implement object type filter Jeff King
2021-04-13 8:06 ` Patrick Steinhardt
2021-04-15 9:42 ` Jeff King
2021-04-16 22:06 ` Junio C Hamano
2021-04-16 23:15 ` Junio C Hamano
2021-04-17 1:17 ` Ramsay Jones
2021-04-17 9:01 ` Jeff King
2021-04-17 21:45 ` Junio C Hamano
2021-04-13 21:03 ` Junio C Hamano
2021-04-14 11:59 ` Patrick Steinhardt
2021-04-14 21:07 ` Junio C Hamano
2021-04-15 9:57 ` Jeff King
2021-04-15 17:53 ` Junio C Hamano
2021-04-15 17:57 ` Junio C Hamano
2021-04-17 8:58 ` Jeff King
2021-04-19 11:46 ` [PATCH v5 " Patrick Steinhardt
2021-04-19 11:46 ` [PATCH v5 1/8] uploadpack.txt: document implication of `uploadpackfilter.allow` Patrick Steinhardt
2021-04-19 11:46 ` [PATCH v5 2/8] revision: mark commit parents as NOT_USER_GIVEN Patrick Steinhardt
2021-04-19 11:46 ` [PATCH v5 3/8] list-objects: move tag processing into its own function Patrick Steinhardt
2021-04-19 11:46 ` [PATCH v5 4/8] list-objects: support filtering by tag and commit Patrick Steinhardt
2021-04-19 11:46 ` [PATCH v5 5/8] list-objects: implement object type filter Patrick Steinhardt
2021-04-19 11:46 ` [PATCH v5 6/8] pack-bitmap: " Patrick Steinhardt
2021-04-19 11:47 ` [PATCH v5 7/8] pack-bitmap: implement combined filter Patrick Steinhardt
2021-04-19 11:47 ` [PATCH v5 8/8] rev-list: allow filtering of provided items Patrick Steinhardt
2021-04-19 23:16 ` [PATCH v5 0/8] rev-list: implement object type filter Junio C Hamano
2021-04-23 9:13 ` Jeff King
2021-04-28 2:18 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YE9ENGKkKRbzUL7I@ncase \
--to=ps@pks.im \
--cc=christian.couder@gmail.com \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).