git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff Hostetler <git@jeffhostetler.com>
To: Matthew DeVore <matvore@comcast.net>
Cc: Matthew DeVore <matvore@google.com>,
	git@vger.kernel.org, jonathantanmy@google.com, jrn@google.com,
	dstolee@microsoft.com, jeffhost@microsoft.com,
	jrnieder@gmail.com, pclouds@gmail.com, emilyshaffer@google.com,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v2 4/9] list-objects-filter: implement composite filters
Date: Fri, 7 Jun 2019 13:58:01 -0400	[thread overview]
Message-ID: <5dc58484-b206-dfe0-981a-437d0a162444@jeffhostetler.com> (raw)
In-Reply-To: <20190606223251.GA7246@comcast.net>



On 6/6/2019 6:32 PM, Matthew DeVore wrote:
> On Mon, Jun 03, 2019 at 05:51:28PM -0400, Jeff Hostetler wrote:
>> Since we are assuming 'compose' is an AND operation, there may be an
>> opportunity to short-cut some of this loop for blobs.  That is, if the
>> object is a blob and any filter rejects it, it is omitted, so we don't
>> need to keep looping for that object.  (Tree objects cannot be short-cut
>> this way because a tree may appear at different depths or in different
>> sparse "cones" and may have to be reconsidered.)
> 
> Blobs are also treated almost the same way as tree objects in tree:<depth>
> filters - they can be included by tree:<depth> - so they also need to be
> reconsidered when found at different depths.
> 
> But I agree it's always true that if some prior filter has excluded a blob, the
> later filters don't even need to be *called at all* for that blob, unless
> perhaps it's found under a different tree later. I also think it may be too
> early to implement this optimization, since filter in a later release may just
> want to "know" about a blob even if it must be excluded in the final result.
> 
> Does the optimization apply to trees as well? Does a tree:<depth> filter still
> want to consider children of tree X if tree X has already been excluded by
> another filter? If it doesn't want to consider, we can short-circuit the checks
> very aggressively. If it does want to consider, we want the short-circuiting to
> be customizable at least for trees.
> 
> A minor point - I don't think that short-circuiting the for loop (breaking out
> early) is important, since it will be very rare that a combine: filter has more
> than 4 or so sub-filters anyway. Calling the filter_fn implementation and
> letting that do internal short-circuiting (informed by the previous filters'
> results) can, however, skip a lot of computation.
> 
>> So you could add an "affects blobs only" bit to the per-filter data
>> and try this out.  For example a "compose:blob:none+sparse:foo" should
>> perform better than "compose:sparse:foo+blob:none" but give the same
>> results.
> 
> Does "affects blobs only" mean the filter includes all non-blob objects?
> 

I just meant that the blobs:none and blobs:limit filters give you a hard
omit.  Other filters later in the chain cannot change or override that
answer (because of the AND assumption); it doesn't matter how deep or
shallow the blob is the tree.

In the case of the tree:depth filter, a blob deep in the tree should
be provisionally omitted in case it appears later in a shallow tree
and should be included.  The tree filter can't do a hard omit on a blob
(just like it can't do a hard omit on a tree node).

WRT your question about a later filter "just wanting to know" about
a blob, I'm not sure.

So yeah, let's wait on this.  We can always add it later as an
optimization if/when it becomes a perf problem (and we have more
experience using them in practice).

Jeff



  reply	other threads:[~2019-06-07 17:58 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-01  0:35 [PATCH v2 0/9] Filter combination Matthew DeVore
2019-06-01  0:35 ` [PATCH v2 1/9] list-objects-filter: make API easier to use Matthew DeVore
2019-06-01  0:35 ` [PATCH v2 2/9] list-objects-filter: put omits set in filter struct Matthew DeVore
2019-06-01  0:35 ` [PATCH v2 3/9] list-objects-filter-options: always supply *errbuf Matthew DeVore
2019-06-01  0:35 ` [PATCH v2 4/9] list-objects-filter: implement composite filters Matthew DeVore
2019-06-03 21:51   ` Jeff Hostetler
2019-06-06 22:32     ` Matthew DeVore
2019-06-07 17:58       ` Jeff Hostetler [this message]
2019-06-01  0:35 ` [PATCH v2 5/9] list-objects-filter-options: move error check up Matthew DeVore
2019-06-01  0:36 ` [PATCH v2 6/9] list-objects-filter-options: make filter_spec a strbuf Matthew DeVore
2019-06-10 20:13   ` Junio C Hamano
2019-06-11  0:34     ` Matthew DeVore
2019-06-11 17:33       ` Junio C Hamano
2019-06-11 18:44         ` Matthew DeVore
2019-06-11 21:34           ` Matthew DeVore
2019-06-11 21:48           ` Junio C Hamano
2019-06-12  0:37             ` Matthew DeVore
2019-06-12 14:55               ` Matthew DeVore
2019-06-01  0:36 ` [PATCH v2 7/9] list-objects-filter-options: allow mult. --filter Matthew DeVore
2019-06-01  0:36 ` [PATCH v2 8/9] list-objects-filter-options: clean up use of ALLOC_GROW Matthew DeVore
2019-06-03 22:07   ` Jacob Keller
2019-06-03 22:39     ` Matthew DeVore
2019-06-04  3:16       ` Jacob Keller
2019-06-01  0:36 ` [PATCH v2 9/9] list-objects-filter-options: make parser void Matthew DeVore
2019-06-03 21:35 ` [PATCH v2 0/9] Filter combination Jeff Hostetler
2019-06-13 21:51 ` [PATCH v3 00/10] " Matthew DeVore
2019-06-13 21:51   ` [PATCH v3 01/10] list-objects-filter: make API easier to use Matthew DeVore
2019-06-13 21:51   ` [PATCH v3 02/10] list-objects-filter: put omits set in filter struct Matthew DeVore
2019-06-13 21:51   ` [PATCH v3 03/10] list-objects-filter-options: always supply *errbuf Matthew DeVore
2019-06-13 21:51   ` [PATCH v3 04/10] list-objects-filter: implement composite filters Matthew DeVore
2019-06-13 21:51   ` [PATCH v3 05/10] list-objects-filter-options: move error check up Matthew DeVore
2019-06-13 21:51   ` [PATCH v3 06/10] list-objects-filter-options: make filter_spec a string_list Matthew DeVore
2019-06-13 21:51   ` [PATCH v3 07/10] strbuf: give URL-encoding API a char predicate fn Matthew DeVore
2019-06-13 21:51   ` [PATCH v3 08/10] list-objects-filter-options: allow mult. --filter Matthew DeVore
2019-06-13 21:51   ` [PATCH v3 09/10] list-objects-filter-options: clean up use of ALLOC_GROW Matthew DeVore
2019-06-13 21:51   ` [PATCH v3 10/10] list-objects-filter-options: make parser void Matthew DeVore
2019-06-14 19:50   ` [PATCH v3 00/10] Filter combination Junio C Hamano
2019-06-15  0:40 ` [PATCH v4 " Matthew DeVore
2019-06-15  0:40   ` [PATCH v4 01/10] list-objects-filter: make API easier to use Matthew DeVore
2019-06-21 22:58     ` Jonathan Tan
2019-06-27  0:46       ` Matthew DeVore
2019-06-15  0:40   ` [PATCH v4 02/10] list-objects-filter: put omits set in filter struct Matthew DeVore
2019-06-15  0:40   ` [PATCH v4 03/10] list-objects-filter-options: always supply *errbuf Matthew DeVore
2019-06-15  0:40   ` [PATCH v4 04/10] list-objects-filter: implement composite filters Matthew DeVore
2019-06-18  8:42     ` Johannes Schindelin
2019-06-18 20:22       ` Matthew DeVore
2019-06-21 18:17         ` Johannes Schindelin
2019-06-22  0:26     ` Jonathan Tan
2019-06-27 21:12       ` Matthew DeVore
2019-06-15  0:40   ` [PATCH v4 05/10] list-objects-filter-options: move error check up Matthew DeVore
2019-06-15  0:40   ` [PATCH v4 06/10] list-objects-filter-options: make filter_spec a string_list Matthew DeVore
2019-06-22  0:37     ` Jonathan Tan
2019-06-27 21:17       ` Matthew DeVore
2019-06-15  0:40   ` [PATCH v4 07/10] strbuf: give URL-encoding API a char predicate fn Matthew DeVore
2019-06-15  0:40   ` [PATCH v4 08/10] list-objects-filter-options: allow mult. --filter Matthew DeVore
2019-06-15  0:40   ` [PATCH v4 09/10] list-objects-filter-options: clean up use of ALLOC_GROW Matthew DeVore
2019-06-15  0:40   ` [PATCH v4 10/10] list-objects-filter-options: make parser void Matthew DeVore
2019-06-22  0:46     ` Jonathan Tan
2019-06-27 21:24       ` Matthew DeVore
2019-06-27 22:27         ` Matthew DeVore
2019-06-18  1:25   ` [PATCH v4 00/10] Filter combination Junio C Hamano
2019-06-27 22:54 ` [PATCH v5 " Matthew DeVore
2019-06-27 22:54   ` [PATCH v5 01/10] list-objects-filter: encapsulate filter components Matthew DeVore
2019-06-27 22:54   ` [PATCH v5 02/10] list-objects-filter: put omits set in filter struct Matthew DeVore
2019-06-27 22:54   ` [PATCH v5 03/10] list-objects-filter-options: always supply *errbuf Matthew DeVore
2019-06-27 22:54   ` [PATCH v5 04/10] list-objects-filter: implement composite filters Matthew DeVore
2019-06-27 22:54   ` [PATCH v5 05/10] list-objects-filter-options: move error check up Matthew DeVore
2019-06-27 22:54   ` [PATCH v5 06/10] list-objects-filter-options: make filter_spec a string_list Matthew DeVore
2019-06-27 22:54   ` [PATCH v5 07/10] strbuf: give URL-encoding API a char predicate fn Matthew DeVore
2019-06-27 22:54   ` [PATCH v5 08/10] list-objects-filter-options: allow mult. --filter Matthew DeVore
2019-06-27 22:54   ` [PATCH v5 09/10] list-objects-filter-options: clean up use of ALLOC_GROW Matthew DeVore
2019-06-27 22:54   ` [PATCH v5 10/10] list-objects-filter-options: make parser void Matthew DeVore
2019-06-28 16:05   ` [PATCH v5 00/10] Filter combination Junio C Hamano
2019-06-28 17:16   ` Jonathan Tan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5dc58484-b206-dfe0-981a-437d0a162444@jeffhostetler.com \
    --to=git@jeffhostetler.com \
    --cc=dstolee@microsoft.com \
    --cc=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=jonathantanmy@google.com \
    --cc=jrn@google.com \
    --cc=jrnieder@gmail.com \
    --cc=matvore@comcast.net \
    --cc=matvore@google.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).