git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Matthew DeVore <matvore@google.com>
Cc: jonathantanmy@google.com, jrn@google.com, git@vger.kernel.org,
	dstolee@microsoft.com, jeffhost@microsoft.com,
	jrnieder@gmail.com, matvore@comcast.net
Subject: Re: [RFC PATCH 3/3] list-objects-filter: implement composite filters
Date: Fri, 17 May 2019 12:25:47 +0900	[thread overview]
Message-ID: <xmqqwoip3gp0.fsf@gitster-ct.c.googlers.com> (raw)
In-Reply-To: <02a8c9b017d8df056d7e90aff907d6e0b5506467.1558030802.git.matvore@google.com> (Matthew DeVore's message of "Thu, 16 May 2019 11:56:51 -0700")

Matthew DeVore <matvore@google.com> writes:

> Allow combining filters such that only objects accepted by all filters
> are shown. The motivation for this is to allow getting directory
> listings without also fetching blobs. This can be done by combining
> blob:none with tree:<depth>. There are massive repositories that have
> larger-than-expected trees - even if you include only a single commit.
>
> The current usage requires passing the filter to rev-list, or sending
> it over the wire, as:
>
> 	combine:<FILTER1>+<FILTER2>
>
> (i.e.: git rev-list --filter=combine:tree:2+blob:limit=32k). This is
> potentially awkward because individual filters must be URL-encoded if
> they contain + or %. This can potentially be improved by supporting a
> repeated flag syntax, e.g.:
>
> 	$ git rev-list --filter=tree:2 --filter:blob:limit=32k

Shouldn't the second one say "--filter=blob:limit=32k" (i.e. the
first colon should be an equal sign)?

> Such usage is currently an error, so giving it a meaning is backwards-
> compatible.

Two minor comments.  

If combine means "must satisfy all of these", '+' is probably a poor
choice (perhaps we want '&' instead).  Also, it seems to me that
having to worry about url encoding and parsing encoded data
correctly and securely would be far more work than simply taking
multiple command line parameters, accumulating them in a string
list, and then at the end of command line parsing, building a
combined filter out of all of them at once (a degenerate case may
end up attempting to build a combined filter that combines a single
filter), iow just biting the bullet and do the "potentially be
improved" step from the beginning.

> +The form '--filter=combine:<filter1>+<filter2>+...<filterN>' combines
> +several filters. Only objects which are accepted by every filter are
> +included. Filters are joined by "+" and individual filters are %-encoded
> +(i.e. URL-encoded). Only the "%" and "+" characters must be encoded. For
> +instance, 'combine:tree:3+blob:none' and
> +'combine:tree%3A2+blob%3Anone' are equivalent, and result in only tree

I do not quite get this part.  The way I read the justification for
encoding given in the proposed commit log message was because a
filter name or filter parameter might want to include the '+' sign,
so a filter whose name is frotz+nitfol that takes a param may be
invoked standalone as "--filter=frotz+nitfol:param=1" needs to
protect its '+' from getting mistaken when used in the combine
filter, it should not be mistaken as a combination of 'frotz' filter
(with no parameter) and 'nitfol' filter (with param set to one), and
the way to do so is to say

	--filter="combine:frotz%2Bnitfol:param=1"

(this is a degenerate 'combination of a single filter', but you can
add another one by following it with '+' and another URLencoded
string).

So why are we allowing %3A there that does not even have to be
encoded?  Shouldn't it be an error?

In any case, I am not quite convinced that we need to complicate the
parameters with URLencoding, so I'd skip reviewing large part this
patch that is about "decoding".

Once the combined filter definition is built in-core, the code that
evaluates the intersection of all conditions seems to be written
sanely to me.

Thanks.

  reply	other threads:[~2019-05-17  3:25 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-28 15:55 Proposal: object negotiation for partial clones Matthew DeVore
2019-05-06 18:25 ` Jonathan Nieder
2019-05-06 19:28 ` Jonathan Tan
2019-05-06 19:46   ` Jonathan Nieder
2019-05-06 23:20     ` Matthew DeVore
2019-05-07  0:02       ` Jonathan Nieder
2019-05-06 22:47   ` Matthew DeVore
2019-05-07 18:34     ` Jonathan Tan
2019-05-07 21:57       ` Matthew DeVore
2019-05-09 18:00         ` Jonathan Tan
2019-05-14  0:09           ` Matthew DeVore
2019-05-14  0:16             ` Jonathan Nieder
2019-05-16 18:56               ` [RFC PATCH 0/3] implement composite filters Matthew DeVore
2019-05-16 18:56                 ` [RFC PATCH 1/3] list-objects-filter: refactor into a context struct Matthew DeVore
2019-05-16 18:56                 ` [RFC PATCH 2/3] list-objects-filter-options: error is localizeable Matthew DeVore
2019-05-16 18:56                 ` [RFC PATCH 3/3] list-objects-filter: implement composite filters Matthew DeVore
2019-05-17  3:25                   ` Junio C Hamano [this message]
2019-05-17 13:17                     ` Matthew DeVore
2019-05-19  1:12                       ` Junio C Hamano
2019-05-20 18:24                       ` Matthew DeVore
2019-05-20 18:28                       ` Matthew DeVore
2019-05-16 22:41                 ` [RFC PATCH 0/3] " Jonathan Tan
2019-05-17  0:01                   ` Matthew DeVore

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqwoip3gp0.fsf@gitster-ct.c.googlers.com \
    --to=gitster@pobox.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=jeffhost@microsoft.com \
    --cc=jonathantanmy@google.com \
    --cc=jrn@google.com \
    --cc=jrnieder@gmail.com \
    --cc=matvore@comcast.net \
    --cc=matvore@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).