From: Jonathan Nieder <jrnieder@gmail.com>
To: Jeff Hostetler <git@jeffhostetler.com>
Cc: git@vger.kernel.org, gitster@pobox.com, peff@peff.net,
jonathantanmy@google.com, Jeff Hostetler <jeffhost@microsoft.com>
Subject: Re: [PATCH v5 5/6] rev-list: add list-objects filtering support
Date: Wed, 22 Nov 2017 12:08:53 -0800 [thread overview]
Message-ID: <20171122200853.GB11671@aiede.mtv.corp.google.com> (raw)
In-Reply-To: <20171121205852.15731-6-git@jeffhostetler.com>
Hi,
Jeff Hostetler wrote:
> Teach rev-list to use the filtering provided by the
> traverse_commit_list_filtered() interface to omit
> unwanted objects from the result.
>
> Object filtering is only allowed when one of the "--objects*"
> options are used.
micronit: the line widths seem to be uneven in these commit messages,
which is a bit distracting when reading.
> When the "--filter-print-omitted" option is used, the omitted
> objects are printed at the end. These are marked with a "~".
> This option can be combined with "--quiet" to get a list of
> just the omitted objects.
Neat. Can you give a quick example?
Using --quiet for this feels a bit odd, since it previously meant
to print nothing to stdout. I wonder if there's another way ---
e.g.
--print-omitted=(yes|no|only)
If I wanted to list all objects matching a filter, even objects
that are not reachable from any ref, is there a way to do that?
(Just curious, trying to think about this interface.)
> Add t6112 test.
This part doesn't need to be in the commit message. More generally,
anything I could more easily learn from the code or diffstat doesn't
need to be in the commit message: the commit message is about the
"why" more than the details of what in the code changed.
> In the future, we will introduce a "partial clone" mechanism
> wherein an object in a repo, obtained from a remote, may
> reference a missing object that can be dynamically fetched from
> that remote once needed. This "partial clone" mechanism will
> have a way, sometimes slow, of determining if a missing link
> is one of the links expected to be produced by this mechanism.
Does this mean the <filter-spec>s will be part of the wire protocol?
I'll look more carefully at them below with that in mind.
> This patch introduces handling of missing objects to help
> debugging and development of the "partial clone" mechanism,
> and once the mechanism is implemented, for a power user to
> perform operations that are missing-object aware without
> incurring the cost of checking if a missing link is expected.
I had trouble understanding what this paragraph is about. Can you
give an example?
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
> Documentation/git-rev-list.txt | 4 +-
> Documentation/rev-list-options.txt | 36 ++++++
> builtin/rev-list.c | 108 ++++++++++++++++-
> t/t6112-rev-list-filters-objects.sh | 225 ++++++++++++++++++++++++++++++++++++
> 4 files changed, 370 insertions(+), 3 deletions(-)
> create mode 100755 t/t6112-rev-list-filters-objects.sh
Looks reasonably concise, good.
[...]
> --- a/Documentation/git-rev-list.txt
> +++ b/Documentation/git-rev-list.txt
> @@ -47,7 +47,9 @@ SYNOPSIS
> [ --fixed-strings | -F ]
> [ --date=<format>]
> [ [ --objects | --objects-edge | --objects-edge-aggressive ]
> - [ --unpacked ] ]
> + [ --unpacked ]
> + [ --filter=<filter-spec> [ --filter-print-omitted ] ] ]
Does this mean --filter is only useful with --objects? E.g. I can't
use it to filter commits?
> + [ --missing=<missing-action> ]
--missing=(error|allow-any|print) would be more informative and about
equally concise.
Since this is mainly for debugging, does it have a different
compatibility guarantee from other options? Could it be named
accordingly to set expectations?
[...]
> +The form '--filter=blob:none' omits all blobs.
Sounds sensible.
> ++
> +The form '--filter=blob:limit=<n>[kmg]' omits blobs larger than n bytes
> +or units. The value may be zero.
On second thought, doesn't blob:limit=0 mean blob:none is not needed?
Is it for future consistency with tree:none?
What units do [kmg] use? Are they GB, GiB, or one of the variants in
between?
> ++
> +The form '--filter=sparse:oid=<oid-ish>' uses a sparse-checkout
> +specification contained in the object (or the object that the expression
> +evaluates to) to omit blobs that would not be not required for a
> +sparse checkout on the requested refs.
This one makes me a little nervous because it would mean we're
planning on adding sparse-checkout specifications to the wire
protocol. Maybe that's okay --- they're already part of the on-disk
format --- but it makes me nervous because the sparse-checkout format
is not so great, as I believe MS has already noticed.
What is an <oid-ish>? Can it just say <blob>? How would this one
work when passed over the wire?
> ++
> +The form '--filter=sparse:path=<path>' similarly uses a sparse-checkout
> +specification contained in <path>.
Is this <path> relative to the cwd of the caller, or is it within some
commit?
Sorry it took so long to send this feedback / these questions.
Hopefully it's useful nevertheless.
Thanks and hope that helps,
Jonathan
next prev parent reply other threads:[~2017-11-22 20:09 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-21 20:58 [PATCH v5 0/6] Partial clone part 1: object filtering Jeff Hostetler
2017-11-21 20:58 ` [PATCH v5 1/6] dir: allow exclusions from blob in addition to file Jeff Hostetler
2017-11-21 20:58 ` [PATCH v5 2/6] oidmap: add oidmap iterator methods Jeff Hostetler
2017-11-21 20:58 ` [PATCH v5 3/6] oidset: add iterator methods to oidset Jeff Hostetler
2017-11-21 20:58 ` [PATCH v5 4/6] list-objects: filter objects in traverse_commit_list Jeff Hostetler
2017-11-22 22:56 ` Stefan Beller
2017-11-27 19:39 ` Jeff Hostetler
2017-11-30 22:03 ` Jeff King
2017-11-21 20:58 ` [PATCH v5 5/6] rev-list: add list-objects filtering support Jeff Hostetler
2017-11-22 20:08 ` Jonathan Nieder [this message]
2017-11-29 14:51 ` Jeff Hostetler
2017-11-21 20:58 ` [PATCH v5 6/6] pack-objects: add list-objects filtering Jeff Hostetler
2017-11-22 1:37 ` [PATCH v5 0/6] Partial clone part 1: object filtering Jonathan Tan
2017-11-22 5:14 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171122200853.GB11671@aiede.mtv.corp.google.com \
--to=jrnieder@gmail.com \
--cc=git@jeffhostetler.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jeffhost@microsoft.com \
--cc=jonathantanmy@google.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).