From: William Sprent <williams@unity3d.com>
To: Eric Sunshine <sunshine@sunshineco.com>,
William Sprent via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] ls-tree: add --sparse-filter-oid argument
Date: Mon, 16 Jan 2023 13:14:02 +0100 [thread overview]
Message-ID: <26f026f6-6c17-7578-e341-d56fe53c95df@unity3d.com> (raw)
In-Reply-To: <CAPig+cRgZ0CrkqY7mufuWmhf6BC8yXjXXuOTEQjuz+Y0NA+N7Q@mail.gmail.com>
On 13/01/2023 15.17, Eric Sunshine wrote:
> On Wed, Jan 11, 2023 at 12:05 PM William Sprent via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>> [...]
>> To fill this gap, add a new argument to ls-tree '--sparse-filter-oid'
>> which takes the object id of a blob containing sparse checkout patterns
>> that filters the output of 'ls-tree'. When filtering with given sparsity
>> patterns, 'ls-tree' only outputs blobs and commit objects that
>> match the given patterns.
>> [...]
>> Signed-off-by: William Sprent <williams@unity3d.com>
>
> This is not a proper review, but rather just some comments about
> issues I noticed while quickly running my eye over the patch. Many are
> just micro-nits about style; a few regarding the tests are probably
> actionable.
>
>> Note that one of the tests only pass when run on top of commit
>> 5842710dc2 (dir: check for single file cone patterns, 2023-01-03), which
>> was submitted separately and is currently is merged to 'next'.
>
> Thanks for mentioning this. It's exactly the sort of information the
> maintainer needs when applying your patch to his tree. And it can be
> helpful for reviewers too.
>
>> diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
>> @@ -48,6 +48,11 @@ OPTIONS
>> +--sparse-filter-oid=<blob-ish>::
>> + Omit showing tree objects and paths that do not match the
>> + sparse-checkout specification contained in the blob
>> + (or blob-expression) <blob-ish>.
>
> Good to see a documentation update. The SYNOPSIS probably deserves an
> update too.
>
Nice catch. I'll update it.
>> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
>> @@ -329,12 +331,79 @@ static struct ls_tree_cmdmode_to_fmt ls_tree_cmdmode_format[] = {
>> +static void sparse_filter_data__init(
>> + struct sparse_filter_data **d,
>> + const char *sparse_oid_name, read_tree_fn_t fn)
>> +{
>> + struct object_id sparse_oid;
>> + struct object_context oc;
>> +
>> + (*d) = xcalloc(1, sizeof(**d));
>> + (*d)->fn = fn;
>> + (*d)->pl.use_cone_patterns = core_sparse_checkout_cone;
>> +
>> + strbuf_init(&(*d)->full_path_buf, 0);
>> +
>> +
>
> Nit: too many blank lines.
>
>> + if (get_oid_with_context(the_repository,
>> + sparse_oid_name,
>> + GET_OID_BLOB, &sparse_oid, &oc))
>
> Pure nit: somewhat odd choice of wrapping; I'd have expected something
> along the lines of:
>
> if (get_oid_with_context(the_repository, sparse_oid_name, GET_OID_BLOB,
> &sparse_oid, &oc))
>
> or:
>
> if (get_oid_with_context(the_repository, sparse_oid_name,
> GET_OID_BLOB, &sparse_oid, &oc))
>
>> +static void sparse_filter_data__free(struct sparse_filter_data *d)
>> +{
>> + clear_pattern_list(&d->pl);
>> + strbuf_release(&d->full_path_buf);
>> + free(d);
>> +}
>
> Is the double-underscore convention in function names imported from
> elsewhere? I don't recall seeing it used in this codebase.
>
I took inspiration from the similar '--filter:sparse' argument for rev-list
which is implemented in 'list-objects-filter.c'. I've just given it a
search and it looks like it that convention isn't really used outside the
list-objects files, which makes it a bit awkward here.
I get a bunch more hits for '(init|free)_', so I will change them to that.
>> +static int path_matches_sparse_checkout_patterns(struct strbuf *full_path, struct pattern_list *pl, int dtype)
>> +{
>> + enum pattern_match_result match = recursively_match_path_with_sparse_patterns(full_path->buf, the_repository->index, dtype, pl);
>> + return match > 0;
>> +}
>> +
>> +
>
> Nit: too many blank lines
>
>> +static int filter_sparse(const struct object_id *oid, struct strbuf *base,
>> + const char *pathname, unsigned mode, void *context)
>> +{
>> +
>> + struct sparse_filter_data *data = context;
>
> Nit: unnecessary blank line after "{"
>
>> + int do_recurse = show_recursive(base->buf, base->len, pathname);
>> + if (object_type(mode) == OBJ_TREE)
>> + return do_recurse;
>> +
>> + strbuf_reset(&data->full_path_buf);
>> + strbuf_addbuf(&data->full_path_buf, base);
>> + strbuf_addstr(&data->full_path_buf, pathname);
>> +
>> + if (!path_matches_sparse_checkout_patterns(&data->full_path_buf, &data->pl, DT_REG))
>> + return 0;
>> +
>> + return data->fn(oid, base, pathname, mode, context);
>> +}
>> +
>> +
>
> Nit: too many blank lines
>
>> diff --git a/dir.c b/dir.c
>> @@ -1450,45 +1450,51 @@ int init_sparse_checkout_patterns(struct index_state *istate)
>> +static int path_in_sparse_checkout_1(const char *path,
>> + struct index_state *istate,
>> + int require_cone_mode)
>> +{
>> +
>> + /*
>
> Nit: unnecessary blank line after "{"
>
>> diff --git a/t/t3106-ls-tree-sparse-checkout.sh b/t/t3106-ls-tree-sparse-checkout.sh
>> new file mode 100755
>
> We often try to avoid introducing a new test script if the tests being
> added can fit well into an existing script. If you didn't find any
> existing script where these tests would fit, then creating a new
> script may be fine.
>
Well there's a couple of things. The only ls-tree related test file
that isn't too specific is 't3103-ls-tree-misc.sh'. However, the
'sparse-checkout' command leaks memory, so that would require setting
'TEST_PASSES_SANITIZE_LEAK' to false.
For tests related to sparse-checkout, we both have the large
't1092-sparse-checkout-compatibility.sh' file. But there is also a
collection of files like 't3705-add-sparse-checkout.sh'. I guess
both could work fine in this case. I think I went with the latter
because I personally find the smaller test files a bit easier to
parse.
If someone feels strongly the other way, I'll gladly change it.
>> @@ -0,0 +1,103 @@
>> +check_agrees_with_ls_files () {
>> + REPO=repo
>> + git -C $REPO submodule deinit -f --all
>> + git -C $REPO cat-file -p ${filter_oid} >${REPO}/.git/info/sparse-checkout
>> + git -C $REPO sparse-checkout init --cone 2>err
>> + git -C $REPO submodule init
>> + git -C $REPO ls-files -t| grep -v "^S "|cut -d" " -f2 >ls-files
>> + test_cmp ls-files actual
>> +}
>
> Several comments:
>
> Since the return code of this function is significant and callers care
> about it, you should &&-chain all of the code in the function itself
> (just like you do within a test) so that failure of any command in the
> function is noticed and causes the calling test to fail. It's a good
> idea to &&-chain the variable assignments at the top of the function
> too (just in case someone later inserts code above the assignments).
>
That was an oversight. Thanks for noticing.
> It's odd to see a mixture of $VAR and ${VAR}. It's better to be
> consistent. We typically use the $VAR form (though it's not exclusive
> by any means).
>
I'll remove the braces.
> Some shells complain when the pathname following ">" redirection
> operator is not quoted, so:
>
> git -C $REPO cat-file -p ${filter_oid} >"$REPO/.git/info/sparse-checkout" &&
>
> Style: add space around "|" pipe operator
>
> We usually avoid having a Git command upstream of the pipe since its
> exit code gets swallowed by the pipe, so we usually do this instead:
>
> git -C $REPO ls-files -t >out &&
> grep -v "^S " out | cut -d" " -f2 >ls-files &&
>
Good point. I'll do that.
> Minor: The two-command pipeline `grep -v "^S " | cut -d" " -f2
>> ls-files` could be expressed via a single `sed` invocation:
>
> sed -n "/^S /!s/^. //p" out &&
>
> Nit: The first argument to test_cmp() is typically named "expect".
>
> Error output is captured to file "err" but that file is never consulted.
>
I'll remove the redirection.
>> +check_same_result_in_bare_repo () {
>> + FULL=repo
>> + BARE=bare
>> + FILTER=$1
>> + git -C repo cat-file -p ${filter_oid}| git -C bare hash-object -w --stdin
>> + git -C bare ls-tree --name-only --filter-sparse-oid=${filter_oid} -r HEAD >bare-result
>> + test_cmp expect bare-result
>> +}
>
> Same comments as above, plus:
>
> What is the purpose of the variables FULL, BARE, FILTER? They don't
> seem to be used in the function.
>
> I suspect that the function should be using $FILTER internally rather
> than $filter_oid.
>
I think I probably had plans with them and then forgot. Thanks for noticing.
I'll keep (and use) FILTER, but remove the others.
>> +test_expect_success 'setup' '
>> + git init submodule &&
>> + (
>> + cd submodule &&
>> + test_commit file
>> + ) &&
>
> Minor: test_commit() accepts a -C option, so this could be done
> without the subshell:
>
> test_commit -C submodule file
>
>> + git init repo &&
>> + (
>> + cd repo &&
>> + mkdir dir &&
>> + test_commit dir/sub-file &&
>> + test_commit dir/sub-file2 &&
>> + mkdir dir2 &&
>> + test_commit dir2/sub-file1 &&
>> + test_commit dir2/sub-file2 &&
>> + test_commit top-file &&
>> + git clone ../submodule submodule &&
>> + git submodule add ./submodule &&
>> + git submodule absorbgitdirs &&
>> + git commit -m"add submodule" &&
>> + git sparse-checkout init --cone
>> + ) &&
>
> Here the subshell makes sense since so many commands are run in
> directory "repo." Fine.
Thanks for taking the time to give some comments. I think they all
make sense to me. Besides the comments I've explicitly acknowledged above,
I've also applied all of the minor/nit/style comments to my local tree.
next prev parent reply other threads:[~2023-01-16 12:14 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-11 17:01 [PATCH] ls-tree: add --sparse-filter-oid argument William Sprent via GitGitGadget
2023-01-13 14:17 ` Eric Sunshine
2023-01-13 20:01 ` Junio C Hamano
2023-01-16 15:13 ` William Sprent
2023-01-16 12:14 ` William Sprent [this message]
2023-01-23 11:46 ` [PATCH v2] " William Sprent via GitGitGadget
2023-01-23 13:00 ` Ævar Arnfjörð Bjarmason
2023-01-24 15:30 ` William Sprent
2023-01-23 13:06 ` Ævar Arnfjörð Bjarmason
2023-01-24 15:30 ` William Sprent
2023-01-24 20:11 ` Victoria Dye
2023-01-25 13:47 ` William Sprent
2023-01-25 18:32 ` Victoria Dye
2023-01-26 14:55 ` William Sprent
2023-01-25 5:11 ` Elijah Newren
2023-01-25 16:16 ` William Sprent
2023-01-26 3:25 ` Elijah Newren
2023-01-27 11:58 ` William Sprent
2023-01-28 16:45 ` Elijah Newren
2023-01-30 15:28 ` William Sprent
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=26f026f6-6c17-7578-e341-d56fe53c95df@unity3d.com \
--to=williams@unity3d.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).