From: Derrick Stolee <stolee@gmail.com>
To: Taylor Blau <me@ttaylorr.com>, Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, dstolee@microsoft.com, szeder.dev@gmail.com,
Jeff King <peff@peff.net>
Subject: Re: [PATCH 12/12] builtin/commit-graph.c: introduce '--max-new-filters=<n>'
Date: Mon, 14 Sep 2020 16:31:03 -0400 [thread overview]
Message-ID: <134d64a0-abb6-bdc9-2c05-7aded01a906a@gmail.com> (raw)
In-Reply-To: <20200914201258.GA12431@nand.local>
On 9/14/2020 4:12 PM, Taylor Blau wrote:
> On Fri, Sep 11, 2020 at 03:25:55PM -0400, Taylor Blau wrote:
>> On Fri, Sep 11, 2020 at 02:59:34PM -0400, Taylor Blau wrote:
>>> On Fri, Sep 11, 2020 at 01:52:16PM -0400, Jeff King wrote:
>>>> On Wed, Sep 09, 2020 at 11:24:00AM -0400, Taylor Blau wrote:
>>>>> +With the `--max-new-filters=<n>` option, generate at most `n` new Bloom
>>>>> +filters (if `--changed-paths` is specified). If `n` is `-1`, no limit is
>>>>> +enforced. Commits whose filters are not calculated are stored as a
>>>>> +length zero Bloom filter, and their bit is marked in the `BFXL` chunk.
>>>>> +Overrides the `commitGraph.maxNewFilters` configuration.
>>>>
>>>> The BFXL chunk doesn't exist anymore in this iteration, right?
>>>
>>> Ack; I'll have to drop that.
>>
>> Junio, I know that I've already sent one replacement patch. If you don't
>> mind, here's another (and if you do mind, I'm happy to re-roll the
>> series).
>
> Just kidding. Let's use *this* version which fixes a bug reading the
> commitGraph.maxNewFilters configuration. At this point, the
> fix-ups are:
>
> - This patch (attached below the scisors) instead of 12/12, and
>
> - This [1] patch instead of 10/12.
>
> [1]: https://lore.kernel.org/git/20200910154516.GA32117@nand.local/
>
> Let me know if you'd rather have a full re-roll.
It's getting a bit difficult to track all of these "use this instead"
patches. But, I'm not the one applying them, so maybe that's not actually
a problem.
You might need a re-roll, anyway, as I have a few comments here:
> --- 8< ---
>
> Subject: [PATCH] builtin/commit-graph.c: introduce '--max-new-filters=<n>'
You also introduce commitGraph.maxNewFitlers here, which is not
mentioned in the commit message anywhere. In fact, it might be
good to include it as a separate patch so its implementation and
tests can be isolated from the command-line functionality.
> +length zero Bloom filter. Overrides the `commitGraph.maxNewFilters`
> +configuration.
We have found it valuable to demonstrate these overrides in tests.
Let's inspect your tests for this.
> +test_bloom_filters_computed () {
> + commit_graph_args=$1
> + rm -f "$TRASH_DIRECTORY/trace.event" &&
> + GIT_TRACE2_EVENT="$TRASH_DIRECTORY/trace.event" git commit-graph write \
> + $commit_graph_args &&
> + grep "\"filter_not_computed\":$2" "$TRASH_DIRECTORY/trace.event" &&
> + grep "\"filter_trunc_large\":$3" "$TRASH_DIRECTORY/trace.event" &&
> + grep "\"filter_computed\":$4" "$TRASH_DIRECTORY/trace.event"
> +}
If the arguments were moved to the last parameter, then we could do a few
interesting things here.
test_bloom_filters_computed () {
NOT_COMPUTED="\"filter_not_computed\":$1" &&
shift &&
TRUNCATED="\"filter_trunc_large\":$1" &&
shift &&
COMPUTED="\"filter_computed\":$1" &&
shift &&
rm -f "$TRASH_DIRECTORY/trace.event" &&
GIT_TRACE2_EVENT="$TRASH_DIRECTORY/trace.event" git commit-graph write $@ &&
grep "$NOT_COMPUTED" "$TRASH_DIRECTORY/trace.event" &&
grep "$TRUNCATED" "$TRASH_DIRECTORY/trace.event" &&
grep "$COMPUTED" "$TRASH_DIRECTORY/trace.event"
}
(I have not tested this script. It might need some work.)
This would make your callers a bit cleaner-looking, for example:
test_expect_success 'Bloom generation is limited by --max-new-filters' '
(
cd limits &&
test_commit c2 filter &&
test_commit c3 filter &&
test_commit c4 no-filter &&
test_bloom_filters_computed 3 0 2 \
--reachable --changed-paths --split=replace --max-new-filters=2
)
'
At least, this looks nicer to me.
> +test_expect_success 'Bloom generation backfills previously-skipped filters' '
> + # Check specifying commitGraph.maxNewFilters over "git config" works.
> + test_config -C limits commitGraph.maxNewFilters 1 &&
> + (
> + cd limits &&
> + test_bloom_filters_computed "--reachable --changed-paths --split=replace" \
> + 4 0 1
> + )
> +'
Adding a case for `commitGraph.maxNewFilters=1` and `--max-new-filters=2` might
be interesting for the override rules.
> +
> +test_expect_success 'Bloom generation backfills empty commits' '
> + git init empty &&
> + test_when_finished "rm -fr empty" &&
> + (
> + cd empty &&
> + for i in $(test_seq 1 6)
> + do
> + git commit --allow-empty -m "$i"
> + done &&
> +
> + # Generate Bloom filters for empty commits 1-6, two at a time.
> + test_bloom_filters_computed "--reachable --changed-paths --max-new-filters=2" \
> + 4 0 2 &&
> + test_bloom_filters_computed "--reachable --changed-paths --max-new-filters=2" \
> + 4 0 2 &&
> + test_bloom_filters_computed "--reachable --changed-paths --max-new-filters=2" \
> + 4 0 2 &&
I'm concerned that the max-new-filters limit (2) is a divisor
of the full number of commits (6). It might be good to add one
more commit here and test again with a limit of 2. That would
handle both "equal to limit" and "less than limit" cases.
> + # Finally, make sure that once all commits have filters, that
> + # none are subsequently recomputed.
> + test_bloom_filters_computed "--reachable --changed-paths --max-new-filters=2" \
> + 6 0 0
> + )
> +'
Thanks,
-Stolee
next prev parent reply other threads:[~2020-09-14 20:32 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-09 15:22 [PATCH 00/12] more miscellaneous Bloom filter improvements, redux Taylor Blau
2020-09-09 15:22 ` [PATCH 01/12] commit-graph: introduce 'get_bloom_filter_settings()' Taylor Blau
2020-09-09 15:22 ` [PATCH 02/12] t4216: use an '&&'-chain Taylor Blau
2020-09-09 15:22 ` [PATCH 03/12] commit-graph: pass a 'struct repository *' in more places Taylor Blau
2020-09-09 15:23 ` [PATCH 04/12] t/helper/test-read-graph.c: prepare repo settings Taylor Blau
2020-09-09 15:23 ` [PATCH 05/12] commit-graph: respect 'commitGraph.readChangedPaths' Taylor Blau
2020-09-09 15:23 ` [PATCH 06/12] commit-graph.c: store maximum changed paths Taylor Blau
2020-09-09 15:23 ` [PATCH 07/12] bloom: split 'get_bloom_filter()' in two Taylor Blau
2020-09-09 15:23 ` [PATCH 08/12] bloom: use provided 'struct bloom_filter_settings' Taylor Blau
2020-09-09 15:23 ` [PATCH 09/12] bloom/diff: properly short-circuit on max_changes Taylor Blau
2020-09-09 15:23 ` [PATCH 10/12] bloom: encode out-of-bounds filters as non-empty Taylor Blau
2020-09-10 3:35 ` Taylor Blau
2020-09-10 15:45 ` Taylor Blau
2020-09-11 18:15 ` Derrick Stolee
2020-09-09 15:23 ` [PATCH 11/12] commit-graph: rename 'split_commit_graph_opts' Taylor Blau
2020-09-09 15:24 ` [PATCH 12/12] builtin/commit-graph.c: introduce '--max-new-filters=<n>' Taylor Blau
2020-09-11 17:52 ` Jeff King
2020-09-11 18:59 ` Taylor Blau
2020-09-11 19:25 ` Taylor Blau
2020-09-14 20:12 ` Taylor Blau
2020-09-14 20:31 ` Derrick Stolee [this message]
2020-09-14 20:36 ` Taylor Blau
2020-09-15 0:59 ` Derrick Stolee
2020-09-15 4:31 ` Taylor Blau
2020-09-15 21:49 ` Junio C Hamano
2020-09-15 21:53 ` Taylor Blau
2020-09-11 19:47 ` Jeff King
2020-09-11 19:31 ` Junio C Hamano
2020-09-16 18:06 ` [PATCH v2 00/13] more miscellaneous Bloom filter improvements, redux Taylor Blau
2020-09-16 18:06 ` [PATCH v2 01/13] commit-graph: introduce 'get_bloom_filter_settings()' Taylor Blau
2020-09-16 18:07 ` [PATCH v2 02/13] t4216: use an '&&'-chain Taylor Blau
2020-09-16 18:07 ` [PATCH v2 03/13] commit-graph: pass a 'struct repository *' in more places Taylor Blau
2020-09-16 18:07 ` [PATCH v2 04/13] t/helper/test-read-graph.c: prepare repo settings Taylor Blau
2020-09-16 18:07 ` [PATCH v2 05/13] commit-graph: respect 'commitGraph.readChangedPaths' Taylor Blau
2020-09-16 18:07 ` [PATCH v2 06/13] commit-graph.c: store maximum changed paths Taylor Blau
2020-09-16 18:07 ` [PATCH v2 07/13] bloom: split 'get_bloom_filter()' in two Taylor Blau
2020-09-16 18:07 ` [PATCH v2 08/13] bloom: use provided 'struct bloom_filter_settings' Taylor Blau
2020-09-16 18:07 ` [PATCH v2 09/13] bloom/diff: properly short-circuit on max_changes Taylor Blau
2020-09-16 18:07 ` [PATCH v2 10/13] bloom: encode out-of-bounds filters as non-empty Taylor Blau
2020-09-17 22:13 ` SZEDER Gábor
2020-09-17 23:13 ` Taylor Blau
2020-09-18 0:52 ` Junio C Hamano
2020-09-18 1:15 ` Taylor Blau
2020-09-16 18:08 ` [PATCH v2 11/13] commit-graph: rename 'split_commit_graph_opts' Taylor Blau
2020-09-16 18:08 ` [PATCH v2 12/13] builtin/commit-graph.c: introduce '--max-new-filters=<n>' Taylor Blau
2020-09-18 9:23 ` SZEDER Gábor
2020-09-18 13:27 ` Taylor Blau
2020-09-16 18:08 ` [PATCH v2 13/13] commit-graph: introduce 'commitGraph.maxNewFilters' Taylor Blau
2020-09-16 22:51 ` [PATCH v2 00/13] more miscellaneous Bloom filter improvements, redux Derrick Stolee
2020-09-16 23:07 ` Junio C Hamano
2020-09-17 0:45 ` Taylor Blau
2020-09-17 0:59 ` Junio C Hamano
2020-09-17 1:10 ` Taylor Blau
2020-09-17 13:34 ` Taylor Blau
2020-09-17 13:38 ` Derrick Stolee
2020-09-18 2:58 ` [PATCH v3 " Taylor Blau
2020-09-18 2:58 ` [PATCH v3 01/13] commit-graph: introduce 'get_bloom_filter_settings()' Taylor Blau
2020-09-18 2:58 ` [PATCH v3 02/13] t4216: use an '&&'-chain Taylor Blau
2020-09-18 2:59 ` [PATCH v3 03/13] commit-graph: pass a 'struct repository *' in more places Taylor Blau
2020-09-18 2:59 ` [PATCH v3 04/13] t/helper/test-read-graph.c: prepare repo settings Taylor Blau
2020-09-18 2:59 ` [PATCH v3 05/13] commit-graph: respect 'commitGraph.readChangedPaths' Taylor Blau
2020-09-18 2:59 ` [PATCH v3 06/13] commit-graph.c: store maximum changed paths Taylor Blau
2020-09-18 2:59 ` [PATCH v3 07/13] bloom: split 'get_bloom_filter()' in two Taylor Blau
2020-09-18 2:59 ` [PATCH v3 08/13] bloom: use provided 'struct bloom_filter_settings' Taylor Blau
2020-09-18 16:27 ` SZEDER Gábor
2020-09-18 16:32 ` Taylor Blau
2020-09-18 2:59 ` [PATCH v3 09/13] bloom/diff: properly short-circuit on max_changes Taylor Blau
2020-09-18 2:59 ` [PATCH v3 10/13] bloom: encode out-of-bounds filters as non-empty Taylor Blau
2020-09-18 2:59 ` [PATCH v3 11/13] commit-graph: rename 'split_commit_graph_opts' Taylor Blau
2020-09-18 2:59 ` [PATCH v3 12/13] builtin/commit-graph.c: introduce '--max-new-filters=<n>' Taylor Blau
2020-09-18 2:59 ` [PATCH v3 13/13] commit-graph: introduce 'commitGraph.maxNewFilters' Taylor Blau
2020-09-18 13:29 ` Taylor Blau
2020-09-18 17:43 ` Junio C Hamano
2020-09-18 13:31 ` [PATCH v3 00/13] more miscellaneous Bloom filter improvements, redux Taylor Blau
2020-09-18 13:34 ` Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=134d64a0-abb6-bdc9-2c05-7aded01a906a@gmail.com \
--to=stolee@gmail.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=me@ttaylorr.com \
--cc=peff@peff.net \
--cc=szeder.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).