From: Philip Oakley <philipoakley@iee.email>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
git@vger.kernel.org
Cc: Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH] bloom: ignore renames when computing changed paths
Date: Wed, 8 Apr 2020 20:13:17 +0100 [thread overview]
Message-ID: <7b23c659-56b4-5ed1-eb66-eb112cbde8a3@iee.email> (raw)
In-Reply-To: <pull.601.git.1586363907252.gitgitgadget@gmail.com>
spelling nit.
On 08/04/2020 17:38, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
>
> The changed-path Bloom filters record an entry in the filter for
> every path that was changed. This includes every add and delete,
> regardless of whther a rename was detected. Detecting renames
whether
> causes significant performance issues, but also will trigger
> downloading missing blobs in partial clone.
>
> The simple fix is to disable rename detection when computing a
> changed-path Bloom filter.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
> bloom: ignore renames when computing changed paths
>
> I promised [1] I would adapt the commit that was dropped from
> gs/commit-graph-path-filter [2] on top of gs/commit-graph-path-filter
> and jt/avoid-prefetch-when-able-in-diff. However, I noticed that the
> change was extremely simple and has value without basing it on
> jt/avoid-prefetch-when-able-in-diff.
>
> This change applied to gs/commit-graph-path-filter has obvious CPU time
> improvements for computing changed-path Bloom filters (that I did not
> measure). The partial clone improvements require
> jt/avoid-prefetch-when-able-in-diff to be included, too, but the code
> does not depend on it at compile time.
>
> Thanks, -Stolee
>
> [1]
> https://lore.kernel.org/git/7de2f54b-8704-a0e1-12aa-0ca9d3d70f6f@gmail.com/
> [2]
> https://lore.kernel.org/git/55824cda89c1dca7756c8c2d831d6e115f4a9ddb.1585528298.git.gitgitgadget@gmail.com/
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-601%2Fderrickstolee%2Fdiff-and-bloom-filters-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-601/derrickstolee/diff-and-bloom-filters-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/601
>
> bloom.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/bloom.c b/bloom.c
> index c5b461d1cfe..dd9bab9bbd6 100644
> --- a/bloom.c
> +++ b/bloom.c
> @@ -189,6 +189,7 @@ struct bloom_filter *get_bloom_filter(struct repository *r,
>
> repo_diff_setup(r, &diffopt);
> diffopt.flags.recursive = 1;
> + diffopt.detect_rename = 0;
> diffopt.max_changes = max_changes;
> diff_setup_done(&diffopt);
>
>
> base-commit: d5b873c832d832e44523d1d2a9d29afe2b84c84f
Philip
next prev parent reply other threads:[~2020-04-08 19:13 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-08 16:38 [PATCH] bloom: ignore renames when computing changed paths Derrick Stolee via GitGitGadget
2020-04-08 19:11 ` Junio C Hamano
2020-04-08 19:13 ` Philip Oakley [this message]
2020-04-08 22:31 ` Jeff King
2020-04-09 11:56 ` Derrick Stolee
2020-04-09 13:47 ` Jeff King
2020-04-09 14:00 ` Derrick Stolee
2020-04-09 14:15 ` Jeff King
2020-04-09 13:00 ` [PATCH v2] " Derrick Stolee via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7b23c659-56b4-5ed1-eb66-eb112cbde8a3@iee.email \
--to=philipoakley@iee.email \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).