From: Derrick Stolee <stolee@gmail.com>
To: Elijah Newren via GitGitGadget <gitgitgadget@gmail.com>,
git@vger.kernel.org
Cc: "Derrick Stolee" <dstolee@microsoft.com>,
"Jonathan Tan" <jonathantanmy@google.com>,
"Taylor Blau" <me@ttaylorr.com>,
"Junio C Hamano" <gitster@pobox.com>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
"Elijah Newren" <newren@gmail.com>
Subject: Re: [PATCH v2 0/8] Optimization batch 9: avoid detecting irrelevant renames
Date: Tue, 9 Mar 2021 17:08:37 -0500 [thread overview]
Message-ID: <9fce692f-86f9-59c1-07de-1eaabdbcb827@gmail.com> (raw)
In-Reply-To: <pull.845.v2.git.1615248599.gitgitgadget@gmail.com>
On 3/8/2021 7:09 PM, Elijah Newren via GitGitGadget wrote:> The basic idea here is:
>
> We only need expensive rename detection on the subset of files changed on
> both sides of history (for the most part).
>
> This is because:
>
> 1. The primary reason for rename detection in merges is enabling three-way
> content merges
> 2. The purpose of three-way content merges is reconciling changes when
>
> both sides of history modified some file 3. If a file was only modified by
> the side that renamed the file, then detecting the rename is irrelevant;
> we'll get the same answer without knowing about the rename. 4. (Well...there
> are rare cases where we need the rename for reasons other than three-way
> content merges. Patch 5 explains those.)
Makes sense. Don't compute information you won't need. I look forward to
trying to figure out the special cases here.
> Before Series After Series
> no-renames: 12.596 s ± 0.061 s 5.680 s ± 0.096 s
> mega-renames: 130.465 s ± 0.259 s 13.812 s ± 0.162 s
> just-one-mega: 3.958 s ± 0.010 s 506.0 ms ± 3.9 ms
These are _very_ impressive numbers for such a "simple" idea.
> However, interestingly, if we had ignored the basename-guided rename
> detection optimizations[2][3], then this optimization series would have
> improved the performance as follows:
>
> Before Basename Series After Just This Series
> no-renames: 13.815 s ± 0.062 s 5.728 s ± 0.104 s
> mega-renames: 1799.937 s ± 0.493 s 18.213 s ± 0.139 s
> just-one-mega 51.289 s ± 0.019 s 891.9 ms ± 7.0 ms
And here it is even more impressive. I see that your optimizations are
having combined effects but also are doing valuable things on their
own.
> We get best results by prioritizing them as follows:
>
> 1. exact rename detection
> 2. skip-because-unnecessary
> 3. basename-guided rename detection
This makes sense to me, since even the basename-guided rename is
doing some non-trivial work. It would be good to reduce that
effort.
> it means that our remaining optimization potential is
> somewhat limited, and subsequent optimization series will have to fight for
> much smaller gains.
This is a good place to end up. Let the code rest for a bit after
we are done here, and maybe we'll find new cases to care about
later. We could chase the long tail forever, but these steps are
a huge accomplishment!
Getting to reading now.
-Stolee
next prev parent reply other threads:[~2021-03-09 22:09 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-28 3:58 [PATCH 0/8] Optimization batch 9: avoid detecting irrelevant renames Elijah Newren via GitGitGadget
2021-02-28 3:58 ` [PATCH 1/8] diffcore-rename: enable filtering possible rename sources Elijah Newren via GitGitGadget
2021-02-28 3:58 ` [PATCH 2/8] merge-ort: precompute subset of sources for which we need rename detection Elijah Newren via GitGitGadget
2021-02-28 3:58 ` [PATCH 3/8] merge-ort: add data structures for an alternate tree traversal Elijah Newren via GitGitGadget
2021-02-28 3:58 ` [PATCH 4/8] merge-ort: introduce wrappers for " Elijah Newren via GitGitGadget
2021-02-28 3:58 ` [PATCH 5/8] merge-ort: precompute whether directory rename detection is needed Elijah Newren via GitGitGadget
2021-02-28 3:58 ` [PATCH 6/8] merge-ort: use relevant_sources to filter possible rename sources Elijah Newren via GitGitGadget
2021-02-28 3:58 ` [PATCH 7/8] merge-ort: skip rename detection entirely if possible Elijah Newren via GitGitGadget
2021-02-28 3:58 ` [PATCH 8/8] diffcore-rename: avoid doing basename comparisons for irrelevant sources Elijah Newren via GitGitGadget
2021-03-01 16:39 ` [PATCH 0/8] Optimization batch 9: avoid detecting irrelevant renames Elijah Newren
2021-03-04 7:54 ` Elijah Newren
2021-03-09 0:09 ` [PATCH v2 " Elijah Newren via GitGitGadget
2021-03-09 0:09 ` [PATCH v2 1/8] diffcore-rename: enable filtering possible rename sources Elijah Newren via GitGitGadget
2021-03-09 22:21 ` Derrick Stolee
2021-03-09 22:40 ` Elijah Newren
2021-03-09 22:45 ` Derrick Stolee
2021-03-09 0:09 ` [PATCH v2 2/8] merge-ort: precompute subset of sources for which we need rename detection Elijah Newren via GitGitGadget
2021-03-09 0:09 ` [PATCH v2 3/8] merge-ort: add data structures for an alternate tree traversal Elijah Newren via GitGitGadget
2021-03-09 0:09 ` [PATCH v2 4/8] merge-ort: introduce wrappers for " Elijah Newren via GitGitGadget
2021-03-09 23:06 ` Derrick Stolee
2021-03-09 0:09 ` [PATCH v2 5/8] merge-ort: precompute whether directory rename detection is needed Elijah Newren via GitGitGadget
2021-03-09 0:09 ` [PATCH v2 6/8] merge-ort: use relevant_sources to filter possible rename sources Elijah Newren via GitGitGadget
2021-03-09 0:09 ` [PATCH v2 7/8] merge-ort: skip rename detection entirely if possible Elijah Newren via GitGitGadget
2021-03-09 22:51 ` Derrick Stolee
2021-03-09 22:57 ` Elijah Newren
2021-03-09 0:09 ` [PATCH v2 8/8] diffcore-rename: avoid doing basename comparisons for irrelevant sources Elijah Newren via GitGitGadget
2021-03-09 22:08 ` Derrick Stolee [this message]
2021-03-10 15:08 ` [PATCH v2 0/8] Optimization batch 9: avoid detecting irrelevant renames Derrick Stolee
2021-03-11 0:38 ` [PATCH v3 " Elijah Newren via GitGitGadget
2021-03-11 0:38 ` [PATCH v3 1/8] diffcore-rename: enable filtering possible rename sources Elijah Newren via GitGitGadget
2021-03-11 0:38 ` [PATCH v3 2/8] merge-ort: precompute subset of sources for which we need rename detection Elijah Newren via GitGitGadget
2021-03-11 0:38 ` [PATCH v3 3/8] merge-ort: add data structures for an alternate tree traversal Elijah Newren via GitGitGadget
2021-03-11 0:38 ` [PATCH v3 4/8] merge-ort: introduce wrappers for " Elijah Newren via GitGitGadget
2021-03-11 0:38 ` [PATCH v3 5/8] merge-ort: precompute whether directory rename detection is needed Elijah Newren via GitGitGadget
2021-03-11 0:38 ` [PATCH v3 6/8] merge-ort: use relevant_sources to filter possible rename sources Elijah Newren via GitGitGadget
2021-03-11 0:38 ` [PATCH v3 7/8] merge-ort: skip rename detection entirely if possible Elijah Newren via GitGitGadget
2021-03-11 0:38 ` [PATCH v3 8/8] diffcore-rename: avoid doing basename comparisons for irrelevant sources Elijah Newren via GitGitGadget
2021-03-15 13:57 ` [PATCH v3 0/8] Optimization batch 9: avoid detecting irrelevant renames Derrick Stolee
2021-03-15 17:10 ` [PATCH v2 " Elijah Newren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9fce692f-86f9-59c1-07de-1eaabdbcb827@gmail.com \
--to=stolee@gmail.com \
--cc=avarab@gmail.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
--cc=jonathantanmy@google.com \
--cc=me@ttaylorr.com \
--cc=newren@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).