From: Derrick Stolee <stolee@gmail.com>
To: Jakub Narebski <jnareb@gmail.com>,
Derrick Stolee <dstolee@microsoft.com>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
"Jeff King" <peff@peff.net>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: Re: [PATCH v4 08/10] commit: add short-circuit to paint_down_to_common()
Date: Tue, 1 May 2018 07:47:01 -0400 [thread overview]
Message-ID: <3e3ae7e4-9efd-3046-c0c2-7b3bf12d0d81@gmail.com> (raw)
In-Reply-To: <864ljsgwx1.fsf@gmail.com>
On 4/30/2018 6:19 PM, Jakub Narebski wrote:
> Derrick Stolee <dstolee@microsoft.com> writes:
>
>> When running 'git branch --contains', the in_merge_bases_many()
>> method calls paint_down_to_common() to discover if a specific
>> commit is reachable from a set of branches. Commits with lower
>> generation number are not needed to correctly answer the
>> containment query of in_merge_bases_many().
>>
>> Add a new parameter, min_generation, to paint_down_to_common() that
>> prevents walking commits with generation number strictly less than
>> min_generation. If 0 is given, then there is no functional change.
> This is thanks to the fact that generation numbers start at zero (as
> special case, though, with _ZERO), and we use strict inequality to avoid
> handling _ZERO etc. in a special way.
>
> As you wrote in response in previous version of this series, because
> paint_down_to_common() is file-local, there is no need to come up with
> symbolic name for GENERATION_NO_CUTOFF case.
>
> All right.
>
>> For in_merge_bases_many(), we can pass commit->generation as the
>> cutoff, and this saves time during 'git branch --contains' queries
>> that would otherwise walk "around" the commit we are inspecting.
> All right, and when using paint_down_to_common() to actually find merge
> bases, and not only check for containment, we cannot use cutoff.
> Therefore at least one call site needs to run it without functional
> change... which we can do. Good.
>
>> For a copy of the Linux repository, where HEAD is checked out at
>> v4.13~100, we get the following performance improvement for
>> 'git branch --contains' over the previous commit:
>>
>> Before: 0.21s
>> After: 0.13s
>> Rel %: -38%
> Nice.
>
>> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
>> ---
>> commit.c | 18 ++++++++++++++----
>> 1 file changed, 14 insertions(+), 4 deletions(-)
> Let me reorder chunks a bit to make it easier to review.
>
>> diff --git a/commit.c b/commit.c
>> index 7bb007f56a..e2e16ea1a7 100644
>> --- a/commit.c
>> +++ b/commit.c
>> @@ -1070,7 +1080,7 @@ int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit *
>> if (commit->generation > min_generation)
>> return ret;
>>
>> - bases = paint_down_to_common(commit, nr_reference, reference);
>> + bases = paint_down_to_common(commit, nr_reference, reference, commit->generation);
>> if (commit->object.flags & PARENT2)
>> ret = 1;
>> clear_commit_marks(commit, all_flags);
>> @@ -808,11 +808,14 @@ static int queue_has_nonstale(struct prio_queue *queue)
>> }
>>
>> /* all input commits in one and twos[] must have been parsed! */
>> -static struct commit_list *paint_down_to_common(struct commit *one, int n, struct commit **twos)
>> +static struct commit_list *paint_down_to_common(struct commit *one, int n,
>> + struct commit **twos,
>> + int min_generation)
>> {
>> struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
>> struct commit_list *result = NULL;
>> int i;
>> + uint32_t last_gen = GENERATION_NUMBER_INFINITY;
>>
>> one->object.flags |= PARENT1;
>> if (!n) {
>> @@ -831,6 +834,13 @@ static struct commit_list *paint_down_to_common(struct commit *one, int n, struc
>> struct commit_list *parents;
>> int flags;
>>
>> + if (commit->generation > last_gen)
>> + BUG("bad generation skip");
>> + last_gen = commit->generation;
> Shouldn't we provide more information about where the problem is to the
> user, to make it easier to debug the repository / commit-graph data?
>
> Good to have this sanity check here.
This BUG() _should_ only be seen by developers who add callers which do
not load commits from the commit-graph file. There is a chance that
there are cases not covered by this patch and the added tests, though.
Hopefully we catch them all by dogfooding the feature before turning it
on by default.
I can add the following to help debug these bad situations:
+ BUG("bad generation skip %d > %d at %s",
+ commit->generation, last_gen,
+ oid_to_hex(&commit->object.oid));
>
>> +
>> + if (commit->generation < min_generation)
>> + break;
> So the reasoning for this, as far as I understand, is the following.
> Please correct me if I am wrong.
>
> The callsite with non-zero min_generation, in_merge_bases_many(), tries
> to find out if "commit" is an ancestor of one of the "references". At
> least one of "references" is above "commit", so in_merge_bases_many()
> uses paint_down_to_common() - but is interested only if "commit" was
> painted as reachable from one of "references".
>
> Thus we can interrupt the walk if we know that none of [considered]
> commits in the queue can reach "commit"/"one" - as if they were all
> STALE.
>
> The search is done using priority queue (a bit like in Dijkstra
> algorithm), with newer commits - with larger generation numbers -
> considered first. Thus if current commit has generation number less
> than min_generation cutoff, i.e. if it is below "commit", then all
> remaining commits in the queue are below cutoff.
>
> Good.
>
>> +
>> flags = commit->object.flags & (PARENT1 | PARENT2 | STALE);
>> if (flags == (PARENT1 | PARENT2)) {
>> if (!(commit->object.flags & RESULT)) {
>> @@ -879,7 +889,7 @@ static struct commit_list *merge_bases_many(struct commit *one, int n, struct co
>> return NULL;
>> }
>>
>> - list = paint_down_to_common(one, n, twos);
>> + list = paint_down_to_common(one, n, twos, 0);
> When calculating merge bases there is no such possibility of an early
> return due to generation number cutoff. All right then.
>
>>
>> while (list) {
>> struct commit *commit = pop_commit(&list);
>> @@ -946,7 +956,7 @@ static int remove_redundant(struct commit **array, int cnt)
>> filled_index[filled] = j;
>> work[filled++] = array[j];
>> }
>> - common = paint_down_to_common(array[i], filled, work);
>> + common = paint_down_to_common(array[i], filled, work, 0);
> Here we are interested not only if "one"/array[i] is reachable from
> "twos"/work, but also if "twos" is reachable from "one". Simple cutoff
> only works in one way, though I wonder if we couldn't use cutoff being
> minimum generation number of "one" and "twos" together.
>
> But that may be left for a separate commit (after checking that the
> above is correct).
>
> Not as simple and obvious as paint_down_to_common() used in
> in_merge_bases_any(), so it is all right.
Thanks for reporting this. Since we are only concerned about
reachability in this method, it is a good candidate to use
min_generation. It is also subtle enough that we should leave it as a
separate commit. Also, we can measure performance improvements
separately, as I will mention in my commit message (but I'll copy it here):
For a copy of the Linux repository, we measured the following
performance improvements:
git merge-base v3.3 v4.5
Before: 234 ms
After: 208 ms
Rel %: -11%
git merge-base v4.3 v4.5
Before: 102 ms
After: 83 ms
Rel %: -19%
The experiments above were chosen to demonstrate that we are
improving the filtering of the merge-base set. In the first
example, more time is spent walking the history to find the
set of merge bases before the remove_redundant() call. The
starting commits are closer together in the second example,
therefore more time is spent in remove_redundant(). The relative
change in performance differs as expected.
>
>> if (array[i]->object.flags & PARENT2)
>> redundant[i] = 1;
>> for (j = 0; j < filled; j++)
next prev parent reply other threads:[~2018-05-01 11:47 UTC|newest]
Thread overview: 162+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-03 16:51 [PATCH 0/6] Compute and consume generation numbers Derrick Stolee
2018-04-03 16:51 ` [PATCH 1/6] object.c: parse commit in graph first Derrick Stolee
2018-04-03 18:21 ` Jonathan Tan
2018-04-03 18:28 ` Jeff King
2018-04-03 18:32 ` Derrick Stolee
2018-04-03 16:51 ` [PATCH 2/6] commit: add generation number to struct commmit Derrick Stolee
2018-04-03 18:05 ` Brandon Williams
2018-04-03 18:28 ` Jeff King
2018-04-03 18:31 ` Derrick Stolee
2018-04-03 18:32 ` Brandon Williams
2018-04-03 18:44 ` Stefan Beller
2018-04-03 23:17 ` Ramsay Jones
2018-04-03 23:19 ` Jeff King
2018-04-03 18:24 ` Jonathan Tan
2018-04-03 16:51 ` [PATCH 3/6] commit-graph: compute generation numbers Derrick Stolee
2018-04-03 18:30 ` Jonathan Tan
2018-04-03 18:49 ` Stefan Beller
2018-04-03 16:51 ` [PATCH 4/6] commit: use generations in paint_down_to_common() Derrick Stolee
2018-04-03 18:31 ` Stefan Beller
2018-04-03 18:31 ` Jonathan Tan
2018-04-03 16:51 ` [PATCH 5/6] commit.c: use generation to halt paint walk Derrick Stolee
2018-04-03 19:01 ` Jonathan Tan
2018-04-03 16:51 ` [PATCH 6/6] commit-graph.txt: update future work Derrick Stolee
2018-04-03 19:04 ` Jonathan Tan
2018-04-03 16:56 ` [PATCH 0/6] Compute and consume generation numbers Derrick Stolee
2018-04-03 18:03 ` Brandon Williams
2018-04-03 18:29 ` Derrick Stolee
2018-04-03 18:47 ` Jeff King
2018-04-03 19:05 ` Jeff King
2018-04-04 15:45 ` [PATCH 7/6] ref-filter: use generation number for --contains Derrick Stolee
2018-04-04 15:45 ` [PATCH 8/6] commit: use generation numbers for in_merge_bases() Derrick Stolee
2018-04-04 15:48 ` Derrick Stolee
2018-04-04 17:01 ` Brandon Williams
2018-04-04 18:24 ` Jeff King
2018-04-04 18:53 ` Derrick Stolee
2018-04-04 18:59 ` Jeff King
2018-04-04 18:22 ` [PATCH 7/6] ref-filter: use generation number for --contains Jeff King
2018-04-04 19:06 ` Derrick Stolee
2018-04-04 19:16 ` Jeff King
2018-04-04 19:22 ` Derrick Stolee
2018-04-04 19:42 ` Jeff King
2018-04-04 19:45 ` Derrick Stolee
2018-04-04 19:46 ` Jeff King
2018-04-07 17:09 ` [PATCH 0/6] Compute and consume generation numbers Jakub Narebski
2018-04-07 16:55 ` Jakub Narebski
2018-04-08 1:06 ` Derrick Stolee
2018-04-11 19:32 ` Jakub Narebski
2018-04-11 19:58 ` Derrick Stolee
2018-04-14 16:52 ` Jakub Narebski
2018-04-21 20:44 ` Jakub Narebski
2018-04-23 13:54 ` Derrick Stolee
2018-04-09 16:41 ` [PATCH v2 00/10] " Derrick Stolee
2018-04-09 16:41 ` [PATCH v2 01/10] object.c: parse commit in graph first Derrick Stolee
2018-04-09 16:41 ` [PATCH v2 02/10] merge: check config before loading commits Derrick Stolee
2018-04-11 2:12 ` Junio C Hamano
2018-04-11 12:49 ` Derrick Stolee
2018-04-09 16:42 ` [PATCH v2 03/10] commit: add generation number to struct commmit Derrick Stolee
2018-04-09 17:59 ` Stefan Beller
2018-04-11 2:31 ` Junio C Hamano
2018-04-11 12:57 ` Derrick Stolee
2018-04-11 23:28 ` Junio C Hamano
2018-04-09 16:42 ` [PATCH v2 04/10] commit-graph: compute generation numbers Derrick Stolee
2018-04-11 2:51 ` Junio C Hamano
2018-04-11 13:02 ` Derrick Stolee
2018-04-11 18:49 ` Stefan Beller
2018-04-11 19:26 ` Eric Sunshine
2018-04-09 16:42 ` [PATCH v2 05/10] commit: use generations in paint_down_to_common() Derrick Stolee
2018-04-09 16:42 ` [PATCH v2 06/10] commit.c: use generation to halt paint walk Derrick Stolee
2018-04-11 3:02 ` Junio C Hamano
2018-04-11 13:24 ` Derrick Stolee
2018-04-09 16:42 ` [PATCH v2 07/10] commit-graph.txt: update future work Derrick Stolee
2018-04-12 9:12 ` Junio C Hamano
2018-04-12 11:35 ` Derrick Stolee
2018-04-13 9:53 ` Jakub Narebski
2018-04-09 16:42 ` [PATCH v2 08/10] ref-filter: use generation number for --contains Derrick Stolee
2018-04-09 16:42 ` [PATCH v2 09/10] commit: use generation numbers for in_merge_bases() Derrick Stolee
2018-04-09 16:42 ` [PATCH v2 10/10] commit: add short-circuit to paint_down_to_common() Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 0/9] Compute and consume generation numbers Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 1/9] commit: add generation number to struct commmit Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 2/9] commit-graph: compute generation numbers Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 3/9] commit: use generations in paint_down_to_common() Derrick Stolee
2018-04-18 14:31 ` Jakub Narebski
2018-04-18 14:46 ` Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 4/9] commit-graph.txt: update design document Derrick Stolee
2018-04-18 19:47 ` Jakub Narebski
2018-04-17 17:00 ` [PATCH v3 5/9] ref-filter: use generation number for --contains Derrick Stolee
2018-04-18 21:02 ` Jakub Narebski
2018-04-23 14:22 ` Derrick Stolee
2018-04-24 18:56 ` Jakub Narebski
2018-04-25 14:11 ` Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 6/9] commit: use generation numbers for in_merge_bases() Derrick Stolee
2018-04-18 22:15 ` Jakub Narebski
2018-04-23 14:31 ` Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 7/9] commit: add short-circuit to paint_down_to_common() Derrick Stolee
2018-04-18 23:19 ` Jakub Narebski
2018-04-23 14:40 ` Derrick Stolee
2018-04-23 21:38 ` Jakub Narebski
2018-04-24 12:31 ` Derrick Stolee
2018-04-19 8:32 ` Jakub Narebski
2018-04-17 17:00 ` [PATCH v3 8/9] commit-graph: always load commit-graph information Derrick Stolee
2018-04-17 17:50 ` Derrick Stolee
2018-04-19 0:02 ` Jakub Narebski
2018-04-23 14:49 ` Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 9/9] merge: check config before loading commits Derrick Stolee
2018-04-19 0:04 ` [PATCH v3 0/9] Compute and consume generation numbers Jakub Narebski
2018-04-23 14:54 ` Derrick Stolee
2018-04-25 14:37 ` [PATCH v4 00/10] " Derrick Stolee
2018-04-25 14:37 ` [PATCH v4 01/10] ref-filter: fix outdated comment on in_commit_list Derrick Stolee
2018-04-28 17:54 ` Jakub Narebski
2018-04-25 14:37 ` [PATCH v4 02/10] commit: add generation number to struct commmit Derrick Stolee
2018-04-28 22:35 ` Jakub Narebski
2018-04-30 12:05 ` Derrick Stolee
2018-04-25 14:37 ` [PATCH v4 03/10] commit-graph: compute generation numbers Derrick Stolee
2018-04-26 2:35 ` Junio C Hamano
2018-04-26 12:58 ` Derrick Stolee
2018-04-26 13:49 ` Derrick Stolee
2018-04-29 9:08 ` Jakub Narebski
2018-05-01 12:10 ` Derrick Stolee
2018-05-02 16:15 ` Jakub Narebski
2018-04-25 14:37 ` [PATCH v4 04/10] commit: use generations in paint_down_to_common() Derrick Stolee
2018-04-26 3:22 ` Junio C Hamano
2018-04-26 9:02 ` Jakub Narebski
2018-04-28 14:38 ` Jakub Narebski
2018-04-29 15:40 ` Jakub Narebski
2018-04-25 14:37 ` [PATCH v4 05/10] commit-graph: always load commit-graph information Derrick Stolee
2018-04-29 22:14 ` Jakub Narebski
2018-05-01 12:19 ` Derrick Stolee
2018-04-29 22:18 ` Jakub Narebski
2018-04-25 14:37 ` [PATCH v4 06/10] ref-filter: use generation number for --contains Derrick Stolee
2018-04-30 16:34 ` Jakub Narebski
2018-04-25 14:37 ` [PATCH v4 07/10] commit: use generation numbers for in_merge_bases() Derrick Stolee
2018-04-30 17:05 ` Jakub Narebski
2018-04-25 14:38 ` [PATCH v4 08/10] commit: add short-circuit to paint_down_to_common() Derrick Stolee
2018-04-30 22:19 ` Jakub Narebski
2018-05-01 11:47 ` Derrick Stolee [this message]
2018-05-02 13:05 ` Jakub Narebski
2018-05-02 13:42 ` Derrick Stolee
2018-04-25 14:38 ` [PATCH v4 09/10] merge: check config before loading commits Derrick Stolee
2018-04-30 22:54 ` Jakub Narebski
2018-05-01 11:52 ` Derrick Stolee
2018-05-02 11:41 ` Jakub Narebski
2018-04-25 14:38 ` [PATCH v4 10/10] commit-graph.txt: update design document Derrick Stolee
2018-04-30 23:32 ` Jakub Narebski
2018-05-01 12:00 ` Derrick Stolee
2018-05-02 7:57 ` Jakub Narebski
2018-04-25 14:40 ` [PATCH v4 00/10] Compute and consume generation numbers Derrick Stolee
2018-04-28 17:28 ` Jakub Narebski
2018-05-01 12:47 ` [PATCH v5 00/11] " Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 01/11] ref-filter: fix outdated comment on in_commit_list Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 02/11] commit: add generation number to struct commmit Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 03/11] commit-graph: compute generation numbers Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 04/11] commit: use generations in paint_down_to_common() Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 05/11] commit-graph: always load commit-graph information Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 06/11] ref-filter: use generation number for --contains Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 07/11] commit: use generation numbers for in_merge_bases() Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 08/11] commit: add short-circuit to paint_down_to_common() Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 09/11] commit: use generation number in remove_redundant() Derrick Stolee
2018-05-01 15:37 ` Derrick Stolee
2018-05-03 18:45 ` Jakub Narebski
2018-05-01 12:47 ` [PATCH v5 10/11] merge: check config before loading commits Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 11/11] commit-graph.txt: update design document Derrick Stolee
2018-05-03 11:18 ` [PATCH v5 00/11] Compute and consume generation numbers Jakub Narebski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3e3ae7e4-9efd-3046-c0c2-7b3bf12d0d81@gmail.com \
--to=stolee@gmail.com \
--cc=avarab@gmail.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jnareb@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).