From: Derrick Stolee <stolee@gmail.com>
To: Jakub Narebski <jnareb@gmail.com>
Cc: "Derrick Stolee" <dstolee@microsoft.com>,
git@vger.kernel.org, "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
"Stefan Beller" <sbeller@google.com>,
"Lars Schneider" <larsxschneider@gmail.com>,
"Jeff King" <peff@peff.net>
Subject: Re: [PATCH 0/6] Compute and consume generation numbers
Date: Wed, 11 Apr 2018 15:58:27 -0400 [thread overview]
Message-ID: <45320265-9fec-cee1-e82c-3ff719bb0435@gmail.com> (raw)
In-Reply-To: <868t9t5yjz.fsf@gmail.com>
On 4/11/2018 3:32 PM, Jakub Narebski wrote:
> What would you suggest as a good test that could imply performance? The
> Google Colab notebook linked to above includes a function to count
> number of commits (nodes / vertices in the commit graph) walked,
> currently in the worst case scenario.
The two main questions to consider are:
1. Can X reach Y?
2. What is the set of merge-bases between X and Y?
And the thing to measure is a commit count. If possible, it would be
good to count commits walked (commits whose parent list is enumerated)
and commits inspected (commits that were listed as a parent of some
walked commit). Walked commits require a commit parse -- albeit from the
commit-graph instead of the ODB now -- while inspected commits only
check the in-memory cache.
For git.git and Linux, I like to use the release tags as tests. They
provide a realistic view of the linear history, and maintenance releases
have their own history from the major releases.
> I have tried finding number of false positives for level (generation
> number) filter and for FELINE index, and number of false negatives for
> min-post intervals in the spanning tree (for DFS tree) for 10000
> randomly selected pairs of commits... but I don't think this is a good
> benchmark.
What is a false-positive? A case where gen(X) < gen(Y) but Y cannot
reach X? I do not think that is a great benchmark, but I guess it is
something to measure.
> I Linux kernel sources (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git)
> that has 750832 nodes and 811733 edges, and 563747941392 possible
> directed pairs, we have for 10000 randomly selected pairs of commits:
>
> level-filter has 91 = 0.91% [all] false positives
> FELINE index has 78 = 0.78% [all] false positives
> FELINE index has 1.16667 less false positives than level filter
>
> min-post spanning-tree intervals has 3641 = 36.41% [all] false
> negatives
Perhaps something you can do instead of sampling from N^2 commits in
total is to select a pair of generations (say, G = 20000, G' = 20100) or
regions of generations ( 20000 <= G <= 20050, 20100 <= G' <= 20150) and
see how many false positives you see by testing all pairs (one from each
level). The delta between the generations may need to be smaller to
actually have a large proportion of unreachable pairs. Try different
levels, since major version releases tend to "pinch" the commit graph to
a common history.
> For git.git repository (https://github.com/git/git.git) that has 52950
> nodes and 65887 edges the numbers are slighly more in FELINE index
> favor (also out of 10000 random pairs):
>
> level-filter has 504 = 9.11% false positives
> FELINE index has 125 = 2.26% false positives
> FELINE index has 4.032 less false positives than level filter
>
> This is for FELINE which does not use level / generatio-numbers filter.
Thanks,
-Stolee
next prev parent reply other threads:[~2018-04-11 19:58 UTC|newest]
Thread overview: 162+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-03 16:51 [PATCH 0/6] Compute and consume generation numbers Derrick Stolee
2018-04-03 16:51 ` [PATCH 1/6] object.c: parse commit in graph first Derrick Stolee
2018-04-03 18:21 ` Jonathan Tan
2018-04-03 18:28 ` Jeff King
2018-04-03 18:32 ` Derrick Stolee
2018-04-03 16:51 ` [PATCH 2/6] commit: add generation number to struct commmit Derrick Stolee
2018-04-03 18:05 ` Brandon Williams
2018-04-03 18:28 ` Jeff King
2018-04-03 18:31 ` Derrick Stolee
2018-04-03 18:32 ` Brandon Williams
2018-04-03 18:44 ` Stefan Beller
2018-04-03 23:17 ` Ramsay Jones
2018-04-03 23:19 ` Jeff King
2018-04-03 18:24 ` Jonathan Tan
2018-04-03 16:51 ` [PATCH 3/6] commit-graph: compute generation numbers Derrick Stolee
2018-04-03 18:30 ` Jonathan Tan
2018-04-03 18:49 ` Stefan Beller
2018-04-03 16:51 ` [PATCH 4/6] commit: use generations in paint_down_to_common() Derrick Stolee
2018-04-03 18:31 ` Stefan Beller
2018-04-03 18:31 ` Jonathan Tan
2018-04-03 16:51 ` [PATCH 5/6] commit.c: use generation to halt paint walk Derrick Stolee
2018-04-03 19:01 ` Jonathan Tan
2018-04-03 16:51 ` [PATCH 6/6] commit-graph.txt: update future work Derrick Stolee
2018-04-03 19:04 ` Jonathan Tan
2018-04-03 16:56 ` [PATCH 0/6] Compute and consume generation numbers Derrick Stolee
2018-04-03 18:03 ` Brandon Williams
2018-04-03 18:29 ` Derrick Stolee
2018-04-03 18:47 ` Jeff King
2018-04-03 19:05 ` Jeff King
2018-04-04 15:45 ` [PATCH 7/6] ref-filter: use generation number for --contains Derrick Stolee
2018-04-04 15:45 ` [PATCH 8/6] commit: use generation numbers for in_merge_bases() Derrick Stolee
2018-04-04 15:48 ` Derrick Stolee
2018-04-04 17:01 ` Brandon Williams
2018-04-04 18:24 ` Jeff King
2018-04-04 18:53 ` Derrick Stolee
2018-04-04 18:59 ` Jeff King
2018-04-04 18:22 ` [PATCH 7/6] ref-filter: use generation number for --contains Jeff King
2018-04-04 19:06 ` Derrick Stolee
2018-04-04 19:16 ` Jeff King
2018-04-04 19:22 ` Derrick Stolee
2018-04-04 19:42 ` Jeff King
2018-04-04 19:45 ` Derrick Stolee
2018-04-04 19:46 ` Jeff King
2018-04-07 17:09 ` [PATCH 0/6] Compute and consume generation numbers Jakub Narebski
2018-04-07 16:55 ` Jakub Narebski
2018-04-08 1:06 ` Derrick Stolee
2018-04-11 19:32 ` Jakub Narebski
2018-04-11 19:58 ` Derrick Stolee [this message]
2018-04-14 16:52 ` Jakub Narebski
2018-04-21 20:44 ` Jakub Narebski
2018-04-23 13:54 ` Derrick Stolee
2018-04-09 16:41 ` [PATCH v2 00/10] " Derrick Stolee
2018-04-09 16:41 ` [PATCH v2 01/10] object.c: parse commit in graph first Derrick Stolee
2018-04-09 16:41 ` [PATCH v2 02/10] merge: check config before loading commits Derrick Stolee
2018-04-11 2:12 ` Junio C Hamano
2018-04-11 12:49 ` Derrick Stolee
2018-04-09 16:42 ` [PATCH v2 03/10] commit: add generation number to struct commmit Derrick Stolee
2018-04-09 17:59 ` Stefan Beller
2018-04-11 2:31 ` Junio C Hamano
2018-04-11 12:57 ` Derrick Stolee
2018-04-11 23:28 ` Junio C Hamano
2018-04-09 16:42 ` [PATCH v2 04/10] commit-graph: compute generation numbers Derrick Stolee
2018-04-11 2:51 ` Junio C Hamano
2018-04-11 13:02 ` Derrick Stolee
2018-04-11 18:49 ` Stefan Beller
2018-04-11 19:26 ` Eric Sunshine
2018-04-09 16:42 ` [PATCH v2 05/10] commit: use generations in paint_down_to_common() Derrick Stolee
2018-04-09 16:42 ` [PATCH v2 06/10] commit.c: use generation to halt paint walk Derrick Stolee
2018-04-11 3:02 ` Junio C Hamano
2018-04-11 13:24 ` Derrick Stolee
2018-04-09 16:42 ` [PATCH v2 07/10] commit-graph.txt: update future work Derrick Stolee
2018-04-12 9:12 ` Junio C Hamano
2018-04-12 11:35 ` Derrick Stolee
2018-04-13 9:53 ` Jakub Narebski
2018-04-09 16:42 ` [PATCH v2 08/10] ref-filter: use generation number for --contains Derrick Stolee
2018-04-09 16:42 ` [PATCH v2 09/10] commit: use generation numbers for in_merge_bases() Derrick Stolee
2018-04-09 16:42 ` [PATCH v2 10/10] commit: add short-circuit to paint_down_to_common() Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 0/9] Compute and consume generation numbers Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 1/9] commit: add generation number to struct commmit Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 2/9] commit-graph: compute generation numbers Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 3/9] commit: use generations in paint_down_to_common() Derrick Stolee
2018-04-18 14:31 ` Jakub Narebski
2018-04-18 14:46 ` Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 4/9] commit-graph.txt: update design document Derrick Stolee
2018-04-18 19:47 ` Jakub Narebski
2018-04-17 17:00 ` [PATCH v3 5/9] ref-filter: use generation number for --contains Derrick Stolee
2018-04-18 21:02 ` Jakub Narebski
2018-04-23 14:22 ` Derrick Stolee
2018-04-24 18:56 ` Jakub Narebski
2018-04-25 14:11 ` Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 6/9] commit: use generation numbers for in_merge_bases() Derrick Stolee
2018-04-18 22:15 ` Jakub Narebski
2018-04-23 14:31 ` Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 7/9] commit: add short-circuit to paint_down_to_common() Derrick Stolee
2018-04-18 23:19 ` Jakub Narebski
2018-04-23 14:40 ` Derrick Stolee
2018-04-23 21:38 ` Jakub Narebski
2018-04-24 12:31 ` Derrick Stolee
2018-04-19 8:32 ` Jakub Narebski
2018-04-17 17:00 ` [PATCH v3 8/9] commit-graph: always load commit-graph information Derrick Stolee
2018-04-17 17:50 ` Derrick Stolee
2018-04-19 0:02 ` Jakub Narebski
2018-04-23 14:49 ` Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 9/9] merge: check config before loading commits Derrick Stolee
2018-04-19 0:04 ` [PATCH v3 0/9] Compute and consume generation numbers Jakub Narebski
2018-04-23 14:54 ` Derrick Stolee
2018-04-25 14:37 ` [PATCH v4 00/10] " Derrick Stolee
2018-04-25 14:37 ` [PATCH v4 01/10] ref-filter: fix outdated comment on in_commit_list Derrick Stolee
2018-04-28 17:54 ` Jakub Narebski
2018-04-25 14:37 ` [PATCH v4 02/10] commit: add generation number to struct commmit Derrick Stolee
2018-04-28 22:35 ` Jakub Narebski
2018-04-30 12:05 ` Derrick Stolee
2018-04-25 14:37 ` [PATCH v4 03/10] commit-graph: compute generation numbers Derrick Stolee
2018-04-26 2:35 ` Junio C Hamano
2018-04-26 12:58 ` Derrick Stolee
2018-04-26 13:49 ` Derrick Stolee
2018-04-29 9:08 ` Jakub Narebski
2018-05-01 12:10 ` Derrick Stolee
2018-05-02 16:15 ` Jakub Narebski
2018-04-25 14:37 ` [PATCH v4 04/10] commit: use generations in paint_down_to_common() Derrick Stolee
2018-04-26 3:22 ` Junio C Hamano
2018-04-26 9:02 ` Jakub Narebski
2018-04-28 14:38 ` Jakub Narebski
2018-04-29 15:40 ` Jakub Narebski
2018-04-25 14:37 ` [PATCH v4 05/10] commit-graph: always load commit-graph information Derrick Stolee
2018-04-29 22:14 ` Jakub Narebski
2018-05-01 12:19 ` Derrick Stolee
2018-04-29 22:18 ` Jakub Narebski
2018-04-25 14:37 ` [PATCH v4 06/10] ref-filter: use generation number for --contains Derrick Stolee
2018-04-30 16:34 ` Jakub Narebski
2018-04-25 14:37 ` [PATCH v4 07/10] commit: use generation numbers for in_merge_bases() Derrick Stolee
2018-04-30 17:05 ` Jakub Narebski
2018-04-25 14:38 ` [PATCH v4 08/10] commit: add short-circuit to paint_down_to_common() Derrick Stolee
2018-04-30 22:19 ` Jakub Narebski
2018-05-01 11:47 ` Derrick Stolee
2018-05-02 13:05 ` Jakub Narebski
2018-05-02 13:42 ` Derrick Stolee
2018-04-25 14:38 ` [PATCH v4 09/10] merge: check config before loading commits Derrick Stolee
2018-04-30 22:54 ` Jakub Narebski
2018-05-01 11:52 ` Derrick Stolee
2018-05-02 11:41 ` Jakub Narebski
2018-04-25 14:38 ` [PATCH v4 10/10] commit-graph.txt: update design document Derrick Stolee
2018-04-30 23:32 ` Jakub Narebski
2018-05-01 12:00 ` Derrick Stolee
2018-05-02 7:57 ` Jakub Narebski
2018-04-25 14:40 ` [PATCH v4 00/10] Compute and consume generation numbers Derrick Stolee
2018-04-28 17:28 ` Jakub Narebski
2018-05-01 12:47 ` [PATCH v5 00/11] " Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 01/11] ref-filter: fix outdated comment on in_commit_list Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 02/11] commit: add generation number to struct commmit Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 03/11] commit-graph: compute generation numbers Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 04/11] commit: use generations in paint_down_to_common() Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 05/11] commit-graph: always load commit-graph information Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 06/11] ref-filter: use generation number for --contains Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 07/11] commit: use generation numbers for in_merge_bases() Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 08/11] commit: add short-circuit to paint_down_to_common() Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 09/11] commit: use generation number in remove_redundant() Derrick Stolee
2018-05-01 15:37 ` Derrick Stolee
2018-05-03 18:45 ` Jakub Narebski
2018-05-01 12:47 ` [PATCH v5 10/11] merge: check config before loading commits Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 11/11] commit-graph.txt: update design document Derrick Stolee
2018-05-03 11:18 ` [PATCH v5 00/11] Compute and consume generation numbers Jakub Narebski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45320265-9fec-cee1-e82c-3ff719bb0435@gmail.com \
--to=stolee@gmail.com \
--cc=avarab@gmail.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=jnareb@gmail.com \
--cc=larsxschneider@gmail.com \
--cc=peff@peff.net \
--cc=sbeller@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).