From: Jakub Narebski <jnareb@gmail.com>
To: Derrick Stolee <dstolee@microsoft.com>
Cc: "git\@vger.kernel.org" <git@vger.kernel.org>,
"gitster\@pobox.com" <gitster@pobox.com>,
"peff\@peff.net" <peff@peff.net>,
"avarab\@gmail.com" <avarab@gmail.com>
Subject: Re: [PATCH v4 03/10] commit-graph: compute generation numbers
Date: Sun, 29 Apr 2018 11:08:13 +0200 [thread overview]
Message-ID: <86r2myidmq.fsf@gmail.com> (raw)
In-Reply-To: <20180425143735.240183-4-dstolee@microsoft.com> (Derrick Stolee's message of "Wed, 25 Apr 2018 14:37:56 +0000")
Derrick Stolee <dstolee@microsoft.com> writes:
> While preparing commits to be written into a commit-graph file, compute
> the generation numbers using a depth-first strategy.
Sidenote: for generation numbers it does not matter if we use
depth-first or breadth-first strategy, but it is more natural to use
depth-first search because generation numbers need post-order processing
(parents before child).
>
> The only commits that are walked in this depth-first search are those
> without a precomputed generation number. Thus, computation time will be
> relative to the number of new commits to the commit-graph file.
A question: what happens if the existing commit graph is from older
version of git and has _ZERO for generation numbers?
Answer: I see that we treat both _INFINITY (not in commit-graph) and
_ZERO (in commit graph but not computed) as not computed generation
numbers. All right.
>
> If a computed generation number would exceed GENERATION_NUMBER_MAX, then
> use GENERATION_NUMBER_MAX instead.
All right, though I guess this would remain theoretical for a long
while.
We don't have any way of testing this, at least not without recompiling
Git with lower value of GENERATION_NUMBER_MAX -- which means not
automatically, isn't it?
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
> commit-graph.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 45 insertions(+)
>
> diff --git a/commit-graph.c b/commit-graph.c
> index 9ad21c3ffb..047fa9fca5 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -439,6 +439,9 @@ static void write_graph_chunk_data(struct hashfile *f, int hash_len,
> else
> packedDate[0] = 0;
>
> + if ((*list)->generation != GENERATION_NUMBER_INFINITY)
> + packedDate[0] |= htonl((*list)->generation << 2);
> +
If we stumble upon commit marked as "not in commit-graph" while writing
commit graph, it is a BUG(), isn't it?
(Problem noticed by Junio.)
It is a bit strange to me that the code uses get_be32 for reading, but
htonl for writing. Is Git tested on non little-endian machines, like
big-endian ppc64 or s390x, or on mixed-endian machines (or
selectable-endian machines with data endianness set to non
little-endian, like ia64)? If not, could we use for example openSUSE
Build Service (https://build.opensuse.org/) for this?
> packedDate[1] = htonl((*list)->date);
> hashwrite(f, packedDate, 8);
>
> @@ -571,6 +574,46 @@ static void close_reachable(struct packed_oid_list *oids)
> }
> }
>
> +static void compute_generation_numbers(struct commit** commits,
> + int nr_commits)
> +{
> + int i;
> + struct commit_list *list = NULL;
All right, commit_list will work as stack.
> +
> + for (i = 0; i < nr_commits; i++) {
> + if (commits[i]->generation != GENERATION_NUMBER_INFINITY &&
> + commits[i]->generation != GENERATION_NUMBER_ZERO)
> + continue;
All right, we consider _INFINITY and _SERO as not computed. If
generation number is computed (by 'recursion' or from commit graph), we
(re)use it. This means that generation number calculation is
incremental, as intended -- good.
> +
> + commit_list_insert(commits[i], &list);
Start depth-first walks from commits given.
> + while (list) {
> + struct commit *current = list->item;
> + struct commit_list *parent;
> + int all_parents_computed = 1;
Here all_parents_computed is a boolean flag. I see that it is easier to
start with assumption that all parents will have computed generation
numbers.
> + uint32_t max_generation = 0;
The generation number value of 0 functions as sentinel; generation
numbers start from 1. Not that it matters much, as lowest possible
generation number is 1, and we could have started from that value.
> +
> + for (parent = current->parents; parent; parent = parent->next) {
> + if (parent->item->generation == GENERATION_NUMBER_INFINITY ||
> + parent->item->generation == GENERATION_NUMBER_ZERO) {
> + all_parents_computed = 0;
> + commit_list_insert(parent->item, &list);
> + break;
If some parent doesn't have generation number calculated, we add it to
stack (and break out of loop because it is depth-first walk), and mark
this situation. All right.
> + } else if (parent->item->generation > max_generation) {
> + max_generation = parent->item->generation;
Otherwise, update max_generation. All right.
> + }
> + }
> +
> + if (all_parents_computed) {
> + current->generation = max_generation + 1;
> + pop_commit(&list);
> + }
> +
> + if (current->generation > GENERATION_NUMBER_MAX)
> + current->generation = GENERATION_NUMBER_MAX;
This conditional should be inside all_parents_computed test, for example
like this:
+ if (all_parents_computed) {
+ current->generation = max_generation + 1;
+ if (current->generation > GENERATION_NUMBER_MAX)
+ current->generation = GENERATION_NUMBER_MAX;
+
+ pop_commit(&list);
+ }
(Noticed by Junio.)
Sidenote: when we revisit the commit, returning from depth-first walk of
one of its parents, we calculate max_generation from scratch again.
This does not matter for performance, as it's just data access and
calculating maximum - any workaround to not restart those calculations
would take more time and memory. And it's simple.
> + }
> + }
> +}
> +
> void write_commit_graph(const char *obj_dir,
> const char **pack_indexes,
> int nr_packs,
> @@ -694,6 +737,8 @@ void write_commit_graph(const char *obj_dir,
> if (commits.nr >= GRAPH_PARENT_MISSING)
> die(_("too many commits to write graph"));
>
> + compute_generation_numbers(commits.list, commits.nr);
> +
Nice and simple. All right.
I guess that we do not pass "struct packed_commit_list commits" as
argument to compute_generation_numbers instead of "struct commit**
commits.list" and "int commits.nr" to compute_generation_numbers() to
keep the latter nice and generic?
> graph_name = get_commit_graph_filename(obj_dir);
> fd = hold_lock_file_for_update(&lk, graph_name, 0);
Best,
--
Jakub Narębski
next prev parent reply other threads:[~2018-04-29 9:08 UTC|newest]
Thread overview: 162+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-03 16:51 [PATCH 0/6] Compute and consume generation numbers Derrick Stolee
2018-04-03 16:51 ` [PATCH 1/6] object.c: parse commit in graph first Derrick Stolee
2018-04-03 18:21 ` Jonathan Tan
2018-04-03 18:28 ` Jeff King
2018-04-03 18:32 ` Derrick Stolee
2018-04-03 16:51 ` [PATCH 2/6] commit: add generation number to struct commmit Derrick Stolee
2018-04-03 18:05 ` Brandon Williams
2018-04-03 18:28 ` Jeff King
2018-04-03 18:31 ` Derrick Stolee
2018-04-03 18:32 ` Brandon Williams
2018-04-03 18:44 ` Stefan Beller
2018-04-03 23:17 ` Ramsay Jones
2018-04-03 23:19 ` Jeff King
2018-04-03 18:24 ` Jonathan Tan
2018-04-03 16:51 ` [PATCH 3/6] commit-graph: compute generation numbers Derrick Stolee
2018-04-03 18:30 ` Jonathan Tan
2018-04-03 18:49 ` Stefan Beller
2018-04-03 16:51 ` [PATCH 4/6] commit: use generations in paint_down_to_common() Derrick Stolee
2018-04-03 18:31 ` Stefan Beller
2018-04-03 18:31 ` Jonathan Tan
2018-04-03 16:51 ` [PATCH 5/6] commit.c: use generation to halt paint walk Derrick Stolee
2018-04-03 19:01 ` Jonathan Tan
2018-04-03 16:51 ` [PATCH 6/6] commit-graph.txt: update future work Derrick Stolee
2018-04-03 19:04 ` Jonathan Tan
2018-04-03 16:56 ` [PATCH 0/6] Compute and consume generation numbers Derrick Stolee
2018-04-03 18:03 ` Brandon Williams
2018-04-03 18:29 ` Derrick Stolee
2018-04-03 18:47 ` Jeff King
2018-04-03 19:05 ` Jeff King
2018-04-04 15:45 ` [PATCH 7/6] ref-filter: use generation number for --contains Derrick Stolee
2018-04-04 15:45 ` [PATCH 8/6] commit: use generation numbers for in_merge_bases() Derrick Stolee
2018-04-04 15:48 ` Derrick Stolee
2018-04-04 17:01 ` Brandon Williams
2018-04-04 18:24 ` Jeff King
2018-04-04 18:53 ` Derrick Stolee
2018-04-04 18:59 ` Jeff King
2018-04-04 18:22 ` [PATCH 7/6] ref-filter: use generation number for --contains Jeff King
2018-04-04 19:06 ` Derrick Stolee
2018-04-04 19:16 ` Jeff King
2018-04-04 19:22 ` Derrick Stolee
2018-04-04 19:42 ` Jeff King
2018-04-04 19:45 ` Derrick Stolee
2018-04-04 19:46 ` Jeff King
2018-04-07 17:09 ` [PATCH 0/6] Compute and consume generation numbers Jakub Narebski
2018-04-07 16:55 ` Jakub Narebski
2018-04-08 1:06 ` Derrick Stolee
2018-04-11 19:32 ` Jakub Narebski
2018-04-11 19:58 ` Derrick Stolee
2018-04-14 16:52 ` Jakub Narebski
2018-04-21 20:44 ` Jakub Narebski
2018-04-23 13:54 ` Derrick Stolee
2018-04-09 16:41 ` [PATCH v2 00/10] " Derrick Stolee
2018-04-09 16:41 ` [PATCH v2 01/10] object.c: parse commit in graph first Derrick Stolee
2018-04-09 16:41 ` [PATCH v2 02/10] merge: check config before loading commits Derrick Stolee
2018-04-11 2:12 ` Junio C Hamano
2018-04-11 12:49 ` Derrick Stolee
2018-04-09 16:42 ` [PATCH v2 03/10] commit: add generation number to struct commmit Derrick Stolee
2018-04-09 17:59 ` Stefan Beller
2018-04-11 2:31 ` Junio C Hamano
2018-04-11 12:57 ` Derrick Stolee
2018-04-11 23:28 ` Junio C Hamano
2018-04-09 16:42 ` [PATCH v2 04/10] commit-graph: compute generation numbers Derrick Stolee
2018-04-11 2:51 ` Junio C Hamano
2018-04-11 13:02 ` Derrick Stolee
2018-04-11 18:49 ` Stefan Beller
2018-04-11 19:26 ` Eric Sunshine
2018-04-09 16:42 ` [PATCH v2 05/10] commit: use generations in paint_down_to_common() Derrick Stolee
2018-04-09 16:42 ` [PATCH v2 06/10] commit.c: use generation to halt paint walk Derrick Stolee
2018-04-11 3:02 ` Junio C Hamano
2018-04-11 13:24 ` Derrick Stolee
2018-04-09 16:42 ` [PATCH v2 07/10] commit-graph.txt: update future work Derrick Stolee
2018-04-12 9:12 ` Junio C Hamano
2018-04-12 11:35 ` Derrick Stolee
2018-04-13 9:53 ` Jakub Narebski
2018-04-09 16:42 ` [PATCH v2 08/10] ref-filter: use generation number for --contains Derrick Stolee
2018-04-09 16:42 ` [PATCH v2 09/10] commit: use generation numbers for in_merge_bases() Derrick Stolee
2018-04-09 16:42 ` [PATCH v2 10/10] commit: add short-circuit to paint_down_to_common() Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 0/9] Compute and consume generation numbers Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 1/9] commit: add generation number to struct commmit Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 2/9] commit-graph: compute generation numbers Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 3/9] commit: use generations in paint_down_to_common() Derrick Stolee
2018-04-18 14:31 ` Jakub Narebski
2018-04-18 14:46 ` Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 4/9] commit-graph.txt: update design document Derrick Stolee
2018-04-18 19:47 ` Jakub Narebski
2018-04-17 17:00 ` [PATCH v3 5/9] ref-filter: use generation number for --contains Derrick Stolee
2018-04-18 21:02 ` Jakub Narebski
2018-04-23 14:22 ` Derrick Stolee
2018-04-24 18:56 ` Jakub Narebski
2018-04-25 14:11 ` Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 6/9] commit: use generation numbers for in_merge_bases() Derrick Stolee
2018-04-18 22:15 ` Jakub Narebski
2018-04-23 14:31 ` Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 7/9] commit: add short-circuit to paint_down_to_common() Derrick Stolee
2018-04-18 23:19 ` Jakub Narebski
2018-04-23 14:40 ` Derrick Stolee
2018-04-23 21:38 ` Jakub Narebski
2018-04-24 12:31 ` Derrick Stolee
2018-04-19 8:32 ` Jakub Narebski
2018-04-17 17:00 ` [PATCH v3 8/9] commit-graph: always load commit-graph information Derrick Stolee
2018-04-17 17:50 ` Derrick Stolee
2018-04-19 0:02 ` Jakub Narebski
2018-04-23 14:49 ` Derrick Stolee
2018-04-17 17:00 ` [PATCH v3 9/9] merge: check config before loading commits Derrick Stolee
2018-04-19 0:04 ` [PATCH v3 0/9] Compute and consume generation numbers Jakub Narebski
2018-04-23 14:54 ` Derrick Stolee
2018-04-25 14:37 ` [PATCH v4 00/10] " Derrick Stolee
2018-04-25 14:37 ` [PATCH v4 01/10] ref-filter: fix outdated comment on in_commit_list Derrick Stolee
2018-04-28 17:54 ` Jakub Narebski
2018-04-25 14:37 ` [PATCH v4 02/10] commit: add generation number to struct commmit Derrick Stolee
2018-04-28 22:35 ` Jakub Narebski
2018-04-30 12:05 ` Derrick Stolee
2018-04-25 14:37 ` [PATCH v4 03/10] commit-graph: compute generation numbers Derrick Stolee
2018-04-26 2:35 ` Junio C Hamano
2018-04-26 12:58 ` Derrick Stolee
2018-04-26 13:49 ` Derrick Stolee
2018-04-29 9:08 ` Jakub Narebski [this message]
2018-05-01 12:10 ` Derrick Stolee
2018-05-02 16:15 ` Jakub Narebski
2018-04-25 14:37 ` [PATCH v4 04/10] commit: use generations in paint_down_to_common() Derrick Stolee
2018-04-26 3:22 ` Junio C Hamano
2018-04-26 9:02 ` Jakub Narebski
2018-04-28 14:38 ` Jakub Narebski
2018-04-29 15:40 ` Jakub Narebski
2018-04-25 14:37 ` [PATCH v4 05/10] commit-graph: always load commit-graph information Derrick Stolee
2018-04-29 22:14 ` Jakub Narebski
2018-05-01 12:19 ` Derrick Stolee
2018-04-29 22:18 ` Jakub Narebski
2018-04-25 14:37 ` [PATCH v4 06/10] ref-filter: use generation number for --contains Derrick Stolee
2018-04-30 16:34 ` Jakub Narebski
2018-04-25 14:37 ` [PATCH v4 07/10] commit: use generation numbers for in_merge_bases() Derrick Stolee
2018-04-30 17:05 ` Jakub Narebski
2018-04-25 14:38 ` [PATCH v4 08/10] commit: add short-circuit to paint_down_to_common() Derrick Stolee
2018-04-30 22:19 ` Jakub Narebski
2018-05-01 11:47 ` Derrick Stolee
2018-05-02 13:05 ` Jakub Narebski
2018-05-02 13:42 ` Derrick Stolee
2018-04-25 14:38 ` [PATCH v4 09/10] merge: check config before loading commits Derrick Stolee
2018-04-30 22:54 ` Jakub Narebski
2018-05-01 11:52 ` Derrick Stolee
2018-05-02 11:41 ` Jakub Narebski
2018-04-25 14:38 ` [PATCH v4 10/10] commit-graph.txt: update design document Derrick Stolee
2018-04-30 23:32 ` Jakub Narebski
2018-05-01 12:00 ` Derrick Stolee
2018-05-02 7:57 ` Jakub Narebski
2018-04-25 14:40 ` [PATCH v4 00/10] Compute and consume generation numbers Derrick Stolee
2018-04-28 17:28 ` Jakub Narebski
2018-05-01 12:47 ` [PATCH v5 00/11] " Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 01/11] ref-filter: fix outdated comment on in_commit_list Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 02/11] commit: add generation number to struct commmit Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 03/11] commit-graph: compute generation numbers Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 04/11] commit: use generations in paint_down_to_common() Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 05/11] commit-graph: always load commit-graph information Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 06/11] ref-filter: use generation number for --contains Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 07/11] commit: use generation numbers for in_merge_bases() Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 08/11] commit: add short-circuit to paint_down_to_common() Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 09/11] commit: use generation number in remove_redundant() Derrick Stolee
2018-05-01 15:37 ` Derrick Stolee
2018-05-03 18:45 ` Jakub Narebski
2018-05-01 12:47 ` [PATCH v5 10/11] merge: check config before loading commits Derrick Stolee
2018-05-01 12:47 ` [PATCH v5 11/11] commit-graph.txt: update design document Derrick Stolee
2018-05-03 11:18 ` [PATCH v5 00/11] Compute and consume generation numbers Jakub Narebski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86r2myidmq.fsf@gmail.com \
--to=jnareb@gmail.com \
--cc=avarab@gmail.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).