git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jonathan Tan <jonathantanmy@google.com>
To: gitgitgadget@gmail.com
Cc: git@vger.kernel.org, gitster@pobox.com, dstolee@microsoft.com,
	Jonathan Tan <jonathantanmy@google.com>
Subject: Re: [PATCH 15/16] commit-reach: make can_all_from_reach... linear
Date: Mon, 16 Jul 2018 18:16:00 -0700	[thread overview]
Message-ID: <20180717011600.22362-1-jonathantanmy@google.com> (raw)
In-Reply-To: <816821eec9ba476ccdfbfdf6e3cdd3619743ea2e.1531746012.git.gitgitgadget@gmail.com>

> The first step includes using a depth-first-search (DFS) from each from
> commit, sorted by ascending generation number. We do not walk beyond the
> minimum generation number or the minimum commit date. This DFS is likely
> to be faster than the existing reachable() method because we expect
> previous ref values to be along the first-parent history.
> 
> If we find a target commit, then we mark everything in the DFS stack as
> a RESULT. This expands the set of targets for the other from commits. We
> also mark the visited commits using 'assign_flag' to prevent re-walking
> the same code.

Thanks for this - it was very helpful in understanding the code.

The function itself uses a DFS stack that contains only the trail
leading up to the currently processed node, and not the one that I'm
more familiar with, which also contains the siblings of processed nodes.
I'll annotate the function with my thought process in the hope that it
will aid future reviewers. (The diff as seen in the e-mail is confusing
so I'm reproducing the function itself, not any + or -.)

> int can_all_from_reach_with_flag(struct object_array *from,
> 				 int with_flag, int assign_flag,
> 				 time_t min_commit_date,
> 				 uint32_t min_generation)
> {
> 	struct commit **list = NULL;
> 	int i;
> 	int result = 1;
> 
> 	ALLOC_ARRAY(list, from->nr);
> 	for (i = 0; i < from->nr; i++) {
> 		list[i] = (struct commit *)from->objects[i].item;
> 
> 		parse_commit(list[i]);
> 
> 		if (list[i]->generation < min_generation)
> 			return 0;
> 	}
> 
> 	QSORT(list, from->nr, compare_commits_by_gen);
> 
> 	for (i = 0; i < from->nr; i++) {
> 		/* DFS from list[i] */
> 		struct commit_list *stack = NULL;
> 
> 		list[i]->object.flags |= assign_flag;
> 		commit_list_insert(list[i], &stack);
> 
> 		while (stack) {
> 			struct commit_list *parent;
> 
> 			if (stack->item->object.flags & with_flag) {
> 				pop_commit(&stack);
> 				continue;
> 			}

I wish that the code would refrain from pushing such an object instead
of popping it at the first opportunity, but I guess that doing so would
require the equivalent of a labeled break/continue. I have no qualms
with using "goto" in this case, but I know that some people don't like
it :-P

> 			for (parent = stack->item->parents; parent; parent = parent->next) {
> 				if (parent->item->object.flags & (with_flag | RESULT))
> 					stack->item->object.flags |= RESULT;

Straightforward, and also produces the bubbling up that we want. An
object is never popped unless it has the "with_flag" flag (see above) or
all its parents have been processed. The object can encounter the "if"
statement multiple times; the last one is when all its parents have been
processed (and thus have the RESULT flag set if necessary).

> 				if (!(parent->item->object.flags & assign_flag)) {
> 					parent->item->object.flags |= assign_flag;
> 
> 					parse_commit(parent->item);
> 
> 					if (parent->item->date < min_commit_date ||
> 					    parent->item->generation < min_generation)
> 						continue;
> 
> 					commit_list_insert(parent->item, &stack);
> 					break;
> 				}

If not yet processed, push it onto the stack and break. The child commit
is still left on the stack. The next time the child commit is processed
(in an iteration of the "while" loop), the "for" loop will iterate until
the next unprocessed parent.

In the DFS that I'm used to, all parents would be pushed here, but
perhaps the fact that the iteration is postorder confuses things.
Anyway, if someone comes up with a better algorithm, replacing it
shouldn't be too difficult - the algorithm is contained within this
function, and there are tests to check the correctness of the algorithm
update.

> 			}
> 
> 			if (!parent)
> 				pop_commit(&stack);

Only when we have no parents left are we completely done with the
current object.

> 		}
> 
> 		if (!(list[i]->object.flags & (with_flag | RESULT))) {
> 			result = 0;
> 			goto cleanup;
> 		}

And after the DFS, if the original object did not have an appropriate
flag set, we do not bother with the other "want" objects.

> 	}
> 
> cleanup:
> 	for (i = 0; i < from->nr; i++) {
> 		clear_commit_marks(list[i], RESULT);
> 		clear_commit_marks(list[i], assign_flag);
> 	}
> 	return result;
> }

  parent reply	other threads:[~2018-07-17  1:16 UTC|newest]

Thread overview: 118+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-16 13:00 [PATCH 00/16] Consolidate reachability logic Derrick Stolee via GitGitGadget
2018-06-19 20:25 ` [PATCH 04/16] upload-pack: make reachable() more generic Derrick Stolee via GitGitGadget
2018-06-19 20:35 ` [PATCH 05/16] upload-pack: refactor ok_to_give_up() Derrick Stolee via GitGitGadget
2018-06-25 17:16 ` [PATCH 01/16] commit-reach: move walk methods from commit.c Derrick Stolee via GitGitGadget
2018-07-16 18:57   ` Stefan Beller
2018-07-16 21:31   ` Jonathan Tan
2018-06-25 17:35 ` [PATCH 02/16] commit-reach: move ref_newer from remote.c Derrick Stolee via GitGitGadget
2018-07-16 19:10   ` Stefan Beller
2018-06-25 18:01 ` [PATCH 03/16] commit-reach: move commit_contains from ref-filter Derrick Stolee via GitGitGadget
2018-07-16 19:14   ` Stefan Beller
2018-06-28 12:31 ` [PATCH 15/16] commit-reach: make can_all_from_reach... linear Derrick Stolee via GitGitGadget
2018-07-16 22:37   ` Stefan Beller
2018-07-17  1:16   ` Jonathan Tan [this message]
2018-10-01 19:16   ` René Scharfe
2018-10-01 19:26     ` Derrick Stolee
2018-10-01 20:37       ` René Scharfe
2018-10-04 22:59         ` René Scharfe
2018-10-05 12:15           ` Derrick Stolee
2018-10-05 16:51           ` Jeff King
2018-10-05 18:48             ` René Scharfe
2018-10-05 19:08               ` Jeff King
2018-10-05 19:36                 ` René Scharfe
2018-10-05 19:42                   ` Jeff King
2018-10-14 14:29                     ` René Scharfe
2018-10-15 15:31                       ` Derrick Stolee
2018-10-15 16:26                         ` René Scharfe
2018-10-16 23:09                       ` Junio C Hamano
2018-10-17  8:33                       ` Jeff King
2020-11-18  2:16                         ` Jonathan Nieder
2020-11-18  6:54                           ` Jeff King
2020-11-18 17:47                             ` René Scharfe
2018-10-05 19:12             ` Ævar Arnfjörð Bjarmason
2018-10-05 19:28               ` Jeff King
2018-10-05 19:42                 ` Ævar Arnfjörð Bjarmason
2018-10-05 19:44                   ` Jeff King
2018-07-12 20:47 ` [PATCH 06/16] upload-pack: generalize commit date cutoff Derrick Stolee via GitGitGadget
2018-07-16 19:38   ` Stefan Beller
2018-07-18 16:04     ` Derrick Stolee
2018-07-12 20:52 ` [PATCH 07/16] commit-reach: move can_all_from_reach_with_flags Derrick Stolee via GitGitGadget
2018-07-16 22:37   ` Jonathan Tan
2018-07-13 14:06 ` [PATCH 08/16] test-reach: create new test tool for ref_newer Derrick Stolee via GitGitGadget
2018-07-16 23:00   ` Jonathan Tan
2018-07-18 16:14     ` Derrick Stolee
2018-07-13 14:28 ` [PATCH 09/16] test-reach: test in_merge_bases Derrick Stolee via GitGitGadget
2018-07-13 14:38 ` [PATCH 10/16] test-reach: test is_descendant_of Derrick Stolee via GitGitGadget
2018-07-13 14:51 ` [PATCH 11/16] test-reach: test get_merge_bases_many Derrick Stolee via GitGitGadget
2018-07-16 21:24   ` Stefan Beller
2018-07-16 23:08   ` Jonathan Tan
2018-07-13 16:51 ` [PATCH 12/16] test-reach: test reduce_heads Derrick Stolee via GitGitGadget
2018-07-16 21:30   ` Stefan Beller
2018-07-16 21:59     ` Eric Sunshine
2018-07-13 17:22 ` [PATCH 13/16] test-reach: test can_all_from_reach_with_flags Derrick Stolee via GitGitGadget
2018-07-16 21:54   ` Stefan Beller
2018-07-18 16:54     ` Derrick Stolee
2018-07-17  0:10   ` Jonathan Tan
2018-07-13 18:37 ` [PATCH 14/16] commit-reach: replace ref_newer logic Derrick Stolee via GitGitGadget
2018-07-16 22:16   ` Stefan Beller
2018-07-13 19:25 ` [PATCH 16/16] commit-reach: use can_all_from_reach Derrick Stolee via GitGitGadget
2018-07-16 22:47   ` Stefan Beller
2018-07-16 13:54 ` [PATCH 00/16] Consolidate reachability logic Ramsay Jones
2018-07-16 16:18   ` Jeff King
2018-07-16 18:40     ` Eric Sunshine
2018-07-16 18:56       ` Jeff King
2018-07-16 18:59         ` Eric Sunshine
2018-07-18 12:32           ` Johannes Schindelin
2018-07-18 12:23     ` Johannes Schindelin
2018-07-18 19:21       ` Jeff King
2018-07-19 16:34         ` Johannes Schindelin
2018-07-16 17:26   ` Stefan Beller
2018-07-16 18:44     ` Eric Sunshine
2018-07-16 18:47       ` Derrick Stolee
2018-07-18 12:28         ` Johannes Schindelin
2018-07-18 15:01           ` Duy Nguyen
2018-07-18 17:01             ` Junio C Hamano
2018-07-18 17:11               ` Derrick Stolee
2018-07-19 16:37                 ` Johannes Schindelin
2018-07-19 16:32               ` Johannes Schindelin
2018-07-20 16:33 ` [PATCH v2 00/18] " Derrick Stolee
2018-07-20 16:33   ` [PATCH v2 01/18] commit-reach: move walk methods from commit.c Derrick Stolee
2018-07-20 16:33   ` [PATCH v2 02/18] commit.h: remove method declarations Derrick Stolee
2018-07-20 16:33   ` [PATCH v2 03/18] commit-reach: move ref_newer from remote.c Derrick Stolee
2018-07-20 16:33   ` [PATCH v2 04/18] commit-reach: move commit_contains from ref-filter Derrick Stolee
2018-08-28 21:24     ` Jonathan Nieder
2018-08-28 21:33       ` Derrick Stolee
2018-08-28 21:36       ` [PATCH] commit-reach: correct accidental #include of C file Jonathan Nieder
2018-08-28 21:39         ` Derrick Stolee
2018-07-20 16:33   ` [PATCH v2 05/18] upload-pack: make reachable() more generic Derrick Stolee
2018-07-20 16:33   ` [PATCH v2 06/18] upload-pack: refactor ok_to_give_up() Derrick Stolee
2018-07-20 16:33   ` [PATCH v2 07/18] upload-pack: generalize commit date cutoff Derrick Stolee
2018-07-20 16:33   ` [PATCH v2 08/18] commit-reach: move can_all_from_reach_with_flags Derrick Stolee
2018-07-20 16:33   ` [PATCH v2 09/18] test-reach: create new test tool for ref_newer Derrick Stolee
2018-07-20 16:33   ` [PATCH v2 10/18] test-reach: test in_merge_bases Derrick Stolee
2018-07-20 16:33   ` [PATCH v2 11/18] test-reach: test is_descendant_of Derrick Stolee
2018-07-20 16:33   ` [PATCH v2 12/18] test-reach: test get_merge_bases_many Derrick Stolee
2018-07-20 16:33   ` [PATCH v2 13/18] test-reach: test reduce_heads Derrick Stolee
2018-07-20 16:33   ` [PATCH v2 14/18] test-reach: test can_all_from_reach_with_flags Derrick Stolee
2018-07-20 16:33   ` [PATCH v2 15/18] test-reach: test commit_contains Derrick Stolee
2018-07-23 20:35     ` Jonathan Tan
2018-07-25 18:08       ` Junio C Hamano
2018-07-25 18:30         ` Derrick Stolee
2018-07-20 16:33   ` [PATCH v2 16/18] commit-reach: replace ref_newer logic Derrick Stolee
2018-07-20 16:33   ` [PATCH v2 17/18] commit-reach: make can_all_from_reach... linear Derrick Stolee
2018-07-23 20:41     ` Jonathan Tan
2018-08-01 20:41       ` Derrick Stolee
2018-09-12  4:14     ` Jeff King
2018-09-12  4:29       ` Jeff King
2018-09-12 13:08         ` Derrick Stolee
2018-07-20 16:33   ` [PATCH v2 18/18] commit-reach: use can_all_from_reach Derrick Stolee
2018-07-20 17:10   ` [PATCH v2 00/18] Consolidate reachability logic Stefan Beller
2018-07-20 17:15     ` Derrick Stolee
2018-07-20 22:16       ` Stefan Beller
2018-08-01 20:33         ` Derrick Stolee
2018-07-20 17:18   ` Derrick Stolee
2018-07-20 18:09     ` Eric Sunshine
2018-07-20 19:14       ` Derrick Stolee
2018-07-20 17:41   ` Duy Nguyen
2018-07-20 19:09     ` Derrick Stolee
2018-07-20 22:45   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180717011600.22362-1-jonathantanmy@google.com \
    --to=jonathantanmy@google.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).