git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: peff@peff.net, newren@gmail.com,
	Junio C Hamano <gitster@pobox.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: [PATCH v2 1/3] commit-reach: implement get_reachable_subset
Date: Fri, 02 Nov 2018 06:14:45 -0700 (PDT)	[thread overview]
Message-ID: <9e570603bda528986f425cae8a9c32ba700bc29b.1541164482.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.60.v2.git.gitgitgadget@gmail.com>

From: Derrick Stolee <dstolee@microsoft.com>

The existing reachability algorithms in commit-reach.c focus on
finding merge-bases or determining if all commits in a set X can
reach at least one commit in a set Y. However, for two commits sets
X and Y, we may also care about which commits in Y are reachable
from at least one commit in X.

Implement get_reachable_subset() which answers this question. Given
two arrays of commits, 'from' and 'to', return a commit_list with
every commit from the 'to' array that is reachable from at least
one commit in the 'from' array.

The algorithm is a simple walk starting at the 'from' commits, using
the PARENT2 flag to indicate "this commit has already been added to
the walk queue". By marking the 'to' commits with the PARENT1 flag,
we can determine when we see a commit from the 'to' array. We remove
the PARENT1 flag as we add that commit to the result list to avoid
duplicates.

The order of the resulting list is a reverse of the order that the
commits are discovered in the walk.

There are a couple shortcuts to avoid walking more than we need:

1. We determine the minimum generation number of commits in the
   'to' array. We do not walk commits with generation number
   below this minimum.

2. We count how many distinct commits are in the 'to' array, and
   decrement this count when we discover a 'to' commit during the
   walk. If this number reaches zero, then we can terminate the
   walk.

Tests will be added using the 'test-tool reach' helper in a
subsequent commit.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 commit-reach.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++++++
 commit-reach.h | 13 ++++++++++
 2 files changed, 83 insertions(+)

diff --git a/commit-reach.c b/commit-reach.c
index 9f79ce0a22..8ad5352752 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -688,3 +688,73 @@ int can_all_from_reach(struct commit_list *from, struct commit_list *to,
 	object_array_clear(&from_objs);
 	return result;
 }
+
+struct commit_list *get_reachable_subset(struct commit **from, int nr_from,
+					 struct commit **to, int nr_to,
+					 unsigned int reachable_flag)
+{
+	struct commit **item;
+	struct commit *current;
+	struct commit_list *found_commits = NULL;
+	struct commit **to_last = to + nr_to;
+	struct commit **from_last = from + nr_from;
+	uint32_t min_generation = GENERATION_NUMBER_INFINITY;
+	int num_to_find = 0;
+
+	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
+
+	for (item = to; item < to_last; item++) {
+		struct commit *c = *item;
+		
+		parse_commit(c);
+		if (c->generation < min_generation)
+			min_generation = c->generation;
+
+		if (!(c->object.flags & PARENT1)) {
+			c->object.flags |= PARENT1;
+			num_to_find++;
+		}
+	}
+
+	for (item = from; item < from_last; item++) {
+		struct commit *c = *item;
+		if (!(c->object.flags & PARENT2)) {
+			c->object.flags |= PARENT2;
+			parse_commit(c);
+
+			prio_queue_put(&queue, *item);
+		}
+	}
+
+	while (num_to_find && (current = prio_queue_get(&queue)) != NULL) {
+		struct commit_list *parents;
+
+		if (current->object.flags & PARENT1) {
+			current->object.flags &= ~PARENT1;
+			current->object.flags |= reachable_flag;
+			commit_list_insert(current, &found_commits);
+			num_to_find--;
+		}
+
+		for (parents = current->parents; parents; parents = parents->next) {
+			struct commit *p = parents->item;
+
+			parse_commit(p);
+
+			if (p->generation < min_generation)
+				continue;
+
+			if (p->object.flags & PARENT2)
+				continue;
+
+			p->object.flags |= PARENT2;
+			prio_queue_put(&queue, p);
+		}
+	}
+
+	clear_commit_marks_many(nr_to, to, PARENT1);
+	clear_commit_marks_many(nr_from, from, PARENT2);
+
+	return found_commits;
+}
+
diff --git a/commit-reach.h b/commit-reach.h
index 7d313e2975..bb34af0269 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -74,4 +74,17 @@ int can_all_from_reach_with_flag(struct object_array *from,
 int can_all_from_reach(struct commit_list *from, struct commit_list *to,
 		       int commit_date_cutoff);
 
+
+/*
+ * Return a list of commits containing the commits in the 'to' array
+ * that are reachable from at least one commit in the 'from' array.
+ * Also add the given 'flag' to each of the commits in the returned list.
+ *
+ * This method uses the PARENT1 and PARENT2 flags during its operation,
+ * so be sure these flags are not set before calling the method.
+ */
+struct commit_list *get_reachable_subset(struct commit **from, int nr_from,
+					 struct commit **to, int nr_to,
+					 unsigned int reachable_flag);
+
 #endif
-- 
gitgitgadget


  reply	other threads:[~2018-11-02 13:14 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-30 14:16 [PATCH 0/3] Make add_missing_tags() linear Derrick Stolee via GitGitGadget
2018-10-30 14:16 ` [PATCH 1/3] commit-reach: implement get_reachable_subset Derrick Stolee via GitGitGadget
2018-10-31  3:35   ` Junio C Hamano
2018-10-31 12:01     ` Derrick Stolee
2018-11-02  1:51       ` Junio C Hamano
2018-10-31  6:07   ` Elijah Newren
2018-10-31 11:54     ` Derrick Stolee
2018-10-30 14:16 ` [PATCH 2/3] test-reach: test get_reachable_subset Derrick Stolee via GitGitGadget
2018-10-30 14:16 ` [PATCH 3/3] remote: make add_missing_tags() linear Derrick Stolee via GitGitGadget
2018-10-31  3:05 ` [PATCH 0/3] Make " Junio C Hamano
2018-10-31  6:04 ` Elijah Newren
2018-10-31 12:05   ` Derrick Stolee
2018-11-01  6:52     ` Elijah Newren
2018-11-01 12:32       ` Derrick Stolee
2018-11-01 18:57         ` Elijah Newren
2018-11-01 19:02           ` Derrick Stolee
2018-11-02 14:58             ` Elijah Newren
2018-11-02 15:38               ` Derrick Stolee
2018-11-02 13:14 ` [PATCH v2 " Derrick Stolee via GitGitGadget
2018-11-02 13:14   ` Derrick Stolee via GitGitGadget [this message]
2018-11-02 13:14   ` [PATCH v2 2/3] test-reach: test get_reachable_subset Derrick Stolee via GitGitGadget
2018-11-02 13:14   ` [PATCH v2 3/3] remote: make add_missing_tags() linear Derrick Stolee via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9e570603bda528986f425cae8a9c32ba700bc29b.1541164482.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).