git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Derrick Stolee <dstolee@microsoft.com>
To: "git@vger.kernel.org" <git@vger.kernel.org>
Cc: "peff@peff.net" <peff@peff.net>,
	"sbeller@google.com" <sbeller@google.com>,
	"jnareb@gmail.com" <jnareb@gmail.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: [RFC PATCH 13/13] commit-reach: use can_all_from_reach
Date: Fri, 29 Jun 2018 16:13:03 +0000	[thread overview]
Message-ID: <20180629161223.229661-14-dstolee@microsoft.com> (raw)
In-Reply-To: <20180629161223.229661-1-dstolee@microsoft.com>

The is_descendant_of method previously used in_merge_bases() to check if
the commit can reach any of the commits in the provided list. This had
two performance problems:

1. The performance is quadratic in worst-case.

2. A single in_merge_bases() call requires walking beyond the target
   commit in order to find the full set of boundary commits that may be
   merge-bases.

The can_all_from_reach method avoids this quadratic behavior and can
limit the search beyond the target commits using generation numbers. It
requires a small prototype adjustment to stop using commit-date as a
cutoff, as that optimization is no longer appropriate here.

Performance was meausured on a copy of the Linux repository using the
'test-tool reach is_descendant_of' command using this input:

A:v4.9
X:v4.10
X:v4.11
X:v4.12
X:v4.13
X:v4.14
X:v4.15
X:v4.16
X:v4.17
X.v3.0

Note that this input is tailored to demonstrate the quadratic nature of
the previous method, as it will compute merge-bases for v4.9 versus all
of the later versions before checking against v4.1.

Before: 0.31 s
 After: 0.27 s

Since we previously used the is_descendant_of method in the ref_newer
method, we also measured performance there using
'test-tool reach ref_newer':

Before: 0.12 s
 After: 0.11 s

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---

One thing I know is missing from this commit is a special-case to use
the old logic when there is no commit-graph present. The
can_all_from_reach() algorithm can be worse when we do not have good
generation number cutoffs. In the previous case of
can_all_from_reach_with_flags(), we already had an established pattern
of using commit date as a cutoff, so the generation number is only a
second cutoff and the algorithm cannot walk more commits than before.

 commit-reach.c        | 34 +++++++++++++++++-----------------
 commit-reach.h        |  3 ++-
 t/helper/test-reach.c |  2 +-
 3 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 249e9a4fac..a823d6965c 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -273,17 +273,15 @@ struct commit_list *get_merge_bases(struct commit *one, struct commit *two)
  */
 int is_descendant_of(struct commit *commit, struct commit_list *with_commit)
 {
+	struct commit_list *from_list = NULL;
+	int result;
 	if (!with_commit)
 		return 1;
-	while (with_commit) {
-		struct commit *other;
 
-		other = with_commit->item;
-		with_commit = with_commit->next;
-		if (in_merge_bases(other, commit))
-			return 1;
-	}
-	return 0;
+	commit_list_insert(commit, &from_list);
+	result = can_all_from_reach(from_list, with_commit, 0);
+	free_commit_list(from_list);
+	return result;
 }
 
 /*
@@ -605,10 +603,11 @@ int can_all_from_reach_with_flag(struct object_array *from,
 	return result;
 }
 
-int can_all_from_reach(struct commit_list *from, struct commit_list *to)
+int can_all_from_reach(struct commit_list *from, struct commit_list *to,
+		       int cutoff_by_min_date)
 {
 	struct object_array from_objs = OBJECT_ARRAY_INIT;
-	time_t min_commit_date = from->item->date;
+	time_t min_commit_date = cutoff_by_min_date ? from->item->date : 0;
 	struct commit_list *from_iter = from;
 	struct commit_list *to_iter = to;
 	int result;
@@ -617,20 +616,21 @@ int can_all_from_reach(struct commit_list *from, struct commit_list *to)
 	while (from_iter) {
 		add_object_array(&from_iter->item->object, NULL, &from_objs);
 
-		if (from_iter->item->date < min_commit_date)
+		if (!parse_commit(from_iter->item) &&
+		    from_iter->item->date < min_commit_date)
 			min_commit_date = from_iter->item->date;
 
 		from_iter = from_iter->next;
 	}
 
 	while (to_iter) {
-		parse_commit(to_iter->item);
-
-		if (to_iter->item->date < min_commit_date)
-			min_commit_date = to_iter->item->date;
+		if (!parse_commit(to_iter->item)) {
+			if (to_iter->item->date < min_commit_date)
+				min_commit_date = to_iter->item->date;
 
-		if (to_iter->item->generation < min_generation)
-			min_generation = to_iter->item->generation;
+			if (to_iter->item->generation < min_generation)
+				min_generation = to_iter->item->generation;
+		}
 
 		to_iter->item->object.flags |= PARENT2;
 
diff --git a/commit-reach.h b/commit-reach.h
index 3eb4c057e6..180f865d7d 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -31,6 +31,7 @@ int can_all_from_reach_with_flag(struct object_array *from, int with_flag,
 				 int assign_flag, time_t min_commit_date,
 				 uint32_t min_generation);
 
-int can_all_from_reach(struct commit_list *from, struct commit_list *to);
+int can_all_from_reach(struct commit_list *from, struct commit_list *to,
+		       int cutoff_by_min_date);
 
 #endif
diff --git a/t/helper/test-reach.c b/t/helper/test-reach.c
index 14aaef5bff..c05137c9f3 100644
--- a/t/helper/test-reach.c
+++ b/t/helper/test-reach.c
@@ -120,7 +120,7 @@ int cmd__reach(int ac, const char **av)
 			list = list->next;
 		}
 	} else if (!strcmp(av[1], "can_all_from_reach")) {
-		int result = can_all_from_reach(list_X, list_Y);
+		int result = can_all_from_reach(list_X, list_Y, 1);
 		printf("%s(X,Y):%d\n", av[1], result);
 	}
 
-- 
2.18.0.118.gd4f65b8d14


  parent reply	other threads:[~2018-06-29 16:13 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-29 16:12 [RFC PATCH 00/13] Consolidate reachability logic Derrick Stolee
2018-06-29 16:12 ` [RFC PATCH 01/13] commit-reach: move walk methods from commit.c Derrick Stolee
2018-06-29 21:35   ` Stefan Beller
2018-06-29 21:52   ` Junio C Hamano
2018-06-29 16:12 ` [RFC PATCH 02/13] commit-reach: move ref_newer from remote.c Derrick Stolee
2018-06-29 16:12 ` [RFC PATCH 03/13] commit-reach: move commit_contains from ref-filter Derrick Stolee
2018-06-29 21:38   ` Stefan Beller
2018-06-30  1:32     ` Derrick Stolee
2018-06-29 22:00   ` Junio C Hamano
2018-06-29 16:12 ` [RFC PATCH 04/13] upload-pack: make reachable() more generic Derrick Stolee
2018-06-29 22:05   ` Junio C Hamano
2018-06-29 16:12 ` [RFC PATCH 05/13] upload-pack: refactor ok_to_give_up() Derrick Stolee
2018-06-29 21:44   ` Stefan Beller
2018-06-29 16:12 ` [RFC PATCH 06/13] commit-reach: move can_all_from_reach_with_flag() Derrick Stolee
2018-06-29 21:47   ` Stefan Beller
2018-06-30  1:35     ` Derrick Stolee
2018-06-29 16:12 ` [RFC PATCH 07/13] test-reach Derrick Stolee
2018-06-29 21:54   ` Stefan Beller
2018-06-30  1:40     ` Derrick Stolee
2018-06-29 16:12 ` [RFC PATCH 08/13] test-reach: test reduce_heads() Derrick Stolee
2018-06-29 22:06   ` Stefan Beller
2018-06-29 16:12 ` [RFC PATCH 09/13] commit-reach: test can_all_from_reach Derrick Stolee
2018-06-29 16:12 ` [RFC PATCH 10/13] commit-reach: test is_descendant_of Derrick Stolee
2018-06-29 16:13 ` [RFC PATCH 11/13] commit-reach: make can_all_from_reach... linear Derrick Stolee
2018-06-29 23:18   ` Stefan Beller
2018-06-29 16:13 ` [RFC PATCH 12/13] commit-reach: use is_descendant_of for ref_newer Derrick Stolee
2018-06-29 16:13 ` Derrick Stolee [this message]
2018-06-29 23:21   ` [RFC PATCH 13/13] commit-reach: use can_all_from_reach Stefan Beller
2018-06-29 17:33 ` [RFC PATCH 00/13] Consolidate reachability logic Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180629161223.229661-14-dstolee@microsoft.com \
    --to=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).