git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] Teach git-rev-list --simplify-forks
@ 2020-04-27  5:07 Antonio Russo
  2020-04-27 10:55 ` Derrick Stolee
  2020-04-29  8:01 ` [PATCH v2] " Antonio Russo
  0 siblings, 2 replies; 17+ messages in thread
From: Antonio Russo @ 2020-04-27  5:07 UTC (permalink / raw)
  To: git-ml; +Cc: Junio C Hamano, Michael Haggerty, Derrick Stolee

When used with --graph, instead of displaying the full graph, display a
spanning subgraph produced by a depth-first search of the graph visiting
parents from left to right.  Edges to already visited commits are
discarded.  This process is repeated for every commit to be displayed.

This is valuable to reduce visual clutter when there are many merges
that were not rebased onto each other and the user is primarily
interested in the state of the branch being merged into.

Also adds documentation and tests of the above.

Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
---
 Documentation/rev-list-options.txt         |  8 +++
 revision.c                                 | 67 +++++++++++++++++++++-
 revision.h                                 |  6 ++
 t/t6016-rev-list-graph-simplify-history.sh | 50 ++++++++++++++++
 4 files changed, 130 insertions(+), 1 deletion(-)

Hello,

I'm trying to address a long-standing problem I've had visualizing complex
git repositories with git log --graph.  I think that a picture is worth a
thousand words, so I'll just show what the patch does for git master:

% git log --graph --abbrev-commit --pretty=oneline --simplify-forks e870325ee8

> * e870325ee8 (HEAD -> master, origin/master, origin/HEAD) The third batch
> *   a397e9c236 Merge branch 'jk/credential-parsing-end-of-host-in-URL'
> |\
> | * 4c5971e18a credential: treat "?" and "#" in URLs as end of host
> *   d6d561db1c Merge branch 'jt/rebase-allow-duplicate'
> |\
> | * 0fcb4f6b62 rebase --merge: optionally skip upstreamed commits
> *   c7d8f69da5 Merge branch 'en/rebase-no-keep-empty'
> |\
> | * 50ed76148a rebase: fix an incompatible-options error message
> | * b9cbd2958f rebase: reinstate --no-keep-empty
> | * 1b5735f75c rebase -i: mark commits that begin empty in todo editor
> *   8b39dfdf47 Merge branch 'js/mingw-is-hidden-test-fix'
> |\
> | * 176a66a748 t: restrict `is_hidden` to be called only on Windows
> | * 9814d0a4ad mingw: make test_path_is_hidden more robust
> | * 7c2dfca7e8 t: consolidate the `is_hidden` functions
> *   a41b41ca74 Merge branch 'js/mingw-isilon-nfs'
> |\
> | * 23eafd924a mingw: cope with the Isilon network file system
> *   33feaca6bf Merge branch 'js/flush-prompt-before-interative-input'
> |\
> | * 1f09aed834 interactive: explicitly `fflush` stdout before expecting input
> | * 08d383f23e interactive: refactor code asking the user for interactive input
> *   9af3a7cb4d Merge branch 'ds/revision-show-pulls'
> |\
> | * 8d049e182e revision: --show-pulls adds helpful merges
> *   82fa169d55 Merge branch 'ma/simplify-merge-config-parsing'
> |\
> | * 9881b451f3 merge: use skip_prefix to parse config key
> *   b3eb70e0f8 Merge branch 'js/mingw-fixes'

Compare this to before:

% git log --graph --abbrev-commit --pretty=oneline e870325ee8

> * e870325ee8 (HEAD -> master, origin/master, origin/HEAD) The third batch
> *   a397e9c236 Merge branch 'jk/credential-parsing-end-of-host-in-URL'
> |\
> | * 4c5971e18a credential: treat "?" and "#" in URLs as end of host
> * |   d6d561db1c Merge branch 'jt/rebase-allow-duplicate'
> |\ \
> | * | 0fcb4f6b62 rebase --merge: optionally skip upstreamed commits
> * | | c7d8f69da5 Merge branch 'en/rebase-no-keep-empty'
> |\| |
> | * | 50ed76148a rebase: fix an incompatible-options error message
> | * | b9cbd2958f rebase: reinstate --no-keep-empty
> | * | 1b5735f75c rebase -i: mark commits that begin empty in todo editor
> * | |   8b39dfdf47 Merge branch 'js/mingw-is-hidden-test-fix'
> |\ \ \
> | * | | 176a66a748 t: restrict `is_hidden` to be called only on Windows
> | * | | 9814d0a4ad mingw: make test_path_is_hidden more robust
> | * | | 7c2dfca7e8 t: consolidate the `is_hidden` functions
> * | | |   a41b41ca74 Merge branch 'js/mingw-isilon-nfs'
> |\ \ \ \
> | * | | | 23eafd924a mingw: cope with the Isilon network file system
> | |/ / /
> * | | |   33feaca6bf Merge branch 'js/flush-prompt-before-interative-input'
> |\ \ \ \
> | * | | | 1f09aed834 interactive: explicitly `fflush` stdout before expecting input
> | * | | | 08d383f23e interactive: refactor code asking the user for interactive input
> | |/ / /
> * | | |   9af3a7cb4d Merge branch 'ds/revision-show-pulls'
> |\ \ \ \
> | * | | | 8d049e182e revision: --show-pulls adds helpful merges
> | |/ / /
> * | | |   82fa169d55 Merge branch 'ma/simplify-merge-config-parsing'
> |\ \ \ \
> | * | | | 9881b451f3 merge: use skip_prefix to parse config key
> | |/ / /
> * | | |   b3eb70e0f8 Merge branch 'js/mingw-fixes'
> |\ \ \ \

Roughly, the tool produces a spanning tree by traversing the commit graph,
leftmost parents first. It forgets information about where forks occurred,
removing the "tails" that connect branches back to their original fork
point.  These "tails" can be very long and often do not contain information
that is very important for someone glancing through commits.

=== Details

It's about 40% faster.  Tested this on current master (of git):

git log --graph --pretty=oneline --simplify-forks e870325ee8 > /dev/null  0.83s user 0.06s system 99% cpu 0.886 total

git log --graph --pretty=oneline e870325ee8 > /dev/null  1.41s user 0.03s system 99% cpu 1.443 total

This is because the produced graph is much simpler---it is a tree in the
above case (though not in general).

The exact approach uses two flags (reusing bits 27 and 28 that are used
by the topological sort walk).  One says whether the commit has been ever
visited, and the other says whether the commit was visited from the current
root.  The walk does not continue down any commit that was visited, and
leaves the connection to the parent only if has not been visited from the
current root.  Each commit that is to be shown is treated as its own root
(most of these root commits are immediately skipped, because they are
likely visited in an earlier traversal).

=== Problems

Travis CI is showing issues with the tests I've added [1].  I cannot
reproduce them locally, neither with gcc 4.8.5 (RHEL) nor 9.3.0 (Debian).
They're also only failing on one of the runners.


Thank you for taking the time to look at this,
Antonio Russo


[1] https://travis-ci.org/github/aerusso/git
[2] https://github.com/aerusso/git/tree/mrs/simplify-history-forks

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 04ad7dd36e..cbac09028c 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -363,6 +363,14 @@ Default mode::
 	merges from the resulting history, as there are no selected
 	commits contributing to this merge.

+--simplify-forks::
+	Convert the commit graph into a spanning subgraph produced by a
+	depth-first-search of the history graph, searching the leftmost
+	parent first, and discarding edges to commits already visited.
+	Useful with `--graph` to visualize repositories with many merges
+	when you are interested in was added to master, and not when the
+	branch was last rebased.
+
 --ancestry-path::
 	When given a range of commits to display (e.g. 'commit1..commit2'
 	or 'commit2 {caret}commit1'), only display commits that exist
diff --git a/revision.c b/revision.c
index 5bc96444b6..06d9697306 100644
--- a/revision.c
+++ b/revision.c
@@ -2082,6 +2082,8 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->simplify_by_decoration = 1;
 		revs->limited = 1;
 		revs->prune = 1;
+	} else if (!strcmp(arg, "--simplify-forks")) {
+		revs->simplify_forks = 1;
 	} else if (!strcmp(arg, "--date-order")) {
 		revs->sort_order = REV_SORT_BY_COMMIT_DATE;
 		revs->topo_order = 1;
@@ -3095,6 +3097,63 @@ static void simplify_merges(struct rev_info *revs)
 	}
 }

+static void simplify_forks(struct rev_info *revs)
+{
+	struct commit_list *stack, *list_lr, *iter_list;
+	struct commit_list **parents;
+	struct commit *commit, *parent;
+
+	stack = NULL;
+	list_lr = NULL;
+
+	clear_object_flags(SIMP_FORK_VISITED);
+
+	for(iter_list = revs->commits; iter_list; iter_list = iter_list->next) {
+		/* process every commit to be displayed exactly once */
+		if(iter_list->item->object.flags & SIMP_FORK_VISITED)
+			continue;
+		clear_object_flags(SIMP_FORK_VISITING);
+		commit_list_insert(iter_list->item, &stack);
+		iter_list->item->object.flags |= SIMP_FORK_VISITED | SIMP_FORK_VISITING;
+		while(stack) {
+			commit = pop_commit(&stack);
+			/* process the parent nodes: removing links to
+			 * commits already visited (creating a spanning tree)
+			 */
+			parents = &(commit->parents);
+			while(*parents) {
+				parent = (*parents)->item;
+				if(parent->object.flags & SIMP_FORK_VISITING) {
+					/* We have already visited this commit, from the same root.
+					 * We do not explore it at all.
+					 */
+					pop_commit(parents);
+				} else if(parent->object.flags & SIMP_FORK_VISITED) {
+					/* We visited this commit before, but from a different root.
+					 * Leave it attached, but do not explore it further.
+					 */
+					parents = &((*parents)->next);
+				} else {
+					/* We have not visited this commit yet. Explore it, as usual.
+					 */
+					parent->object.flags |= SIMP_FORK_VISITED | SIMP_FORK_VISITING;
+					commit_list_insert(parent, &list_lr);
+					parents = &((*parents)->next);
+				}
+			}
+
+			/* feed the parents, right to left (reversed) onto the
+			 * stack to do a depth-first traversal of the commit graph
+			 */
+			while(list_lr) {
+				commit_list_insert(pop_commit(&list_lr), &stack);
+			}
+		}
+	}
+
+	clear_object_flags(SIMP_FORK_VISITED | SIMP_FORK_VISITING);
+}
+
 static void set_children(struct rev_info *revs)
 {
 	struct commit_list *l;
@@ -3392,10 +3451,16 @@ int prepare_revision_walk(struct rev_info *revs)
 	if (revs->limited) {
 		if (limit_list(revs) < 0)
 			return -1;
+		if (revs->simplify_forks)
+			simplify_forks(revs);
 		if (revs->topo_order)
 			sort_in_topological_order(&revs->commits, revs->sort_order);
-	} else if (revs->topo_order)
+	} else if (revs->topo_order) {
+		if (revs->simplify_forks)
+			simplify_forks(revs);
 		init_topo_walk(revs);
+	} else if (revs->simplify_forks)
+		simplify_forks(revs);
 	if (revs->line_level_traverse)
 		line_log_filter(revs);
 	if (revs->simplify_merges)
diff --git a/revision.h b/revision.h
index c1af164b30..f1abdb26b0 100644
--- a/revision.h
+++ b/revision.h
@@ -51,6 +51,11 @@
 #define TOPO_WALK_EXPLORED	(1u<<27)
 #define TOPO_WALK_INDEGREE	(1u<<28)

+/* Re-use the TOPO_WALK flagspace for simplify_forks
+ */
+#define SIMP_FORK_VISITED	(1u<<27)
+#define SIMP_FORK_VISITING	(1u<<28)
+
 #define DECORATE_SHORT_REFS	1
 #define DECORATE_FULL_REFS	2

@@ -132,6 +137,7 @@ struct rev_info {
 			no_walk:2,
 			remove_empty_trees:1,
 			simplify_history:1,
+			simplify_forks:1,
 			show_pulls:1,
 			topo_order:1,
 			simplify_merges:1,
diff --git a/t/t6016-rev-list-graph-simplify-history.sh b/t/t6016-rev-list-graph-simplify-history.sh
index f5e6e92f5b..d99214b6df 100755
--- a/t/t6016-rev-list-graph-simplify-history.sh
+++ b/t/t6016-rev-list-graph-simplify-history.sh
@@ -85,6 +85,28 @@ test_expect_success '--graph --all' '
 	test_cmp expected actual
 	'

+# Make sure that simplify_histpry_forks produces a spanning tree
+test_expect_success '--graph --simplify-forks --all' '
+	rm -f expected &&
+	echo "* $A7" >> expected &&
+	echo "*   $A6" >> expected &&
+	echo "|\  " >> expected &&
+	echo "| * $C4" >> expected &&
+	echo "| * $C3" >> expected &&
+	echo "* $A5" >> expected &&
+	echo "*-.   $A4" >> expected &&
+	echo "|\ \  " >> expected &&
+	echo "| | * $C2" >> expected &&
+	echo "| | * $C1" >> expected &&
+	echo "| * $B2" >> expected &&
+	echo "| * $B1" >> expected &&
+	echo "* $A3" >> expected &&
+	echo "* $A2" >> expected &&
+	echo "* $A1" >> expected &&
+	git rev-list --graph --simplify-forks --all > actual &&
+	test_cmp expected actual
+	'
+
 # Make sure the graph_is_interesting() code still realizes
 # that undecorated merges are interesting, even with --simplify-by-decoration
 test_expect_success '--graph --simplify-by-decoration' '
@@ -157,6 +179,20 @@ test_expect_success '--graph --full-history -- bar.txt' '
 	test_cmp expected actual
 	'

+test_expect_success '--graph --simplify-forks --full-history -- bar.txt' '
+	rm -f expected &&
+	echo "* $A7" >> expected &&
+	echo "*   $A6" >> expected &&
+	echo "|\\  " >> expected &&
+	echo "| * $C4" >> expected &&
+	echo "* $A5" >> expected &&
+	echo "* $A4" >> expected &&
+	echo "* $A3" >> expected &&
+	echo "* $A2" >> expected &&
+	git rev-list --graph --simplify-forks --full-history --all -- bar.txt > actual &&
+	test_cmp expected actual
+	'
+
 test_expect_success '--graph --full-history --simplify-merges -- bar.txt' '
 	rm -f expected &&
 	echo "* $A7" >> expected &&
@@ -172,6 +208,20 @@ test_expect_success '--graph --full-history --simplify-merges -- bar.txt' '
 	test_cmp expected actual
 	'

+test_expect_success '--graph --simplify-forks --full-history --simplify-merges -- bar.txt' '
+	rm -f expected &&
+	echo "* $A7" >> expected &&
+	echo "*   $A6" >> expected &&
+	echo "|\\  " >> expected &&
+	echo "| * $C4" >> expected &&
+	echo "* $A5" >> expected &&
+	echo "* $A3" >> expected &&
+	echo "* $A2" >> expected &&
+	git rev-list --graph --simplify-forks --full-history --simplify-merges --all \
+		-- bar.txt > actual &&
+	test_cmp expected actual
+	'
+
 test_expect_success '--graph -- bar.txt' '
 	rm -f expected &&
 	echo "* $A7" >> expected &&
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] Teach git-rev-list --simplify-forks
  2020-04-27  5:07 [PATCH] Teach git-rev-list --simplify-forks Antonio Russo
@ 2020-04-27 10:55 ` Derrick Stolee
  2020-04-28 12:49   ` Antonio Russo
  2020-04-29  8:01 ` [PATCH v2] " Antonio Russo
  1 sibling, 1 reply; 17+ messages in thread
From: Derrick Stolee @ 2020-04-27 10:55 UTC (permalink / raw)
  To: Antonio Russo, git-ml; +Cc: Junio C Hamano, Michael Haggerty, Derrick Stolee

On 4/27/2020 1:07 AM, Antonio Russo wrote:
> === Problems
> 
> Travis CI is showing issues with the tests I've added [1].  I cannot
> reproduce them locally, neither with gcc 4.8.5 (RHEL) nor 9.3.0 (Debian).
> They're also only failing on one of the runners.

This is probably because the tests run a second round with GIT_TEST_COMMIT_GRAPH=1, which enables the commit-graph feature. This triggers a different set of logic for the topo-order, which ignores the logic the way you inserted it here.

Thanks,
-Stolee



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Teach git-rev-list --simplify-forks
  2020-04-27 10:55 ` Derrick Stolee
@ 2020-04-28 12:49   ` Antonio Russo
  2020-04-28 14:11     ` Derrick Stolee
  0 siblings, 1 reply; 17+ messages in thread
From: Antonio Russo @ 2020-04-28 12:49 UTC (permalink / raw)
  To: Derrick Stolee, git-ml

On 4/27/20 4:55 AM, Derrick Stolee wrote:
> 
> This is probably because the tests run a second round with GIT_TEST_COMMIT_GRAPH=1, which enables the commit-graph feature. This triggers a different set of logic for the topo-order, which ignores the logic the way you inserted it here.
> 

Thank you for pointing me at this.  If I now understand correctly, the commit information
is not yet necessarily loaded for all commits at this point, and therefore the logic here
will need to be called later on (and makes it more complicated).

Am I correct that this loading of parents happens during traverse_commit_list_filtered
(for the case of rev-list)?  Also, am I correct that there are not yet any hooks to
filter out edges (of the graph of commits)?

Thank you,
Antonio

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Teach git-rev-list --simplify-forks
  2020-04-28 12:49   ` Antonio Russo
@ 2020-04-28 14:11     ` Derrick Stolee
  0 siblings, 0 replies; 17+ messages in thread
From: Derrick Stolee @ 2020-04-28 14:11 UTC (permalink / raw)
  To: Antonio Russo, git-ml

On 4/28/2020 8:49 AM, Antonio Russo wrote:
> On 4/27/20 4:55 AM, Derrick Stolee wrote:
>>
>> This is probably because the tests run a second round with GIT_TEST_COMMIT_GRAPH=1, which enables the commit-graph feature. This triggers a different set of logic for the topo-order, which ignores the logic the way you inserted it here.
>>
> 
> Thank you for pointing me at this.  If I now understand correctly, the commit information
> is not yet necessarily loaded for all commits at this point, and therefore the logic here
> will need to be called later on (and makes it more complicated).
> 
> Am I correct that this loading of parents happens during traverse_commit_list_filtered
> (for the case of rev-list)?  Also, am I correct that there are not yet any hooks to
> filter out edges (of the graph of commits)?

I mean that if you have a commit-graph, then a completely disjoint set of code
is run instead of the code you modified. See [1] for the blocks of code that
are run instead.

[1] https://github.com/git/git/commit/b45424181e9e8b2284a48c6db7b8db635bbfccc8

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2] Teach git-rev-list --simplify-forks
  2020-04-27  5:07 [PATCH] Teach git-rev-list --simplify-forks Antonio Russo
  2020-04-27 10:55 ` Derrick Stolee
@ 2020-04-29  8:01 ` Antonio Russo
  2020-04-29 10:08   ` Philip Oakley
                     ` (4 more replies)
  1 sibling, 5 replies; 17+ messages in thread
From: Antonio Russo @ 2020-04-29  8:01 UTC (permalink / raw)
  To: git-ml; +Cc: Junio C Hamano, Michael Haggerty, Derrick Stolee

When used with --graph, instead of displaying the full graph, display a
spanning subgraph produced by a depth-first search of the graph visiting
parents from left to right.  Edges to already visited commits are
discarded.  This process is repeated for every commit to be displayed.

This is valuable to reduce visual clutter when there are many merges
that were not rebased onto each other and the user is primarily
interested in the state of the branch being merged into.

Also adds documentation and tests of the above.

Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
---
 Documentation/rev-list-options.txt         |  8 +++
 revision.c                                 | 62 ++++++++++++++++++++++
 revision.h                                 |  6 +++
 t/t6016-rev-list-graph-simplify-history.sh | 50 +++++++++++++++++
 4 files changed, 126 insertions(+)

Hello,

This second revision of the patch sets revs->limited.  This forces the
graph of commits to be loaded, and simplfiy_forks() therefore reliably
traverses it.  This addresses the test failures mentioned before (see [1]).

Antonio

[1] https://travis-ci.org/github/aerusso/git/builds/680894920

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 04ad7dd36e..cbac09028c 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -363,6 +363,14 @@ Default mode::
 	merges from the resulting history, as there are no selected
 	commits contributing to this merge.

+--simplify-forks::
+	Convert the commit graph into a spanning subgraph produced by a
+	depth-first-search of the history graph, searching the leftmost
+	parent first, and discarding edges to commits already visited.
+	Useful with `--graph` to visualize repositories with many merges
+	when you are interested in was added to master, and not when the
+	branch was last rebased.
+
 --ancestry-path::
 	When given a range of commits to display (e.g. 'commit1..commit2'
 	or 'commit2 {caret}commit1'), only display commits that exist
diff --git a/revision.c b/revision.c
index 5bc96444b6..51dbe21847 100644
--- a/revision.c
+++ b/revision.c
@@ -2082,6 +2082,9 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->simplify_by_decoration = 1;
 		revs->limited = 1;
 		revs->prune = 1;
+	} else if (!strcmp(arg, "--simplify-forks")) {
+		revs->limited = 1;
+		revs->simplify_forks = 1;
 	} else if (!strcmp(arg, "--date-order")) {
 		revs->sort_order = REV_SORT_BY_COMMIT_DATE;
 		revs->topo_order = 1;
@@ -3095,6 +3098,63 @@ static void simplify_merges(struct rev_info *revs)
 	}
 }

+static void simplify_forks(struct rev_info *revs)
+{
+	struct commit_list *stack, *list_lr, *iter_list;
+	struct commit_list **parents;
+	struct commit *commit, *parent;
+
+	stack = NULL;
+	list_lr = NULL;
+
+	clear_object_flags(SIMP_FORK_VISITED);
+
+	for(iter_list = revs->commits; iter_list; iter_list = iter_list->next) {
+		/* process every commit to be displayed exactly once */
+		if(iter_list->item->object.flags & SIMP_FORK_VISITED)
+			continue;
+		clear_object_flags(SIMP_FORK_VISITING);
+		commit_list_insert(iter_list->item, &stack);
+		iter_list->item->object.flags |= SIMP_FORK_VISITED | SIMP_FORK_VISITING;
+		while(stack) {
+			commit = pop_commit(&stack);
+			/* process the parent nodes: removing links to
+			 * commits already visited (creating a spanning tree)
+			 */
+			parents = &(commit->parents);
+			while(*parents) {
+				parent = (*parents)->item;
+				if(parent->object.flags & SIMP_FORK_VISITING) {
+					/* We have already visited this commit, from the same root.
+					 * We do not explore it at all.
+					 */
+					pop_commit(parents);
+				} else if(parent->object.flags & SIMP_FORK_VISITED) {
+					/* We visited this commit before, but from a different root.
+					 * Leave it attached, but do not explore it further.
+					 */
+					parents = &((*parents)->next);
+				} else {
+					/* We have not visited this commit yet. Explore it, as usual.
+					 */
+					parent->object.flags |= SIMP_FORK_VISITED | SIMP_FORK_VISITING;
+					commit_list_insert(parent, &list_lr);
+					parents = &((*parents)->next);
+				}
+			}
+
+			/* feed the parents, right to left (reversed) onto the
+			 * stack to do a depth-first traversal of the commit graph
+			 */
+			while(list_lr) {
+				commit_list_insert(pop_commit(&list_lr), &stack);
+			}
+		}
+	}
+
+	clear_object_flags(SIMP_FORK_VISITED | SIMP_FORK_VISITING);
+}
+
 static void set_children(struct rev_info *revs)
 {
 	struct commit_list *l;
@@ -3392,6 +3452,8 @@ int prepare_revision_walk(struct rev_info *revs)
 	if (revs->limited) {
 		if (limit_list(revs) < 0)
 			return -1;
+		if (revs->simplify_forks)
+			simplify_forks(revs);
 		if (revs->topo_order)
 			sort_in_topological_order(&revs->commits, revs->sort_order);
 	} else if (revs->topo_order)
diff --git a/revision.h b/revision.h
index c1af164b30..f1abdb26b0 100644
--- a/revision.h
+++ b/revision.h
@@ -51,6 +51,11 @@
 #define TOPO_WALK_EXPLORED	(1u<<27)
 #define TOPO_WALK_INDEGREE	(1u<<28)

+/* Re-use the TOPO_WALK flagspace for simplify_forks
+ */
+#define SIMP_FORK_VISITED	(1u<<27)
+#define SIMP_FORK_VISITING	(1u<<28)
+
 #define DECORATE_SHORT_REFS	1
 #define DECORATE_FULL_REFS	2

@@ -132,6 +137,7 @@ struct rev_info {
 			no_walk:2,
 			remove_empty_trees:1,
 			simplify_history:1,
+			simplify_forks:1,
 			show_pulls:1,
 			topo_order:1,
 			simplify_merges:1,
diff --git a/t/t6016-rev-list-graph-simplify-history.sh b/t/t6016-rev-list-graph-simplify-history.sh
index f5e6e92f5b..d99214b6df 100755
--- a/t/t6016-rev-list-graph-simplify-history.sh
+++ b/t/t6016-rev-list-graph-simplify-history.sh
@@ -85,6 +85,28 @@ test_expect_success '--graph --all' '
 	test_cmp expected actual
 	'

+# Make sure that simplify_histpry_forks produces a spanning tree
+test_expect_success '--graph --simplify-forks --all' '
+	rm -f expected &&
+	echo "* $A7" >> expected &&
+	echo "*   $A6" >> expected &&
+	echo "|\  " >> expected &&
+	echo "| * $C4" >> expected &&
+	echo "| * $C3" >> expected &&
+	echo "* $A5" >> expected &&
+	echo "*-.   $A4" >> expected &&
+	echo "|\ \  " >> expected &&
+	echo "| | * $C2" >> expected &&
+	echo "| | * $C1" >> expected &&
+	echo "| * $B2" >> expected &&
+	echo "| * $B1" >> expected &&
+	echo "* $A3" >> expected &&
+	echo "* $A2" >> expected &&
+	echo "* $A1" >> expected &&
+	git rev-list --graph --simplify-forks --all > actual &&
+	test_cmp expected actual
+	'
+
 # Make sure the graph_is_interesting() code still realizes
 # that undecorated merges are interesting, even with --simplify-by-decoration
 test_expect_success '--graph --simplify-by-decoration' '
@@ -157,6 +179,20 @@ test_expect_success '--graph --full-history -- bar.txt' '
 	test_cmp expected actual
 	'

+test_expect_success '--graph --simplify-forks --full-history -- bar.txt' '
+	rm -f expected &&
+	echo "* $A7" >> expected &&
+	echo "*   $A6" >> expected &&
+	echo "|\\  " >> expected &&
+	echo "| * $C4" >> expected &&
+	echo "* $A5" >> expected &&
+	echo "* $A4" >> expected &&
+	echo "* $A3" >> expected &&
+	echo "* $A2" >> expected &&
+	git rev-list --graph --simplify-forks --full-history --all -- bar.txt > actual &&
+	test_cmp expected actual
+	'
+
 test_expect_success '--graph --full-history --simplify-merges -- bar.txt' '
 	rm -f expected &&
 	echo "* $A7" >> expected &&
@@ -172,6 +208,20 @@ test_expect_success '--graph --full-history --simplify-merges -- bar.txt' '
 	test_cmp expected actual
 	'

+test_expect_success '--graph --simplify-forks --full-history --simplify-merges -- bar.txt' '
+	rm -f expected &&
+	echo "* $A7" >> expected &&
+	echo "*   $A6" >> expected &&
+	echo "|\\  " >> expected &&
+	echo "| * $C4" >> expected &&
+	echo "* $A5" >> expected &&
+	echo "* $A3" >> expected &&
+	echo "* $A2" >> expected &&
+	git rev-list --graph --simplify-forks --full-history --simplify-merges --all \
+		-- bar.txt > actual &&
+	test_cmp expected actual
+	'
+
 test_expect_success '--graph -- bar.txt' '
 	rm -f expected &&
 	echo "* $A7" >> expected &&
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] Teach git-rev-list --simplify-forks
  2020-04-29  8:01 ` [PATCH v2] " Antonio Russo
@ 2020-04-29 10:08   ` Philip Oakley
  2020-04-29 13:04     ` Antonio Russo
  2020-04-29 13:12   ` Derrick Stolee
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Philip Oakley @ 2020-04-29 10:08 UTC (permalink / raw)
  To: Antonio Russo, git-ml; +Cc: Junio C Hamano, Michael Haggerty, Derrick Stolee

Hi Antonio,

On 29/04/2020 09:01, Antonio Russo wrote:
> When used with --graph, instead of displaying the full graph, display a
> spanning subgraph produced by a depth-first search of the graph visiting
> parents from left to right.  Edges to already visited commits are
> discarded.  This process is repeated for every commit to be displayed.
>
> This is valuable to reduce visual clutter when there are many merges
> that were not rebased onto each other and the user is primarily
> interested in the state of the branch being merged into.
>
> Also adds documentation and tests of the above.
>
> Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
> ---
>  Documentation/rev-list-options.txt         |  8 +++
>  revision.c                                 | 62 ++++++++++++++++++++++
>  revision.h                                 |  6 +++
>  t/t6016-rev-list-graph-simplify-history.sh | 50 +++++++++++++++++
>  4 files changed, 126 insertions(+)
>
> Hello,
>
> This second revision of the patch sets revs->limited.  This forces the
> graph of commits to be loaded, and simplfiy_forks() therefore reliably
> traverses it.  This addresses the test failures mentioned before (see [1]).
>
> Antonio
>
> [1] https://travis-ci.org/github/aerusso/git/builds/680894920
>
> diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
> index 04ad7dd36e..cbac09028c 100644
> --- a/Documentation/rev-list-options.txt
> +++ b/Documentation/rev-list-options.txt
> @@ -363,6 +363,14 @@ Default mode::
>  	merges from the resulting history, as there are no selected
>  	commits contributing to this merge.
>
> +--simplify-forks::
> +	Convert the commit graph into a spanning subgraph produced by a
> +	depth-first-search of the history graph, searching the leftmost
> +	parent first, and discarding edges to commits already visited.
> +	Useful with `--graph` to visualize repositories with many merges
> +	when you are interested in was added to master, and not when the
> +	branch was last rebased.

Does this also need to actually mention that it effectively discard
edges to fork points, as per the option name?. No rebasing required for
it to be useful.
     s/when/where/
    "...and not where the branch was last rebased or forked from." - not
great but actually mentions the option.

--
Philip
> +
>  --ancestry-path::
>  	When given a range of commits to display (e.g. 'commit1..commit2'
>  	or 'commit2 {caret}commit1'), only display commits that exist
> diff --git a/revision.c b/revision.c
> index 5bc96444b6..51dbe21847 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -2082,6 +2082,9 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
>  		revs->simplify_by_decoration = 1;
>  		revs->limited = 1;
>  		revs->prune = 1;
> +	} else if (!strcmp(arg, "--simplify-forks")) {
> +		revs->limited = 1;
> +		revs->simplify_forks = 1;
>  	} else if (!strcmp(arg, "--date-order")) {
>  		revs->sort_order = REV_SORT_BY_COMMIT_DATE;
>  		revs->topo_order = 1;
> @@ -3095,6 +3098,63 @@ static void simplify_merges(struct rev_info *revs)
>  	}
>  }
>
> +static void simplify_forks(struct rev_info *revs)
> +{
> +	struct commit_list *stack, *list_lr, *iter_list;
> +	struct commit_list **parents;
> +	struct commit *commit, *parent;
> +
> +	stack = NULL;
> +	list_lr = NULL;
> +
> +	clear_object_flags(SIMP_FORK_VISITED);
> +
> +	for(iter_list = revs->commits; iter_list; iter_list = iter_list->next) {
> +		/* process every commit to be displayed exactly once */
> +		if(iter_list->item->object.flags & SIMP_FORK_VISITED)
> +			continue;
> +		clear_object_flags(SIMP_FORK_VISITING);
> +		commit_list_insert(iter_list->item, &stack);
> +		iter_list->item->object.flags |= SIMP_FORK_VISITED | SIMP_FORK_VISITING;
> +		while(stack) {
> +			commit = pop_commit(&stack);
> +			/* process the parent nodes: removing links to
> +			 * commits already visited (creating a spanning tree)
> +			 */
> +			parents = &(commit->parents);
> +			while(*parents) {
> +				parent = (*parents)->item;
> +				if(parent->object.flags & SIMP_FORK_VISITING) {
> +					/* We have already visited this commit, from the same root.
> +					 * We do not explore it at all.
> +					 */
> +					pop_commit(parents);
> +				} else if(parent->object.flags & SIMP_FORK_VISITED) {
> +					/* We visited this commit before, but from a different root.
> +					 * Leave it attached, but do not explore it further.
> +					 */
> +					parents = &((*parents)->next);
> +				} else {
> +					/* We have not visited this commit yet. Explore it, as usual.
> +					 */
> +					parent->object.flags |= SIMP_FORK_VISITED | SIMP_FORK_VISITING;
> +					commit_list_insert(parent, &list_lr);
> +					parents = &((*parents)->next);
> +				}
> +			}
> +
> +			/* feed the parents, right to left (reversed) onto the
> +			 * stack to do a depth-first traversal of the commit graph
> +			 */
> +			while(list_lr) {
> +				commit_list_insert(pop_commit(&list_lr), &stack);
> +			}
> +		}
> +	}
> +
> +	clear_object_flags(SIMP_FORK_VISITED | SIMP_FORK_VISITING);
> +}
> +
>  static void set_children(struct rev_info *revs)
>  {
>  	struct commit_list *l;
> @@ -3392,6 +3452,8 @@ int prepare_revision_walk(struct rev_info *revs)
>  	if (revs->limited) {
>  		if (limit_list(revs) < 0)
>  			return -1;
> +		if (revs->simplify_forks)
> +			simplify_forks(revs);
>  		if (revs->topo_order)
>  			sort_in_topological_order(&revs->commits, revs->sort_order);
>  	} else if (revs->topo_order)
> diff --git a/revision.h b/revision.h
> index c1af164b30..f1abdb26b0 100644
> --- a/revision.h
> +++ b/revision.h
> @@ -51,6 +51,11 @@
>  #define TOPO_WALK_EXPLORED	(1u<<27)
>  #define TOPO_WALK_INDEGREE	(1u<<28)
>
> +/* Re-use the TOPO_WALK flagspace for simplify_forks
> + */
> +#define SIMP_FORK_VISITED	(1u<<27)
> +#define SIMP_FORK_VISITING	(1u<<28)
> +
>  #define DECORATE_SHORT_REFS	1
>  #define DECORATE_FULL_REFS	2
>
> @@ -132,6 +137,7 @@ struct rev_info {
>  			no_walk:2,
>  			remove_empty_trees:1,
>  			simplify_history:1,
> +			simplify_forks:1,
>  			show_pulls:1,
>  			topo_order:1,
>  			simplify_merges:1,
> diff --git a/t/t6016-rev-list-graph-simplify-history.sh b/t/t6016-rev-list-graph-simplify-history.sh
> index f5e6e92f5b..d99214b6df 100755
> --- a/t/t6016-rev-list-graph-simplify-history.sh
> +++ b/t/t6016-rev-list-graph-simplify-history.sh
> @@ -85,6 +85,28 @@ test_expect_success '--graph --all' '
>  	test_cmp expected actual
>  	'
>
> +# Make sure that simplify_histpry_forks produces a spanning tree
> +test_expect_success '--graph --simplify-forks --all' '
> +	rm -f expected &&
> +	echo "* $A7" >> expected &&
> +	echo "*   $A6" >> expected &&
> +	echo "|\  " >> expected &&
> +	echo "| * $C4" >> expected &&
> +	echo "| * $C3" >> expected &&
> +	echo "* $A5" >> expected &&
> +	echo "*-.   $A4" >> expected &&
> +	echo "|\ \  " >> expected &&
> +	echo "| | * $C2" >> expected &&
> +	echo "| | * $C1" >> expected &&
> +	echo "| * $B2" >> expected &&
> +	echo "| * $B1" >> expected &&
> +	echo "* $A3" >> expected &&
> +	echo "* $A2" >> expected &&
> +	echo "* $A1" >> expected &&
> +	git rev-list --graph --simplify-forks --all > actual &&
> +	test_cmp expected actual
> +	'
> +
>  # Make sure the graph_is_interesting() code still realizes
>  # that undecorated merges are interesting, even with --simplify-by-decoration
>  test_expect_success '--graph --simplify-by-decoration' '
> @@ -157,6 +179,20 @@ test_expect_success '--graph --full-history -- bar.txt' '
>  	test_cmp expected actual
>  	'
>
> +test_expect_success '--graph --simplify-forks --full-history -- bar.txt' '
> +	rm -f expected &&
> +	echo "* $A7" >> expected &&
> +	echo "*   $A6" >> expected &&
> +	echo "|\\  " >> expected &&
> +	echo "| * $C4" >> expected &&
> +	echo "* $A5" >> expected &&
> +	echo "* $A4" >> expected &&
> +	echo "* $A3" >> expected &&
> +	echo "* $A2" >> expected &&
> +	git rev-list --graph --simplify-forks --full-history --all -- bar.txt > actual &&
> +	test_cmp expected actual
> +	'
> +
>  test_expect_success '--graph --full-history --simplify-merges -- bar.txt' '
>  	rm -f expected &&
>  	echo "* $A7" >> expected &&
> @@ -172,6 +208,20 @@ test_expect_success '--graph --full-history --simplify-merges -- bar.txt' '
>  	test_cmp expected actual
>  	'
>
> +test_expect_success '--graph --simplify-forks --full-history --simplify-merges -- bar.txt' '
> +	rm -f expected &&
> +	echo "* $A7" >> expected &&
> +	echo "*   $A6" >> expected &&
> +	echo "|\\  " >> expected &&
> +	echo "| * $C4" >> expected &&
> +	echo "* $A5" >> expected &&
> +	echo "* $A3" >> expected &&
> +	echo "* $A2" >> expected &&
> +	git rev-list --graph --simplify-forks --full-history --simplify-merges --all \
> +		-- bar.txt > actual &&
> +	test_cmp expected actual
> +	'
> +
>  test_expect_success '--graph -- bar.txt' '
>  	rm -f expected &&
>  	echo "* $A7" >> expected &&


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] Teach git-rev-list --simplify-forks
  2020-04-29 10:08   ` Philip Oakley
@ 2020-04-29 13:04     ` Antonio Russo
  0 siblings, 0 replies; 17+ messages in thread
From: Antonio Russo @ 2020-04-29 13:04 UTC (permalink / raw)
  To: Philip Oakley, git-ml; +Cc: Junio C Hamano, Michael Haggerty, Derrick Stolee

On 4/29/20 4:08 AM, Philip Oakley wrote:
>> +--simplify-forks::
>> +	Convert the commit graph into a spanning subgraph produced by a
>> +	depth-first-search of the history graph, searching the leftmost
>> +	parent first, and discarding edges to commits already visited.
>> +	Useful with `--graph` to visualize repositories with many merges
>> +	when you are interested in was added to master, and not when the
>> +	branch was last rebased.
> 
> Does this also need to actually mention that it effectively discard
> edges to fork points, as per the option name?. No rebasing required for
> it to be useful.
>      s/when/where/
>     "...and not where the branch was last rebased or forked from." - not
> great but actually mentions the option.
> 
> --
> Philip

Good point.  More generally, my description of the option is not very
useful unless you already basically understand what it's doing.

How about:

> --simplify-forks::
> 	Omits the parent relationship of the first commit of merged branches.
> 	Effectively discards all information about the fork point of merged
> 	branches.  It does so by converting the commit graph into a
> 	spanning subgraph produced by a depth-first-search of the history
> 	graph, searching the leftmost parent first, and discarding edges
> 	to commits already visited.  Useful with `--graph` to visualize
> 	repositories with many merges when you are most interested in
> 	when branches were merged.


Antonio

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] Teach git-rev-list --simplify-forks
  2020-04-29  8:01 ` [PATCH v2] " Antonio Russo
  2020-04-29 10:08   ` Philip Oakley
@ 2020-04-29 13:12   ` Derrick Stolee
  2020-05-01 14:13     ` Antonio Russo
  2020-05-01 14:21   ` [PATCH 1/3 v3] Clean up t6016-rev-list-graph-simplify-history Antonio Russo
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Derrick Stolee @ 2020-04-29 13:12 UTC (permalink / raw)
  To: Antonio Russo, git-ml; +Cc: Junio C Hamano, Michael Haggerty, Derrick Stolee

On 4/29/2020 4:01 AM, Antonio Russo wrote:
> When used with --graph, instead of displaying the full graph, display a
> spanning subgraph produced by a depth-first search of the graph visiting
> parents from left to right.  Edges to already visited commits are
> discarded.  This process is repeated for every commit to be displayed.
> 
> This is valuable to reduce visual clutter when there are many merges
> that were not rebased onto each other and the user is primarily
> interested in the state of the branch being merged into.

My earlier comment to recommend how to fix the test failures intended
to demonstrate that this area of code requires a bit of precision. I
took some time to review this patch, but I find it needs some significant
updates.

tl;dr: The way you manipulate the commit parents seems incorrect to me,
but perhaps there is enough prior art in the way we simplify parents to
make that acceptable. Someone else will need to chime in on that part.

It may be possible to do this "drop edges" entirely inside of graph.c
(the graph rendering code) instead of revision.c, which would automatically
work with the performance gains from the newer topo-walk algorithm.

There are enough deviations from code and test style that this will
need a significant revision regardless.
 
> This second revision of the patch sets revs->limited.  This forces the
> graph of commits to be loaded, and simplfiy_forks() therefore reliably
> traverses it.  This addresses the test failures mentioned before (see [1]).

This will have a significant performance impact on the option, as you will
not see even the first result until the DFS has completed.

> diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
> index 04ad7dd36e..cbac09028c 100644
> --- a/Documentation/rev-list-options.txt
> +++ b/Documentation/rev-list-options.txt
> @@ -363,6 +363,14 @@ Default mode::
>  	merges from the resulting history, as there are no selected
>  	commits contributing to this merge.
> 
> +--simplify-forks::
> +	Convert the commit graph into a spanning subgraph produced by a
> +	depth-first-search of the history graph, searching the leftmost
> +	parent first, and discarding edges to commits already visited.
> +	Useful with `--graph` to visualize repositories with many merges
> +	when you are interested in was added to master, and not when the
> +	branch was last rebased.

Describing the option via the algorithm may not be the best way to
inform the user of what they will see. It also will not be informative
if the implementation changes to not perform the algorithm described
here, because that algorithm is not incremental (it is O(N) where N is
the total number of reachable commits, not ~O(n) where n is the number
of commits that will actually show up).

An easy test is to time your command with "-n 1".

I'm also not crazy about "when the branch was last rebased". You
probably mean "..when you are not interested in which merge base
was used for each merge."

Also, this does nothing without "--graph" right? Perhaps it should
enable it?

Here is a suggested alternative:

--simplify-forks::
	Show commits that are introduced by each merge before showing
	the first parent of the merge, as in `--graph`, but remove
	edges from those commits to commits reachable from the first
	parent. This helps to visualize repositories with many merges
	when you are not interested in which merge base was used for
	each merge. It also reduces the width of the graph visualization
	compared to `--graph`.

With this description, perhaps it is worth renaming the option? Perhaps
"--ignore-merge-bases"? The word "simplify" implies something like
dropping commits from history, but you are instead dropping _edges_ from
the graph.

>  --ancestry-path::
>  	When given a range of commits to display (e.g. 'commit1..commit2'
>  	or 'commit2 {caret}commit1'), only display commits that exist
> diff --git a/revision.c b/revision.c
> index 5bc96444b6..51dbe21847 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -2082,6 +2082,9 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
>  		revs->simplify_by_decoration = 1;
>  		revs->limited = 1;
>  		revs->prune = 1;
> +	} else if (!strcmp(arg, "--simplify-forks")) {
> +		revs->limited = 1;
> +		revs->simplify_forks = 1;

I recommend dropping the revs->limited setting here and instead fixing the
performance instead. But maybe that should be a second patch on top of this
one.

>  	} else if (!strcmp(arg, "--date-order")) {
>  		revs->sort_order = REV_SORT_BY_COMMIT_DATE;
>  		revs->topo_order = 1;
> @@ -3095,6 +3098,63 @@ static void simplify_merges(struct rev_info *revs)
>  	}
>  }
> 
> +static void simplify_forks(struct rev_info *revs)
> +{
> +	struct commit_list *stack, *list_lr, *iter_list;
> +	struct commit_list **parents;

You want "struct commit_list *parents" here for simpler use.

> +	struct commit *commit, *parent;
> +
> +	stack = NULL;
> +	list_lr = NULL;
> +
> +	clear_object_flags(SIMP_FORK_VISITED);
> +
> +	for(iter_list = revs->commits; iter_list; iter_list = iter_list->next) {

This method could use a split into at least two. I count three levels of nested
loops here, so please break them up into smaller methods.

> +		/* process every commit to be displayed exactly once */
> +		if(iter_list->item->object.flags & SIMP_FORK_VISITED)
> +			continue;
> +		clear_object_flags(SIMP_FORK_VISITING);

This drops the flag from all commits. This will cause quadratic performance
if combined with "--all". Drop this and instead clear the flag yourself as you
visit commits.

> +		commit_list_insert(iter_list->item, &stack);
> +		iter_list->item->object.flags |= SIMP_FORK_VISITED | SIMP_FORK_VISITING;
> +		while(stack) {
> +			commit = pop_commit(&stack);
> +			/* process the parent nodes: removing links to
> +			 * commits already visited (creating a spanning tree)
> +			 */
> +			parents = &(commit->parents);
> +			while(*parents) {

You have some whitespace issues. Put a space between "while" and "(". Same with "if"s below.

> +				parent = (*parents)->item;
> +				if(parent->object.flags & SIMP_FORK_VISITING) {
> +					/* We have already visited this commit, from the same root.
> +					 * We do not explore it at all.
> +					 */
> +					pop_commit(parents);

I don't think you want pop_commit() here. You want to have "parents = parents->next" at the end.

Now, if this is how you are "simplifying" the edges in the final output, then I think this is
destructive and unsafe! You are literally modifying "commit->parents" so if another operation in
Git tries to read those parents it will see the wrong data.

Think about a different way to achieve your goal here, perhaps in the rendering portion of
graph.c instead of here.

> +				} else if(parent->object.flags & SIMP_FORK_VISITED) {
> +					/* We visited this commit before, but from a different root.
> +					 * Leave it attached, but do not explore it further.
> +					 */
> +					parents = &((*parents)->next);
> +				} else {
> +					/* We have not visited this commit yet. Explore it, as usual.
> +					 */
> +					parent->object.flags |= SIMP_FORK_VISITED | SIMP_FORK_VISITING;
> +					commit_list_insert(parent, &list_lr);
> +					parents = &((*parents)->next);
> +				}

It is very unclear that your loop terminates. Instead, use
"parents = parents->next" at the end of the loop block to
make is extremely clear that the loop behaves as expected.

Of course, this assumes that you are not being destructive
to the commit parents as you explore.

> +			}
> +
> +			/* feed the parents, right to left (reversed) onto the
> +			 * stack to do a depth-first traversal of the commit graph
> +			 */

nit: multi-line comment style [1]

[1] https://github.com/git/git/blob/86ab15cb154862b6fa5cc646dac27532f881e1fb/Documentation/CodingGuidelines#L285-L291

> +			while(list_lr) {
> +				commit_list_insert(pop_commit(&list_lr), &stack);
> +			}

nit: No curly braces around single-line blocks. [2]

[2] https://github.com/git/git/blob/86ab15cb154862b6fa5cc646dac27532f881e1fb/Documentation/CodingGuidelines#L239-L243

> +		}
> +	}
> +
> +	clear_object_flags(SIMP_FORK_VISITED | SIMP_FORK_VISITING);
> +}
> +
>  static void set_children(struct rev_info *revs)
>  {
>  	struct commit_list *l;
> @@ -3392,6 +3452,8 @@ int prepare_revision_walk(struct rev_info *revs)
>  	if (revs->limited) {
>  		if (limit_list(revs) < 0)
>  			return -1;
> +		if (revs->simplify_forks)
> +			simplify_forks(revs);
>  		if (revs->topo_order)
>  			sort_in_topological_order(&revs->commits, revs->sort_order);
>  	} else if (revs->topo_order)
> diff --git a/revision.h b/revision.h
> index c1af164b30..f1abdb26b0 100644
> --- a/revision.h
> +++ b/revision.h
> @@ -51,6 +51,11 @@
>  #define TOPO_WALK_EXPLORED	(1u<<27)
>  #define TOPO_WALK_INDEGREE	(1u<<28)
> 
> +/* Re-use the TOPO_WALK flagspace for simplify_forks
> + */
> +#define SIMP_FORK_VISITED	(1u<<27)
> +#define SIMP_FORK_VISITING	(1u<<28)

I don't like that you are re-using these, as it is dangerous for later
collisions. At the moment, these flags are not used in the same code paths
because you specify "limited = 1" and that code path does not use TOPO_WALK_*
macros.

>  #define DECORATE_SHORT_REFS	1
>  #define DECORATE_FULL_REFS	2
> 
> @@ -132,6 +137,7 @@ struct rev_info {
>  			no_walk:2,
>  			remove_empty_trees:1,
>  			simplify_history:1,
> +			simplify_forks:1,
>  			show_pulls:1,
>  			topo_order:1,
>  			simplify_merges:1,
> diff --git a/t/t6016-rev-list-graph-simplify-history.sh b/t/t6016-rev-list-graph-simplify-history.sh
> index f5e6e92f5b..d99214b6df 100755
> --- a/t/t6016-rev-list-graph-simplify-history.sh
> +++ b/t/t6016-rev-list-graph-simplify-history.sh
> @@ -85,6 +85,28 @@ test_expect_success '--graph --all' '
>  	test_cmp expected actual
>  	'
> 
> +# Make sure that simplify_histpry_forks produces a spanning tree
> +test_expect_success '--graph --simplify-forks --all' '
> +	rm -f expected &&
> +	echo "* $A7" >> expected &&
> +	echo "*   $A6" >> expected &&
> +	echo "|\  " >> expected &&
> +	echo "| * $C4" >> expected &&
> +	echo "| * $C3" >> expected &&
> +	echo "* $A5" >> expected &&
> +	echo "*-.   $A4" >> expected &&
> +	echo "|\ \  " >> expected &&
> +	echo "| | * $C2" >> expected &&
> +	echo "| | * $C1" >> expected &&
> +	echo "| * $B2" >> expected &&
> +	echo "| * $B1" >> expected &&
> +	echo "* $A3" >> expected &&
> +	echo "* $A2" >> expected &&
> +	echo "* $A1" >> expected &&

Do not use too many echos like this. Instead use t4215-log-skewed-merges.sh
as an example [3].

[3] https://github.com/git/git/blob/86ab15cb154862b6fa5cc646dac27532f881e1fb/t/t4215-log-skewed-merges.sh#L24-L37

I also hope that you can find more complicated cases to
test, including:

 1. A merge that brings in a merge (think a feature branch)
 2. A merge that brings in a twisted merge (think a user using "git pull").

Here are some example --graph and --simplify-forks outputs to try. I
have not tested these myself, but I believe they are interesting test
cases that can trip up other algorithms.

Merge bringing a merge (non-twisted):

 --graph	--simplify-forks
 *		*
 |\		|\
 | *		| *
 | |\		| |\
 | | *		| | *
 | * |		| *
 * | |		*
 |/ /		*
 * /		*
 |/
 *

Twisted merge:

 --graph	--simplify-forks
 *		*
 |\		|\
 | *		| *
 | |\		| *
 | * |		*
 | | |		*
 * |/		*
 |/|
 * |
 |/
 *

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] Teach git-rev-list --simplify-forks
  2020-04-29 13:12   ` Derrick Stolee
@ 2020-05-01 14:13     ` Antonio Russo
  2020-05-01 15:44       ` Derrick Stolee
  0 siblings, 1 reply; 17+ messages in thread
From: Antonio Russo @ 2020-05-01 14:13 UTC (permalink / raw)
  To: Derrick Stolee, git-ml; +Cc: Junio C Hamano, Michael Haggerty, Derrick Stolee

On 4/29/20 7:12 AM, Derrick Stolee wrote:
> On 4/29/2020 4:01 AM, Antonio Russo wrote:
>> When used with --graph, instead of displaying the full graph, display a
>> spanning subgraph produced by a depth-first search of the graph visiting
>> parents from left to right.  Edges to already visited commits are
>> discarded.  This process is repeated for every commit to be displayed.
>>
>> This is valuable to reduce visual clutter when there are many merges
>> that were not rebased onto each other and the user is primarily
>> interested in the state of the branch being merged into.
>
> My earlier comment to recommend how to fix the test failures intended
> to demonstrate that this area of code requires a bit of precision. I
> took some time to review this patch, but I find it needs some significant
> updates.
>
> tl;dr: The way you manipulate the commit parents seems incorrect to me,
> but perhaps there is enough prior art in the way we simplify parents to
> make that acceptable. Someone else will need to chime in on that part.

First, thank you for taking the time look at this.  I understand your
hesitation about the "amputation" of the history, but in some sense
that's the point of this option.  I really want to be ignorant of the
details of when the fork branched off.  I would like the reported
history to be appear nearly equivalent to a rebase-and-fastforward only
merge style, which results in a much simpler git log --graph.

> It may be possible to do this "drop edges" entirely inside of graph.c
> (the graph rendering code) instead of revision.c, which would automatically
> work with the performance gains from the newer topo-walk algorithm.

Non-local information about the commit graph needs to be used to do this
amputation of the history.  We cannot know how many parents we want to
display until we've completely explored all the parents of that node.
Unfortunately, that means that the whole graph needs to be loaded, and I
cannot really see how there would be any gain by doing this in graph.c.

Caveat: there are semi-streaming DFS implementations (i.e., O(n log n)
space) that we might be able to use to get the first line out the door
quicker. I would, however, like to leave that to another patch.

I will also add that, for the tests I've done, all performance penalties
have been insignificant (less than ~5% for showing the first commit),
while there are significant performance _improvements_, e.g., ~40% for
displaying the full tree.

A notable exception is --all, which can be ~50x faster for the full
output, but is often dramatically slower to show anything (i.e., the
first line).

> There are enough deviations from code and test style that this will
> need a significant revision regardless.

(Please see forthcoming revision 3).

>> This second revision of the patch sets revs->limited.  This forces the
>> graph of commits to be loaded, and simplfiy_forks() therefore reliably
>> traverses it.  This addresses the test failures mentioned before (see [1]).
>
> This will have a significant performance impact on the option, as you will
> not see even the first result until the DFS has completed.

First of all, short of using some other, more sophisticated streaming
version of this algorithm, the full DFS must finish before the first
commit having two (or more) parents can be shown.

That said, the performance is not significantly affected:

I ran the following test (2.26.2, with my patch on top of it):
(git lg = git log --graph --pretty=oneline)

 % time git lg -n1 --ignore-merge-bases e896a286df > /dev/null
 0.73s user 0.02s system 99% cpu 0.746 total

 % time git lg -n1 e896a286df > /dev/null
 0.72s user 0.01s system 99% cpu 0.731 total

 For the linux git repo:

 % time git lg -n1 --ignore-merge-bases v5.7-rc3 >/dev/null
 9.25s user 0.39s system 99% cpu 9.646 total

 % time git lg -n1 v5.7-rc3 >/dev/null
 9.02s user 0.35s system 99% cpu 9.378 total

So the performance seems basically unaffected for very limited graphs.
It's also about 40% faster for complicated ones (as mentioned in my
first email):

 % time git lg --ignore-merge-bases e870325ee8 > /dev/null
 0.83s user 0.06s system 99% cpu 0.886 total

 % time git lg e870325ee8 > /dev/null
 1.41s user 0.03s system 99% cpu 1.443 total

 For the linux git repo:

 % time git lg --ignore-merge-bases v5.7-rc3 >/dev/null
 11.86s user 0.62s system 99% cpu 12.489 total

 % time git lg v5.7-rc3 >/dev/null
 21.56s user 0.55s system 99% cpu 22.108 total

This is because the amputated graph is much simpler, and the rest of the
code needs to do much less work.

Passing --all is another beast, and does indeed suffer:

 % time git lg --ignore-merge-bases --all >/dev/null
 4.06s user 0.04s system 99% cpu 4.105 total

 % time git lg --all >/dev/null
 189.59s user 0.04s system 99% cpu 3:09.65 total

 (and for the first line)

 % time git lg -n1 --ignore-merge-bases --all >/dev/null
 3.86s user 0.02s system 99% cpu 3.874 total

 % time git lg -n1 --all >/dev/null
 0.83s user 0.02s system 99% cpu 0.848 total

(If you need to use --all for the Linux git repo, you should not use
--ignore-merge-bases).

There's certainly performance improvements to be desired, but they do
not appear except for very large repositories, and only when combined
with --all.

>> diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
>> index 04ad7dd36e..cbac09028c 100644
>> --- a/Documentation/rev-list-options.txt
>> +++ b/Documentation/rev-list-options.txt
>> @@ -363,6 +363,14 @@ Default mode::
>>  	merges from the resulting history, as there are no selected
>>  	commits contributing to this merge.
>>
>> +--simplify-forks::
>> +	Convert the commit graph into a spanning subgraph produced by a
>> +	depth-first-search of the history graph, searching the leftmost
>> +	parent first, and discarding edges to commits already visited.
>> +	Useful with `--graph` to visualize repositories with many merges
>> +	when you are interested in was added to master, and not when the
>> +	branch was last rebased.
>
> Describing the option via the algorithm may not be the best way to
> inform the user of what they will see. It also will not be informative
> if the implementation changes to not perform the algorithm described
> here, because that algorithm is not incremental (it is O(N) where N is
> the total number of reachable commits, not ~O(n) where n is the number
> of commits that will actually show up).
>
> An easy test is to time your command with "-n 1".

While this is true, it is not the dominant effect (see above).

> I'm also not crazy about "when the branch was last rebased". You
> probably mean "..when you are not interested in which merge base
> was used for each merge."
>
> Also, this does nothing without "--graph" right? Perhaps it should
> enable it?

Not quite.  The patch interacts very nicely with gitk!

> Here is a suggested alternative:
>
> --simplify-forks::
> 	Show commits that are introduced by each merge before showing
> 	the first parent of the merge, as in `--graph`, but remove
> 	edges from those commits to commits reachable from the first
> 	parent. This helps to visualize repositories with many merges
> 	when you are not interested in which merge base was used for
> 	each merge. It also reduces the width of the graph visualization
> 	compared to `--graph`.
>
> With this description, perhaps it is worth renaming the option? Perhaps
> "--ignore-merge-bases"? The word "simplify" implies something like
> dropping commits from history, but you are instead dropping _edges_ from
> the graph.

(see revision 3).

>>  --ancestry-path::
>>  	When given a range of commits to display (e.g. 'commit1..commit2'
>>  	or 'commit2 {caret}commit1'), only display commits that exist
>> diff --git a/revision.c b/revision.c
>> index 5bc96444b6..51dbe21847 100644
>> --- a/revision.c
>> +++ b/revision.c
>> @@ -2082,6 +2082,9 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
>>  		revs->simplify_by_decoration = 1;
>>  		revs->limited = 1;
>>  		revs->prune = 1;
>> +	} else if (!strcmp(arg, "--simplify-forks")) {
>> +		revs->limited = 1;
>> +		revs->simplify_forks = 1;
>
> I recommend dropping the revs->limited setting here and instead fixing the
> performance instead. But maybe that should be a second patch on top of this
> one.

(See above.)

>>  	} else if (!strcmp(arg, "--date-order")) {
>>  		revs->sort_order = REV_SORT_BY_COMMIT_DATE;
>>  		revs->topo_order = 1;
>> @@ -3095,6 +3098,63 @@ static void simplify_merges(struct rev_info *revs)
>>  	}
>>  }
>>
>> +static void simplify_forks(struct rev_info *revs)
>> +{
>> +	struct commit_list *stack, *list_lr, *iter_list;
>> +	struct commit_list **parents;
>
> You want "struct commit_list *parents" here for simpler use.

Using a pointer to the pointer makes the amputation of the parents much
more uniform: we can just pop_commit(parents), and it handles the cases
of the first parent (which is addressed by commit->parents) and later
parents (which are addressed by list_element->next).

Of course, I'm insistent here that we should still be amputating the
graph.

>> +	struct commit *commit, *parent;
>> +
>> +	stack = NULL;
>> +	list_lr = NULL;
>> +
>> +	clear_object_flags(SIMP_FORK_VISITED);
>> +
>> +	for(iter_list = revs->commits; iter_list; iter_list = iter_list->next) {
>
> This method could use a split into at least two. I count three levels of nested
> loops here, so please break them up into smaller methods.

(see revision 3).

>> +		/* process every commit to be displayed exactly once */
>> +		if(iter_list->item->object.flags & SIMP_FORK_VISITED)
>> +			continue;
>> +		clear_object_flags(SIMP_FORK_VISITING);
>
> This drops the flag from all commits. This will cause quadratic performance
> if combined with "--all". Drop this and instead clear the flag yourself as you
> visit commits.

See revision 3.  I now re-walk the tree as needed.  Performance under
--all is now extremely good (but does not stream).

>> +		commit_list_insert(iter_list->item, &stack);
>> +		iter_list->item->object.flags |= SIMP_FORK_VISITED | SIMP_FORK_VISITING;
>> +		while(stack) {
>> +			commit = pop_commit(&stack);
>> +			/* process the parent nodes: removing links to
>> +			 * commits already visited (creating a spanning tree)
>> +			 */
>> +			parents = &(commit->parents);
>> +			while(*parents) {
>
> You have some whitespace issues. Put a space between "while" and "(". Same with "if"s below.

(See revision 3).  I think I got them all.

>> +				parent = (*parents)->item;
>> +				if(parent->object.flags & SIMP_FORK_VISITING) {
>> +					/* We have already visited this commit, from the same root.
>> +					 * We do not explore it at all.
>> +					 */
>> +					pop_commit(parents);
>
> I don't think you want pop_commit() here. You want to have "parents = parents->next" at the end.
>
> Now, if this is how you are "simplifying" the edges in the final output, then I think this is
> destructive and unsafe! You are literally modifying "commit->parents" so if another operation in
> Git tries to read those parents it will see the wrong data.
>
> Think about a different way to achieve your goal here, perhaps in the rendering portion of
> graph.c instead of here.

I would really like to be blissfully ignorant of branches' merge bases.

I understand that this feature interacts nontrivially with things like
--boundary (and in fact does retain significant information in the
ordering---see the last test case in t6016).  That said, it's not clear
to me the use case for this when combined with other sophisticated graph
manipulations.

At the end of the day, I'm trying to make a feature that aids in
visualizing the repository.  Do you had an example case where someone
would like to use this feature, but removing these parents makes these
other options less useful?

>> +				} else if(parent->object.flags & SIMP_FORK_VISITED) {
>> +					/* We visited this commit before, but from a different root.
>> +					 * Leave it attached, but do not explore it further.
>> +					 */
>> +					parents = &((*parents)->next);
>> +				} else {
>> +					/* We have not visited this commit yet. Explore it, as usual.
>> +					 */
>> +					parent->object.flags |= SIMP_FORK_VISITED | SIMP_FORK_VISITING;
>> +					commit_list_insert(parent, &list_lr);
>> +					parents = &((*parents)->next);
>> +				}
>
> It is very unclear that your loop terminates. Instead, use
> "parents = parents->next" at the end of the loop block to
> make is extremely clear that the loop behaves as expected.
>
> Of course, this assumes that you are not being destructive
> to the commit parents as you explore.

(*parents) is a commit_list.  If, at the start of an iteration of the
loop, it has n items in it, by the end of that iteration, we will have
either removed the head commit (via pop_commit) or advanced one item
down the list.  In either case, there are n-1 items in the list at the
end of the iteration.  Therefore, the loop must terminate.

I've slightly adjusted the layout to make this more clear.

>> +			}
>> +
>> +			/* feed the parents, right to left (reversed) onto the
>> +			 * stack to do a depth-first traversal of the commit graph
>> +			 */
>
> nit: multi-line comment style [1]
>
> [1] https://github.com/git/git/blob/86ab15cb154862b6fa5cc646dac27532f881e1fb/Documentation/CodingGuidelines#L285-L291
>
>> +			while(list_lr) {
>> +				commit_list_insert(pop_commit(&list_lr), &stack);
>> +			}
>
> nit: No curly braces around single-line blocks. [2]
>
> [2] https://github.com/git/git/blob/86ab15cb154862b6fa5cc646dac27532f881e1fb/Documentation/CodingGuidelines#L239-L243

Thank you (and see revision 3).

>> +		}
>> +	}
>> +
>> +	clear_object_flags(SIMP_FORK_VISITED | SIMP_FORK_VISITING);
>> +}
>> +
>>  static void set_children(struct rev_info *revs)
>>  {
>>  	struct commit_list *l;
>> @@ -3392,6 +3452,8 @@ int prepare_revision_walk(struct rev_info *revs)
>>  	if (revs->limited) {
>>  		if (limit_list(revs) < 0)
>>  			return -1;
>> +		if (revs->simplify_forks)
>> +			simplify_forks(revs);
>>  		if (revs->topo_order)
>>  			sort_in_topological_order(&revs->commits, revs->sort_order);
>>  	} else if (revs->topo_order)
>> diff --git a/revision.h b/revision.h
>> index c1af164b30..f1abdb26b0 100644
>> --- a/revision.h
>> +++ b/revision.h
>> @@ -51,6 +51,11 @@
>>  #define TOPO_WALK_EXPLORED	(1u<<27)
>>  #define TOPO_WALK_INDEGREE	(1u<<28)
>>
>> +/* Re-use the TOPO_WALK flagspace for simplify_forks
>> + */
>> +#define SIMP_FORK_VISITED	(1u<<27)
>> +#define SIMP_FORK_VISITING	(1u<<28)
>
> I don't like that you are re-using these, as it is dangerous for later
> collisions. At the moment, these flags are not used in the same code paths
> because you specify "limited = 1" and that code path does not use TOPO_WALK_*
> macros.

I'm happy to use a different bit, and it would make me sleep easier
knowing that this code is portable without worrying about this flag.
I'm now using bits 23 and 24.

>>  #define DECORATE_SHORT_REFS	1
>>  #define DECORATE_FULL_REFS	2
>>
>> @@ -132,6 +137,7 @@ struct rev_info {
>>  			no_walk:2,
>>  			remove_empty_trees:1,
>>  			simplify_history:1,
>> +			simplify_forks:1,
>>  			show_pulls:1,
>>  			topo_order:1,
>>  			simplify_merges:1,
>> diff --git a/t/t6016-rev-list-graph-simplify-history.sh b/t/t6016-rev-list-graph-simplify-history.sh
>> index f5e6e92f5b..d99214b6df 100755
>> --- a/t/t6016-rev-list-graph-simplify-history.sh
>> +++ b/t/t6016-rev-list-graph-simplify-history.sh
>> @@ -85,6 +85,28 @@ test_expect_success '--graph --all' '
>>  	test_cmp expected actual
>>  	'
>>
>> +# Make sure that simplify_histpry_forks produces a spanning tree
>> +test_expect_success '--graph --simplify-forks --all' '
>> +	rm -f expected &&
>> +	echo "* $A7" >> expected &&
>> +	echo "*   $A6" >> expected &&
>> +	echo "|\  " >> expected &&
>> +	echo "| * $C4" >> expected &&
>> +	echo "| * $C3" >> expected &&
>> +	echo "* $A5" >> expected &&
>> +	echo "*-.   $A4" >> expected &&
>> +	echo "|\ \  " >> expected &&
>> +	echo "| | * $C2" >> expected &&
>> +	echo "| | * $C1" >> expected &&
>> +	echo "| * $B2" >> expected &&
>> +	echo "| * $B1" >> expected &&
>> +	echo "* $A3" >> expected &&
>> +	echo "* $A2" >> expected &&
>> +	echo "* $A1" >> expected &&
>
> Do not use too many echos like this. Instead use t4215-log-skewed-merges.sh
> as an example [3].
>
> [3] https://github.com/git/git/blob/86ab15cb154862b6fa5cc646dac27532f881e1fb/t/t4215-log-skewed-merges.sh#L24-L37
>
> I also hope that you can find more complicated cases to
> test, including:
>
>  1. A merge that brings in a merge (think a feature branch)
>  2. A merge that brings in a twisted merge (think a user using "git pull").
>
> Here are some example --graph and --simplify-forks outputs to try. I
> have not tested these myself, but I believe they are interesting test
> cases that can trip up other algorithms.
>
> Merge bringing a merge (non-twisted):
>
>  --graph	--simplify-forks
>  *		*
>  |\		|\
>  | *		| *
>  | |\		| |\
>  | | *		| | *
>  | * |		| *
>  * | |		*
>  |/ /		*
>  * /		*
>  |/
>  *
>
> Twisted merge:
>
>  --graph	--simplify-forks
>  *		*
>  |\		|\
>  | *		| *
>  | |\		| *
>  | * |		*
>  | | |		*
>  * |/		*
>  |/|
>  * |
>  |/
>  *

That is much nicer.  I've cleaned up t6106 in a similar way, and test
--ignore-merge-bases there, t4215, and in the new t4216.

Thanks,
Antonio

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/3 v3] Clean up t6016-rev-list-graph-simplify-history
  2020-04-29  8:01 ` [PATCH v2] " Antonio Russo
  2020-04-29 10:08   ` Philip Oakley
  2020-04-29 13:12   ` Derrick Stolee
@ 2020-05-01 14:21   ` Antonio Russo
  2020-05-01 17:10     ` Junio C Hamano
  2020-05-01 14:22   ` [PATCH 2/3 v3] Teach git-rev-list --ignore-merge-bases Antonio Russo
  2020-05-01 14:23   ` [PATCH 3/3 v3] Add new tests of ignore-merge-bases Antonio Russo
  4 siblings, 1 reply; 17+ messages in thread
From: Antonio Russo @ 2020-05-01 14:21 UTC (permalink / raw)
  To: git-ml; +Cc: Junio C Hamano, Michael Haggerty, Derrick Stolee

Simplifies the logic used to test rev-list, making adding new tests
easier.  Uses a heredoc and postprocessing of the rev-list output
instead of shell substitutions and manually escaped echo's.

Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
---
 t/t6016-rev-list-graph-simplify-history.sh | 349 ++++++++++-----------
 1 file changed, 166 insertions(+), 183 deletions(-)

diff --git a/t/t6016-rev-list-graph-simplify-history.sh b/t/t6016-rev-list-graph-simplify-history.sh
index f5e6e92f5b..eac271a4fa 100755
--- a/t/t6016-rev-list-graph-simplify-history.sh
+++ b/t/t6016-rev-list-graph-simplify-history.sh
@@ -9,6 +9,12 @@ test_description='--graph and simplified history'

 . ./test-lib.sh

+check_graph () {
+	cat >expected &&
+	git rev-list --graph "$@" | sed "$(cat sedscript)" > actual &&
+	test_cmp expected actual
+}
+
 test_expect_success 'set up rev-list --graph test' '
 	# 3 commits on branch A
 	test_commit A1 foo.txt &&
@@ -43,76 +49,62 @@ test_expect_success 'set up rev-list --graph test' '

 	test_commit A7 bar.txt &&

-	# Store commit names in variables for later use
-	A1=$(git rev-parse --verify A1) &&
-	A2=$(git rev-parse --verify A2) &&
-	A3=$(git rev-parse --verify A3) &&
-	A4=$(git rev-parse --verify A4) &&
-	A5=$(git rev-parse --verify A5) &&
-	A6=$(git rev-parse --verify A6) &&
-	A7=$(git rev-parse --verify A7) &&
-	B1=$(git rev-parse --verify B1) &&
-	B2=$(git rev-parse --verify B2) &&
-	C1=$(git rev-parse --verify C1) &&
-	C2=$(git rev-parse --verify C2) &&
-	C3=$(git rev-parse --verify C3) &&
-	C4=$(git rev-parse --verify C4)
-	'
+	echo "s/ *$//;" > sedscript &&
+	( for tag in $(git tag -l) ; do echo "s/$(git rev-parse --verify $tag)/$tag/;" >> sedscript ; done )
+'

 test_expect_success '--graph --all' '
-	rm -f expected &&
-	echo "* $A7" >> expected &&
-	echo "*   $A6" >> expected &&
-	echo "|\\  " >> expected &&
-	echo "| * $C4" >> expected &&
-	echo "| * $C3" >> expected &&
-	echo "* | $A5" >> expected &&
-	echo "| |   " >> expected &&
-	echo "|  \\  " >> expected &&
-	echo "*-. | $A4" >> expected &&
-	echo "|\\ \\| " >> expected &&
-	echo "| | * $C2" >> expected &&
-	echo "| | * $C1" >> expected &&
-	echo "| * | $B2" >> expected &&
-	echo "| * | $B1" >> expected &&
-	echo "* | | $A3" >> expected &&
-	echo "| |/  " >> expected &&
-	echo "|/|   " >> expected &&
-	echo "* | $A2" >> expected &&
-	echo "|/  " >> expected &&
-	echo "* $A1" >> expected &&
-	git rev-list --graph --all > actual &&
-	test_cmp expected actual
-	'
+	check_graph --all <<-\EOF
+	* A7
+	*   A6
+	|\
+	| * C4
+	| * C3
+	* | A5
+	| |
+	|  \
+	*-. | A4
+	|\ \|
+	| | * C2
+	| | * C1
+	| * | B2
+	| * | B1
+	* | | A3
+	| |/
+	|/|
+	* | A2
+	|/
+	* A1
+	EOF
+'

 # Make sure the graph_is_interesting() code still realizes
 # that undecorated merges are interesting, even with --simplify-by-decoration
 test_expect_success '--graph --simplify-by-decoration' '
-	rm -f expected &&
 	git tag -d A4 &&
-	echo "* $A7" >> expected &&
-	echo "*   $A6" >> expected &&
-	echo "|\\  " >> expected &&
-	echo "| * $C4" >> expected &&
-	echo "| * $C3" >> expected &&
-	echo "* | $A5" >> expected &&
-	echo "| |   " >> expected &&
-	echo "|  \\  " >> expected &&
-	echo "*-. | $A4" >> expected &&
-	echo "|\\ \\| " >> expected &&
-	echo "| | * $C2" >> expected &&
-	echo "| | * $C1" >> expected &&
-	echo "| * | $B2" >> expected &&
-	echo "| * | $B1" >> expected &&
-	echo "* | | $A3" >> expected &&
-	echo "| |/  " >> expected &&
-	echo "|/|   " >> expected &&
-	echo "* | $A2" >> expected &&
-	echo "|/  " >> expected &&
-	echo "* $A1" >> expected &&
-	git rev-list --graph --all --simplify-by-decoration > actual &&
-	test_cmp expected actual
-	'
+	check_graph --all --simplify-by-decoration <<-\EOF
+	* A7
+	*   A6
+	|\
+	| * C4
+	| * C3
+	* | A5
+	| |
+	|  \
+	*-. | A4
+	|\ \|
+	| | * C2
+	| | * C1
+	| * | B2
+	| * | B1
+	* | | A3
+	| |/
+	|/|
+	* | A2
+	|/
+	* A1
+	EOF
+'

 test_expect_success 'setup: get rid of decorations on B' '
 	git tag -d B2 &&
@@ -122,142 +114,133 @@ test_expect_success 'setup: get rid of decorations on B' '

 # Graph with branch B simplified away
 test_expect_success '--graph --simplify-by-decoration prune branch B' '
-	rm -f expected &&
-	echo "* $A7" >> expected &&
-	echo "*   $A6" >> expected &&
-	echo "|\\  " >> expected &&
-	echo "| * $C4" >> expected &&
-	echo "| * $C3" >> expected &&
-	echo "* | $A5" >> expected &&
-	echo "* | $A4" >> expected &&
-	echo "|\\| " >> expected &&
-	echo "| * $C2" >> expected &&
-	echo "| * $C1" >> expected &&
-	echo "* | $A3" >> expected &&
-	echo "|/  " >> expected &&
-	echo "* $A2" >> expected &&
-	echo "* $A1" >> expected &&
-	git rev-list --graph --simplify-by-decoration --all > actual &&
-	test_cmp expected actual
-	'
+	check_graph --simplify-by-decoration --all <<-\EOF
+	* A7
+	*   A6
+	|\
+	| * C4
+	| * C3
+	* | A5
+	* | A4
+	|\|
+	| * C2
+	| * C1
+	* | A3
+	|/
+	* A2
+	* A1
+	EOF
+'

 test_expect_success '--graph --full-history -- bar.txt' '
-	rm -f expected &&
-	echo "* $A7" >> expected &&
-	echo "*   $A6" >> expected &&
-	echo "|\\  " >> expected &&
-	echo "| * $C4" >> expected &&
-	echo "* | $A5" >> expected &&
-	echo "* | $A4" >> expected &&
-	echo "|\\| " >> expected &&
-	echo "* | $A3" >> expected &&
-	echo "|/  " >> expected &&
-	echo "* $A2" >> expected &&
-	git rev-list --graph --full-history --all -- bar.txt > actual &&
-	test_cmp expected actual
-	'
+	check_graph --full-history --all -- bar.txt <<-\EOF
+	* A7
+	*   A6
+	|\
+	| * C4
+	* | A5
+	* | A4
+	|\|
+	* | A3
+	|/
+	* A2
+	EOF
+'

 test_expect_success '--graph --full-history --simplify-merges -- bar.txt' '
-	rm -f expected &&
-	echo "* $A7" >> expected &&
-	echo "*   $A6" >> expected &&
-	echo "|\\  " >> expected &&
-	echo "| * $C4" >> expected &&
-	echo "* | $A5" >> expected &&
-	echo "* | $A3" >> expected &&
-	echo "|/  " >> expected &&
-	echo "* $A2" >> expected &&
-	git rev-list --graph --full-history --simplify-merges --all \
-		-- bar.txt > actual &&
-	test_cmp expected actual
-	'
+	check_graph --full-history --simplify-merges --all -- bar.txt <<-\EOF
+	* A7
+	*   A6
+	|\
+	| * C4
+	* | A5
+	* | A3
+	|/
+	* A2
+	EOF
+'

 test_expect_success '--graph -- bar.txt' '
-	rm -f expected &&
-	echo "* $A7" >> expected &&
-	echo "* $A5" >> expected &&
-	echo "* $A3" >> expected &&
-	echo "| * $C4" >> expected &&
-	echo "|/  " >> expected &&
-	echo "* $A2" >> expected &&
-	git rev-list --graph --all -- bar.txt > actual &&
-	test_cmp expected actual
-	'
+	check_graph --all -- bar.txt <<-\EOF
+	* A7
+	* A5
+	* A3
+	| * C4
+	|/
+	* A2
+	EOF
+'

 test_expect_success '--graph --sparse -- bar.txt' '
-	rm -f expected &&
-	echo "* $A7" >> expected &&
-	echo "* $A6" >> expected &&
-	echo "* $A5" >> expected &&
-	echo "* $A4" >> expected &&
-	echo "* $A3" >> expected &&
-	echo "| * $C4" >> expected &&
-	echo "| * $C3" >> expected &&
-	echo "| * $C2" >> expected &&
-	echo "| * $C1" >> expected &&
-	echo "|/  " >> expected &&
-	echo "* $A2" >> expected &&
-	echo "* $A1" >> expected &&
-	git rev-list --graph --sparse --all -- bar.txt > actual &&
-	test_cmp expected actual
-	'
+	check_graph --sparse --all -- bar.txt <<-\EOF
+	* A7
+	* A6
+	* A5
+	* A4
+	* A3
+	| * C4
+	| * C3
+	| * C2
+	| * C1
+	|/
+	* A2
+	* A1
+	EOF
+'

 test_expect_success '--graph ^C4' '
-	rm -f expected &&
-	echo "* $A7" >> expected &&
-	echo "* $A6" >> expected &&
-	echo "* $A5" >> expected &&
-	echo "*   $A4" >> expected &&
-	echo "|\\  " >> expected &&
-	echo "| * $B2" >> expected &&
-	echo "| * $B1" >> expected &&
-	echo "* $A3" >> expected &&
-	git rev-list --graph --all ^C4 > actual &&
-	test_cmp expected actual
-	'
+	check_graph --all ^C4 <<-\EOF
+	* A7
+	* A6
+	* A5
+	*   A4
+	|\
+	| * B2
+	| * B1
+	* A3
+	EOF
+'

 test_expect_success '--graph ^C3' '
-	rm -f expected &&
-	echo "* $A7" >> expected &&
-	echo "*   $A6" >> expected &&
-	echo "|\\  " >> expected &&
-	echo "| * $C4" >> expected &&
-	echo "* $A5" >> expected &&
-	echo "*   $A4" >> expected &&
-	echo "|\\  " >> expected &&
-	echo "| * $B2" >> expected &&
-	echo "| * $B1" >> expected &&
-	echo "* $A3" >> expected &&
-	git rev-list --graph --all ^C3 > actual &&
-	test_cmp expected actual
-	'
+	check_graph --all ^C3 <<-\EOF
+	* A7
+	*   A6
+	|\
+	| * C4
+	* A5
+	*   A4
+	|\
+	| * B2
+	| * B1
+	* A3
+	EOF
+'

 # I don't think the ordering of the boundary commits is really
 # that important, but this test depends on it.  If the ordering ever changes
 # in the code, we'll need to update this test.
 test_expect_success '--graph --boundary ^C3' '
-	rm -f expected &&
-	echo "* $A7" >> expected &&
-	echo "*   $A6" >> expected &&
-	echo "|\\  " >> expected &&
-	echo "| * $C4" >> expected &&
-	echo "* | $A5" >> expected &&
-	echo "| |     " >> expected &&
-	echo "|  \\    " >> expected &&
-	echo "*-. \\   $A4" >> expected &&
-	echo "|\\ \\ \\  " >> expected &&
-	echo "| * | | $B2" >> expected &&
-	echo "| * | | $B1" >> expected &&
-	echo "* | | | $A3" >> expected &&
-	echo "o | | | $A2" >> expected &&
-	echo "|/ / /  " >> expected &&
-	echo "o / / $A1" >> expected &&
-	echo " / /  " >> expected &&
-	echo "| o $C3" >> expected &&
-	echo "|/  " >> expected &&
-	echo "o $C2" >> expected &&
-	git rev-list --graph --boundary --all ^C3 > actual &&
-	test_cmp expected actual
-	'
+	check_graph --boundary --all ^C3 <<-\EOF
+	* A7
+	*   A6
+	|\
+	| * C4
+	* | A5
+	| |
+	|  \
+	*-. \   A4
+	|\ \ \
+	| * | | B2
+	| * | | B1
+	* | | | A3
+	o | | | A2
+	|/ / /
+	o / / A1
+	 / /
+	| o C3
+	|/
+	o C2
+	EOF
+'

 test_done
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/3 v3] Teach git-rev-list --ignore-merge-bases
  2020-04-29  8:01 ` [PATCH v2] " Antonio Russo
                     ` (2 preceding siblings ...)
  2020-05-01 14:21   ` [PATCH 1/3 v3] Clean up t6016-rev-list-graph-simplify-history Antonio Russo
@ 2020-05-01 14:22   ` Antonio Russo
  2020-05-02 13:25     ` Johannes Sixt
  2020-05-01 14:23   ` [PATCH 3/3 v3] Add new tests of ignore-merge-bases Antonio Russo
  4 siblings, 1 reply; 17+ messages in thread
From: Antonio Russo @ 2020-05-01 14:22 UTC (permalink / raw)
  To: git-ml; +Cc: Junio C Hamano, Michael Haggerty, Derrick Stolee

This new option simplifies the graph of commits for merges.  Commits are
shown before their parents, but edges from those commits to commits
reachable from the first parent are removed.  Being disconnected from
the branch from which they were forked, these amputated merges
effectively lose all information about their merge base.

When used with --graph, this amputation can dramatically reduce the
width of the displayed graph and the total time taken to draw all
output.

The implementation traverses all reachable commits with a
depth-first-search to produce a spanning tree.

Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
---
 Documentation/rev-list-options.txt |  9 ++++
 object.h                           |  2 +-
 revision.c                         | 85 ++++++++++++++++++++++++++++++
 revision.h                         |  5 ++
 4 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 04ad7dd36e..3f5d196985 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -363,6 +363,15 @@ Default mode::
 	merges from the resulting history, as there are no selected
 	commits contributing to this merge.

+--ignore-merge-bases::
+	Show commits that are introduced by each merge before showing
+	the first parent of the merge but remove edges from those commits
+	to commits reachable from the first parent. When used with
+	`--graph`, this can help visualize repositories with many merges
+	when you are not interested in the merge base used for each
+	merge. It also reduces the width of the graph visualization
+	when used with `--graph`.
+
 --ancestry-path::
 	When given a range of commits to display (e.g. 'commit1..commit2'
 	or 'commit2 {caret}commit1'), only display commits that exist
diff --git a/object.h b/object.h
index b22328b838..0bf6fb0d55 100644
--- a/object.h
+++ b/object.h
@@ -59,7 +59,7 @@ struct object_array {

 /*
  * object flag allocation:
- * revision.h:               0---------10         15                   25----28
+ * revision.h:               0---------10         15                   23----28
  * fetch-pack.c:             01
  * negotiator/default.c:       2--5
  * walker.c:                 0-2
diff --git a/revision.c b/revision.c
index 5bc96444b6..ae81175e34 100644
--- a/revision.c
+++ b/revision.c
@@ -2082,6 +2082,9 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->simplify_by_decoration = 1;
 		revs->limited = 1;
 		revs->prune = 1;
+	} else if (!strcmp(arg, "--ignore-merge-bases")) {
+		revs->limited = 1;
+		revs->ignore_merge_bases = 1;
 	} else if (!strcmp(arg, "--date-order")) {
 		revs->sort_order = REV_SORT_BY_COMMIT_DATE;
 		revs->topo_order = 1;
@@ -3095,6 +3098,86 @@ static void simplify_merges(struct rev_info *revs)
 	}
 }

+static void cleanup_ignore_merge_bases(struct commit *commit) {
+	struct commit_list *stack;
+	struct commit_list *parents;
+	stack = NULL;
+
+	commit_list_insert(commit, &stack);
+	while (stack) {
+		commit = pop_commit(&stack);
+		commit->object.flags &= !IGN_MRG_BASES_VISITING;
+
+		parents = commit->parents;
+		for (parents = commit->parents; parents; parents = parents->next)
+			commit_list_insert(parents->item, &stack);
+	}
+}
+
+static void do_ignore_merge_bases(struct commit *commit) {
+	struct commit_list *stack, *list_lr;
+	struct commit_list **parents;
+	struct commit *parent;
+	stack = NULL;
+	list_lr = NULL;
+
+	/* process every commit to be displayed exactly once */
+	commit_list_insert(commit, &stack);
+	commit->object.flags |= IGN_MRG_BASES_VISITED | IGN_MRG_BASES_VISITING;
+	while (stack) {
+		commit = pop_commit(&stack);
+		/*
+		 * process the parent nodes: removing links to
+		 * commits already visited (creating a spanning tree)
+		 */
+		parents = &(commit->parents);
+		while (*parents) {
+			parent = (*parents)->item;
+			if (parent->object.flags & IGN_MRG_BASES_VISITING) {
+				/*
+				 * We have already visited this commit, from the same root.
+				 * We do not explore it at all.
+				 */
+				pop_commit(parents);
+				continue;
+			}
+			if (!(parent->object.flags & IGN_MRG_BASES_VISITED)) {
+				/*
+				 * If we visited this commit before, but from a different root,
+				 * leave it attached, but do not explore it further.
+				 *
+				 * If we have not visited this commit yet, explore it, as usual.
+				 */
+				parent->object.flags |= IGN_MRG_BASES_VISITED | IGN_MRG_BASES_VISITING;
+				commit_list_insert(parent, &list_lr);
+			}
+			parents = &((*parents)->next);
+		}
+
+		/*
+		 * Feed the parents, right to left (reversed) onto the
+		 * stack to do a depth-first traversal of the commit graph.
+		 */
+		while (list_lr)
+			commit_list_insert(pop_commit(&list_lr), &stack);
+	}
+}
+
+static void ignore_merge_bases(struct rev_info *revs)
+{
+	struct commit_list *iter_list;
+	struct commit *prev = NULL;
+
+	for (iter_list = revs->commits; iter_list; iter_list = iter_list->next) {
+		if (iter_list->item->object.flags & IGN_MRG_BASES_VISITED)
+			continue;
+		if (prev)
+			cleanup_ignore_merge_bases(prev);
+		prev = iter_list->item;
+		do_ignore_merge_bases(prev);
+	}
+}
+
 static void set_children(struct rev_info *revs)
 {
 	struct commit_list *l;
@@ -3392,6 +3475,8 @@ int prepare_revision_walk(struct rev_info *revs)
 	if (revs->limited) {
 		if (limit_list(revs) < 0)
 			return -1;
+		if (revs->ignore_merge_bases)
+			ignore_merge_bases(revs);
 		if (revs->topo_order)
 			sort_in_topological_order(&revs->commits, revs->sort_order);
 	} else if (revs->topo_order)
diff --git a/revision.h b/revision.h
index c1af164b30..93b318b304 100644
--- a/revision.h
+++ b/revision.h
@@ -37,6 +37,10 @@

 /* WARNING: This is also used as REACHABLE in commit-graph.c. */
 #define PULL_MERGE	(1u<<15)
+
+#define IGN_MRG_BASES_VISITED	(1u<<23)
+#define IGN_MRG_BASES_VISITING	(1u<<24)
+
 /*
  * Indicates object was reached by traversal. i.e. not given by user on
  * command-line or stdin.
@@ -132,6 +136,7 @@ struct rev_info {
 			no_walk:2,
 			remove_empty_trees:1,
 			simplify_history:1,
+			ignore_merge_bases:1,
 			show_pulls:1,
 			topo_order:1,
 			simplify_merges:1,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/3 v3] Add new tests of ignore-merge-bases
  2020-04-29  8:01 ` [PATCH v2] " Antonio Russo
                     ` (3 preceding siblings ...)
  2020-05-01 14:22   ` [PATCH 2/3 v3] Teach git-rev-list --ignore-merge-bases Antonio Russo
@ 2020-05-01 14:23   ` Antonio Russo
  4 siblings, 0 replies; 17+ messages in thread
From: Antonio Russo @ 2020-05-01 14:23 UTC (permalink / raw)
  To: git-ml; +Cc: Junio C Hamano, Michael Haggerty, Derrick Stolee

Extend t4215 and t6016 to also use --ignore-merge-bases on their test
cases.

Add the new test case t4216-log-merges, with three tests: a standard
feature merge, a "twisted" feature merge, and the motivating case for
ignore-merge-bases: a "mountain" of merges.

Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
---
 t/t4215-log-skewed-merges.sh               | 165 +++++++++++++
 t/t4216-log-merges.sh                      | 273 +++++++++++++++++++++
 t/t6016-rev-list-graph-simplify-history.sh |  69 ++++++
 3 files changed, 507 insertions(+)
 create mode 100755 t/t4216-log-merges.sh

diff --git a/t/t4215-log-skewed-merges.sh b/t/t4215-log-skewed-merges.sh
index 28d0779a8c..fb5df1a323 100755
--- a/t/t4215-log-skewed-merges.sh
+++ b/t/t4215-log-skewed-merges.sh
@@ -38,6 +38,22 @@ test_expect_success 'log --graph with merge fusing with its left and right neigh
 	EOF
 '

+test_expect_success 'log --graph with merge fusing with its left and right neighbors (ignore-merge-bases)' '
+	check_graph --ignore-merge-bases <<-\EOF
+	*   H
+	|\
+	| *   G
+	| |\
+	| | * F
+	| *   E
+	| |\
+	| | * D
+	| * C
+	* B
+	* A
+	EOF
+'
+
 test_expect_success 'log --graph with left-skewed merge' '
 	git checkout --orphan 0_p && test_commit 0_A &&
 	git checkout -b 0_q 0_p && test_commit 0_B &&
@@ -72,6 +88,20 @@ test_expect_success 'log --graph with left-skewed merge' '
 	EOF
 '

+test_expect_success 'log --graph with left-skewed merge (ignore-merge-bases)' '
+	check_graph --ignore-merge-bases <<-\EOF
+	*-----.   0_H
+	|\ \ \ \
+	| | | | * 0_G
+	| | | * 0_F
+	| | | * 0_E
+	| | * 0_D
+	| | * 0_C
+	| * 0_B
+	* 0_A
+	EOF
+'
+
 test_expect_success 'log --graph with nested left-skewed merge' '
 	git checkout --orphan 1_p &&
 	test_commit 1_A &&
@@ -100,6 +130,21 @@ test_expect_success 'log --graph with nested left-skewed merge' '
 	EOF
 '

+test_expect_success 'log --graph with nested left-skewed merge (ignore-merge-bases)' '
+	check_graph --ignore-merge-bases<<-\EOF
+	*   1_H
+	|\
+	| *   1_G
+	| |\
+	| | * 1_F
+	| * 1_E
+	| * 1_D
+	* 1_C
+	* 1_B
+	* 1_A
+	EOF
+'
+
 test_expect_success 'log --graph with nested left-skewed merge following normal merge' '
 	git checkout --orphan 2_p &&
 	test_commit 2_A &&
@@ -137,6 +182,23 @@ test_expect_success 'log --graph with nested left-skewed merge following normal
 	EOF
 '

+test_expect_success 'log --graph with nested left-skewed merge following normal merge (ignore-merge-bases)' '
+	check_graph --ignore-merge-bases<<-\EOF
+	*   2_K
+	|\
+	| *   2_J
+	| |\
+	| | * 2_H
+	| | * 2_G
+	| | * 2_F
+	| * 2_E
+	| * 2_D
+	* 2_C
+	* 2_B
+	* 2_A
+	EOF
+'
+
 test_expect_success 'log --graph with nested right-skewed merge following left-skewed merge' '
 	git checkout --orphan 3_p &&
 	test_commit 3_A &&
@@ -170,6 +232,23 @@ test_expect_success 'log --graph with nested right-skewed merge following left-s
 	EOF
 '

+test_expect_success 'log --graph with nested right-skewed merge following left-skewed merge (ignore-merge-bases)' '
+	check_graph --ignore-merge-bases<<-\EOF
+	*   3_J
+	|\
+	| *   3_H
+	| |\
+	| | * 3_G
+	| * 3_F
+	| *   3_E
+	| |\
+	| | * 3_D
+	| * 3_C
+	| * 3_B
+	* 3_A
+	EOF
+'
+
 test_expect_success 'log --graph with right-skewed merge following a left-skewed one' '
 	git checkout --orphan 4_p &&
 	test_commit 4_A &&
@@ -202,6 +281,23 @@ test_expect_success 'log --graph with right-skewed merge following a left-skewed
 	EOF
 '

+test_expect_success 'log --graph with right-skewed merge following a left-skewed one (ignore-merge-bases)' '
+	check_graph --date-order --ignore-merge-bases<<-\EOF
+	*   4_H
+	|\
+	| *   4_G
+	| |\
+	| * | 4_F
+	| * |   4_E
+	| |\ \
+	| | * | 4_D
+	| |  /
+	| | * 4_C
+	| * 4_B
+	* 4_A
+	EOF
+'
+
 test_expect_success 'log --graph with octopus merge with column joining its penultimate parent' '
 	git checkout --orphan 5_p &&
 	test_commit 5_A &&
@@ -239,6 +335,21 @@ test_expect_success 'log --graph with octopus merge with column joining its penu
 	EOF
 '

+test_expect_success 'log --graph with octopus merge with column joining its penultimate parent (ignore-merge-bases)' '
+	check_graph --ignore-merge-bases<<-\EOF
+	*   5_H
+	|\
+	| *-.   5_G
+	| |\ \
+	| | | * 5_F
+	| | * 5_E
+	| | * 5_C
+	| * 5_B
+	* 5_D
+	* 5_A
+	EOF
+'
+
 test_expect_success 'log --graph with multiple tips' '
 	git checkout --orphan 6_1 &&
 	test_commit 6_A &&
@@ -281,6 +392,27 @@ test_expect_success 'log --graph with multiple tips' '
 	EOF
 '

+test_expect_success 'log --graph with multiple tips (ignore-merge-bases)' '
+	check_graph --ignore-merge-bases 6_1 6_3 6_5 <<-\EOF
+	*   6_I
+	|\
+	| | *   6_H
+	| | |\
+	| | | * 6_G
+	| | * 6_E
+	| | | * 6_F
+	| |_|/|
+	|/| |/
+	| |/|
+	| * | 6_D
+	|  /
+	* / 6_C
+	|/
+	* 6_B
+	* 6_A
+	EOF
+'
+
 test_expect_success 'log --graph with multiple tips and colors' '
 	test_config log.graphColors red,green,yellow,blue,magenta,cyan &&
 	cat >expect.colors <<-\EOF &&
@@ -370,4 +502,37 @@ test_expect_success 'log --graph with multiple tips' '
 	EOF
 '

+test_expect_success 'log --graph with multiple tips (ignore-merge-bases)' '
+	check_graph --ignore-merge-bases M_1 M_3 M_5 M_7 <<-\EOF
+	*   7_M1
+	|\
+	| | *   7_M2
+	| | |\
+	| | | * 7_H
+	| | | | *   7_M3
+	| | | | |\
+	| | | | | * 7_J
+	| | | | * | 7_I
+	| | | | * | 7_A
+	| | | |  /
+	| | | | | *   7_M4
+	| |_|_|_|/|\
+	|/| | | |/ /
+	| | |_|/| /
+	| |/| | |/
+	| | | |/|
+	| | |/| |
+	| | * | | 7_G
+	| | | |/
+	| | |/|
+	| | * | 7_F
+	| |  /
+	| * / 7_E
+	| |/
+	| * 7_D
+	* 7_C
+	* 7_B
+	EOF
+'
+
 test_done
diff --git a/t/t4216-log-merges.sh b/t/t4216-log-merges.sh
new file mode 100755
index 0000000000..84b1973131
--- /dev/null
+++ b/t/t4216-log-merges.sh
@@ -0,0 +1,273 @@
+#!/bin/sh
+
+test_description='git log --graph of merges'
+
+. ./test-lib.sh
+. "$TEST_DIRECTORY"/lib-log-graph.sh
+
+check_graph () {
+	cat >expect &&
+	lib_test_cmp_graph --format=%s "$@"
+}
+
+test_expect_success 'log --graph with merge pulling in a feature' '
+	git checkout --orphan _p && test_commit A &&
+	git checkout -b _q &&
+	git checkout _p && test_commit B &&
+	git checkout -b _r &&
+	git checkout _p && test_commit C &&
+	git checkout _r && test_commit F_1 &&
+	git checkout _q && test_commit F_2 &&
+	git checkout _r && git merge --no-ff _q -m M &&
+	git checkout _p && git merge --no-ff _r -m D &&
+
+	check_graph <<-\EOF
+	*   D
+	|\
+	| *   M
+	| |\
+	| | * F_2
+	| * | F_1
+	* | | C
+	|/ /
+	* / B
+	|/
+	* A
+	EOF
+'
+
+test_expect_success 'log --graph with merge pulling in a feature (ignore-merge-bases)' '
+	check_graph --ignore-merge-bases <<-\EOF
+	*   D
+	|\
+	| *   M
+	| |\
+	| | * F_2
+	| * F_1
+	* C
+	* B
+	* A
+	EOF
+'
+
+test_expect_success 'log --graph with twisted merge pulling in a feature from master' '
+	git checkout --orphan 0_p && test_commit 0_A &&
+	git checkout -b 0_q &&
+	git checkout 0_p && test_commit 0_B &&
+	git checkout -b 0_r &&
+	git checkout 0_p && test_commit 0_C &&
+	git checkout 0_q && test_commit 0_F1 && git merge --no-ff 0_r -m 0_M1 &&
+	git checkout 0_p && git merge --no-ff 0_q -m 0_M2 &&
+
+	check_graph <<-\EOF
+	*   0_M2
+	|\
+	| *   0_M1
+	| |\
+	| * | 0_F1
+	* | | 0_C
+	| |/
+	|/|
+	* | 0_B
+	|/
+	* 0_A
+	EOF
+'
+
+test_expect_success 'log --graph with twisted merge pulling in a feature from master (ignore-merge-bases)' '
+	check_graph --ignore-merge-bases <<-\EOF
+	*   0_M2
+	|\
+	| * 0_M1
+	| * 0_F1
+	* 0_C
+	* 0_B
+	* 0_A
+	EOF
+'
+
+test_expect_success 'log --graph with several merges' '
+	git checkout --orphan 1_p &&
+	test_commit 1_root &&
+	for m in $(test_seq 1 10) ;
+	do
+		git checkout -b 1_f${m} 1_root ;
+		test_commit 1_A${m} ;
+	done &&
+	for m in $(test_seq 1 10) ;
+	do
+		i=$((11 - $m)) ;
+		git merge --no-ff 1_f${i} -m 1_M${m}A${i} ;
+	done &&
+	for mp in $(test_seq 1 10) ;
+	do
+		m=$((11 - mp))
+		git checkout 1_f${m} ;
+		test_commit 1_B${m} ;
+		git checkout 1_p ;
+		git merge --no-ff 1_f${m} -m 1_M${m} ;
+	done &&
+
+	check_graph <<-\EOF
+	*   1_M1
+	|\
+	| * 1_B1
+	* |   1_M2
+	|\ \
+	| * | 1_B2
+	* | |   1_M3
+	|\ \ \
+	| * | | 1_B3
+	* | | |   1_M4
+	|\ \ \ \
+	| * | | | 1_B4
+	* | | | |   1_M5
+	|\ \ \ \ \
+	| * | | | | 1_B5
+	* | | | | |   1_M6
+	|\ \ \ \ \ \
+	| * | | | | | 1_B6
+	* | | | | | |   1_M7
+	|\ \ \ \ \ \ \
+	| * | | | | | | 1_B7
+	* | | | | | | |   1_M8
+	|\ \ \ \ \ \ \ \
+	| * | | | | | | | 1_B8
+	* | | | | | | | |   1_M9
+	|\ \ \ \ \ \ \ \ \
+	| * | | | | | | | | 1_B9
+	* | | | | | | | | |   1_M10
+	|\ \ \ \ \ \ \ \ \ \
+	| * | | | | | | | | | 1_B10
+	| * | | | | | | | | |   1_M10A1
+	| |\ \ \ \ \ \ \ \ \ \
+	| | | |_|_|_|_|_|_|_|/
+	| | |/| | | | | | | |
+	| | * | | | | | | | | 1_A1
+	| |/ / / / / / / / /
+	|/| | | | | | | | |
+	| * | | | | | | | |   1_M9A2
+	| |\ \ \ \ \ \ \ \ \
+	| | | |_|_|_|_|_|_|/
+	| | |/| | | | | | |
+	| | * | | | | | | | 1_A2
+	| |/ / / / / / / /
+	|/| | | | | | | |
+	| * | | | | | | |   1_M8A3
+	| |\ \ \ \ \ \ \ \
+	| | | |_|_|_|_|_|/
+	| | |/| | | | | |
+	| | * | | | | | | 1_A3
+	| |/ / / / / / /
+	|/| | | | | | |
+	| * | | | | | |   1_M7A4
+	| |\ \ \ \ \ \ \
+	| | | |_|_|_|_|/
+	| | |/| | | | |
+	| | * | | | | | 1_A4
+	| |/ / / / / /
+	|/| | | | | |
+	| * | | | | |   1_M6A5
+	| |\ \ \ \ \ \
+	| | | |_|_|_|/
+	| | |/| | | |
+	| | * | | | | 1_A5
+	| |/ / / / /
+	|/| | | | |
+	| * | | | |   1_M5A6
+	| |\ \ \ \ \
+	| | | |_|_|/
+	| | |/| | |
+	| | * | | | 1_A6
+	| |/ / / /
+	|/| | | |
+	| * | | |   1_M4A7
+	| |\ \ \ \
+	| | | |_|/
+	| | |/| |
+	| | * | | 1_A7
+	| |/ / /
+	|/| | |
+	| * | |   1_M3A8
+	| |\ \ \
+	| | | |/
+	| | |/|
+	| | * | 1_A8
+	| |/ /
+	|/| |
+	| * | 1_M2A9
+	| |\|
+	| | * 1_A9
+	| |/
+	|/|
+	| * 1_A10
+	|/
+	* 1_root
+	EOF
+'
+
+test_expect_success 'log --graph with several merges (ignore-merge-bases)' '
+	check_graph --ignore-merge-bases <<-\EOF
+	*   1_M1
+	|\
+	| * 1_B1
+	*   1_M2
+	|\
+	| * 1_B2
+	*   1_M3
+	|\
+	| * 1_B3
+	*   1_M4
+	|\
+	| * 1_B4
+	*   1_M5
+	|\
+	| * 1_B5
+	*   1_M6
+	|\
+	| * 1_B6
+	*   1_M7
+	|\
+	| * 1_B7
+	*   1_M8
+	|\
+	| * 1_B8
+	*   1_M9
+	|\
+	| * 1_B9
+	*   1_M10
+	|\
+	| * 1_B10
+	| *   1_M10A1
+	| |\
+	| | * 1_A1
+	| *   1_M9A2
+	| |\
+	| | * 1_A2
+	| *   1_M8A3
+	| |\
+	| | * 1_A3
+	| *   1_M7A4
+	| |\
+	| | * 1_A4
+	| *   1_M6A5
+	| |\
+	| | * 1_A5
+	| *   1_M5A6
+	| |\
+	| | * 1_A6
+	| *   1_M4A7
+	| |\
+	| | * 1_A7
+	| *   1_M3A8
+	| |\
+	| | * 1_A8
+	| *   1_M2A9
+	| |\
+	| | * 1_A9
+	| * 1_A10
+	* 1_root
+	EOF
+'
+
+test_done
diff --git a/t/t6016-rev-list-graph-simplify-history.sh b/t/t6016-rev-list-graph-simplify-history.sh
index eac271a4fa..a097be5ce5 100755
--- a/t/t6016-rev-list-graph-simplify-history.sh
+++ b/t/t6016-rev-list-graph-simplify-history.sh
@@ -78,6 +78,27 @@ test_expect_success '--graph --all' '
 	EOF
 '

+# Make sure that ignore_merge_bases produces a spanning tree
+test_expect_success '--graph --ignore-merge-bases --all' '
+	check_graph --ignore-merge-bases --all <<-\EOF
+	* A7
+	*   A6
+	|\
+	| * C4
+	| * C3
+	* A5
+	*-.   A4
+	|\ \
+	| | * C2
+	| | * C1
+	| * B2
+	| * B1
+	* A3
+	* A2
+	* A1
+	EOF
+'
+
 # Make sure the graph_is_interesting() code still realizes
 # that undecorated merges are interesting, even with --simplify-by-decoration
 test_expect_success '--graph --simplify-by-decoration' '
@@ -147,6 +168,19 @@ test_expect_success '--graph --full-history -- bar.txt' '
 	EOF
 '

+test_expect_success '--graph --ignore-merge-bases --full-history -- bar.txt' '
+	check_graph --ignore-merge-bases --full-history --all -- bar.txt <<-\EOF
+	* A7
+	*   A6
+	|\
+	| * C4
+	* A5
+	* A4
+	* A3
+	* A2
+	EOF
+'
+
 test_expect_success '--graph --full-history --simplify-merges -- bar.txt' '
 	check_graph --full-history --simplify-merges --all -- bar.txt <<-\EOF
 	* A7
@@ -160,6 +194,18 @@ test_expect_success '--graph --full-history --simplify-merges -- bar.txt' '
 	EOF
 '

+test_expect_success '--graph --ignore-merge-bases --full-history --simplify-merges -- bar.txt' '
+	check_graph --ignore-merge-bases --full-history --simplify-merges --all -- bar.txt <<-\EOF
+	* A7
+	*   A6
+	|\
+	| * C4
+	* A5
+	* A3
+	* A2
+	EOF
+'
+
 test_expect_success '--graph -- bar.txt' '
 	check_graph --all -- bar.txt <<-\EOF
 	* A7
@@ -243,4 +289,27 @@ test_expect_success '--graph --boundary ^C3' '
 	EOF
 '

+test_expect_success '--graph --ignore-merge-bases --boundary ^C3' '
+	check_graph --ignore-merge-bases --boundary --all ^C3 <<-\EOF
+	* A7
+	*   A6
+	|\
+	| * C4
+	* | A5
+	| |
+	|  \
+	*-. \   A4
+	|\ \ \
+	| * | | B2
+	| * | | B1
+	|  / /
+	* | | A3
+	o | | A2
+	 / /
+	o / C2
+	 /
+	o C3
+	EOF
+'
+
 test_done
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] Teach git-rev-list --simplify-forks
  2020-05-01 14:13     ` Antonio Russo
@ 2020-05-01 15:44       ` Derrick Stolee
  0 siblings, 0 replies; 17+ messages in thread
From: Derrick Stolee @ 2020-05-01 15:44 UTC (permalink / raw)
  To: Antonio Russo, git-ml; +Cc: Junio C Hamano, Michael Haggerty, Derrick Stolee

On 5/1/2020 10:13 AM, Antonio Russo wrote:
> On 4/29/20 7:12 AM, Derrick Stolee wrote:
>> On 4/29/2020 4:01 AM, Antonio Russo wrote:
>>> When used with --graph, instead of displaying the full graph, display a
>>> spanning subgraph produced by a depth-first search of the graph visiting
>>> parents from left to right.  Edges to already visited commits are
>>> discarded.  This process is repeated for every commit to be displayed.
>>>
>>> This is valuable to reduce visual clutter when there are many merges
>>> that were not rebased onto each other and the user is primarily
>>> interested in the state of the branch being merged into.
>>
>> My earlier comment to recommend how to fix the test failures intended
>> to demonstrate that this area of code requires a bit of precision. I
>> took some time to review this patch, but I find it needs some significant
>> updates.
>>
>> tl;dr: The way you manipulate the commit parents seems incorrect to me,
>> but perhaps there is enough prior art in the way we simplify parents to
>> make that acceptable. Someone else will need to chime in on that part.
> 
> First, thank you for taking the time look at this.  I understand your
> hesitation about the "amputation" of the history, but in some sense
> that's the point of this option.  I really want to be ignorant of the
> details of when the fork branched off.  I would like the reported
> history to be appear nearly equivalent to a rebase-and-fastforward only
> merge style, which results in a much simpler git log --graph.
> 
>> It may be possible to do this "drop edges" entirely inside of graph.c
>> (the graph rendering code) instead of revision.c, which would automatically
>> work with the performance gains from the newer topo-walk algorithm.
> 
> Non-local information about the commit graph needs to be used to do this
> amputation of the history.  We cannot know how many parents we want to
> display until we've completely explored all the parents of that node.
> Unfortunately, that means that the whole graph needs to be loaded, and I
> cannot really see how there would be any gain by doing this in graph.c.
> 
> Caveat: there are semi-streaming DFS implementations (i.e., O(n log n)
> space) that we might be able to use to get the first line out the door
> quicker. I would, however, like to leave that to another patch.
> 
> I will also add that, for the tests I've done, all performance penalties
> have been insignificant (less than ~5% for showing the first commit),
> while there are significant performance _improvements_, e.g., ~40% for
> displaying the full tree.
> 
> A notable exception is --all, which can be ~50x faster for the full
> output, but is often dramatically slower to show anything (i.e., the
> first line).
> 
>> There are enough deviations from code and test style that this will
>> need a significant revision regardless.
> 
> (Please see forthcoming revision 3).
> 
>>> This second revision of the patch sets revs->limited.  This forces the
>>> graph of commits to be loaded, and simplfiy_forks() therefore reliably
>>> traverses it.  This addresses the test failures mentioned before (see [1]).
>>
>> This will have a significant performance impact on the option, as you will
>> not see even the first result until the DFS has completed.
> 
> First of all, short of using some other, more sophisticated streaming
> version of this algorithm, the full DFS must finish before the first
> commit having two (or more) parents can be shown.
> 
> That said, the performance is not significantly affected:
> 
> I ran the following test (2.26.2, with my patch on top of it):
> (git lg = git log --graph --pretty=oneline)
> 
>  % time git lg -n1 --ignore-merge-bases e896a286df > /dev/null
>  0.73s user 0.02s system 99% cpu 0.746 total
> 
>  % time git lg -n1 e896a286df > /dev/null
>  0.72s user 0.01s system 99% cpu 0.731 total
> 
>  For the linux git repo:
> 
>  % time git lg -n1 --ignore-merge-bases v5.7-rc3 >/dev/null
>  9.25s user 0.39s system 99% cpu 9.646 total
> 
>  % time git lg -n1 v5.7-rc3 >/dev/null
>  9.02s user 0.35s system 99% cpu 9.378 total
> 
> So the performance seems basically unaffected for very limited graphs.
> It's also about 40% faster for complicated ones (as mentioned in my
> first email):
> 
>  % time git lg --ignore-merge-bases e870325ee8 > /dev/null
>  0.83s user 0.06s system 99% cpu 0.886 total
> 
>  % time git lg e870325ee8 > /dev/null
>  1.41s user 0.03s system 99% cpu 1.443 total
> 
>  For the linux git repo:
> 
>  % time git lg --ignore-merge-bases v5.7-rc3 >/dev/null
>  11.86s user 0.62s system 99% cpu 12.489 total
> 
>  % time git lg v5.7-rc3 >/dev/null
>  21.56s user 0.55s system 99% cpu 22.108 total

First, run `git commit-graph write --reachable` to create a commit-graph
file which has generation numbers. In this case, I get the following:

$ time git log --oneline --graph v5.7-rc3 >/dev/null

real    0m13.548s
user    0m13.348s
sys     0m0.200s

$ time git log --oneline --graph -n 1 v5.7-rc3 >/dev/null

real    0m0.007s
user    0m0.004s
sys     0m0.004s

Notice exactly how much better this gets for the first result with that
file.

> This is because the amputated graph is much simpler, and the rest of the
> code needs to do much less work.
> 
> Passing --all is another beast, and does indeed suffer:
> 
>  % time git lg --ignore-merge-bases --all >/dev/null
>  4.06s user 0.04s system 99% cpu 4.105 total
> 
>  % time git lg --all >/dev/null
>  189.59s user 0.04s system 99% cpu 3:09.65 total
> 
>  (and for the first line)
> 
>  % time git lg -n1 --ignore-merge-bases --all >/dev/null
>  3.86s user 0.02s system 99% cpu 3.874 total
> 
>  % time git lg -n1 --all >/dev/null
>  0.83s user 0.02s system 99% cpu 0.848 total
> 
> (If you need to use --all for the Linux git repo, you should not use
> --ignore-merge-bases).

I think this is a deficiency in your implementation, not a hard rule
about how these options would need to interact.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3 v3] Clean up t6016-rev-list-graph-simplify-history
  2020-05-01 14:21   ` [PATCH 1/3 v3] Clean up t6016-rev-list-graph-simplify-history Antonio Russo
@ 2020-05-01 17:10     ` Junio C Hamano
  2020-05-02 15:27       ` Antonio Russo
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2020-05-01 17:10 UTC (permalink / raw)
  To: Antonio Russo; +Cc: git-ml, Michael Haggerty, Derrick Stolee

Antonio Russo <antonio.e.russo@gmail.com> writes:

> +check_graph () {
> +	cat >expected &&

Not a new issue, but we may want to fix this to align to majority of
tests by calling it "expect".

> +	git rev-list --graph "$@" | sed "$(cat sedscript)" > actual &&

Style. No SP between > (or < for that matter) and the filename.

The "sed" utility can be told to read its script from a file with
its "-f" option.

Correctness.  Never run "git" command that is the target being
tested on the left side of a pipe.  It will hide the exit status.

> +	test_cmp expected actual
> +}

>  test_expect_success 'set up rev-list --graph test' '
>  	# 3 commits on branch A
>  	test_commit A1 foo.txt &&
> @@ -43,76 +49,62 @@ test_expect_success 'set up rev-list --graph test' '
>
>  	test_commit A7 bar.txt &&
>
> -	# Store commit names in variables for later use
> -	A1=$(git rev-parse --verify A1) &&
> -	A2=$(git rev-parse --verify A2) &&
> -	A3=$(git rev-parse --verify A3) &&
> -	A4=$(git rev-parse --verify A4) &&
> -	A5=$(git rev-parse --verify A5) &&
> -	A6=$(git rev-parse --verify A6) &&
> -	A7=$(git rev-parse --verify A7) &&
> -	B1=$(git rev-parse --verify B1) &&
> -	B2=$(git rev-parse --verify B2) &&
> -	C1=$(git rev-parse --verify C1) &&
> -	C2=$(git rev-parse --verify C2) &&
> -	C3=$(git rev-parse --verify C3) &&
> -	C4=$(git rev-parse --verify C4)
> -	'
> +	echo "s/ *$//;" > sedscript &&
> +	( for tag in $(git tag -l) ; do echo "s/$(git rev-parse --verify $tag)/$tag/;" >> sedscript ; done )

Avoid unreadable one-liner with needless subshell.

I suspect that this is a task for-each-ref was designed for,
something along the lines of...

	git for-each-ref --format='s|%(objectname)|%(refname:short)|' \
		refs/tags/ >>sedScript

> +'
>
>  test_expect_success '--graph --all' '
> -	rm -f expected &&
> -	echo "* $A7" >> expected &&
> -	echo "*   $A6" >> expected &&
> -	echo "|\\  " >> expected &&
> -	echo "| * $C4" >> expected &&
> -	echo "| * $C3" >> expected &&
> -	echo "* | $A5" >> expected &&
> -	echo "| |   " >> expected &&
> -	echo "|  \\  " >> expected &&
> -	echo "*-. | $A4" >> expected &&
> -	echo "|\\ \\| " >> expected &&
> -	echo "| | * $C2" >> expected &&
> -	echo "| | * $C1" >> expected &&
> -	echo "| * | $B2" >> expected &&
> -	echo "| * | $B1" >> expected &&
> -	echo "* | | $A3" >> expected &&
> -	echo "| |/  " >> expected &&
> -	echo "|/|   " >> expected &&
> -	echo "* | $A2" >> expected &&
> -	echo "|/  " >> expected &&
> -	echo "* $A1" >> expected &&
> -	git rev-list --graph --all > actual &&
> -	test_cmp expected actual
> -	'
> +	check_graph --all <<-\EOF
> +	* A7
> +	*   A6
> +	|\
> +	| * C4
> +	| * C3
> +	* | A5
> +	| |
> +	|  \
> +	*-. | A4
> +	|\ \|
> +	| | * C2
> +	| | * C1
> +	| * | B2
> +	| * | B1
> +	* | | A3
> +	| |/
> +	|/|
> +	* | A2
> +	|/
> +	* A1
> +	EOF

Much nicer to see.

Having said all that, I am not sure if this change of design is
sound.

The original approach would have worked even if two or more of these
tags pointed at the same object.  Your version will pick one of
them.  If two tags, say A5 and C8, pointed at the same commit, and
the illustration given to check_graph helper from its standard
output labeled a commit as C8, wouldn't the actual output converted
to show A5 with your sedScript approach?

I think it is salvageable by changing the direction you munge.
Instead of munging the rev-list output, you can store it as-is in
"actual", and instead pass the illustration that comes from the
standard input of the check_graph helper through sed to expand the
symbolic names to actual object names before comparing. i.e.

	check_graph () {
		sed -f expand_tag_to_objects.sed >expect &&
		git rev-list --graph "$@" >actual &&
		test_cmp expect actual
	}

Note that I renamed the overly generic "sedscript" to a name that
reflects the purpose of the file (and the contents being of a
certain type is conveyed by .sed suffix, just like you'd use
suffixes like .c, .txt).  A good discipline to learn, I would say.

Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3 v3] Teach git-rev-list --ignore-merge-bases
  2020-05-01 14:22   ` [PATCH 2/3 v3] Teach git-rev-list --ignore-merge-bases Antonio Russo
@ 2020-05-02 13:25     ` Johannes Sixt
  2020-05-02 14:21       ` Antonio Russo
  0 siblings, 1 reply; 17+ messages in thread
From: Johannes Sixt @ 2020-05-02 13:25 UTC (permalink / raw)
  To: Antonio Russo
  Cc: Git Mailing List, Junio C Hamano, Michael Haggerty,
	Derrick Stolee

Am 01.05.20 um 16:22 schrieb Antonio Russo:
> +--ignore-merge-bases::
> +	Show commits that are introduced by each merge before showing
> +	the first parent of the merge but remove edges from those commits
> +	to commits reachable from the first parent. When used with
> +	`--graph`, this can help visualize repositories with many merges
> +	when you are not interested in the merge base used for each
> +	merge. It also reduces the width of the graph visualization
> +	when used with `--graph`.

I wonder whether there is some potential to prettify another aspect of
--graph once this feature is in place: In particular, when I *am*
interested in merge-bases, I use --boundary, and I get this chart:

$ git log --graph --boundary --oneline --no-decorate 7d28d69174~4..7d28d69174

*   7d28d69174 Merge branch 'jc/allow-strlen-substitution-in-shell-scripts'
|\  
| * 78725ebda9 CodingGuidelines: allow ${#posix} == strlen($posix)
* |   dfdce31ce6 Merge branch 'en/pull-do-not-rebase-after-fast-forwarding'
|\ \  
| * | fbae70ddc6 pull: avoid running both merge and rebase
| |/  
* |   b660a76d0f Merge branch 'dl/wrapper-fix-indentation'
|\ \  
| * | 7cd54d37dc wrapper: indent with tabs
* | |   3d6c56dd66 Merge branch 'ag/sequencer-i18n-messages'
|\ \ \  
| * | | 4d55d63bde sequencer: mark messages for translation
| o | | 3bab5d5625 The second batch post 2.26 cycle
|  / /  
o / / 9f471e4b95 Merge branch 'rs/pull-options-sync-code-and-doc'
 / /  
o / 0b6806b9e4 xread, xwrite: limit size of IO to 8MB
 /  
o b6d4d82bd5 msvc: accommodate for vcpkg's upgrade to OpenSSL v1.1.x

but I would prefer to see something like this:

*   7d28d69174 Merge branch 'jc/allow-strlen-substitution-in-shell-scripts'
|\  
| * 78725ebda9 CodingGuidelines: allow ${#posix} == strlen($posix)
* |   dfdce31ce6 Merge branch 'en/pull-do-not-rebase-after-fast-forwarding'
|\ \  
| * | fbae70ddc6 pull: avoid running both merge and rebase
| |/
| o b6d4d82bd5 msvc: accommodate for vcpkg's upgrade to OpenSSL v1.1.x  
*   b660a76d0f Merge branch 'dl/wrapper-fix-indentation'
|\  
| * 7cd54d37dc wrapper: indent with tabs
| o 0b6806b9e4 xread, xwrite: limit size of IO to 8MB
*   3d6c56dd66 Merge branch 'ag/sequencer-i18n-messages'
|\  
| * 4d55d63bde sequencer: mark messages for translation
| o 3bab5d5625 The second batch post 2.26 cycle
o 9f471e4b95 Merge branch 'rs/pull-options-sync-code-and-doc'

Maybe that is just an easy fall-out?

> diff --git a/object.h b/object.h
> index b22328b838..0bf6fb0d55 100644
> --- a/object.h
> +++ b/object.h
> @@ -59,7 +59,7 @@ struct object_array {
> 
>  /*
>   * object flag allocation:
> - * revision.h:               0---------10         15                   25----28
> + * revision.h:               0---------10         15                   23----28

I think the intent is that the line is like a ruler with marks: you
should move "23" sufficiently far to the left, because the "23" mark
cannot be at the same spot where the "25" mark was.

(I'm not reviewing this patch as I am illiterate when it comes to the
revision walker, I just happened to notice this when I skimmed the
patch.)

>   * fetch-pack.c:             01
>   * negotiator/default.c:       2--5
>   * walker.c:                 0-2

-- Hannes

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3 v3] Teach git-rev-list --ignore-merge-bases
  2020-05-02 13:25     ` Johannes Sixt
@ 2020-05-02 14:21       ` Antonio Russo
  0 siblings, 0 replies; 17+ messages in thread
From: Antonio Russo @ 2020-05-02 14:21 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: Git Mailing List, Junio C Hamano, Michael Haggerty,
	Derrick Stolee

On 5/2/20 7:25 AM, Johannes Sixt wrote:
> Am 01.05.20 um 16:22 schrieb Antonio Russo:
>> +--ignore-merge-bases::
>> +	Show commits that are introduced by each merge before showing
>> +	the first parent of the merge but remove edges from those commits
>> +	to commits reachable from the first parent. When used with
>> +	`--graph`, this can help visualize repositories with many merges
>> +	when you are not interested in the merge base used for each
>> +	merge. It also reduces the width of the graph visualization
>> +	when used with `--graph`.
> 
> I wonder whether there is some potential to prettify another aspect of
> --graph once this feature is in place: In particular, when I *am*
> interested in merge-bases, I use --boundary, and I get this chart:
> 
> $ git log --graph --boundary --oneline --no-decorate 7d28d69174~4..7d28d69174
> 
> *   7d28d69174 Merge branch 'jc/allow-strlen-substitution-in-shell-scripts'
> |\  
> | * 78725ebda9 CodingGuidelines: allow ${#posix} == strlen($posix)
> * |   dfdce31ce6 Merge branch 'en/pull-do-not-rebase-after-fast-forwarding'
> |\ \  
> | * | fbae70ddc6 pull: avoid running both merge and rebase
> | |/  
> * |   b660a76d0f Merge branch 'dl/wrapper-fix-indentation'
> |\ \  
> | * | 7cd54d37dc wrapper: indent with tabs
> * | |   3d6c56dd66 Merge branch 'ag/sequencer-i18n-messages'
> |\ \ \  
> | * | | 4d55d63bde sequencer: mark messages for translation
> | o | | 3bab5d5625 The second batch post 2.26 cycle
> |  / /  
> o / / 9f471e4b95 Merge branch 'rs/pull-options-sync-code-and-doc'
>  / /  
> o / 0b6806b9e4 xread, xwrite: limit size of IO to 8MB
>  /  
> o b6d4d82bd5 msvc: accommodate for vcpkg's upgrade to OpenSSL v1.1.x
> 
> but I would prefer to see something like this:
> 
> *   7d28d69174 Merge branch 'jc/allow-strlen-substitution-in-shell-scripts'
> |\  
> | * 78725ebda9 CodingGuidelines: allow ${#posix} == strlen($posix)
> * |   dfdce31ce6 Merge branch 'en/pull-do-not-rebase-after-fast-forwarding'
> |\ \  
> | * | fbae70ddc6 pull: avoid running both merge and rebase
> | |/
> | o b6d4d82bd5 msvc: accommodate for vcpkg's upgrade to OpenSSL v1.1.x  
> *   b660a76d0f Merge branch 'dl/wrapper-fix-indentation'
> |\  
> | * 7cd54d37dc wrapper: indent with tabs
> | o 0b6806b9e4 xread, xwrite: limit size of IO to 8MB
> *   3d6c56dd66 Merge branch 'ag/sequencer-i18n-messages'
> |\  
> | * 4d55d63bde sequencer: mark messages for translation
> | o 3bab5d5625 The second batch post 2.26 cycle
> o 9f471e4b95 Merge branch 'rs/pull-options-sync-code-and-doc'
> 
> Maybe that is just an easy fall-out?

Yeah, I noticed this too.  What we want is to create a spanning tree by left-first
DFS (amputating edges connecting to already visited vertices), and then walk the
amputated tree in a right-first DFS (and then not sort the commits any further).

The tricky part is getting the results of the right-first DFS before the left-first
(amputation) DFS finishes.  I'm working on it.

>> diff --git a/object.h b/object.h
>> index b22328b838..0bf6fb0d55 100644
>> --- a/object.h
>> +++ b/object.h
>> @@ -59,7 +59,7 @@ struct object_array {
>>
>>  /*
>>   * object flag allocation:
>> - * revision.h:               0---------10         15                   25----28
>> + * revision.h:               0---------10         15                   23----28
> 
> I think the intent is that the line is like a ruler with marks: you
> should move "23" sufficiently far to the left, because the "23" mark
> cannot be at the same spot where the "25" mark was.

I though this was only a schematic ruler, but, counting it now, it is exact.
I'll fix it.

> (I'm not reviewing this patch as I am illiterate when it comes to the
> revision walker, I just happened to notice this when I skimmed the
> patch.)

Thank you,
Antonio

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3 v3] Clean up t6016-rev-list-graph-simplify-history
  2020-05-01 17:10     ` Junio C Hamano
@ 2020-05-02 15:27       ` Antonio Russo
  0 siblings, 0 replies; 17+ messages in thread
From: Antonio Russo @ 2020-05-02 15:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git-ml, Michael Haggerty, Derrick Stolee

On 5/1/20 11:10 AM, Junio C Hamano wrote:
> Antonio Russo <antonio.e.russo@gmail.com> writes:
> 
>> +check_graph () {
>> +	cat >expected &&
> 
> Not a new issue, but we may want to fix this to align to majority of
> tests by calling it "expect".
> 
>> +	git rev-list --graph "$@" | sed "$(cat sedscript)" > actual &&
> 
> Style. No SP between > (or < for that matter) and the filename.
> 
> The "sed" utility can be told to read its script from a file with
> its "-f" option.
> 
> Correctness.  Never run "git" command that is the target being
> tested on the left side of a pipe.  It will hide the exit status.

>> -	'
>> +	echo "s/ *$//;" > sedscript &&
>> +	( for tag in $(git tag -l) ; do echo "s/$(git rev-parse --verify $tag)/$tag/;" >> sedscript ; done )
> 
> Avoid unreadable one-liner with needless subshell.
> 
> I suspect that this is a task for-each-ref was designed for,
> something along the lines of...
> 
> 	git for-each-ref --format='s|%(objectname)|%(refname:short)|' \
> 		refs/tags/ >>sedScript
> 

Thanks, I've fixed these.

> 
> Much nicer to see.
> 
> Having said all that, I am not sure if this change of design is
> sound.
> 
> The original approach would have worked even if two or more of these
> tags pointed at the same object.  Your version will pick one of
> them.  If two tags, say A5 and C8, pointed at the same commit, and
> the illustration given to check_graph helper from its standard
> output labeled a commit as C8, wouldn't the actual output converted
> to show A5 with your sedScript approach?
> 
> I think it is salvageable by changing the direction you munge.
> Instead of munging the rev-list output, you can store it as-is in
> "actual", and instead pass the illustration that comes from the
> standard input of the check_graph helper through sed to expand the
> symbolic names to actual object names before comparing. i.e.
> 
> 	check_graph () {
> 		sed -f expand_tag_to_objects.sed >expect &&
> 		git rev-list --graph "$@" >actual &&
> 		test_cmp expect actual
> 	}
> 
> Note that I renamed the overly generic "sedscript" to a name that
> reflects the purpose of the file (and the contents being of a
> certain type is conveyed by .sed suffix, just like you'd use
> suffixes like .c, .txt).  A good discipline to learn, I would say.

I also fixed these.  I originally chose the other direction
(oid->refname) to make the output more human readable.

If you think that's a worthwhile goal, I can figure something out.

That said, the patchset as it stands scratches my immediate need,
and addressing D. Stolee's comments (in particular, making it work
with revs->limited==0) unfortunately has to be a lower priority
for me.

I am not abandoning the patchset, but may be more delayed
responding to review.

Thanks,
Antonio

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-05-02 15:32 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-27  5:07 [PATCH] Teach git-rev-list --simplify-forks Antonio Russo
2020-04-27 10:55 ` Derrick Stolee
2020-04-28 12:49   ` Antonio Russo
2020-04-28 14:11     ` Derrick Stolee
2020-04-29  8:01 ` [PATCH v2] " Antonio Russo
2020-04-29 10:08   ` Philip Oakley
2020-04-29 13:04     ` Antonio Russo
2020-04-29 13:12   ` Derrick Stolee
2020-05-01 14:13     ` Antonio Russo
2020-05-01 15:44       ` Derrick Stolee
2020-05-01 14:21   ` [PATCH 1/3 v3] Clean up t6016-rev-list-graph-simplify-history Antonio Russo
2020-05-01 17:10     ` Junio C Hamano
2020-05-02 15:27       ` Antonio Russo
2020-05-01 14:22   ` [PATCH 2/3 v3] Teach git-rev-list --ignore-merge-bases Antonio Russo
2020-05-02 13:25     ` Johannes Sixt
2020-05-02 14:21       ` Antonio Russo
2020-05-01 14:23   ` [PATCH 3/3 v3] Add new tests of ignore-merge-bases Antonio Russo

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).