git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/3] merge-ort: implement recursive merges
@ 2020-12-15 17:53 Elijah Newren via GitGitGadget
  2020-12-15 17:53 ` [PATCH 1/3] merge-ort: copy a few small helper functions from merge-recursive.c Elijah Newren via GitGitGadget
                   ` (3 more replies)
  0 siblings, 4 replies; 37+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-12-15 17:53 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

This series depends on en/merge-ort-2 (it does NOT depend on en/merge-ort-3
and can thus be reviewed/merged independently of it).

This short series adds handling of recursive merges (merging of multiple
merge-bases to create a virtual merge base) to merge-ort. With this short
series the number of test failures under GIT_TEST_MERGE_ALGORITHM=ort drops
by 801 (from 1448 to 647).

Elijah Newren (3):
  merge-ort: copy a few small helper functions from merge-recursive.c
  merge-ort: make clear_internal_opts() aware of partial clearing
  merge-ort: implement merge_incore_recursive()

 merge-ort.c | 133 +++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 126 insertions(+), 7 deletions(-)


base-commit: c5a6f65527aa3b6f5d7cf25437a88d8727ab0646
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-814%2Fnewren%2Fort-recursive-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-814/newren/ort-recursive-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/814
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 1/3] merge-ort: copy a few small helper functions from merge-recursive.c
  2020-12-15 17:53 [PATCH 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
@ 2020-12-15 17:53 ` Elijah Newren via GitGitGadget
  2020-12-16  1:16   ` Junio C Hamano
  2020-12-16 13:30   ` Derrick Stolee
  2020-12-15 17:53 ` [PATCH 2/3] merge-ort: make clear_internal_opts() aware of partial clearing Elijah Newren via GitGitGadget
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 37+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-12-15 17:53 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

In a subsequent commit, we will implement the traditional recursiveness
that gave merge-recursive its name, namely merging non-unique
merge-bases to come up with a single virtual merge base.  Copy a few
helper functions from merge-recursive.c that we will use in the
implementation.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/merge-ort.c b/merge-ort.c
index 414e7b7eeac..05ba92c91a6 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -17,8 +17,10 @@
 #include "cache.h"
 #include "merge-ort.h"
 
+#include "alloc.h"
 #include "blob.h"
 #include "cache-tree.h"
+#include "commit.h"
 #include "commit-reach.h"
 #include "diff.h"
 #include "diffcore.h"
@@ -1348,6 +1350,34 @@ void merge_finalize(struct merge_options *opt,
 
 /*** Function Grouping: helper functions for merge_incore_*() ***/
 
+static inline void set_commit_tree(struct commit *c, struct tree *t)
+{
+	c->maybe_tree = t;
+}
+
+static struct commit *make_virtual_commit(struct repository *repo,
+					  struct tree *tree,
+					  const char *comment)
+{
+	struct commit *commit = alloc_commit_node(repo);
+
+	set_merge_remote_desc(commit, comment, (struct object *)commit);
+	set_commit_tree(commit, tree);
+	commit->object.parsed = 1;
+	return commit;
+}
+
+static struct commit_list *reverse_commit_list(struct commit_list *list)
+{
+	struct commit_list *next = NULL, *current, *backup;
+	for (current = list; current; current = backup) {
+		backup = current->next;
+		current->next = next;
+		next = current;
+	}
+	return next;
+}
+
 static void merge_start(struct merge_options *opt, struct merge_result *result)
 {
 	/* Sanity checks on opt */
@@ -1462,5 +1492,7 @@ void merge_incore_recursive(struct merge_options *opt,
 			    struct commit *side2,
 			    struct merge_result *result)
 {
+	(void)reverse_commit_list;
+	(void)make_virtual_commit;
 	die("Not yet implemented");
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 2/3] merge-ort: make clear_internal_opts() aware of partial clearing
  2020-12-15 17:53 [PATCH 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
  2020-12-15 17:53 ` [PATCH 1/3] merge-ort: copy a few small helper functions from merge-recursive.c Elijah Newren via GitGitGadget
@ 2020-12-15 17:53 ` Elijah Newren via GitGitGadget
  2020-12-15 17:53 ` [PATCH 3/3] merge-ort: implement merge_incore_recursive() Elijah Newren via GitGitGadget
  2020-12-16  5:52 ` [PATCH v2 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
  3 siblings, 0 replies; 37+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-12-15 17:53 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

In order to handle recursive merges, after merging merge-bases we need
to clear away most of the data we had built up but some of it needs to
be kept -- in particular the "output" field.  Rename the function to
reflect its future change in use.

Further, since "reinitialize" means we'll be reusing the fields
immediately, take advantage of this to only partially clear maps,
leaving the hashtable allocated and pre-sized.  (This may be slightly
out-of-order since the speedups aren't realized until there are far
more strmaps in use, but the patch submission process already went out
of order because of various questions and requests for strmap.  Anyway,
see commit 6ccdfc2a20 ("strmap: enable faster clearing and reusing of
strmaps", 2020-11-05), for performance details about the use of
strmap_partial_clear().)

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index 05ba92c91a6..10a97e944c4 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -253,10 +253,11 @@ static void free_strmap_strings(struct strmap *map)
 	}
 }
 
-static void clear_internal_opts(struct merge_options_internal *opti,
-				int reinitialize)
+static void clear_or_reinit_internal_opts(struct merge_options_internal *opti,
+					  int reinitialize)
 {
-	assert(!reinitialize);
+	void (*strmap_func)(struct strmap *, int) =
+		reinitialize ? strmap_partial_clear : strmap_clear;
 
 	/*
 	 * We marked opti->paths with strdup_strings = 0, so that we
@@ -266,14 +267,14 @@ static void clear_internal_opts(struct merge_options_internal *opti,
 	 * to deallocate them.
 	 */
 	free_strmap_strings(&opti->paths);
-	strmap_clear(&opti->paths, 1);
+	strmap_func(&opti->paths, 1);
 
 	/*
 	 * All keys and values in opti->conflicted are a subset of those in
 	 * opti->paths.  We don't want to deallocate anything twice, so we
 	 * don't free the keys and we pass 0 for free_values.
 	 */
-	strmap_clear(&opti->conflicted, 0);
+	strmap_func(&opti->conflicted, 0);
 
 	/*
 	 * opti->paths_to_free is similar to opti->paths; we created it with
@@ -1344,7 +1345,7 @@ void merge_finalize(struct merge_options *opt,
 
 	assert(opt->priv == NULL);
 
-	clear_internal_opts(opti, 0);
+	clear_or_reinit_internal_opts(opti, 0);
 	FREE_AND_NULL(opti);
 }
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 3/3] merge-ort: implement merge_incore_recursive()
  2020-12-15 17:53 [PATCH 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
  2020-12-15 17:53 ` [PATCH 1/3] merge-ort: copy a few small helper functions from merge-recursive.c Elijah Newren via GitGitGadget
  2020-12-15 17:53 ` [PATCH 2/3] merge-ort: make clear_internal_opts() aware of partial clearing Elijah Newren via GitGitGadget
@ 2020-12-15 17:53 ` Elijah Newren via GitGitGadget
  2020-12-16  2:07   ` Junio C Hamano
  2020-12-16  5:52 ` [PATCH v2 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
  3 siblings, 1 reply; 37+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-12-15 17:53 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Implement merge_incore_recursive(), mostly through the use of a new
helper function, merge_ort_internal(), which itself is based off
merge_recursive_internal() from merge-recursive.c.

This drops the number of failures in the testsuite when run under
GIT_TEST_MERGE_ALGORITHM=ort from around 1500 to 647.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 89 insertions(+), 3 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index 10a97e944c4..65f7ce2b223 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -1476,6 +1476,90 @@ static void merge_ort_nonrecursive_internal(struct merge_options *opt,
 	}
 }
 
+/*
+ * Originally from merge_recursive_internal(); somewhat adapted, though.
+ */
+static void merge_ort_internal(struct merge_options *opt,
+			       struct commit_list *merge_bases,
+			       struct commit *h1,
+			       struct commit *h2,
+			       struct merge_result *result)
+{
+	struct commit_list *iter;
+	struct commit *merged_merge_bases;
+	const char *ancestor_name;
+	struct strbuf merge_base_abbrev = STRBUF_INIT;
+
+	if (!merge_bases) {
+		merge_bases = get_merge_bases(h1, h2);
+		merge_bases = reverse_commit_list(merge_bases);
+	}
+
+	merged_merge_bases = pop_commit(&merge_bases);
+	if (merged_merge_bases == NULL) {
+		/* if there is no common ancestor, use an empty tree */
+		struct tree *tree;
+
+		tree = lookup_tree(opt->repo, opt->repo->hash_algo->empty_tree);
+		merged_merge_bases = make_virtual_commit(opt->repo, tree,
+							 "ancestor");
+		ancestor_name = "empty tree";
+	} else if (opt->ancestor && !opt->priv->call_depth) {
+		ancestor_name = opt->ancestor;
+	} else if (merge_bases) {
+		ancestor_name = "merged common ancestors";
+	} else {
+		strbuf_add_unique_abbrev(&merge_base_abbrev,
+					 &merged_merge_bases->object.oid,
+					 DEFAULT_ABBREV);
+		ancestor_name = merge_base_abbrev.buf;
+	}
+
+	for (iter = merge_bases; iter; iter = iter->next) {
+		const char *saved_b1, *saved_b2;
+		struct commit *prev = merged_merge_bases;
+
+		opt->priv->call_depth++;
+		/*
+		 * When the merge fails, the result contains files
+		 * with conflict markers. The cleanness flag is
+		 * ignored (unless indicating an error), it was never
+		 * actually used, as result of merge_trees has always
+		 * overwritten it: the committed "conflicts" were
+		 * already resolved.
+		 */
+		saved_b1 = opt->branch1;
+		saved_b2 = opt->branch2;
+		opt->branch1 = "Temporary merge branch 1";
+		opt->branch2 = "Temporary merge branch 2";
+		merge_ort_internal(opt, NULL, prev, iter->item, result);
+		if (result->clean < 0)
+			return;
+		opt->branch1 = saved_b1;
+		opt->branch2 = saved_b2;
+		opt->priv->call_depth--;
+
+		merged_merge_bases = make_virtual_commit(opt->repo,
+							 result->tree,
+							 "merged tree");
+		commit_list_insert(prev, &merged_merge_bases->parents);
+		commit_list_insert(iter->item,
+				   &merged_merge_bases->parents->next);
+
+		clear_or_reinit_internal_opts(opt->priv, 1);
+	}
+
+	opt->ancestor = ancestor_name;
+	merge_ort_nonrecursive_internal(opt,
+					repo_get_commit_tree(opt->repo,
+							     merged_merge_bases),
+					repo_get_commit_tree(opt->repo, h1),
+					repo_get_commit_tree(opt->repo, h2),
+					result);
+	strbuf_release(&merge_base_abbrev);
+	opt->ancestor = NULL;  /* avoid accidental re-use of opt->ancestor */
+}
+
 void merge_incore_nonrecursive(struct merge_options *opt,
 			       struct tree *merge_base,
 			       struct tree *side1,
@@ -1493,7 +1577,9 @@ void merge_incore_recursive(struct merge_options *opt,
 			    struct commit *side2,
 			    struct merge_result *result)
 {
-	(void)reverse_commit_list;
-	(void)make_virtual_commit;
-	die("Not yet implemented");
+	assert(opt->ancestor == NULL ||
+	       !strcmp(opt->ancestor, "constructed merge base"));
+
+	merge_start(opt, result);
+	merge_ort_internal(opt, merge_bases, side1, side2, result);
 }
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/3] merge-ort: copy a few small helper functions from merge-recursive.c
  2020-12-15 17:53 ` [PATCH 1/3] merge-ort: copy a few small helper functions from merge-recursive.c Elijah Newren via GitGitGadget
@ 2020-12-16  1:16   ` Junio C Hamano
  2020-12-16 16:12     ` Johannes Schindelin
  2020-12-16 13:30   ` Derrick Stolee
  1 sibling, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2020-12-16  1:16 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: git, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Elijah Newren <newren@gmail.com>
>
> In a subsequent commit, we will implement the traditional recursiveness
> that gave merge-recursive its name, namely merging non-unique
> merge-bases to come up with a single virtual merge base.  Copy a few
> helper functions from merge-recursive.c that we will use in the
> implementation.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ...
> @@ -1462,5 +1492,7 @@ void merge_incore_recursive(struct merge_options *opt,
>  			    struct commit *side2,
>  			    struct merge_result *result)
>  {
> +	(void)reverse_commit_list;
> +	(void)make_virtual_commit;

To keep these symbols referenced?  Cute ;-)

>  	die("Not yet implemented");
>  }

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/3] merge-ort: implement merge_incore_recursive()
  2020-12-15 17:53 ` [PATCH 3/3] merge-ort: implement merge_incore_recursive() Elijah Newren via GitGitGadget
@ 2020-12-16  2:07   ` Junio C Hamano
  2020-12-16  4:09     ` Elijah Newren
  0 siblings, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2020-12-16  2:07 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: git, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +/*
> + * Originally from merge_recursive_internal(); somewhat adapted, though.
> + */
> +static void merge_ort_internal(struct merge_options *opt,
> +			       struct commit_list *merge_bases,
> +			       struct commit *h1,
> +			       struct commit *h2,
> +			       struct merge_result *result)
> +{
> +	struct commit_list *iter;
> +	struct commit *merged_merge_bases;
> +	const char *ancestor_name;
> +	struct strbuf merge_base_abbrev = STRBUF_INIT;
> +
> +	if (!merge_bases) {
> +		merge_bases = get_merge_bases(h1, h2);
> +		merge_bases = reverse_commit_list(merge_bases);

Do we want to say why we reverse here, or is the reason so well
known why in the original merge-recursive case?

> +	}
> +
> +	merged_merge_bases = pop_commit(&merge_bases);
> +	if (merged_merge_bases == NULL) {
> +		/* if there is no common ancestor, use an empty tree */
> +		struct tree *tree;
> +
> +		tree = lookup_tree(opt->repo, opt->repo->hash_algo->empty_tree);
> +		merged_merge_bases = make_virtual_commit(opt->repo, tree,
> +							 "ancestor");
> +		ancestor_name = "empty tree";
> +	} else if (opt->ancestor && !opt->priv->call_depth) {
> +		ancestor_name = opt->ancestor;
> +	} else if (merge_bases) {
> +		ancestor_name = "merged common ancestors";
> +	} else {
> +		strbuf_add_unique_abbrev(&merge_base_abbrev,
> +					 &merged_merge_bases->object.oid,
> +					 DEFAULT_ABBREV);
> +		ancestor_name = merge_base_abbrev.buf;
> +	}

So, up to this point we learned:

 - merge bases either given by the caller, or computed from h1 and
   h2 when the caller just told us to figure it out ourselves.

 - if we have

   - 0 merge base between h1 and h2, in which case we would use an
     empty tree as an imaginary common

   - 1 merge base between h1 and h2, in which case the common
     ancestor of the resuting merge between h1 and h2 is that single
     merge base

   - 2 or more bases, in which case we'd use that would eventually
     come back when we merged recursively all bases.

and the primary products of the above procedure are

 - ancestor_name (the string used when presenting conflicts while
   merging h1 and h2)

 - merged_merge_bases (one starting commit among the merge bases)

And then the loop will iterate over the remaining merge bases,
merging one popped from it in the current merged_merge_bases,
until we run out.  At that point when we leave the loop, we'd
have merged_merge_bases that is a virtual commit to be used as
a single merge base to use while merging trees of h1 and h2.

> +	for (iter = merge_bases; iter; iter = iter->next) {
> +		const char *saved_b1, *saved_b2;
> +		struct commit *prev = merged_merge_bases;
> +
> +		opt->priv->call_depth++;
> +		/*
> +		 * When the merge fails, the result contains files
> +		 * with conflict markers. The cleanness flag is
> +		 * ignored (unless indicating an error), it was never
> +		 * actually used, as result of merge_trees has always
> +		 * overwritten it: the committed "conflicts" were
> +		 * already resolved.
> +		 */
> +		saved_b1 = opt->branch1;
> +		saved_b2 = opt->branch2;
> +		opt->branch1 = "Temporary merge branch 1";
> +		opt->branch2 = "Temporary merge branch 2";
> +		merge_ort_internal(opt, NULL, prev, iter->item, result);
> +		if (result->clean < 0)
> +			return;
> +		opt->branch1 = saved_b1;
> +		opt->branch2 = saved_b2;
> +		opt->priv->call_depth--;
> +
> +		merged_merge_bases = make_virtual_commit(opt->repo,
> +							 result->tree,
> +							 "merged tree");
> +		commit_list_insert(prev, &merged_merge_bases->parents);
> +		commit_list_insert(iter->item,
> +				   &merged_merge_bases->parents->next);

We need to record these parents because...?  When merged_merge_bases
we just created is used as one side of a merge in the next iteration,
we'd need to compute the merge base between it and the one we'd pop
out of merge_bases, and that is why.

> +		clear_or_reinit_internal_opts(opt->priv, 1);
> +	}

OK.  I think I understood this loop.  It looks mostly straight-forward.

> +	opt->ancestor = ancestor_name;

And the label to be used, that was computed before the above loop,
is used here...

> +	merge_ort_nonrecursive_internal(opt,
> +					repo_get_commit_tree(opt->repo,
> +							     merged_merge_bases),
> +					repo_get_commit_tree(opt->repo, h1),
> +					repo_get_commit_tree(opt->repo, h2),
> +					result);

... to finally compute the 3-way merge between h1 and h2.

> +	strbuf_release(&merge_base_abbrev);

And the storage that may have been holding the .ancestor name is
cleared, as we no longer need it.

> +	opt->ancestor = NULL;  /* avoid accidental re-use of opt->ancestor */
> +}
> +
>  void merge_incore_nonrecursive(struct merge_options *opt,
>  			       struct tree *merge_base,
>  			       struct tree *side1,
> @@ -1493,7 +1577,9 @@ void merge_incore_recursive(struct merge_options *opt,
>  			    struct commit *side2,
>  			    struct merge_result *result)
>  {
> -	(void)reverse_commit_list;
> -	(void)make_virtual_commit;
> -	die("Not yet implemented");
> +	assert(opt->ancestor == NULL ||
> +	       !strcmp(opt->ancestor, "constructed merge base"));

Where does this string come from?  The recursive backend does
asssign a fixed string with that value to opt->ancestor, but we
don't expect that string to come here, no?

> +	merge_start(opt, result);
> +	merge_ort_internal(opt, merge_bases, side1, side2, result);
>  }

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/3] merge-ort: implement merge_incore_recursive()
  2020-12-16  2:07   ` Junio C Hamano
@ 2020-12-16  4:09     ` Elijah Newren
  2020-12-16  4:44       ` Elijah Newren
  0 siblings, 1 reply; 37+ messages in thread
From: Elijah Newren @ 2020-12-16  4:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Elijah Newren via GitGitGadget, Git Mailing List

On Tue, Dec 15, 2020 at 6:07 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > +/*
> > + * Originally from merge_recursive_internal(); somewhat adapted, though.
> > + */
> > +static void merge_ort_internal(struct merge_options *opt,
> > +                            struct commit_list *merge_bases,
> > +                            struct commit *h1,
> > +                            struct commit *h2,
> > +                            struct merge_result *result)
> > +{
> > +     struct commit_list *iter;
> > +     struct commit *merged_merge_bases;
> > +     const char *ancestor_name;
> > +     struct strbuf merge_base_abbrev = STRBUF_INIT;
> > +
> > +     if (!merge_bases) {
> > +             merge_bases = get_merge_bases(h1, h2);
> > +             merge_bases = reverse_commit_list(merge_bases);
>
> Do we want to say why we reverse here, or is the reason so well
> known why in the original merge-recursive case?

Oh, good point.  After starting on merge-ort, I shifted back and forth
between it and cleaning up merge-recursive for a while...and it looks
like this is one of the things I forgot to copy over.  The reason was
totally opaque to me until Johannes spelled it out over here:
https://lore.kernel.org/git/nycvar.QRO.7.76.6.1907252055500.21907@tvgsbejvaqbjf.bet/

Note that the same reversing of merge bases is done in builtin/merge.c
and sequencer.c as well.  It resulted in me adding the following note
to the declaration of merge_recursive() in merge-recursive.h:

 * NOTE: empirically, about a decade ago it was determined that with more
 *       than two merge bases, optimal behavior was found when the
 *       merge_bases were passed in the order of oldest commit to newest
 *       commit.  Also, merge_bases will be consumed (emptied) so make a
 *       copy if you need it.

but I never copied that comment over to merge_incore_recursive().  I
should do that, and perhaps reference that comment at this point in
the code.

> > +     }
> > +
> > +     merged_merge_bases = pop_commit(&merge_bases);
> > +     if (merged_merge_bases == NULL) {
> > +             /* if there is no common ancestor, use an empty tree */
> > +             struct tree *tree;
> > +
> > +             tree = lookup_tree(opt->repo, opt->repo->hash_algo->empty_tree);
> > +             merged_merge_bases = make_virtual_commit(opt->repo, tree,
> > +                                                      "ancestor");
> > +             ancestor_name = "empty tree";
> > +     } else if (opt->ancestor && !opt->priv->call_depth) {
> > +             ancestor_name = opt->ancestor;
> > +     } else if (merge_bases) {
> > +             ancestor_name = "merged common ancestors";
> > +     } else {
> > +             strbuf_add_unique_abbrev(&merge_base_abbrev,
> > +                                      &merged_merge_bases->object.oid,
> > +                                      DEFAULT_ABBREV);
> > +             ancestor_name = merge_base_abbrev.buf;
> > +     }
>
> So, up to this point we learned:
>
>  - merge bases either given by the caller, or computed from h1 and
>    h2 when the caller just told us to figure it out ourselves.
>
>  - if we have
>
>    - 0 merge base between h1 and h2, in which case we would use an
>      empty tree as an imaginary common
>
>    - 1 merge base between h1 and h2, in which case the common
>      ancestor of the resuting merge between h1 and h2 is that single
>      merge base
>
>    - 2 or more bases, in which case we'd use that would eventually
>      come back when we merged recursively all bases.
>
> and the primary products of the above procedure are
>
>  - ancestor_name (the string used when presenting conflicts while
>    merging h1 and h2)
>
>  - merged_merge_bases (one starting commit among the merge bases)
>
> And then the loop will iterate over the remaining merge bases,
> merging one popped from it in the current merged_merge_bases,
> until we run out.  At that point when we leave the loop, we'd
> have merged_merge_bases that is a virtual commit to be used as
> a single merge base to use while merging trees of h1 and h2.
>
> > +     for (iter = merge_bases; iter; iter = iter->next) {
> > +             const char *saved_b1, *saved_b2;
> > +             struct commit *prev = merged_merge_bases;
> > +
> > +             opt->priv->call_depth++;
> > +             /*
> > +              * When the merge fails, the result contains files
> > +              * with conflict markers. The cleanness flag is
> > +              * ignored (unless indicating an error), it was never
> > +              * actually used, as result of merge_trees has always
> > +              * overwritten it: the committed "conflicts" were
> > +              * already resolved.
> > +              */
> > +             saved_b1 = opt->branch1;
> > +             saved_b2 = opt->branch2;
> > +             opt->branch1 = "Temporary merge branch 1";
> > +             opt->branch2 = "Temporary merge branch 2";
> > +             merge_ort_internal(opt, NULL, prev, iter->item, result);
> > +             if (result->clean < 0)
> > +                     return;
> > +             opt->branch1 = saved_b1;
> > +             opt->branch2 = saved_b2;
> > +             opt->priv->call_depth--;
> > +
> > +             merged_merge_bases = make_virtual_commit(opt->repo,
> > +                                                      result->tree,
> > +                                                      "merged tree");
> > +             commit_list_insert(prev, &merged_merge_bases->parents);
> > +             commit_list_insert(iter->item,
> > +                                &merged_merge_bases->parents->next);
>
> We need to record these parents because...?  When merged_merge_bases
> we just created is used as one side of a merge in the next iteration,
> we'd need to compute the merge base between it and the one we'd pop
> out of merge_bases, and that is why.
>
> > +             clear_or_reinit_internal_opts(opt->priv, 1);
> > +     }
>
> OK.  I think I understood this loop.  It looks mostly straight-forward.
>
> > +     opt->ancestor = ancestor_name;
>
> And the label to be used, that was computed before the above loop,
> is used here...
>
> > +     merge_ort_nonrecursive_internal(opt,
> > +                                     repo_get_commit_tree(opt->repo,
> > +                                                          merged_merge_bases),
> > +                                     repo_get_commit_tree(opt->repo, h1),
> > +                                     repo_get_commit_tree(opt->repo, h2),
> > +                                     result);
>
> ... to finally compute the 3-way merge between h1 and h2.
>
> > +     strbuf_release(&merge_base_abbrev);
>
> And the storage that may have been holding the .ancestor name is
> cleared, as we no longer need it.
>
> > +     opt->ancestor = NULL;  /* avoid accidental re-use of opt->ancestor */
> > +}
> > +
> >  void merge_incore_nonrecursive(struct merge_options *opt,
> >                              struct tree *merge_base,
> >                              struct tree *side1,
> > @@ -1493,7 +1577,9 @@ void merge_incore_recursive(struct merge_options *opt,
> >                           struct commit *side2,
> >                           struct merge_result *result)
> >  {
> > -     (void)reverse_commit_list;
> > -     (void)make_virtual_commit;
> > -     die("Not yet implemented");
> > +     assert(opt->ancestor == NULL ||
> > +            !strcmp(opt->ancestor, "constructed merge base"));
>
> Where does this string come from?  The recursive backend does
> asssign a fixed string with that value to opt->ancestor, but we
> don't expect that string to come here, no?

It's specifically the merge_recursive_generic() function from
merge-recursive.c that sets this, which was part of the
merge-recursive API.  merge-ort does not (yet?) have an equivalent
function (anything calling merge_recursive_generic() just can't use
merge-ort right now -- a list that includes 'am', 'stash', and
'merge-recursive').  For now, I am just letting those callers continue
to use merge-recursive.c.  I never figured out if I wanted to make
that function part of merge-ort's API, whether I just wanted to add a
wrapper to merge-ort-wrappers.[ch] for it, or if I should rewrite the
callers to do something else.

However, looking a little closer, the name for opt->ancestor is
slightly phony -- I think it only makes sense for "am", not for either
of "stash" or "merge-recursive".  Perhaps I should instead count the
number of merge_bases, and assert that either opt->ancestor == NULL
(exclusive)OR  num_merge_bases == 1.  merge_recursive_generic() should
also be made to stop setting opt->ancestor, and then callers like
"am", "stash", and "merge-recursive" should be responsible to provide
a reasonable ancestor name for merge.conflictStyle=diff3 to use when
it's clear they are providing the sole ancestor.

Then at some point I can decide what to do with
merge_recursive_generic().  I'll probably just make it a wrapper at
some point; since that lets me kick the can down the road even
further.  :-)

> > +     merge_start(opt, result);
> > +     merge_ort_internal(opt, merge_bases, side1, side2, result);
> >  }

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/3] merge-ort: implement merge_incore_recursive()
  2020-12-16  4:09     ` Elijah Newren
@ 2020-12-16  4:44       ` Elijah Newren
  0 siblings, 0 replies; 37+ messages in thread
From: Elijah Newren @ 2020-12-16  4:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Elijah Newren via GitGitGadget, Git Mailing List

On Tue, Dec 15, 2020 at 8:09 PM Elijah Newren <newren@gmail.com> wrote:
>
> On Tue, Dec 15, 2020 at 6:07 PM Junio C Hamano <gitster@pobox.com> wrote:
> >
> > "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> >
> > > +/*
> > > + * Originally from merge_recursive_internal(); somewhat adapted, though.
> > > + */
> > > +static void merge_ort_internal(struct merge_options *opt,
> > > +                            struct commit_list *merge_bases,
> > > +                            struct commit *h1,
> > > +                            struct commit *h2,
> > > +                            struct merge_result *result)
> > > +{
> > > +     struct commit_list *iter;
> > > +     struct commit *merged_merge_bases;
> > > +     const char *ancestor_name;
> > > +     struct strbuf merge_base_abbrev = STRBUF_INIT;
> > > +
> > > +     if (!merge_bases) {
> > > +             merge_bases = get_merge_bases(h1, h2);
> > > +             merge_bases = reverse_commit_list(merge_bases);
> >
> > Do we want to say why we reverse here, or is the reason so well
> > known why in the original merge-recursive case?
>
> Oh, good point.  After starting on merge-ort, I shifted back and forth
> between it and cleaning up merge-recursive for a while...and it looks
> like this is one of the things I forgot to copy over.  The reason was
> totally opaque to me until Johannes spelled it out over here:
> https://lore.kernel.org/git/nycvar.QRO.7.76.6.1907252055500.21907@tvgsbejvaqbjf.bet/
>
> Note that the same reversing of merge bases is done in builtin/merge.c
> and sequencer.c as well.  It resulted in me adding the following note
> to the declaration of merge_recursive() in merge-recursive.h:
>
>  * NOTE: empirically, about a decade ago it was determined that with more
>  *       than two merge bases, optimal behavior was found when the
>  *       merge_bases were passed in the order of oldest commit to newest
>  *       commit.  Also, merge_bases will be consumed (emptied) so make a
>  *       copy if you need it.
>
> but I never copied that comment over to merge_incore_recursive().  I
> should do that, and perhaps reference that comment at this point in
> the code.
>
> > > +     }
> > > +
> > > +     merged_merge_bases = pop_commit(&merge_bases);
> > > +     if (merged_merge_bases == NULL) {
> > > +             /* if there is no common ancestor, use an empty tree */
> > > +             struct tree *tree;
> > > +
> > > +             tree = lookup_tree(opt->repo, opt->repo->hash_algo->empty_tree);
> > > +             merged_merge_bases = make_virtual_commit(opt->repo, tree,
> > > +                                                      "ancestor");
> > > +             ancestor_name = "empty tree";
> > > +     } else if (opt->ancestor && !opt->priv->call_depth) {
> > > +             ancestor_name = opt->ancestor;
> > > +     } else if (merge_bases) {
> > > +             ancestor_name = "merged common ancestors";
> > > +     } else {
> > > +             strbuf_add_unique_abbrev(&merge_base_abbrev,
> > > +                                      &merged_merge_bases->object.oid,
> > > +                                      DEFAULT_ABBREV);
> > > +             ancestor_name = merge_base_abbrev.buf;
> > > +     }
> >
> > So, up to this point we learned:
> >
> >  - merge bases either given by the caller, or computed from h1 and
> >    h2 when the caller just told us to figure it out ourselves.
> >
> >  - if we have
> >
> >    - 0 merge base between h1 and h2, in which case we would use an
> >      empty tree as an imaginary common
> >
> >    - 1 merge base between h1 and h2, in which case the common
> >      ancestor of the resuting merge between h1 and h2 is that single
> >      merge base
> >
> >    - 2 or more bases, in which case we'd use that would eventually
> >      come back when we merged recursively all bases.
> >
> > and the primary products of the above procedure are
> >
> >  - ancestor_name (the string used when presenting conflicts while
> >    merging h1 and h2)
> >
> >  - merged_merge_bases (one starting commit among the merge bases)
> >
> > And then the loop will iterate over the remaining merge bases,
> > merging one popped from it in the current merged_merge_bases,
> > until we run out.  At that point when we leave the loop, we'd
> > have merged_merge_bases that is a virtual commit to be used as
> > a single merge base to use while merging trees of h1 and h2.
> >
> > > +     for (iter = merge_bases; iter; iter = iter->next) {
> > > +             const char *saved_b1, *saved_b2;
> > > +             struct commit *prev = merged_merge_bases;
> > > +
> > > +             opt->priv->call_depth++;
> > > +             /*
> > > +              * When the merge fails, the result contains files
> > > +              * with conflict markers. The cleanness flag is
> > > +              * ignored (unless indicating an error), it was never
> > > +              * actually used, as result of merge_trees has always
> > > +              * overwritten it: the committed "conflicts" were
> > > +              * already resolved.
> > > +              */
> > > +             saved_b1 = opt->branch1;
> > > +             saved_b2 = opt->branch2;
> > > +             opt->branch1 = "Temporary merge branch 1";
> > > +             opt->branch2 = "Temporary merge branch 2";
> > > +             merge_ort_internal(opt, NULL, prev, iter->item, result);
> > > +             if (result->clean < 0)
> > > +                     return;
> > > +             opt->branch1 = saved_b1;
> > > +             opt->branch2 = saved_b2;
> > > +             opt->priv->call_depth--;
> > > +
> > > +             merged_merge_bases = make_virtual_commit(opt->repo,
> > > +                                                      result->tree,
> > > +                                                      "merged tree");
> > > +             commit_list_insert(prev, &merged_merge_bases->parents);
> > > +             commit_list_insert(iter->item,
> > > +                                &merged_merge_bases->parents->next);
> >
> > We need to record these parents because...?  When merged_merge_bases
> > we just created is used as one side of a merge in the next iteration,
> > we'd need to compute the merge base between it and the one we'd pop
> > out of merge_bases, and that is why.
> >
> > > +             clear_or_reinit_internal_opts(opt->priv, 1);
> > > +     }
> >
> > OK.  I think I understood this loop.  It looks mostly straight-forward.
> >
> > > +     opt->ancestor = ancestor_name;
> >
> > And the label to be used, that was computed before the above loop,
> > is used here...
> >
> > > +     merge_ort_nonrecursive_internal(opt,
> > > +                                     repo_get_commit_tree(opt->repo,
> > > +                                                          merged_merge_bases),
> > > +                                     repo_get_commit_tree(opt->repo, h1),
> > > +                                     repo_get_commit_tree(opt->repo, h2),
> > > +                                     result);
> >
> > ... to finally compute the 3-way merge between h1 and h2.
> >
> > > +     strbuf_release(&merge_base_abbrev);
> >
> > And the storage that may have been holding the .ancestor name is
> > cleared, as we no longer need it.
> >
> > > +     opt->ancestor = NULL;  /* avoid accidental re-use of opt->ancestor */
> > > +}
> > > +
> > >  void merge_incore_nonrecursive(struct merge_options *opt,
> > >                              struct tree *merge_base,
> > >                              struct tree *side1,
> > > @@ -1493,7 +1577,9 @@ void merge_incore_recursive(struct merge_options *opt,
> > >                           struct commit *side2,
> > >                           struct merge_result *result)
> > >  {
> > > -     (void)reverse_commit_list;
> > > -     (void)make_virtual_commit;
> > > -     die("Not yet implemented");
> > > +     assert(opt->ancestor == NULL ||
> > > +            !strcmp(opt->ancestor, "constructed merge base"));
> >
> > Where does this string come from?  The recursive backend does
> > asssign a fixed string with that value to opt->ancestor, but we
> > don't expect that string to come here, no?
>
> It's specifically the merge_recursive_generic() function from
> merge-recursive.c that sets this, which was part of the
> merge-recursive API.  merge-ort does not (yet?) have an equivalent
> function (anything calling merge_recursive_generic() just can't use
> merge-ort right now -- a list that includes 'am', 'stash', and
> 'merge-recursive').  For now, I am just letting those callers continue
> to use merge-recursive.c.  I never figured out if I wanted to make
> that function part of merge-ort's API, whether I just wanted to add a
> wrapper to merge-ort-wrappers.[ch] for it, or if I should rewrite the
> callers to do something else.
>
> However, looking a little closer, the name for opt->ancestor is
> slightly phony -- I think it only makes sense for "am", not for either
> of "stash" or "merge-recursive".  Perhaps I should instead count the
> number of merge_bases, and assert that either opt->ancestor == NULL
> (exclusive)OR  num_merge_bases == 1.  merge_recursive_generic() should

No, this isn't right.  Most callers just grab all the merge bases and
often there happens to be only one.  In such a case, the caller
shouldn't have to set opt->ancestor; the fallback of referencing the
commit id is perfectly good there.  It's just that when there is
exactly one merge base, a special caller like
merge_recursive_generic() might want to override the value for
opt->ancestor.

Anyway, I'll post a new patch soon.

> also be made to stop setting opt->ancestor, and then callers like
> "am", "stash", and "merge-recursive" should be responsible to provide
> a reasonable ancestor name for merge.conflictStyle=diff3 to use when
> it's clear they are providing the sole ancestor.
>
> Then at some point I can decide what to do with
> merge_recursive_generic().  I'll probably just make it a wrapper at
> some point; since that lets me kick the can down the road even
> further.  :-)
>
> > > +     merge_start(opt, result);
> > > +     merge_ort_internal(opt, merge_bases, side1, side2, result);
> > >  }

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v2 0/3] merge-ort: implement recursive merges
  2020-12-15 17:53 [PATCH 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
                   ` (2 preceding siblings ...)
  2020-12-15 17:53 ` [PATCH 3/3] merge-ort: implement merge_incore_recursive() Elijah Newren via GitGitGadget
@ 2020-12-16  5:52 ` Elijah Newren via GitGitGadget
  2020-12-16  5:52   ` [PATCH v2 1/3] merge-ort: copy a few small helper functions from merge-recursive.c Elijah Newren via GitGitGadget
                     ` (3 more replies)
  3 siblings, 4 replies; 37+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-12-16  5:52 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren

This series depends on en/merge-ort-2 (it does NOT depend on en/merge-ort-3
and can thus be reviewed/merged independently of it).

This short series adds handling of recursive merges (merging of multiple
merge-bases to create a virtual merge base) to merge-ort. With this short
series the number of test failures under GIT_TEST_MERGE_ALGORITHM=ort drops
by 801 (from 1448 to 647).

Changes since v1 (based on review comments from Junio):

 * add documentation on why we reverse the commit_list of merge bases
 * modify the special override for allowing opt->ancestor to be set before
   merge_incore_recursive() and document its rationale

Elijah Newren (3):
  merge-ort: copy a few small helper functions from merge-recursive.c
  merge-ort: make clear_internal_opts() aware of partial clearing
  merge-ort: implement merge_incore_recursive()

 merge-ort.c | 149 +++++++++++++++++++++++++++++++++++++++++++++++++---
 merge-ort.h |  10 ++++
 2 files changed, 152 insertions(+), 7 deletions(-)


base-commit: c5a6f65527aa3b6f5d7cf25437a88d8727ab0646
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-814%2Fnewren%2Fort-recursive-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-814/newren/ort-recursive-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/814

Range-diff vs v1:

 1:  0b455bd6fe7 = 1:  0b455bd6fe7 merge-ort: copy a few small helper functions from merge-recursive.c
 2:  fc26c1a11db = 2:  fc26c1a11db merge-ort: make clear_internal_opts() aware of partial clearing
 3:  82a773d8972 ! 3:  d8f79450a40 merge-ort: implement merge_incore_recursive()
     @@ merge-ort.c: static void merge_ort_nonrecursive_internal(struct merge_options *o
      +
      +	if (!merge_bases) {
      +		merge_bases = get_merge_bases(h1, h2);
     ++		/* See merge-ort.h:merge_incore_recursive() declaration NOTE */
      +		merge_bases = reverse_commit_list(merge_bases);
      +	}
      +
     @@ merge-ort.c: void merge_incore_recursive(struct merge_options *opt,
      -	(void)reverse_commit_list;
      -	(void)make_virtual_commit;
      -	die("Not yet implemented");
     -+	assert(opt->ancestor == NULL ||
     -+	       !strcmp(opt->ancestor, "constructed merge base"));
     ++	/*
     ++	 * merge_incore_nonrecursive() exists for cases where we always
     ++	 * know there is a well-defined single merge base.  However,
     ++	 * despite a similar structure, merge-recursive.c noted that some
     ++	 * callers preferred to call the recursive logic anyway and still
     ++	 * set a special name for opt->ancestor that would appear in
     ++	 * merge.conflictStyle=diff3 output.
     ++	 *
     ++	 * git-am was one such example (it wanted to set opt->ancestor to
     ++	 * "constructed merge base", since it created a fake merge base);
     ++	 * it called the recursive merge logic through a special
     ++	 * merge_recursive_generic() wrapper.
     ++	 *
     ++	 * Allow the same kind of special case here.
     ++	 */
     ++	int num_merge_bases_is_1 = (merge_bases && !merge_bases->next);
     ++	assert(opt->ancestor == NULL || num_merge_bases_is_1);
      +
      +	merge_start(opt, result);
      +	merge_ort_internal(opt, merge_bases, side1, side2, result);
       }
     +
     + ## merge-ort.h ##
     +@@ merge-ort.h: struct merge_result {
     + /*
     +  * rename-detecting three-way merge with recursive ancestor consolidation.
     +  * working tree and index are untouched.
     ++ *
     ++ * merge_bases will be consumed (emptied) so make a copy if you need it.
     ++ *
     ++ * NOTE: empirically, the recursive algorithm will perform better if you
     ++ *       pass the merge_bases in the order of oldest commit to the
     ++ *       newest[1][2].
     ++ *
     ++ *       [1] https://lore.kernel.org/git/nycvar.QRO.7.76.6.1907252055500.21907@tvgsbejvaqbjf.bet/
     ++ *       [2] commit 8918b0c9c2 ("merge-recur: try to merge older merge bases
     ++ *           first", 2006-08-09)
     +  */
     + void merge_incore_recursive(struct merge_options *opt,
     + 			    struct commit_list *merge_bases,

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v2 1/3] merge-ort: copy a few small helper functions from merge-recursive.c
  2020-12-16  5:52 ` [PATCH v2 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
@ 2020-12-16  5:52   ` Elijah Newren via GitGitGadget
  2020-12-16  5:52   ` [PATCH v2 2/3] merge-ort: make clear_internal_opts() aware of partial clearing Elijah Newren via GitGitGadget
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 37+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-12-16  5:52 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

In a subsequent commit, we will implement the traditional recursiveness
that gave merge-recursive its name, namely merging non-unique
merge-bases to come up with a single virtual merge base.  Copy a few
helper functions from merge-recursive.c that we will use in the
implementation.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/merge-ort.c b/merge-ort.c
index 414e7b7eeac..05ba92c91a6 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -17,8 +17,10 @@
 #include "cache.h"
 #include "merge-ort.h"
 
+#include "alloc.h"
 #include "blob.h"
 #include "cache-tree.h"
+#include "commit.h"
 #include "commit-reach.h"
 #include "diff.h"
 #include "diffcore.h"
@@ -1348,6 +1350,34 @@ void merge_finalize(struct merge_options *opt,
 
 /*** Function Grouping: helper functions for merge_incore_*() ***/
 
+static inline void set_commit_tree(struct commit *c, struct tree *t)
+{
+	c->maybe_tree = t;
+}
+
+static struct commit *make_virtual_commit(struct repository *repo,
+					  struct tree *tree,
+					  const char *comment)
+{
+	struct commit *commit = alloc_commit_node(repo);
+
+	set_merge_remote_desc(commit, comment, (struct object *)commit);
+	set_commit_tree(commit, tree);
+	commit->object.parsed = 1;
+	return commit;
+}
+
+static struct commit_list *reverse_commit_list(struct commit_list *list)
+{
+	struct commit_list *next = NULL, *current, *backup;
+	for (current = list; current; current = backup) {
+		backup = current->next;
+		current->next = next;
+		next = current;
+	}
+	return next;
+}
+
 static void merge_start(struct merge_options *opt, struct merge_result *result)
 {
 	/* Sanity checks on opt */
@@ -1462,5 +1492,7 @@ void merge_incore_recursive(struct merge_options *opt,
 			    struct commit *side2,
 			    struct merge_result *result)
 {
+	(void)reverse_commit_list;
+	(void)make_virtual_commit;
 	die("Not yet implemented");
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 2/3] merge-ort: make clear_internal_opts() aware of partial clearing
  2020-12-16  5:52 ` [PATCH v2 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
  2020-12-16  5:52   ` [PATCH v2 1/3] merge-ort: copy a few small helper functions from merge-recursive.c Elijah Newren via GitGitGadget
@ 2020-12-16  5:52   ` Elijah Newren via GitGitGadget
  2020-12-16  5:52   ` [PATCH v2 3/3] merge-ort: implement merge_incore_recursive() Elijah Newren via GitGitGadget
  2020-12-16 17:17   ` [PATCH v3 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
  3 siblings, 0 replies; 37+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-12-16  5:52 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

In order to handle recursive merges, after merging merge-bases we need
to clear away most of the data we had built up but some of it needs to
be kept -- in particular the "output" field.  Rename the function to
reflect its future change in use.

Further, since "reinitialize" means we'll be reusing the fields
immediately, take advantage of this to only partially clear maps,
leaving the hashtable allocated and pre-sized.  (This may be slightly
out-of-order since the speedups aren't realized until there are far
more strmaps in use, but the patch submission process already went out
of order because of various questions and requests for strmap.  Anyway,
see commit 6ccdfc2a20 ("strmap: enable faster clearing and reusing of
strmaps", 2020-11-05), for performance details about the use of
strmap_partial_clear().)

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index 05ba92c91a6..10a97e944c4 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -253,10 +253,11 @@ static void free_strmap_strings(struct strmap *map)
 	}
 }
 
-static void clear_internal_opts(struct merge_options_internal *opti,
-				int reinitialize)
+static void clear_or_reinit_internal_opts(struct merge_options_internal *opti,
+					  int reinitialize)
 {
-	assert(!reinitialize);
+	void (*strmap_func)(struct strmap *, int) =
+		reinitialize ? strmap_partial_clear : strmap_clear;
 
 	/*
 	 * We marked opti->paths with strdup_strings = 0, so that we
@@ -266,14 +267,14 @@ static void clear_internal_opts(struct merge_options_internal *opti,
 	 * to deallocate them.
 	 */
 	free_strmap_strings(&opti->paths);
-	strmap_clear(&opti->paths, 1);
+	strmap_func(&opti->paths, 1);
 
 	/*
 	 * All keys and values in opti->conflicted are a subset of those in
 	 * opti->paths.  We don't want to deallocate anything twice, so we
 	 * don't free the keys and we pass 0 for free_values.
 	 */
-	strmap_clear(&opti->conflicted, 0);
+	strmap_func(&opti->conflicted, 0);
 
 	/*
 	 * opti->paths_to_free is similar to opti->paths; we created it with
@@ -1344,7 +1345,7 @@ void merge_finalize(struct merge_options *opt,
 
 	assert(opt->priv == NULL);
 
-	clear_internal_opts(opti, 0);
+	clear_or_reinit_internal_opts(opti, 0);
 	FREE_AND_NULL(opti);
 }
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 3/3] merge-ort: implement merge_incore_recursive()
  2020-12-16  5:52 ` [PATCH v2 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
  2020-12-16  5:52   ` [PATCH v2 1/3] merge-ort: copy a few small helper functions from merge-recursive.c Elijah Newren via GitGitGadget
  2020-12-16  5:52   ` [PATCH v2 2/3] merge-ort: make clear_internal_opts() aware of partial clearing Elijah Newren via GitGitGadget
@ 2020-12-16  5:52   ` Elijah Newren via GitGitGadget
  2020-12-16 18:09     ` Junio C Hamano
  2020-12-16 17:17   ` [PATCH v3 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
  3 siblings, 1 reply; 37+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-12-16  5:52 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Implement merge_incore_recursive(), mostly through the use of a new
helper function, merge_ort_internal(), which itself is based off
merge_recursive_internal() from merge-recursive.c.

This drops the number of failures in the testsuite when run under
GIT_TEST_MERGE_ALGORITHM=ort from around 1500 to 647.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 108 ++++++++++++++++++++++++++++++++++++++++++++++++++--
 merge-ort.h |  10 +++++
 2 files changed, 115 insertions(+), 3 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index 10a97e944c4..de9585b3179 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -1476,6 +1476,91 @@ static void merge_ort_nonrecursive_internal(struct merge_options *opt,
 	}
 }
 
+/*
+ * Originally from merge_recursive_internal(); somewhat adapted, though.
+ */
+static void merge_ort_internal(struct merge_options *opt,
+			       struct commit_list *merge_bases,
+			       struct commit *h1,
+			       struct commit *h2,
+			       struct merge_result *result)
+{
+	struct commit_list *iter;
+	struct commit *merged_merge_bases;
+	const char *ancestor_name;
+	struct strbuf merge_base_abbrev = STRBUF_INIT;
+
+	if (!merge_bases) {
+		merge_bases = get_merge_bases(h1, h2);
+		/* See merge-ort.h:merge_incore_recursive() declaration NOTE */
+		merge_bases = reverse_commit_list(merge_bases);
+	}
+
+	merged_merge_bases = pop_commit(&merge_bases);
+	if (merged_merge_bases == NULL) {
+		/* if there is no common ancestor, use an empty tree */
+		struct tree *tree;
+
+		tree = lookup_tree(opt->repo, opt->repo->hash_algo->empty_tree);
+		merged_merge_bases = make_virtual_commit(opt->repo, tree,
+							 "ancestor");
+		ancestor_name = "empty tree";
+	} else if (opt->ancestor && !opt->priv->call_depth) {
+		ancestor_name = opt->ancestor;
+	} else if (merge_bases) {
+		ancestor_name = "merged common ancestors";
+	} else {
+		strbuf_add_unique_abbrev(&merge_base_abbrev,
+					 &merged_merge_bases->object.oid,
+					 DEFAULT_ABBREV);
+		ancestor_name = merge_base_abbrev.buf;
+	}
+
+	for (iter = merge_bases; iter; iter = iter->next) {
+		const char *saved_b1, *saved_b2;
+		struct commit *prev = merged_merge_bases;
+
+		opt->priv->call_depth++;
+		/*
+		 * When the merge fails, the result contains files
+		 * with conflict markers. The cleanness flag is
+		 * ignored (unless indicating an error), it was never
+		 * actually used, as result of merge_trees has always
+		 * overwritten it: the committed "conflicts" were
+		 * already resolved.
+		 */
+		saved_b1 = opt->branch1;
+		saved_b2 = opt->branch2;
+		opt->branch1 = "Temporary merge branch 1";
+		opt->branch2 = "Temporary merge branch 2";
+		merge_ort_internal(opt, NULL, prev, iter->item, result);
+		if (result->clean < 0)
+			return;
+		opt->branch1 = saved_b1;
+		opt->branch2 = saved_b2;
+		opt->priv->call_depth--;
+
+		merged_merge_bases = make_virtual_commit(opt->repo,
+							 result->tree,
+							 "merged tree");
+		commit_list_insert(prev, &merged_merge_bases->parents);
+		commit_list_insert(iter->item,
+				   &merged_merge_bases->parents->next);
+
+		clear_or_reinit_internal_opts(opt->priv, 1);
+	}
+
+	opt->ancestor = ancestor_name;
+	merge_ort_nonrecursive_internal(opt,
+					repo_get_commit_tree(opt->repo,
+							     merged_merge_bases),
+					repo_get_commit_tree(opt->repo, h1),
+					repo_get_commit_tree(opt->repo, h2),
+					result);
+	strbuf_release(&merge_base_abbrev);
+	opt->ancestor = NULL;  /* avoid accidental re-use of opt->ancestor */
+}
+
 void merge_incore_nonrecursive(struct merge_options *opt,
 			       struct tree *merge_base,
 			       struct tree *side1,
@@ -1493,7 +1578,24 @@ void merge_incore_recursive(struct merge_options *opt,
 			    struct commit *side2,
 			    struct merge_result *result)
 {
-	(void)reverse_commit_list;
-	(void)make_virtual_commit;
-	die("Not yet implemented");
+	/*
+	 * merge_incore_nonrecursive() exists for cases where we always
+	 * know there is a well-defined single merge base.  However,
+	 * despite a similar structure, merge-recursive.c noted that some
+	 * callers preferred to call the recursive logic anyway and still
+	 * set a special name for opt->ancestor that would appear in
+	 * merge.conflictStyle=diff3 output.
+	 *
+	 * git-am was one such example (it wanted to set opt->ancestor to
+	 * "constructed merge base", since it created a fake merge base);
+	 * it called the recursive merge logic through a special
+	 * merge_recursive_generic() wrapper.
+	 *
+	 * Allow the same kind of special case here.
+	 */
+	int num_merge_bases_is_1 = (merge_bases && !merge_bases->next);
+	assert(opt->ancestor == NULL || num_merge_bases_is_1);
+
+	merge_start(opt, result);
+	merge_ort_internal(opt, merge_bases, side1, side2, result);
 }
diff --git a/merge-ort.h b/merge-ort.h
index 55ae7ee865d..d53a0a339f3 100644
--- a/merge-ort.h
+++ b/merge-ort.h
@@ -34,6 +34,16 @@ struct merge_result {
 /*
  * rename-detecting three-way merge with recursive ancestor consolidation.
  * working tree and index are untouched.
+ *
+ * merge_bases will be consumed (emptied) so make a copy if you need it.
+ *
+ * NOTE: empirically, the recursive algorithm will perform better if you
+ *       pass the merge_bases in the order of oldest commit to the
+ *       newest[1][2].
+ *
+ *       [1] https://lore.kernel.org/git/nycvar.QRO.7.76.6.1907252055500.21907@tvgsbejvaqbjf.bet/
+ *       [2] commit 8918b0c9c2 ("merge-recur: try to merge older merge bases
+ *           first", 2006-08-09)
  */
 void merge_incore_recursive(struct merge_options *opt,
 			    struct commit_list *merge_bases,
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/3] merge-ort: copy a few small helper functions from merge-recursive.c
  2020-12-15 17:53 ` [PATCH 1/3] merge-ort: copy a few small helper functions from merge-recursive.c Elijah Newren via GitGitGadget
  2020-12-16  1:16   ` Junio C Hamano
@ 2020-12-16 13:30   ` Derrick Stolee
  2020-12-16 17:43     ` Junio C Hamano
  1 sibling, 1 reply; 37+ messages in thread
From: Derrick Stolee @ 2020-12-16 13:30 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget, git; +Cc: Elijah Newren

On 12/15/2020 12:53 PM, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
> 
> In a subsequent commit, we will implement the traditional recursiveness
> that gave merge-recursive its name, namely merging non-unique
> merge-bases to come up with a single virtual merge base.  Copy a few
> helper functions from merge-recursive.c that we will use in the
> implementation.

I'm sure these are copies, but...

> +static struct commit_list *reverse_commit_list(struct commit_list *list)
> +{
> +	struct commit_list *next = NULL, *current, *backup;
> +	for (current = list; current; current = backup) {
> +		backup = current->next;
> +		current->next = next;
> +		next = current;
> +	}

The naming of 'next' seems backwards to me, since it is really
the "previous" node. Using something like 'previous' makes it
clear that you are reversing when you say

	current->next = previous;

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/3] merge-ort: copy a few small helper functions from merge-recursive.c
  2020-12-16  1:16   ` Junio C Hamano
@ 2020-12-16 16:12     ` Johannes Schindelin
  2020-12-16 16:24       ` Elijah Newren
  0 siblings, 1 reply; 37+ messages in thread
From: Johannes Schindelin @ 2020-12-16 16:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Elijah Newren via GitGitGadget, git, Elijah Newren

Hi Junio,

On Tue, 15 Dec 2020, Junio C Hamano wrote:

> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Elijah Newren <newren@gmail.com>
> >
> > In a subsequent commit, we will implement the traditional recursiveness
> > that gave merge-recursive its name, namely merging non-unique
> > merge-bases to come up with a single virtual merge base.  Copy a few
> > helper functions from merge-recursive.c that we will use in the
> > implementation.
> >
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > ...
> > @@ -1462,5 +1492,7 @@ void merge_incore_recursive(struct merge_options *opt,
> >  			    struct commit *side2,
> >  			    struct merge_result *result)
> >  {
> > +	(void)reverse_commit_list;
> > +	(void)make_virtual_commit;
>
> To keep these symbols referenced?  Cute ;-)

I saw this technique used extensively in cURL. Note that we ourselves
introduced the first such thing in 2fb330ca723 (packed_delete_refs():
implement method, 2017-09-08).

However, we seem to have the `MAYBE_UNUSED` macro specifically for such
use cases (and use it in four instances as of time of writing). I wonder
whether we want to settle one one strategy instead of keeping both?

Ciao,
Dscho

>
> >  	die("Not yet implemented");
> >  }
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/3] merge-ort: copy a few small helper functions from merge-recursive.c
  2020-12-16 16:12     ` Johannes Schindelin
@ 2020-12-16 16:24       ` Elijah Newren
  0 siblings, 0 replies; 37+ messages in thread
From: Elijah Newren @ 2020-12-16 16:24 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Junio C Hamano, Elijah Newren via GitGitGadget, Git Mailing List

On Wed, Dec 16, 2020 at 8:12 AM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi Junio,
>
> On Tue, 15 Dec 2020, Junio C Hamano wrote:
>
> > "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> >
> > > From: Elijah Newren <newren@gmail.com>
> > >
> > > In a subsequent commit, we will implement the traditional recursiveness
> > > that gave merge-recursive its name, namely merging non-unique
> > > merge-bases to come up with a single virtual merge base.  Copy a few
> > > helper functions from merge-recursive.c that we will use in the
> > > implementation.
> > >
> > > Signed-off-by: Elijah Newren <newren@gmail.com>
> > > ...
> > > @@ -1462,5 +1492,7 @@ void merge_incore_recursive(struct merge_options *opt,
> > >                         struct commit *side2,
> > >                         struct merge_result *result)
> > >  {
> > > +   (void)reverse_commit_list;
> > > +   (void)make_virtual_commit;
> >
> > To keep these symbols referenced?  Cute ;-)
>
> I saw this technique used extensively in cURL. Note that we ourselves
> introduced the first such thing in 2fb330ca723 (packed_delete_refs():
> implement method, 2017-09-08).
>
> However, we seem to have the `MAYBE_UNUSED` macro specifically for such
> use cases (and use it in four instances as of time of writing). I wonder
> whether we want to settle one one strategy instead of keeping both?

Ooh, MAYBE_UNUSED, I like it.  Much more self-documenting.  I'm
sending another round to address Derrick's comment, so I'll include
this change while at it.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v3 0/3] merge-ort: implement recursive merges
  2020-12-16  5:52 ` [PATCH v2 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
                     ` (2 preceding siblings ...)
  2020-12-16  5:52   ` [PATCH v2 3/3] merge-ort: implement merge_incore_recursive() Elijah Newren via GitGitGadget
@ 2020-12-16 17:17   ` Elijah Newren via GitGitGadget
  2020-12-16 17:17     ` [PATCH v3 1/3] merge-ort: copy a few small helper functions from merge-recursive.c Elijah Newren via GitGitGadget
                       ` (3 more replies)
  3 siblings, 4 replies; 37+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-12-16 17:17 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Derrick Stolee, Johannes Schindelin, Elijah Newren

This series depends on en/merge-ort-2 (it does NOT depend on en/merge-ort-3
and can thus be reviewed/merged independently of it).

This short series adds handling of recursive merges (merging of multiple
merge-bases to create a virtual merge base) to merge-ort. With this short
series the number of test failures under GIT_TEST_MERGE_ALGORITHM=ort drops
by 801 (from 1448 to 647).

Changes since v2:

 * rename local var from 'next' to 'previous' to make reversing logic
   clearer (as suggested by Stolee)
 * employ MAYBE_UNUSED modifier on new functions as better way to avoid
   unused-function errors (as suggested by Johannes)

Elijah Newren (3):
  merge-ort: copy a few small helper functions from merge-recursive.c
  merge-ort: make clear_internal_opts() aware of partial clearing
  merge-ort: implement merge_incore_recursive()

 merge-ort.c | 149 +++++++++++++++++++++++++++++++++++++++++++++++++---
 merge-ort.h |  10 ++++
 2 files changed, 152 insertions(+), 7 deletions(-)


base-commit: c5a6f65527aa3b6f5d7cf25437a88d8727ab0646
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-814%2Fnewren%2Fort-recursive-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-814/newren/ort-recursive-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/814

Range-diff vs v2:

 1:  0b455bd6fe7 ! 1:  dcf28565ad3 merge-ort: copy a few small helper functions from merge-recursive.c
     @@ merge-ort.c: void merge_finalize(struct merge_options *opt,
      +	c->maybe_tree = t;
      +}
      +
     ++MAYBE_UNUSED
      +static struct commit *make_virtual_commit(struct repository *repo,
      +					  struct tree *tree,
      +					  const char *comment)
     @@ merge-ort.c: void merge_finalize(struct merge_options *opt,
      +	return commit;
      +}
      +
     ++MAYBE_UNUSED
      +static struct commit_list *reverse_commit_list(struct commit_list *list)
      +{
     -+	struct commit_list *next = NULL, *current, *backup;
     ++	struct commit_list *previous = NULL, *current, *backup;
      +	for (current = list; current; current = backup) {
      +		backup = current->next;
     -+		current->next = next;
     -+		next = current;
     ++		current->next = previous;
     ++		previous = current;
      +	}
     -+	return next;
     ++	return previous;
      +}
      +
       static void merge_start(struct merge_options *opt, struct merge_result *result)
       {
       	/* Sanity checks on opt */
     -@@ merge-ort.c: void merge_incore_recursive(struct merge_options *opt,
     - 			    struct commit *side2,
     - 			    struct merge_result *result)
     - {
     -+	(void)reverse_commit_list;
     -+	(void)make_virtual_commit;
     - 	die("Not yet implemented");
     - }
 2:  fc26c1a11db = 2:  bffc45c6570 merge-ort: make clear_internal_opts() aware of partial clearing
 3:  d8f79450a40 ! 3:  59216a155ae merge-ort: implement merge_incore_recursive()
     @@ Commit message
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
       ## merge-ort.c ##
     +@@ merge-ort.c: static inline void set_commit_tree(struct commit *c, struct tree *t)
     + 	c->maybe_tree = t;
     + }
     + 
     +-MAYBE_UNUSED
     + static struct commit *make_virtual_commit(struct repository *repo,
     + 					  struct tree *tree,
     + 					  const char *comment)
     +@@ merge-ort.c: static struct commit *make_virtual_commit(struct repository *repo,
     + 	return commit;
     + }
     + 
     +-MAYBE_UNUSED
     + static struct commit_list *reverse_commit_list(struct commit_list *list)
     + {
     + 	struct commit_list *previous = NULL, *current, *backup;
      @@ merge-ort.c: static void merge_ort_nonrecursive_internal(struct merge_options *opt,
       	}
       }
     @@ merge-ort.c: void merge_incore_recursive(struct merge_options *opt,
       			    struct commit *side2,
       			    struct merge_result *result)
       {
     --	(void)reverse_commit_list;
     --	(void)make_virtual_commit;
      -	die("Not yet implemented");
      +	/*
      +	 * merge_incore_nonrecursive() exists for cases where we always

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v3 1/3] merge-ort: copy a few small helper functions from merge-recursive.c
  2020-12-16 17:17   ` [PATCH v3 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
@ 2020-12-16 17:17     ` Elijah Newren via GitGitGadget
  2020-12-16 17:17     ` [PATCH v3 2/3] merge-ort: make clear_internal_opts() aware of partial clearing Elijah Newren via GitGitGadget
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 37+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-12-16 17:17 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, Derrick Stolee, Johannes Schindelin, Elijah Newren,
	Elijah Newren

From: Elijah Newren <newren@gmail.com>

In a subsequent commit, we will implement the traditional recursiveness
that gave merge-recursive its name, namely merging non-unique
merge-bases to come up with a single virtual merge base.  Copy a few
helper functions from merge-recursive.c that we will use in the
implementation.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/merge-ort.c b/merge-ort.c
index 414e7b7eeac..b66f8a6f100 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -17,8 +17,10 @@
 #include "cache.h"
 #include "merge-ort.h"
 
+#include "alloc.h"
 #include "blob.h"
 #include "cache-tree.h"
+#include "commit.h"
 #include "commit-reach.h"
 #include "diff.h"
 #include "diffcore.h"
@@ -1348,6 +1350,36 @@ void merge_finalize(struct merge_options *opt,
 
 /*** Function Grouping: helper functions for merge_incore_*() ***/
 
+static inline void set_commit_tree(struct commit *c, struct tree *t)
+{
+	c->maybe_tree = t;
+}
+
+MAYBE_UNUSED
+static struct commit *make_virtual_commit(struct repository *repo,
+					  struct tree *tree,
+					  const char *comment)
+{
+	struct commit *commit = alloc_commit_node(repo);
+
+	set_merge_remote_desc(commit, comment, (struct object *)commit);
+	set_commit_tree(commit, tree);
+	commit->object.parsed = 1;
+	return commit;
+}
+
+MAYBE_UNUSED
+static struct commit_list *reverse_commit_list(struct commit_list *list)
+{
+	struct commit_list *previous = NULL, *current, *backup;
+	for (current = list; current; current = backup) {
+		backup = current->next;
+		current->next = previous;
+		previous = current;
+	}
+	return previous;
+}
+
 static void merge_start(struct merge_options *opt, struct merge_result *result)
 {
 	/* Sanity checks on opt */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 2/3] merge-ort: make clear_internal_opts() aware of partial clearing
  2020-12-16 17:17   ` [PATCH v3 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
  2020-12-16 17:17     ` [PATCH v3 1/3] merge-ort: copy a few small helper functions from merge-recursive.c Elijah Newren via GitGitGadget
@ 2020-12-16 17:17     ` Elijah Newren via GitGitGadget
  2020-12-16 17:17     ` [PATCH v3 3/3] merge-ort: implement merge_incore_recursive() Elijah Newren via GitGitGadget
  2020-12-16 20:35     ` [PATCH v4 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
  3 siblings, 0 replies; 37+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-12-16 17:17 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, Derrick Stolee, Johannes Schindelin, Elijah Newren,
	Elijah Newren

From: Elijah Newren <newren@gmail.com>

In order to handle recursive merges, after merging merge-bases we need
to clear away most of the data we had built up but some of it needs to
be kept -- in particular the "output" field.  Rename the function to
reflect its future change in use.

Further, since "reinitialize" means we'll be reusing the fields
immediately, take advantage of this to only partially clear maps,
leaving the hashtable allocated and pre-sized.  (This may be slightly
out-of-order since the speedups aren't realized until there are far
more strmaps in use, but the patch submission process already went out
of order because of various questions and requests for strmap.  Anyway,
see commit 6ccdfc2a20 ("strmap: enable faster clearing and reusing of
strmaps", 2020-11-05), for performance details about the use of
strmap_partial_clear().)

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index b66f8a6f100..7679c1d08e2 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -253,10 +253,11 @@ static void free_strmap_strings(struct strmap *map)
 	}
 }
 
-static void clear_internal_opts(struct merge_options_internal *opti,
-				int reinitialize)
+static void clear_or_reinit_internal_opts(struct merge_options_internal *opti,
+					  int reinitialize)
 {
-	assert(!reinitialize);
+	void (*strmap_func)(struct strmap *, int) =
+		reinitialize ? strmap_partial_clear : strmap_clear;
 
 	/*
 	 * We marked opti->paths with strdup_strings = 0, so that we
@@ -266,14 +267,14 @@ static void clear_internal_opts(struct merge_options_internal *opti,
 	 * to deallocate them.
 	 */
 	free_strmap_strings(&opti->paths);
-	strmap_clear(&opti->paths, 1);
+	strmap_func(&opti->paths, 1);
 
 	/*
 	 * All keys and values in opti->conflicted are a subset of those in
 	 * opti->paths.  We don't want to deallocate anything twice, so we
 	 * don't free the keys and we pass 0 for free_values.
 	 */
-	strmap_clear(&opti->conflicted, 0);
+	strmap_func(&opti->conflicted, 0);
 
 	/*
 	 * opti->paths_to_free is similar to opti->paths; we created it with
@@ -1344,7 +1345,7 @@ void merge_finalize(struct merge_options *opt,
 
 	assert(opt->priv == NULL);
 
-	clear_internal_opts(opti, 0);
+	clear_or_reinit_internal_opts(opti, 0);
 	FREE_AND_NULL(opti);
 }
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v3 3/3] merge-ort: implement merge_incore_recursive()
  2020-12-16 17:17   ` [PATCH v3 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
  2020-12-16 17:17     ` [PATCH v3 1/3] merge-ort: copy a few small helper functions from merge-recursive.c Elijah Newren via GitGitGadget
  2020-12-16 17:17     ` [PATCH v3 2/3] merge-ort: make clear_internal_opts() aware of partial clearing Elijah Newren via GitGitGadget
@ 2020-12-16 17:17     ` Elijah Newren via GitGitGadget
  2020-12-16 20:35     ` [PATCH v4 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
  3 siblings, 0 replies; 37+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-12-16 17:17 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, Derrick Stolee, Johannes Schindelin, Elijah Newren,
	Elijah Newren

From: Elijah Newren <newren@gmail.com>

Implement merge_incore_recursive(), mostly through the use of a new
helper function, merge_ort_internal(), which itself is based off
merge_recursive_internal() from merge-recursive.c.

This drops the number of failures in the testsuite when run under
GIT_TEST_MERGE_ALGORITHM=ort from around 1500 to 647.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 108 ++++++++++++++++++++++++++++++++++++++++++++++++++--
 merge-ort.h |  10 +++++
 2 files changed, 115 insertions(+), 3 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index 7679c1d08e2..f01307f2ee0 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -1356,7 +1356,6 @@ static inline void set_commit_tree(struct commit *c, struct tree *t)
 	c->maybe_tree = t;
 }
 
-MAYBE_UNUSED
 static struct commit *make_virtual_commit(struct repository *repo,
 					  struct tree *tree,
 					  const char *comment)
@@ -1369,7 +1368,6 @@ static struct commit *make_virtual_commit(struct repository *repo,
 	return commit;
 }
 
-MAYBE_UNUSED
 static struct commit_list *reverse_commit_list(struct commit_list *list)
 {
 	struct commit_list *previous = NULL, *current, *backup;
@@ -1478,6 +1476,91 @@ static void merge_ort_nonrecursive_internal(struct merge_options *opt,
 	}
 }
 
+/*
+ * Originally from merge_recursive_internal(); somewhat adapted, though.
+ */
+static void merge_ort_internal(struct merge_options *opt,
+			       struct commit_list *merge_bases,
+			       struct commit *h1,
+			       struct commit *h2,
+			       struct merge_result *result)
+{
+	struct commit_list *iter;
+	struct commit *merged_merge_bases;
+	const char *ancestor_name;
+	struct strbuf merge_base_abbrev = STRBUF_INIT;
+
+	if (!merge_bases) {
+		merge_bases = get_merge_bases(h1, h2);
+		/* See merge-ort.h:merge_incore_recursive() declaration NOTE */
+		merge_bases = reverse_commit_list(merge_bases);
+	}
+
+	merged_merge_bases = pop_commit(&merge_bases);
+	if (merged_merge_bases == NULL) {
+		/* if there is no common ancestor, use an empty tree */
+		struct tree *tree;
+
+		tree = lookup_tree(opt->repo, opt->repo->hash_algo->empty_tree);
+		merged_merge_bases = make_virtual_commit(opt->repo, tree,
+							 "ancestor");
+		ancestor_name = "empty tree";
+	} else if (opt->ancestor && !opt->priv->call_depth) {
+		ancestor_name = opt->ancestor;
+	} else if (merge_bases) {
+		ancestor_name = "merged common ancestors";
+	} else {
+		strbuf_add_unique_abbrev(&merge_base_abbrev,
+					 &merged_merge_bases->object.oid,
+					 DEFAULT_ABBREV);
+		ancestor_name = merge_base_abbrev.buf;
+	}
+
+	for (iter = merge_bases; iter; iter = iter->next) {
+		const char *saved_b1, *saved_b2;
+		struct commit *prev = merged_merge_bases;
+
+		opt->priv->call_depth++;
+		/*
+		 * When the merge fails, the result contains files
+		 * with conflict markers. The cleanness flag is
+		 * ignored (unless indicating an error), it was never
+		 * actually used, as result of merge_trees has always
+		 * overwritten it: the committed "conflicts" were
+		 * already resolved.
+		 */
+		saved_b1 = opt->branch1;
+		saved_b2 = opt->branch2;
+		opt->branch1 = "Temporary merge branch 1";
+		opt->branch2 = "Temporary merge branch 2";
+		merge_ort_internal(opt, NULL, prev, iter->item, result);
+		if (result->clean < 0)
+			return;
+		opt->branch1 = saved_b1;
+		opt->branch2 = saved_b2;
+		opt->priv->call_depth--;
+
+		merged_merge_bases = make_virtual_commit(opt->repo,
+							 result->tree,
+							 "merged tree");
+		commit_list_insert(prev, &merged_merge_bases->parents);
+		commit_list_insert(iter->item,
+				   &merged_merge_bases->parents->next);
+
+		clear_or_reinit_internal_opts(opt->priv, 1);
+	}
+
+	opt->ancestor = ancestor_name;
+	merge_ort_nonrecursive_internal(opt,
+					repo_get_commit_tree(opt->repo,
+							     merged_merge_bases),
+					repo_get_commit_tree(opt->repo, h1),
+					repo_get_commit_tree(opt->repo, h2),
+					result);
+	strbuf_release(&merge_base_abbrev);
+	opt->ancestor = NULL;  /* avoid accidental re-use of opt->ancestor */
+}
+
 void merge_incore_nonrecursive(struct merge_options *opt,
 			       struct tree *merge_base,
 			       struct tree *side1,
@@ -1495,5 +1578,24 @@ void merge_incore_recursive(struct merge_options *opt,
 			    struct commit *side2,
 			    struct merge_result *result)
 {
-	die("Not yet implemented");
+	/*
+	 * merge_incore_nonrecursive() exists for cases where we always
+	 * know there is a well-defined single merge base.  However,
+	 * despite a similar structure, merge-recursive.c noted that some
+	 * callers preferred to call the recursive logic anyway and still
+	 * set a special name for opt->ancestor that would appear in
+	 * merge.conflictStyle=diff3 output.
+	 *
+	 * git-am was one such example (it wanted to set opt->ancestor to
+	 * "constructed merge base", since it created a fake merge base);
+	 * it called the recursive merge logic through a special
+	 * merge_recursive_generic() wrapper.
+	 *
+	 * Allow the same kind of special case here.
+	 */
+	int num_merge_bases_is_1 = (merge_bases && !merge_bases->next);
+	assert(opt->ancestor == NULL || num_merge_bases_is_1);
+
+	merge_start(opt, result);
+	merge_ort_internal(opt, merge_bases, side1, side2, result);
 }
diff --git a/merge-ort.h b/merge-ort.h
index 55ae7ee865d..d53a0a339f3 100644
--- a/merge-ort.h
+++ b/merge-ort.h
@@ -34,6 +34,16 @@ struct merge_result {
 /*
  * rename-detecting three-way merge with recursive ancestor consolidation.
  * working tree and index are untouched.
+ *
+ * merge_bases will be consumed (emptied) so make a copy if you need it.
+ *
+ * NOTE: empirically, the recursive algorithm will perform better if you
+ *       pass the merge_bases in the order of oldest commit to the
+ *       newest[1][2].
+ *
+ *       [1] https://lore.kernel.org/git/nycvar.QRO.7.76.6.1907252055500.21907@tvgsbejvaqbjf.bet/
+ *       [2] commit 8918b0c9c2 ("merge-recur: try to merge older merge bases
+ *           first", 2006-08-09)
  */
 void merge_incore_recursive(struct merge_options *opt,
 			    struct commit_list *merge_bases,
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/3] merge-ort: copy a few small helper functions from merge-recursive.c
  2020-12-16 13:30   ` Derrick Stolee
@ 2020-12-16 17:43     ` Junio C Hamano
  2020-12-16 18:54       ` Felipe Contreras
  2020-12-16 19:20       ` Elijah Newren
  0 siblings, 2 replies; 37+ messages in thread
From: Junio C Hamano @ 2020-12-16 17:43 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Elijah Newren via GitGitGadget, git, Elijah Newren

Derrick Stolee <stolee@gmail.com> writes:

> On 12/15/2020 12:53 PM, Elijah Newren via GitGitGadget wrote:
>> From: Elijah Newren <newren@gmail.com>
>> 
>> In a subsequent commit, we will implement the traditional recursiveness
>> that gave merge-recursive its name, namely merging non-unique
>> merge-bases to come up with a single virtual merge base.  Copy a few
>> helper functions from merge-recursive.c that we will use in the
>> implementation.
>
> I'm sure these are copies, but...
>
>> +static struct commit_list *reverse_commit_list(struct commit_list *list)
>> +{
>> +	struct commit_list *next = NULL, *current, *backup;
>> +	for (current = list; current; current = backup) {
>> +		backup = current->next;
>> +		current->next = next;
>> +		next = current;
>> +	}
>
> The naming of 'next' seems backwards to me, since it is really
> the "previous" node. Using something like 'previous' makes it
> clear that you are reversing when you say
>
> 	current->next = previous;

Hmph.  I took "next" commit_list as "list is the original one, and
next is the reversed list, the next generation of what we received".

Calling it "previous" feels even more backwards when you view among
three "struct commit_list *" pointers, one (the one that holds the
eventual return value) is primarily used to denote the resulting
list itself, and the other two are used to point individual elements
on the original list.

I wonder if a slightly different codeflow may be easier to follow

	struct commit_list *result = NULL;
	while (list) {
        	struct commit_list *next = list->next;
		list->next = result;
		result = list;
		list = next;
	}
	return result;

if we were to try improving this for readability?  

I dunno if it matters too much, though.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 3/3] merge-ort: implement merge_incore_recursive()
  2020-12-16  5:52   ` [PATCH v2 3/3] merge-ort: implement merge_incore_recursive() Elijah Newren via GitGitGadget
@ 2020-12-16 18:09     ` Junio C Hamano
  2020-12-16 18:37       ` Elijah Newren
  0 siblings, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2020-12-16 18:09 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: git, Elijah Newren

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +	/*
> +	 * merge_incore_nonrecursive() exists for cases where we always
> +	 * know there is a well-defined single merge base.  However,
> +	 * despite a similar structure, merge-recursive.c noted that some
> +	 * callers preferred to call the recursive logic anyway and still
> +	 * set a special name for opt->ancestor that would appear in
> +	 * merge.conflictStyle=diff3 output.

Sorry, I do not understand the comment, especially the "anyway"
part.  There is no such thing as nonrecursive variant of
merge-recursive, is it?  If somebody wanted to perform a merge of
two trees with a designated single common ancestor ("am -3" would
want to do so using a fabricated tree, "cherry-pick" would want to
do so using the parent commit of what gets picked), it would be
natural to call "git merge-recursive O -- A B" if it is a scripted
Porcelain, or would call merge_recursive() with the single merge
base on the merge_bases commit_list parameter if it is written in C,
I would think.

> +	 * git-am was one such example (it wanted to set opt->ancestor to
> +	 * "constructed merge base", since it created a fake merge base);
> +	 * it called the recursive merge logic through a special
> +	 * merge_recursive_generic() wrapper.
> +	 *
> +	 * Allow the same kind of special case here.
> +	 */

In any case, the mention of "constructed merge base" may explain
very well why the assert in the previous iteration checked for the
string, but it does not seem relevant after the condition changed.

> +	int num_merge_bases_is_1 = (merge_bases && !merge_bases->next);
> +	assert(opt->ancestor == NULL || num_merge_bases_is_1);

The above comment may have explained why some callers that call the
machinery with a single merge base want to use its own diff3 label,
but the assert the comment applies to is stricter than that.

In other words, it is unclear why the caller is forbidden from
giving the diff3 label, when the recursive merge needs to synthesize
the virtual ancestor using all the given merge bases?

The answer could be a simple "opt->ancestor field is managed by the
recursive machinery itself when it needs to create virtual ancestor,
and must start as NULL, but when we do not create virtual ancestor,
it is allowed to start with any value, as the machinery itself does
not assign any new value to the field", but I cannot read if that is
the case from the comments in the patch.

> +
> +	merge_start(opt, result);
> +	merge_ort_internal(opt, merge_bases, side1, side2, result);
>  }
> diff --git a/merge-ort.h b/merge-ort.h
> index 55ae7ee865d..d53a0a339f3 100644
> --- a/merge-ort.h
> +++ b/merge-ort.h
> @@ -34,6 +34,16 @@ struct merge_result {
>  /*
>   * rename-detecting three-way merge with recursive ancestor consolidation.
>   * working tree and index are untouched.
> + *
> + * merge_bases will be consumed (emptied) so make a copy if you need it.
> + *
> + * NOTE: empirically, the recursive algorithm will perform better if you
> + *       pass the merge_bases in the order of oldest commit to the
> + *       newest[1][2].
> + *
> + *       [1] https://lore.kernel.org/git/nycvar.QRO.7.76.6.1907252055500.21907@tvgsbejvaqbjf.bet/
> + *       [2] commit 8918b0c9c2 ("merge-recur: try to merge older merge bases
> + *           first", 2006-08-09)
>   */

I initially thought that this was a bit out-of-place for the comment
that explains why the merge bases list gets reversed in the code, but
it is actually the right place---it guides the callers that hand a
list of merge_bases to the function.

>  void merge_incore_recursive(struct merge_options *opt,
>  			    struct commit_list *merge_bases,

Thanks.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 3/3] merge-ort: implement merge_incore_recursive()
  2020-12-16 18:09     ` Junio C Hamano
@ 2020-12-16 18:37       ` Elijah Newren
  0 siblings, 0 replies; 37+ messages in thread
From: Elijah Newren @ 2020-12-16 18:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Elijah Newren via GitGitGadget, Git Mailing List

On Wed, Dec 16, 2020 at 10:09 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > +     /*
> > +      * merge_incore_nonrecursive() exists for cases where we always
> > +      * know there is a well-defined single merge base.  However,
> > +      * despite a similar structure, merge-recursive.c noted that some
> > +      * callers preferred to call the recursive logic anyway and still
> > +      * set a special name for opt->ancestor that would appear in
> > +      * merge.conflictStyle=diff3 output.
>
> Sorry, I do not understand the comment, especially the "anyway"
> part.  There is no such thing as nonrecursive variant of
> merge-recursive, is it?  If somebody wanted to perform a merge of

There is absolutely a nonrecursive variant; it's called merge_trees()
in merge-recursive.[ch] and merge_incore_nonrecursive() in
merge-ort.[ch].

Note that I called out the nonrecursive variant at the beginning of
the comment.  Perhaps I should have said "recursive variant" rather
than "recursive logic"?

> two trees with a designated single common ancestor ("am -3" would
> want to do so using a fabricated tree, "cherry-pick" would want to
> do so using the parent commit of what gets picked), it would be
> natural to call "git merge-recursive O -- A B" if it is a scripted
> Porcelain, or would call merge_recursive() with the single merge
> base on the merge_bases commit_list parameter if it is written in C,
> I would think.

cherry-pick/sequencer uses merge_trees() or
merge_incore_nonrecursive().  In other words, the non-recursive API.

git-am uses merge_recursive_generic(), a weird wrapper that calls
merge_recursive() [whose analogue in merge-ort would be
merge_incore_recursive()].  It always passes in exactly one merge
base, though.  And the label for the "ancestor" commit in diff3 output
would default to showing some fake commit ID, if it weren't for the
ability to override the ancestor label in that case.

> > +      * git-am was one such example (it wanted to set opt->ancestor to
> > +      * "constructed merge base", since it created a fake merge base);
> > +      * it called the recursive merge logic through a special
> > +      * merge_recursive_generic() wrapper.
> > +      *
> > +      * Allow the same kind of special case here.
> > +      */
>
> In any case, the mention of "constructed merge base" may explain
> very well why the assert in the previous iteration checked for the
> string, but it does not seem relevant after the condition changed.

Sure, the question in the last iteration was "why check for this
string", but there's a new question with this iteration: who would
pass a special opt->ancestor and with what kind of meaning.  Providing
an example thus provides an answer to that question and gives people
an easy way to search for more information.

> > +     int num_merge_bases_is_1 = (merge_bases && !merge_bases->next);
> > +     assert(opt->ancestor == NULL || num_merge_bases_is_1);
>
> The above comment may have explained why some callers that call the
> machinery with a single merge base want to use its own diff3 label,
> but the assert the comment applies to is stricter than that.
>
> In other words, it is unclear why the caller is forbidden from
> giving the diff3 label, when the recursive merge needs to synthesize
> the virtual ancestor using all the given merge bases?
>
> The answer could be a simple "opt->ancestor field is managed by the
> recursive machinery itself when it needs to create virtual ancestor,
> and must start as NULL, but when we do not create virtual ancestor,
> it is allowed to start with any value, as the machinery itself does
> not assign any new value to the field", but I cannot read if that is
> the case from the comments in the patch.

Yeah, you're making me lean towards thinking that
merge_recursive_generic() is just a broken API that I shouldn't port
over (even as a wrapper), and that I further shouldn't support using
the merge_ort_recursive() API in a fashion wanted by that function.

I'll just toss the single-merge-base check along with the laborious comment.

To clarify, though, your comment above is mostly correct but is off in
the case where there is no need to create a virtual ancestor.
merge_incore_recursive() does assign a label in that case (if one is
not already assigned) -- and that label is just a unique abbreviation
of the relevant commit.  That's a great label when the merge base is a
"real" commit, less so when it's fake/constructed.

But we already have a merge_incore_nonrecursive() that allows (in
fact, requires) setting the ancestor label, so we should probably just
shift merge_recursive_generic() callers who have a special ancestor to
go through that API instead.

> > +
> > +     merge_start(opt, result);
> > +     merge_ort_internal(opt, merge_bases, side1, side2, result);
> >  }
> > diff --git a/merge-ort.h b/merge-ort.h
> > index 55ae7ee865d..d53a0a339f3 100644
> > --- a/merge-ort.h
> > +++ b/merge-ort.h
> > @@ -34,6 +34,16 @@ struct merge_result {
> >  /*
> >   * rename-detecting three-way merge with recursive ancestor consolidation.
> >   * working tree and index are untouched.
> > + *
> > + * merge_bases will be consumed (emptied) so make a copy if you need it.
> > + *
> > + * NOTE: empirically, the recursive algorithm will perform better if you
> > + *       pass the merge_bases in the order of oldest commit to the
> > + *       newest[1][2].
> > + *
> > + *       [1] https://lore.kernel.org/git/nycvar.QRO.7.76.6.1907252055500.21907@tvgsbejvaqbjf.bet/
> > + *       [2] commit 8918b0c9c2 ("merge-recur: try to merge older merge bases
> > + *           first", 2006-08-09)
> >   */
>
> I initially thought that this was a bit out-of-place for the comment
> that explains why the merge bases list gets reversed in the code, but
> it is actually the right place---it guides the callers that hand a
> list of merge_bases to the function.
>
> >  void merge_incore_recursive(struct merge_options *opt,
> >                           struct commit_list *merge_bases,
>
> Thanks.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/3] merge-ort: copy a few small helper functions from merge-recursive.c
  2020-12-16 17:43     ` Junio C Hamano
@ 2020-12-16 18:54       ` Felipe Contreras
  2020-12-16 19:20       ` Elijah Newren
  1 sibling, 0 replies; 37+ messages in thread
From: Felipe Contreras @ 2020-12-16 18:54 UTC (permalink / raw)
  To: Junio C Hamano, Derrick Stolee
  Cc: Elijah Newren via GitGitGadget, git, Elijah Newren

Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
> 
> > On 12/15/2020 12:53 PM, Elijah Newren via GitGitGadget wrote:
> >> From: Elijah Newren <newren@gmail.com>
> >> 
> >> In a subsequent commit, we will implement the traditional recursiveness
> >> that gave merge-recursive its name, namely merging non-unique
> >> merge-bases to come up with a single virtual merge base.  Copy a few
> >> helper functions from merge-recursive.c that we will use in the
> >> implementation.
> >
> > I'm sure these are copies, but...
> >
> >> +static struct commit_list *reverse_commit_list(struct commit_list *list)
> >> +{
> >> +	struct commit_list *next = NULL, *current, *backup;
> >> +	for (current = list; current; current = backup) {
> >> +		backup = current->next;
> >> +		current->next = next;
> >> +		next = current;
> >> +	}
> >
> > The naming of 'next' seems backwards to me, since it is really
> > the "previous" node. Using something like 'previous' makes it
> > clear that you are reversing when you say
> >
> > 	current->next = previous;
> 
> Hmph.  I took "next" commit_list as "list is the original one, and
> next is the reversed list, the next generation of what we received".
> 
> Calling it "previous" feels even more backwards when you view among
> three "struct commit_list *" pointers, one (the one that holds the
> eventual return value) is primarily used to denote the resulting
> list itself, and the other two are used to point individual elements
> on the original list.

Both feel awkward to me because to me previous/next are actually current
in my mind, and we should return current, not previous/next. Plus
there's no need for current, we can use list, as your example.

This is what I came up with:

  struct commit_list *current = NULL;
  while (list) {
          struct commit_list *tmp = list->next;
          list->next = current;
          current = list;
          list = tmp;
  }
  return current;

For completeness I checked how Linux does it, and surprisingly
llist_reverse_order() is very close to what I came up with:

  struct commit_list *new_list = NULL;
  while (list) {
          struct commit_list *tmp = list;
          list = list->next;
          tmp->next = new_list;
          new_list = tmp;
  }
  return new_list;

(obviously translated)

Maybe it's just me, but I find my version easier to read.

Cheers.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/3] merge-ort: copy a few small helper functions from merge-recursive.c
  2020-12-16 17:43     ` Junio C Hamano
  2020-12-16 18:54       ` Felipe Contreras
@ 2020-12-16 19:20       ` Elijah Newren
  2020-12-16 20:41         ` Junio C Hamano
  1 sibling, 1 reply; 37+ messages in thread
From: Elijah Newren @ 2020-12-16 19:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Derrick Stolee, Elijah Newren via GitGitGadget, Git Mailing List

On Wed, Dec 16, 2020 at 9:43 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Derrick Stolee <stolee@gmail.com> writes:
>
> > On 12/15/2020 12:53 PM, Elijah Newren via GitGitGadget wrote:
> >> From: Elijah Newren <newren@gmail.com>
> >>
> >> In a subsequent commit, we will implement the traditional recursiveness
> >> that gave merge-recursive its name, namely merging non-unique
> >> merge-bases to come up with a single virtual merge base.  Copy a few
> >> helper functions from merge-recursive.c that we will use in the
> >> implementation.
> >
> > I'm sure these are copies, but...
> >
> >> +static struct commit_list *reverse_commit_list(struct commit_list *list)
> >> +{
> >> +    struct commit_list *next = NULL, *current, *backup;
> >> +    for (current = list; current; current = backup) {
> >> +            backup = current->next;
> >> +            current->next = next;
> >> +            next = current;
> >> +    }
> >
> > The naming of 'next' seems backwards to me, since it is really
> > the "previous" node. Using something like 'previous' makes it
> > clear that you are reversing when you say
> >
> >       current->next = previous;
>
> Hmph.  I took "next" commit_list as "list is the original one, and
> next is the reversed list, the next generation of what we received".
>
> Calling it "previous" feels even more backwards when you view among
> three "struct commit_list *" pointers, one (the one that holds the
> eventual return value) is primarily used to denote the resulting
> list itself, and the other two are used to point individual elements
> on the original list.
>
> I wonder if a slightly different codeflow may be easier to follow
>
>         struct commit_list *result = NULL;
>         while (list) {
>                 struct commit_list *next = list->next;
>                 list->next = result;
>                 result = list;
>                 list = next;
>         }
>         return result;
>
> if we were to try improving this for readability?

Looks like Felipe also came up with the same version you did (modulo
the temporary variable name).

> I dunno if it matters too much, though.

Yeah, reverse_commit_list() has been unchanged in merge-recursive.c
since its introduction in August of 2006.  The function's purpose
seems obvious from its name, so no one ever needs to look at or modify
the implementation.  I'm certain I'll never touch it again.  So, I
personally don't care what the particular implementation is, and I'm
happy to take whatever reviewers prefer here.

Since we have three people weighing in with opinions though -- and on
a point that's irrelevant to me -- do you want to make the call here
Junio?

Otherwise I'll just leave it as-is and not update further.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v4 0/3] merge-ort: implement recursive merges
  2020-12-16 17:17   ` [PATCH v3 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
                       ` (2 preceding siblings ...)
  2020-12-16 17:17     ` [PATCH v3 3/3] merge-ort: implement merge_incore_recursive() Elijah Newren via GitGitGadget
@ 2020-12-16 20:35     ` Elijah Newren via GitGitGadget
  2020-12-16 20:35       ` [PATCH v4 1/3] merge-ort: copy a few small helper functions from merge-recursive.c Elijah Newren via GitGitGadget
                         ` (3 more replies)
  3 siblings, 4 replies; 37+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-12-16 20:35 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, Derrick Stolee, Johannes Schindelin,
	Felipe Contreras, Elijah Newren

This series depends on en/merge-ort-2 (it does NOT depend on en/merge-ort-3
and can thus be reviewed/merged independently of it).

This short series adds handling of recursive merges (merging of multiple
merge-bases to create a virtual merge base) to merge-ort. With this short
series the number of test failures under GIT_TEST_MERGE_ALGORITHM=ort drops
by 801 (from 1448 to 647).

Changes since v3:

 * remove the confusing portions of the merge_incore_recursive() API around
   opt->ancestor that were designed to accommodate merge_recursive_generic()
   or some function like it. If that stuff is really needed, we can add it
   later, but it may be better to simply adjust merge_recursive_generic()
   and/or its callers when we get to that point in the porting process.

Elijah Newren (3):
  merge-ort: copy a few small helper functions from merge-recursive.c
  merge-ort: make clear_internal_opts() aware of partial clearing
  merge-ort: implement merge_incore_recursive()

 merge-ort.c | 132 +++++++++++++++++++++++++++++++++++++++++++++++++---
 merge-ort.h |  10 ++++
 2 files changed, 135 insertions(+), 7 deletions(-)


base-commit: c5a6f65527aa3b6f5d7cf25437a88d8727ab0646
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-814%2Fnewren%2Fort-recursive-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-814/newren/ort-recursive-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/814

Range-diff vs v3:

 1:  dcf28565ad3 = 1:  dcf28565ad3 merge-ort: copy a few small helper functions from merge-recursive.c
 2:  bffc45c6570 = 2:  bffc45c6570 merge-ort: make clear_internal_opts() aware of partial clearing
 3:  59216a155ae ! 3:  f622d6905d0 merge-ort: implement merge_incore_recursive()
     @@ merge-ort.c: static void merge_ort_nonrecursive_internal(struct merge_options *o
      +		merged_merge_bases = make_virtual_commit(opt->repo, tree,
      +							 "ancestor");
      +		ancestor_name = "empty tree";
     -+	} else if (opt->ancestor && !opt->priv->call_depth) {
     -+		ancestor_name = opt->ancestor;
      +	} else if (merge_bases) {
      +		ancestor_name = "merged common ancestors";
      +	} else {
     @@ merge-ort.c: void merge_incore_recursive(struct merge_options *opt,
       			    struct merge_result *result)
       {
      -	die("Not yet implemented");
     -+	/*
     -+	 * merge_incore_nonrecursive() exists for cases where we always
     -+	 * know there is a well-defined single merge base.  However,
     -+	 * despite a similar structure, merge-recursive.c noted that some
     -+	 * callers preferred to call the recursive logic anyway and still
     -+	 * set a special name for opt->ancestor that would appear in
     -+	 * merge.conflictStyle=diff3 output.
     -+	 *
     -+	 * git-am was one such example (it wanted to set opt->ancestor to
     -+	 * "constructed merge base", since it created a fake merge base);
     -+	 * it called the recursive merge logic through a special
     -+	 * merge_recursive_generic() wrapper.
     -+	 *
     -+	 * Allow the same kind of special case here.
     -+	 */
     -+	int num_merge_bases_is_1 = (merge_bases && !merge_bases->next);
     -+	assert(opt->ancestor == NULL || num_merge_bases_is_1);
     ++	/* We set the ancestor label based on the merge_bases */
     ++	assert(opt->ancestor == NULL);
      +
      +	merge_start(opt, result);
      +	merge_ort_internal(opt, merge_bases, side1, side2, result);

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v4 1/3] merge-ort: copy a few small helper functions from merge-recursive.c
  2020-12-16 20:35     ` [PATCH v4 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
@ 2020-12-16 20:35       ` Elijah Newren via GitGitGadget
  2020-12-16 20:35       ` [PATCH v4 2/3] merge-ort: make clear_internal_opts() aware of partial clearing Elijah Newren via GitGitGadget
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 37+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-12-16 20:35 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, Derrick Stolee, Johannes Schindelin,
	Felipe Contreras, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

In a subsequent commit, we will implement the traditional recursiveness
that gave merge-recursive its name, namely merging non-unique
merge-bases to come up with a single virtual merge base.  Copy a few
helper functions from merge-recursive.c that we will use in the
implementation.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/merge-ort.c b/merge-ort.c
index 414e7b7eeac..b66f8a6f100 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -17,8 +17,10 @@
 #include "cache.h"
 #include "merge-ort.h"
 
+#include "alloc.h"
 #include "blob.h"
 #include "cache-tree.h"
+#include "commit.h"
 #include "commit-reach.h"
 #include "diff.h"
 #include "diffcore.h"
@@ -1348,6 +1350,36 @@ void merge_finalize(struct merge_options *opt,
 
 /*** Function Grouping: helper functions for merge_incore_*() ***/
 
+static inline void set_commit_tree(struct commit *c, struct tree *t)
+{
+	c->maybe_tree = t;
+}
+
+MAYBE_UNUSED
+static struct commit *make_virtual_commit(struct repository *repo,
+					  struct tree *tree,
+					  const char *comment)
+{
+	struct commit *commit = alloc_commit_node(repo);
+
+	set_merge_remote_desc(commit, comment, (struct object *)commit);
+	set_commit_tree(commit, tree);
+	commit->object.parsed = 1;
+	return commit;
+}
+
+MAYBE_UNUSED
+static struct commit_list *reverse_commit_list(struct commit_list *list)
+{
+	struct commit_list *previous = NULL, *current, *backup;
+	for (current = list; current; current = backup) {
+		backup = current->next;
+		current->next = previous;
+		previous = current;
+	}
+	return previous;
+}
+
 static void merge_start(struct merge_options *opt, struct merge_result *result)
 {
 	/* Sanity checks on opt */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 2/3] merge-ort: make clear_internal_opts() aware of partial clearing
  2020-12-16 20:35     ` [PATCH v4 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
  2020-12-16 20:35       ` [PATCH v4 1/3] merge-ort: copy a few small helper functions from merge-recursive.c Elijah Newren via GitGitGadget
@ 2020-12-16 20:35       ` Elijah Newren via GitGitGadget
  2020-12-16 20:35       ` [PATCH v4 3/3] merge-ort: implement merge_incore_recursive() Elijah Newren via GitGitGadget
  2020-12-16 22:27       ` [PATCH v5 0/4] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
  3 siblings, 0 replies; 37+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-12-16 20:35 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, Derrick Stolee, Johannes Schindelin,
	Felipe Contreras, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

In order to handle recursive merges, after merging merge-bases we need
to clear away most of the data we had built up but some of it needs to
be kept -- in particular the "output" field.  Rename the function to
reflect its future change in use.

Further, since "reinitialize" means we'll be reusing the fields
immediately, take advantage of this to only partially clear maps,
leaving the hashtable allocated and pre-sized.  (This may be slightly
out-of-order since the speedups aren't realized until there are far
more strmaps in use, but the patch submission process already went out
of order because of various questions and requests for strmap.  Anyway,
see commit 6ccdfc2a20 ("strmap: enable faster clearing and reusing of
strmaps", 2020-11-05), for performance details about the use of
strmap_partial_clear().)

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index b66f8a6f100..7679c1d08e2 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -253,10 +253,11 @@ static void free_strmap_strings(struct strmap *map)
 	}
 }
 
-static void clear_internal_opts(struct merge_options_internal *opti,
-				int reinitialize)
+static void clear_or_reinit_internal_opts(struct merge_options_internal *opti,
+					  int reinitialize)
 {
-	assert(!reinitialize);
+	void (*strmap_func)(struct strmap *, int) =
+		reinitialize ? strmap_partial_clear : strmap_clear;
 
 	/*
 	 * We marked opti->paths with strdup_strings = 0, so that we
@@ -266,14 +267,14 @@ static void clear_internal_opts(struct merge_options_internal *opti,
 	 * to deallocate them.
 	 */
 	free_strmap_strings(&opti->paths);
-	strmap_clear(&opti->paths, 1);
+	strmap_func(&opti->paths, 1);
 
 	/*
 	 * All keys and values in opti->conflicted are a subset of those in
 	 * opti->paths.  We don't want to deallocate anything twice, so we
 	 * don't free the keys and we pass 0 for free_values.
 	 */
-	strmap_clear(&opti->conflicted, 0);
+	strmap_func(&opti->conflicted, 0);
 
 	/*
 	 * opti->paths_to_free is similar to opti->paths; we created it with
@@ -1344,7 +1345,7 @@ void merge_finalize(struct merge_options *opt,
 
 	assert(opt->priv == NULL);
 
-	clear_internal_opts(opti, 0);
+	clear_or_reinit_internal_opts(opti, 0);
 	FREE_AND_NULL(opti);
 }
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 3/3] merge-ort: implement merge_incore_recursive()
  2020-12-16 20:35     ` [PATCH v4 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
  2020-12-16 20:35       ` [PATCH v4 1/3] merge-ort: copy a few small helper functions from merge-recursive.c Elijah Newren via GitGitGadget
  2020-12-16 20:35       ` [PATCH v4 2/3] merge-ort: make clear_internal_opts() aware of partial clearing Elijah Newren via GitGitGadget
@ 2020-12-16 20:35       ` Elijah Newren via GitGitGadget
  2020-12-16 22:27       ` [PATCH v5 0/4] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
  3 siblings, 0 replies; 37+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-12-16 20:35 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, Derrick Stolee, Johannes Schindelin,
	Felipe Contreras, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Implement merge_incore_recursive(), mostly through the use of a new
helper function, merge_ort_internal(), which itself is based off
merge_recursive_internal() from merge-recursive.c.

This drops the number of failures in the testsuite when run under
GIT_TEST_MERGE_ALGORITHM=ort from around 1500 to 647.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 91 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 merge-ort.h | 10 ++++++
 2 files changed, 98 insertions(+), 3 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index 7679c1d08e2..d6c0cd36f4c 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -1356,7 +1356,6 @@ static inline void set_commit_tree(struct commit *c, struct tree *t)
 	c->maybe_tree = t;
 }
 
-MAYBE_UNUSED
 static struct commit *make_virtual_commit(struct repository *repo,
 					  struct tree *tree,
 					  const char *comment)
@@ -1369,7 +1368,6 @@ static struct commit *make_virtual_commit(struct repository *repo,
 	return commit;
 }
 
-MAYBE_UNUSED
 static struct commit_list *reverse_commit_list(struct commit_list *list)
 {
 	struct commit_list *previous = NULL, *current, *backup;
@@ -1478,6 +1476,89 @@ static void merge_ort_nonrecursive_internal(struct merge_options *opt,
 	}
 }
 
+/*
+ * Originally from merge_recursive_internal(); somewhat adapted, though.
+ */
+static void merge_ort_internal(struct merge_options *opt,
+			       struct commit_list *merge_bases,
+			       struct commit *h1,
+			       struct commit *h2,
+			       struct merge_result *result)
+{
+	struct commit_list *iter;
+	struct commit *merged_merge_bases;
+	const char *ancestor_name;
+	struct strbuf merge_base_abbrev = STRBUF_INIT;
+
+	if (!merge_bases) {
+		merge_bases = get_merge_bases(h1, h2);
+		/* See merge-ort.h:merge_incore_recursive() declaration NOTE */
+		merge_bases = reverse_commit_list(merge_bases);
+	}
+
+	merged_merge_bases = pop_commit(&merge_bases);
+	if (merged_merge_bases == NULL) {
+		/* if there is no common ancestor, use an empty tree */
+		struct tree *tree;
+
+		tree = lookup_tree(opt->repo, opt->repo->hash_algo->empty_tree);
+		merged_merge_bases = make_virtual_commit(opt->repo, tree,
+							 "ancestor");
+		ancestor_name = "empty tree";
+	} else if (merge_bases) {
+		ancestor_name = "merged common ancestors";
+	} else {
+		strbuf_add_unique_abbrev(&merge_base_abbrev,
+					 &merged_merge_bases->object.oid,
+					 DEFAULT_ABBREV);
+		ancestor_name = merge_base_abbrev.buf;
+	}
+
+	for (iter = merge_bases; iter; iter = iter->next) {
+		const char *saved_b1, *saved_b2;
+		struct commit *prev = merged_merge_bases;
+
+		opt->priv->call_depth++;
+		/*
+		 * When the merge fails, the result contains files
+		 * with conflict markers. The cleanness flag is
+		 * ignored (unless indicating an error), it was never
+		 * actually used, as result of merge_trees has always
+		 * overwritten it: the committed "conflicts" were
+		 * already resolved.
+		 */
+		saved_b1 = opt->branch1;
+		saved_b2 = opt->branch2;
+		opt->branch1 = "Temporary merge branch 1";
+		opt->branch2 = "Temporary merge branch 2";
+		merge_ort_internal(opt, NULL, prev, iter->item, result);
+		if (result->clean < 0)
+			return;
+		opt->branch1 = saved_b1;
+		opt->branch2 = saved_b2;
+		opt->priv->call_depth--;
+
+		merged_merge_bases = make_virtual_commit(opt->repo,
+							 result->tree,
+							 "merged tree");
+		commit_list_insert(prev, &merged_merge_bases->parents);
+		commit_list_insert(iter->item,
+				   &merged_merge_bases->parents->next);
+
+		clear_or_reinit_internal_opts(opt->priv, 1);
+	}
+
+	opt->ancestor = ancestor_name;
+	merge_ort_nonrecursive_internal(opt,
+					repo_get_commit_tree(opt->repo,
+							     merged_merge_bases),
+					repo_get_commit_tree(opt->repo, h1),
+					repo_get_commit_tree(opt->repo, h2),
+					result);
+	strbuf_release(&merge_base_abbrev);
+	opt->ancestor = NULL;  /* avoid accidental re-use of opt->ancestor */
+}
+
 void merge_incore_nonrecursive(struct merge_options *opt,
 			       struct tree *merge_base,
 			       struct tree *side1,
@@ -1495,5 +1576,9 @@ void merge_incore_recursive(struct merge_options *opt,
 			    struct commit *side2,
 			    struct merge_result *result)
 {
-	die("Not yet implemented");
+	/* We set the ancestor label based on the merge_bases */
+	assert(opt->ancestor == NULL);
+
+	merge_start(opt, result);
+	merge_ort_internal(opt, merge_bases, side1, side2, result);
 }
diff --git a/merge-ort.h b/merge-ort.h
index 55ae7ee865d..d53a0a339f3 100644
--- a/merge-ort.h
+++ b/merge-ort.h
@@ -34,6 +34,16 @@ struct merge_result {
 /*
  * rename-detecting three-way merge with recursive ancestor consolidation.
  * working tree and index are untouched.
+ *
+ * merge_bases will be consumed (emptied) so make a copy if you need it.
+ *
+ * NOTE: empirically, the recursive algorithm will perform better if you
+ *       pass the merge_bases in the order of oldest commit to the
+ *       newest[1][2].
+ *
+ *       [1] https://lore.kernel.org/git/nycvar.QRO.7.76.6.1907252055500.21907@tvgsbejvaqbjf.bet/
+ *       [2] commit 8918b0c9c2 ("merge-recur: try to merge older merge bases
+ *           first", 2006-08-09)
  */
 void merge_incore_recursive(struct merge_options *opt,
 			    struct commit_list *merge_bases,
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/3] merge-ort: copy a few small helper functions from merge-recursive.c
  2020-12-16 19:20       ` Elijah Newren
@ 2020-12-16 20:41         ` Junio C Hamano
  2020-12-16 21:25           ` Felipe Contreras
  2020-12-16 21:34           ` Elijah Newren
  0 siblings, 2 replies; 37+ messages in thread
From: Junio C Hamano @ 2020-12-16 20:41 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Derrick Stolee, Elijah Newren via GitGitGadget, Git Mailing List

Elijah Newren <newren@gmail.com> writes:

> I wonder if a slightly different codeflow may be easier to follow
>>
>>         struct commit_list *result = NULL;
>>         while (list) {
>>                 struct commit_list *next = list->next;
>>                 list->next = result;
>>                 result = list;
>>                 list = next;
>>         }
>>         return result;
>>
>> if we were to try improving this for readability?
>
> Looks like Felipe also came up with the same version you did (modulo
> the temporary variable name).

Funny if it was sent as a response to the message that already had
the same answer...

>> I dunno if it matters too much, though.
>
> Yeah, reverse_commit_list() has been unchanged in merge-recursive.c
> since its introduction in August of 2006.  The function's purpose
> seems obvious from its name, so no one ever needs to look at or modify
> the implementation.  I'm certain I'll never touch it again.  So, I
> personally don't care what the particular implementation is, and I'm
> happy to take whatever reviewers prefer here.
>
> Since we have three people weighing in with opinions though -- and on
> a point that's irrelevant to me -- do you want to make the call here
> Junio?

If I were pressed to give a suggestion, I have two ;-)

I would prefer to see us first find out if all other callers of
get_merge_bases() _care_ about a particular order of the resulting
list.  If they do not care [*1*] and if it seems feasible to teach
get_merge_bases() build its return value reversed already without
too much extra effort, then the commit list reverser can
disappear and get_merge_bases() can be fixed to return its commit
list in older-to-newer order.

If the above does not happen, then I'd prefer to see a single commit
list reverser in commit.c and have it *shared* between the two merge
backends, instead of adding another one in merge-ort.c while leaving
the one in merge-recursive.c behind.  And the single implementation
can be either "copied from merge-recursive as that may be
unintuitive or harder to follow but at least we know it is battle
tested", or the one we see above.  If we were to take the latter, we
really need to avoid making stupid mistakes while attempting to
replace an existing awkward one with "simplified" version, though.

Sorry for listing even more work, but since you asked ... ;-)


[Footnote]

*1* if those who care actually would benefit if we used
older-to-newer order, it would be even better.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/3] merge-ort: copy a few small helper functions from merge-recursive.c
  2020-12-16 20:41         ` Junio C Hamano
@ 2020-12-16 21:25           ` Felipe Contreras
  2020-12-16 21:34           ` Elijah Newren
  1 sibling, 0 replies; 37+ messages in thread
From: Felipe Contreras @ 2020-12-16 21:25 UTC (permalink / raw)
  To: Junio C Hamano, Elijah Newren
  Cc: Derrick Stolee, Elijah Newren via GitGitGadget, Git Mailing List

Junio C Hamano wrote:
> Elijah Newren <newren@gmail.com> writes:
> 
> > I wonder if a slightly different codeflow may be easier to follow
> >>
> >>         struct commit_list *result = NULL;
> >>         while (list) {
> >>                 struct commit_list *next = list->next;
> >>                 list->next = result;
> >>                 result = list;
> >>                 list = next;
> >>         }
> >>         return result;
> >>
> >> if we were to try improving this for readability?
> >
> > Looks like Felipe also came up with the same version you did (modulo
> > the temporary variable name).
> 
> Funny if it was sent as a response to the message that already had
> the same answer...

It's two changes, actually: s/result/current/ and s/next/tmp/.

But yeah, it's interesting that I used Elijahs's version as a basis and
arrived at virtually the same thing.

Cheers.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/3] merge-ort: copy a few small helper functions from merge-recursive.c
  2020-12-16 20:41         ` Junio C Hamano
  2020-12-16 21:25           ` Felipe Contreras
@ 2020-12-16 21:34           ` Elijah Newren
  1 sibling, 0 replies; 37+ messages in thread
From: Elijah Newren @ 2020-12-16 21:34 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Derrick Stolee, Elijah Newren via GitGitGadget, Git Mailing List

On Wed, Dec 16, 2020 at 12:42 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > I wonder if a slightly different codeflow may be easier to follow
> >>
> >>         struct commit_list *result = NULL;
> >>         while (list) {
> >>                 struct commit_list *next = list->next;
> >>                 list->next = result;
> >>                 result = list;
> >>                 list = next;
> >>         }
> >>         return result;
> >>
> >> if we were to try improving this for readability?
> >
> > Looks like Felipe also came up with the same version you did (modulo
> > the temporary variable name).
>
> Funny if it was sent as a response to the message that already had
> the same answer...


>
> >> I dunno if it matters too much, though.
> >
> > Yeah, reverse_commit_list() has been unchanged in merge-recursive.c
> > since its introduction in August of 2006.  The function's purpose
> > seems obvious from its name, so no one ever needs to look at or modify
> > the implementation.  I'm certain I'll never touch it again.  So, I
> > personally don't care what the particular implementation is, and I'm
> > happy to take whatever reviewers prefer here.
> >
> > Since we have three people weighing in with opinions though -- and on
> > a point that's irrelevant to me -- do you want to make the call here
> > Junio?
>
> If I were pressed to give a suggestion, I have two ;-)
>
> I would prefer to see us first find out if all other callers of
> get_merge_bases() _care_ about a particular order of the resulting
> list.  If they do not care [*1*] and if it seems feasible to teach
> get_merge_bases() build its return value reversed already without
> too much extra effort, then the commit list reverser can
> disappear and get_merge_bases() can be fixed to return its commit
> list in older-to-newer order.

The ones in sequencer.c and builtin/merge.c care about the order, but
they manually reverse it (because they are going to call the merge
machinery).  So, these two would benefit from it being reversed.
However...

notes-merge.c and submodule.c both have subtle dependencies on the
order (in ways that might be buggy).  They could perhaps be taught to
depend on the reversed order, but I'm leery of touching something that
looks possibly buggy (one even has a TODO highlighting it) and
becoming responsible.

get_octopus_merge_bases() depends on the order from get_merge_bases(),
and builtin/merge-base.c depends on that function.  So, the output
order of a command depends on it, which might thus affect user
scripts.

builtin/pull.c also depends on the order returned by get_octopus_merge_bases().

That's enough dependencies that I'm inclined to just leave this side
of things as they are.


> If the above does not happen, then I'd prefer to see a single commit
> list reverser in commit.c and have it *shared* between the two merge
> backends, instead of adding another one in merge-ort.c while leaving
> the one in merge-recursive.c behind.  And the single implementation
> can be either "copied from merge-recursive as that may be
> unintuitive or harder to follow but at least we know it is battle
> tested", or the one we see above.  If we were to take the latter, we
> really need to avoid making stupid mistakes while attempting to
> replace an existing awkward one with "simplified" version, though.

commit.c seems like a natural location.  I'll insert a patch at the
beginning of the series to move the function there, and just use
Johannes' original version from 2006 as-is -- that'll make it easier
to verify that my patch is simply moving things, anyway.

> Sorry for listing even more work, but since you asked ... ;-)
>
>
> [Footnote]
>
> *1* if those who care actually would benefit if we used
> older-to-newer order, it would be even better.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v5 0/4] merge-ort: implement recursive merges
  2020-12-16 20:35     ` [PATCH v4 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
                         ` (2 preceding siblings ...)
  2020-12-16 20:35       ` [PATCH v4 3/3] merge-ort: implement merge_incore_recursive() Elijah Newren via GitGitGadget
@ 2020-12-16 22:27       ` Elijah Newren via GitGitGadget
  2020-12-16 22:27         ` [PATCH v5 1/4] commit: move reverse_commit_list() from merge-recursive Elijah Newren via GitGitGadget
                           ` (3 more replies)
  3 siblings, 4 replies; 37+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-12-16 22:27 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, Derrick Stolee, Johannes Schindelin,
	Felipe Contreras, Elijah Newren

This series depends on en/merge-ort-2 (it does NOT depend on en/merge-ort-3
and can thus be reviewed/merged independently of it).

This short series adds handling of recursive merges (merging of multiple
merge-bases to create a virtual merge base) to merge-ort. With this short
series the number of test failures under GIT_TEST_MERGE_ALGORITHM=ort drops
by 801 (from 1448 to 647).

Changes since v4:

 * add an earlier patch in the series that moves reverse_commit_list(),
   as-is, to commit.c. This also shrinks what is now the second patch.

Elijah Newren (4):
  commit: move reverse_commit_list() from merge-recursive
  merge-ort: copy a few small helper functions from merge-recursive.c
  merge-ort: make clear_internal_opts() aware of partial clearing
  merge-ort: implement merge_incore_recursive()

 commit.c          |  11 +++++
 commit.h          |   3 ++
 merge-ort.c       | 121 +++++++++++++++++++++++++++++++++++++++++++---
 merge-ort.h       |  10 ++++
 merge-recursive.c |  11 -----
 5 files changed, 138 insertions(+), 18 deletions(-)


base-commit: c5a6f65527aa3b6f5d7cf25437a88d8727ab0646
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-814%2Fnewren%2Fort-recursive-v5
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-814/newren/ort-recursive-v5
Pull-Request: https://github.com/gitgitgadget/git/pull/814

Range-diff vs v4:

 -:  ----------- > 1:  9052faeabe6 commit: move reverse_commit_list() from merge-recursive
 1:  dcf28565ad3 ! 2:  949741932e5 merge-ort: copy a few small helper functions from merge-recursive.c
     @@ merge-ort.c: void merge_finalize(struct merge_options *opt,
      +	commit->object.parsed = 1;
      +	return commit;
      +}
     -+
     -+MAYBE_UNUSED
     -+static struct commit_list *reverse_commit_list(struct commit_list *list)
     -+{
     -+	struct commit_list *previous = NULL, *current, *backup;
     -+	for (current = list; current; current = backup) {
     -+		backup = current->next;
     -+		current->next = previous;
     -+		previous = current;
     -+	}
     -+	return previous;
     -+}
      +
       static void merge_start(struct merge_options *opt, struct merge_result *result)
       {
 2:  bffc45c6570 = 3:  3852125c70b merge-ort: make clear_internal_opts() aware of partial clearing
 3:  f622d6905d0 ! 4:  63e30492ccb merge-ort: implement merge_incore_recursive()
     @@ merge-ort.c: static inline void set_commit_tree(struct commit *c, struct tree *t
       static struct commit *make_virtual_commit(struct repository *repo,
       					  struct tree *tree,
       					  const char *comment)
     -@@ merge-ort.c: static struct commit *make_virtual_commit(struct repository *repo,
     - 	return commit;
     - }
     - 
     --MAYBE_UNUSED
     - static struct commit_list *reverse_commit_list(struct commit_list *list)
     - {
     - 	struct commit_list *previous = NULL, *current, *backup;
      @@ merge-ort.c: static void merge_ort_nonrecursive_internal(struct merge_options *opt,
       	}
       }

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v5 1/4] commit: move reverse_commit_list() from merge-recursive
  2020-12-16 22:27       ` [PATCH v5 0/4] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
@ 2020-12-16 22:27         ` Elijah Newren via GitGitGadget
  2020-12-17 14:03           ` Derrick Stolee
  2020-12-16 22:28         ` [PATCH v5 2/4] merge-ort: copy a few small helper functions from merge-recursive.c Elijah Newren via GitGitGadget
                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 37+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-12-16 22:27 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, Derrick Stolee, Johannes Schindelin,
	Felipe Contreras, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 commit.c          | 11 +++++++++++
 commit.h          |  3 +++
 merge-recursive.c | 11 -----------
 3 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/commit.c b/commit.c
index f53429c0ac3..dc08a47b073 100644
--- a/commit.c
+++ b/commit.c
@@ -563,6 +563,17 @@ struct commit_list *copy_commit_list(struct commit_list *list)
 	return head;
 }
 
+struct commit_list *reverse_commit_list(struct commit_list *list)
+{
+	struct commit_list *next = NULL, *current, *backup;
+	for (current = list; current; current = backup) {
+		backup = current->next;
+		current->next = next;
+		next = current;
+	}
+	return next;
+}
+
 void free_commit_list(struct commit_list *list)
 {
 	while (list)
diff --git a/commit.h b/commit.h
index 5467786c7be..3e9139a0004 100644
--- a/commit.h
+++ b/commit.h
@@ -177,6 +177,9 @@ void commit_list_sort_by_date(struct commit_list **list);
 /* Shallow copy of the input list */
 struct commit_list *copy_commit_list(struct commit_list *list);
 
+/* Modify list in-place to reverse it, returning new head; list will be tail */
+struct commit_list *reverse_commit_list(struct commit_list *list);
+
 void free_commit_list(struct commit_list *list);
 
 struct rev_info; /* in revision.h, it circularly uses enum cmit_fmt */
diff --git a/merge-recursive.c b/merge-recursive.c
index f736a0f6323..b052974f191 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -3517,17 +3517,6 @@ static int merge_trees_internal(struct merge_options *opt,
 	return clean;
 }
 
-static struct commit_list *reverse_commit_list(struct commit_list *list)
-{
-	struct commit_list *next = NULL, *current, *backup;
-	for (current = list; current; current = backup) {
-		backup = current->next;
-		current->next = next;
-		next = current;
-	}
-	return next;
-}
-
 /*
  * Merge the commits h1 and h2, returning a flag (int) indicating the
  * cleanness of the merge.  Also, if opt->priv->call_depth, create a
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v5 2/4] merge-ort: copy a few small helper functions from merge-recursive.c
  2020-12-16 22:27       ` [PATCH v5 0/4] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
  2020-12-16 22:27         ` [PATCH v5 1/4] commit: move reverse_commit_list() from merge-recursive Elijah Newren via GitGitGadget
@ 2020-12-16 22:28         ` Elijah Newren via GitGitGadget
  2020-12-16 22:28         ` [PATCH v5 3/4] merge-ort: make clear_internal_opts() aware of partial clearing Elijah Newren via GitGitGadget
  2020-12-16 22:28         ` [PATCH v5 4/4] merge-ort: implement merge_incore_recursive() Elijah Newren via GitGitGadget
  3 siblings, 0 replies; 37+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-12-16 22:28 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, Derrick Stolee, Johannes Schindelin,
	Felipe Contreras, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

In a subsequent commit, we will implement the traditional recursiveness
that gave merge-recursive its name, namely merging non-unique
merge-bases to come up with a single virtual merge base.  Copy a few
helper functions from merge-recursive.c that we will use in the
implementation.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/merge-ort.c b/merge-ort.c
index 414e7b7eeac..6eac0cef491 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -17,8 +17,10 @@
 #include "cache.h"
 #include "merge-ort.h"
 
+#include "alloc.h"
 #include "blob.h"
 #include "cache-tree.h"
+#include "commit.h"
 #include "commit-reach.h"
 #include "diff.h"
 #include "diffcore.h"
@@ -1348,6 +1350,24 @@ void merge_finalize(struct merge_options *opt,
 
 /*** Function Grouping: helper functions for merge_incore_*() ***/
 
+static inline void set_commit_tree(struct commit *c, struct tree *t)
+{
+	c->maybe_tree = t;
+}
+
+MAYBE_UNUSED
+static struct commit *make_virtual_commit(struct repository *repo,
+					  struct tree *tree,
+					  const char *comment)
+{
+	struct commit *commit = alloc_commit_node(repo);
+
+	set_merge_remote_desc(commit, comment, (struct object *)commit);
+	set_commit_tree(commit, tree);
+	commit->object.parsed = 1;
+	return commit;
+}
+
 static void merge_start(struct merge_options *opt, struct merge_result *result)
 {
 	/* Sanity checks on opt */
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v5 3/4] merge-ort: make clear_internal_opts() aware of partial clearing
  2020-12-16 22:27       ` [PATCH v5 0/4] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
  2020-12-16 22:27         ` [PATCH v5 1/4] commit: move reverse_commit_list() from merge-recursive Elijah Newren via GitGitGadget
  2020-12-16 22:28         ` [PATCH v5 2/4] merge-ort: copy a few small helper functions from merge-recursive.c Elijah Newren via GitGitGadget
@ 2020-12-16 22:28         ` Elijah Newren via GitGitGadget
  2020-12-16 22:28         ` [PATCH v5 4/4] merge-ort: implement merge_incore_recursive() Elijah Newren via GitGitGadget
  3 siblings, 0 replies; 37+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-12-16 22:28 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, Derrick Stolee, Johannes Schindelin,
	Felipe Contreras, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

In order to handle recursive merges, after merging merge-bases we need
to clear away most of the data we had built up but some of it needs to
be kept -- in particular the "output" field.  Rename the function to
reflect its future change in use.

Further, since "reinitialize" means we'll be reusing the fields
immediately, take advantage of this to only partially clear maps,
leaving the hashtable allocated and pre-sized.  (This may be slightly
out-of-order since the speedups aren't realized until there are far
more strmaps in use, but the patch submission process already went out
of order because of various questions and requests for strmap.  Anyway,
see commit 6ccdfc2a20 ("strmap: enable faster clearing and reusing of
strmaps", 2020-11-05), for performance details about the use of
strmap_partial_clear().)

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index 6eac0cef491..bd237f3472e 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -253,10 +253,11 @@ static void free_strmap_strings(struct strmap *map)
 	}
 }
 
-static void clear_internal_opts(struct merge_options_internal *opti,
-				int reinitialize)
+static void clear_or_reinit_internal_opts(struct merge_options_internal *opti,
+					  int reinitialize)
 {
-	assert(!reinitialize);
+	void (*strmap_func)(struct strmap *, int) =
+		reinitialize ? strmap_partial_clear : strmap_clear;
 
 	/*
 	 * We marked opti->paths with strdup_strings = 0, so that we
@@ -266,14 +267,14 @@ static void clear_internal_opts(struct merge_options_internal *opti,
 	 * to deallocate them.
 	 */
 	free_strmap_strings(&opti->paths);
-	strmap_clear(&opti->paths, 1);
+	strmap_func(&opti->paths, 1);
 
 	/*
 	 * All keys and values in opti->conflicted are a subset of those in
 	 * opti->paths.  We don't want to deallocate anything twice, so we
 	 * don't free the keys and we pass 0 for free_values.
 	 */
-	strmap_clear(&opti->conflicted, 0);
+	strmap_func(&opti->conflicted, 0);
 
 	/*
 	 * opti->paths_to_free is similar to opti->paths; we created it with
@@ -1344,7 +1345,7 @@ void merge_finalize(struct merge_options *opt,
 
 	assert(opt->priv == NULL);
 
-	clear_internal_opts(opti, 0);
+	clear_or_reinit_internal_opts(opti, 0);
 	FREE_AND_NULL(opti);
 }
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v5 4/4] merge-ort: implement merge_incore_recursive()
  2020-12-16 22:27       ` [PATCH v5 0/4] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
                           ` (2 preceding siblings ...)
  2020-12-16 22:28         ` [PATCH v5 3/4] merge-ort: make clear_internal_opts() aware of partial clearing Elijah Newren via GitGitGadget
@ 2020-12-16 22:28         ` Elijah Newren via GitGitGadget
  3 siblings, 0 replies; 37+ messages in thread
From: Elijah Newren via GitGitGadget @ 2020-12-16 22:28 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, Derrick Stolee, Johannes Schindelin,
	Felipe Contreras, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Implement merge_incore_recursive(), mostly through the use of a new
helper function, merge_ort_internal(), which itself is based off
merge_recursive_internal() from merge-recursive.c.

This drops the number of failures in the testsuite when run under
GIT_TEST_MERGE_ALGORITHM=ort from around 1500 to 647.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 90 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 merge-ort.h | 10 ++++++
 2 files changed, 98 insertions(+), 2 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index bd237f3472e..31103d21407 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -1356,7 +1356,6 @@ static inline void set_commit_tree(struct commit *c, struct tree *t)
 	c->maybe_tree = t;
 }
 
-MAYBE_UNUSED
 static struct commit *make_virtual_commit(struct repository *repo,
 					  struct tree *tree,
 					  const char *comment)
@@ -1466,6 +1465,89 @@ static void merge_ort_nonrecursive_internal(struct merge_options *opt,
 	}
 }
 
+/*
+ * Originally from merge_recursive_internal(); somewhat adapted, though.
+ */
+static void merge_ort_internal(struct merge_options *opt,
+			       struct commit_list *merge_bases,
+			       struct commit *h1,
+			       struct commit *h2,
+			       struct merge_result *result)
+{
+	struct commit_list *iter;
+	struct commit *merged_merge_bases;
+	const char *ancestor_name;
+	struct strbuf merge_base_abbrev = STRBUF_INIT;
+
+	if (!merge_bases) {
+		merge_bases = get_merge_bases(h1, h2);
+		/* See merge-ort.h:merge_incore_recursive() declaration NOTE */
+		merge_bases = reverse_commit_list(merge_bases);
+	}
+
+	merged_merge_bases = pop_commit(&merge_bases);
+	if (merged_merge_bases == NULL) {
+		/* if there is no common ancestor, use an empty tree */
+		struct tree *tree;
+
+		tree = lookup_tree(opt->repo, opt->repo->hash_algo->empty_tree);
+		merged_merge_bases = make_virtual_commit(opt->repo, tree,
+							 "ancestor");
+		ancestor_name = "empty tree";
+	} else if (merge_bases) {
+		ancestor_name = "merged common ancestors";
+	} else {
+		strbuf_add_unique_abbrev(&merge_base_abbrev,
+					 &merged_merge_bases->object.oid,
+					 DEFAULT_ABBREV);
+		ancestor_name = merge_base_abbrev.buf;
+	}
+
+	for (iter = merge_bases; iter; iter = iter->next) {
+		const char *saved_b1, *saved_b2;
+		struct commit *prev = merged_merge_bases;
+
+		opt->priv->call_depth++;
+		/*
+		 * When the merge fails, the result contains files
+		 * with conflict markers. The cleanness flag is
+		 * ignored (unless indicating an error), it was never
+		 * actually used, as result of merge_trees has always
+		 * overwritten it: the committed "conflicts" were
+		 * already resolved.
+		 */
+		saved_b1 = opt->branch1;
+		saved_b2 = opt->branch2;
+		opt->branch1 = "Temporary merge branch 1";
+		opt->branch2 = "Temporary merge branch 2";
+		merge_ort_internal(opt, NULL, prev, iter->item, result);
+		if (result->clean < 0)
+			return;
+		opt->branch1 = saved_b1;
+		opt->branch2 = saved_b2;
+		opt->priv->call_depth--;
+
+		merged_merge_bases = make_virtual_commit(opt->repo,
+							 result->tree,
+							 "merged tree");
+		commit_list_insert(prev, &merged_merge_bases->parents);
+		commit_list_insert(iter->item,
+				   &merged_merge_bases->parents->next);
+
+		clear_or_reinit_internal_opts(opt->priv, 1);
+	}
+
+	opt->ancestor = ancestor_name;
+	merge_ort_nonrecursive_internal(opt,
+					repo_get_commit_tree(opt->repo,
+							     merged_merge_bases),
+					repo_get_commit_tree(opt->repo, h1),
+					repo_get_commit_tree(opt->repo, h2),
+					result);
+	strbuf_release(&merge_base_abbrev);
+	opt->ancestor = NULL;  /* avoid accidental re-use of opt->ancestor */
+}
+
 void merge_incore_nonrecursive(struct merge_options *opt,
 			       struct tree *merge_base,
 			       struct tree *side1,
@@ -1483,5 +1565,9 @@ void merge_incore_recursive(struct merge_options *opt,
 			    struct commit *side2,
 			    struct merge_result *result)
 {
-	die("Not yet implemented");
+	/* We set the ancestor label based on the merge_bases */
+	assert(opt->ancestor == NULL);
+
+	merge_start(opt, result);
+	merge_ort_internal(opt, merge_bases, side1, side2, result);
 }
diff --git a/merge-ort.h b/merge-ort.h
index 55ae7ee865d..d53a0a339f3 100644
--- a/merge-ort.h
+++ b/merge-ort.h
@@ -34,6 +34,16 @@ struct merge_result {
 /*
  * rename-detecting three-way merge with recursive ancestor consolidation.
  * working tree and index are untouched.
+ *
+ * merge_bases will be consumed (emptied) so make a copy if you need it.
+ *
+ * NOTE: empirically, the recursive algorithm will perform better if you
+ *       pass the merge_bases in the order of oldest commit to the
+ *       newest[1][2].
+ *
+ *       [1] https://lore.kernel.org/git/nycvar.QRO.7.76.6.1907252055500.21907@tvgsbejvaqbjf.bet/
+ *       [2] commit 8918b0c9c2 ("merge-recur: try to merge older merge bases
+ *           first", 2006-08-09)
  */
 void merge_incore_recursive(struct merge_options *opt,
 			    struct commit_list *merge_bases,
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v5 1/4] commit: move reverse_commit_list() from merge-recursive
  2020-12-16 22:27         ` [PATCH v5 1/4] commit: move reverse_commit_list() from merge-recursive Elijah Newren via GitGitGadget
@ 2020-12-17 14:03           ` Derrick Stolee
  0 siblings, 0 replies; 37+ messages in thread
From: Derrick Stolee @ 2020-12-17 14:03 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget, git
  Cc: Elijah Newren, Johannes Schindelin, Felipe Contreras

On 12/16/2020 5:27 PM, Elijah Newren via GitGitGadget wrote:
> +struct commit_list *reverse_commit_list(struct commit_list *list)
> +{
> +	struct commit_list *next = NULL, *current, *backup;
> +	for (current = list; current; current = backup) {
> +		backup = current->next;
> +		current->next = next;
> +		next = current;
> +	}
> +	return next;
> +}
> +

I followed the discussion about the variable names in
the earlier thread, and I certainly did not meant to
stir a controversy. I think this approach of deduplicating
but keeping an exact copy is the best solution.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2020-12-17 14:06 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-15 17:53 [PATCH 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
2020-12-15 17:53 ` [PATCH 1/3] merge-ort: copy a few small helper functions from merge-recursive.c Elijah Newren via GitGitGadget
2020-12-16  1:16   ` Junio C Hamano
2020-12-16 16:12     ` Johannes Schindelin
2020-12-16 16:24       ` Elijah Newren
2020-12-16 13:30   ` Derrick Stolee
2020-12-16 17:43     ` Junio C Hamano
2020-12-16 18:54       ` Felipe Contreras
2020-12-16 19:20       ` Elijah Newren
2020-12-16 20:41         ` Junio C Hamano
2020-12-16 21:25           ` Felipe Contreras
2020-12-16 21:34           ` Elijah Newren
2020-12-15 17:53 ` [PATCH 2/3] merge-ort: make clear_internal_opts() aware of partial clearing Elijah Newren via GitGitGadget
2020-12-15 17:53 ` [PATCH 3/3] merge-ort: implement merge_incore_recursive() Elijah Newren via GitGitGadget
2020-12-16  2:07   ` Junio C Hamano
2020-12-16  4:09     ` Elijah Newren
2020-12-16  4:44       ` Elijah Newren
2020-12-16  5:52 ` [PATCH v2 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
2020-12-16  5:52   ` [PATCH v2 1/3] merge-ort: copy a few small helper functions from merge-recursive.c Elijah Newren via GitGitGadget
2020-12-16  5:52   ` [PATCH v2 2/3] merge-ort: make clear_internal_opts() aware of partial clearing Elijah Newren via GitGitGadget
2020-12-16  5:52   ` [PATCH v2 3/3] merge-ort: implement merge_incore_recursive() Elijah Newren via GitGitGadget
2020-12-16 18:09     ` Junio C Hamano
2020-12-16 18:37       ` Elijah Newren
2020-12-16 17:17   ` [PATCH v3 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
2020-12-16 17:17     ` [PATCH v3 1/3] merge-ort: copy a few small helper functions from merge-recursive.c Elijah Newren via GitGitGadget
2020-12-16 17:17     ` [PATCH v3 2/3] merge-ort: make clear_internal_opts() aware of partial clearing Elijah Newren via GitGitGadget
2020-12-16 17:17     ` [PATCH v3 3/3] merge-ort: implement merge_incore_recursive() Elijah Newren via GitGitGadget
2020-12-16 20:35     ` [PATCH v4 0/3] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
2020-12-16 20:35       ` [PATCH v4 1/3] merge-ort: copy a few small helper functions from merge-recursive.c Elijah Newren via GitGitGadget
2020-12-16 20:35       ` [PATCH v4 2/3] merge-ort: make clear_internal_opts() aware of partial clearing Elijah Newren via GitGitGadget
2020-12-16 20:35       ` [PATCH v4 3/3] merge-ort: implement merge_incore_recursive() Elijah Newren via GitGitGadget
2020-12-16 22:27       ` [PATCH v5 0/4] merge-ort: implement recursive merges Elijah Newren via GitGitGadget
2020-12-16 22:27         ` [PATCH v5 1/4] commit: move reverse_commit_list() from merge-recursive Elijah Newren via GitGitGadget
2020-12-17 14:03           ` Derrick Stolee
2020-12-16 22:28         ` [PATCH v5 2/4] merge-ort: copy a few small helper functions from merge-recursive.c Elijah Newren via GitGitGadget
2020-12-16 22:28         ` [PATCH v5 3/4] merge-ort: make clear_internal_opts() aware of partial clearing Elijah Newren via GitGitGadget
2020-12-16 22:28         ` [PATCH v5 4/4] merge-ort: implement merge_incore_recursive() Elijah Newren via GitGitGadget

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).