git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Elijah Newren via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Christian Couder <chriscool@tuxfamily.org>,
	Taylor Blau <me@ttaylorr.com>,
	Johannes Altmanninger <aclopte@gmail.com>,
	Elijah Newren <newren@gmail.com>,
	Elijah Newren <newren@gmail.com>,
	Elijah Newren <newren@gmail.com>
Subject: Re: [PATCH v2 4/8] merge-tree: implement real merges
Date: Fri, 7 Jan 2022 16:30:07 +0100 (CET)	[thread overview]
Message-ID: <nycvar.QRO.7.76.6.2201071602110.339@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <1710ba4a9e432e2a854579c4c929e7f2cfc92211.1641403655.git.gitgitgadget@gmail.com>

Hi Elijah,

On Wed, 5 Jan 2022, Elijah Newren via GitGitGadget wrote:

> From: Elijah Newren <newren@gmail.com>
>
> This adds the ability to perform real merges rather than just trivial
> merges (meaning handling three way content merges, recursive ancestor
> consolidation, renames, proper directory/file conflict handling, and so
> forth).  However, unlike `git merge`, the working tree and index are
> left alone and no branch is updated.
>
> The only output is:
>   - the toplevel resulting tree printed on stdout
>   - exit status of 0 (clean) or 1 (conflicts present)
>
> This output is mean to be used by some higher level script, perhaps in a
                 ^^^^

My apologies for pointing out a grammar issue: This probably intended to
say "meant", as the word "mean" changes the sense of the sentence.

In my defense, I have more substantial suggestions below.

> sequence of steps like this:
>
>    NEWTREE=$(git merge-tree --real $BRANCH1 $BRANCH2)
>    test $? -eq 0 || die "There were conflicts..."
>    NEWCOMMIT=$(git commit-tree $NEWTREE -p $BRANCH1 -p $BRANCH2)
>    git update-ref $BRANCH1 $NEWCOMMIT
>
> Note that higher level scripts may also want to access the
> conflict/warning messages normally output during a merge, or have quick
> access to a list of files with conflicts.  That is not available in this
> preliminary implementation, but subsequent commits will add that
> ability.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  Documentation/git-merge-tree.txt | 28 +++++++----
>  builtin/merge-tree.c             | 55 +++++++++++++++++++++-
>  t/t4301-merge-tree-real.sh       | 81 ++++++++++++++++++++++++++++++++
>  3 files changed, 153 insertions(+), 11 deletions(-)
>  create mode 100755 t/t4301-merge-tree-real.sh
>
> diff --git a/Documentation/git-merge-tree.txt b/Documentation/git-merge-tree.txt
> index 58731c19422..5823938937f 100644
> --- a/Documentation/git-merge-tree.txt
> +++ b/Documentation/git-merge-tree.txt
> @@ -3,26 +3,34 @@ git-merge-tree(1)
>
>  NAME
>  ----
> -git-merge-tree - Show three-way merge without touching index
> +git-merge-tree - Perform merge without touching index or working tree
>
>
>  SYNOPSIS
>  --------
>  [verse]
> +'git merge-tree' --real <branch1> <branch2>
>  'git merge-tree' <base-tree> <branch1> <branch2>

Here is an idea: How about aiming for this synopsis instead, exploiting
the fact that the "real" mode takes a different amount of arguments?

   'git merge-tree' [--write-tree] <branch1> <branch2>
   'git merge-tree' [--demo-trivial-merge] <base-tree> <branch1> <branch2>

That way, the old mode can still function, and can even at some stage be
deprecated and eventually removed.

>
>  DESCRIPTION
>  -----------
> -Reads three tree-ish, and output trivial merge results and
> -conflicting stages to the standard output.  This is similar to
> -what three-way 'git read-tree -m' does, but instead of storing the
> -results in the index, the command outputs the entries to the
> -standard output.
> +Performs a merge, but does not make any new commits and does not read
> +from or write to either the working tree or index.
>
> -This is meant to be used by higher level scripts to compute
> -merge results outside of the index, and stuff the results back into the
> -index.  For this reason, the output from the command omits
> -entries that match the <branch1> tree.
> +The first form will merge the two branches, doing a full recursive
> +merge with rename detection.  If the merge is clean, the exit status
> +will be `0`, and if the merge has conflicts, the exit status will be
> +`1`.  The output will consist solely of the resulting toplevel tree
> +(which may have files including conflict markers).
> +
> +The second form is meant for backward compatibility and will only do a
> +trival merge.  It reads three tree-ish, and outputs trivial merge
> +results and conflicting stages to the standard output in a semi-diff
> +format.  Since this was designed for higher level scripts to consume
> +and merge the results back into the index, it omits entries that match
> +<branch1>.  The result of this second form is is similar to what
> +three-way 'git read-tree -m' does, but instead of storing the results
> +in the index, the command outputs the entries to the standard output.
>
>  GIT
>  ---
> diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
> index e1d2832c809..ac50f3d108b 100644
> --- a/builtin/merge-tree.c
> +++ b/builtin/merge-tree.c
> @@ -2,6 +2,9 @@
>  #include "builtin.h"
>  #include "tree-walk.h"
>  #include "xdiff-interface.h"
> +#include "help.h"
> +#include "commit-reach.h"
> +#include "merge-ort.h"
>  #include "object-store.h"
>  #include "parse-options.h"
>  #include "repository.h"
> @@ -392,7 +395,57 @@ struct merge_tree_options {
>  static int real_merge(struct merge_tree_options *o,
>  		      const char *branch1, const char *branch2)
>  {
> -	die(_("real merges are not yet implemented"));
> +	struct commit *parent1, *parent2;
> +	struct commit_list *common;
> +	struct commit_list *merge_bases = NULL;
> +	struct commit_list *j;
> +	struct merge_options opt;
> +	struct merge_result result = { 0 };
> +
> +	parent1 = get_merge_parent(branch1);
> +	if (!parent1)
> +		help_unknown_ref(branch1, "merge",
> +				 _("not something we can merge"));
> +
> +	parent2 = get_merge_parent(branch2);
> +	if (!parent2)
> +		help_unknown_ref(branch2, "merge",
> +				 _("not something we can merge"));
> +
> +	init_merge_options(&opt, the_repository);
> +	/*
> +	 * TODO: Support subtree and other -X options?
> +	if (use_strategies_nr == 1 &&
> +	    !strcmp(use_strategies[0]->name, "subtree"))
> +		opt.subtree_shift = "";
> +	for (x = 0; x < xopts_nr; x++)
> +		if (parse_merge_opt(&opt, xopts[x]))
> +			die(_("Unknown strategy option: -X%s"), xopts[x]);
> +	*/
> +
> +	opt.show_rename_progress = 0;
> +
> +	opt.branch1 = merge_remote_util(parent1)->name; /* or just branch1? */
> +	opt.branch2 = merge_remote_util(parent2)->name; /* or just branch2? */
> +
> +	/*
> +	 * Get the merge bases, in reverse order; see comment above
> +	 * merge_incore_recursive in merge-ort.h
> +	 */
> +	common = get_merge_bases(parent1, parent2);
> +	for (j = common; j; j = j->next)
> +		commit_list_insert(j->item, &merge_bases);
> +
> +	/*
> +	 * TODO: notify if merging unrelated histories?

I guess that it would make most sense to add a flag whether this is
allowed or not, and I would suggest the default to be `off`.

> +	if (!common)
> +		fprintf(stderr, _("merging unrelated histories"));
> +	 */
> +
> +	merge_incore_recursive(&opt, merge_bases, parent1, parent2, &result);
> +	printf("%s\n", oid_to_hex(&result.tree->object.oid));
> +	merge_switch_to_result(&opt, NULL, &result, 0, 0);

This looks to be idempotent to `merge_finalize(&opt, &result)`, so maybe
use that instead?

> +	return result.clean ? 0 : 1;
>  }
>
>  int cmd_merge_tree(int argc, const char **argv, const char *prefix)
> diff --git a/t/t4301-merge-tree-real.sh b/t/t4301-merge-tree-real.sh
> new file mode 100755
> index 00000000000..f7aa310f8c1
> --- /dev/null
> +++ b/t/t4301-merge-tree-real.sh
> @@ -0,0 +1,81 @@
> +#!/bin/sh
> +
> +test_description='git merge-tree --real'
> +
> +. ./test-lib.sh
> +
> +# This test is ort-specific
> +GIT_TEST_MERGE_ALGORITHM=ort
> +export GIT_TEST_MERGE_ALGORITHM

It might make sense to skip the entire test if the user asked for
`recursive` to be tested:

	test "${GIT_TEST_MERGE_ALGORITHM:-ort}" = ort ||
		skip_all="GIT_TEST_MERGE_ALGORITHM != ort"
		test_done
	}

> +
> +test_expect_success setup '
> +	test_write_lines 1 2 3 4 5 >numbers &&
> +	echo hello >greeting &&
> +	echo foo >whatever &&
> +	git add numbers greeting whatever &&
> +	git commit -m initial &&

I would really like to encourage the use of `test_tick`. It makes the
commit consistent, just in case you run into an issue that depends on some
hash order.

> +
> +	git branch side1 &&
> +	git branch side2 &&
> +
> +	git checkout side1 &&

Please use `git switch -c side1` or `git checkout -b side1`: it is more
compact than `git branch ... && git checkout ...`.

> +	test_write_lines 1 2 3 4 5 6 >numbers &&
> +	echo hi >greeting &&
> +	echo bar >whatever &&
> +	git add numbers greeting whatever &&
> +	git commit -m modify-stuff &&
> +
> +	git checkout side2 &&

This could be written as `git checkout -b side2 HEAD^`, to make the setup
more succinct.

> +	test_write_lines 0 1 2 3 4 5 >numbers &&
> +	echo yo >greeting &&
> +	git rm whatever &&
> +	mkdir whatever &&
> +	>whatever/empty &&
> +	git add numbers greeting whatever/empty &&
> +	git commit -m other-modifications
> +'
> +
> +test_expect_success 'Content merge and a few conflicts' '
> +	git checkout side1^0 &&
> +	test_must_fail git merge side2 &&
> +	cp .git/AUTO_MERGE EXPECT &&
> +	E_TREE=$(cat EXPECT) &&

The file `EXPECT` is not used below. And can we use a more obvious name?
SOmething like:

	expected_tree=$(cat .git/AUTO_MERGE)

> +	git reset --hard &&

For an extra bonus, we could delay this via `test_when_finished`, to prove
that `git merge-tree --real` works even in a dirty worktree _with
conflicts_.

> +	test_must_fail git merge-tree --real side1 side2 >RESULT &&
> +	R_TREE=$(cat RESULT) &&

How about `actual_tree` instead?

> +
> +	# Due to differences of e.g. "HEAD" vs "side1", the results will not
> +	# exactly match.  Dig into individual files.
> +
> +	# Numbers should have three-way merged cleanly
> +	test_write_lines 0 1 2 3 4 5 6 >expect &&
> +	git show ${R_TREE}:numbers >actual &&
> +	test_cmp expect actual &&
> +
> +	# whatever and whatever~<branch> should have same HASHES
> +	git rev-parse ${E_TREE}:whatever ${E_TREE}:whatever~HEAD >expect &&
> +	git rev-parse ${R_TREE}:whatever ${R_TREE}:whatever~side1 >actual &&
> +	test_cmp expect actual &&
> +
> +	# greeting should have a merge conflict
> +	git show ${E_TREE}:greeting >tmp &&
> +	cat tmp | sed -e s/HEAD/side1/ >expect &&
> +	git show ${R_TREE}:greeting >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'Barf on misspelled option' '
> +	# Mis-spell with single "s" instead of double "s"
> +	test_expect_code 129 git merge-tree --real --mesages FOOBAR side1 side2 2>expect &&
> +
> +	grep "error: unknown option.*mesages" expect
> +'

I do not think that this test case adds much, and we already test the
`parse_options()` machinery elsewhere.

> +
> +test_expect_success 'Barf on too many arguments' '
> +	test_expect_code 129 git merge-tree --real side1 side2 side3 2>expect &&
> +
> +	grep "^usage: git merge-tree" expect
> +'
> +
> +test_done

The rest looks awesome. Thank you for working on it! I will definitely
come back to review the rest (have to take a break now), and then probably
add quite a bit of food for thought based on my experience _actually_
using `merge-ort` on the server-side. Stay tuned.

Thank you,
Dscho

  reply	other threads:[~2022-01-07 15:30 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-31  5:03 [PATCH 0/8] RFC: Server side merges (no ref updating, no commit creating, no touching worktree or index) Elijah Newren via GitGitGadget
2021-12-31  5:03 ` [PATCH 1/8] merge-tree: rename merge_trees() to trivial_merge_trees() Elijah Newren via GitGitGadget
2021-12-31  5:03 ` [PATCH 2/8] merge-tree: move logic for existing merge into new function Elijah Newren via GitGitGadget
2022-01-01 20:11   ` Johannes Altmanninger
2022-01-01 20:17     ` Elijah Newren
2021-12-31  5:03 ` [PATCH 3/8] merge-tree: add option parsing and initial shell for real merge function Elijah Newren via GitGitGadget
2021-12-31  5:04 ` [PATCH 4/8] merge-tree: implement real merges Elijah Newren via GitGitGadget
2022-01-01 20:08   ` Johannes Altmanninger
2022-01-01 21:11     ` Elijah Newren
2022-01-03 12:23   ` Fabian Stelzer
2022-01-03 16:37     ` Elijah Newren
2021-12-31  5:04 ` [PATCH 5/8] merge-ort: split out a separate display_update_messages() function Elijah Newren via GitGitGadget
2022-01-03 12:15   ` Fabian Stelzer
2022-01-03 12:25     ` Fabian Stelzer
2021-12-31  5:04 ` [PATCH 6/8] merge-ort: allow update messages to be written to different file stream Elijah Newren via GitGitGadget
2022-01-01 20:08   ` Johannes Altmanninger
2022-01-01 20:19     ` Elijah Newren
2021-12-31  5:04 ` [PATCH 7/8] merge-tree: support saving merge messages to a separate file Elijah Newren via GitGitGadget
2022-01-03 12:31   ` Fabian Stelzer
2022-01-03 16:51     ` Elijah Newren
2022-01-03 17:22       ` Fabian Stelzer
2022-01-03 19:46         ` Elijah Newren
2022-01-04 13:05           ` Fabian Stelzer
2022-01-03 12:35   ` Fabian Stelzer
2022-01-03 16:55     ` Elijah Newren
2021-12-31  5:04 ` [PATCH 8/8] merge-tree: provide an easy way to access which files have conflicts Elijah Newren via GitGitGadget
2022-01-05 17:27 ` [PATCH v2 0/8] RFC: Server side merges (no ref updating, no commit creating, no touching worktree or index) Elijah Newren via GitGitGadget
2022-01-05 17:27   ` [PATCH v2 1/8] merge-tree: rename merge_trees() to trivial_merge_trees() Elijah Newren via GitGitGadget
2022-01-05 17:27   ` [PATCH v2 2/8] merge-tree: move logic for existing merge into new function Elijah Newren via GitGitGadget
2022-01-05 17:27   ` [PATCH v2 3/8] merge-tree: add option parsing and initial shell for real merge function Elijah Newren via GitGitGadget
2022-01-05 17:27   ` [PATCH v2 4/8] merge-tree: implement real merges Elijah Newren via GitGitGadget
2022-01-07 15:30     ` Johannes Schindelin [this message]
2022-01-07 17:26       ` Elijah Newren
2022-01-07 18:22         ` Johannes Schindelin
2022-01-07 19:15           ` Elijah Newren
2022-01-07 20:56         ` Junio C Hamano
2022-01-11 13:39           ` Johannes Schindelin
2022-01-07 18:12     ` Christian Couder
2022-01-07 19:09       ` Elijah Newren
2022-01-05 17:27   ` [PATCH v2 5/8] merge-ort: split out a separate display_update_messages() function Elijah Newren via GitGitGadget
2022-01-05 17:27   ` [PATCH v2 6/8] merge-ort: allow update messages to be written to different file stream Elijah Newren via GitGitGadget
2022-01-05 17:27   ` [PATCH v2 7/8] merge-tree: support saving merge messages to a separate file Elijah Newren via GitGitGadget
2022-01-07 18:07     ` Johannes Schindelin
2022-01-08  1:02       ` Elijah Newren
2022-01-05 17:27   ` [PATCH v2 8/8] merge-tree: provide an easy way to access which files have conflicts Elijah Newren via GitGitGadget
2022-01-05 19:09     ` Ramsay Jones
2022-01-05 19:17       ` Elijah Newren
2022-01-07 19:36     ` Johannes Schindelin
2022-01-07 22:12       ` Elijah Newren
2022-02-22 13:03         ` Johannes Schindelin
2022-01-08  1:28       ` Elijah Newren
2022-02-22 13:05         ` Johannes Schindelin
2022-01-05 20:18   ` [PATCH v2 0/8] RFC: Server side merges (no ref updating, no commit creating, no touching worktree or index) Junio C Hamano
2022-01-05 22:35     ` Elijah Newren
2022-01-07 18:46   ` Christian Couder
2022-01-07 19:59     ` Elijah Newren
2022-01-07 21:26       ` René Scharfe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.QRO.7.76.6.2201071602110.339@tvgsbejvaqbjf.bet \
    --to=johannes.schindelin@gmx.de \
    --cc=aclopte@gmail.com \
    --cc=chriscool@tuxfamily.org \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).