From: Elijah Newren <newren@gmail.com>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Elijah Newren via GitGitGadget <gitgitgadget@gmail.com>,
Git Mailing List <git@vger.kernel.org>,
Christian Couder <chriscool@tuxfamily.org>,
Taylor Blau <me@ttaylorr.com>,
Johannes Altmanninger <aclopte@gmail.com>
Subject: Re: [PATCH v2 4/8] merge-tree: implement real merges
Date: Fri, 7 Jan 2022 09:26:31 -0800 [thread overview]
Message-ID: <CABPp-BFUJ6pU_CKM7ccnFvi0nkeeGfd2GETdksKLaz=B_=BZAQ@mail.gmail.com> (raw)
In-Reply-To: <nycvar.QRO.7.76.6.2201071602110.339@tvgsbejvaqbjf.bet>
On Fri, Jan 7, 2022 at 7:30 AM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi Elijah,
>
> On Wed, 5 Jan 2022, Elijah Newren via GitGitGadget wrote:
>
> > From: Elijah Newren <newren@gmail.com>
> >
> > This adds the ability to perform real merges rather than just trivial
> > merges (meaning handling three way content merges, recursive ancestor
> > consolidation, renames, proper directory/file conflict handling, and so
> > forth). However, unlike `git merge`, the working tree and index are
> > left alone and no branch is updated.
> >
> > The only output is:
> > - the toplevel resulting tree printed on stdout
> > - exit status of 0 (clean) or 1 (conflicts present)
> >
> > This output is mean to be used by some higher level script, perhaps in a
> ^^^^
>
> My apologies for pointing out a grammar issue: This probably intended to
> say "meant", as the word "mean" changes the sense of the sentence.
Oops. Yeah, I'll correct that; thanks for pointing it out.
> In my defense, I have more substantial suggestions below.
>
> > sequence of steps like this:
> >
> > NEWTREE=$(git merge-tree --real $BRANCH1 $BRANCH2)
> > test $? -eq 0 || die "There were conflicts..."
> > NEWCOMMIT=$(git commit-tree $NEWTREE -p $BRANCH1 -p $BRANCH2)
> > git update-ref $BRANCH1 $NEWCOMMIT
> >
> > Note that higher level scripts may also want to access the
> > conflict/warning messages normally output during a merge, or have quick
> > access to a list of files with conflicts. That is not available in this
> > preliminary implementation, but subsequent commits will add that
> > ability.
> >
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > ---
> > Documentation/git-merge-tree.txt | 28 +++++++----
> > builtin/merge-tree.c | 55 +++++++++++++++++++++-
> > t/t4301-merge-tree-real.sh | 81 ++++++++++++++++++++++++++++++++
> > 3 files changed, 153 insertions(+), 11 deletions(-)
> > create mode 100755 t/t4301-merge-tree-real.sh
> >
> > diff --git a/Documentation/git-merge-tree.txt b/Documentation/git-merge-tree.txt
> > index 58731c19422..5823938937f 100644
> > --- a/Documentation/git-merge-tree.txt
> > +++ b/Documentation/git-merge-tree.txt
> > @@ -3,26 +3,34 @@ git-merge-tree(1)
> >
> > NAME
> > ----
> > -git-merge-tree - Show three-way merge without touching index
> > +git-merge-tree - Perform merge without touching index or working tree
> >
> >
> > SYNOPSIS
> > --------
> > [verse]
> > +'git merge-tree' --real <branch1> <branch2>
> > 'git merge-tree' <base-tree> <branch1> <branch2>
>
> Here is an idea: How about aiming for this synopsis instead, exploiting
> the fact that the "real" mode takes a different amount of arguments?
My turn on the grammar thing: s/amount/number/. :-)
>
> 'git merge-tree' [--write-tree] <branch1> <branch2>
> 'git merge-tree' [--demo-trivial-merge] <base-tree> <branch1> <branch2>
>
> That way, the old mode can still function, and can even at some stage be
> deprecated and eventually removed.
Ooh, interesting.
> >
> > DESCRIPTION
> > -----------
> > -Reads three tree-ish, and output trivial merge results and
> > -conflicting stages to the standard output. This is similar to
> > -what three-way 'git read-tree -m' does, but instead of storing the
> > -results in the index, the command outputs the entries to the
> > -standard output.
> > +Performs a merge, but does not make any new commits and does not read
> > +from or write to either the working tree or index.
> >
> > -This is meant to be used by higher level scripts to compute
> > -merge results outside of the index, and stuff the results back into the
> > -index. For this reason, the output from the command omits
> > -entries that match the <branch1> tree.
> > +The first form will merge the two branches, doing a full recursive
> > +merge with rename detection. If the merge is clean, the exit status
> > +will be `0`, and if the merge has conflicts, the exit status will be
> > +`1`. The output will consist solely of the resulting toplevel tree
> > +(which may have files including conflict markers).
> > +
> > +The second form is meant for backward compatibility and will only do a
> > +trival merge. It reads three tree-ish, and outputs trivial merge
> > +results and conflicting stages to the standard output in a semi-diff
> > +format. Since this was designed for higher level scripts to consume
> > +and merge the results back into the index, it omits entries that match
> > +<branch1>. The result of this second form is is similar to what
> > +three-way 'git read-tree -m' does, but instead of storing the results
> > +in the index, the command outputs the entries to the standard output.
> >
> > GIT
> > ---
> > diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
> > index e1d2832c809..ac50f3d108b 100644
> > --- a/builtin/merge-tree.c
> > +++ b/builtin/merge-tree.c
> > @@ -2,6 +2,9 @@
> > #include "builtin.h"
> > #include "tree-walk.h"
> > #include "xdiff-interface.h"
> > +#include "help.h"
> > +#include "commit-reach.h"
> > +#include "merge-ort.h"
> > #include "object-store.h"
> > #include "parse-options.h"
> > #include "repository.h"
> > @@ -392,7 +395,57 @@ struct merge_tree_options {
> > static int real_merge(struct merge_tree_options *o,
> > const char *branch1, const char *branch2)
> > {
> > - die(_("real merges are not yet implemented"));
> > + struct commit *parent1, *parent2;
> > + struct commit_list *common;
> > + struct commit_list *merge_bases = NULL;
> > + struct commit_list *j;
> > + struct merge_options opt;
> > + struct merge_result result = { 0 };
> > +
> > + parent1 = get_merge_parent(branch1);
> > + if (!parent1)
> > + help_unknown_ref(branch1, "merge",
> > + _("not something we can merge"));
> > +
> > + parent2 = get_merge_parent(branch2);
> > + if (!parent2)
> > + help_unknown_ref(branch2, "merge",
> > + _("not something we can merge"));
> > +
> > + init_merge_options(&opt, the_repository);
> > + /*
> > + * TODO: Support subtree and other -X options?
> > + if (use_strategies_nr == 1 &&
> > + !strcmp(use_strategies[0]->name, "subtree"))
> > + opt.subtree_shift = "";
> > + for (x = 0; x < xopts_nr; x++)
> > + if (parse_merge_opt(&opt, xopts[x]))
> > + die(_("Unknown strategy option: -X%s"), xopts[x]);
> > + */
> > +
> > + opt.show_rename_progress = 0;
> > +
> > + opt.branch1 = merge_remote_util(parent1)->name; /* or just branch1? */
> > + opt.branch2 = merge_remote_util(parent2)->name; /* or just branch2? */
> > +
> > + /*
> > + * Get the merge bases, in reverse order; see comment above
> > + * merge_incore_recursive in merge-ort.h
> > + */
> > + common = get_merge_bases(parent1, parent2);
> > + for (j = common; j; j = j->next)
> > + commit_list_insert(j->item, &merge_bases);
> > +
> > + /*
> > + * TODO: notify if merging unrelated histories?
>
> I guess that it would make most sense to add a flag whether this is
> allowed or not, and I would suggest the default to be `off`.
Sounds fair. Thanks for commenting on one of the TODOs that I was unsure about.
> > + if (!common)
> > + fprintf(stderr, _("merging unrelated histories"));
> > + */
> > +
> > + merge_incore_recursive(&opt, merge_bases, parent1, parent2, &result);
> > + printf("%s\n", oid_to_hex(&result.tree->object.oid));
> > + merge_switch_to_result(&opt, NULL, &result, 0, 0);
>
> This looks to be idempotent to `merge_finalize(&opt, &result)`, so maybe
> use that instead?
Yeah, and add a TODO about the display messages (that'll be addressed
in a later patch, unlike the above TODOs).
>
> > + return result.clean ? 0 : 1;
> > }
> >
> > int cmd_merge_tree(int argc, const char **argv, const char *prefix)
> > diff --git a/t/t4301-merge-tree-real.sh b/t/t4301-merge-tree-real.sh
> > new file mode 100755
> > index 00000000000..f7aa310f8c1
> > --- /dev/null
> > +++ b/t/t4301-merge-tree-real.sh
> > @@ -0,0 +1,81 @@
> > +#!/bin/sh
> > +
> > +test_description='git merge-tree --real'
> > +
> > +. ./test-lib.sh
> > +
> > +# This test is ort-specific
> > +GIT_TEST_MERGE_ALGORITHM=ort
> > +export GIT_TEST_MERGE_ALGORITHM
>
> It might make sense to skip the entire test if the user asked for
> `recursive` to be tested:
>
> test "${GIT_TEST_MERGE_ALGORITHM:-ort}" = ort ||
> skip_all="GIT_TEST_MERGE_ALGORITHM != ort"
> test_done
> }
The idea makes sense, but it took me a bit to understand this code
block. I think you're just missing an opening left curly brace right
after the '||'?
> > +
> > +test_expect_success setup '
> > + test_write_lines 1 2 3 4 5 >numbers &&
> > + echo hello >greeting &&
> > + echo foo >whatever &&
> > + git add numbers greeting whatever &&
> > + git commit -m initial &&
>
> I would really like to encourage the use of `test_tick`. It makes the
> commit consistent, just in case you run into an issue that depends on some
> hash order.
I've used test_tick before, but I already know this test can't depend
on hash order. Further, the hashes in the output are also replaced
before comparing in order to make the tests also work as-is under
sha256. So the tests are explicitly ignoring precise hashes. As
such, I'm not sure I see the value of test_tick here.
> > +
> > + git branch side1 &&
> > + git branch side2 &&
> > +
> > + git checkout side1 &&
>
> Please use `git switch -c side1` or `git checkout -b side1`: it is more
> compact than `git branch ... && git checkout ...`.
Yes, but less forgiving to later modification where I go and add
additional commits on one of the sides, because...
>
> > + test_write_lines 1 2 3 4 5 6 >numbers &&
> > + echo hi >greeting &&
> > + echo bar >whatever &&
> > + git add numbers greeting whatever &&
> > + git commit -m modify-stuff &&
> > +
> > + git checkout side2 &&
>
> This could be written as `git checkout -b side2 HEAD^`, to make the setup
> more succinct.
...the presumption of HEAD^ is hardcoded and has to be parsed by
readers to understand the test. It felt like more cognitive overhead
to me, in addition to being less malleable.
> > + test_write_lines 0 1 2 3 4 5 >numbers &&
> > + echo yo >greeting &&
> > + git rm whatever &&
> > + mkdir whatever &&
> > + >whatever/empty &&
> > + git add numbers greeting whatever/empty &&
> > + git commit -m other-modifications
> > +'
> > +
> > +test_expect_success 'Content merge and a few conflicts' '
> > + git checkout side1^0 &&
> > + test_must_fail git merge side2 &&
> > + cp .git/AUTO_MERGE EXPECT &&
> > + E_TREE=$(cat EXPECT) &&
>
> The file `EXPECT` is not used below. And can we use a more obvious name?
> SOmething like:
>
> expected_tree=$(cat .git/AUTO_MERGE)
There go my beautiful <80 character lines below. :-(
But on a more serious note, yeah this is probably better. I'll change it. :-)
>
> > + git reset --hard &&
>
> For an extra bonus, we could delay this via `test_when_finished`, to prove
> that `git merge-tree --real` works even in a dirty worktree _with
> conflicts_.
Ooh, good thought. I like that.
>
> > + test_must_fail git merge-tree --real side1 side2 >RESULT &&
> > + R_TREE=$(cat RESULT) &&
>
> How about `actual_tree` instead?
But my 80-characters rev-parse lines....waaah. Just kidding, yeah
this would be better.
> > +
> > + # Due to differences of e.g. "HEAD" vs "side1", the results will not
> > + # exactly match. Dig into individual files.
> > +
> > + # Numbers should have three-way merged cleanly
> > + test_write_lines 0 1 2 3 4 5 6 >expect &&
> > + git show ${R_TREE}:numbers >actual &&
> > + test_cmp expect actual &&
> > +
> > + # whatever and whatever~<branch> should have same HASHES
> > + git rev-parse ${E_TREE}:whatever ${E_TREE}:whatever~HEAD >expect &&
> > + git rev-parse ${R_TREE}:whatever ${R_TREE}:whatever~side1 >actual &&
> > + test_cmp expect actual &&
> > +
> > + # greeting should have a merge conflict
> > + git show ${E_TREE}:greeting >tmp &&
> > + cat tmp | sed -e s/HEAD/side1/ >expect &&
> > + git show ${R_TREE}:greeting >actual &&
> > + test_cmp expect actual
> > +'
> > +
> > +test_expect_success 'Barf on misspelled option' '
> > + # Mis-spell with single "s" instead of double "s"
> > + test_expect_code 129 git merge-tree --real --mesages FOOBAR side1 side2 2>expect &&
> > +
> > + grep "error: unknown option.*mesages" expect
> > +'
>
> I do not think that this test case adds much, and we already test the
> `parse_options()` machinery elsewhere.
It's more about verifying that exit codes of 0 & 1 are reserved for
"completed with no conflicts" and "completed with conflicts". The 129
bit in this test is the important bit (and perhaps is well-known to
lots of other folks, but I thought it was worth highlighting). That
said, I did a bad job mentioning that in the test description; I'll
fix it up.
> > +
> > +test_expect_success 'Barf on too many arguments' '
> > + test_expect_code 129 git merge-tree --real side1 side2 side3 2>expect &&
> > +
> > + grep "^usage: git merge-tree" expect
> > +'
> > +
> > +test_done
>
> The rest looks awesome. Thank you for working on it! I will definitely
> come back to review the rest (have to take a break now), and then probably
> add quite a bit of food for thought based on my experience _actually_
> using `merge-ort` on the server-side. Stay tuned.
Ooh, I'm intrigued. And thanks for reviewing!
next prev parent reply other threads:[~2022-01-07 17:26 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-31 5:03 [PATCH 0/8] RFC: Server side merges (no ref updating, no commit creating, no touching worktree or index) Elijah Newren via GitGitGadget
2021-12-31 5:03 ` [PATCH 1/8] merge-tree: rename merge_trees() to trivial_merge_trees() Elijah Newren via GitGitGadget
2021-12-31 5:03 ` [PATCH 2/8] merge-tree: move logic for existing merge into new function Elijah Newren via GitGitGadget
2022-01-01 20:11 ` Johannes Altmanninger
2022-01-01 20:17 ` Elijah Newren
2021-12-31 5:03 ` [PATCH 3/8] merge-tree: add option parsing and initial shell for real merge function Elijah Newren via GitGitGadget
2021-12-31 5:04 ` [PATCH 4/8] merge-tree: implement real merges Elijah Newren via GitGitGadget
2022-01-01 20:08 ` Johannes Altmanninger
2022-01-01 21:11 ` Elijah Newren
2022-01-03 12:23 ` Fabian Stelzer
2022-01-03 16:37 ` Elijah Newren
2021-12-31 5:04 ` [PATCH 5/8] merge-ort: split out a separate display_update_messages() function Elijah Newren via GitGitGadget
2022-01-03 12:15 ` Fabian Stelzer
2022-01-03 12:25 ` Fabian Stelzer
2021-12-31 5:04 ` [PATCH 6/8] merge-ort: allow update messages to be written to different file stream Elijah Newren via GitGitGadget
2022-01-01 20:08 ` Johannes Altmanninger
2022-01-01 20:19 ` Elijah Newren
2021-12-31 5:04 ` [PATCH 7/8] merge-tree: support saving merge messages to a separate file Elijah Newren via GitGitGadget
2022-01-03 12:31 ` Fabian Stelzer
2022-01-03 16:51 ` Elijah Newren
2022-01-03 17:22 ` Fabian Stelzer
2022-01-03 19:46 ` Elijah Newren
2022-01-04 13:05 ` Fabian Stelzer
2022-01-03 12:35 ` Fabian Stelzer
2022-01-03 16:55 ` Elijah Newren
2021-12-31 5:04 ` [PATCH 8/8] merge-tree: provide an easy way to access which files have conflicts Elijah Newren via GitGitGadget
2022-01-05 17:27 ` [PATCH v2 0/8] RFC: Server side merges (no ref updating, no commit creating, no touching worktree or index) Elijah Newren via GitGitGadget
2022-01-05 17:27 ` [PATCH v2 1/8] merge-tree: rename merge_trees() to trivial_merge_trees() Elijah Newren via GitGitGadget
2022-01-05 17:27 ` [PATCH v2 2/8] merge-tree: move logic for existing merge into new function Elijah Newren via GitGitGadget
2022-01-05 17:27 ` [PATCH v2 3/8] merge-tree: add option parsing and initial shell for real merge function Elijah Newren via GitGitGadget
2022-01-05 17:27 ` [PATCH v2 4/8] merge-tree: implement real merges Elijah Newren via GitGitGadget
2022-01-07 15:30 ` Johannes Schindelin
2022-01-07 17:26 ` Elijah Newren [this message]
2022-01-07 18:22 ` Johannes Schindelin
2022-01-07 19:15 ` Elijah Newren
2022-01-07 20:56 ` Junio C Hamano
2022-01-11 13:39 ` Johannes Schindelin
2022-01-07 18:12 ` Christian Couder
2022-01-07 19:09 ` Elijah Newren
2022-01-05 17:27 ` [PATCH v2 5/8] merge-ort: split out a separate display_update_messages() function Elijah Newren via GitGitGadget
2022-01-05 17:27 ` [PATCH v2 6/8] merge-ort: allow update messages to be written to different file stream Elijah Newren via GitGitGadget
2022-01-05 17:27 ` [PATCH v2 7/8] merge-tree: support saving merge messages to a separate file Elijah Newren via GitGitGadget
2022-01-07 18:07 ` Johannes Schindelin
2022-01-08 1:02 ` Elijah Newren
2022-01-05 17:27 ` [PATCH v2 8/8] merge-tree: provide an easy way to access which files have conflicts Elijah Newren via GitGitGadget
2022-01-05 19:09 ` Ramsay Jones
2022-01-05 19:17 ` Elijah Newren
2022-01-07 19:36 ` Johannes Schindelin
2022-01-07 22:12 ` Elijah Newren
2022-02-22 13:03 ` Johannes Schindelin
2022-01-08 1:28 ` Elijah Newren
2022-02-22 13:05 ` Johannes Schindelin
2022-01-05 20:18 ` [PATCH v2 0/8] RFC: Server side merges (no ref updating, no commit creating, no touching worktree or index) Junio C Hamano
2022-01-05 22:35 ` Elijah Newren
2022-01-07 18:46 ` Christian Couder
2022-01-07 19:59 ` Elijah Newren
2022-01-07 21:26 ` René Scharfe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CABPp-BFUJ6pU_CKM7ccnFvi0nkeeGfd2GETdksKLaz=B_=BZAQ@mail.gmail.com' \
--to=newren@gmail.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=aclopte@gmail.com \
--cc=chriscool@tuxfamily.org \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=me@ttaylorr.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).