git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Felipe Contreras <felipe.contreras@gmail.com>
To: Phillip Wood <phillip.wood123@gmail.com>,
	Felipe Contreras <felipe.contreras@gmail.com>,
	git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
	David Aguilar <davvid@gmail.com>, Johannes Sixt <j6t@kdbg.org>,
	Seth House <seth@eseth.com>
Subject: Re: [PATCH v4 1/1] mergetool: add automerge configuration
Date: Sun, 20 Dec 2020 21:04:33 -0600	[thread overview]
Message-ID: <5fe010c1d4988_89f1208dc@natae.notmuch> (raw)
In-Reply-To: <c0e0e600-7adf-966f-620e-ea29c3f76916@gmail.com>

Phillip Wood wrote:
> On 19/12/2020 12:53, Felipe Contreras wrote:
> > Phillip Wood wrote:
> >> Hi Felipe
> >>
> >> On 18/12/2020 12:49, Felipe Contreras wrote:
> >>> It doesn't make sense to display lines without conflicts in the
> >>> different views of all mergetools.
> >>>
> >>> Only the lines that warrant conflict markers should be displayed.
> >>>
> >>> Most people would want this behavior on, but in case some don't; add a
> >>> new configuration: mergetool.autoMerge.
> >>>
> >>> See Seth House's blog post [1] for the idea, and the rationale.
> >>>
> >>> [1] https://www.eseth.org/2020/mergetools.html
> >>
> >> I would be good to have a summary of the idea in this commit message so
> >> people do not have to go and find a blog post which may well disappear
> >> in the future
> > 
> > I thought I did in the paragraphs above. How about adding this > explanation:
> > 
> > When merging, not all lines with changes are considered conflicts, for
> > example:
> > 
> >    cat >BASE <<EOF
> >    Patagraph 1
> > 
> >    Paragraph 2
> >    EOF
> > 
> >    cat >LOCAL <<EOF
> >    Paragraph 1
> > 
> >    Paragraph 2
> >    EOF
> > 
> >    cat >REMOTE <<EOF
> >    Patagraph 1.
> > 
> >    Paragraph 2.
> >    EOF
> > 
> > In this case the first paragraph does have a conflict because there are
> > two changes (in LOCAL and REMOTE), that the user must resolve.
> > 
> > However, the second paragraph doesn't have a conflict; it's
> > straightforward to decide that we want the only change present (in
> > REMOTE).
> > 
> > In fact, if it were not for the first paragraph with a conflict, git
> > wouldn't have bothered the user since the automatic merge would have
> > succeeded.
> > 
> > So it doesn't make sense to display these unconflicted lines to the user
> > inside the mergetool; it only creates noise.
> > 
> > We can fix that by propagating the final version of the file with the
> > automatic merge to all the panes of the mergetool (BASE, LOCAL, and
> > REMOTE), and only make them differ on the places where the are actual
> > conflicts (and they are demarcated with conflict markers).
> > 
> > (this is mostly my explanation though, not Seth's, who used visual
> > examples)
> 
> I'm not sure we need that much detail, it just needs to explain that the 
> merge tools display non-conflicting changes. Maybe something along the 
> lines of
> 
> Most merge tools ask the user to merge all the changes in the merge 
> including changes to just one side which do not create conflicts rather 
> than just the conflicting changes. This is inconvenient and a waste of 
> the user's time. We can avoid this by passing the tool two files which 
> resolve the conflicts in favor of the LOCAL and REMOTE side of the merge 
> as the LOCAL and REMOTE merge heads respectively rather than the real 
> merge heads.

It's not just two files. And at least me personally I find the above a
little confusing. How about:

The purpose of mergetools is to resolve conflicts when git cannot
automatically do so. For that git has added markers in the specific
areas that need resolving, which the user must manually fix. The tool is
supposed to help with that.

However, by passing the original BASE, LOCAL, and REMOTE files, many
changes without conflict are presented to the user when in fact nothing
needs to be done for them.

We can fix that by propagating the final version of the file with the
automatic merge to all the panes of the mergetool (BASE, LOCAL, and
REMOTE), and only make them differ on the places where there are actual
conflicts.

> >>> Original-idea-by: Seth House <seth@eseth.com>
> >>> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
> >>> ---
> >>>    Documentation/config/mergetool.txt |  3 +++
> >>>    git-mergetool.sh                   | 17 +++++++++++++++++
> >>>    t/t7610-mergetool.sh               | 18 ++++++++++++++++++
> >>>    3 files changed, 38 insertions(+)
> >>>
> >>> diff --git a/Documentation/config/mergetool.txt b/Documentation/config/mergetool.txt
> >>> index 16a27443a3..7ce6d0d3ac 100644
> >>> --- a/Documentation/config/mergetool.txt
> >>> +++ b/Documentation/config/mergetool.txt
> >>> @@ -61,3 +61,6 @@ mergetool.writeToTemp::
> >>>    
> >>>    mergetool.prompt::
> >>>    	Prompt before each invocation of the merge resolution program.
> >>> +
> >>> +mergetool.autoMerge::
> >>> +	Remove lines without conflicts from all the files. Defaults to `true`.
> >>> diff --git a/git-mergetool.sh b/git-mergetool.sh
> >>> index e3f6d543fb..f4db0cac8d 100755
> >>> --- a/git-mergetool.sh
> >>> +++ b/git-mergetool.sh
> >>> @@ -239,6 +239,17 @@ checkout_staged_file () {
> >>>    	fi
> >>>    }
> >>>    
> >>> +auto_merge () {
> >>> +	git merge-file --diff3 --marker-size=7 -q -p "$LOCAL" "$BASE" "$REMOTE" >"$DIFF3"
> >>
> >> I've been wondering if we want to recreate the merge or just get the
> >> merged BASE LOCAL and REMOTE from the merged file in the working tree.
> >> If the user wants to resolve the conflicts in stages, or opens the file
> >> in a editor and fixes some conflicts and then realizes they want to use
> >> a merge tool that work is thrown away if we recreate the merge. They can
> >> always use `checkout --merge` to throw away their changes and start
> >> again with a mergetool. It would mean checking the size of the conflict
> >> markers and using
> >> '/^<{$conflict_marker_size}/,^|{$conflict_marker_size}/' for sed.
> >> Getting the merged BASE would be tricky if the user does not have diff3
> >> conflicts enabled, I'm not sure if we can safely get BASE from `git
> >> merge-file ...` and LOCAL and REMOTE from the working tree.
> > 
> > That's a good point.
> > 
> > However, their work is not thrown away; MERGED is not touched by this.
> 
> I wasn't sure whether the tools would overwrite MERGED with a new file 
> or if they started with that and just edited it. If it is the latter 
> then I agree the users changes are safe

All mergetools are passed MERGED and are supposed to edit it.

I suppose some mergetools would want to recreate MERGED, but the user
may have already resolved some of the conflicts.

> > It's only for visualization purposes that some already-fixed conflicts
> > would be shown in the mergetool, which yeah; it's not ideal.
> > 
> > That's an improvement that can be done later, on top of this patch. The
> > bulk of improvements are already enabled by this, and the marginal
> > gains can be added later.
> 
> There's also the issue of what happens when the user has set merge 
> driver for a file. If we use the file from the working tree we are using 
> the result of that driver, if we re-merge with `git merge-file` then the 
> files passed to the mergetool will not match the output of the merge 
> driver set for that file.

I don't know how that situation would look like, but presumably the
conflicts would be around the same areas anyway, no?

> >>> +	if test -s "$DIFF3"
> >>> +	then
> >>> +		sed -e '/^<<<<<<< /,/^||||||| /d' -e '/^=======\r\?$/,/^>>>>>>> /d' "$DIFF3" >"$BASE"
> >>> +		sed -e '/^||||||| /,/^>>>>>>> /d' -e '/^<<<<<<< /d' "$DIFF3" >"$LOCAL"
> >>> +		sed -e '/^<<<<<<< /,/^=======\r\?$/d' -e '/^>>>>>>> /d' "$DIFF3" >"$REMOTE"
> >>> +	fi
> >>> +	rm -- "$DIFF3"
> >>> +}
> >>> +
> >>>    merge_file () {
> >>>    	MERGED="$1"
> >>>    
> >>> @@ -274,6 +285,7 @@ merge_file () {
> >>>    		BASE=${BASE##*/}
> >>>    	fi
> >>>    
> >>> +	DIFF3="$MERGETOOL_TMPDIR/${BASE}_DIFF3_$$$ext"
> >>>    	BACKUP="$MERGETOOL_TMPDIR/${BASE}_BACKUP_$$$ext"
> >>>    	LOCAL="$MERGETOOL_TMPDIR/${BASE}_LOCAL_$$$ext"
> >>>    	REMOTE="$MERGETOOL_TMPDIR/${BASE}_REMOTE_$$$ext"
> >>> @@ -322,6 +334,11 @@ merge_file () {
> >>>    	checkout_staged_file 2 "$MERGED" "$LOCAL"
> >>>    	checkout_staged_file 3 "$MERGED" "$REMOTE"
> >>>    
> >>> +	if test "$(git config --bool mergetool.autoMerge)" != "false"
> >>
> >> If I run `git config --bool mergetool.autoMerge` it returns an empty
> >> string so I think you need to test it is actually equal to "true".
> > 
> > Yeah, this would evaluate to positive:
> > 
> >    test "" != "false"
> > 
> > It's enabled by default since I heard Junio mention it would make sense.
> 
> I think it probably does make sense in which case it would be good to 
> make that explicit in the commit message. Maybe

Right, I thought I did.

> As most people will want the new behavior we enable it by default. 
> Users that do not want the new behavior can set mergetool.autoMerge to 
> false.

Sounds good.

> >> I also share the view that this should be per tool. Your demand that
> >> someone comes up with an example that breaks assumes that we have access
> >> to all the tools that users are using.
> > 
> > It's not a demand. It's a fact that unless we have an example (even if
> > hypothetical), the burden of proof has not been met.
> > 
> > The default position is that we don't know if such configuration would
> > make sense or not.
> > 
> >> Seth has done a great job of
> >> surveying the popular tools but given the size of git's user-base and
> >> the diversity of uses it is very likely that there will be people using
> >> in-house or proprietary tools that no one on the list has access to.
> > 
> > Yes, they can just turn off the flag.
> > 
> >> I would much prefer to avoid breaking them rather than waiting for a
> >> bug report before implementing a per-tool setting.
> > 
> > Even with a per-tool configuration they would be broken (until the user
> > configures otherwise).
> > 
> >> It is quite possible people are using different tools for different
> >> files in the same way as they use different merge drivers for
> >> different files and want the setting disabled for a tool that does
> >> semantic merging but enabled textual merges.
> > 
> > I think your definition of what's possible and mine are very different.
> 
> All I'm saying is that if a user has different tools for different 
> file-types they may want this on for one tool but not another.

Possible yeah, I just don't find it very likely.

What different mergetools would you use for different file-types?

> > But this is actually what I was asking: an example. You are bringing a
> > hypothetical "semantic mergetool" that would somehow benefit from having
> > unconflicted lines. Can you explain how it would benefit?
> 
> Because the result of the merge depends on the diff and a semantic tool 
> (there was a talk about one for C# a few years ago at git merge I think) 
> will diff the file based on it's semantics rather than matching lines.

But you would want this tool to run on every merge, regardless if there
are conflicts or not.

It's this pre-mergetool tool that would determine if there are
conflicts to be resolved by the user or not.

> > Also, neither Seth nor Junio responded to my example, can you?
> > 
> > Do you agree there is no conflict here?
> > 
> >    echo Hello > BASE
> >    echo Hello > LOCAL
> >    echo Hello. > REMOTE
> >    git merge-file -p LOCAL BASE REMOTE
> 
> There is no conflict but I don't see what point you're making by that. 

If there's no conflict there's no opportunity to run "git mergetool".

If there's a conflict some lines bellow that doesn't make the above
magically be a conflict. So why would the mergetool show it to the
user?

> I've been thinking about a different example
> 
> BASE    LOCAL   REMOTE
> A	A	A
> A	A	A
> A	A	A
> 	B	A
> 
> Is there a conflict or not? I think it depends on the diff algorithm. 
> These are both valid diffs of BASE and LOCAL but only the first one will 
> lead to conflicts
> 
>   A	+A
>   A	 A
>   A	 A
> +A	 A

Isn't there a B in LOCAL?

> If a tool implements a different diff algorithm to git then it may want 
> to do the whole merge itself.

Yes, in which case it would want take ownership of the whole "are there
conflicts" decision, instead of letting git decide there are no
conflicts.

> I'm going to be off the list for the next couple of weeks

All right. Thanks for the input anyway.

Cheers.

-- 
Felipe Contreras

      reply	other threads:[~2020-12-21  4:56 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-18 12:49 [PATCH v4 0/1] mergetool: add configuration to remove unconflicted lines Felipe Contreras
2020-12-18 12:49 ` [PATCH v4 1/1] mergetool: add automerge configuration Felipe Contreras
2020-12-19 11:14   ` Phillip Wood
2020-12-19 12:53     ` Felipe Contreras
2020-12-20 19:21       ` Phillip Wood
2020-12-21  3:04         ` Felipe Contreras [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5fe010c1d4988_89f1208dc@natae.notmuch \
    --to=felipe.contreras@gmail.com \
    --cc=davvid@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j6t@kdbg.org \
    --cc=phillip.wood123@gmail.com \
    --cc=seth@eseth.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).