git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* feature suggestion: optimize common parts for checkout --conflict=diff3
@ 2013-03-06 15:05 Uwe Kleine-König
  2013-03-06 18:27 ` Antoine Pelisse
  2013-03-06 19:26 ` Antoine Pelisse
  0 siblings, 2 replies; 38+ messages in thread
From: Uwe Kleine-König @ 2013-03-06 15:05 UTC (permalink / raw)
  To: git; +Cc: kernel

Hello,

Here comes another recipe for a different suggestion:

	git init
	echo 1 > file
	git add file
	git commit -m 'base'
	git branch branch
	seq 1 30 | grep -v 15 > file
	git commit -m 'add 2-30 without 15' file
	git checkout branch
	seq 1 30 | grep -v 16 > file
	git commit -m 'add 2-30 without 16' file
	git merge master
	git diff

This yields:

	diff --cc file
	index a07e697,5080129..0000000
	--- a/file
	+++ b/file
	@@@ -12,7 -12,7 +12,11 @@@
	  12
	  13
	  14
	++<<<<<<< HEAD
	 +15
	++=======
	+ 16
	++>>>>>>> master
	  17
	  18
	  19

as expected; nice and sweet. After

	git checkout --conflict=diff3 file

however the difference isn't that easy to spot any more. I expected

	diff --cc file
	index a07e697,5080129..0000000
	--- a/file
	+++ b/file
	@@@ -12,7 -12,7 +12,12 @@@
	  12
	  13
	  14
	++<<<<<<< ours
	 +15
	++||||||| base
	++=======
	+ 16
	++>>>>>>> theirs
	  17
	  18
	  19

But instead I get

	diff --cc file
	index a07e697,5080129..0000000
	--- a/file
	+++ b/file
	@@@ -1,29 -1,29 +1,61 @@@
	  1
	++<<<<<<< ours
	 +2
	 +3
	 +4
	 +5
	 +6
	 +7
	 +8
	 +9
	 +10
	 +11
	 +12
	 +13
	 +14
	 +15
	 +17
	 +18
	 +19
	 +20
	 +21
	 +22
	 +23
	 +24
	 +25
	 +26
	 +27
	 +28
	 +29
	 +30
	++||||||| base
	++=======
	+ 2
	+ 3
	+ 4
	+ 5
	+ 6
	+ 7
	+ 8
	+ 9
	+ 10
	+ 11
	+ 12
	+ 13
	+ 14
	+ 16
	+ 17
	+ 18
	+ 19
	+ 20
	+ 21
	+ 22
	+ 23
	+ 24
	+ 25
	+ 26
	+ 27
	+ 28
	+ 29
	+ 30
	++>>>>>>> theirs

Of course this is technically correct, just not maximally helpful.

Is this a missing optimisation for the diff3 case or did I miss a detail
that makes my expectation wrong?

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: feature suggestion: optimize common parts for checkout --conflict=diff3
  2013-03-06 15:05 feature suggestion: optimize common parts for checkout --conflict=diff3 Uwe Kleine-König
@ 2013-03-06 18:27 ` Antoine Pelisse
  2013-03-06 19:26 ` Antoine Pelisse
  1 sibling, 0 replies; 38+ messages in thread
From: Antoine Pelisse @ 2013-03-06 18:27 UTC (permalink / raw)
  To: Uwe Kleine-König; +Cc: git, kernel

>         git checkout --conflict=diff3 file

That's somehow unrelated, but shouldn't we have a "conflict" option to
git-merge as we have for git-checkout ?

With something like this (pasted into gmail):

diff --git a/builtin/merge.c b/builtin/merge.c
index 7c8922c..edad742 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -65,6 +65,7 @@ static int abort_current_merge;
 static int show_progress = -1;
 static int default_to_upstream;
 static const char *sign_commit;
+static char *conflict_style;

 static struct strategy all_strategy[] = {
  { "recursive",  DEFAULT_TWOHEAD | NO_TRIVIAL },
@@ -213,6 +214,7 @@ static struct option builtin_merge_options[] = {
  { OPTION_STRING, 'S', "gpg-sign", &sign_commit, N_("key id"),
   N_("GPG sign commit"), PARSE_OPT_OPTARG, NULL, (intptr_t) "" },
  OPT_BOOLEAN(0, "overwrite-ignore", &overwrite_ignore, N_("update
ignored files (default)")),
+ OPT_STRING(0, "conflict", &conflict_style, N_("style"), N_("conflict
style (merge or diff3)")),
  OPT_END()
 };

@@ -1102,6 +1104,9 @@ int cmd_merge(int argc, const char **argv, const
char *prefix)
  if (verbosity < 0 && show_progress == -1)
  show_progress = 0;

+ if (conflict_style)
+ git_xmerge_config("merge.conflictstyle", conflict_style, NULL);
+
  if (abort_current_merge) {
  int nargc = 2;
  const char *nargv[] = {"reset", "--merge", NULL};

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: feature suggestion: optimize common parts for checkout --conflict=diff3
  2013-03-06 15:05 feature suggestion: optimize common parts for checkout --conflict=diff3 Uwe Kleine-König
  2013-03-06 18:27 ` Antoine Pelisse
@ 2013-03-06 19:26 ` Antoine Pelisse
  2013-03-06 20:03   ` Jeff King
  1 sibling, 1 reply; 38+ messages in thread
From: Antoine Pelisse @ 2013-03-06 19:26 UTC (permalink / raw)
  To: Uwe Kleine-König; +Cc: git, kernel

> however the difference isn't that easy to spot any more. I expected
>
>         diff --cc file
>         index a07e697,5080129..0000000
>         --- a/file
>         +++ b/file
>         @@@ -12,7 -12,7 +12,12 @@@
>           12
>           13
>           14
>         ++<<<<<<< ours
>          +15
>         ++||||||| base
>         ++=======
>         + 16
>         ++>>>>>>> theirs
>           17
>           18
>           19

This is not correct, it would mean that 12, 13, 14, 17, 18, 19 are in
base, while they are not.

Cheers,
Antoine

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: feature suggestion: optimize common parts for checkout --conflict=diff3
  2013-03-06 19:26 ` Antoine Pelisse
@ 2013-03-06 20:03   ` Jeff King
  2013-03-06 20:36     ` [PATCH] xdiff: implement a zealous diff3 Uwe Kleine-König
  2013-03-06 20:40     ` feature suggestion: optimize common parts for checkout --conflict=diff3 Junio C Hamano
  0 siblings, 2 replies; 38+ messages in thread
From: Jeff King @ 2013-03-06 20:03 UTC (permalink / raw)
  To: Antoine Pelisse; +Cc: Uwe Kleine-König, git, kernel

On Wed, Mar 06, 2013 at 08:26:57PM +0100, Antoine Pelisse wrote:

> > however the difference isn't that easy to spot any more. I expected
> >
> >         diff --cc file
> >         index a07e697,5080129..0000000
> >         --- a/file
> >         +++ b/file
> >         @@@ -12,7 -12,7 +12,12 @@@
> >           12
> >           13
> >           14
> >         ++<<<<<<< ours
> >          +15
> >         ++||||||| base
> >         ++=======
> >         + 16
> >         ++>>>>>>> theirs
> >           17
> >           18
> >           19
> 
> This is not correct, it would mean that 12, 13, 14, 17, 18, 19 are in
> base, while they are not.

Yeah, I agree it is a bit of a lie, as you are not seeing the full
picture of what was in the base. That is why we intentionally dial down
the conflict simplification level when using diff3. If you apply this
patch to git:

diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index 9e13b25..f381e0c 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -420,15 +420,6 @@ static int xdl_do_merge(xdfenv_t *xe1, xdchange_t *xscr1,
 	int style = xmp->style;
 	int favor = xmp->favor;
 
-	if (style == XDL_MERGE_DIFF3) {
-		/*
-		 * "diff3 -m" output does not make sense for anything
-		 * more aggressive than XDL_MERGE_EAGER.
-		 */
-		if (XDL_MERGE_EAGER < level)
-			level = XDL_MERGE_EAGER;
-	}
-
 	c = changes = NULL;
 
 	while (xscr1 && xscr2) {

then it will produce the output that Uwe expects. While it can be
misleading, I also think it can make some conflicts (like this one) much
easier to understand. I don't see any reason we can't have a "zealous
diff3" mode to let people view this output, as long as it is not the
default.

-Peff

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH] xdiff: implement a zealous diff3
  2013-03-06 20:03   ` Jeff King
@ 2013-03-06 20:36     ` Uwe Kleine-König
  2013-03-06 20:46       ` Jeff King
  2013-03-06 20:40     ` feature suggestion: optimize common parts for checkout --conflict=diff3 Junio C Hamano
  1 sibling, 1 reply; 38+ messages in thread
From: Uwe Kleine-König @ 2013-03-06 20:36 UTC (permalink / raw)
  To: git; +Cc: kernel, Antoine Pelisse, Jeff King

"zdiff3" is identical to ordinary diff3, only it allows more aggressive
compaction than diff3. This way the displayed base isn't necessary
technically correct, but still this mode might help resolving merge
conflicts between two near identical additions.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
---
Hello,

this patch implements what I want. Thanks to Jeff for pointing me to the
right code to modify.

Best regards
Uwe

 builtin/merge-file.c                   | 2 ++
 contrib/completion/git-completion.bash | 2 +-
 xdiff-interface.c                      | 2 ++
 xdiff/xdiff.h                          | 1 +
 xdiff/xmerge.c                         | 8 +++++++-
 5 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/builtin/merge-file.c b/builtin/merge-file.c
index c0570f2..4ef86aa 100644
--- a/builtin/merge-file.c
+++ b/builtin/merge-file.c
@@ -32,6 +32,8 @@ int cmd_merge_file(int argc, const char **argv, const char *prefix)
 	struct option options[] = {
 		OPT_BOOLEAN('p', "stdout", &to_stdout, N_("send results to standard output")),
 		OPT_SET_INT(0, "diff3", &xmp.style, N_("use a diff3 based merge"), XDL_MERGE_DIFF3),
+		OPT_SET_INT(0, "zdiff3", &xmp.style, N_("use a zealous diff3 based merge"),
+				XDL_MERGE_ZEALOUS_DIFF3),
 		OPT_SET_INT(0, "ours", &xmp.favor, N_("for conflicts, use our version"),
 			    XDL_MERGE_FAVOR_OURS),
 		OPT_SET_INT(0, "theirs", &xmp.favor, N_("for conflicts, use their version"),
diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
index b62bec0..a0d887e 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -1091,7 +1091,7 @@ _git_checkout ()
 
 	case "$cur" in
 	--conflict=*)
-		__gitcomp "diff3 merge" "" "${cur##--conflict=}"
+		__gitcomp "diff3 merge zdiff3" "" "${cur##--conflict=}"
 		;;
 	--*)
 		__gitcomp "
diff --git a/xdiff-interface.c b/xdiff-interface.c
index ecfa05f..a911c25 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -308,6 +308,8 @@ int git_xmerge_config(const char *var, const char *value, void *cb)
 			die("'%s' is not a boolean", var);
 		if (!strcmp(value, "diff3"))
 			git_xmerge_style = XDL_MERGE_DIFF3;
+		else if (!strcmp(value, "zdiff3"))
+			git_xmerge_style = XDL_MERGE_ZEALOUS_DIFF3;
 		else if (!strcmp(value, "merge"))
 			git_xmerge_style = 0;
 		else
diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h
index 219a3bb..9730c63 100644
--- a/xdiff/xdiff.h
+++ b/xdiff/xdiff.h
@@ -64,6 +64,7 @@ extern "C" {
 
 /* merge output styles */
 #define XDL_MERGE_DIFF3 1
+#define XDL_MERGE_ZEALOUS_DIFF3 2
 
 typedef struct s_mmfile {
 	char *ptr;
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index 9e13b25..4772707 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -177,7 +177,7 @@ static int fill_conflict_hunk(xdfenv_t *xe1, const char *name1,
 	size += xdl_recs_copy(xe1, m->i1, m->chg1, 1,
 			      dest ? dest + size : NULL);
 
-	if (style == XDL_MERGE_DIFF3) {
+	if (style == XDL_MERGE_DIFF3 || style == XDL_MERGE_ZEALOUS_DIFF3) {
 		/* Shared preimage */
 		if (!dest) {
 			size += marker_size + 1 + marker3_size;
@@ -420,6 +420,12 @@ static int xdl_do_merge(xdfenv_t *xe1, xdchange_t *xscr1,
 	int style = xmp->style;
 	int favor = xmp->favor;
 
+	/*
+	 * This is the only change between XDL_MERGE_DIFF3 and
+	 * XDL_MERGE_ZEALOUS_DIFF3. "zdiff3" isn't 100% technically correct (as
+	 * the base might be considerably simplified), but still it might help
+	 * interpreting conflicts between two big and near identical additions.
+	 */
 	if (style == XDL_MERGE_DIFF3) {
 		/*
 		 * "diff3 -m" output does not make sense for anything
-- 
1.8.2.rc2

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: feature suggestion: optimize common parts for checkout --conflict=diff3
  2013-03-06 20:03   ` Jeff King
  2013-03-06 20:36     ` [PATCH] xdiff: implement a zealous diff3 Uwe Kleine-König
@ 2013-03-06 20:40     ` Junio C Hamano
  2013-03-06 20:54       ` Jeff King
  1 sibling, 1 reply; 38+ messages in thread
From: Junio C Hamano @ 2013-03-06 20:40 UTC (permalink / raw)
  To: Jeff King; +Cc: Antoine Pelisse, Uwe Kleine-König, git, kernel

Jeff King <peff@peff.net> writes:

> then it will produce the output that Uwe expects. While it can be
> misleading,...

Misleading is one thing but in this case isn't it outright wrong?

If you remove <<< ours ||| portion from the combined diff output,
I would expect that the hunk will apply to the base, but that is no
longer true, no?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] xdiff: implement a zealous diff3
  2013-03-06 20:36     ` [PATCH] xdiff: implement a zealous diff3 Uwe Kleine-König
@ 2013-03-06 20:46       ` Jeff King
  0 siblings, 0 replies; 38+ messages in thread
From: Jeff King @ 2013-03-06 20:46 UTC (permalink / raw)
  To: Uwe Kleine-König; +Cc: git, kernel, Antoine Pelisse

On Wed, Mar 06, 2013 at 09:36:42PM +0100, Uwe Kleine-König wrote:

> "zdiff3" is identical to ordinary diff3, only it allows more aggressive
> compaction than diff3. This way the displayed base isn't necessary
> technically correct, but still this mode might help resolving merge
> conflicts between two near identical additions.
> 
> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

I think the patch is correct, assuming this is the interface we want.

It would be more flexible instead to have:

  1. user can configure zealous-level of merge via command-line or
     config (they cannot control it at all right now)

  2. when diff3 is used and no level is explicitly given, do not go
     above EAGER

  3. otherwise, respect the level given by the user, even if it is
     ZEALOUS

But that would involve a lot of refactoring. I don't know if it is worth
the effort (the bonus is that people can then set the level
independently, but I do not know that anyone ever wants to do that).

>  builtin/merge-file.c                   | 2 ++
>  contrib/completion/git-completion.bash | 2 +-
>  xdiff-interface.c                      | 2 ++
>  xdiff/xdiff.h                          | 1 +
>  xdiff/xmerge.c                         | 8 +++++++-
>  5 files changed, 13 insertions(+), 2 deletions(-)

I think this would need documentation not only to let users know about
the feature, but also to warn them of the caveats.

-Peff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: feature suggestion: optimize common parts for checkout --conflict=diff3
  2013-03-06 20:40     ` feature suggestion: optimize common parts for checkout --conflict=diff3 Junio C Hamano
@ 2013-03-06 20:54       ` Jeff King
  2013-03-06 21:09         ` Junio C Hamano
  0 siblings, 1 reply; 38+ messages in thread
From: Jeff King @ 2013-03-06 20:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Antoine Pelisse, Uwe Kleine-König, git, kernel

On Wed, Mar 06, 2013 at 12:40:48PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > then it will produce the output that Uwe expects. While it can be
> > misleading,...
> 
> Misleading is one thing but in this case isn't it outright wrong?
> 
> If you remove <<< ours ||| portion from the combined diff output,
> I would expect that the hunk will apply to the base, but that is no
> longer true, no?

It shifts the concept of what is the "base" and what is the "conflict".
In Uwe's example, no, it would not apply to the single-line file that is
the true 3-way base. But it would apply to the content that is outside
of the hunk marker; we have changed the concept of what is in the base
and what is in the conflict by shrinking the conflict to its smallest
size. The same is true of the conflict markers produced in the non-diff3
case. It is a property of XDL_MERGE_ZEALOUS, not of the conflict style.

If your argument is "diff3 means something different than regular
conflict markers; it should have the property of being
machine-convertible into a patch, but regular markers do not", I'm not
sure I agree. It may be used that way, but I think it is mostly used in
git to give the reader more context when making a resolution. And
anyway, I think the proposed change would not be to change diff3, but to
introduce a new diff3-like format that also shrinks the hunk size, so it
would not hurt existing users of diff3.

-Peff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: feature suggestion: optimize common parts for checkout --conflict=diff3
  2013-03-06 20:54       ` Jeff King
@ 2013-03-06 21:09         ` Junio C Hamano
  2013-03-06 21:21           ` Jeff King
                             ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Junio C Hamano @ 2013-03-06 21:09 UTC (permalink / raw)
  To: Jeff King; +Cc: Antoine Pelisse, Uwe Kleine-König, git, kernel

Jeff King <peff@peff.net> writes:

> But it would apply to the content that is outside
> of the hunk marker; we have changed the concept of what is in the base
> and what is in the conflict by shrinking the conflict to its smallest
> size.

Hmm, unless you mean by "base" something entirely different from
"what was in the common ancestor version", I do not think I can
agree.  The point of diff3 mode is to show how it looked line in the
common ancestor and what the conflicting sides want to change that
common version into; letting the user view three versions to help
him decide what to do by only looking at the part inside conflict
markers.

We show "both sides added, either identically or differently" as
noteworthy events, but the patched code pushes "both sides added
identically" case outside the conflicting hunk, as if what was added
relative to the common ancestor version (in Uwe's case, is it 1-14
that is common, or just 10-14?) is not worth looking at when
considering what the right resolution is.  If it is not worth
looking at what was in the original for the conflicting part, why
would we be even using diff3 mode in the first place?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: feature suggestion: optimize common parts for checkout --conflict=diff3
  2013-03-06 21:09         ` Junio C Hamano
@ 2013-03-06 21:21           ` Jeff King
  2013-03-06 21:50             ` Junio C Hamano
  2013-03-06 21:31           ` Uwe Kleine-König
  2013-03-06 21:32           ` Junio C Hamano
  2 siblings, 1 reply; 38+ messages in thread
From: Jeff King @ 2013-03-06 21:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Antoine Pelisse, Uwe Kleine-König, git, kernel

On Wed, Mar 06, 2013 at 01:09:41PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > But it would apply to the content that is outside
> > of the hunk marker; we have changed the concept of what is in the base
> > and what is in the conflict by shrinking the conflict to its smallest
> > size.
> 
> Hmm, unless you mean by "base" something entirely different from
> "what was in the common ancestor version", I do not think I can
> agree.

I don't know. I didn't use the word "base" in the first place. I was
trying to figure out what you meant. :)

My point is that the hunk (everything from "<<<" to ">>>") is
self-consistent. It's just misleading in that the hunk has been shrunk
not to include identical bits from each side. IMHO, this is not much
different than a nearby change being auto-resolved. The conflict hunks
the user sees do not represent the original files, but rather the
remains after a first pass at resolving.

> The point of diff3 mode is to show how it looked line in the
> common ancestor and what the conflicting sides want to change that
> common version into; letting the user view three versions to help
> him decide what to do by only looking at the part inside conflict
> markers.

Right, I agree.

> We show "both sides added, either identically or differently" as
> noteworthy events, but the patched code pushes "both sides added
> identically" case outside the conflicting hunk, as if what was added
> relative to the common ancestor version (in Uwe's case, is it 1-14
> that is common, or just 10-14?) is not worth looking at when
> considering what the right resolution is.  If it is not worth
> looking at what was in the original for the conflicting part, why
> would we be even using diff3 mode in the first place?

I think Uwe's example shows that it _is_ useful. Yes, you no longer have
the information about what happened through 1-14 (whether it was really
there in the ancestor file, or whether it was simply added identically).
But that information might or might not be relevant. In Uwe's example,
it is just noise that detracts from the interesting part of the change
(or does it? I think the answer is in the eye of the reader).  I think
it can be helpful to have both types available, and they can pick which
one they want; it's just another tool.

Another argument is that some people (including me) set
merge.conflictstyle to diff3, because they like seeing the extra context
when resolving (I find it helps a lot with rebasing, when it is
sometimes hard to remember which side is which in the merge). I'd
consider setting it to zdiff3 to get the benefits of XDL_MERGE_ZEALOUS,
and using "git checkout --conflict-style=diff3" if I need to get more
information about a specific case.

-Peff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: feature suggestion: optimize common parts for checkout --conflict=diff3
  2013-03-06 21:09         ` Junio C Hamano
  2013-03-06 21:21           ` Jeff King
@ 2013-03-06 21:31           ` Uwe Kleine-König
  2013-03-06 21:32           ` Junio C Hamano
  2 siblings, 0 replies; 38+ messages in thread
From: Uwe Kleine-König @ 2013-03-06 21:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Antoine Pelisse, git, kernel

Hello Junio,

On Wed, Mar 06, 2013 at 01:09:41PM -0800, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
> 
> > But it would apply to the content that is outside
> > of the hunk marker; we have changed the concept of what is in the base
> > and what is in the conflict by shrinking the conflict to its smallest
> > size.
> 
> Hmm, unless you mean by "base" something entirely different from
> "what was in the common ancestor version", I do not think I can
> agree.  The point of diff3 mode is to show how it looked line in the
> common ancestor and what the conflicting sides want to change that
> common version into; letting the user view three versions to help
> him decide what to do by only looking at the part inside conflict
> markers.
> 
> We show "both sides added, either identically or differently" as
> noteworthy events, but the patched code pushes "both sides added
> identically" case outside the conflicting hunk, as if what was added
I didn't test, but "both sides removed identically" should be moved out,
too, shouldn't it?

> relative to the common ancestor version (in Uwe's case, is it 1-14
> that is common, or just 10-14?) is not worth looking at when
> considering what the right resolution is.  If it is not worth
> looking at what was in the original for the conflicting part, why
> would we be even using diff3 mode in the first place?
because even zdiff3 contains more information than merge. And compared
to diff3 it's smaller sometimes and so easier to understand.

Other than that I agree fully to the things Jeff said so far.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: feature suggestion: optimize common parts for checkout --conflict=diff3
  2013-03-06 21:09         ` Junio C Hamano
  2013-03-06 21:21           ` Jeff King
  2013-03-06 21:31           ` Uwe Kleine-König
@ 2013-03-06 21:32           ` Junio C Hamano
  2013-03-07  8:04             ` Jeff King
  2 siblings, 1 reply; 38+ messages in thread
From: Junio C Hamano @ 2013-03-06 21:32 UTC (permalink / raw)
  To: Jeff King; +Cc: Antoine Pelisse, Uwe Kleine-König, git, kernel

Junio C Hamano <gitster@pobox.com> writes:

> Jeff King <peff@peff.net> writes:
>
>> But it would apply to the content that is outside
>> of the hunk marker; we have changed the concept of what is in the base
>> and what is in the conflict by shrinking the conflict to its smallest
>> size.
>
> Hmm, unless you mean by "base" something entirely different from
> "what was in the common ancestor version", I do not think I can
> agree.  The point of diff3 mode is to show how it looked line in the
> common ancestor and what the conflicting sides want to change that
> common version into; letting the user view three versions to help
> him decide what to do by only looking at the part inside conflict
> markers.
>
> We show "both sides added, either identically or differently" as
> noteworthy events, but the patched code pushes "both sides added
> identically" case outside the conflicting hunk, as if what was added
> relative to the common ancestor version (in Uwe's case, is it 1-14
> that is common, or just 10-14?) is not worth looking at when
> considering what the right resolution is.  If it is not worth
> looking at what was in the original for the conflicting part, why
> would we be even using diff3 mode in the first place?

I vaguely recall we did this "clip to eager" as an explicit bugfix
at 83133740d9c8 (xmerge.c: "diff3 -m" style clips merge reduction
level to EAGER or less, 2008-08-29).  The list archive around that
time may give us more contexts.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: feature suggestion: optimize common parts for checkout --conflict=diff3
  2013-03-06 21:21           ` Jeff King
@ 2013-03-06 21:50             ` Junio C Hamano
  2013-03-07  1:02               ` Jeff King
  0 siblings, 1 reply; 38+ messages in thread
From: Junio C Hamano @ 2013-03-06 21:50 UTC (permalink / raw)
  To: Jeff King; +Cc: Antoine Pelisse, Uwe Kleine-König, git, kernel

Jeff King <peff@peff.net> writes:

> I think Uwe's example shows that it _is_ useful. Yes, you no longer have
> the information about what happened through 1-14 (whether it was really
> there in the ancestor file, or whether it was simply added identically).
> But that information might or might not be relevant.

I think it is more like "I added bread and my wife added bread to
our common shopping list" and our two-way "RCS merge" default is to
collapse that case to "one loaf of bread on the shopping list".  My
impression has always been that people who use "diff3" mode care
about this case and want to know that the original did not have
"bread" on the list in order to decide if one or two loaves of bread
should remain in the result.

> In Uwe's example,
> it is just noise that detracts from the interesting part of the change
> (or does it? I think the answer is in the eye of the reader).

In other words, you would use the "RCS merge" style because most of
the time you would resolve to "one loaf of bread" and the fact that
it was missing in the original is not needed to decide that.  So, it
feels strange to use "diff3" and still want to discard that
information---if it is not relevant, why are you using diff3 mode in
the first place?  That is the question that is still not answered.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: feature suggestion: optimize common parts for checkout --conflict=diff3
  2013-03-06 21:50             ` Junio C Hamano
@ 2013-03-07  1:02               ` Jeff King
  0 siblings, 0 replies; 38+ messages in thread
From: Jeff King @ 2013-03-07  1:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Antoine Pelisse, Uwe Kleine-König, git, kernel

On Wed, Mar 06, 2013 at 01:50:46PM -0800, Junio C Hamano wrote:

> I think it is more like "I added bread and my wife added bread to
> our common shopping list" and our two-way "RCS merge" default is to
> collapse that case to "one loaf of bread on the shopping list".  My
> impression has always been that people who use "diff3" mode care
> about this case and want to know that the original did not have
> "bread" on the list in order to decide if one or two loaves of bread
> should remain in the result.

I think that is only the case sometimes. It depends on what is in the
conflict, and what your data is. I think you are conflating two things,
though: zealousness of merge, and having the original content handy when
resolving. To me, diff3 is about the latter. It can also be a hint that
the user cares about the former, but not necessarily.

> > In Uwe's example,
> > it is just noise that detracts from the interesting part of the change
> > (or does it? I think the answer is in the eye of the reader).
> 
> In other words, you would use the "RCS merge" style because most of
> the time you would resolve to "one loaf of bread" and the fact that
> it was missing in the original is not needed to decide that.  So, it
> feels strange to use "diff3" and still want to discard that
> information---if it is not relevant, why are you using diff3 mode in
> the first place?  That is the question that is still not answered.

Because for the lines that _are_ changed, you may want to see what the
original looked like. Here's a more realistic example:

	git init repo
	cd repo

	# Some baseline C code.
	cat >foo.c <<\EOF
	int foo(int bar)
	{
	  return bar + 5;
	}
	EOF
	git add foo.c
	git commit -m base
	git tag base

	# Simulate a modification to the function.
	sed -i '2a\
	  if (bar < 3)\
	    bar *= 2;
	' foo.c
	git commit -am multiply
	git tag multiply

	# And another modification.
	sed -i 's/bar + 5/bar + 7/' foo.c
	git commit -am plus7

	# Now on a side branch...
	git checkout -b side base

	# let's cherry pick the first change. Obviously
	# we could just fast-forward in this toy example,
	# but let's try to simulate a real history.
	#
	# We insert a sleep so that the cherry-pick does not
	# accidentally end up with the exact same commit-id (again,
	# because this is a toy example).
	sleep 1
	git cherry-pick multiply

	# and now let's make a change that conflicts with later
	# changes on master
	sed -i 's/bar + 5/bar + 8/' foo.c
	git commit -am plus8

	# and now merge, getting a conflict
	git merge master

	# show the result with various marker styles
	for i in merge diff3 zdiff3; do
	  echo
	  echo "==> $i"
	  git.compile checkout --conflict=$i foo.c
	  cat foo.c
	done

which produces:

	==> merge
	int foo(int bar)
	{
	  if (bar < 3)
	    bar *= 2;
	<<<<<<< ours
	  return bar + 8;
	=======
	  return bar + 7;
	>>>>>>> theirs
	}

The ZEALOUS level has helpfully cut out the shared cherry-picked bits,
and let us focus on the real change.

	==> diff3
	int foo(int bar)
	{
	<<<<<<< ours
	  if (bar < 3)
	    bar *= 2;
	  return bar + 8;
	||||||| base
	  return bar + 5;
	=======
	  if (bar < 3)
	    bar *= 2;
	  return bar + 7;
	>>>>>>> theirs
	}

Here we get to see all of the change, but the interesting difference is
overwhelmed by the shared cherry-picked bits. It's only 2 lines here,
but of course it could be much larger in a real example, and the reader
is forced to manually verify that the early parts are byte-for-byte
identical.

	==> zdiff3
	int foo(int bar)
	{
	  if (bar < 3)
	    bar *= 2;
	<<<<<<< ours
	  return bar + 8;
	||||||| base
	  return bar + 5;
	=======
	  return bar + 7;
	>>>>>>> theirs
	}

Here we see the hunk cut-down again, removing the cherry-picked parts.
But the presence of the base is still interesting, because we see
something that was not in the "merge" marker: that we were originally
at "5", and moved to "7" on one side and "8" on the other.

I see conflicts like this when I rebase my topics forward; you may pick
up part of my series, or even make a tweak to a patch in the middle. I
prefer diff3 markers because they carry more information (and use them
automatically via merge.conflictstyle). But in some cases, the lack of
zealous reduction means that I end having to figure out whether and if
anything changed in the seemingly identical bits.  Sometimes it is
nothing, and sometimes you tweaked whitespace or fixed a typo, and it
takes a lot of manual looking to figure it out. I hadn't realized it was
related to the use of diff3 until the discussion today.

-Peff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: feature suggestion: optimize common parts for checkout --conflict=diff3
  2013-03-06 21:32           ` Junio C Hamano
@ 2013-03-07  8:04             ` Jeff King
  2013-03-07 17:26               ` Junio C Hamano
  0 siblings, 1 reply; 38+ messages in thread
From: Jeff King @ 2013-03-07  8:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Antoine Pelisse, Uwe Kleine-König, git, kernel

On Wed, Mar 06, 2013 at 01:32:28PM -0800, Junio C Hamano wrote:

> > We show "both sides added, either identically or differently" as
> > noteworthy events, but the patched code pushes "both sides added
> > identically" case outside the conflicting hunk, as if what was added
> > relative to the common ancestor version (in Uwe's case, is it 1-14
> > that is common, or just 10-14?) is not worth looking at when
> > considering what the right resolution is.  If it is not worth
> > looking at what was in the original for the conflicting part, why
> > would we be even using diff3 mode in the first place?
> 
> I vaguely recall we did this "clip to eager" as an explicit bugfix
> at 83133740d9c8 (xmerge.c: "diff3 -m" style clips merge reduction
> level to EAGER or less, 2008-08-29).  The list archive around that
> time may give us more contexts.

Thanks for the pointer. The relevant threads are:

  http://article.gmane.org/gmane.comp.version-control.git/94228

and

  http://thread.gmane.org/gmane.comp.version-control.git/94339

There is not much discussion beyond what ended up in 8313374; both Linus
and Dscho question whether level and output format are orthogonal, but
seem to accept the explanation you give in the commit message.

Having read that commit and the surrounding thread, I think I stand by
my argument that "zdiff3" is a useful tool to have, as long as the user
understands what the hunks mean. It should never replace diff3, but I
think it makes sense as a separate format.

I was also curious whether it would the diff3/zealous combination would
trigger any weird corner cases. In particular, I wanted to know how the
example you gave in that commit of:

  postimage#1: 1234ABCDE789
                  |    /
                  |   /
  preimage:    123456789
                  |   \
                  |    \
  postimage#2: 1234AXCYE789

would react with diff3 (this is not the original example, but one with
an extra "C" in the middle of postimage#2, which could in theory be
presented as split hunks). However, it seems that we do not do such hunk
splitting at all, neither for diff3 nor for the "merge" representation.

-Peff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: feature suggestion: optimize common parts for checkout --conflict=diff3
  2013-03-07  8:04             ` Jeff King
@ 2013-03-07 17:26               ` Junio C Hamano
  2013-03-07 18:01                 ` Jeff King
  2013-03-07 18:21                 ` Junio C Hamano
  0 siblings, 2 replies; 38+ messages in thread
From: Junio C Hamano @ 2013-03-07 17:26 UTC (permalink / raw)
  To: Jeff King; +Cc: Antoine Pelisse, Uwe Kleine-König, git, kernel

Jeff King <peff@peff.net> writes:

> I was also curious whether it would the diff3/zealous combination would
> trigger any weird corner cases. In particular, I wanted to know how the
> example you gave in that commit of:
>
>   postimage#1: 1234ABCDE789
>                   |    /
>                   |   /
>   preimage:    123456789
>                   |   \
>                   |    \
>   postimage#2: 1234AXCYE789
>
> would react with diff3 (this is not the original example, but one with
> an extra "C" in the middle of postimage#2, which could in theory be
> presented as split hunks). However, it seems that we do not do such hunk
> splitting at all, neither for diff3 nor for the "merge" representation.

Without thinking about it too deeply,...

I think the "RCS merge" _could_ show it as "1234A<B=X>C<D=Y>E789"
without losing any information (as it is already discarding what was
in the original in the part that is affected by the conflict,
i.e. "56 was there").

Let's think aloud how "diff3 -m" _should_ split this. The most
straight-forward representation would be "1234<ABCDE|56=AXCYE>789",
that is, where "56" was originally there, one side made it to
"ABCDE" and the other "AXCYE".

You could make it "1234<AB|5=AX><C|=C><DE|6=YE>789", and that is
technically correct (what there were in the shared original for the
conflicted part is 5 and then 6), but the representation pretends
that it knows more than there actually is information, which may be
somewhat misleading.  All these three are equally plausible split of
the original "56":

	1234<AB|=AX><C|=C><DE|56=YE>789
	1234<AB|5=AX><C|=C><DE|6=YE>789
	1234<AB|56=AX><C|=C><DE|=YE>789

and picking one over others would be a mere heuristic.  All three
are technically correct representations and it is just the matter of
which one is the easiest to understand.  So, this is the kind of
"misleading but not incorrect".

In all these cases, the middle part would look like this:

	<<<<<<< ours
        C
        ||||||| base
        =======
	C
        >>>>>>> theirs

in order to honor the explicit "I want to view all three versions to
examine the situation" aka "--conflict=diff3" option.  We cannot
reduce it to just "C".  That will make it "not just misleading but
is actively wrong".

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: feature suggestion: optimize common parts for checkout --conflict=diff3
  2013-03-07 17:26               ` Junio C Hamano
@ 2013-03-07 18:01                 ` Jeff King
  2013-03-07 18:40                   ` Junio C Hamano
  2013-03-07 18:21                 ` Junio C Hamano
  1 sibling, 1 reply; 38+ messages in thread
From: Jeff King @ 2013-03-07 18:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Antoine Pelisse, Uwe Kleine-König, git, kernel

On Thu, Mar 07, 2013 at 09:26:05AM -0800, Junio C Hamano wrote:

> Without thinking about it too deeply,...
> 
> I think the "RCS merge" _could_ show it as "1234A<B=X>C<D=Y>E789"
> without losing any information (as it is already discarding what was
> in the original in the part that is affected by the conflict,
> i.e. "56 was there").

Right, I think that is sane, though we do not do that at this point.

> Let's think aloud how "diff3 -m" _should_ split this. The most
> straight-forward representation would be "1234<ABCDE|56=AXCYE>789",
> that is, where "56" was originally there, one side made it to
> "ABCDE" and the other "AXCYE".

Yes, that is what diff3 would do now (because it does not do any hunk
refinement at all), and should continue doing.

> You could make it "1234<AB|5=AX><C|=C><DE|6=YE>789", and that is
> technically correct (what there were in the shared original for the
> conflicted part is 5 and then 6), but the representation pretends
> that it knows more than there actually is information, which may be
> somewhat misleading.  All these three are equally plausible split of
> the original "56":
> 
> 	1234<AB|=AX><C|=C><DE|56=YE>789
> 	1234<AB|5=AX><C|=C><DE|6=YE>789
> 	1234<AB|56=AX><C|=C><DE|=YE>789
> 
> and picking one over others would be a mere heuristic.  All three
> are technically correct representations and it is just the matter of
> which one is the easiest to understand.  So, this is the kind of
> "misleading but not incorrect".

Yes, I agree it is a heuristic about which part of a split hunk to place
deleted preimage lines in. Conceptually, I'm OK with that; the point of
zdiff3 is to try to make the conflict easier to read by eliminating
possibly uninteresting parts. It doesn't have to be right all the time;
it just has to be useful most of the time. But it's not clear how true
that would be in real life.

I think this is somewhat a moot point, though. We do not do this
splitting now. If we later learn to do it, there is nothing to say that
zdiff3 would have to adopt it also; it could stop at a lower
zealous-level than the regular merge markers. I think I'd want to
experiment with it and see some real-world examples before making a
decision on that.

> In all these cases, the middle part would look like this:
> 
> 	<<<<<<< ours
>         C
>         ||||||| base
>         =======
> 	C
>         >>>>>>> theirs
> 
> in order to honor the explicit "I want to view all three versions to
> examine the situation" aka "--conflict=diff3" option.  We cannot
> reduce it to just "C".  That will make it "not just misleading but
> is actively wrong".

I'm not sure I agree. In this output (which does the zealous
simplification, the splitting, and arbitrarily assigns deleted preimage
to the first of the split hunks):

  1234A<B|56=X>C<D|Y>E789

I do not see the promotion of C to "already resolved, you cannot tell if
it was really in the preimage or not" as any more or less misleading or
wrong than that of A or E.  It is no more misleading than what the
merge-marker case would do, which would be:

  1234A<B=X>C<D=Y>E789

The wrong thing to me is the arbitrary choice about how to distribute
the preimage lines. In this example, it is not a big deal for the
heuristic to be wrong; you can see both of the hunks. But if C is long,
and you do not even see D=Y while resolving B=X, seeing the preimage
there may become nonsensical.

But again, we don't do this splitting now. So I don't think it's
something that should make or break a decision to have zdiff3. Without
the splitting, I can see it being quite useful. I'm going to carry the
patch in my tree for a while and try using it in practice for a while.

-Peff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: feature suggestion: optimize common parts for checkout --conflict=diff3
  2013-03-07 17:26               ` Junio C Hamano
  2013-03-07 18:01                 ` Jeff King
@ 2013-03-07 18:21                 ` Junio C Hamano
  1 sibling, 0 replies; 38+ messages in thread
From: Junio C Hamano @ 2013-03-07 18:21 UTC (permalink / raw)
  To: Jeff King; +Cc: Antoine Pelisse, Uwe Kleine-König, git, kernel

Junio C Hamano <gitster@pobox.com> writes:

> You could make it "1234<AB|5=AX><C|=C><DE|6=YE>789", and that is
> technically correct (what there were in the shared original for the
> conflicted part is 5 and then 6), but the representation pretends
> that it knows more than there actually is information, which may be
> somewhat misleading.  All these three are equally plausible split of
> the original "56":
>
> 	1234<AB|=AX><C|=C><DE|56=YE>789
> 	1234<AB|5=AX><C|=C><DE|6=YE>789
> 	1234<AB|56=AX><C|=C><DE|=YE>789
>
> and picking one over others would be a mere heuristic.  All three
> are technically correct representations and it is just the matter of
> which one is the easiest to understand.  So, this is the kind of
> "misleading but not incorrect".

I forgot to say that youu could even do something silly like:

	1234<AB|=AX><C|56=C><DE|=YE>789

;-)

> In all these cases, the middle part would look like this:
>
>       <<<<<<< ours
>       C
>       ||||||| base
>       =======
>       C
>       >>>>>>> theirs
>
> in order to honor the explicit "I want to view all three versions to
> examine the situation" aka "--conflict=diff3" option.  We cannot
> reduce it to just "C".  That will make it "not just misleading but
> is actively wrong".

I also forgot to say that the issue is the same to reduce

 	1234<AB|=AX><C|=C><DE|56=YE>789

to

 	1234<A|=A><B|=X><C|=C><D|56=Y><E|=E>789

which is unconditionally correct and then for all x reduce <x|=x> to
x, yielding

 	1234A<B|=X>C<D|56=Y>E789

which your zealous-diff3 would do.  So squashing that <C|=C> in the
middle would be consistent if you take the zealous-diff3 route.

But again, that is discarding the information of the original, which
the user explicitly asked from "diff3 -m", i.e. show all three to
examine the situation. If the user wants to operate _without_ the
original, the user would have asked for "RCS merge" style output, so
I am still not sure if that is a sensible mode of operation for diff3
to begin with.



	

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: feature suggestion: optimize common parts for checkout --conflict=diff3
  2013-03-07 18:01                 ` Jeff King
@ 2013-03-07 18:40                   ` Junio C Hamano
  2013-03-07 18:50                     ` Jeff King
  0 siblings, 1 reply; 38+ messages in thread
From: Junio C Hamano @ 2013-03-07 18:40 UTC (permalink / raw)
  To: Jeff King; +Cc: Antoine Pelisse, Uwe Kleine-König, git, kernel

Jeff King <peff@peff.net> writes:

> I'm not sure I agree. In this output (which does the zealous
> simplification, the splitting, and arbitrarily assigns deleted preimage
> to the first of the split hunks):
>
>   1234A<B|56=X>C<D|Y>E789
>
> I do not see the promotion of C to "already resolved, you cannot tell if
> it was really in the preimage or not" as any more or less misleading or
> wrong than that of A or E.  It is no more misleading than what the
> merge-marker case would do, which would be:
>
>   1234A<B=X>C<D=Y>E789

That is exactly my point and I think we are in complete agreement.
While the intended difference between RCS merge and diff3 -m is for
the latter not to lose information on the original, zealous-diff3
chooses to lose information in "both sides added, identically" case.

Where we differ is if such information loss is a good thing to have.

We could say "both sides added, identically" is auto-resolved when
you use the zealous option, and do so regardless of how the merge
conflicts are presented.  Then it becomes perfectly fine to eject
"A" and "E" out of the conflicted block and merge them to be part of
pre/post contexts.  The same goes for reducing "<C|=C>" to "C".  As
long as we clearly present the users what the option does and what
its implications are, it is not bad to have such an option, I think.

> The wrong thing to me is the arbitrary choice about how to distribute
> the preimage lines.

Yeah, but that is not "diff3 -m" vs "zealous-diff3" issue, is it?
If you value the original and want to show it somewhere, you cannot
avoid making the choice whether you are zealous or not if you split
such a hunk.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: feature suggestion: optimize common parts for checkout --conflict=diff3
  2013-03-07 18:40                   ` Junio C Hamano
@ 2013-03-07 18:50                     ` Jeff King
  2013-04-04 20:33                       ` Jeff King
  0 siblings, 1 reply; 38+ messages in thread
From: Jeff King @ 2013-03-07 18:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Antoine Pelisse, Uwe Kleine-König, git, kernel

On Thu, Mar 07, 2013 at 10:40:46AM -0800, Junio C Hamano wrote:

> Where we differ is if such information loss is a good thing to have.
>
> We could say "both sides added, identically" is auto-resolved when
> you use the zealous option, and do so regardless of how the merge
> conflicts are presented.  Then it becomes perfectly fine to eject
> "A" and "E" out of the conflicted block and merge them to be part of
> pre/post contexts.  The same goes for reducing "<C|=C>" to "C".  As
> long as we clearly present the users what the option does and what
> its implications are, it is not bad to have such an option, I think.

Exactly. I do think it has real-world uses (see the example script I
posted yesterday), but it would never replace diff3. I'm going to try it
out for a bit. As I mentioned yesterday, I see those sorts of
cherry-pick-with-something-on-top conflicts when I am rebasing onto or
merging my topics into what you have picked up from the same topic on
the list.

I think the code in Uwe's patch looked fine, but it definitely needs a
documentation change to explain the new mode and its caveats. I'd also
be happy with a different name, if you think it implies that it is too
related to zdiff3, but I cannot think of anything better at the moment.

> > The wrong thing to me is the arbitrary choice about how to distribute
> > the preimage lines.
> 
> Yeah, but that is not "diff3 -m" vs "zealous-diff3" issue, is it?
> If you value the original and want to show it somewhere, you cannot
> avoid making the choice whether you are zealous or not if you split
> such a hunk.

Right, but I meant that we would never split a hunk like that with
diff3, because we would not do any hunk refinement at all.  Splitting a
hunk with "merge" is OK, because the "where does the preimage go"
problem does not exist there. zdiff3 is the only problematic case,
because it would be the only one that (potentially) splits and cares
about how the preimage maps to each hunk. But we can deal with that if
and when we ever do such splitting.

-Peff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: feature suggestion: optimize common parts for checkout --conflict=diff3
  2013-03-07 18:50                     ` Jeff King
@ 2013-04-04 20:33                       ` Jeff King
  2013-04-04 20:49                         ` Uwe Kleine-König
  0 siblings, 1 reply; 38+ messages in thread
From: Jeff King @ 2013-04-04 20:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Antoine Pelisse, Uwe Kleine-König, git, kernel

On Thu, Mar 07, 2013 at 01:50:46PM -0500, Jeff King wrote:

> On Thu, Mar 07, 2013 at 10:40:46AM -0800, Junio C Hamano wrote:
> 
> > Where we differ is if such information loss is a good thing to have.
> >
> > We could say "both sides added, identically" is auto-resolved when
> > you use the zealous option, and do so regardless of how the merge
> > conflicts are presented.  Then it becomes perfectly fine to eject
> > "A" and "E" out of the conflicted block and merge them to be part of
> > pre/post contexts.  The same goes for reducing "<C|=C>" to "C".  As
> > long as we clearly present the users what the option does and what
> > its implications are, it is not bad to have such an option, I think.
> 
> Exactly. I do think it has real-world uses (see the example script I
> posted yesterday), but it would never replace diff3. I'm going to try it
> out for a bit. As I mentioned yesterday, I see those sorts of
> cherry-pick-with-something-on-top conflicts when I am rebasing onto or
> merging my topics into what you have picked up from the same topic on
> the list.

I wanted to give an update on how this has been going. I've been running
with zdiff3 for almost a month. I keep my merge.conflictstyle set to
diff3, and when I see something that I think might benefit from the
"both sides added" zealousness, I do a "git checkout --conflict=zdiff3"
and examine the result.

I have seen it help, and always when rebasing patches that were accepted
upstream. For example, imagine I added a big block of text in one patch
(e.g., an entire test script). Then I added more tests in a follow-on
patch. Or I change some of the lines from expect_failure to
expect_success. You can see this in t1060 of the
jk/check-corrupt-objects-carefully topic (I didn't try, but you could
probably reproduce by just rebasing it on top of the current master).

When I rebase my version of the patches on your master with the new
content, the conflict for the first patch is useless in diff3. I see
that the base had nothing, upstream added a hundred lines, and my patch
added ninety lines. But it's hard to see which lines are missing or
modified because of the size of the conflict. It looks like:

       <<<<<<< ours
       #!/bin/sh
       test_description=whatever
       ...
          end of some test
       '
       test_done
       ||||||| base
       =======
       #!/bin/sh
       test_description=whatever
       ...
          end of another test
       '
       test_done
       >>>>>>> theirs

The interesting part is in the "...", which contains different lines in
each version, but it may be hundreds of lines long. Using zdiff3, I get:

       #!/bin/sh
       test_description=whatever
       ...
       <<<<<<< ours
       test_expect_success 'some_new_test' '
       ...
       ||||||| base
       =======
       >>>>>>> theirs
       '
       test_done

I can see that nothing was tweaked; I just didn't add any content there,
and upstream did. Contrast this with zealous "merge" conflicts, which
would look like:

      #!/bin/sh
      test_description=whatever
      ...
      <<<<<<< ours
      test_expect_success 'some_new_test' '
      ...
      =======
      >>>>>>> theirs
      '
      test_done

which similarly condenses, but is missing a piece of information: that
there was nothing in the base. I don't know whether the conflict is
there because my patch removed some content that got changed upstream,
or whether upstream added some content that I did not have in my patch.

So I think it is useful when rebasing on top of what upstream took,
specifically when:

  1. You have a series that updates the same hunk repeatedly (because
     from your perspective, you see only the tip of what upstream took).

  2. Upstream takes your patch but tweaks it (either as a fixup, to deal
     with a merge conflict, or whatever). You get to see the minimal
     tweak, not the fact that you have a giant hunk which differs from
     the upstream only by a few characters or a few lines.

So I do think zdiff3 is useful, and I plan to continue using it.

-Peff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: feature suggestion: optimize common parts for checkout --conflict=diff3
  2013-04-04 20:33                       ` Jeff King
@ 2013-04-04 20:49                         ` Uwe Kleine-König
  2013-04-04 20:54                           ` Jeff King
  0 siblings, 1 reply; 38+ messages in thread
From: Uwe Kleine-König @ 2013-04-04 20:49 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Antoine Pelisse, git, kernel

Hi Jeff,

On Thu, Apr 04, 2013 at 04:33:44PM -0400, Jeff King wrote:
> [...]
> So I do think zdiff3 is useful, and I plan to continue using it.
Thanks for your description. I'm using and liking zdiff3, too. So I'd
really like seeing it in vanilla git.

Thanks
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: feature suggestion: optimize common parts for checkout --conflict=diff3
  2013-04-04 20:49                         ` Uwe Kleine-König
@ 2013-04-04 20:54                           ` Jeff King
  2013-04-04 21:19                             ` Junio C Hamano
  0 siblings, 1 reply; 38+ messages in thread
From: Jeff King @ 2013-04-04 20:54 UTC (permalink / raw)
  To: Uwe Kleine-König; +Cc: Junio C Hamano, Antoine Pelisse, git, kernel

On Thu, Apr 04, 2013 at 10:49:44PM +0200, Uwe Kleine-König wrote:

> On Thu, Apr 04, 2013 at 04:33:44PM -0400, Jeff King wrote:
> > [...]
> > So I do think zdiff3 is useful, and I plan to continue using it.
> Thanks for your description. I'm using and liking zdiff3, too. So I'd
> really like seeing it in vanilla git.

I don't know if Junio is interested in taking a patch with the concept
or not, but as I recall, your patch needed at least a documentation
update before it could be picked up. Unless Junio wants to say "no, I am
not interested at all", I think the next step would be to repost an
updated version.

-Peff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: feature suggestion: optimize common parts for checkout --conflict=diff3
  2013-04-04 20:54                           ` Jeff King
@ 2013-04-04 21:19                             ` Junio C Hamano
  0 siblings, 0 replies; 38+ messages in thread
From: Junio C Hamano @ 2013-04-04 21:19 UTC (permalink / raw)
  To: Jeff King; +Cc: Uwe Kleine-König, Antoine Pelisse, git, kernel

Jeff King <peff@peff.net> writes:

> On Thu, Apr 04, 2013 at 10:49:44PM +0200, Uwe Kleine-König wrote:
>
>> On Thu, Apr 04, 2013 at 04:33:44PM -0400, Jeff King wrote:
>> > [...]
>> > So I do think zdiff3 is useful, and I plan to continue using it.
>> Thanks for your description. I'm using and liking zdiff3, too. So I'd
>> really like seeing it in vanilla git.
>
> I don't know if Junio is interested in taking a patch with the concept
> or not,...

In two messages upthread before you restarted this discussion, I
said:

    As long as we clearly present the users what the option does and
    what its implications are, it is not bad to have such an option,
    I think.

> but as I recall, your patch needed at least a documentation update
> before it could be picked up. Unless Junio wants to say "no, I am
> not interested at all", I think the next step would be to repost
> an updated version.

Yup.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH] xdiff: implement a zealous diff3
@ 2021-06-13 14:31 Felipe Contreras
  2021-06-13 15:42 ` Jeff King
  0 siblings, 1 reply; 38+ messages in thread
From: Felipe Contreras @ 2021-06-13 14:31 UTC (permalink / raw)
  To: git
  Cc: Elijah Newren, Johannes Sixt, Sergey Organov, Jeff King,
	Junio C Hamano, Uwe Kleine-König

From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

"zdiff3" is identical to ordinary diff3, only it allows more aggressive
compaction than diff3. This way the displayed base isn't necessary
technically correct, but still this mode might help resolving merge
conflicts between two near identical additions.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
---

I'm re-sending this patch from 2013 because I do think it provides value
and we might want to make it the default.

I hardcoded diff3 to be zdiff3 and all the tests passed, so if our tests
don't care about the simplification level of diff3 perhaps many (or even
most) our users don't either.

FTR: this patch applied cleanly.

 builtin/merge-file.c                   | 2 ++
 contrib/completion/git-completion.bash | 2 +-
 xdiff-interface.c                      | 2 ++
 xdiff/xdiff.h                          | 1 +
 xdiff/xmerge.c                         | 8 +++++++-
 5 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/builtin/merge-file.c b/builtin/merge-file.c
index 06a2f90c48..e695867ee5 100644
--- a/builtin/merge-file.c
+++ b/builtin/merge-file.c
@@ -34,6 +34,8 @@ int cmd_merge_file(int argc, const char **argv, const char *prefix)
 	struct option options[] = {
 		OPT_BOOL('p', "stdout", &to_stdout, N_("send results to standard output")),
 		OPT_SET_INT(0, "diff3", &xmp.style, N_("use a diff3 based merge"), XDL_MERGE_DIFF3),
+		OPT_SET_INT(0, "zdiff3", &xmp.style, N_("use a zealous diff3 based merge"),
+				XDL_MERGE_ZEALOUS_DIFF3),
 		OPT_SET_INT(0, "ours", &xmp.favor, N_("for conflicts, use our version"),
 			    XDL_MERGE_FAVOR_OURS),
 		OPT_SET_INT(0, "theirs", &xmp.favor, N_("for conflicts, use their version"),
diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
index b50c5d0ea3..8594559298 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -1566,7 +1566,7 @@ _git_checkout ()
 
 	case "$cur" in
 	--conflict=*)
-		__gitcomp "diff3 merge" "" "${cur##--conflict=}"
+		__gitcomp "diff3 merge zdiff3" "" "${cur##--conflict=}"
 		;;
 	--*)
 		__gitcomp_builtin checkout
diff --git a/xdiff-interface.c b/xdiff-interface.c
index 609615db2c..9977813a9d 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -308,6 +308,8 @@ int git_xmerge_config(const char *var, const char *value, void *cb)
 			die("'%s' is not a boolean", var);
 		if (!strcmp(value, "diff3"))
 			git_xmerge_style = XDL_MERGE_DIFF3;
+		else if (!strcmp(value, "zdiff3"))
+			git_xmerge_style = XDL_MERGE_ZEALOUS_DIFF3;
 		else if (!strcmp(value, "merge"))
 			git_xmerge_style = 0;
 		/*
diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h
index 7a04605146..8629ae287c 100644
--- a/xdiff/xdiff.h
+++ b/xdiff/xdiff.h
@@ -65,6 +65,7 @@ extern "C" {
 
 /* merge output styles */
 #define XDL_MERGE_DIFF3 1
+#define XDL_MERGE_ZEALOUS_DIFF3 2
 
 typedef struct s_mmfile {
 	char *ptr;
diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c
index 1659edb453..95871a0b6e 100644
--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -230,7 +230,7 @@ static int fill_conflict_hunk(xdfenv_t *xe1, const char *name1,
 	size += xdl_recs_copy(xe1, m->i1, m->chg1, needs_cr, 1,
 			      dest ? dest + size : NULL);
 
-	if (style == XDL_MERGE_DIFF3) {
+	if (style == XDL_MERGE_DIFF3 || style == XDL_MERGE_ZEALOUS_DIFF3) {
 		/* Shared preimage */
 		if (!dest) {
 			size += marker_size + 1 + needs_cr + marker3_size;
@@ -482,6 +482,12 @@ static int xdl_do_merge(xdfenv_t *xe1, xdchange_t *xscr1,
 	int style = xmp->style;
 	int favor = xmp->favor;
 
+	/*
+	 * This is the only change between XDL_MERGE_DIFF3 and
+	 * XDL_MERGE_ZEALOUS_DIFF3. "zdiff3" isn't 100% technically correct (as
+	 * the base might be considerably simplified), but still it might help
+	 * interpreting conflicts between two big and near identical additions.
+	 */
 	if (style == XDL_MERGE_DIFF3) {
 		/*
 		 * "diff3 -m" output does not make sense for anything
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH] xdiff: implement a zealous diff3
  2021-06-13 14:31 [PATCH] xdiff: implement a zealous diff3 Felipe Contreras
@ 2021-06-13 15:42 ` Jeff King
  2021-06-13 18:00   ` Felipe Contreras
  0 siblings, 1 reply; 38+ messages in thread
From: Jeff King @ 2021-06-13 15:42 UTC (permalink / raw)
  To: Felipe Contreras
  Cc: git, Elijah Newren, Johannes Sixt, Sergey Organov, Junio C Hamano,
	Uwe Kleine-König

On Sun, Jun 13, 2021 at 09:31:55AM -0500, Felipe Contreras wrote:

> From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> 
> "zdiff3" is identical to ordinary diff3, only it allows more aggressive
> compaction than diff3. This way the displayed base isn't necessary
> technically correct, but still this mode might help resolving merge
> conflicts between two near identical additions.
> 
> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> ---
> 
> I'm re-sending this patch from 2013 because I do think it provides value
> and we might want to make it the default.

I take it you didn't investigate the segfault I mentioned.

Try this:

   commit=a5170794372cf1325710a3419473c91ec4af53bf
   for style in merge diff3 zdiff3; do
     git reset --hard
     git checkout $commit^1
     git -c merge.conflictstyle=$style merge $commit^2
   done

The first two are fine; the zdiff3 one segfaults within the xmerge.c
code.

-Peff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] xdiff: implement a zealous diff3
  2021-06-13 15:42 ` Jeff King
@ 2021-06-13 18:00   ` Felipe Contreras
  2021-06-13 21:24     ` Felipe Contreras
  2021-06-14  4:44     ` Jeff King
  0 siblings, 2 replies; 38+ messages in thread
From: Felipe Contreras @ 2021-06-13 18:00 UTC (permalink / raw)
  To: Jeff King, Felipe Contreras
  Cc: git, Elijah Newren, Johannes Sixt, Sergey Organov, Junio C Hamano,
	Uwe Kleine-König

Jeff King wrote:
> On Sun, Jun 13, 2021 at 09:31:55AM -0500, Felipe Contreras wrote:
> 
> > From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> > 
> > "zdiff3" is identical to ordinary diff3, only it allows more aggressive
> > compaction than diff3. This way the displayed base isn't necessary
> > technically correct, but still this mode might help resolving merge
> > conflicts between two near identical additions.
> > 
> > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> > ---
> > 
> > I'm re-sending this patch from 2013 because I do think it provides value
> > and we might want to make it the default.
> 
> I take it you didn't investigate the segfault I mentioned.

I don't know how I was supposed to investigate the few segfaults you
mentioned. All you said is that you never tracked the bug.

> Try this:
> 
>    commit=a5170794372cf1325710a3419473c91ec4af53bf
>    for style in merge diff3 zdiff3; do
>      git reset --hard
>      git checkout $commit^1
>      git -c merge.conflictstyle=$style merge $commit^2
>    done
> 
> The first two are fine; the zdiff3 one segfaults within the xmerge.c
> code.

I can reproduct the segfault, and here is a simpler way to reproduce it:

(I have a hacked version of diff3 until merge-file learns how to use
merge.conflictstyle)

  cat >b <<EOF
  A
  EOF

  cat >l <<EOF
  A

  B
  C
  D
  E
  F
  GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
  H
  I
  EOF

  cat >r <<EOF
  A

  b
  C
  D
  E
  F
  GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
  H
  i
  EOF

  $git merge-file --diff3 -p l b r

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] xdiff: implement a zealous diff3
  2021-06-13 18:00   ` Felipe Contreras
@ 2021-06-13 21:24     ` Felipe Contreras
  2021-06-15  2:07       ` Junio C Hamano
  2021-06-14  4:44     ` Jeff King
  1 sibling, 1 reply; 38+ messages in thread
From: Felipe Contreras @ 2021-06-13 21:24 UTC (permalink / raw)
  To: Felipe Contreras, Jeff King, Felipe Contreras
  Cc: git, Elijah Newren, Johannes Sixt, Sergey Organov, Junio C Hamano,
	Uwe Kleine-König

Felipe Contreras wrote:
> Jeff King wrote:
> > Try this:
> > 
> >    commit=a5170794372cf1325710a3419473c91ec4af53bf
> >    for style in merge diff3 zdiff3; do
> >      git reset --hard
> >      git checkout $commit^1
> >      git -c merge.conflictstyle=$style merge $commit^2
> >    done
> > 
> > The first two are fine; the zdiff3 one segfaults within the xmerge.c
> > code.
> 
> I can reproduct the segfault, and here is a simpler way to reproduce it:

I found the problem, m->chg0 was not initialized in xdl_refine_conflicts.

I'm not familiar with the area so I don't know if the following makes
sense, but it fixes the crash:

--- a/xdiff/xmerge.c
+++ b/xdiff/xmerge.c
@@ -333,7 +333,7 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
                mmfile_t t1, t2;
                xdfenv_t xe;
                xdchange_t *xscr, *x;
-               int i1 = m->i1, i2 = m->i2;
+               int i0 = m->i0, i1 = m->i1, i2 = m->i2;
 
                /* let's handle just the conflicts */
                if (m->mode)
@@ -384,6 +384,8 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
                        m->next = m2;
                        m = m2;
                        m->mode = 0;
+                       m->i0 = i0;
+                       m->chg0 = 0;
                        m->i1 = xscr->i1 + i1;
                        m->chg1 = xscr->chg1;
                        m->i2 = xscr->i2 + i2;


-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] xdiff: implement a zealous diff3
  2021-06-13 18:00   ` Felipe Contreras
  2021-06-13 21:24     ` Felipe Contreras
@ 2021-06-14  4:44     ` Jeff King
  2021-06-15  4:19       ` Felipe Contreras
  1 sibling, 1 reply; 38+ messages in thread
From: Jeff King @ 2021-06-14  4:44 UTC (permalink / raw)
  To: Felipe Contreras
  Cc: git, Elijah Newren, Johannes Sixt, Sergey Organov, Junio C Hamano,
	Uwe Kleine-König

On Sun, Jun 13, 2021 at 01:00:33PM -0500, Felipe Contreras wrote:

> > > I'm re-sending this patch from 2013 because I do think it provides value
> > > and we might want to make it the default.
> > 
> > I take it you didn't investigate the segfault I mentioned.
> 
> I don't know how I was supposed to investigate the few segfaults you
> mentioned. All you said is that you never tracked the bug.

My point is that if you are going to repost a patch that has known
problems, it is worth saying so to give reviewers and the maintainer a
realistic idea of how stable it is.

I also didn't have a reproduction recipe. I found the commit I sent by
just re-running every merge in git.git in a loop.

It sounds like you have a smaller reproduction and maybe a fix, which is
good.

-Peff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] xdiff: implement a zealous diff3
  2021-06-13 21:24     ` Felipe Contreras
@ 2021-06-15  2:07       ` Junio C Hamano
  2021-06-15  3:43         ` Elijah Newren
  0 siblings, 1 reply; 38+ messages in thread
From: Junio C Hamano @ 2021-06-15  2:07 UTC (permalink / raw)
  To: Felipe Contreras, Johannes Schindelin
  Cc: Jeff King, git, Elijah Newren, Johannes Sixt, Sergey Organov,
	Uwe Kleine-König

Felipe Contreras <felipe.contreras@gmail.com> writes:

> I found the problem, m->chg0 was not initialized in xdl_refine_conflicts.
>
> I'm not familiar with the area so I don't know if the following makes
> sense, but it fixes the crash:

Unlike the remainder of the xdiff/ directory, xdiff/xmerge.c was
Dscho's brainchild if I am not mistaken, so I'm CCing him for
input.



> --- a/xdiff/xmerge.c
> +++ b/xdiff/xmerge.c
> @@ -333,7 +333,7 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
>                 mmfile_t t1, t2;
>                 xdfenv_t xe;
>                 xdchange_t *xscr, *x;
> -               int i1 = m->i1, i2 = m->i2;
> +               int i0 = m->i0, i1 = m->i1, i2 = m->i2;
>  
>                 /* let's handle just the conflicts */
>                 if (m->mode)
> @@ -384,6 +384,8 @@ static int xdl_refine_conflicts(xdfenv_t *xe1, xdfenv_t *xe2, xdmerge_t *m,
>                         m->next = m2;
>                         m = m2;
>                         m->mode = 0;
> +                       m->i0 = i0;
> +                       m->chg0 = 0;
>                         m->i1 = xscr->i1 + i1;
>                         m->chg1 = xscr->chg1;
>                         m->i2 = xscr->i2 + i2;

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] xdiff: implement a zealous diff3
  2021-06-15  2:07       ` Junio C Hamano
@ 2021-06-15  3:43         ` Elijah Newren
  2021-06-15  4:03           ` Junio C Hamano
  2021-06-15  9:16           ` Felipe Contreras
  0 siblings, 2 replies; 38+ messages in thread
From: Elijah Newren @ 2021-06-15  3:43 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Felipe Contreras, Johannes Schindelin, Jeff King,
	Git Mailing List, Johannes Sixt, Sergey Organov,
	Uwe Kleine-König

On Mon, Jun 14, 2021 at 7:07 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Felipe Contreras <felipe.contreras@gmail.com> writes:
>
> > I found the problem, m->chg0 was not initialized in xdl_refine_conflicts.
> >
> > I'm not familiar with the area so I don't know if the following makes
> > sense, but it fixes the crash:
>
> Unlike the remainder of the xdiff/ directory, xdiff/xmerge.c was
> Dscho's brainchild if I am not mistaken, so I'm CCing him for
> input.

This is going to sound harsh, but people shouldn't waste (any more)
time reviewing the patches in this thread or the "merge: cleanups and
fix" series submitted elsewhere.  They should all just be rejected.

I do not think it is reasonable to expect reviewers to spend time
responding to re-posted patches when:
  * no attempt was made to make sure they were up-to-date with current
code beyond compiling (see below)
  * no attempt was made to address missing items pointed out in
response to the original submission[1]
  * no attempt was made to handle or even test particular cases
pointed out in response to the original submission (see [1] and below)
  * the patches were posted despite knowing they caused segfaults, and
without even stating as much![2]
  * the segfault "fixes" are submitted as a separate series from the
patch introducing the segfault[3], raising the risk that one gets
picked up without the other.

In my opinion, these submissions were egregiously cavalier.  I'll
submit a patch (or perhaps a few) soon that has a functioning zdiff3.

However, since I've already put in the time to understand it, let me
explain what is wrong with this patch.  This particular change is in
the area of the code that splits conflict regions when there are
portions of the sides (not the base) that match.  Doing such splitting
makes sense with "merge" conflictStyle since the base is never shown;
this splitting can allow pulling the common lines out of the conflict
region.  However, with diff3 or zdiff3, the original text does not
match the sides and by splitting the conflict region, we are forced to
decide how or where to split the original text among the various
conflict (and non-conflict?) regions.  This is pretty haphazard, and
the effect of this patch is to assign all of the original text to the
first conflict region in the split, and make all other regions have
empty base text.

This exact scenario was discussed by you and Peff back when zdiff3 was
originally introduced in the thread where Felipe got the patch that he
started this thread with.  In that thread, Peff explained how zdiff3
should only try to move common lines at the beginning or end of the
conflict hunk outside the conflict region, without doing any splitting
of the conflict region (this particular issue took about 1/3 to 1/2 of
the original thread, but I think [4] has a good hilight).
Additionally, a quick grep through the code showed that there are
additional places in bash/zsh completion that need to be fixed to use
the new option besides the locations modified in the original zdiff3
patch.  See [1] and [2] for various other things overlooked.


[1] https://lore.kernel.org/git/CABPp-BGZ2H1MVgw9RvSdogLMdqsX3n89NkkDYDa2VM3TRHn7tg@mail.gmail.com/
[2] https://lore.kernel.org/git/YMbexfeUG78yBix4@coredump.intra.peff.net/
[3] https://lore.kernel.org/git/20210613225836.1009569-5-felipe.contreras@gmail.com/
[4] https://lore.kernel.org/git/20130307180157.GA6604@sigill.intra.peff.net/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] xdiff: implement a zealous diff3
  2021-06-15  3:43         ` Elijah Newren
@ 2021-06-15  4:03           ` Junio C Hamano
  2021-06-15  9:20             ` Felipe Contreras
  2021-06-15  9:16           ` Felipe Contreras
  1 sibling, 1 reply; 38+ messages in thread
From: Junio C Hamano @ 2021-06-15  4:03 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Felipe Contreras, Johannes Schindelin, Jeff King,
	Git Mailing List, Johannes Sixt, Sergey Organov,
	Uwe Kleine-König

Elijah Newren <newren@gmail.com> writes:

> This is going to sound harsh, but people shouldn't waste (any more)
> time reviewing the patches in this thread or the "merge: cleanups and
> fix" series submitted elsewhere.  They should all just be rejected.
>
> I do not think it is reasonable to expect reviewers to spend time
> responding to re-posted patches when:
>   * no attempt was made to make sure they were up-to-date with current
> code beyond compiling (see below)
>   * no attempt was made to address missing items pointed out in
> response to the original submission[1]
>   * no attempt was made to handle or even test particular cases
> pointed out in response to the original submission (see [1] and below)
>   * the patches were posted despite knowing they caused segfaults, and
> without even stating as much![2]
>   * the segfault "fixes" are submitted as a separate series from the
> patch introducing the segfault[3], raising the risk that one gets
> picked up without the other.

Fair enough.  Thanks.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] xdiff: implement a zealous diff3
  2021-06-14  4:44     ` Jeff King
@ 2021-06-15  4:19       ` Felipe Contreras
  2021-06-15  9:24         ` Jeff King
  0 siblings, 1 reply; 38+ messages in thread
From: Felipe Contreras @ 2021-06-15  4:19 UTC (permalink / raw)
  To: Jeff King, Felipe Contreras
  Cc: git, Elijah Newren, Johannes Sixt, Sergey Organov, Junio C Hamano,
	Uwe Kleine-König

Jeff King wrote:
> On Sun, Jun 13, 2021 at 01:00:33PM -0500, Felipe Contreras wrote:
> 
> > > > I'm re-sending this patch from 2013 because I do think it provides value
> > > > and we might want to make it the default.
> > > 
> > > I take it you didn't investigate the segfault I mentioned.
> > 
> > I don't know how I was supposed to investigate the few segfaults you
> > mentioned. All you said is that you never tracked the bug.
> 
> My point is that if you are going to repost a patch that has known
> problems,

It was not known that it had problems.

That fact that person X said patch Y had a problem doesn't necessarily
mean that patch Y has a problem.

  1. The problem in the past might not apply in the present
  2. The problem X person had might be specific to his/her setup
  3. The problem might be due a combination of patches, not the patch
     itself

Plus many others.

A logical person sees evidence for what it is, and the only thing that
person X saying patch Y had a problem means, is that person X said patch
Y had a problem.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] xdiff: implement a zealous diff3
  2021-06-15  3:43         ` Elijah Newren
  2021-06-15  4:03           ` Junio C Hamano
@ 2021-06-15  9:16           ` Felipe Contreras
  2021-06-15 16:59             ` Elijah Newren
  1 sibling, 1 reply; 38+ messages in thread
From: Felipe Contreras @ 2021-06-15  9:16 UTC (permalink / raw)
  To: Elijah Newren, Junio C Hamano
  Cc: Felipe Contreras, Johannes Schindelin, Jeff King,
	Git Mailing List, Johannes Sixt, Sergey Organov,
	Uwe Kleine-König

Elijah Newren wrote:
> On Mon, Jun 14, 2021 at 7:07 PM Junio C Hamano <gitster@pobox.com> wrote:
> >
> > Felipe Contreras <felipe.contreras@gmail.com> writes:
> >
> > > I found the problem, m->chg0 was not initialized in xdl_refine_conflicts.
> > >
> > > I'm not familiar with the area so I don't know if the following makes
> > > sense, but it fixes the crash:
> >
> > Unlike the remainder of the xdiff/ directory, xdiff/xmerge.c was
> > Dscho's brainchild if I am not mistaken, so I'm CCing him for
> > input.
> 
> This is going to sound harsh, but people shouldn't waste (any more)
> time reviewing the patches in this thread or the "merge: cleanups and
> fix" series submitted elsewhere.  They should all just be rejected.
> 
> I do not think it is reasonable to expect reviewers to spend time
> responding to re-posted patches when:
>   * no attempt was made to make sure they were up-to-date with current
> code beyond compiling (see below)

What makes you think so?

>   * no attempt was made to address missing items pointed out in
> response to the original submission[1]

The original submission caused a discussion with no resolution, and
edned with Jeff saying he wanted to try real use-cases and that that he
wanted to use it in practice for a while.

The purpose v1 of this series was to respark the discussion and see if
any of the original parties had changed their minds.

People do change their minds after 8 years.

>   * no attempt was made to handle or even test particular cases
> pointed out in response to the original submission (see [1] and below)

Those were sent *after* the series, except [4], which clearly states the
*opposite* of there being a deal-breaker:

  But again, we don't do this splitting now. So I don't think it's
  something that should make or break a decision to have zdiff3. Without
  the splitting, I can see it being quite useful.

>   * the patches were posted despite knowing they caused segfaults, and
> without even stating as much![2]

Whomever *knew* that, it wasn't me.

>   * the segfault "fixes" are submitted as a separate series from the
> patch introducing the segfault[3], raising the risk that one gets
> picked up without the other.

My v2 includes the patch.

Just because a patch is in one series that doesn't preclude it from
being in another series. `git merge` and `git rebase` are smart enough
to handle such cases.

There is no risk of that happening (unless there's plans of merging v1
as-is).

> In my opinion, these submissions were egregiously cavalier.

If you make unwarranted assumptions everything is possible.

> I'll submit a patch (or perhaps a few) soon that has a functioning
> zdiff3.

I already have a functioning zdiff3.

> However, since I've already put in the time to understand it, let me
> explain what is wrong with this patch.  This particular change is in
> the area of the code that splits conflict regions when there are
> portions of the sides (not the base) that match.  Doing such splitting
> makes sense with "merge" conflictStyle since the base is never shown;
> this splitting can allow pulling the common lines out of the conflict
> region.  However, with diff3 or zdiff3, the original text does not
> match the sides and by splitting the conflict region, we are forced to
> decide how or where to split the original text among the various
> conflict (and non-conflict?) regions.  This is pretty haphazard, and
> the effect of this patch is to assign all of the original text to the
> first conflict region in the split, and make all other regions have
> empty base text.

Yes, that is *one* opinion. The jury is still out on what is the best
approach. Junio and Jeff did not agree on that.


The whole point of "zdiff3" was to have something closer to "merge",
even if it wasn't 100% correct. Your approach maybe more correct, but
correctness was never the point.

Either way, I have more rewarding things to focus on, so good luck with
that.

> [1] https://lore.kernel.org/git/CABPp-BGZ2H1MVgw9RvSdogLMdqsX3n89NkkDYDa2VM3TRHn7tg@mail.gmail.com/
> [2] https://lore.kernel.org/git/YMbexfeUG78yBix4@coredump.intra.peff.net/
> [3] https://lore.kernel.org/git/20210613225836.1009569-5-felipe.contreras@gmail.com/
> [4] https://lore.kernel.org/git/20130307180157.GA6604@sigill.intra.peff.net/

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] xdiff: implement a zealous diff3
  2021-06-15  4:03           ` Junio C Hamano
@ 2021-06-15  9:20             ` Felipe Contreras
  0 siblings, 0 replies; 38+ messages in thread
From: Felipe Contreras @ 2021-06-15  9:20 UTC (permalink / raw)
  To: Junio C Hamano, Elijah Newren
  Cc: Felipe Contreras, Johannes Schindelin, Jeff King,
	Git Mailing List, Johannes Sixt, Sergey Organov,
	Uwe Kleine-König

Junio C Hamano wrote:
> Elijah Newren <newren@gmail.com> writes:
> 
> > This is going to sound harsh, but people shouldn't waste (any more)
> > time reviewing the patches in this thread or the "merge: cleanups and
> > fix" series submitted elsewhere.  They should all just be rejected.
> >
> > I do not think it is reasonable to expect reviewers to spend time
> > responding to re-posted patches when:
> >   * no attempt was made to make sure they were up-to-date with current
> > code beyond compiling (see below)
> >   * no attempt was made to address missing items pointed out in
> > response to the original submission[1]
> >   * no attempt was made to handle or even test particular cases
> > pointed out in response to the original submission (see [1] and below)
> >   * the patches were posted despite knowing they caused segfaults, and
> > without even stating as much![2]
> >   * the segfault "fixes" are submitted as a separate series from the
> > patch introducing the segfault[3], raising the risk that one gets
> > picked up without the other.
> 
> Fair enough.  Thanks.

I didn't know some people's opinions on this mailing list were
automatically promoted to facts, but FWIW the vast majority of the
points stated above are simply not true.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] xdiff: implement a zealous diff3
  2021-06-15  4:19       ` Felipe Contreras
@ 2021-06-15  9:24         ` Jeff King
  2021-06-15 10:24           ` Felipe Contreras
  0 siblings, 1 reply; 38+ messages in thread
From: Jeff King @ 2021-06-15  9:24 UTC (permalink / raw)
  To: Felipe Contreras
  Cc: git, Elijah Newren, Johannes Sixt, Sergey Organov, Junio C Hamano,
	Uwe Kleine-König

On Mon, Jun 14, 2021 at 11:19:46PM -0500, Felipe Contreras wrote:

> > My point is that if you are going to repost a patch that has known
> > problems,
> 
> It was not known that it had problems.
> 
> That fact that person X said patch Y had a problem doesn't necessarily
> mean that patch Y has a problem.
> 
>   1. The problem in the past might not apply in the present
>   2. The problem X person had might be specific to his/her setup
>   3. The problem might be due a combination of patches, not the patch
>      itself
> 
> Plus many others.
> 
> A logical person sees evidence for what it is, and the only thing that
> person X saying patch Y had a problem means, is that person X said patch
> Y had a problem.

Wow.

For one thing, you could still relay the _report_ of a problem along
with the patch, which would be valuable information for reviewers.

But much more important, in my opinion: that you would dismiss without
further investigation a report of a bug from the one person who actually
had experience running with the patch implies a level of carelessness
that I'm not comfortable with for the project.

I had already given up on having substantive discussion with you, but I
had hoped I could help the project by pointing out relevant facts in
areas that you were working in. But if a simple statement like "this
segfaulted for me" is not even useful, then I don't see much point in
communicating with you at all.

-Peff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] xdiff: implement a zealous diff3
  2021-06-15  9:24         ` Jeff King
@ 2021-06-15 10:24           ` Felipe Contreras
  0 siblings, 0 replies; 38+ messages in thread
From: Felipe Contreras @ 2021-06-15 10:24 UTC (permalink / raw)
  To: Jeff King, Felipe Contreras
  Cc: git, Elijah Newren, Johannes Sixt, Sergey Organov, Junio C Hamano,
	Uwe Kleine-König

Jeff King wrote:
> On Mon, Jun 14, 2021 at 11:19:46PM -0500, Felipe Contreras wrote:
> 
> > > My point is that if you are going to repost a patch that has known
> > > problems,
> > 
> > It was not known that it had problems.
> > 
> > That fact that person X said patch Y had a problem doesn't necessarily
> > mean that patch Y has a problem.
> > 
> >   1. The problem in the past might not apply in the present
> >   2. The problem X person had might be specific to his/her setup
> >   3. The problem might be due a combination of patches, not the patch
> >      itself
> > 
> > Plus many others.
> > 
> > A logical person sees evidence for what it is, and the only thing that
> > person X saying patch Y had a problem means, is that person X said patch
> > Y had a problem.
> 
> Wow.
> 
> For one thing, you could still relay the _report_ of a problem along
> with the patch, which would be valuable information for reviewers.

Yes I could have, and knowing what I know now I wouldn't even have even
posted the patch (not without a proposed fix). Woulda, coulda, shoulda.

But that's not the point. The point is that I did not repost a patch with
known problems *today*. Nor did I know what kind of problems, or
how pervasive the issue was.

Presumably you had to try at least 2,500 merges to find *one* issue.

I ran all the tests for diff3 with zdiff3 and they passed without
problems.


Merging this patch would have:

 1. Not broken any tests
 2. Not changed any behavior for any user
 3. Not have caused any problem for the vast majority (> 99%) of
    people trying out zdiff3

So there was no carelessness here.

Moreover, I provied the patch at 9:30, at 10:42 you commented about the
segfault, and 16:24 I had the fix. On a Sunday.

If this is not caring, I don't know what is.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] xdiff: implement a zealous diff3
  2021-06-15  9:16           ` Felipe Contreras
@ 2021-06-15 16:59             ` Elijah Newren
  0 siblings, 0 replies; 38+ messages in thread
From: Elijah Newren @ 2021-06-15 16:59 UTC (permalink / raw)
  To: Felipe Contreras
  Cc: Junio C Hamano, Johannes Schindelin, Jeff King, Git Mailing List,
	Johannes Sixt, Sergey Organov, Uwe Kleine-König

On Tue, Jun 15, 2021 at 2:16 AM Felipe Contreras
<felipe.contreras@gmail.com> wrote:
>
> Elijah Newren wrote:
> > On Mon, Jun 14, 2021 at 7:07 PM Junio C Hamano <gitster@pobox.com> wrote:
> > >
> > > Felipe Contreras <felipe.contreras@gmail.com> writes:
> > >
> > > > I found the problem, m->chg0 was not initialized in xdl_refine_conflicts.
> > > >
> > > > I'm not familiar with the area so I don't know if the following makes
> > > > sense, but it fixes the crash:
> > >
> > > Unlike the remainder of the xdiff/ directory, xdiff/xmerge.c was
> > > Dscho's brainchild if I am not mistaken, so I'm CCing him for
> > > input.
> >
> > This is going to sound harsh, but people shouldn't waste (any more)
> > time reviewing the patches in this thread or the "merge: cleanups and
> > fix" series submitted elsewhere.  They should all just be rejected.
> >
> > I do not think it is reasonable to expect reviewers to spend time
> > responding to re-posted patches when:
> >   * no attempt was made to make sure they were up-to-date with current
> > code beyond compiling (see below)
>
> What makes you think so?

I did a simple grep to see where "diff3" was mentioned in the codebase
to see if any of those needed a "zdiff3".  Among the things I found
was that although the original patch updated git-completion.bash,
there were additional locations within a current git-completion.bash
that referred to "diff3" that should also have a "zdiff3".  I know you
understand that part of the code.

> >   * no attempt was made to address missing items pointed out in
> > response to the original submission[1]
>
> The original submission caused a discussion with no resolution

The discussion ended with no resolution in part because there were
multiple items discussed that would need to be addressed.  Including
the one reiterated at the end of the discussion.

>, and
> edned with Jeff saying he wanted to try real use-cases and that that he
> wanted to use it in practice for a while.

That wasn't the end of the discussion.  The email you are referencing
occurred here: https://lore.kernel.org/git/20130307185046.GA11622@sigill.intra.peff.net/.
The end of the discussion was Junio quoting himself in order to
reiterate that "As long as we clearly present the users what the
option does and what its implications are, it is not bad to have such
an option, I think."  See
https://lore.kernel.org/git/7vip42gfjc.fsf@alter.siamese.dyndns.org/
and check the timestamps in the threadlist.

> >   * no attempt was made to handle or even test particular cases
> > pointed out in response to the original submission (see [1] and below)
>
> Those were sent *after* the series, except [4], which clearly states the
> *opposite* of there being a deal-breaker:
>
>   But again, we don't do this splitting now. So I don't think it's
>   something that should make or break a decision to have zdiff3. Without
>   the splitting, I can see it being quite useful.

This statement from Peff was incorrect; the zdiff3 patches made the
code do splitting of conflict hunks.  I would normally understand if
perhaps you didn't know his statement was incorrect and wouldn't have
had a way to know, *except* for the fact that this exact patch we are
commenting on that you posted is modifying the code that does conflict
hunk splitting.

Further, you stated at
https://lore.kernel.org/git/60c8758c80e13_e633208f7@natae.notmuch/
that you wanted to see conflict hunk splitting in a zdiff3 mode and
expected it.  So clearly conflict hunk splitting is relevant to you
even if it wasn't to Peff.

Peff and Junio spent several emails discussing conflict hunk splitting
in the original thread (with Junio raising the question multiple times
showing it was a concern of his), and Peff spent several emails
discussing that topic even assuming that code was never triggered.  In
contrast to Peff, you know that conflict hunk splitting is relevant
since you wanted it to occur, you saw the old thread where they
discussed that topic and length, and yet you made no attempt to
include a testcase (perhaps even using the one they discussed) to show
how the splitting works?  I find that negligent.

> >   * the patches were posted despite knowing they caused segfaults, and
> > without even stating as much![2]
>
> Whomever *knew* that, it wasn't me.

You knew that Peff had reported they caused segfaults.  He pointed it
out after making you aware of the zdiff3 patch; see
https://lore.kernel.org/git/YMI+R5LFTj7ezlZE@coredump.intra.peff.net/.
You also acknowledged having known of Peff's reports before reposting
the patches at https://lore.kernel.org/git/60c82a622ae66_e5292087f@natae.notmuch/

You may be correct to point out that you only knew Peff had reported
segfaults, rather than having verified for yourself that there were
segfaults.  But the fact that you took no action on the knowledge you
did have, neither trying to verify, nor asking if the segfaults still
occurred, nor even relaying those reports when reposting the patch, is
exactly the problem at stake here.  I find the lack of action with
respect to the segfault report to be reckless.

> >   * the segfault "fixes" are submitted as a separate series from the
> > patch introducing the segfault[3], raising the risk that one gets
> > picked up without the other.
>
> My v2 includes the patch.

Ah, so your plan was to post a v2 with the fix as well as *also* post
that fix elsewhere?  Okay, that makes me feel better about this item,
so I retract it.

> > In my opinion, these submissions were egregiously cavalier.
>
> If you make unwarranted assumptions everything is possible.

Which assumptions?  That you were splitting the segfault fixes into a
separate series and not also including them with the patch that
introduces the segfault?  That does seem unusual and would have been
nice if you had communicated your plans somewhere so others wouldn't
have to worry about that particular issue, but I agree that your
explanation does invalidate the fifth item from my list as a concern.

Unfortunately, that still leaves the other four.  It's unfair to
reviewers to post patches if you have not done due diligence.  I've
read other patches of yours and commented that I thought they looked
good, so I'm not just trying to pick on you.  You clearly have talent.
With regards to the zdiff3 patches, I've stated above why I think you
haven't done your due diligence.  Sometimes people make mistakes;
that's something that can be corrected.  What I find egregious here is
that even when Peff and I have pointed out how more due diligence is
expected and needed, you've dug in to explain why you think your
course of action was reasonable (both here and in
https://lore.kernel.org/git/60c82a622ae66_e5292087f@natae.notmuch/).
That in my mind raises your submissions from careless to glaringly
cavalier.  Further, it makes me suspect we may continue to see you
repeat such behavior.  That worries me.

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2021-06-15 16:59 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-06 15:05 feature suggestion: optimize common parts for checkout --conflict=diff3 Uwe Kleine-König
2013-03-06 18:27 ` Antoine Pelisse
2013-03-06 19:26 ` Antoine Pelisse
2013-03-06 20:03   ` Jeff King
2013-03-06 20:36     ` [PATCH] xdiff: implement a zealous diff3 Uwe Kleine-König
2013-03-06 20:46       ` Jeff King
2013-03-06 20:40     ` feature suggestion: optimize common parts for checkout --conflict=diff3 Junio C Hamano
2013-03-06 20:54       ` Jeff King
2013-03-06 21:09         ` Junio C Hamano
2013-03-06 21:21           ` Jeff King
2013-03-06 21:50             ` Junio C Hamano
2013-03-07  1:02               ` Jeff King
2013-03-06 21:31           ` Uwe Kleine-König
2013-03-06 21:32           ` Junio C Hamano
2013-03-07  8:04             ` Jeff King
2013-03-07 17:26               ` Junio C Hamano
2013-03-07 18:01                 ` Jeff King
2013-03-07 18:40                   ` Junio C Hamano
2013-03-07 18:50                     ` Jeff King
2013-04-04 20:33                       ` Jeff King
2013-04-04 20:49                         ` Uwe Kleine-König
2013-04-04 20:54                           ` Jeff King
2013-04-04 21:19                             ` Junio C Hamano
2013-03-07 18:21                 ` Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2021-06-13 14:31 [PATCH] xdiff: implement a zealous diff3 Felipe Contreras
2021-06-13 15:42 ` Jeff King
2021-06-13 18:00   ` Felipe Contreras
2021-06-13 21:24     ` Felipe Contreras
2021-06-15  2:07       ` Junio C Hamano
2021-06-15  3:43         ` Elijah Newren
2021-06-15  4:03           ` Junio C Hamano
2021-06-15  9:20             ` Felipe Contreras
2021-06-15  9:16           ` Felipe Contreras
2021-06-15 16:59             ` Elijah Newren
2021-06-14  4:44     ` Jeff King
2021-06-15  4:19       ` Felipe Contreras
2021-06-15  9:24         ` Jeff King
2021-06-15 10:24           ` Felipe Contreras

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).