git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/9] Add a new --remerge-diff capability to show & log
@ 2021-12-21 18:05 Elijah Newren via GitGitGadget
  2021-12-21 18:05 ` [PATCH 1/9] tmp_objdir: add a helper function for discarding all contained objects Elijah Newren via GitGitGadget
                   ` (11 more replies)
  0 siblings, 12 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-21 18:05 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Elijah Newren

Here are some patches to add a --remerge-diff capability to show & log,
which works by comparing merge commits to an automatic remerge (note that
the automatic remerge tree can contain files with conflict markers).

Changes since original submission[1]:

 * Rebased on top of the version of ns/tmp-objdir that Neeraj submitted
   (Neeraj's patches were based on v2.34, but ns/tmp-objdir got applied on
   an old commit and does not even build because of that).
 * Modify ll-merge API to return a status, instead of printing "Cannot merge
   binary files" on stdout[2] (as suggested by Peff)
 * Make conflict messages and other such warnings into diff headers of the
   subsequent remerge-diff rather than appearing in the diff as file content
   of some funny looking filenames (as suggested by Peff[3] and Junio[4])
 * Sergey ack'ed the diff-merges.c portion of the patches, but that wasn't
   limited to one patch so not sure where to record that ack.

[1]
https://lore.kernel.org/git/pull.1080.git.git.1630376800.gitgitgadget@gmail.com/;
GitHub wouldn't let me change the target branch for the PR, so I had to
create a new one with the new base and thus the reason for not sending this
as v2 even though it is. [2]
https://lore.kernel.org/git/YVOZRhWttzF18Xql@coredump.intra.peff.net/,
https://lore.kernel.org/git/YVOZty9D7NRbzhE5@coredump.intra.peff.net/ [3]
https://lore.kernel.org/git/YVOXPTjsp9lrxmS6@coredump.intra.peff.net/ [4]
https://lore.kernel.org/git/xmqqr1d7e4ug.fsf@gitster.g/

=== FURTHER BACKGROUND (original cover letter material) ==

Here are some example commits you can try this out on (with git show
--remerge-diff $COMMIT):

 * git.git conflicted merge: 07601b5b36
 * git.git non-conflicted change: bf04590ecd
 * linux.git conflicted merge: eab3540562fb
 * linux.git non-conflicted change: 223cea6a4f05

Many more can be found by just running git log --merges --remerge-diff in
your repository of choice and searching for diffs (most merges tend to be
clean and unmodified and thus produce no diff but a search of '^diff' in the
log output tends to find the examples nicely).

Some basic high level details about this new option:

 * This option is most naturally compared to --cc, though the output seems
   to be much more understandable to most users than --cc output.
 * Since merges are often clean and unmodified, this new option results in
   an empty diff for most merges.
 * This new option shows things like the removal of conflict markers, which
   hunks users picked from the various conflicted sides to keep or remove,
   and shows changes made outside of conflict markers (which might reflect
   changes needed to resolve semantic conflicts or cleanups of e.g.
   compilation warnings or other additional changes an integrator felt
   belonged in the merged result).
 * This new option does not (currently) work for octopus merges, since
   merge-ort is specific to two-parent merges[1].
 * This option will not work on a read-only or full filesystem[2].
 * We discussed this capability at Git Merge 2020, and one of the
   suggestions was doing a periodic git gc --auto during the operation (due
   to potential new blobs and trees created during the operation). I found a
   way to avoid that; see [2].
 * This option is faster than you'd probably expect; it handles 33.5 merge
   commits per second in linux.git on my computer; see below.

In regards to the performance point above, the timing for running the
following command:

time git log --min-parents=2 --max-parents=2 $DIFF_FLAG | wc -l


in linux.git (with v5.4 checked out, since my copy of linux is very out of
date) is as follows:

DIFF_FLAG=--cc:            71m 31.536s
DIFF_FLAG=--remerge-diff:  31m  3.170s


Note that there are 62476 merges in this history. Also, output size is:

DIFF_FLAG=--cc:            2169111 lines
DIFF_FLAG=--remerge-diff:  2458020 lines


So roughly the same amount of output as --cc, as you'd expect.

As a side note: git log --remerge-diff, when run in various repositories and
allowed to run all the way back to the beginning(s) of history, is a nice
stress test of sorts for merge-ort. Especially when users run it for you on
their repositories they are working on, whether intentionally or via a bug
in a tool triggering that command to be run unexpectedly. Long story short,
such a bug in an internal tool existed last December and this command was
run on an internal repository and found a platform-specific bug in merge-ort
on some really old merge commit from that repo. I fixed that bug (a
STABLE_QSORT thing) while upstreaming all the merge-ort patches in the mean
time, but it was nice getting extra testing. Having more folks run this on
their repositories might be useful extra testing of the new merge strategy.

Also, I previously mentioned --remerge-diff-only (a flag to show how
cherry-picks or reverts differ from an automatic cherry-pick or revert, in
addition to showing how merges differ from an automatic merge). This series
does not include the patches to introduce that option; I'll submit them
later.

Two other things that might be interesting but are not included and which I
haven't investigated:

 * some mechanism for passing extra merge options through (e.g.
   -Xignore-space-change)
 * a capability to compare the automatic merge to a second automatic merge
   done with different merge options. (Not sure if this would be of interest
   to end users, but might be interesting while developing new a
   --strategy-option, or maybe checking how changing some default in the
   merge algorithm would affect historical merges in various repositories).

[1] I have nebulous ideas of how an Octopus-centric ORT strategy could be
written -- basically, just repeatedly invoking ort and trying to make sure
nested conflicts can be differentiated. For now, though, a simple warning is
printed that octopus merges are not handled and no diff will be shown. [2]
New blobs/trees can be written by the three-way merging step. These are
written to a temporary area (via tmp-objdir.c) under the git object store
that is cleaned up at the end of the operation, with the new loose objects
from the remerge being cleaned up after each individual merge.

Elijah Newren (9):
  tmp_objdir: add a helper function for discarding all contained objects
  ll-merge: make callers responsible for showing warnings
  merge-ort: capture and print ll-merge warnings in our preferred
    fashion
  merge-ort: mark a few more conflict messages as omittable
  merge-ort: make path_messages available to external callers
  diff: add ability to insert additional headers for paths
  merge-ort: format messages slightly different for use in headers
  show, log: provide a --remerge-diff capability
  doc/diff-options: explain the new --remerge-diff option

 Documentation/diff-options.txt |  8 ++++
 apply.c                        |  5 ++-
 builtin/checkout.c             | 12 ++++--
 builtin/log.c                  | 16 ++++++++
 diff-merges.c                  | 12 ++++++
 diff.c                         | 34 ++++++++++++++++-
 diff.h                         |  1 +
 ll-merge.c                     | 40 ++++++++++---------
 ll-merge.h                     |  9 ++++-
 log-tree.c                     | 70 ++++++++++++++++++++++++++++++++++
 merge-blobs.c                  |  5 ++-
 merge-ort.c                    | 49 +++++++++++++++++++++---
 merge-ort.h                    | 10 +++++
 merge-recursive.c              |  8 +++-
 merge-recursive.h              |  1 +
 notes-merge.c                  |  5 ++-
 rerere.c                       | 10 +++--
 revision.h                     |  6 ++-
 t/t6404-recursive-merge.sh     |  9 ++++-
 t/t6406-merge-attr.sh          |  9 ++++-
 tmp-objdir.c                   |  5 +++
 tmp-objdir.h                   |  6 +++
 22 files changed, 288 insertions(+), 42 deletions(-)


base-commit: 4e44121c2d7bced65e25eb7ec5156290132bec94
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1103%2Fnewren%2Fremerge-diff-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1103/newren/remerge-diff-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1103
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH 1/9] tmp_objdir: add a helper function for discarding all contained objects
  2021-12-21 18:05 [PATCH 0/9] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
@ 2021-12-21 18:05 ` Elijah Newren via GitGitGadget
  2021-12-21 23:26   ` Junio C Hamano
  2021-12-21 18:05 ` [PATCH 2/9] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-21 18:05 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 tmp-objdir.c | 5 +++++
 tmp-objdir.h | 6 ++++++
 2 files changed, 11 insertions(+)

diff --git a/tmp-objdir.c b/tmp-objdir.c
index 3d38eeab66b..adf6033549e 100644
--- a/tmp-objdir.c
+++ b/tmp-objdir.c
@@ -79,6 +79,11 @@ static void remove_tmp_objdir_on_signal(int signo)
 	raise(signo);
 }
 
+void tmp_objdir_discard_objects(struct tmp_objdir *t)
+{
+	remove_dir_recursively(&t->path, REMOVE_DIR_KEEP_TOPLEVEL);
+}
+
 /*
  * These env_* functions are for setting up the child environment; the
  * "replace" variant overrides the value of any existing variable with that
diff --git a/tmp-objdir.h b/tmp-objdir.h
index cda5ec76778..76efc7edee5 100644
--- a/tmp-objdir.h
+++ b/tmp-objdir.h
@@ -46,6 +46,12 @@ int tmp_objdir_migrate(struct tmp_objdir *);
  */
 int tmp_objdir_destroy(struct tmp_objdir *);
 
+/*
+ * Remove all objects from the temporary object directory, while leaving it
+ * around so more objects can be added.
+ */
+void tmp_objdir_discard_objects(struct tmp_objdir *);
+
 /*
  * Add the temporary object directory as an alternate object store in the
  * current process.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH 2/9] ll-merge: make callers responsible for showing warnings
  2021-12-21 18:05 [PATCH 0/9] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
  2021-12-21 18:05 ` [PATCH 1/9] tmp_objdir: add a helper function for discarding all contained objects Elijah Newren via GitGitGadget
@ 2021-12-21 18:05 ` Elijah Newren via GitGitGadget
  2021-12-21 21:19   ` Ævar Arnfjörð Bjarmason
  2021-12-21 23:44   ` Junio C Hamano
  2021-12-21 18:05 ` [PATCH 3/9] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
                   ` (9 subsequent siblings)
  11 siblings, 2 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-21 18:05 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Since some callers may want to send warning messages to somewhere other
than stdout/stderr, stop printing "warning: Cannot merge binary files"
from ll-merge and instead modify the return status of ll_merge() to
indicate when a merge of binary files has occurred.

Note that my methodology included first modifying ll_merge() to return
a struct, so that the compiler would catch all the callers for me and
ensure I had modified all of them.  After modifying all of them, I then
changed the struct to an enum.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 apply.c            |  5 ++++-
 builtin/checkout.c | 12 ++++++++----
 ll-merge.c         | 40 ++++++++++++++++++++++------------------
 ll-merge.h         |  9 ++++++++-
 merge-blobs.c      |  5 ++++-
 merge-ort.c        |  5 ++++-
 merge-recursive.c  |  5 ++++-
 notes-merge.c      |  5 ++++-
 rerere.c           | 10 +++++++---
 9 files changed, 65 insertions(+), 31 deletions(-)

diff --git a/apply.c b/apply.c
index 43a0aebf4ee..12ea9c72a6b 100644
--- a/apply.c
+++ b/apply.c
@@ -3492,7 +3492,7 @@ static int three_way_merge(struct apply_state *state,
 {
 	mmfile_t base_file, our_file, their_file;
 	mmbuffer_t result = { NULL };
-	int status;
+	enum ll_merge_result status;
 
 	/* resolve trivial cases first */
 	if (oideq(base, ours))
@@ -3509,6 +3509,9 @@ static int three_way_merge(struct apply_state *state,
 			  &their_file, "theirs",
 			  state->repo->index,
 			  NULL);
+	if (status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			"base", "ours", "theirs");
 	free(base_file.ptr);
 	free(our_file.ptr);
 	free(their_file.ptr);
diff --git a/builtin/checkout.c b/builtin/checkout.c
index cbf73b8c9f6..3a559d69303 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -237,6 +237,7 @@ static int checkout_merged(int pos, const struct checkout *state,
 	struct cache_entry *ce = active_cache[pos];
 	const char *path = ce->name;
 	mmfile_t ancestor, ours, theirs;
+	enum ll_merge_result merge_status;
 	int status;
 	struct object_id oid;
 	mmbuffer_t result_buf;
@@ -267,13 +268,16 @@ static int checkout_merged(int pos, const struct checkout *state,
 	memset(&ll_opts, 0, sizeof(ll_opts));
 	git_config_get_bool("merge.renormalize", &renormalize);
 	ll_opts.renormalize = renormalize;
-	status = ll_merge(&result_buf, path, &ancestor, "base",
-			  &ours, "ours", &theirs, "theirs",
-			  state->istate, &ll_opts);
+	merge_status = ll_merge(&result_buf, path, &ancestor, "base",
+				&ours, "ours", &theirs, "theirs",
+				state->istate, &ll_opts);
 	free(ancestor.ptr);
 	free(ours.ptr);
 	free(theirs.ptr);
-	if (status < 0 || !result_buf.ptr) {
+	if (merge_status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			path, "ours", "theirs");
+	if (merge_status < 0 || !result_buf.ptr) {
 		free(result_buf.ptr);
 		return error(_("path '%s': cannot merge"), path);
 	}
diff --git a/ll-merge.c b/ll-merge.c
index 261657578c7..669c09eed6c 100644
--- a/ll-merge.c
+++ b/ll-merge.c
@@ -14,7 +14,7 @@
 
 struct ll_merge_driver;
 
-typedef int (*ll_merge_fn)(const struct ll_merge_driver *,
+typedef enum ll_merge_result (*ll_merge_fn)(const struct ll_merge_driver *,
 			   mmbuffer_t *result,
 			   const char *path,
 			   mmfile_t *orig, const char *orig_name,
@@ -49,7 +49,7 @@ void reset_merge_attributes(void)
 /*
  * Built-in low-levels
  */
-static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
+static enum ll_merge_result ll_binary_merge(const struct ll_merge_driver *drv_unused,
 			   mmbuffer_t *result,
 			   const char *path,
 			   mmfile_t *orig, const char *orig_name,
@@ -58,6 +58,7 @@ static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
 			   const struct ll_merge_options *opts,
 			   int marker_size)
 {
+	enum ll_merge_result ret;
 	mmfile_t *stolen;
 	assert(opts);
 
@@ -68,16 +69,19 @@ static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
 	 */
 	if (opts->virtual_ancestor) {
 		stolen = orig;
+		ret = LL_MERGE_OK;
 	} else {
 		switch (opts->variant) {
 		default:
-			warning("Cannot merge binary files: %s (%s vs. %s)",
-				path, name1, name2);
-			/* fallthru */
+			ret = LL_MERGE_BINARY_CONFLICT;
+			stolen = src1;
+			break;
 		case XDL_MERGE_FAVOR_OURS:
+			ret = LL_MERGE_OK;
 			stolen = src1;
 			break;
 		case XDL_MERGE_FAVOR_THEIRS:
+			ret = LL_MERGE_OK;
 			stolen = src2;
 			break;
 		}
@@ -87,16 +91,10 @@ static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
 	result->size = stolen->size;
 	stolen->ptr = NULL;
 
-	/*
-	 * With -Xtheirs or -Xours, we have cleanly merged;
-	 * otherwise we got a conflict.
-	 */
-	return opts->variant == XDL_MERGE_FAVOR_OURS ||
-	       opts->variant == XDL_MERGE_FAVOR_THEIRS ?
-	       0 : 1;
+	return ret;
 }
 
-static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
+static enum ll_merge_result ll_xdl_merge(const struct ll_merge_driver *drv_unused,
 			mmbuffer_t *result,
 			const char *path,
 			mmfile_t *orig, const char *orig_name,
@@ -105,7 +103,9 @@ static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
 			const struct ll_merge_options *opts,
 			int marker_size)
 {
+	enum ll_merge_result ret;
 	xmparam_t xmp;
+	int status;
 	assert(opts);
 
 	if (orig->size > MAX_XDIFF_SIZE ||
@@ -133,10 +133,12 @@ static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
 	xmp.ancestor = orig_name;
 	xmp.file1 = name1;
 	xmp.file2 = name2;
-	return xdl_merge(orig, src1, src2, &xmp, result);
+	status = xdl_merge(orig, src1, src2, &xmp, result);
+	ret = (status > 1 ) ? LL_MERGE_CONFLICT : status;
+	return ret;
 }
 
-static int ll_union_merge(const struct ll_merge_driver *drv_unused,
+static enum ll_merge_result ll_union_merge(const struct ll_merge_driver *drv_unused,
 			  mmbuffer_t *result,
 			  const char *path,
 			  mmfile_t *orig, const char *orig_name,
@@ -178,7 +180,7 @@ static void create_temp(mmfile_t *src, char *path, size_t len)
 /*
  * User defined low-level merge driver support.
  */
-static int ll_ext_merge(const struct ll_merge_driver *fn,
+static enum ll_merge_result ll_ext_merge(const struct ll_merge_driver *fn,
 			mmbuffer_t *result,
 			const char *path,
 			mmfile_t *orig, const char *orig_name,
@@ -194,6 +196,7 @@ static int ll_ext_merge(const struct ll_merge_driver *fn,
 	const char *args[] = { NULL, NULL };
 	int status, fd, i;
 	struct stat st;
+	enum ll_merge_result ret;
 	assert(opts);
 
 	sq_quote_buf(&path_sq, path);
@@ -236,7 +239,8 @@ static int ll_ext_merge(const struct ll_merge_driver *fn,
 		unlink_or_warn(temp[i]);
 	strbuf_release(&cmd);
 	strbuf_release(&path_sq);
-	return status;
+	ret = (status > 1) ? LL_MERGE_CONFLICT : status;
+	return ret;
 }
 
 /*
@@ -362,7 +366,7 @@ static void normalize_file(mmfile_t *mm, const char *path, struct index_state *i
 	}
 }
 
-int ll_merge(mmbuffer_t *result_buf,
+enum ll_merge_result ll_merge(mmbuffer_t *result_buf,
 	     const char *path,
 	     mmfile_t *ancestor, const char *ancestor_label,
 	     mmfile_t *ours, const char *our_label,
diff --git a/ll-merge.h b/ll-merge.h
index aceb1b24132..e4a20e81a3a 100644
--- a/ll-merge.h
+++ b/ll-merge.h
@@ -82,13 +82,20 @@ struct ll_merge_options {
 	long xdl_opts;
 };
 
+enum ll_merge_result {
+	LL_MERGE_ERROR = -1,
+	LL_MERGE_OK = 0,
+	LL_MERGE_CONFLICT,
+	LL_MERGE_BINARY_CONFLICT,
+};
+
 /**
  * Perform a three-way single-file merge in core.  This is a thin wrapper
  * around `xdl_merge` that takes the path and any merge backend specified in
  * `.gitattributes` or `.git/info/attributes` into account.
  * Returns 0 for a clean merge.
  */
-int ll_merge(mmbuffer_t *result_buf,
+enum ll_merge_result ll_merge(mmbuffer_t *result_buf,
 	     const char *path,
 	     mmfile_t *ancestor, const char *ancestor_label,
 	     mmfile_t *ours, const char *our_label,
diff --git a/merge-blobs.c b/merge-blobs.c
index ee0a0e90c94..8138090f81c 100644
--- a/merge-blobs.c
+++ b/merge-blobs.c
@@ -36,7 +36,7 @@ static void *three_way_filemerge(struct index_state *istate,
 				 mmfile_t *their,
 				 unsigned long *size)
 {
-	int merge_status;
+	enum ll_merge_result merge_status;
 	mmbuffer_t res;
 
 	/*
@@ -50,6 +50,9 @@ static void *three_way_filemerge(struct index_state *istate,
 				istate, NULL);
 	if (merge_status < 0)
 		return NULL;
+	if (merge_status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			path, ".our", ".their");
 
 	*size = res.size;
 	return res.ptr;
diff --git a/merge-ort.c b/merge-ort.c
index 0342f104836..c24da2ba3cb 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -1743,7 +1743,7 @@ static int merge_3way(struct merge_options *opt,
 	mmfile_t orig, src1, src2;
 	struct ll_merge_options ll_opts = {0};
 	char *base, *name1, *name2;
-	int merge_status;
+	enum ll_merge_result merge_status;
 
 	if (!opt->priv->attr_index.initialized)
 		initialize_attr_index(opt);
@@ -1787,6 +1787,9 @@ static int merge_3way(struct merge_options *opt,
 	merge_status = ll_merge(result_buf, path, &orig, base,
 				&src1, name1, &src2, name2,
 				&opt->priv->attr_index, &ll_opts);
+	if (merge_status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			path, name1, name2);
 
 	free(base);
 	free(name1);
diff --git a/merge-recursive.c b/merge-recursive.c
index d9457797dbb..bc73c52dd84 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1044,7 +1044,7 @@ static int merge_3way(struct merge_options *opt,
 	mmfile_t orig, src1, src2;
 	struct ll_merge_options ll_opts = {0};
 	char *base, *name1, *name2;
-	int merge_status;
+	enum ll_merge_result merge_status;
 
 	ll_opts.renormalize = opt->renormalize;
 	ll_opts.extra_marker_size = extra_marker_size;
@@ -1090,6 +1090,9 @@ static int merge_3way(struct merge_options *opt,
 	merge_status = ll_merge(result_buf, a->path, &orig, base,
 				&src1, name1, &src2, name2,
 				opt->repo->index, &ll_opts);
+	if (merge_status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			a->path, name1, name2);
 
 	free(base);
 	free(name1);
diff --git a/notes-merge.c b/notes-merge.c
index b4a3a903e86..01d596920ea 100644
--- a/notes-merge.c
+++ b/notes-merge.c
@@ -344,7 +344,7 @@ static int ll_merge_in_worktree(struct notes_merge_options *o,
 {
 	mmbuffer_t result_buf;
 	mmfile_t base, local, remote;
-	int status;
+	enum ll_merge_result status;
 
 	read_mmblob(&base, &p->base);
 	read_mmblob(&local, &p->local);
@@ -358,6 +358,9 @@ static int ll_merge_in_worktree(struct notes_merge_options *o,
 	free(local.ptr);
 	free(remote.ptr);
 
+	if (status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			oid_to_hex(&p->obj), o->local_ref, o->remote_ref);
 	if ((status < 0) || !result_buf.ptr)
 		die("Failed to execute internal merge");
 
diff --git a/rerere.c b/rerere.c
index d83d58df4fb..46fd01819b8 100644
--- a/rerere.c
+++ b/rerere.c
@@ -609,19 +609,23 @@ static int try_merge(struct index_state *istate,
 		     const struct rerere_id *id, const char *path,
 		     mmfile_t *cur, mmbuffer_t *result)
 {
-	int ret;
+	enum ll_merge_result ret;
 	mmfile_t base = {NULL, 0}, other = {NULL, 0};
 
 	if (read_mmfile(&base, rerere_path(id, "preimage")) ||
 	    read_mmfile(&other, rerere_path(id, "postimage")))
-		ret = 1;
-	else
+		ret = LL_MERGE_CONFLICT;
+	else {
 		/*
 		 * A three-way merge. Note that this honors user-customizable
 		 * low-level merge driver settings.
 		 */
 		ret = ll_merge(result, path, &base, NULL, cur, "", &other, "",
 			       istate, NULL);
+		if (ret == LL_MERGE_BINARY_CONFLICT)
+			warning("Cannot merge binary files: %s (%s vs. %s)",
+				path, "", "");
+	}
 
 	free(base.ptr);
 	free(other.ptr);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH 3/9] merge-ort: capture and print ll-merge warnings in our preferred fashion
  2021-12-21 18:05 [PATCH 0/9] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
  2021-12-21 18:05 ` [PATCH 1/9] tmp_objdir: add a helper function for discarding all contained objects Elijah Newren via GitGitGadget
  2021-12-21 18:05 ` [PATCH 2/9] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
@ 2021-12-21 18:05 ` Elijah Newren via GitGitGadget
  2021-12-22  0:00   ` Junio C Hamano
  2021-12-21 18:05 ` [PATCH 4/9] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-21 18:05 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Instead of immediately printing ll-merge warnings to stderr, we save
them in our output strbuf.  Besides allowing us to move these warnings
to a special file for --remerge-diff, this has two other benefits for
regular merges done by merge-ort:

  * The deferral of messages ensures we can print all messages about
    any given path together (merge-recursive was known to sometimes
    intersperse messages about other paths, particularly when renames
    were involved).

  * The deferral of messages means we can avoid printing spurious
    conflict messages when we just end up aborting due to local user
    modifications in the way.  (In contrast to merge-recursive.c which
    prematurely checks for local modifications in the way via
    unpack_trees() and gets the check wrong both in terms of false
    positives and false negatives relative to renames, merge-ort does
    not perform the local modifications in the way check until the
    checkout() step after the full merge has been computed.)

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c                | 5 +++--
 t/t6404-recursive-merge.sh | 9 +++++++--
 t/t6406-merge-attr.sh      | 9 +++++++--
 3 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index c24da2ba3cb..a18f47e23c5 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -1788,8 +1788,9 @@ static int merge_3way(struct merge_options *opt,
 				&src1, name1, &src2, name2,
 				&opt->priv->attr_index, &ll_opts);
 	if (merge_status == LL_MERGE_BINARY_CONFLICT)
-		warning("Cannot merge binary files: %s (%s vs. %s)",
-			path, name1, name2);
+		path_msg(opt, path, 0,
+			 "warning: Cannot merge binary files: %s (%s vs. %s)",
+			 path, name1, name2);
 
 	free(base);
 	free(name1);
diff --git a/t/t6404-recursive-merge.sh b/t/t6404-recursive-merge.sh
index eaf48e941e2..b8735c6db4d 100755
--- a/t/t6404-recursive-merge.sh
+++ b/t/t6404-recursive-merge.sh
@@ -108,8 +108,13 @@ test_expect_success 'refuse to merge binary files' '
 	printf "\0\0" >binary-file &&
 	git add binary-file &&
 	git commit -m binary2 &&
-	test_must_fail git merge F >merge.out 2>merge.err &&
-	grep "Cannot merge binary files: binary-file (HEAD vs. F)" merge.err
+	if test "$GIT_TEST_MERGE_ALGORITHM" = ort
+	then
+		test_must_fail git merge F >merge_output
+	else
+		test_must_fail git merge F 2>merge_output
+	fi &&
+	grep "Cannot merge binary files: binary-file (HEAD vs. F)" merge_output
 '
 
 test_expect_success 'mark rename/delete as unmerged' '
diff --git a/t/t6406-merge-attr.sh b/t/t6406-merge-attr.sh
index 84946458371..c41584eb33e 100755
--- a/t/t6406-merge-attr.sh
+++ b/t/t6406-merge-attr.sh
@@ -221,8 +221,13 @@ test_expect_success 'binary files with union attribute' '
 	printf "two\0" >bin.txt &&
 	git commit -am two &&
 
-	test_must_fail git merge bin-main 2>stderr &&
-	grep -i "warning.*cannot merge.*HEAD vs. bin-main" stderr
+	if test "$GIT_TEST_MERGE_ALGORITHM" = ort
+	then
+		test_must_fail git merge bin-main >output
+	else
+		test_must_fail git merge bin-main 2>output
+	fi &&
+	grep -i "warning.*cannot merge.*HEAD vs. bin-main" output
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH 4/9] merge-ort: mark a few more conflict messages as omittable
  2021-12-21 18:05 [PATCH 0/9] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
                   ` (2 preceding siblings ...)
  2021-12-21 18:05 ` [PATCH 3/9] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
@ 2021-12-21 18:05 ` Elijah Newren via GitGitGadget
  2021-12-22  0:06   ` Junio C Hamano
  2021-12-21 18:05 ` [PATCH 5/9] merge-ort: make path_messages available to external callers Elijah Newren via GitGitGadget
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-21 18:05 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

path_msg() has the ability to mark messages as omittable, designed for
remerge-diff where we'll instead be showing conflict messages as diff
headers for a subsequent diff.  While all these messages are very useful
when trying to create a merge initially, early use with the
--remerge-diff feature (the only user of this omittable conflict message
capability), suggests that the particular messages marked in this commit
are just noise when trying to see what changes users made to create a
merge commit.  Mark them as omittable.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index a18f47e23c5..fe27870e73e 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -2420,7 +2420,7 @@ static void apply_directory_rename_modifications(struct merge_options *opt,
 		 */
 		ci->path_conflict = 1;
 		if (pair->status == 'A')
-			path_msg(opt, new_path, 0,
+			path_msg(opt, new_path, 1,
 				 _("CONFLICT (file location): %s added in %s "
 				   "inside a directory that was renamed in %s, "
 				   "suggesting it should perhaps be moved to "
@@ -2428,7 +2428,7 @@ static void apply_directory_rename_modifications(struct merge_options *opt,
 				 old_path, branch_with_new_path,
 				 branch_with_dir_rename, new_path);
 		else
-			path_msg(opt, new_path, 0,
+			path_msg(opt, new_path, 1,
 				 _("CONFLICT (file location): %s renamed to %s "
 				   "in %s, inside a directory that was renamed "
 				   "in %s, suggesting it should perhaps be "
@@ -3825,7 +3825,7 @@ static void process_entry(struct merge_options *opt,
 				reason = _("add/add");
 			if (S_ISGITLINK(merged_file.mode))
 				reason = _("submodule");
-			path_msg(opt, path, 0,
+			path_msg(opt, path, 1,
 				 _("CONFLICT (%s): Merge conflict in %s"),
 				 reason, path);
 		}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH 5/9] merge-ort: make path_messages available to external callers
  2021-12-21 18:05 [PATCH 0/9] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
                   ` (3 preceding siblings ...)
  2021-12-21 18:05 ` [PATCH 4/9] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
@ 2021-12-21 18:05 ` Elijah Newren via GitGitGadget
  2021-12-21 18:05 ` [PATCH 6/9] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-21 18:05 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

merge-ort is designed to be more flexible so that it could be called as
more of a library function.  Part of that design is not writing to the
working tree or index unless and until requested.  Part of it is
returning tree objects (rather than creating commits and making them
part of HEAD), and allowing callers to do their own special thing with
that merged tree.  Along the same lines, we want to enable callers to do
something special with output messages (conflicts and other warnings)
besides just automatically displaying on stdout/stderr.  Do so by making
the output path messages accessible via a new member of struct
merge_result named path_messages.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c |  1 +
 merge-ort.h | 10 ++++++++++
 2 files changed, 11 insertions(+)

diff --git a/merge-ort.c b/merge-ort.c
index fe27870e73e..c4d6c5c81cc 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -4547,6 +4547,7 @@ redo:
 	trace2_region_leave("merge", "process_entries", opt->repo);
 
 	/* Set return values */
+	result->path_messages = &opt->priv->output;
 	result->tree = parse_tree_indirect(&working_tree_oid);
 	/* existence of conflicted entries implies unclean */
 	result->clean &= strmap_empty(&opt->priv->conflicted);
diff --git a/merge-ort.h b/merge-ort.h
index c011864ffeb..fe599b87868 100644
--- a/merge-ort.h
+++ b/merge-ort.h
@@ -5,6 +5,7 @@
 
 struct commit;
 struct tree;
+struct strmap;
 
 struct merge_result {
 	/*
@@ -23,6 +24,15 @@ struct merge_result {
 	 */
 	struct tree *tree;
 
+	/*
+	 * Special messages and conflict notices for various paths
+	 *
+	 * This is a map of pathnames to strbufs.  It contains various
+	 * warning/conflict/notice messages (possibly multiple per path)
+	 * that callers may want to use.
+	 */
+	struct strmap *path_messages;
+
 	/*
 	 * Additional metadata used by merge_switch_to_result() or future calls
 	 * to merge_incore_*().  Includes data needed to update the index (if
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH 6/9] diff: add ability to insert additional headers for paths
  2021-12-21 18:05 [PATCH 0/9] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
                   ` (4 preceding siblings ...)
  2021-12-21 18:05 ` [PATCH 5/9] merge-ort: make path_messages available to external callers Elijah Newren via GitGitGadget
@ 2021-12-21 18:05 ` Elijah Newren via GitGitGadget
  2021-12-22  0:24   ` Junio C Hamano
  2021-12-21 18:05 ` [PATCH 7/9] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-21 18:05 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

In support of a remerge-diff ability we will add in a few commits, we
want to be able to provide additional headers to show along with a diff.
Add the plumbing necessary to enable this.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 diff.c | 34 +++++++++++++++++++++++++++++++++-
 diff.h |  1 +
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/diff.c b/diff.c
index 861282db1c3..a9490b9b2ba 100644
--- a/diff.c
+++ b/diff.c
@@ -27,6 +27,7 @@
 #include "help.h"
 #include "promisor-remote.h"
 #include "dir.h"
+#include "strmap.h"
 
 #ifdef NO_FAST_WORKING_DIRECTORY
 #define FAST_WORKING_DIRECTORY 0
@@ -3406,6 +3407,33 @@ struct userdiff_driver *get_textconv(struct repository *r,
 	return userdiff_get_textconv(r, one->driver);
 }
 
+static struct strbuf* additional_headers(struct diff_options *o,
+					 const char *path)
+{
+	if (!o->additional_path_headers)
+		return NULL;
+	return strmap_get(o->additional_path_headers, path);
+}
+
+static void add_formatted_headers(struct strbuf *msg,
+				  struct strbuf *more_headers,
+				  const char *line_prefix,
+				  const char *meta,
+				  const char *reset)
+{
+	char *next, *newline;
+
+	next = more_headers->buf;
+	while ((newline = strchr(next, '\n'))) {
+		*newline = '\0';
+		strbuf_addf(msg, "%s%s%s%s\n", line_prefix, meta, next, reset);
+		*newline = '\n';
+		next = newline + 1;
+	}
+	if (*next)
+		strbuf_addf(msg, "%s%s%s%s\n", line_prefix, meta, next, reset);
+}
+
 static void builtin_diff(const char *name_a,
 			 const char *name_b,
 			 struct diff_filespec *one,
@@ -4328,9 +4356,13 @@ static void fill_metainfo(struct strbuf *msg,
 	const char *set = diff_get_color(use_color, DIFF_METAINFO);
 	const char *reset = diff_get_color(use_color, DIFF_RESET);
 	const char *line_prefix = diff_line_prefix(o);
+	struct strbuf *more_headers = NULL;
 
 	*must_show_header = 1;
 	strbuf_init(msg, PATH_MAX * 2 + 300);
+	if ((more_headers = additional_headers(o, name)))
+		add_formatted_headers(msg, more_headers,
+				      line_prefix, set, reset);
 	switch (p->status) {
 	case DIFF_STATUS_COPIED:
 		strbuf_addf(msg, "%s%ssimilarity index %d%%",
@@ -5852,7 +5884,7 @@ int diff_unmodified_pair(struct diff_filepair *p)
 
 static void diff_flush_patch(struct diff_filepair *p, struct diff_options *o)
 {
-	if (diff_unmodified_pair(p))
+	if (diff_unmodified_pair(p) && !additional_headers(o, p->one->path))
 		return;
 
 	if ((DIFF_FILE_VALID(p->one) && S_ISDIR(p->one->mode)) ||
diff --git a/diff.h b/diff.h
index 8ba85c5e605..289badf5643 100644
--- a/diff.h
+++ b/diff.h
@@ -395,6 +395,7 @@ struct diff_options {
 
 	struct repository *repo;
 	struct option *parseopts;
+	struct strmap *additional_path_headers;
 
 	int no_free;
 };
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH 7/9] merge-ort: format messages slightly different for use in headers
  2021-12-21 18:05 [PATCH 0/9] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
                   ` (5 preceding siblings ...)
  2021-12-21 18:05 ` [PATCH 6/9] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
@ 2021-12-21 18:05 ` Elijah Newren via GitGitGadget
  2021-12-21 18:05 ` [PATCH 8/9] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-21 18:05 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

We want to add an ability for users to run
    git show --remerge-diff $MERGE_COMMIT
or even
    git log -p --remerge-diff ...
and have git show the differences between where the merge machinery
would stop and what is recorded in merge commits.  However, in such
cases, stdout is not an appropriate location to dump conflict messages.
We instead want these messages to appear as headers in the subsequent
diff.  For them to work as headers, though, we need for any multiline
messages to replace newlines with both a newline and a space.  Add a new
flag to signal when we want these messages modified in such a fashion,
and use it in path_msg() to modify these messages this way.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c       | 36 ++++++++++++++++++++++++++++++++++--
 merge-recursive.c |  3 +++
 merge-recursive.h |  1 +
 3 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index c4d6c5c81cc..0ae3e4ffa75 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -634,17 +634,46 @@ static void path_msg(struct merge_options *opt,
 		     const char *fmt, ...)
 {
 	va_list ap;
-	struct strbuf *sb = strmap_get(&opt->priv->output, path);
+	struct strbuf *sb, *dest;
+	struct strbuf tmp = STRBUF_INIT;
+
+	if (opt->record_conflict_msgs_as_headers && omittable_hint)
+		return; /* Do not record mere hints in tree */
+	sb = strmap_get(&opt->priv->output, path);
 	if (!sb) {
 		sb = xmalloc(sizeof(*sb));
 		strbuf_init(sb, 0);
 		strmap_put(&opt->priv->output, path, sb);
 	}
 
+	dest = (opt->record_conflict_msgs_as_headers ? &tmp : sb);
+
 	va_start(ap, fmt);
-	strbuf_vaddf(sb, fmt, ap);
+	strbuf_vaddf(dest, fmt, ap);
 	va_end(ap);
 
+	if (opt->record_conflict_msgs_as_headers) {
+		int i_sb = 0, i_tmp = 0;
+
+		/* Copy tmp to sb, adding spaces after newlines */
+		strbuf_grow(sb, 2*tmp.len); /* more than sufficient */
+		for (; i_tmp < tmp.len; i_tmp++, i_sb++) {
+			/* Copy next character from tmp to sb */
+			sb->buf[sb->len + i_sb] = tmp.buf[i_tmp];
+
+			/* If we copied a newline, add a space */
+			if (tmp.buf[i_tmp] == '\n')
+				sb->buf[++i_sb] = ' ';
+		}
+		/* Update length and ensure it's NUL-terminated */
+		sb->len += i_sb;
+		sb->buf[sb->len] = '\0';
+
+		/* Clean up tmp */
+		strbuf_release(&tmp);
+	}
+
+	/* Add final newline character to sb */
 	strbuf_addch(sb, '\n');
 }
 
@@ -4246,6 +4275,9 @@ void merge_switch_to_result(struct merge_options *opt,
 		struct string_list olist = STRING_LIST_INIT_NODUP;
 		int i;
 
+		if (opt->record_conflict_msgs_as_headers)
+			BUG("Either display conflict messages or record them as headers, not both");
+
 		trace2_region_enter("merge", "display messages", opt->repo);
 
 		/* Hack to pre-allocate olist to the desired size */
diff --git a/merge-recursive.c b/merge-recursive.c
index bc73c52dd84..c9ba7e904a6 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -3714,6 +3714,9 @@ static int merge_start(struct merge_options *opt, struct tree *head)
 
 	assert(opt->priv == NULL);
 
+	/* Not supported; option specific to merge-ort */
+	assert(!opt->record_conflict_msgs_as_headers);
+
 	/* Sanity check on repo state; index must match head */
 	if (repo_index_has_changes(opt->repo, head, &sb)) {
 		err(opt, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
diff --git a/merge-recursive.h b/merge-recursive.h
index 0795a1d3ec1..ebfdb7f994e 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -46,6 +46,7 @@ struct merge_options {
 	/* miscellaneous control options */
 	const char *subtree_shift;
 	unsigned renormalize : 1;
+	unsigned record_conflict_msgs_as_headers : 1;
 
 	/* internal fields used by the implementation */
 	struct merge_options_internal *priv;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH 8/9] show, log: provide a --remerge-diff capability
  2021-12-21 18:05 [PATCH 0/9] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
                   ` (6 preceding siblings ...)
  2021-12-21 18:05 ` [PATCH 7/9] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
@ 2021-12-21 18:05 ` Elijah Newren via GitGitGadget
  2021-12-21 21:23   ` Ævar Arnfjörð Bjarmason
  2021-12-21 18:05 ` [PATCH 9/9] doc/diff-options: explain the new --remerge-diff option Elijah Newren via GitGitGadget
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-21 18:05 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

When this option is specified, we remerge all (two parent) merge commits
and diff the actual merge commit to the automatically created version,
in order to show how users removed conflict markers, resolved the
different conflict versions, and potentially added new changes outside
of conflict regions in order to resolve semantic merge problems (or,
possibly, just to hide other random changes).

This capability works by creating a temporary object directory and
marking it as the primary object store, so that any blobs or trees
created during the automatic merge, can be easily removed afterwards by
just deleting all objects from the temporary object directory.  We can
do this after handling each merge commit, in order to avoid the need to
worry about doing `git gc --auto` runs while running `git log
--remerge-diff`.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/log.c | 16 ++++++++++++
 diff-merges.c | 12 +++++++++
 log-tree.c    | 70 +++++++++++++++++++++++++++++++++++++++++++++++++++
 revision.h    |  6 ++++-
 4 files changed, 103 insertions(+), 1 deletion(-)

diff --git a/builtin/log.c b/builtin/log.c
index f75d87e8d7f..2b51d8b6aae 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -35,6 +35,8 @@
 #include "repository.h"
 #include "commit-reach.h"
 #include "range-diff.h"
+#include "dir.h"
+#include "tmp-objdir.h"
 
 #define MAIL_DEFAULT_WRAP 72
 #define COVER_FROM_AUTO_MAX_SUBJECT_LEN 100
@@ -407,6 +409,13 @@ static int cmd_log_walk(struct rev_info *rev)
 	int saved_nrl = 0;
 	int saved_dcctc = 0;
 
+	if (rev->remerge_diff) {
+		rev->remerge_objdir = tmp_objdir_create("remerge-diff");
+		if (!rev->remerge_objdir)
+			die(_("unable to create temporary object directory"));
+		tmp_objdir_replace_primary_odb(rev->remerge_objdir, 1);
+	}
+
 	if (rev->early_output)
 		setup_early_output();
 
@@ -449,6 +458,11 @@ static int cmd_log_walk(struct rev_info *rev)
 	rev->diffopt.no_free = 0;
 	diff_free(&rev->diffopt);
 
+	if (rev->remerge_diff) {
+		tmp_objdir_destroy(rev->remerge_objdir);
+		rev->remerge_objdir = NULL;
+	}
+
 	if (rev->diffopt.output_format & DIFF_FORMAT_CHECKDIFF &&
 	    rev->diffopt.flags.check_failed) {
 		return 02;
@@ -1943,6 +1957,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 		die(_("--name-status does not make sense"));
 	if (rev.diffopt.output_format & DIFF_FORMAT_CHECKDIFF)
 		die(_("--check does not make sense"));
+	if (rev.remerge_diff)
+		die(_("--remerge_diff does not make sense"));
 
 	if (!use_patch_format &&
 		(!rev.diffopt.output_format ||
diff --git a/diff-merges.c b/diff-merges.c
index 5060ccd890b..0af4b3f9191 100644
--- a/diff-merges.c
+++ b/diff-merges.c
@@ -17,6 +17,7 @@ static void suppress(struct rev_info *revs)
 	revs->combined_all_paths = 0;
 	revs->merges_imply_patch = 0;
 	revs->merges_need_diff = 0;
+	revs->remerge_diff = 0;
 }
 
 static void set_separate(struct rev_info *revs)
@@ -45,6 +46,12 @@ static void set_dense_combined(struct rev_info *revs)
 	revs->dense_combined_merges = 1;
 }
 
+static void set_remerge_diff(struct rev_info *revs)
+{
+	suppress(revs);
+	revs->remerge_diff = 1;
+}
+
 static diff_merges_setup_func_t func_by_opt(const char *optarg)
 {
 	if (!strcmp(optarg, "off") || !strcmp(optarg, "none"))
@@ -57,6 +64,8 @@ static diff_merges_setup_func_t func_by_opt(const char *optarg)
 		return set_combined;
 	else if (!strcmp(optarg, "cc") || !strcmp(optarg, "dense-combined"))
 		return set_dense_combined;
+	else if (!strcmp(optarg, "r") || !strcmp(optarg, "remerge"))
+		return set_remerge_diff;
 	else if (!strcmp(optarg, "m") || !strcmp(optarg, "on"))
 		return set_to_default;
 	return NULL;
@@ -110,6 +119,9 @@ int diff_merges_parse_opts(struct rev_info *revs, const char **argv)
 	} else if (!strcmp(arg, "--cc")) {
 		set_dense_combined(revs);
 		revs->merges_imply_patch = 1;
+	} else if (!strcmp(arg, "--remerge-diff")) {
+		set_remerge_diff(revs);
+		revs->merges_imply_patch = 1;
 	} else if (!strcmp(arg, "--no-diff-merges")) {
 		suppress(revs);
 	} else if (!strcmp(arg, "--combined-all-paths")) {
diff --git a/log-tree.c b/log-tree.c
index 644893fd8cf..8fef9822a1e 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -1,12 +1,15 @@
 #include "cache.h"
+#include "commit-reach.h"
 #include "config.h"
 #include "diff.h"
 #include "object-store.h"
 #include "repository.h"
+#include "tmp-objdir.h"
 #include "commit.h"
 #include "tag.h"
 #include "graph.h"
 #include "log-tree.h"
+#include "merge-ort.h"
 #include "reflog-walk.h"
 #include "refs.h"
 #include "string-list.h"
@@ -16,6 +19,7 @@
 #include "line-log.h"
 #include "help.h"
 #include "range-diff.h"
+#include "dir.h"
 
 static struct decoration name_decoration = { "object names" };
 static int decoration_loaded;
@@ -902,6 +906,60 @@ static int do_diff_combined(struct rev_info *opt, struct commit *commit)
 	return !opt->loginfo;
 }
 
+static int do_remerge_diff(struct rev_info *opt,
+			   struct commit_list *parents,
+			   struct object_id *oid,
+			   struct commit *commit)
+{
+	struct merge_options o;
+	struct commit_list *bases;
+	struct merge_result res;
+	struct pretty_print_context ctx = {0};
+	struct strbuf commit1 = STRBUF_INIT;
+	struct strbuf commit2 = STRBUF_INIT;
+
+	/* Setup merge options */
+	init_merge_options(&o, the_repository);
+	memset(&res, 0, sizeof(res));
+	o.show_rename_progress = 0;
+
+	ctx.abbrev = DEFAULT_ABBREV;
+	format_commit_message(parents->item,       "%h (%s)", &commit1, &ctx);
+	format_commit_message(parents->next->item, "%h (%s)", &commit2, &ctx);
+	o.branch1 = commit1.buf;
+	o.branch2 = commit2.buf;
+	o.record_conflict_msgs_as_headers = 1;
+
+	/* Parse the relevant commits and get the merge bases */
+	parse_commit_or_die(parents->item);
+	parse_commit_or_die(parents->next->item);
+	bases = get_merge_bases(parents->item, parents->next->item);
+
+	/* Re-merge the parents */
+	merge_incore_recursive(&o,
+			       bases, parents->item, parents->next->item,
+			       &res);
+
+	/* Show the diff */
+	opt->diffopt.additional_path_headers = res.path_messages;
+	diff_tree_oid(&res.tree->object.oid, oid, "", &opt->diffopt);
+	log_tree_diff_flush(opt);
+
+	/* Cleanup */
+	opt->diffopt.additional_path_headers = NULL;
+	strbuf_release(&commit1);
+	strbuf_release(&commit2);
+	merge_finalize(&o, &res);
+
+	/* Clean up the temporary object directory */
+	if (opt->remerge_objdir != NULL)
+		tmp_objdir_discard_objects(opt->remerge_objdir);
+	else
+		BUG("unable to remove temporary object directory");
+
+	return !opt->loginfo;
+}
+
 /*
  * Show the diff of a commit.
  *
@@ -936,6 +994,18 @@ static int log_tree_diff(struct rev_info *opt, struct commit *commit, struct log
 	}
 
 	if (is_merge) {
+		int octopus = (parents->next->next != NULL);
+
+		if (opt->remerge_diff) {
+			if (octopus) {
+				show_log(opt);
+				fprintf(opt->diffopt.file,
+					"diff: warning: Skipping remerge-diff "
+					"for octopus merges.\n");
+				return 1;
+			}
+			return do_remerge_diff(opt, parents, oid, commit);
+		}
 		if (opt->combine_merges)
 			return do_diff_combined(opt, commit);
 		if (opt->separate_merges) {
diff --git a/revision.h b/revision.h
index 5578bb4720a..44efce3f410 100644
--- a/revision.h
+++ b/revision.h
@@ -195,7 +195,8 @@ struct rev_info {
 			combine_merges:1,
 			combined_all_paths:1,
 			dense_combined_merges:1,
-			first_parent_merges:1;
+			first_parent_merges:1,
+			remerge_diff:1;
 
 	/* Format info */
 	int		show_notes;
@@ -317,6 +318,9 @@ struct rev_info {
 
 	/* misc. flags related to '--no-kept-objects' */
 	unsigned keep_pack_cache_flags;
+
+	/* Location where temporary objects for remerge-diff are written. */
+	struct tmp_objdir *remerge_objdir;
 };
 
 int ref_excluded(struct string_list *, const char *path);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH 9/9] doc/diff-options: explain the new --remerge-diff option
  2021-12-21 18:05 [PATCH 0/9] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
                   ` (7 preceding siblings ...)
  2021-12-21 18:05 ` [PATCH 8/9] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
@ 2021-12-21 18:05 ` Elijah Newren via GitGitGadget
  2021-12-21 21:28   ` Ævar Arnfjörð Bjarmason
  2021-12-21 23:20 ` [PATCH 0/9] Add a new --remerge-diff capability to show & log Junio C Hamano
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-21 18:05 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Documentation/diff-options.txt | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index c89d530d3d1..b05f1c9f1c9 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -64,6 +64,14 @@ ifdef::git-log[]
 	each of the parents. Separate log entry and diff is generated
 	for each parent.
 +
+--diff-merges=remerge:::
+--diff-merges=r:::
+--remerge-diff:::
+	With this option, two-parent merge commits are remerged to
+	create a temporary tree object -- potentially containing files
+	with conflict markers and such.  A diff is then shown between
+	that temporary tree and the actual merge commit.
++
 --diff-merges=combined:::
 --diff-merges=c:::
 -c:::
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 113+ messages in thread

* Re: [PATCH 2/9] ll-merge: make callers responsible for showing warnings
  2021-12-21 18:05 ` [PATCH 2/9] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
@ 2021-12-21 21:19   ` Ævar Arnfjörð Bjarmason
  2021-12-21 21:57     ` Elijah Newren
  2021-12-21 23:44   ` Junio C Hamano
  1 sibling, 1 reply; 113+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-21 21:19 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Neeraj Singh, Elijah Newren


On Tue, Dec 21 2021, Elijah Newren via GitGitGadget wrote:

> From: Elijah Newren <newren@gmail.com>

> +	if (status == LL_MERGE_BINARY_CONFLICT)
> +		warning("Cannot merge binary files: %s (%s vs. %s)",
> +			"base", "ours", "theirs");

This & other messages in the series have warning/BUG etc. starting with
upper-case.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 8/9] show, log: provide a --remerge-diff capability
  2021-12-21 18:05 ` [PATCH 8/9] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
@ 2021-12-21 21:23   ` Ævar Arnfjörð Bjarmason
  2021-12-21 22:18     ` Elijah Newren
  0 siblings, 1 reply; 113+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-21 21:23 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Neeraj Singh, Elijah Newren


On Tue, Dec 21 2021, Elijah Newren via GitGitGadget wrote:

> From: Elijah Newren <newren@gmail.com>

> +	if (rev->remerge_diff) {
> +		rev->remerge_objdir = tmp_objdir_create("remerge-diff");
> +		if (!rev->remerge_objdir)
> +			die(_("unable to create temporary object directory"));

It looks like the tmp_objdir_create() API is rather bad about mixing
errors that would come with an errno with others, but shouldn't this be
die_errno() in the case where it would fail due to a syscall? Even
better would be passing a "gentle" to it and have it emit the
appropriate errors.

> +	if (rev.remerge_diff)
> +		die(_("--remerge_diff does not make sense"));

s/_/-/


> +	struct merge_options o;
> +	struct commit_list *bases;
> +	struct merge_result res;

nit: could use "= { 0 }" instead of memset below.

> +	/* Re-merge the parents */
> +	merge_incore_recursive(&o,
> +			       bases, parents->item, parents->next->item,
> +			       &res);

style: odd not to have arguments that fit on the line on
the line, i.e. "&o, bases, ...".

> +	/* Clean up the temporary object directory */
> +	if (opt->remerge_objdir != NULL)

style: if (!x) not if (x != NULL)

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 9/9] doc/diff-options: explain the new --remerge-diff option
  2021-12-21 18:05 ` [PATCH 9/9] doc/diff-options: explain the new --remerge-diff option Elijah Newren via GitGitGadget
@ 2021-12-21 21:28   ` Ævar Arnfjörð Bjarmason
  2021-12-21 22:24     ` Elijah Newren
  0 siblings, 1 reply; 113+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-21 21:28 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Neeraj Singh, Elijah Newren


On Tue, Dec 21 2021, Elijah Newren via GitGitGadget wrote:

> From: Elijah Newren <newren@gmail.com>
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  Documentation/diff-options.txt | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> index c89d530d3d1..b05f1c9f1c9 100644
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -64,6 +64,14 @@ ifdef::git-log[]
>  	each of the parents. Separate log entry and diff is generated
>  	for each parent.
>  +
> +--diff-merges=remerge:::
> +--diff-merges=r:::
> +--remerge-diff:::
> +	With this option, two-parent merge commits are remerged to
> +	create a temporary tree object -- potentially containing files
> +	with conflict markers and such.  A diff is then shown between
> +	that temporary tree and the actual merge commit.
> ++
>  --diff-merges=combined:::
>  --diff-merges=c:::
>  -c:::

This & 5/9 would I think be better squashed into their respective "main"
patches.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 2/9] ll-merge: make callers responsible for showing warnings
  2021-12-21 21:19   ` Ævar Arnfjörð Bjarmason
@ 2021-12-21 21:57     ` Elijah Newren
  2021-12-21 23:02       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren @ 2021-12-21 21:57 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya, Neeraj Singh

On Tue, Dec 21, 2021 at 1:21 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Tue, Dec 21 2021, Elijah Newren via GitGitGadget wrote:
>
> > From: Elijah Newren <newren@gmail.com>
>
> > +     if (status == LL_MERGE_BINARY_CONFLICT)
> > +             warning("Cannot merge binary files: %s (%s vs. %s)",
> > +                     "base", "ours", "theirs");
>
> This & other messages in the series have warning/BUG etc. starting with
> upper-case.

Yes, but I'm not introducing a new message here; I'm merely moving an
existing one.  It's important to me that readers of this patch be able
to verify that I have made no functional changes in this patch, so
fixing the case should definitely be a different patch from this one.
I kind of think that fixing the case distracts a bit from the point of
the series, and the series is already kind of long, but do you feel
strongly that I should fix the case with a new patch inserted into the
series?

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 8/9] show, log: provide a --remerge-diff capability
  2021-12-21 21:23   ` Ævar Arnfjörð Bjarmason
@ 2021-12-21 22:18     ` Elijah Newren
  0 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren @ 2021-12-21 22:18 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya, Neeraj Singh

On Tue, Dec 21, 2021 at 1:28 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Tue, Dec 21 2021, Elijah Newren via GitGitGadget wrote:
>
> > From: Elijah Newren <newren@gmail.com>
>
> > +     if (rev->remerge_diff) {
> > +             rev->remerge_objdir = tmp_objdir_create("remerge-diff");
> > +             if (!rev->remerge_objdir)
> > +                     die(_("unable to create temporary object directory"));
>
> It looks like the tmp_objdir_create() API is rather bad about mixing
> errors that would come with an errno with others, but shouldn't this be
> die_errno() in the case where it would fail due to a syscall? Even
> better would be passing a "gentle" to it and have it emit the
> appropriate errors.

I can switch to die_errno().

>
> > +     if (rev.remerge_diff)
> > +             die(_("--remerge_diff does not make sense"));
>
> s/_/-/

Indeed, thanks.

> > +     struct merge_options o;
> > +     struct commit_list *bases;
> > +     struct merge_result res;
>
> nit: could use "= { 0 }" instead of memset below.

Sure, I can make that change.

> > +     /* Re-merge the parents */
> > +     merge_incore_recursive(&o,
> > +                            bases, parents->item, parents->next->item,
> > +                            &res);
>
> style: odd not to have arguments that fit on the line on
> the line, i.e. "&o, bases, ...".

Yes, but this groups all the ancestors so nicely as opposed to the
typical happenstance of whatever fits on a line.  ;-)

> > +     /* Clean up the temporary object directory */
> > +     if (opt->remerge_objdir != NULL)
>
> style: if (!x) not if (x != NULL)

Ok, will change.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 9/9] doc/diff-options: explain the new --remerge-diff option
  2021-12-21 21:28   ` Ævar Arnfjörð Bjarmason
@ 2021-12-21 22:24     ` Elijah Newren
  2021-12-21 23:47       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren @ 2021-12-21 22:24 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya, Neeraj Singh

On Tue, Dec 21, 2021 at 1:29 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Tue, Dec 21 2021, Elijah Newren via GitGitGadget wrote:
>
> > From: Elijah Newren <newren@gmail.com>
> >
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > ---
> >  Documentation/diff-options.txt | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> > index c89d530d3d1..b05f1c9f1c9 100644
> > --- a/Documentation/diff-options.txt
> > +++ b/Documentation/diff-options.txt
> > @@ -64,6 +64,14 @@ ifdef::git-log[]
> >       each of the parents. Separate log entry and diff is generated
> >       for each parent.
> >  +
> > +--diff-merges=remerge:::
> > +--diff-merges=r:::
> > +--remerge-diff:::
> > +     With this option, two-parent merge commits are remerged to
> > +     create a temporary tree object -- potentially containing files
> > +     with conflict markers and such.  A diff is then shown between
> > +     that temporary tree and the actual merge commit.
> > ++
> >  --diff-merges=combined:::
> >  --diff-merges=c:::
> >  -c:::
>
> This & 5/9 would I think be better squashed into their respective "main"
> patches.

I presume you mean the "main" patch for this one is 8/9.  I was trying
to find a way to break up that large patch, but this is pretty small
so...sure I'll squash it in.

What are you referring to as the "main" patch for 5/9, though?  It
only seems related to 6/9 and 7/9 to me, but I very deliberately split
those patches off and don't want to confuse them with unrelated
changes.  I disagree with combining 5/9 with either of those.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 2/9] ll-merge: make callers responsible for showing warnings
  2021-12-21 21:57     ` Elijah Newren
@ 2021-12-21 23:02       ` Ævar Arnfjörð Bjarmason
  2021-12-21 23:15         ` Elijah Newren
  0 siblings, 1 reply; 113+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-21 23:02 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya, Neeraj Singh


On Tue, Dec 21 2021, Elijah Newren wrote:

> On Tue, Dec 21, 2021 at 1:21 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>> On Tue, Dec 21 2021, Elijah Newren via GitGitGadget wrote:
>>
>> > From: Elijah Newren <newren@gmail.com>
>>
>> > +     if (status == LL_MERGE_BINARY_CONFLICT)
>> > +             warning("Cannot merge binary files: %s (%s vs. %s)",
>> > +                     "base", "ours", "theirs");
>>
>> This & other messages in the series have warning/BUG etc. starting with
>> upper-case.
>
> Yes, but I'm not introducing a new message here; I'm merely moving an
> existing one.  It's important to me that readers of this patch be able
> to verify that I have made no functional changes in this patch, so
> fixing the case should definitely be a different patch from this one.
> I kind of think that fixing the case distracts a bit from the point of
> the series, and the series is already kind of long, but do you feel
> strongly that I should fix the case with a new patch inserted into the
> series?

I just missed the bit where it was moved from below in the diff. Sorry
about the noise.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 2/9] ll-merge: make callers responsible for showing warnings
  2021-12-21 23:02       ` Ævar Arnfjörð Bjarmason
@ 2021-12-21 23:15         ` Elijah Newren
  0 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren @ 2021-12-21 23:15 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya, Neeraj Singh

On Tue, Dec 21, 2021 at 3:03 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Tue, Dec 21 2021, Elijah Newren wrote:
>
> > On Tue, Dec 21, 2021 at 1:21 PM Ævar Arnfjörð Bjarmason
> > <avarab@gmail.com> wrote:
> >>
> >> On Tue, Dec 21 2021, Elijah Newren via GitGitGadget wrote:
> >>
> >> > From: Elijah Newren <newren@gmail.com>
> >>
> >> > +     if (status == LL_MERGE_BINARY_CONFLICT)
> >> > +             warning("Cannot merge binary files: %s (%s vs. %s)",
> >> > +                     "base", "ours", "theirs");
> >>
> >> This & other messages in the series have warning/BUG etc. starting with
> >> upper-case.
> >
> > Yes, but I'm not introducing a new message here; I'm merely moving an
> > existing one.  It's important to me that readers of this patch be able
> > to verify that I have made no functional changes in this patch, so
> > fixing the case should definitely be a different patch from this one.
> > I kind of think that fixing the case distracts a bit from the point of
> > the series, and the series is already kind of long, but do you feel
> > strongly that I should fix the case with a new patch inserted into the
> > series?
>
> I just missed the bit where it was moved from below in the diff. Sorry
> about the noise.

Nah, no worries; thanks for taking a look at the patches!

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 0/9] Add a new --remerge-diff capability to show & log
  2021-12-21 18:05 [PATCH 0/9] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
                   ` (8 preceding siblings ...)
  2021-12-21 18:05 ` [PATCH 9/9] doc/diff-options: explain the new --remerge-diff option Elijah Newren via GitGitGadget
@ 2021-12-21 23:20 ` Junio C Hamano
  2021-12-21 23:43   ` Elijah Newren
  2021-12-22  0:33 ` Junio C Hamano
  2021-12-25  7:59 ` [PATCH v2 0/8] " Elijah Newren via GitGitGadget
  11 siblings, 1 reply; 113+ messages in thread
From: Junio C Hamano @ 2021-12-21 23:20 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Here are some patches to add a --remerge-diff capability to show & log,
> which works by comparing merge commits to an automatic remerge (note that
> the automatic remerge tree can contain files with conflict markers).
>
> Changes since original submission[1]:
>
>  * Rebased on top of the version of ns/tmp-objdir that Neeraj submitted
>    (Neeraj's patches were based on v2.34, but ns/tmp-objdir got applied on
>    an old commit and does not even build because of that).

Oh, that's bad.  I wish people do not rebase their updates on top of
newer 'master' only for the sake of it, once an older version is
queued.

>  * Modify ll-merge API to return a status, instead of printing "Cannot merge
>    binary files" on stdout[2] (as suggested by Peff)

I wondered if we want to do the same for other error messages to
give callers greater control, but this change by itself already
looks quite good.

>  * Make conflict messages and other such warnings into diff headers of the
>    subsequent remerge-diff rather than appearing in the diff as file content
>    of some funny looking filenames (as suggested by Peff[3] and Junio[4])

OK.

>  * Sergey ack'ed the diff-merges.c portion of the patches, but that wasn't
>    limited to one patch so not sure where to record that ack.

On that single patch?


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 1/9] tmp_objdir: add a helper function for discarding all contained objects
  2021-12-21 18:05 ` [PATCH 1/9] tmp_objdir: add a helper function for discarding all contained objects Elijah Newren via GitGitGadget
@ 2021-12-21 23:26   ` Junio C Hamano
  2021-12-21 23:51     ` Elijah Newren
  0 siblings, 1 reply; 113+ messages in thread
From: Junio C Hamano @ 2021-12-21 23:26 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Elijah Newren <newren@gmail.com>
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  tmp-objdir.c | 5 +++++
>  tmp-objdir.h | 6 ++++++
>  2 files changed, 11 insertions(+)
>
> diff --git a/tmp-objdir.c b/tmp-objdir.c
> index 3d38eeab66b..adf6033549e 100644
> --- a/tmp-objdir.c
> +++ b/tmp-objdir.c
> @@ -79,6 +79,11 @@ static void remove_tmp_objdir_on_signal(int signo)
>  	raise(signo);
>  }
>  
> +void tmp_objdir_discard_objects(struct tmp_objdir *t)
> +{
> +	remove_dir_recursively(&t->path, REMOVE_DIR_KEEP_TOPLEVEL);
> +}
> +

OK.

Without a caller, it is a bit hard to judge if a separate helper
makes the caller easier to read and understand, or becomes an extra
layer of abstraction that obscures the logic.  Hopefully, having a
more specific function name with "tmp" and "discard" in it makes the
intent at callers more clear than the function that is named after
the detail of the operation.

> +/*
> + * Remove all objects from the temporary object directory, while leaving it
> + * around so more objects can be added.
> + */
> +void tmp_objdir_discard_objects(struct tmp_objdir *);
> +
>  /*
>   * Add the temporary object directory as an alternate object store in the
>   * current process.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 0/9] Add a new --remerge-diff capability to show & log
  2021-12-21 23:20 ` [PATCH 0/9] Add a new --remerge-diff capability to show & log Junio C Hamano
@ 2021-12-21 23:43   ` Elijah Newren
  0 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren @ 2021-12-21 23:43 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason, Neeraj Singh

On Tue, Dec 21, 2021 at 3:20 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > Here are some patches to add a --remerge-diff capability to show & log,
> > which works by comparing merge commits to an automatic remerge (note that
> > the automatic remerge tree can contain files with conflict markers).
> >
> > Changes since original submission[1]:
> >
> >  * Rebased on top of the version of ns/tmp-objdir that Neeraj submitted
> >    (Neeraj's patches were based on v2.34, but ns/tmp-objdir got applied on
> >    an old commit and does not even build because of that).
>
> Oh, that's bad.  I wish people do not rebase their updates on top of
> newer 'master' only for the sake of it, once an older version is
> queued.
>
> >  * Modify ll-merge API to return a status, instead of printing "Cannot merge
> >    binary files" on stdout[2] (as suggested by Peff)
>
> I wondered if we want to do the same for other error messages to
> give callers greater control, but this change by itself already
> looks quite good.
>
> >  * Make conflict messages and other such warnings into diff headers of the
> >    subsequent remerge-diff rather than appearing in the diff as file content
> >    of some funny looking filenames (as suggested by Peff[3] and Junio[4])
>
> OK.
>
> >  * Sergey ack'ed the diff-merges.c portion of the patches, but that wasn't
> >    limited to one patch so not sure where to record that ack.
>
> On that single patch?

Yeah, the main patch near the end that changed 4 files; he acked the
changes to just one of those files in that patch.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 2/9] ll-merge: make callers responsible for showing warnings
  2021-12-21 18:05 ` [PATCH 2/9] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
  2021-12-21 21:19   ` Ævar Arnfjörð Bjarmason
@ 2021-12-21 23:44   ` Junio C Hamano
  2021-12-23 18:26     ` Elijah Newren
  1 sibling, 1 reply; 113+ messages in thread
From: Junio C Hamano @ 2021-12-21 23:44 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Elijah Newren <newren@gmail.com>
>
> Since some callers may want to send warning messages to somewhere other
> than stdout/stderr, stop printing "warning: Cannot merge binary files"
> from ll-merge and instead modify the return status of ll_merge() to
> indicate when a merge of binary files has occurred.
>
> Note that my methodology included first modifying ll_merge() to return
> a struct, so that the compiler would catch all the callers for me and
> ensure I had modified all of them.  After modifying all of them, I then
> changed the struct to an enum.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  apply.c            |  5 ++++-
>  builtin/checkout.c | 12 ++++++++----
>  ll-merge.c         | 40 ++++++++++++++++++++++------------------
>  ll-merge.h         |  9 ++++++++-
>  merge-blobs.c      |  5 ++++-
>  merge-ort.c        |  5 ++++-
>  merge-recursive.c  |  5 ++++-
>  notes-merge.c      |  5 ++++-
>  rerere.c           | 10 +++++++---
>  9 files changed, 65 insertions(+), 31 deletions(-)
>
> diff --git a/apply.c b/apply.c
> index 43a0aebf4ee..12ea9c72a6b 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -3492,7 +3492,7 @@ static int three_way_merge(struct apply_state *state,
>  {
>  	mmfile_t base_file, our_file, their_file;
>  	mmbuffer_t result = { NULL };
> -	int status;
> +	enum ll_merge_result status;
>  
>  	/* resolve trivial cases first */
>  	if (oideq(base, ours))
> @@ -3509,6 +3509,9 @@ static int three_way_merge(struct apply_state *state,
>  			  &their_file, "theirs",
>  			  state->repo->index,
>  			  NULL);
> +	if (status == LL_MERGE_BINARY_CONFLICT)
> +		warning("Cannot merge binary files: %s (%s vs. %s)",
> +			"base", "ours", "theirs");

This used to come from ll_merge()

> -			warning("Cannot merge binary files: %s (%s vs. %s)",
> -				path, name1, name2);
> -			/* fallthru */

And our call to ll_merge() above (half of it invisible in the
pre-context of the hunk) gave "ours" and "theirs" to our_label and
their_label, which in turn are called name1 and name2, respectively,
in ll_merge_binary() driver.

I am not sure about the "base" string, though.  I suspect that your
"base" should be a reference to the parameter 'path' of three_way_merge()
function.

> diff --git a/builtin/checkout.c b/builtin/checkout.c
> index cbf73b8c9f6..3a559d69303 100644
> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c
> @@ -237,6 +237,7 @@ static int checkout_merged(int pos, const struct checkout *state,
>  	struct cache_entry *ce = active_cache[pos];
>  	const char *path = ce->name;
>  	mmfile_t ancestor, ours, theirs;
> +	enum ll_merge_result merge_status;
>  	int status;
>  	struct object_id oid;
>  	mmbuffer_t result_buf;
> @@ -267,13 +268,16 @@ static int checkout_merged(int pos, const struct checkout *state,
>  	memset(&ll_opts, 0, sizeof(ll_opts));
>  	git_config_get_bool("merge.renormalize", &renormalize);
>  	ll_opts.renormalize = renormalize;
> -	status = ll_merge(&result_buf, path, &ancestor, "base",
> -			  &ours, "ours", &theirs, "theirs",
> -			  state->istate, &ll_opts);
> +	merge_status = ll_merge(&result_buf, path, &ancestor, "base",
> +				&ours, "ours", &theirs, "theirs",
> +				state->istate, &ll_opts);
>  	free(ancestor.ptr);
>  	free(ours.ptr);
>  	free(theirs.ptr);
> -	if (status < 0 || !result_buf.ptr) {
> +	if (merge_status == LL_MERGE_BINARY_CONFLICT)
> +		warning("Cannot merge binary files: %s (%s vs. %s)",
> +			path, "ours", "theirs");

This one looks correct.

> +	if (merge_status < 0 || !result_buf.ptr) {
>  		free(result_buf.ptr);
>  		return error(_("path '%s': cannot merge"), path);
>  	}

> diff --git a/merge-blobs.c b/merge-blobs.c
> index ee0a0e90c94..8138090f81c 100644
> --- a/merge-blobs.c
> +++ b/merge-blobs.c
> @@ -36,7 +36,7 @@ static void *three_way_filemerge(struct index_state *istate,
>  				 mmfile_t *their,
>  				 unsigned long *size)
>  {
> -	int merge_status;
> +	enum ll_merge_result merge_status;
>  	mmbuffer_t res;
>  
>  	/*
> @@ -50,6 +50,9 @@ static void *three_way_filemerge(struct index_state *istate,
>  				istate, NULL);
>  	if (merge_status < 0)
>  		return NULL;
> +	if (merge_status == LL_MERGE_BINARY_CONFLICT)
> +		warning("Cannot merge binary files: %s (%s vs. %s)",
> +			path, ".our", ".their");

OK.

> diff --git a/merge-ort.c b/merge-ort.c
> index 0342f104836..c24da2ba3cb 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -1743,7 +1743,7 @@ static int merge_3way(struct merge_options *opt,
>  	mmfile_t orig, src1, src2;
>  	struct ll_merge_options ll_opts = {0};
>  	char *base, *name1, *name2;
> -	int merge_status;
> +	enum ll_merge_result merge_status;
>  
>  	if (!opt->priv->attr_index.initialized)
>  		initialize_attr_index(opt);
> @@ -1787,6 +1787,9 @@ static int merge_3way(struct merge_options *opt,
>  	merge_status = ll_merge(result_buf, path, &orig, base,
>  				&src1, name1, &src2, name2,
>  				&opt->priv->attr_index, &ll_opts);
> +	if (merge_status == LL_MERGE_BINARY_CONFLICT)
> +		warning("Cannot merge binary files: %s (%s vs. %s)",
> +			path, name1, name2);

OK; this is your code and I do not have to read it too carefully,
but all we need is conveniently in the pre-context of the hunk ;-).

> diff --git a/merge-recursive.c b/merge-recursive.c
> index d9457797dbb..bc73c52dd84 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -1044,7 +1044,7 @@ static int merge_3way(struct merge_options *opt,
>  	mmfile_t orig, src1, src2;
>  	struct ll_merge_options ll_opts = {0};
>  	char *base, *name1, *name2;
> -	int merge_status;
> +	enum ll_merge_result merge_status;
>  
>  	ll_opts.renormalize = opt->renormalize;
>  	ll_opts.extra_marker_size = extra_marker_size;
> @@ -1090,6 +1090,9 @@ static int merge_3way(struct merge_options *opt,
>  	merge_status = ll_merge(result_buf, a->path, &orig, base,
>  				&src1, name1, &src2, name2,
>  				opt->repo->index, &ll_opts);
> +	if (merge_status == LL_MERGE_BINARY_CONFLICT)
> +		warning("Cannot merge binary files: %s (%s vs. %s)",
> +			a->path, name1, name2);

OK.

> diff --git a/notes-merge.c b/notes-merge.c
> index b4a3a903e86..01d596920ea 100644
> --- a/notes-merge.c
> +++ b/notes-merge.c
> @@ -344,7 +344,7 @@ static int ll_merge_in_worktree(struct notes_merge_options *o,
>  {
>  	mmbuffer_t result_buf;
>  	mmfile_t base, local, remote;
> -	int status;
> +	enum ll_merge_result status;
>  
>  	read_mmblob(&base, &p->base);
>  	read_mmblob(&local, &p->local);
> @@ -358,6 +358,9 @@ static int ll_merge_in_worktree(struct notes_merge_options *o,
>  	free(local.ptr);
>  	free(remote.ptr);
>  
> +	if (status == LL_MERGE_BINARY_CONFLICT)
> +		warning("Cannot merge binary files: %s (%s vs. %s)",
> +			oid_to_hex(&p->obj), o->local_ref, o->remote_ref);

This uses another slot in the rotating buffer used by oid_to_hex(),
but I do not think anybody grabbed a pointer to one of them and held
onto it before we got here, so it would be OK.

> diff --git a/rerere.c b/rerere.c
> index d83d58df4fb..46fd01819b8 100644
> --- a/rerere.c
> +++ b/rerere.c
> @@ -609,19 +609,23 @@ static int try_merge(struct index_state *istate,
>  		     const struct rerere_id *id, const char *path,
>  		     mmfile_t *cur, mmbuffer_t *result)
>  {
> -	int ret;
> +	enum ll_merge_result ret;
>  	mmfile_t base = {NULL, 0}, other = {NULL, 0};
>  
>  	if (read_mmfile(&base, rerere_path(id, "preimage")) ||
>  	    read_mmfile(&other, rerere_path(id, "postimage")))
> -		ret = 1;
> -	else
> +		ret = LL_MERGE_CONFLICT;
> +	else {

Let's have {} around the if clause now the corresponding else clause
needs it.

>  		/*
>  		 * A three-way merge. Note that this honors user-customizable
>  		 * low-level merge driver settings.
>  		 */
>  		ret = ll_merge(result, path, &base, NULL, cur, "", &other, "",
>  			       istate, NULL);
> +		if (ret == LL_MERGE_BINARY_CONFLICT)
> +			warning("Cannot merge binary files: %s (%s vs. %s)",
> +				path, "", "");
> +	}

This is a faithful conversion of what should not happen in practice,
as the rerere logic would not be able to reach here.  In a binary
file, we won't be able to identify <<< === >>> blocks, hash the text
in the conflicted block to come up with the conflict ID to find the
preimage and postimage files.  These files are the input to the low
level merge driver call we are making here.

Looking almost good except for a warning message bug I spotted
earlier.

Thanks.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 9/9] doc/diff-options: explain the new --remerge-diff option
  2021-12-21 22:24     ` Elijah Newren
@ 2021-12-21 23:47       ` Ævar Arnfjörð Bjarmason
  2021-12-22 19:05         ` Elijah Newren
  0 siblings, 1 reply; 113+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-21 23:47 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya, Neeraj Singh


On Tue, Dec 21 2021, Elijah Newren wrote:

> On Tue, Dec 21, 2021 at 1:29 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>>
>> On Tue, Dec 21 2021, Elijah Newren via GitGitGadget wrote:
>>
>> > From: Elijah Newren <newren@gmail.com>
>> >
>> > Signed-off-by: Elijah Newren <newren@gmail.com>
>> > ---
>> >  Documentation/diff-options.txt | 8 ++++++++
>> >  1 file changed, 8 insertions(+)
>> >
>> > diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
>> > index c89d530d3d1..b05f1c9f1c9 100644
>> > --- a/Documentation/diff-options.txt
>> > +++ b/Documentation/diff-options.txt
>> > @@ -64,6 +64,14 @@ ifdef::git-log[]
>> >       each of the parents. Separate log entry and diff is generated
>> >       for each parent.
>> >  +
>> > +--diff-merges=remerge:::
>> > +--diff-merges=r:::
>> > +--remerge-diff:::
>> > +     With this option, two-parent merge commits are remerged to
>> > +     create a temporary tree object -- potentially containing files
>> > +     with conflict markers and such.  A diff is then shown between
>> > +     that temporary tree and the actual merge commit.
>> > ++
>> >  --diff-merges=combined:::
>> >  --diff-merges=c:::
>> >  -c:::
>>
>> This & 5/9 would I think be better squashed into their respective "main"
>> patches.
>
> I presume you mean the "main" patch for this one is 8/9.  I was trying
> to find a way to break up that large patch, but this is pretty small
> so...sure I'll squash it in.
>
> What are you referring to as the "main" patch for 5/9, though?  It
> only seems related to 6/9 and 7/9 to me, but I very deliberately split
> those patches off and don't want to confuse them with unrelated
> changes.  I disagree with combining 5/9 with either of those.

I just gave it a quick initial skim.

I have sometimes found it a bit harder to review your patches due to
over-splitting.

E.g. (went back and looked) here tmp_objdir_discard_objects() is
introduced in 1/9 but used in 8/9. "path_messages" is then introduced in
5/9 and used in 8/9, no?

Anyway, just a bit of feedback. FWIW not just bikeshedding. I do find
myself stopping at 1/9, paging to 2/9, searching for the function, not
there, checking 3/9 etc.

I realize this is a bit of a stones & glass houses comment, but I find
it a bit easier to review things when a patch is larger v.s. having it
split up in a way where preceding steps don't do anything yet except
wait for use by a subsequent patch.

0.02 etc.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 1/9] tmp_objdir: add a helper function for discarding all contained objects
  2021-12-21 23:26   ` Junio C Hamano
@ 2021-12-21 23:51     ` Elijah Newren
  2021-12-22  6:23       ` Junio C Hamano
  0 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren @ 2021-12-21 23:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason, Neeraj Singh

On Tue, Dec 21, 2021 at 3:26 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Elijah Newren <newren@gmail.com>
> >
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > ---
> >  tmp-objdir.c | 5 +++++
> >  tmp-objdir.h | 6 ++++++
> >  2 files changed, 11 insertions(+)
> >
> > diff --git a/tmp-objdir.c b/tmp-objdir.c
> > index 3d38eeab66b..adf6033549e 100644
> > --- a/tmp-objdir.c
> > +++ b/tmp-objdir.c
> > @@ -79,6 +79,11 @@ static void remove_tmp_objdir_on_signal(int signo)
> >       raise(signo);
> >  }
> >
> > +void tmp_objdir_discard_objects(struct tmp_objdir *t)
> > +{
> > +     remove_dir_recursively(&t->path, REMOVE_DIR_KEEP_TOPLEVEL);
> > +}
> > +
>
> OK.
>
> Without a caller, it is a bit hard to judge if a separate helper
> makes the caller easier to read and understand, or becomes an extra
> layer of abstraction that obscures the logic.  Hopefully, having a
> more specific function name with "tmp" and "discard" in it makes the
> intent at callers more clear than the function that is named after
> the detail of the operation.

This isn't just a convenience; since tmp_objdir is defined in
tmp-objdir.c rather than tmp-objdir.h, t->path is not accessible
outside of tmp-objdir.c.  Because of this, some kind of helper
function is necessary.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 3/9] merge-ort: capture and print ll-merge warnings in our preferred fashion
  2021-12-21 18:05 ` [PATCH 3/9] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
@ 2021-12-22  0:00   ` Junio C Hamano
  2021-12-23 18:36     ` Elijah Newren
  0 siblings, 1 reply; 113+ messages in thread
From: Junio C Hamano @ 2021-12-22  0:00 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Elijah Newren <newren@gmail.com>
>
> Instead of immediately printing ll-merge warnings to stderr, we save
> them in our output strbuf.  Besides allowing us to move these warnings
> to a special file for --remerge-diff, this has two other benefits for
> regular merges done by merge-ort:
>
>   * The deferral of messages ensures we can print all messages about
>     any given path together (merge-recursive was known to sometimes
>     intersperse messages about other paths, particularly when renames
>     were involved).

I would imagine that with something like this, we can show such a
warning message differently when it happens during an inner
"synthesizing a virtual common ancestor" merge (the most likely
value for "show differently" would be to "squelch"), which may be a
good thing.

>  	if (merge_status == LL_MERGE_BINARY_CONFLICT)
> -		warning("Cannot merge binary files: %s (%s vs. %s)",
> -			path, name1, name2);
> +		path_msg(opt, path, 0,
> +			 "warning: Cannot merge binary files: %s (%s vs. %s)",
> +			 path, name1, name2);
>  

Nice.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 4/9] merge-ort: mark a few more conflict messages as omittable
  2021-12-21 18:05 ` [PATCH 4/9] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
@ 2021-12-22  0:06   ` Junio C Hamano
  2021-12-23 18:38     ` Elijah Newren
  0 siblings, 1 reply; 113+ messages in thread
From: Junio C Hamano @ 2021-12-22  0:06 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Elijah Newren <newren@gmail.com>
>
> path_msg() has the ability to mark messages as omittable, designed for
> remerge-diff where we'll instead be showing conflict messages as diff
> headers for a subsequent diff.  While all these messages are very useful
> when trying to create a merge initially, early use with the
> --remerge-diff feature (the only user of this omittable conflict message
> capability), suggests that the particular messages marked in this commit
> are just noise when trying to see what changes users made to create a
> merge commit.

It is likely because when somebody is looking at the output of
remerge-diff, they are mostly concentrating on the _content_ level
merges and they are not keenly looking for a merge whose result is
deposited at a wrong path.  Since what is shown is something that
has already recorded in the history, we can safely assume that it is
no longer a relevant (or "it is way too late to matter"), I would
say, to show these messages about "file location".

> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  merge-ort.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/merge-ort.c b/merge-ort.c
> index a18f47e23c5..fe27870e73e 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -2420,7 +2420,7 @@ static void apply_directory_rename_modifications(struct merge_options *opt,
>  		 */
>  		ci->path_conflict = 1;
>  		if (pair->status == 'A')
> -			path_msg(opt, new_path, 0,
> +			path_msg(opt, new_path, 1,
>  				 _("CONFLICT (file location): %s added in %s "
>  				   "inside a directory that was renamed in %s, "
>  				   "suggesting it should perhaps be moved to "
> @@ -2428,7 +2428,7 @@ static void apply_directory_rename_modifications(struct merge_options *opt,
>  				 old_path, branch_with_new_path,
>  				 branch_with_dir_rename, new_path);
>  		else
> -			path_msg(opt, new_path, 0,
> +			path_msg(opt, new_path, 1,
>  				 _("CONFLICT (file location): %s renamed to %s "
>  				   "in %s, inside a directory that was renamed "
>  				   "in %s, suggesting it should perhaps be "
> @@ -3825,7 +3825,7 @@ static void process_entry(struct merge_options *opt,
>  				reason = _("add/add");
>  			if (S_ISGITLINK(merged_file.mode))
>  				reason = _("submodule");
> -			path_msg(opt, path, 0,
> +			path_msg(opt, path, 1,
>  				 _("CONFLICT (%s): Merge conflict in %s"),
>  				 reason, path);

I am not as sure about this one as the other two, though.  I guess
in the context of remerge-diff, resolving the add/add conflict into
the same file is also something that happened long time ago and
these messages are too late to matter the same way as the other two.

OK.



^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 6/9] diff: add ability to insert additional headers for paths
  2021-12-21 18:05 ` [PATCH 6/9] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
@ 2021-12-22  0:24   ` Junio C Hamano
  2021-12-25  2:35     ` Elijah Newren
  0 siblings, 1 reply; 113+ messages in thread
From: Junio C Hamano @ 2021-12-22  0:24 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Elijah Newren <newren@gmail.com>
>
> In support of a remerge-diff ability we will add in a few commits, we
> want to be able to provide additional headers to show along with a diff.
> Add the plumbing necessary to enable this.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  diff.c | 34 +++++++++++++++++++++++++++++++++-
>  diff.h |  1 +
>  2 files changed, 34 insertions(+), 1 deletion(-)
>
> diff --git a/diff.c b/diff.c
> index 861282db1c3..a9490b9b2ba 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -27,6 +27,7 @@
>  #include "help.h"
>  #include "promisor-remote.h"
>  #include "dir.h"
> +#include "strmap.h"
>  
>  #ifdef NO_FAST_WORKING_DIRECTORY
>  #define FAST_WORKING_DIRECTORY 0
> @@ -3406,6 +3407,33 @@ struct userdiff_driver *get_textconv(struct repository *r,
>  	return userdiff_get_textconv(r, one->driver);
>  }
>  
> +static struct strbuf* additional_headers(struct diff_options *o,

Style.

> +					 const char *path)
> +{
> +	if (!o->additional_path_headers)
> +		return NULL;
> +	return strmap_get(o->additional_path_headers, path);
> +}
> +
> +static void add_formatted_headers(struct strbuf *msg,
> +				  struct strbuf *more_headers,
> +				  const char *line_prefix,
> +				  const char *meta,
> +				  const char *reset)
> +{
> +	char *next, *newline;
> +
> +	next = more_headers->buf;
> +	while ((newline = strchr(next, '\n'))) {
> +		*newline = '\0';
> +		strbuf_addf(msg, "%s%s%s%s\n", line_prefix, meta, next, reset);
> +		*newline = '\n';
> +		next = newline + 1;
> +	}

The above is not wrong per-se, but we do not need to do the
"temporarily terminate and then recover" dance, and avoiding it
would make the code cleaner.

Once you learn the value of "newline" [*], you know the number of
bytes between "next" and "newline" so you can use safely "%.*s"
format specifier without temporarily terminating the subsection of
the string.

	Side note. I would actually use strchrnul() instead, so that
        we do not have to special case the end of the buffer.  For a
        readily available example, see advice.c::vadvise().

> +	if (*next)
> +		strbuf_addf(msg, "%s%s%s%s\n", line_prefix, meta, next, reset);
> +}

> @@ -4328,9 +4356,13 @@ static void fill_metainfo(struct strbuf *msg,
>  	const char *set = diff_get_color(use_color, DIFF_METAINFO);
>  	const char *reset = diff_get_color(use_color, DIFF_RESET);
>  	const char *line_prefix = diff_line_prefix(o);
> +	struct strbuf *more_headers = NULL;
>  
>  	*must_show_header = 1;
>  	strbuf_init(msg, PATH_MAX * 2 + 300);
> +	if ((more_headers = additional_headers(o, name)))
> +		add_formatted_headers(msg, more_headers,
> +				      line_prefix, set, reset);

So, we stuff what came via path_msg() without anything that allows
readers to identify them to the header part?  Just like we have
fixed and known string taken from a bounded vocabulary such as
"index", "copy from", "old mode", etc., don't we want to prefix the
hints that came from the merge machinery with some identifiable
string?

> @@ -5852,7 +5884,7 @@ int diff_unmodified_pair(struct diff_filepair *p)
>  
>  static void diff_flush_patch(struct diff_filepair *p, struct diff_options *o)
>  {
> -	if (diff_unmodified_pair(p))
> +	if (diff_unmodified_pair(p) && !additional_headers(o, p->one->path))
>  		return;

This does not feel quite right.  At least there needs a comment that
says the _current_ callers that add additional_headers() would do so
only for paths that the end-users cares about, even when there is no
change in the contents.  It is quite plausible that future callers
may want to add additional information to only paths that have some
changes that need to be shown, no?  And at that point, they want to
tweak this condition we place here, but without explanation they
wouldn't know what they would be breaking if they did so.


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 0/9] Add a new --remerge-diff capability to show & log
  2021-12-21 18:05 [PATCH 0/9] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
                   ` (9 preceding siblings ...)
  2021-12-21 23:20 ` [PATCH 0/9] Add a new --remerge-diff capability to show & log Junio C Hamano
@ 2021-12-22  0:33 ` Junio C Hamano
  2021-12-25  7:59 ` [PATCH v2 0/8] " Elijah Newren via GitGitGadget
  11 siblings, 0 replies; 113+ messages in thread
From: Junio C Hamano @ 2021-12-22  0:33 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

>  Documentation/diff-options.txt |  8 ++++
>  apply.c                        |  5 ++-
>  builtin/checkout.c             | 12 ++++--
>  builtin/log.c                  | 16 ++++++++
>  diff-merges.c                  | 12 ++++++
>  diff.c                         | 34 ++++++++++++++++-
>  diff.h                         |  1 +
>  ll-merge.c                     | 40 ++++++++++---------
>  ll-merge.h                     |  9 ++++-
>  log-tree.c                     | 70 ++++++++++++++++++++++++++++++++++
>  merge-blobs.c                  |  5 ++-
>  merge-ort.c                    | 49 +++++++++++++++++++++---
>  merge-ort.h                    | 10 +++++
>  merge-recursive.c              |  8 +++-
>  merge-recursive.h              |  1 +
>  notes-merge.c                  |  5 ++-
>  rerere.c                       | 10 +++--
>  revision.h                     |  6 ++-
>  t/t6404-recursive-merge.sh     |  9 ++++-
>  t/t6406-merge-attr.sh          |  9 ++++-
>  tmp-objdir.c                   |  5 +++
>  tmp-objdir.h                   |  6 +++
>  22 files changed, 288 insertions(+), 42 deletions(-)

It is somewhat disappointing that there is no test or documentation
update that show how a typical remerge-diff output should look like.
I was specifically interested in finding out how the "conflict hint
messages from merge backend in the diff header" output would look.

I left some messages here and there on the patches I read carefully,
and they looked mostly good.  I only skimmed 7 and 8 and did not
find anything glaringly wrong, but that wouldn't count as a review.

Thanks.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 1/9] tmp_objdir: add a helper function for discarding all contained objects
  2021-12-21 23:51     ` Elijah Newren
@ 2021-12-22  6:23       ` Junio C Hamano
  2021-12-25  2:29         ` Elijah Newren
  0 siblings, 1 reply; 113+ messages in thread
From: Junio C Hamano @ 2021-12-22  6:23 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason, Neeraj Singh

Elijah Newren <newren@gmail.com> writes:

>> > +void tmp_objdir_discard_objects(struct tmp_objdir *t)
>> > +{
>> > +     remove_dir_recursively(&t->path, REMOVE_DIR_KEEP_TOPLEVEL);
>> > +}
>> > +
>>
>> OK.
>>
>> Without a caller, it is a bit hard to judge if a separate helper
>> makes the caller easier to read and understand, or becomes an extra
>> layer of abstraction that obscures the logic.  Hopefully, having a
>> more specific function name with "tmp" and "discard" in it makes the
>> intent at callers more clear than the function that is named after
>> the detail of the operation.
>
> This isn't just a convenience; since tmp_objdir is defined in
> tmp-objdir.c rather than tmp-objdir.h, t->path is not accessible
> outside of tmp-objdir.c.  Because of this, some kind of helper
> function is necessary.

But adding this function as an extra level of abstration is *not*
the only way to expose the feature.  Instead the internal of "struct
tmp_objdir" could be exposed to the caller that wants to discard the
files inside the path.

I think we now have enough material to fill between these two lines
to help readers ;-)

>> > From: Elijah Newren <newren@gmail.com>
>> >
>> > Signed-off-by: Elijah Newren <newren@gmail.com>

Thanks.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 9/9] doc/diff-options: explain the new --remerge-diff option
  2021-12-21 23:47       ` Ævar Arnfjörð Bjarmason
@ 2021-12-22 19:05         ` Elijah Newren
  0 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren @ 2021-12-22 19:05 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya, Neeraj Singh

On Tue, Dec 21, 2021 at 5:06 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Tue, Dec 21 2021, Elijah Newren wrote:
>
> > On Tue, Dec 21, 2021 at 1:29 PM Ævar Arnfjörð Bjarmason
> > <avarab@gmail.com> wrote:
> >>
> >>
> >> On Tue, Dec 21 2021, Elijah Newren via GitGitGadget wrote:
> >>
> >> > From: Elijah Newren <newren@gmail.com>
> >> >
> >> > Signed-off-by: Elijah Newren <newren@gmail.com>
> >> > ---
> >> >  Documentation/diff-options.txt | 8 ++++++++
> >> >  1 file changed, 8 insertions(+)
> >> >
> >> > diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> >> > index c89d530d3d1..b05f1c9f1c9 100644
> >> > --- a/Documentation/diff-options.txt
> >> > +++ b/Documentation/diff-options.txt
> >> > @@ -64,6 +64,14 @@ ifdef::git-log[]
> >> >       each of the parents. Separate log entry and diff is generated
> >> >       for each parent.
> >> >  +
> >> > +--diff-merges=remerge:::
> >> > +--diff-merges=r:::
> >> > +--remerge-diff:::
> >> > +     With this option, two-parent merge commits are remerged to
> >> > +     create a temporary tree object -- potentially containing files
> >> > +     with conflict markers and such.  A diff is then shown between
> >> > +     that temporary tree and the actual merge commit.
> >> > ++
> >> >  --diff-merges=combined:::
> >> >  --diff-merges=c:::
> >> >  -c:::
> >>
> >> This & 5/9 would I think be better squashed into their respective "main"
> >> patches.
> >
> > I presume you mean the "main" patch for this one is 8/9.  I was trying
> > to find a way to break up that large patch, but this is pretty small
> > so...sure I'll squash it in.
> >
> > What are you referring to as the "main" patch for 5/9, though?  It
> > only seems related to 6/9 and 7/9 to me, but I very deliberately split
> > those patches off and don't want to confuse them with unrelated
> > changes.  I disagree with combining 5/9 with either of those.
>
> I just gave it a quick initial skim.
>
> I have sometimes found it a bit harder to review your patches due to
> over-splitting.
>
> E.g. (went back and looked) here tmp_objdir_discard_objects() is
> introduced in 1/9 but used in 8/9. "path_messages" is then introduced in
> 5/9 and used in 8/9, no?
>
> Anyway, just a bit of feedback. FWIW not just bikeshedding. I do find
> myself stopping at 1/9, paging to 2/9, searching for the function, not
> there, checking 3/9 etc.
>
> I realize this is a bit of a stones & glass houses comment, but I find
> it a bit easier to review things when a patch is larger v.s. having it
> split up in a way where preceding steps don't do anything yet except
> wait for use by a subsequent patch.
>
> 0.02 etc.

Oh, 8/9.  That one could make sense.

And thanks for the feedback.  Perhaps I could restructure this series
with a top-down design instead of bottom-up.  Doing that would mean
either adding functions with an instant-die() implementation in the
first step or just leaving a placeholder comment, and then filling
those things in for later steps.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 2/9] ll-merge: make callers responsible for showing warnings
  2021-12-21 23:44   ` Junio C Hamano
@ 2021-12-23 18:26     ` Elijah Newren
  0 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren @ 2021-12-23 18:26 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason, Neeraj Singh

On Tue, Dec 21, 2021 at 3:44 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Elijah Newren <newren@gmail.com>
> >
> > Since some callers may want to send warning messages to somewhere other
> > than stdout/stderr, stop printing "warning: Cannot merge binary files"
> > from ll-merge and instead modify the return status of ll_merge() to
> > indicate when a merge of binary files has occurred.
> >
> > Note that my methodology included first modifying ll_merge() to return
> > a struct, so that the compiler would catch all the callers for me and
> > ensure I had modified all of them.  After modifying all of them, I then
> > changed the struct to an enum.
> >
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > ---
> >  apply.c            |  5 ++++-
> >  builtin/checkout.c | 12 ++++++++----
> >  ll-merge.c         | 40 ++++++++++++++++++++++------------------
> >  ll-merge.h         |  9 ++++++++-
> >  merge-blobs.c      |  5 ++++-
> >  merge-ort.c        |  5 ++++-
> >  merge-recursive.c  |  5 ++++-
> >  notes-merge.c      |  5 ++++-
> >  rerere.c           | 10 +++++++---
> >  9 files changed, 65 insertions(+), 31 deletions(-)
> >
> > diff --git a/apply.c b/apply.c
> > index 43a0aebf4ee..12ea9c72a6b 100644
> > --- a/apply.c
> > +++ b/apply.c
> > @@ -3492,7 +3492,7 @@ static int three_way_merge(struct apply_state *state,
> >  {
> >       mmfile_t base_file, our_file, their_file;
> >       mmbuffer_t result = { NULL };
> > -     int status;
> > +     enum ll_merge_result status;
> >
> >       /* resolve trivial cases first */
> >       if (oideq(base, ours))
> > @@ -3509,6 +3509,9 @@ static int three_way_merge(struct apply_state *state,
> >                         &their_file, "theirs",
> >                         state->repo->index,
> >                         NULL);
> > +     if (status == LL_MERGE_BINARY_CONFLICT)
> > +             warning("Cannot merge binary files: %s (%s vs. %s)",
> > +                     "base", "ours", "theirs");
>
> This used to come from ll_merge()
>
> > -                     warning("Cannot merge binary files: %s (%s vs. %s)",
> > -                             path, name1, name2);
> > -                     /* fallthru */
>
> And our call to ll_merge() above (half of it invisible in the
> pre-context of the hunk) gave "ours" and "theirs" to our_label and
> their_label, which in turn are called name1 and name2, respectively,
> in ll_merge_binary() driver.
>
> I am not sure about the "base" string, though.  I suspect that your
> "base" should be a reference to the parameter 'path' of three_way_merge()
> function.

Ah, indeed; thanks for reading carefully.

> > diff --git a/builtin/checkout.c b/builtin/checkout.c
> > index cbf73b8c9f6..3a559d69303 100644
> > --- a/builtin/checkout.c
> > +++ b/builtin/checkout.c
> > @@ -237,6 +237,7 @@ static int checkout_merged(int pos, const struct checkout *state,
> >       struct cache_entry *ce = active_cache[pos];
> >       const char *path = ce->name;
> >       mmfile_t ancestor, ours, theirs;
> > +     enum ll_merge_result merge_status;
> >       int status;
> >       struct object_id oid;
> >       mmbuffer_t result_buf;
> > @@ -267,13 +268,16 @@ static int checkout_merged(int pos, const struct checkout *state,
> >       memset(&ll_opts, 0, sizeof(ll_opts));
> >       git_config_get_bool("merge.renormalize", &renormalize);
> >       ll_opts.renormalize = renormalize;
> > -     status = ll_merge(&result_buf, path, &ancestor, "base",
> > -                       &ours, "ours", &theirs, "theirs",
> > -                       state->istate, &ll_opts);
> > +     merge_status = ll_merge(&result_buf, path, &ancestor, "base",
> > +                             &ours, "ours", &theirs, "theirs",
> > +                             state->istate, &ll_opts);
> >       free(ancestor.ptr);
> >       free(ours.ptr);
> >       free(theirs.ptr);
> > -     if (status < 0 || !result_buf.ptr) {
> > +     if (merge_status == LL_MERGE_BINARY_CONFLICT)
> > +             warning("Cannot merge binary files: %s (%s vs. %s)",
> > +                     path, "ours", "theirs");
>
> This one looks correct.
>
> > +     if (merge_status < 0 || !result_buf.ptr) {
> >               free(result_buf.ptr);
> >               return error(_("path '%s': cannot merge"), path);
> >       }
>
> > diff --git a/merge-blobs.c b/merge-blobs.c
> > index ee0a0e90c94..8138090f81c 100644
> > --- a/merge-blobs.c
> > +++ b/merge-blobs.c
> > @@ -36,7 +36,7 @@ static void *three_way_filemerge(struct index_state *istate,
> >                                mmfile_t *their,
> >                                unsigned long *size)
> >  {
> > -     int merge_status;
> > +     enum ll_merge_result merge_status;
> >       mmbuffer_t res;
> >
> >       /*
> > @@ -50,6 +50,9 @@ static void *three_way_filemerge(struct index_state *istate,
> >                               istate, NULL);
> >       if (merge_status < 0)
> >               return NULL;
> > +     if (merge_status == LL_MERGE_BINARY_CONFLICT)
> > +             warning("Cannot merge binary files: %s (%s vs. %s)",
> > +                     path, ".our", ".their");
>
> OK.
>
> > diff --git a/merge-ort.c b/merge-ort.c
> > index 0342f104836..c24da2ba3cb 100644
> > --- a/merge-ort.c
> > +++ b/merge-ort.c
> > @@ -1743,7 +1743,7 @@ static int merge_3way(struct merge_options *opt,
> >       mmfile_t orig, src1, src2;
> >       struct ll_merge_options ll_opts = {0};
> >       char *base, *name1, *name2;
> > -     int merge_status;
> > +     enum ll_merge_result merge_status;
> >
> >       if (!opt->priv->attr_index.initialized)
> >               initialize_attr_index(opt);
> > @@ -1787,6 +1787,9 @@ static int merge_3way(struct merge_options *opt,
> >       merge_status = ll_merge(result_buf, path, &orig, base,
> >                               &src1, name1, &src2, name2,
> >                               &opt->priv->attr_index, &ll_opts);
> > +     if (merge_status == LL_MERGE_BINARY_CONFLICT)
> > +             warning("Cannot merge binary files: %s (%s vs. %s)",
> > +                     path, name1, name2);
>
> OK; this is your code and I do not have to read it too carefully,
> but all we need is conveniently in the pre-context of the hunk ;-).
>
> > diff --git a/merge-recursive.c b/merge-recursive.c
> > index d9457797dbb..bc73c52dd84 100644
> > --- a/merge-recursive.c
> > +++ b/merge-recursive.c
> > @@ -1044,7 +1044,7 @@ static int merge_3way(struct merge_options *opt,
> >       mmfile_t orig, src1, src2;
> >       struct ll_merge_options ll_opts = {0};
> >       char *base, *name1, *name2;
> > -     int merge_status;
> > +     enum ll_merge_result merge_status;
> >
> >       ll_opts.renormalize = opt->renormalize;
> >       ll_opts.extra_marker_size = extra_marker_size;
> > @@ -1090,6 +1090,9 @@ static int merge_3way(struct merge_options *opt,
> >       merge_status = ll_merge(result_buf, a->path, &orig, base,
> >                               &src1, name1, &src2, name2,
> >                               opt->repo->index, &ll_opts);
> > +     if (merge_status == LL_MERGE_BINARY_CONFLICT)
> > +             warning("Cannot merge binary files: %s (%s vs. %s)",
> > +                     a->path, name1, name2);
>
> OK.
>
> > diff --git a/notes-merge.c b/notes-merge.c
> > index b4a3a903e86..01d596920ea 100644
> > --- a/notes-merge.c
> > +++ b/notes-merge.c
> > @@ -344,7 +344,7 @@ static int ll_merge_in_worktree(struct notes_merge_options *o,
> >  {
> >       mmbuffer_t result_buf;
> >       mmfile_t base, local, remote;
> > -     int status;
> > +     enum ll_merge_result status;
> >
> >       read_mmblob(&base, &p->base);
> >       read_mmblob(&local, &p->local);
> > @@ -358,6 +358,9 @@ static int ll_merge_in_worktree(struct notes_merge_options *o,
> >       free(local.ptr);
> >       free(remote.ptr);
> >
> > +     if (status == LL_MERGE_BINARY_CONFLICT)
> > +             warning("Cannot merge binary files: %s (%s vs. %s)",
> > +                     oid_to_hex(&p->obj), o->local_ref, o->remote_ref);
>
> This uses another slot in the rotating buffer used by oid_to_hex(),
> but I do not think anybody grabbed a pointer to one of them and held
> onto it before we got here, so it would be OK.
>
> > diff --git a/rerere.c b/rerere.c
> > index d83d58df4fb..46fd01819b8 100644
> > --- a/rerere.c
> > +++ b/rerere.c
> > @@ -609,19 +609,23 @@ static int try_merge(struct index_state *istate,
> >                    const struct rerere_id *id, const char *path,
> >                    mmfile_t *cur, mmbuffer_t *result)
> >  {
> > -     int ret;
> > +     enum ll_merge_result ret;
> >       mmfile_t base = {NULL, 0}, other = {NULL, 0};
> >
> >       if (read_mmfile(&base, rerere_path(id, "preimage")) ||
> >           read_mmfile(&other, rerere_path(id, "postimage")))
> > -             ret = 1;
> > -     else
> > +             ret = LL_MERGE_CONFLICT;
> > +     else {
>
> Let's have {} around the if clause now the corresponding else clause
> needs it.

Will fix.

> >               /*
> >                * A three-way merge. Note that this honors user-customizable
> >                * low-level merge driver settings.
> >                */
> >               ret = ll_merge(result, path, &base, NULL, cur, "", &other, "",
> >                              istate, NULL);
> > +             if (ret == LL_MERGE_BINARY_CONFLICT)
> > +                     warning("Cannot merge binary files: %s (%s vs. %s)",
> > +                             path, "", "");
> > +     }
>
> This is a faithful conversion of what should not happen in practice,
> as the rerere logic would not be able to reach here.  In a binary
> file, we won't be able to identify <<< === >>> blocks, hash the text
> in the conflicted block to come up with the conflict ID to find the
> preimage and postimage files.  These files are the input to the low
> level merge driver call we are making here.
>
> Looking almost good except for a warning message bug I spotted
> earlier.
>
> Thanks.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 3/9] merge-ort: capture and print ll-merge warnings in our preferred fashion
  2021-12-22  0:00   ` Junio C Hamano
@ 2021-12-23 18:36     ` Elijah Newren
  0 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren @ 2021-12-23 18:36 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason, Neeraj Singh

On Tue, Dec 21, 2021 at 4:00 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Elijah Newren <newren@gmail.com>
> >
> > Instead of immediately printing ll-merge warnings to stderr, we save
> > them in our output strbuf.  Besides allowing us to move these warnings
> > to a special file for --remerge-diff, this has two other benefits for
> > regular merges done by merge-ort:
> >
> >   * The deferral of messages ensures we can print all messages about
> >     any given path together (merge-recursive was known to sometimes
> >     intersperse messages about other paths, particularly when renames
> >     were involved).
>
> I would imagine that with something like this, we can show such a
> warning message differently when it happens during an inner
> "synthesizing a virtual common ancestor" merge (the most likely
> value for "show differently" would be to "squelch"), which may be a
> good thing.

Yes, that is a possibility that opens up after this.  Which reminds
me, merge-recursive nicely nested conflict/warnings messages from
inner merges by adding 2*call_depth space characters before messages.
I lost that in merge-ort (which becomes more problematic since
merge-ort tries to group messages about the same path together, thus
mixing inner merge messages with outer ones and providing no way to
differentiate the two).  I've got a patch to fix that up, but of
course it conflicts with this series, so I'll be submitting it after
this one settles.

> >       if (merge_status == LL_MERGE_BINARY_CONFLICT)
> > -             warning("Cannot merge binary files: %s (%s vs. %s)",
> > -                     path, name1, name2);
> > +             path_msg(opt, path, 0,
> > +                      "warning: Cannot merge binary files: %s (%s vs. %s)",
> > +                      path, name1, name2);
> >
>
> Nice.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 4/9] merge-ort: mark a few more conflict messages as omittable
  2021-12-22  0:06   ` Junio C Hamano
@ 2021-12-23 18:38     ` Elijah Newren
  0 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren @ 2021-12-23 18:38 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason, Neeraj Singh

On Tue, Dec 21, 2021 at 4:06 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Elijah Newren <newren@gmail.com>
> >
> > path_msg() has the ability to mark messages as omittable, designed for
> > remerge-diff where we'll instead be showing conflict messages as diff
> > headers for a subsequent diff.  While all these messages are very useful
> > when trying to create a merge initially, early use with the
> > --remerge-diff feature (the only user of this omittable conflict message
> > capability), suggests that the particular messages marked in this commit
> > are just noise when trying to see what changes users made to create a
> > merge commit.
>
> It is likely because when somebody is looking at the output of
> remerge-diff, they are mostly concentrating on the _content_ level
> merges and they are not keenly looking for a merge whose result is
> deposited at a wrong path.  Since what is shown is something that
> has already recorded in the history, we can safely assume that it is
> no longer a relevant (or "it is way too late to matter"), I would
> say, to show these messages about "file location".
>
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > ---
> >  merge-ort.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/merge-ort.c b/merge-ort.c
> > index a18f47e23c5..fe27870e73e 100644
> > --- a/merge-ort.c
> > +++ b/merge-ort.c
> > @@ -2420,7 +2420,7 @@ static void apply_directory_rename_modifications(struct merge_options *opt,
> >                */
> >               ci->path_conflict = 1;
> >               if (pair->status == 'A')
> > -                     path_msg(opt, new_path, 0,
> > +                     path_msg(opt, new_path, 1,
> >                                _("CONFLICT (file location): %s added in %s "
> >                                  "inside a directory that was renamed in %s, "
> >                                  "suggesting it should perhaps be moved to "
> > @@ -2428,7 +2428,7 @@ static void apply_directory_rename_modifications(struct merge_options *opt,
> >                                old_path, branch_with_new_path,
> >                                branch_with_dir_rename, new_path);
> >               else
> > -                     path_msg(opt, new_path, 0,
> > +                     path_msg(opt, new_path, 1,
> >                                _("CONFLICT (file location): %s renamed to %s "
> >                                  "in %s, inside a directory that was renamed "
> >                                  "in %s, suggesting it should perhaps be "
> > @@ -3825,7 +3825,7 @@ static void process_entry(struct merge_options *opt,
> >                               reason = _("add/add");
> >                       if (S_ISGITLINK(merged_file.mode))
> >                               reason = _("submodule");
> > -                     path_msg(opt, path, 0,
> > +                     path_msg(opt, path, 1,
> >                                _("CONFLICT (%s): Merge conflict in %s"),
> >                                reason, path);
>
> I am not as sure about this one as the other two, though.  I guess
> in the context of remerge-diff, resolving the add/add conflict into
> the same file is also something that happened long time ago and
> these messages are too late to matter the same way as the other two.

Yeah, I'm not so sure about it either, and my notes are long, long
gone.  I think I'll pull this one out, and we can always tweak it
later if needed.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 1/9] tmp_objdir: add a helper function for discarding all contained objects
  2021-12-22  6:23       ` Junio C Hamano
@ 2021-12-25  2:29         ` Elijah Newren
  0 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren @ 2021-12-25  2:29 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason, Neeraj Singh

On Tue, Dec 21, 2021 at 10:23 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> >> > +void tmp_objdir_discard_objects(struct tmp_objdir *t)
> >> > +{
> >> > +     remove_dir_recursively(&t->path, REMOVE_DIR_KEEP_TOPLEVEL);
> >> > +}
> >> > +
> >>
> >> OK.
> >>
> >> Without a caller, it is a bit hard to judge if a separate helper
> >> makes the caller easier to read and understand, or becomes an extra
> >> layer of abstraction that obscures the logic.  Hopefully, having a
> >> more specific function name with "tmp" and "discard" in it makes the
> >> intent at callers more clear than the function that is named after
> >> the detail of the operation.
> >
> > This isn't just a convenience; since tmp_objdir is defined in
> > tmp-objdir.c rather than tmp-objdir.h, t->path is not accessible
> > outside of tmp-objdir.c.  Because of this, some kind of helper
> > function is necessary.
>
> But adding this function as an extra level of abstration is *not*
> the only way to expose the feature.  Instead the internal of "struct
> tmp_objdir" could be exposed to the caller that wants to discard the
> files inside the path.

Ah, yes, we talked about that during the tmp-objdir discussion back in
September/October.  Peff didn't want struct tmp_objdir exposed, and I
was operating with that in mind.

> I think we now have enough material to fill between these two lines
> to help readers ;-)

I've restructured the series a bit based on Ævar's feedback, and this
function is now only introduced along with its caller.  Hopefully that
makes it a bit clearer.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH 6/9] diff: add ability to insert additional headers for paths
  2021-12-22  0:24   ` Junio C Hamano
@ 2021-12-25  2:35     ` Elijah Newren
  0 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren @ 2021-12-25  2:35 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason, Neeraj Singh

On Tue, Dec 21, 2021 at 4:24 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Elijah Newren <newren@gmail.com>
> >
> > In support of a remerge-diff ability we will add in a few commits, we
> > want to be able to provide additional headers to show along with a diff.
> > Add the plumbing necessary to enable this.
> >
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > ---
> >  diff.c | 34 +++++++++++++++++++++++++++++++++-
> >  diff.h |  1 +
> >  2 files changed, 34 insertions(+), 1 deletion(-)
> >
> > diff --git a/diff.c b/diff.c
> > index 861282db1c3..a9490b9b2ba 100644
> > --- a/diff.c
> > +++ b/diff.c
> > @@ -27,6 +27,7 @@
> >  #include "help.h"
> >  #include "promisor-remote.h"
> >  #include "dir.h"
> > +#include "strmap.h"
> >
> >  #ifdef NO_FAST_WORKING_DIRECTORY
> >  #define FAST_WORKING_DIRECTORY 0
> > @@ -3406,6 +3407,33 @@ struct userdiff_driver *get_textconv(struct repository *r,
> >       return userdiff_get_textconv(r, one->driver);
> >  }
> >
> > +static struct strbuf* additional_headers(struct diff_options *o,
>
> Style.
>
> > +                                      const char *path)
> > +{
> > +     if (!o->additional_path_headers)
> > +             return NULL;
> > +     return strmap_get(o->additional_path_headers, path);
> > +}
> > +
> > +static void add_formatted_headers(struct strbuf *msg,
> > +                               struct strbuf *more_headers,
> > +                               const char *line_prefix,
> > +                               const char *meta,
> > +                               const char *reset)
> > +{
> > +     char *next, *newline;
> > +
> > +     next = more_headers->buf;
> > +     while ((newline = strchr(next, '\n'))) {
> > +             *newline = '\0';
> > +             strbuf_addf(msg, "%s%s%s%s\n", line_prefix, meta, next, reset);
> > +             *newline = '\n';
> > +             next = newline + 1;
> > +     }
>
> The above is not wrong per-se, but we do not need to do the
> "temporarily terminate and then recover" dance, and avoiding it
> would make the code cleaner.
>
> Once you learn the value of "newline" [*], you know the number of
> bytes between "next" and "newline" so you can use safely "%.*s"
> format specifier without temporarily terminating the subsection of
> the string.
>
>         Side note. I would actually use strchrnul() instead, so that
>         we do not have to special case the end of the buffer.  For a
>         readily available example, see advice.c::vadvise().
>
> > +     if (*next)
> > +             strbuf_addf(msg, "%s%s%s%s\n", line_prefix, meta, next, reset);
> > +}
>
> > @@ -4328,9 +4356,13 @@ static void fill_metainfo(struct strbuf *msg,
> >       const char *set = diff_get_color(use_color, DIFF_METAINFO);
> >       const char *reset = diff_get_color(use_color, DIFF_RESET);
> >       const char *line_prefix = diff_line_prefix(o);
> > +     struct strbuf *more_headers = NULL;
> >
> >       *must_show_header = 1;
> >       strbuf_init(msg, PATH_MAX * 2 + 300);
> > +     if ((more_headers = additional_headers(o, name)))
> > +             add_formatted_headers(msg, more_headers,
> > +                                   line_prefix, set, reset);
>
> So, we stuff what came via path_msg() without anything that allows
> readers to identify them to the header part?  Just like we have
> fixed and known string taken from a bounded vocabulary such as
> "index", "copy from", "old mode", etc., don't we want to prefix the
> hints that came from the merge machinery with some identifiable
> string?

That's a fair question.  Most of the involved messages are of the form
    CONFLICT (<reason>): more details
and "CONFLICT" seems like a pretty identifiable string.  There are
some others, which made me wonder if we wanted some kind of additional
prefix, but I was having a hard time coming up with a meaningful
prefix; most that I thought of didn't seem like they'd help.

I've provided some testcases in the next re-roll so you can see some
examples; maybe that will help others judge if a prefix is needed and
spur creative juices for coming up with a good once since I seem to be
unable to.

> > @@ -5852,7 +5884,7 @@ int diff_unmodified_pair(struct diff_filepair *p)
> >
> >  static void diff_flush_patch(struct diff_filepair *p, struct diff_options *o)
> >  {
> > -     if (diff_unmodified_pair(p))
> > +     if (diff_unmodified_pair(p) && !additional_headers(o, p->one->path))
> >               return;
>
> This does not feel quite right.  At least there needs a comment that
> says the _current_ callers that add additional_headers() would do so
> only for paths that the end-users cares about, even when there is no
> change in the contents.  It is quite plausible that future callers
> may want to add additional information to only paths that have some
> changes that need to be shown, no?  And at that point, they want to
> tweak this condition we place here, but without explanation they
> wouldn't know what they would be breaking if they did so.

I've added a comment.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH v2 0/8] Add a new --remerge-diff capability to show & log
  2021-12-21 18:05 [PATCH 0/9] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
                   ` (10 preceding siblings ...)
  2021-12-22  0:33 ` Junio C Hamano
@ 2021-12-25  7:59 ` Elijah Newren via GitGitGadget
  2021-12-25  7:59   ` [PATCH v2 1/8] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
                     ` (10 more replies)
  11 siblings, 11 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-25  7:59 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Elijah Newren

Here are some patches to add a --remerge-diff capability to show & log,
which works by comparing merge commits to an automatic remerge (note that
the automatic remerge tree can contain files with conflict markers).

Changes since v1 (of the restarted submission, which technically was v2):

 * Restructured the series, so the first patch introduces the feature --
   with a bunch of caveats. Subsequent patches clean up those caveats. This
   avoids introducing not-yet-used functions, and hopefully makes review
   easier.
 * added testcases
 * numerous small improvements suggested by Ævar and Junio

Changes since original submission[1]:

 * Rebased on top of the version of ns/tmp-objdir that Neeraj submitted
   (Neeraj's patches were based on v2.34, but ns/tmp-objdir got applied on
   an old commit and does not even build because of that).
 * Modify ll-merge API to return a status, instead of printing "Cannot merge
   binary files" on stdout[2] (as suggested by Peff)
 * Make conflict messages and other such warnings into diff headers of the
   subsequent remerge-diff rather than appearing in the diff as file content
   of some funny looking filenames (as suggested by Peff[3] and Junio[4])
 * Sergey ack'ed the diff-merges.c portion of the patches, but that wasn't
   limited to one patch so not sure where to record that ack.

[1]
https://lore.kernel.org/git/pull.1080.git.git.1630376800.gitgitgadget@gmail.com/;
GitHub wouldn't let me change the target branch for the PR, so I had to
create a new one with the new base and thus the reason for not sending this
as v2 even though it is. [2]
https://lore.kernel.org/git/YVOZRhWttzF18Xql@coredump.intra.peff.net/,
https://lore.kernel.org/git/YVOZty9D7NRbzhE5@coredump.intra.peff.net/ [3]
https://lore.kernel.org/git/YVOXPTjsp9lrxmS6@coredump.intra.peff.net/ [4]
https://lore.kernel.org/git/xmqqr1d7e4ug.fsf@gitster.g/

=== FURTHER BACKGROUND (original cover letter material) ==

Here are some example commits you can try this out on (with git show
--remerge-diff $COMMIT):

 * git.git conflicted merge: 07601b5b36
 * git.git non-conflicted change: bf04590ecd
 * linux.git conflicted merge: eab3540562fb
 * linux.git non-conflicted change: 223cea6a4f05

Many more can be found by just running git log --merges --remerge-diff in
your repository of choice and searching for diffs (most merges tend to be
clean and unmodified and thus produce no diff but a search of '^diff' in the
log output tends to find the examples nicely).

Some basic high level details about this new option:

 * This option is most naturally compared to --cc, though the output seems
   to be much more understandable to most users than --cc output.
 * Since merges are often clean and unmodified, this new option results in
   an empty diff for most merges.
 * This new option shows things like the removal of conflict markers, which
   hunks users picked from the various conflicted sides to keep or remove,
   and shows changes made outside of conflict markers (which might reflect
   changes needed to resolve semantic conflicts or cleanups of e.g.
   compilation warnings or other additional changes an integrator felt
   belonged in the merged result).
 * This new option does not (currently) work for octopus merges, since
   merge-ort is specific to two-parent merges[1].
 * This option will not work on a read-only or full filesystem[2].
 * We discussed this capability at Git Merge 2020, and one of the
   suggestions was doing a periodic git gc --auto during the operation (due
   to potential new blobs and trees created during the operation). I found a
   way to avoid that; see [2].
 * This option is faster than you'd probably expect; it handles 33.5 merge
   commits per second in linux.git on my computer; see below.

In regards to the performance point above, the timing for running the
following command:

time git log --min-parents=2 --max-parents=2 $DIFF_FLAG | wc -l


in linux.git (with v5.4 checked out, since my copy of linux is very out of
date) is as follows:

DIFF_FLAG=--cc:            71m 31.536s
DIFF_FLAG=--remerge-diff:  31m  3.170s


Note that there are 62476 merges in this history. Also, output size is:

DIFF_FLAG=--cc:            2169111 lines
DIFF_FLAG=--remerge-diff:  2458020 lines


So roughly the same amount of output as --cc, as you'd expect.

As a side note: git log --remerge-diff, when run in various repositories and
allowed to run all the way back to the beginning(s) of history, is a nice
stress test of sorts for merge-ort. Especially when users run it for you on
their repositories they are working on, whether intentionally or via a bug
in a tool triggering that command to be run unexpectedly. Long story short,
such a bug in an internal tool existed last December and this command was
run on an internal repository and found a platform-specific bug in merge-ort
on some really old merge commit from that repo. I fixed that bug (a
STABLE_QSORT thing) while upstreaming all the merge-ort patches in the mean
time, but it was nice getting extra testing. Having more folks run this on
their repositories might be useful extra testing of the new merge strategy.

Also, I previously mentioned --remerge-diff-only (a flag to show how
cherry-picks or reverts differ from an automatic cherry-pick or revert, in
addition to showing how merges differ from an automatic merge). This series
does not include the patches to introduce that option; I'll submit them
later.

Two other things that might be interesting but are not included and which I
haven't investigated:

 * some mechanism for passing extra merge options through (e.g.
   -Xignore-space-change)
 * a capability to compare the automatic merge to a second automatic merge
   done with different merge options. (Not sure if this would be of interest
   to end users, but might be interesting while developing new a
   --strategy-option, or maybe checking how changing some default in the
   merge algorithm would affect historical merges in various repositories).

[1] I have nebulous ideas of how an Octopus-centric ORT strategy could be
written -- basically, just repeatedly invoking ort and trying to make sure
nested conflicts can be differentiated. For now, though, a simple warning is
printed that octopus merges are not handled and no diff will be shown. [2]
New blobs/trees can be written by the three-way merging step. These are
written to a temporary area (via tmp-objdir.c) under the git object store
that is cleaned up at the end of the operation, with the new loose objects
from the remerge being cleaned up after each individual merge.

Elijah Newren (8):
  show, log: provide a --remerge-diff capability
  log: clean unneeded objects during `log --remerge-diff`
  ll-merge: make callers responsible for showing warnings
  merge-ort: capture and print ll-merge warnings in our preferred
    fashion
  merge-ort: mark a few more conflict messages as omittable
  merge-ort: format messages slightly different for use in headers
  diff: add ability to insert additional headers for paths
  show, log: include conflict/warning messages in --remerge-diff headers

 Documentation/diff-options.txt |   8 ++
 apply.c                        |   5 +-
 builtin/checkout.c             |  12 ++-
 builtin/log.c                  |  15 +++
 diff-merges.c                  |  12 +++
 diff.c                         | 116 +++++++++++++++++++++-
 diff.h                         |   3 +-
 ll-merge.c                     |  40 ++++----
 ll-merge.h                     |   9 +-
 log-tree.c                     |  70 +++++++++++++-
 merge-blobs.c                  |   5 +-
 merge-ort.c                    |  47 ++++++++-
 merge-ort.h                    |  10 ++
 merge-recursive.c              |   8 +-
 merge-recursive.h              |   1 +
 notes-merge.c                  |   5 +-
 rerere.c                       |  12 ++-
 revision.h                     |   6 +-
 t/t4069-remerge-diff.sh        | 172 +++++++++++++++++++++++++++++++++
 t/t6404-recursive-merge.sh     |   9 +-
 t/t6406-merge-attr.sh          |   9 +-
 tmp-objdir.c                   |   5 +
 tmp-objdir.h                   |   6 ++
 23 files changed, 538 insertions(+), 47 deletions(-)
 create mode 100755 t/t4069-remerge-diff.sh


base-commit: 4e44121c2d7bced65e25eb7ec5156290132bec94
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1103%2Fnewren%2Fremerge-diff-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1103/newren/remerge-diff-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1103

Range-diff vs v1:

  8:  5d5846be0bd !  1:  b3ae62083e1 show, log: provide a --remerge-diff capability
     @@ Commit message
          possibly, just to hide other random changes).
      
          This capability works by creating a temporary object directory and
     -    marking it as the primary object store, so that any blobs or trees
     -    created during the automatic merge, can be easily removed afterwards by
     -    just deleting all objects from the temporary object directory.  We can
     -    do this after handling each merge commit, in order to avoid the need to
     -    worry about doing `git gc --auto` runs while running `git log
     -    --remerge-diff`.
     +    marking it as the primary object store.  This makes it so that any blobs
     +    or trees created during the automatic merge easily removable afterwards
     +    by just deleting all objects from the temporary object directory.
     +
     +    There are a few ways that this implementation is suboptimal:
     +      * `log --remerge-diff` becomes slow, because the temporary object
     +        directory can fills with many loose objects while running
     +      * the log output can be muddied with misplaced "warning: cannot merge
     +        binary files" messages, since ll-merge.c unconditionally writes those
     +        messages to stderr while running instead of allowing callers to
     +        manage them.
     +      * important conflict and warning messages are simply dropped; thus for
     +        conflicts like modify/delete or rename/rename or file/directory which
     +        are not representable with content conflict markers, there may be no
     +        way for a user of --remerge-diff to know that there had been a
     +        conflict which was resolved (and which possibly motivated other
     +        changes in the merge commit).
     +    Subsequent commits will address these issues.
      
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
     + ## Documentation/diff-options.txt ##
     +@@ Documentation/diff-options.txt: ifdef::git-log[]
     + 	each of the parents. Separate log entry and diff is generated
     + 	for each parent.
     + +
     ++--diff-merges=remerge:::
     ++--diff-merges=r:::
     ++--remerge-diff:::
     ++	With this option, two-parent merge commits are remerged to
     ++	create a temporary tree object -- potentially containing files
     ++	with conflict markers and such.  A diff is then shown between
     ++	that temporary tree and the actual merge commit.
     +++
     + --diff-merges=combined:::
     + --diff-merges=c:::
     + -c:::
     +
       ## builtin/log.c ##
      @@
       #include "repository.h"
       #include "commit-reach.h"
       #include "range-diff.h"
     -+#include "dir.h"
      +#include "tmp-objdir.h"
       
       #define MAIL_DEFAULT_WRAP 72
       #define COVER_FROM_AUTO_MAX_SUBJECT_LEN 100
      @@ builtin/log.c: static int cmd_log_walk(struct rev_info *rev)
     + 	struct commit *commit;
       	int saved_nrl = 0;
       	int saved_dcctc = 0;
     - 
     ++	struct tmp_objdir *remerge_objdir = NULL;
     ++
      +	if (rev->remerge_diff) {
     -+		rev->remerge_objdir = tmp_objdir_create("remerge-diff");
     -+		if (!rev->remerge_objdir)
     -+			die(_("unable to create temporary object directory"));
     -+		tmp_objdir_replace_primary_odb(rev->remerge_objdir, 1);
     ++		remerge_objdir = tmp_objdir_create("remerge-diff");
     ++		if (!remerge_objdir)
     ++			die_errno(_("unable to create temporary object directory"));
     ++		tmp_objdir_replace_primary_odb(remerge_objdir, 1);
      +	}
     -+
     + 
       	if (rev->early_output)
       		setup_early_output();
     - 
      @@ builtin/log.c: static int cmd_log_walk(struct rev_info *rev)
       	rev->diffopt.no_free = 0;
       	diff_free(&rev->diffopt);
       
     -+	if (rev->remerge_diff) {
     -+		tmp_objdir_destroy(rev->remerge_objdir);
     -+		rev->remerge_objdir = NULL;
     -+	}
     ++	if (rev->remerge_diff)
     ++		tmp_objdir_destroy(remerge_objdir);
      +
       	if (rev->diffopt.output_format & DIFF_FORMAT_CHECKDIFF &&
       	    rev->diffopt.flags.check_failed) {
     @@ builtin/log.c: int cmd_format_patch(int argc, const char **argv, const char *pre
       	if (rev.diffopt.output_format & DIFF_FORMAT_CHECKDIFF)
       		die(_("--check does not make sense"));
      +	if (rev.remerge_diff)
     -+		die(_("--remerge_diff does not make sense"));
     ++		die(_("--remerge-diff does not make sense"));
       
       	if (!use_patch_format &&
       		(!rev.diffopt.output_format ||
     @@ log-tree.c
       #include "config.h"
       #include "diff.h"
       #include "object-store.h"
     - #include "repository.h"
     -+#include "tmp-objdir.h"
     - #include "commit.h"
     +@@
       #include "tag.h"
       #include "graph.h"
       #include "log-tree.h"
     @@ log-tree.c
       #include "reflog-walk.h"
       #include "refs.h"
       #include "string-list.h"
     -@@
     - #include "line-log.h"
     - #include "help.h"
     - #include "range-diff.h"
     -+#include "dir.h"
     - 
     - static struct decoration name_decoration = { "object names" };
     - static int decoration_loaded;
      @@ log-tree.c: static int do_diff_combined(struct rev_info *opt, struct commit *commit)
       	return !opt->loginfo;
       }
     @@ log-tree.c: static int do_diff_combined(struct rev_info *opt, struct commit *com
      +{
      +	struct merge_options o;
      +	struct commit_list *bases;
     -+	struct merge_result res;
     ++	struct merge_result res = {0};
      +	struct pretty_print_context ctx = {0};
     -+	struct strbuf commit1 = STRBUF_INIT;
     -+	struct strbuf commit2 = STRBUF_INIT;
     ++	struct commit *parent1 = parents->item;
     ++	struct commit *parent2 = parents->next->item;
     ++	struct strbuf parent1_desc = STRBUF_INIT;
     ++	struct strbuf parent2_desc = STRBUF_INIT;
      +
      +	/* Setup merge options */
      +	init_merge_options(&o, the_repository);
     -+	memset(&res, 0, sizeof(res));
      +	o.show_rename_progress = 0;
      +
      +	ctx.abbrev = DEFAULT_ABBREV;
     -+	format_commit_message(parents->item,       "%h (%s)", &commit1, &ctx);
     -+	format_commit_message(parents->next->item, "%h (%s)", &commit2, &ctx);
     -+	o.branch1 = commit1.buf;
     -+	o.branch2 = commit2.buf;
     -+	o.record_conflict_msgs_as_headers = 1;
     ++	format_commit_message(parent1, "%h (%s)", &parent1_desc, &ctx);
     ++	format_commit_message(parent2, "%h (%s)", &parent2_desc, &ctx);
     ++	o.branch1 = parent1_desc.buf;
     ++	o.branch2 = parent2_desc.buf;
      +
      +	/* Parse the relevant commits and get the merge bases */
     -+	parse_commit_or_die(parents->item);
     -+	parse_commit_or_die(parents->next->item);
     -+	bases = get_merge_bases(parents->item, parents->next->item);
     ++	parse_commit_or_die(parent1);
     ++	parse_commit_or_die(parent2);
     ++	bases = get_merge_bases(parent1, parent2);
      +
      +	/* Re-merge the parents */
     -+	merge_incore_recursive(&o,
     -+			       bases, parents->item, parents->next->item,
     -+			       &res);
     ++	merge_incore_recursive(&o, bases, parent1, parent2, &res);
      +
      +	/* Show the diff */
     -+	opt->diffopt.additional_path_headers = res.path_messages;
      +	diff_tree_oid(&res.tree->object.oid, oid, "", &opt->diffopt);
      +	log_tree_diff_flush(opt);
      +
      +	/* Cleanup */
     -+	opt->diffopt.additional_path_headers = NULL;
     -+	strbuf_release(&commit1);
     -+	strbuf_release(&commit2);
     ++	strbuf_release(&parent1_desc);
     ++	strbuf_release(&parent2_desc);
      +	merge_finalize(&o, &res);
     -+
     -+	/* Clean up the temporary object directory */
     -+	if (opt->remerge_objdir != NULL)
     -+		tmp_objdir_discard_objects(opt->remerge_objdir);
     -+	else
     -+		BUG("unable to remove temporary object directory");
     ++	/* TODO: clean up the temporary object directory */
      +
      +	return !opt->loginfo;
      +}
     @@ revision.h: struct rev_info {
       
       	/* Format info */
       	int		show_notes;
     -@@ revision.h: struct rev_info {
     - 
     - 	/* misc. flags related to '--no-kept-objects' */
     - 	unsigned keep_pack_cache_flags;
     +
     + ## t/t4069-remerge-diff.sh (new) ##
     +@@
     ++#!/bin/sh
      +
     -+	/* Location where temporary objects for remerge-diff are written. */
     -+	struct tmp_objdir *remerge_objdir;
     - };
     - 
     - int ref_excluded(struct string_list *, const char *path);
     ++test_description='remerge-diff handling'
     ++
     ++. ./test-lib.sh
     ++
     ++test_expect_success 'setup basic merges' '
     ++	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
     ++	git add numbers &&
     ++	git commit -m base &&
     ++
     ++	git branch feature_a &&
     ++	git branch feature_b &&
     ++	git branch feature_c &&
     ++
     ++	git branch ab_resolution &&
     ++	git branch bc_resolution &&
     ++
     ++	git checkout feature_a &&
     ++	test_write_lines 1 2 three 4 5 6 7 eight 9 >numbers &&
     ++	git commit -a -m change_a &&
     ++
     ++	git checkout feature_b &&
     ++	test_write_lines 1 2 tres 4 5 6 7 8 9 >numbers &&
     ++	git commit -a -m change_b &&
     ++
     ++	git checkout feature_c &&
     ++	test_write_lines 1 2 3 4 5 6 7 8 9 10 >numbers &&
     ++	git commit -a -m change_c &&
     ++
     ++	git checkout bc_resolution &&
     ++	# fast forward
     ++	git merge feature_b &&
     ++	# no conflict
     ++	git merge feature_c &&
     ++
     ++	git checkout ab_resolution &&
     ++	# fast forward
     ++	git merge feature_a &&
     ++	# conflicts!
     ++	test_must_fail git merge feature_b &&
     ++	# Resolve conflict...and make another change elsewhere
     ++	test_write_lines 1 2 drei 4 5 6 7 acht 9 >numbers &&
     ++	git add numbers &&
     ++	git merge --continue
     ++'
     ++
     ++test_expect_success 'remerge-diff on a clean merge' '
     ++	git log -1 --oneline bc_resolution >expect &&
     ++	git show --oneline --remerge-diff bc_resolution >actual &&
     ++	test_cmp expect actual
     ++'
     ++
     ++test_expect_success 'remerge-diff with both a resolved conflict and an unrelated change' '
     ++	git log -1 --oneline ab_resolution >tmp &&
     ++	cat <<-EOF >>tmp &&
     ++	diff --git a/numbers b/numbers
     ++	index a1fb731..6875544 100644
     ++	--- a/numbers
     ++	+++ b/numbers
     ++	@@ -1,13 +1,9 @@
     ++	 1
     ++	 2
     ++	-<<<<<<< b0ed5cb (change_a)
     ++	-three
     ++	-=======
     ++	-tres
     ++	->>>>>>> 6cd3f82 (change_b)
     ++	+drei
     ++	 4
     ++	 5
     ++	 6
     ++	 7
     ++	-eight
     ++	+acht
     ++	 9
     ++	EOF
     ++	# Hashes above are sha1; rip them out so test works with sha256
     ++	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
     ++
     ++	git show --oneline --remerge-diff ab_resolution >tmp &&
     ++	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
     ++	test_cmp expect actual
     ++'
     ++
     ++test_done
  1:  fab1b2c69ea !  2:  54f1fb31d04 tmp_objdir: add a helper function for discarding all contained objects
     @@ Metadata
      Author: Elijah Newren <newren@gmail.com>
      
       ## Commit message ##
     -    tmp_objdir: add a helper function for discarding all contained objects
     +    log: clean unneeded objects during `log --remerge-diff`
     +
     +    The --remerge-diff option will need to create new blobs and trees
     +    representing the "automatic merge" state.  If one is traversing a
     +    long project history, one can easily get hundreds of thousands of
     +    loose objects generated during `log --remerge-diff`.  However, none of
     +    those loose objects are needed after we have completed our diff
     +    operation; they can be summarily deleted.
     +
     +    Add a new helper function to tmp_objdir to discard all the contained
     +    objects, and call it after each merge is handled.
      
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
     + ## builtin/log.c ##
     +@@ builtin/log.c: static int cmd_log_walk(struct rev_info *rev)
     + 	struct commit *commit;
     + 	int saved_nrl = 0;
     + 	int saved_dcctc = 0;
     +-	struct tmp_objdir *remerge_objdir = NULL;
     + 
     + 	if (rev->remerge_diff) {
     +-		remerge_objdir = tmp_objdir_create("remerge-diff");
     +-		if (!remerge_objdir)
     ++		rev->remerge_objdir = tmp_objdir_create("remerge-diff");
     ++		if (!rev->remerge_objdir)
     + 			die_errno(_("unable to create temporary object directory"));
     +-		tmp_objdir_replace_primary_odb(remerge_objdir, 1);
     ++		tmp_objdir_replace_primary_odb(rev->remerge_objdir, 1);
     + 	}
     + 
     + 	if (rev->early_output)
     +@@ builtin/log.c: static int cmd_log_walk(struct rev_info *rev)
     + 	rev->diffopt.no_free = 0;
     + 	diff_free(&rev->diffopt);
     + 
     +-	if (rev->remerge_diff)
     +-		tmp_objdir_destroy(remerge_objdir);
     ++	if (rev->remerge_diff) {
     ++		tmp_objdir_destroy(rev->remerge_objdir);
     ++		rev->remerge_objdir = NULL;
     ++	}
     + 
     + 	if (rev->diffopt.output_format & DIFF_FORMAT_CHECKDIFF &&
     + 	    rev->diffopt.flags.check_failed) {
     +
     + ## log-tree.c ##
     +@@
     + #include "diff.h"
     + #include "object-store.h"
     + #include "repository.h"
     ++#include "tmp-objdir.h"
     + #include "commit.h"
     + #include "tag.h"
     + #include "graph.h"
     +@@ log-tree.c: static int do_remerge_diff(struct rev_info *opt,
     + 	strbuf_release(&parent1_desc);
     + 	strbuf_release(&parent2_desc);
     + 	merge_finalize(&o, &res);
     +-	/* TODO: clean up the temporary object directory */
     ++
     ++	/* Clean up the contents of the temporary object directory */
     ++	if (opt->remerge_objdir)
     ++		tmp_objdir_discard_objects(opt->remerge_objdir);
     ++	else
     ++		BUG("unable to remove temporary object directory");
     + 
     + 	return !opt->loginfo;
     + }
     +
     + ## revision.h ##
     +@@ revision.h: struct rev_info {
     + 
     + 	/* misc. flags related to '--no-kept-objects' */
     + 	unsigned keep_pack_cache_flags;
     ++
     ++	/* Location where temporary objects for remerge-diff are written. */
     ++	struct tmp_objdir *remerge_objdir;
     + };
     + 
     + int ref_excluded(struct string_list *, const char *path);
     +
       ## tmp-objdir.c ##
      @@ tmp-objdir.c: static void remove_tmp_objdir_on_signal(int signo)
       	raise(signo);
  2:  d022176618d !  3:  d5566f5d136 ll-merge: make callers responsible for showing warnings
     @@ Commit message
          from ll-merge and instead modify the return status of ll_merge() to
          indicate when a merge of binary files has occurred.
      
     +    This commit continues printing the message as-is; future changes will
     +    start handling the new commit differently in the merge-ort codepath.
     +
          Note that my methodology included first modifying ll_merge() to return
          a struct, so that the compiler would catch all the callers for me and
          ensure I had modified all of them.  After modifying all of them, I then
     @@ apply.c: static int three_way_merge(struct apply_state *state,
       			  NULL);
      +	if (status == LL_MERGE_BINARY_CONFLICT)
      +		warning("Cannot merge binary files: %s (%s vs. %s)",
     -+			"base", "ours", "theirs");
     ++			path, "ours", "theirs");
       	free(base_file.ptr);
       	free(our_file.ptr);
       	free(their_file.ptr);
     @@ rerere.c: static int try_merge(struct index_state *istate,
       	mmfile_t base = {NULL, 0}, other = {NULL, 0};
       
       	if (read_mmfile(&base, rerere_path(id, "preimage")) ||
     - 	    read_mmfile(&other, rerere_path(id, "postimage")))
     +-	    read_mmfile(&other, rerere_path(id, "postimage")))
      -		ret = 1;
      -	else
     ++	    read_mmfile(&other, rerere_path(id, "postimage"))) {
      +		ret = LL_MERGE_CONFLICT;
     -+	else {
     ++	} else {
       		/*
       		 * A three-way merge. Note that this honors user-customizable
       		 * low-level merge driver settings.
  3:  f36395fdee0 =  4:  a02845f12db merge-ort: capture and print ll-merge warnings in our preferred fashion
  4:  1e7eef7705e !  5:  000933c5d7f merge-ort: mark a few more conflict messages as omittable
     @@ Commit message
          are just noise when trying to see what changes users made to create a
          merge commit.  Mark them as omittable.
      
     +    Note that there were already a few messages marked as omittable in
     +    merge-ort when doing a remerge-diff, because the development of
     +    --remerge-diff preceded the upstreaming of merge-ort and I was trying to
     +    ensure merge-ort could handle all the necessary requirements.  See
     +    commit c5a6f65527 ("merge-ort: add modify/delete handling and delayed
     +    output processing", 2020-12-03) for the initial details.  For some
     +    examples of already-marked-as-omittable messages, see either
     +    "Auto-merging <path>" or some of the submodule update hints.  This
     +    commit just adds two more messages that should also be omittable.
     +
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
       ## merge-ort.c ##
     @@ merge-ort.c: static void apply_directory_rename_modifications(struct merge_optio
       				 _("CONFLICT (file location): %s renamed to %s "
       				   "in %s, inside a directory that was renamed "
       				   "in %s, suggesting it should perhaps be "
     -@@ merge-ort.c: static void process_entry(struct merge_options *opt,
     - 				reason = _("add/add");
     - 			if (S_ISGITLINK(merged_file.mode))
     - 				reason = _("submodule");
     --			path_msg(opt, path, 0,
     -+			path_msg(opt, path, 1,
     - 				 _("CONFLICT (%s): Merge conflict in %s"),
     - 				 reason, path);
     - 		}
  7:  b307f63569f !  6:  887e46435c0 merge-ort: format messages slightly different for use in headers
     @@ Metadata
       ## Commit message ##
          merge-ort: format messages slightly different for use in headers
      
     -    We want to add an ability for users to run
     +    When users run
              git show --remerge-diff $MERGE_COMMIT
     -    or even
     +    or
              git log -p --remerge-diff ...
     -    and have git show the differences between where the merge machinery
     -    would stop and what is recorded in merge commits.  However, in such
     -    cases, stdout is not an appropriate location to dump conflict messages.
     -    We instead want these messages to appear as headers in the subsequent
     -    diff.  For them to work as headers, though, we need for any multiline
     +    stdout is not an appropriate location to dump conflict messages, but we
     +    do want to provide them to users.  We will include them in the diff
     +    headers instead...but for that to work, we need for any multiline
          messages to replace newlines with both a newline and a space.  Add a new
          flag to signal when we want these messages modified in such a fashion,
          and use it in path_msg() to modify these messages this way.
  6:  15600df925f !  7:  e9470651303 diff: add ability to insert additional headers for paths
     @@ Metadata
       ## Commit message ##
          diff: add ability to insert additional headers for paths
      
     -    In support of a remerge-diff ability we will add in a few commits, we
     -    want to be able to provide additional headers to show along with a diff.
     -    Add the plumbing necessary to enable this.
     +    When additional headers are provided, we need to
     +      * add diff_filepairs to diff_queued_diff for each paths in the
     +        additional headers map which, unless that path is part of
     +        another diff_filepair already found in diff_queued_diff
     +      * format the headers (colorization, line_prefix for --graph)
     +      * make sure the various codepaths that attempt to return early
     +        if there are "no changes" take into account the headers that
     +        need to be shown.
      
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
     @@ diff.c: struct userdiff_driver *get_textconv(struct repository *r,
       	return userdiff_get_textconv(r, one->driver);
       }
       
     -+static struct strbuf* additional_headers(struct diff_options *o,
     ++static struct strbuf *additional_headers(struct diff_options *o,
      +					 const char *path)
      +{
      +	if (!o->additional_path_headers)
     @@ diff.c: struct userdiff_driver *get_textconv(struct repository *r,
      +{
      +	char *next, *newline;
      +
     -+	next = more_headers->buf;
     -+	while ((newline = strchr(next, '\n'))) {
     -+		*newline = '\0';
     -+		strbuf_addf(msg, "%s%s%s%s\n", line_prefix, meta, next, reset);
     -+		*newline = '\n';
     -+		next = newline + 1;
     ++	for (next = more_headers->buf; *next; next = newline) {
     ++		newline = strchrnul(next, '\n');
     ++		strbuf_addf(msg, "%s%s%.*s%s\n", line_prefix, meta,
     ++			    (int)(newline - next), next, reset);
     ++		if (*newline)
     ++			newline++;
      +	}
     -+	if (*next)
     -+		strbuf_addf(msg, "%s%s%s%s\n", line_prefix, meta, next, reset);
      +}
      +
       static void builtin_diff(const char *name_a,
       			 const char *name_b,
       			 struct diff_filespec *one,
     +@@ diff.c: static void builtin_diff(const char *name_a,
     + 	b_two = quote_two(b_prefix, name_b + (*name_b == '/'));
     + 	lbl[0] = DIFF_FILE_VALID(one) ? a_one : "/dev/null";
     + 	lbl[1] = DIFF_FILE_VALID(two) ? b_two : "/dev/null";
     ++	if (!DIFF_FILE_VALID(one) && !DIFF_FILE_VALID(two)) {
     ++		/*
     ++		 * We should only reach this point for pairs from
     ++		 * create_filepairs_for_header_only_notifications().  For
     ++		 * these, we should avoid the "/dev/null" special casing
     ++		 * above, meaning we avoid showing such pairs as either
     ++		 * "new file" or "deleted file" below.
     ++		 */
     ++		lbl[0] = a_one;
     ++		lbl[1] = b_two;
     ++	}
     + 	strbuf_addf(&header, "%s%sdiff --git %s %s%s\n", line_prefix, meta, a_one, b_two, reset);
     + 	if (lbl[0][0] == '/') {
     + 		/* /dev/null */
      @@ diff.c: static void fill_metainfo(struct strbuf *msg,
       	const char *set = diff_get_color(use_color, DIFF_METAINFO);
       	const char *reset = diff_get_color(use_color, DIFF_RESET);
     @@ diff.c: static void fill_metainfo(struct strbuf *msg,
       
       	*must_show_header = 1;
       	strbuf_init(msg, PATH_MAX * 2 + 300);
     -+	if ((more_headers = additional_headers(o, name)))
     +@@ diff.c: static void fill_metainfo(struct strbuf *msg,
     + 	default:
     + 		*must_show_header = 0;
     + 	}
     ++	if ((more_headers = additional_headers(o, name))) {
      +		add_formatted_headers(msg, more_headers,
      +				      line_prefix, set, reset);
     - 	switch (p->status) {
     - 	case DIFF_STATUS_COPIED:
     - 		strbuf_addf(msg, "%s%ssimilarity index %d%%",
     ++		*must_show_header = 1;
     ++	}
     + 	if (one && two && !oideq(&one->oid, &two->oid)) {
     + 		const unsigned hexsz = the_hash_algo->hexsz;
     + 		int abbrev = o->abbrev ? o->abbrev : DEFAULT_ABBREV;
      @@ diff.c: int diff_unmodified_pair(struct diff_filepair *p)
       
       static void diff_flush_patch(struct diff_filepair *p, struct diff_options *o)
       {
      -	if (diff_unmodified_pair(p))
     ++	/*
     ++	 * Check if we can return early without showing a diff.  Note that
     ++	 * diff_filepair only stores {oid, path, mode, is_valid}
     ++	 * information for each path, and thus diff_unmodified_pair() only
     ++	 * considers those bits of info.  However, we do not want pairs
     ++	 * created by create_filepairs_for_header_only_notifications() to
     ++	 * be ignored, so return early if both p is unmodified AND
     ++	 * p->one->path is not in additional headers.
     ++	 */
      +	if (diff_unmodified_pair(p) && !additional_headers(o, p->one->path))
       		return;
       
     ++	/* Actually, we can also return early to avoid showing tree diffs */
       	if ((DIFF_FILE_VALID(p->one) && S_ISDIR(p->one->mode)) ||
     + 	    (DIFF_FILE_VALID(p->two) && S_ISDIR(p->two->mode)))
     +-		return; /* no tree diffs in patch format */
     ++		return;
     + 
     + 	run_diff(p, o);
     + }
     +@@ diff.c: static void diff_flush_checkdiff(struct diff_filepair *p,
     + 	run_checkdiff(p, o);
     + }
     + 
     +-int diff_queue_is_empty(void)
     ++int diff_queue_is_empty(struct diff_options *o)
     + {
     + 	struct diff_queue_struct *q = &diff_queued_diff;
     + 	int i;
     ++
     ++	if (o->additional_path_headers &&
     ++	    !strmap_empty(o->additional_path_headers))
     ++		return 0;
     + 	for (i = 0; i < q->nr; i++)
     + 		if (!diff_unmodified_pair(q->queue[i]))
     + 			return 0;
     +@@ diff.c: void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc)
     + 		warning(_(rename_limit_advice), varname, needed);
     + }
     + 
     ++static void create_filepairs_for_header_only_notifications(struct diff_options *o)
     ++{
     ++	struct strset present;
     ++	struct diff_queue_struct *q = &diff_queued_diff;
     ++	struct hashmap_iter iter;
     ++	struct strmap_entry *e;
     ++	int i;
     ++
     ++	strset_init_with_options(&present, /*pool*/ NULL, /*strdup*/ 0);
     ++
     ++	/*
     ++	 * Find out which paths exist in diff_queued_diff, preferring
     ++	 * one->path for any pair that has multiple paths.
     ++	 */
     ++	for (i = 0; i < q->nr; i++) {
     ++		struct diff_filepair *p = q->queue[i];
     ++		char *path = p->one->path ? p->one->path : p->two->path;
     ++
     ++		if (strmap_contains(o->additional_path_headers, path))
     ++			strset_add(&present, path);
     ++	}
     ++
     ++	/*
     ++	 * Loop over paths in additional_path_headers; for each NOT already
     ++	 * in diff_queued_diff, create a synthetic filepair and insert that
     ++	 * into diff_queued_diff.
     ++	 */
     ++	strmap_for_each_entry(o->additional_path_headers, &iter, e) {
     ++		if (!strset_contains(&present, e->key)) {
     ++			struct diff_filespec *one, *two;
     ++			struct diff_filepair *p;
     ++
     ++			one = alloc_filespec(e->key);
     ++			two = alloc_filespec(e->key);
     ++			fill_filespec(one, null_oid(), 0, 0);
     ++			fill_filespec(two, null_oid(), 0, 0);
     ++			p = diff_queue(q, one, two);
     ++			p->status = DIFF_STATUS_MODIFIED;
     ++		}
     ++	}
     ++
     ++	/* Re-sort the filepairs */
     ++	diffcore_fix_diff_index();
     ++
     ++	/* Cleanup */
     ++	strset_clear(&present);
     ++}
     ++
     + static void diff_flush_patch_all_file_pairs(struct diff_options *o)
     + {
     + 	int i;
     +@@ diff.c: static void diff_flush_patch_all_file_pairs(struct diff_options *o)
     + 	if (o->color_moved)
     + 		o->emitted_symbols = &esm;
     + 
     ++	if (o->additional_path_headers)
     ++		create_filepairs_for_header_only_notifications(o);
     ++
     + 	for (i = 0; i < q->nr; i++) {
     + 		struct diff_filepair *p = q->queue[i];
     + 		if (check_pair_status(p))
     +@@ diff.c: void diff_flush(struct diff_options *options)
     + 	 * Order: raw, stat, summary, patch
     + 	 * or:    name/name-status/checkdiff (other bits clear)
     + 	 */
     +-	if (!q->nr)
     ++	if (!q->nr && !options->additional_path_headers)
     + 		goto free_queue;
     + 
     + 	if (output_format & (DIFF_FORMAT_RAW |
      
       ## diff.h ##
      @@ diff.h: struct diff_options {
     @@ diff.h: struct diff_options {
       
       	int no_free;
       };
     +@@ diff.h: void diffcore_fix_diff_index(void);
     + "                show all files diff when -S is used and hit is found.\n" \
     + "  -a  --text    treat all files as text.\n"
     + 
     +-int diff_queue_is_empty(void);
     ++int diff_queue_is_empty(struct diff_options*);
     + void diff_flush(struct diff_options*);
     + void diff_free(struct diff_options*);
     + void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc);
     +
     + ## log-tree.c ##
     +@@ log-tree.c: int log_tree_diff_flush(struct rev_info *opt)
     + 	opt->shown_dashes = 0;
     + 	diffcore_std(&opt->diffopt);
     + 
     +-	if (diff_queue_is_empty()) {
     ++	if (diff_queue_is_empty(&opt->diffopt)) {
     + 		int saved_fmt = opt->diffopt.output_format;
     + 		opt->diffopt.output_format = DIFF_FORMAT_NO_OUTPUT;
     + 		diff_flush(&opt->diffopt);
  5:  dd5461d45de !  8:  4cc53c55a6e merge-ort: make path_messages available to external callers
     @@ Metadata
      Author: Elijah Newren <newren@gmail.com>
      
       ## Commit message ##
     -    merge-ort: make path_messages available to external callers
     +    show, log: include conflict/warning messages in --remerge-diff headers
      
     -    merge-ort is designed to be more flexible so that it could be called as
     -    more of a library function.  Part of that design is not writing to the
     -    working tree or index unless and until requested.  Part of it is
     -    returning tree objects (rather than creating commits and making them
     -    part of HEAD), and allowing callers to do their own special thing with
     -    that merged tree.  Along the same lines, we want to enable callers to do
     -    something special with output messages (conflicts and other warnings)
     -    besides just automatically displaying on stdout/stderr.  Do so by making
     -    the output path messages accessible via a new member of struct
     -    merge_result named path_messages.
     +    Conflicts such as modify/delete, rename/rename, or file/directory are
     +    not representable via content conflict markers, and the normal output
     +    messages notifying users about these were dropped with --remerge-diff.
     +    While we don't want these messages randomly shown before the commit
     +    and diff headers, we do want them to still be shown; include them as
     +    part of the diff headers instead.
      
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
     + ## log-tree.c ##
     +@@ log-tree.c: static int do_remerge_diff(struct rev_info *opt,
     + 	/* Setup merge options */
     + 	init_merge_options(&o, the_repository);
     + 	o.show_rename_progress = 0;
     ++	o.record_conflict_msgs_as_headers = 1;
     + 
     + 	ctx.abbrev = DEFAULT_ABBREV;
     + 	format_commit_message(parent1, "%h (%s)", &parent1_desc, &ctx);
     +@@ log-tree.c: static int do_remerge_diff(struct rev_info *opt,
     + 	merge_incore_recursive(&o, bases, parent1, parent2, &res);
     + 
     + 	/* Show the diff */
     ++	opt->diffopt.additional_path_headers = res.path_messages;
     + 	diff_tree_oid(&res.tree->object.oid, oid, "", &opt->diffopt);
     + 	log_tree_diff_flush(opt);
     + 
     + 	/* Cleanup */
     ++	opt->diffopt.additional_path_headers = NULL;
     + 	strbuf_release(&parent1_desc);
     + 	strbuf_release(&parent2_desc);
     + 	merge_finalize(&o, &res);
     +
       ## merge-ort.c ##
      @@ merge-ort.c: redo:
       	trace2_region_leave("merge", "process_entries", opt->repo);
     @@ merge-ort.h: struct merge_result {
       	/*
       	 * Additional metadata used by merge_switch_to_result() or future calls
       	 * to merge_incore_*().  Includes data needed to update the index (if
     +
     + ## t/t4069-remerge-diff.sh ##
     +@@ t/t4069-remerge-diff.sh: test_description='remerge-diff handling'
     + 
     + . ./test-lib.sh
     + 
     ++# --remerge-diff uses ort under the hood regardless of setting.  However,
     ++# we set up a file/directory conflict beforehand, and the different backends
     ++# handle the conflict differently, which would require separate code paths
     ++# to resolve.  There's not much point in making the code uglier to do that,
     ++# though, when the real thing we are testing (--remerge-diff) will hardcode
     ++# calls directly into the merge-ort API anyway.  So just force the use of
     ++# ort on the setup too.
     ++GIT_TEST_MERGE_ALGORITHM=ort
     ++
     + test_expect_success 'setup basic merges' '
     + 	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
     + 	git add numbers &&
     +@@ t/t4069-remerge-diff.sh: test_expect_success 'remerge-diff with both a resolved conflict and an unrelated
     + 	git log -1 --oneline ab_resolution >tmp &&
     + 	cat <<-EOF >>tmp &&
     + 	diff --git a/numbers b/numbers
     ++	CONFLICT (content): Merge conflict in numbers
     + 	index a1fb731..6875544 100644
     + 	--- a/numbers
     + 	+++ b/numbers
     +@@ t/t4069-remerge-diff.sh: test_expect_success 'remerge-diff with both a resolved conflict and an unrelated
     + 	test_cmp expect actual
     + '
     + 
     ++test_expect_success 'setup non-content conflicts' '
     ++	git switch --orphan base &&
     ++
     ++	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
     ++	test_write_lines a b c d e f g h i >letters &&
     ++	test_write_lines in the way >content &&
     ++	git add numbers letters content &&
     ++	git commit -m base &&
     ++
     ++	git branch side1 &&
     ++	git branch side2 &&
     ++
     ++	git checkout side1 &&
     ++	test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
     ++	git mv letters letters_side1 &&
     ++	git mv content file_or_directory &&
     ++	git add numbers &&
     ++	git commit -m side1 &&
     ++
     ++	git checkout side2 &&
     ++	git rm numbers &&
     ++	git mv letters letters_side2 &&
     ++	mkdir file_or_directory &&
     ++	echo hello >file_or_directory/world &&
     ++	git add file_or_directory/world &&
     ++	git commit -m side2 &&
     ++
     ++	git checkout -b resolution side1 &&
     ++	test_must_fail git merge side2 &&
     ++	test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
     ++	git add numbers &&
     ++	git add letters_side1 &&
     ++	git rm letters &&
     ++	git rm letters_side2 &&
     ++	git add file_or_directory~HEAD &&
     ++	git mv file_or_directory~HEAD wanted_content &&
     ++	git commit -m resolved
     ++'
     ++
     ++test_expect_success 'remerge-diff with non-content conflicts' '
     ++	git log -1 --oneline resolution >tmp &&
     ++	cat <<-EOF >>tmp &&
     ++	diff --git a/file_or_directory~HASH (side1) b/wanted_content
     ++	similarity index 100%
     ++	rename from file_or_directory~HASH (side1)
     ++	rename to wanted_content
     ++	CONFLICT (file/directory): directory in the way of file_or_directory from HASH (side1); moving it to file_or_directory~HASH (side1) instead.
     ++	diff --git a/letters b/letters
     ++	CONFLICT (rename/rename): letters renamed to letters_side1 in HASH (side1) and to letters_side2 in HASH (side2).
     ++	diff --git a/letters_side2 b/letters_side2
     ++	deleted file mode 100644
     ++	index b236ae5..0000000
     ++	--- a/letters_side2
     ++	+++ /dev/null
     ++	@@ -1,9 +0,0 @@
     ++	-a
     ++	-b
     ++	-c
     ++	-d
     ++	-e
     ++	-f
     ++	-g
     ++	-h
     ++	-i
     ++	diff --git a/numbers b/numbers
     ++	CONFLICT (modify/delete): numbers deleted in HASH (side2) and modified in HASH (side1).  Version HASH (side1) of numbers left in tree.
     ++	EOF
     ++	# We still have some sha1 hashes above; rip them out so test works
     ++	# with sha256
     ++	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
     ++
     ++	git show --oneline --remerge-diff resolution >tmp &&
     ++	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
     ++	test_cmp expect actual
     ++'
     ++
     + test_done
  9:  4f21969e357 <  -:  ----------- doc/diff-options: explain the new --remerge-diff option

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH v2 1/8] show, log: provide a --remerge-diff capability
  2021-12-25  7:59 ` [PATCH v2 0/8] " Elijah Newren via GitGitGadget
@ 2021-12-25  7:59   ` Elijah Newren via GitGitGadget
  2021-12-28 10:56     ` Johannes Altmanninger
  2021-12-25  7:59   ` [PATCH v2 2/8] log: clean unneeded objects during `log --remerge-diff` Elijah Newren via GitGitGadget
                     ` (9 subsequent siblings)
  10 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-25  7:59 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

When this option is specified, we remerge all (two parent) merge commits
and diff the actual merge commit to the automatically created version,
in order to show how users removed conflict markers, resolved the
different conflict versions, and potentially added new changes outside
of conflict regions in order to resolve semantic merge problems (or,
possibly, just to hide other random changes).

This capability works by creating a temporary object directory and
marking it as the primary object store.  This makes it so that any blobs
or trees created during the automatic merge easily removable afterwards
by just deleting all objects from the temporary object directory.

There are a few ways that this implementation is suboptimal:
  * `log --remerge-diff` becomes slow, because the temporary object
    directory can fills with many loose objects while running
  * the log output can be muddied with misplaced "warning: cannot merge
    binary files" messages, since ll-merge.c unconditionally writes those
    messages to stderr while running instead of allowing callers to
    manage them.
  * important conflict and warning messages are simply dropped; thus for
    conflicts like modify/delete or rename/rename or file/directory which
    are not representable with content conflict markers, there may be no
    way for a user of --remerge-diff to know that there had been a
    conflict which was resolved (and which possibly motivated other
    changes in the merge commit).
Subsequent commits will address these issues.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Documentation/diff-options.txt |  8 ++++
 builtin/log.c                  | 14 ++++++
 diff-merges.c                  | 12 +++++
 log-tree.c                     | 59 +++++++++++++++++++++++
 revision.h                     |  3 +-
 t/t4069-remerge-diff.sh        | 86 ++++++++++++++++++++++++++++++++++
 6 files changed, 181 insertions(+), 1 deletion(-)
 create mode 100755 t/t4069-remerge-diff.sh

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index c89d530d3d1..b05f1c9f1c9 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -64,6 +64,14 @@ ifdef::git-log[]
 	each of the parents. Separate log entry and diff is generated
 	for each parent.
 +
+--diff-merges=remerge:::
+--diff-merges=r:::
+--remerge-diff:::
+	With this option, two-parent merge commits are remerged to
+	create a temporary tree object -- potentially containing files
+	with conflict markers and such.  A diff is then shown between
+	that temporary tree and the actual merge commit.
++
 --diff-merges=combined:::
 --diff-merges=c:::
 -c:::
diff --git a/builtin/log.c b/builtin/log.c
index f75d87e8d7f..d053418fddd 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -35,6 +35,7 @@
 #include "repository.h"
 #include "commit-reach.h"
 #include "range-diff.h"
+#include "tmp-objdir.h"
 
 #define MAIL_DEFAULT_WRAP 72
 #define COVER_FROM_AUTO_MAX_SUBJECT_LEN 100
@@ -406,6 +407,14 @@ static int cmd_log_walk(struct rev_info *rev)
 	struct commit *commit;
 	int saved_nrl = 0;
 	int saved_dcctc = 0;
+	struct tmp_objdir *remerge_objdir = NULL;
+
+	if (rev->remerge_diff) {
+		remerge_objdir = tmp_objdir_create("remerge-diff");
+		if (!remerge_objdir)
+			die_errno(_("unable to create temporary object directory"));
+		tmp_objdir_replace_primary_odb(remerge_objdir, 1);
+	}
 
 	if (rev->early_output)
 		setup_early_output();
@@ -449,6 +458,9 @@ static int cmd_log_walk(struct rev_info *rev)
 	rev->diffopt.no_free = 0;
 	diff_free(&rev->diffopt);
 
+	if (rev->remerge_diff)
+		tmp_objdir_destroy(remerge_objdir);
+
 	if (rev->diffopt.output_format & DIFF_FORMAT_CHECKDIFF &&
 	    rev->diffopt.flags.check_failed) {
 		return 02;
@@ -1943,6 +1955,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 		die(_("--name-status does not make sense"));
 	if (rev.diffopt.output_format & DIFF_FORMAT_CHECKDIFF)
 		die(_("--check does not make sense"));
+	if (rev.remerge_diff)
+		die(_("--remerge-diff does not make sense"));
 
 	if (!use_patch_format &&
 		(!rev.diffopt.output_format ||
diff --git a/diff-merges.c b/diff-merges.c
index 5060ccd890b..0af4b3f9191 100644
--- a/diff-merges.c
+++ b/diff-merges.c
@@ -17,6 +17,7 @@ static void suppress(struct rev_info *revs)
 	revs->combined_all_paths = 0;
 	revs->merges_imply_patch = 0;
 	revs->merges_need_diff = 0;
+	revs->remerge_diff = 0;
 }
 
 static void set_separate(struct rev_info *revs)
@@ -45,6 +46,12 @@ static void set_dense_combined(struct rev_info *revs)
 	revs->dense_combined_merges = 1;
 }
 
+static void set_remerge_diff(struct rev_info *revs)
+{
+	suppress(revs);
+	revs->remerge_diff = 1;
+}
+
 static diff_merges_setup_func_t func_by_opt(const char *optarg)
 {
 	if (!strcmp(optarg, "off") || !strcmp(optarg, "none"))
@@ -57,6 +64,8 @@ static diff_merges_setup_func_t func_by_opt(const char *optarg)
 		return set_combined;
 	else if (!strcmp(optarg, "cc") || !strcmp(optarg, "dense-combined"))
 		return set_dense_combined;
+	else if (!strcmp(optarg, "r") || !strcmp(optarg, "remerge"))
+		return set_remerge_diff;
 	else if (!strcmp(optarg, "m") || !strcmp(optarg, "on"))
 		return set_to_default;
 	return NULL;
@@ -110,6 +119,9 @@ int diff_merges_parse_opts(struct rev_info *revs, const char **argv)
 	} else if (!strcmp(arg, "--cc")) {
 		set_dense_combined(revs);
 		revs->merges_imply_patch = 1;
+	} else if (!strcmp(arg, "--remerge-diff")) {
+		set_remerge_diff(revs);
+		revs->merges_imply_patch = 1;
 	} else if (!strcmp(arg, "--no-diff-merges")) {
 		suppress(revs);
 	} else if (!strcmp(arg, "--combined-all-paths")) {
diff --git a/log-tree.c b/log-tree.c
index 644893fd8cf..84ed864fc81 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -1,4 +1,5 @@
 #include "cache.h"
+#include "commit-reach.h"
 #include "config.h"
 #include "diff.h"
 #include "object-store.h"
@@ -7,6 +8,7 @@
 #include "tag.h"
 #include "graph.h"
 #include "log-tree.h"
+#include "merge-ort.h"
 #include "reflog-walk.h"
 #include "refs.h"
 #include "string-list.h"
@@ -902,6 +904,51 @@ static int do_diff_combined(struct rev_info *opt, struct commit *commit)
 	return !opt->loginfo;
 }
 
+static int do_remerge_diff(struct rev_info *opt,
+			   struct commit_list *parents,
+			   struct object_id *oid,
+			   struct commit *commit)
+{
+	struct merge_options o;
+	struct commit_list *bases;
+	struct merge_result res = {0};
+	struct pretty_print_context ctx = {0};
+	struct commit *parent1 = parents->item;
+	struct commit *parent2 = parents->next->item;
+	struct strbuf parent1_desc = STRBUF_INIT;
+	struct strbuf parent2_desc = STRBUF_INIT;
+
+	/* Setup merge options */
+	init_merge_options(&o, the_repository);
+	o.show_rename_progress = 0;
+
+	ctx.abbrev = DEFAULT_ABBREV;
+	format_commit_message(parent1, "%h (%s)", &parent1_desc, &ctx);
+	format_commit_message(parent2, "%h (%s)", &parent2_desc, &ctx);
+	o.branch1 = parent1_desc.buf;
+	o.branch2 = parent2_desc.buf;
+
+	/* Parse the relevant commits and get the merge bases */
+	parse_commit_or_die(parent1);
+	parse_commit_or_die(parent2);
+	bases = get_merge_bases(parent1, parent2);
+
+	/* Re-merge the parents */
+	merge_incore_recursive(&o, bases, parent1, parent2, &res);
+
+	/* Show the diff */
+	diff_tree_oid(&res.tree->object.oid, oid, "", &opt->diffopt);
+	log_tree_diff_flush(opt);
+
+	/* Cleanup */
+	strbuf_release(&parent1_desc);
+	strbuf_release(&parent2_desc);
+	merge_finalize(&o, &res);
+	/* TODO: clean up the temporary object directory */
+
+	return !opt->loginfo;
+}
+
 /*
  * Show the diff of a commit.
  *
@@ -936,6 +983,18 @@ static int log_tree_diff(struct rev_info *opt, struct commit *commit, struct log
 	}
 
 	if (is_merge) {
+		int octopus = (parents->next->next != NULL);
+
+		if (opt->remerge_diff) {
+			if (octopus) {
+				show_log(opt);
+				fprintf(opt->diffopt.file,
+					"diff: warning: Skipping remerge-diff "
+					"for octopus merges.\n");
+				return 1;
+			}
+			return do_remerge_diff(opt, parents, oid, commit);
+		}
 		if (opt->combine_merges)
 			return do_diff_combined(opt, commit);
 		if (opt->separate_merges) {
diff --git a/revision.h b/revision.h
index 5578bb4720a..13178e6b8f3 100644
--- a/revision.h
+++ b/revision.h
@@ -195,7 +195,8 @@ struct rev_info {
 			combine_merges:1,
 			combined_all_paths:1,
 			dense_combined_merges:1,
-			first_parent_merges:1;
+			first_parent_merges:1,
+			remerge_diff:1;
 
 	/* Format info */
 	int		show_notes;
diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
new file mode 100755
index 00000000000..192dbce2bfe
--- /dev/null
+++ b/t/t4069-remerge-diff.sh
@@ -0,0 +1,86 @@
+#!/bin/sh
+
+test_description='remerge-diff handling'
+
+. ./test-lib.sh
+
+test_expect_success 'setup basic merges' '
+	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
+	git add numbers &&
+	git commit -m base &&
+
+	git branch feature_a &&
+	git branch feature_b &&
+	git branch feature_c &&
+
+	git branch ab_resolution &&
+	git branch bc_resolution &&
+
+	git checkout feature_a &&
+	test_write_lines 1 2 three 4 5 6 7 eight 9 >numbers &&
+	git commit -a -m change_a &&
+
+	git checkout feature_b &&
+	test_write_lines 1 2 tres 4 5 6 7 8 9 >numbers &&
+	git commit -a -m change_b &&
+
+	git checkout feature_c &&
+	test_write_lines 1 2 3 4 5 6 7 8 9 10 >numbers &&
+	git commit -a -m change_c &&
+
+	git checkout bc_resolution &&
+	# fast forward
+	git merge feature_b &&
+	# no conflict
+	git merge feature_c &&
+
+	git checkout ab_resolution &&
+	# fast forward
+	git merge feature_a &&
+	# conflicts!
+	test_must_fail git merge feature_b &&
+	# Resolve conflict...and make another change elsewhere
+	test_write_lines 1 2 drei 4 5 6 7 acht 9 >numbers &&
+	git add numbers &&
+	git merge --continue
+'
+
+test_expect_success 'remerge-diff on a clean merge' '
+	git log -1 --oneline bc_resolution >expect &&
+	git show --oneline --remerge-diff bc_resolution >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'remerge-diff with both a resolved conflict and an unrelated change' '
+	git log -1 --oneline ab_resolution >tmp &&
+	cat <<-EOF >>tmp &&
+	diff --git a/numbers b/numbers
+	index a1fb731..6875544 100644
+	--- a/numbers
+	+++ b/numbers
+	@@ -1,13 +1,9 @@
+	 1
+	 2
+	-<<<<<<< b0ed5cb (change_a)
+	-three
+	-=======
+	-tres
+	->>>>>>> 6cd3f82 (change_b)
+	+drei
+	 4
+	 5
+	 6
+	 7
+	-eight
+	+acht
+	 9
+	EOF
+	# Hashes above are sha1; rip them out so test works with sha256
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
+
+	git show --oneline --remerge-diff ab_resolution >tmp &&
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
+	test_cmp expect actual
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v2 2/8] log: clean unneeded objects during `log --remerge-diff`
  2021-12-25  7:59 ` [PATCH v2 0/8] " Elijah Newren via GitGitGadget
  2021-12-25  7:59   ` [PATCH v2 1/8] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
@ 2021-12-25  7:59   ` Elijah Newren via GitGitGadget
  2021-12-25  7:59   ` [PATCH v2 3/8] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
                     ` (8 subsequent siblings)
  10 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-25  7:59 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

The --remerge-diff option will need to create new blobs and trees
representing the "automatic merge" state.  If one is traversing a
long project history, one can easily get hundreds of thousands of
loose objects generated during `log --remerge-diff`.  However, none of
those loose objects are needed after we have completed our diff
operation; they can be summarily deleted.

Add a new helper function to tmp_objdir to discard all the contained
objects, and call it after each merge is handled.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/log.c | 13 +++++++------
 log-tree.c    |  8 +++++++-
 revision.h    |  3 +++
 tmp-objdir.c  |  5 +++++
 tmp-objdir.h  |  6 ++++++
 5 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index d053418fddd..e6a080df914 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -407,13 +407,12 @@ static int cmd_log_walk(struct rev_info *rev)
 	struct commit *commit;
 	int saved_nrl = 0;
 	int saved_dcctc = 0;
-	struct tmp_objdir *remerge_objdir = NULL;
 
 	if (rev->remerge_diff) {
-		remerge_objdir = tmp_objdir_create("remerge-diff");
-		if (!remerge_objdir)
+		rev->remerge_objdir = tmp_objdir_create("remerge-diff");
+		if (!rev->remerge_objdir)
 			die_errno(_("unable to create temporary object directory"));
-		tmp_objdir_replace_primary_odb(remerge_objdir, 1);
+		tmp_objdir_replace_primary_odb(rev->remerge_objdir, 1);
 	}
 
 	if (rev->early_output)
@@ -458,8 +457,10 @@ static int cmd_log_walk(struct rev_info *rev)
 	rev->diffopt.no_free = 0;
 	diff_free(&rev->diffopt);
 
-	if (rev->remerge_diff)
-		tmp_objdir_destroy(remerge_objdir);
+	if (rev->remerge_diff) {
+		tmp_objdir_destroy(rev->remerge_objdir);
+		rev->remerge_objdir = NULL;
+	}
 
 	if (rev->diffopt.output_format & DIFF_FORMAT_CHECKDIFF &&
 	    rev->diffopt.flags.check_failed) {
diff --git a/log-tree.c b/log-tree.c
index 84ed864fc81..d4655b63d75 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -4,6 +4,7 @@
 #include "diff.h"
 #include "object-store.h"
 #include "repository.h"
+#include "tmp-objdir.h"
 #include "commit.h"
 #include "tag.h"
 #include "graph.h"
@@ -944,7 +945,12 @@ static int do_remerge_diff(struct rev_info *opt,
 	strbuf_release(&parent1_desc);
 	strbuf_release(&parent2_desc);
 	merge_finalize(&o, &res);
-	/* TODO: clean up the temporary object directory */
+
+	/* Clean up the contents of the temporary object directory */
+	if (opt->remerge_objdir)
+		tmp_objdir_discard_objects(opt->remerge_objdir);
+	else
+		BUG("unable to remove temporary object directory");
 
 	return !opt->loginfo;
 }
diff --git a/revision.h b/revision.h
index 13178e6b8f3..44efce3f410 100644
--- a/revision.h
+++ b/revision.h
@@ -318,6 +318,9 @@ struct rev_info {
 
 	/* misc. flags related to '--no-kept-objects' */
 	unsigned keep_pack_cache_flags;
+
+	/* Location where temporary objects for remerge-diff are written. */
+	struct tmp_objdir *remerge_objdir;
 };
 
 int ref_excluded(struct string_list *, const char *path);
diff --git a/tmp-objdir.c b/tmp-objdir.c
index 3d38eeab66b..adf6033549e 100644
--- a/tmp-objdir.c
+++ b/tmp-objdir.c
@@ -79,6 +79,11 @@ static void remove_tmp_objdir_on_signal(int signo)
 	raise(signo);
 }
 
+void tmp_objdir_discard_objects(struct tmp_objdir *t)
+{
+	remove_dir_recursively(&t->path, REMOVE_DIR_KEEP_TOPLEVEL);
+}
+
 /*
  * These env_* functions are for setting up the child environment; the
  * "replace" variant overrides the value of any existing variable with that
diff --git a/tmp-objdir.h b/tmp-objdir.h
index cda5ec76778..76efc7edee5 100644
--- a/tmp-objdir.h
+++ b/tmp-objdir.h
@@ -46,6 +46,12 @@ int tmp_objdir_migrate(struct tmp_objdir *);
  */
 int tmp_objdir_destroy(struct tmp_objdir *);
 
+/*
+ * Remove all objects from the temporary object directory, while leaving it
+ * around so more objects can be added.
+ */
+void tmp_objdir_discard_objects(struct tmp_objdir *);
+
 /*
  * Add the temporary object directory as an alternate object store in the
  * current process.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v2 3/8] ll-merge: make callers responsible for showing warnings
  2021-12-25  7:59 ` [PATCH v2 0/8] " Elijah Newren via GitGitGadget
  2021-12-25  7:59   ` [PATCH v2 1/8] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
  2021-12-25  7:59   ` [PATCH v2 2/8] log: clean unneeded objects during `log --remerge-diff` Elijah Newren via GitGitGadget
@ 2021-12-25  7:59   ` Elijah Newren via GitGitGadget
  2021-12-28 10:56     ` Johannes Altmanninger
  2021-12-25  7:59   ` [PATCH v2 4/8] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
                     ` (7 subsequent siblings)
  10 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-25  7:59 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Since some callers may want to send warning messages to somewhere other
than stdout/stderr, stop printing "warning: Cannot merge binary files"
from ll-merge and instead modify the return status of ll_merge() to
indicate when a merge of binary files has occurred.

This commit continues printing the message as-is; future changes will
start handling the new commit differently in the merge-ort codepath.

Note that my methodology included first modifying ll_merge() to return
a struct, so that the compiler would catch all the callers for me and
ensure I had modified all of them.  After modifying all of them, I then
changed the struct to an enum.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 apply.c            |  5 ++++-
 builtin/checkout.c | 12 ++++++++----
 ll-merge.c         | 40 ++++++++++++++++++++++------------------
 ll-merge.h         |  9 ++++++++-
 merge-blobs.c      |  5 ++++-
 merge-ort.c        |  5 ++++-
 merge-recursive.c  |  5 ++++-
 notes-merge.c      |  5 ++++-
 rerere.c           | 12 ++++++++----
 9 files changed, 66 insertions(+), 32 deletions(-)

diff --git a/apply.c b/apply.c
index 43a0aebf4ee..8079395755f 100644
--- a/apply.c
+++ b/apply.c
@@ -3492,7 +3492,7 @@ static int three_way_merge(struct apply_state *state,
 {
 	mmfile_t base_file, our_file, their_file;
 	mmbuffer_t result = { NULL };
-	int status;
+	enum ll_merge_result status;
 
 	/* resolve trivial cases first */
 	if (oideq(base, ours))
@@ -3509,6 +3509,9 @@ static int three_way_merge(struct apply_state *state,
 			  &their_file, "theirs",
 			  state->repo->index,
 			  NULL);
+	if (status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			path, "ours", "theirs");
 	free(base_file.ptr);
 	free(our_file.ptr);
 	free(their_file.ptr);
diff --git a/builtin/checkout.c b/builtin/checkout.c
index cbf73b8c9f6..3a559d69303 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -237,6 +237,7 @@ static int checkout_merged(int pos, const struct checkout *state,
 	struct cache_entry *ce = active_cache[pos];
 	const char *path = ce->name;
 	mmfile_t ancestor, ours, theirs;
+	enum ll_merge_result merge_status;
 	int status;
 	struct object_id oid;
 	mmbuffer_t result_buf;
@@ -267,13 +268,16 @@ static int checkout_merged(int pos, const struct checkout *state,
 	memset(&ll_opts, 0, sizeof(ll_opts));
 	git_config_get_bool("merge.renormalize", &renormalize);
 	ll_opts.renormalize = renormalize;
-	status = ll_merge(&result_buf, path, &ancestor, "base",
-			  &ours, "ours", &theirs, "theirs",
-			  state->istate, &ll_opts);
+	merge_status = ll_merge(&result_buf, path, &ancestor, "base",
+				&ours, "ours", &theirs, "theirs",
+				state->istate, &ll_opts);
 	free(ancestor.ptr);
 	free(ours.ptr);
 	free(theirs.ptr);
-	if (status < 0 || !result_buf.ptr) {
+	if (merge_status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			path, "ours", "theirs");
+	if (merge_status < 0 || !result_buf.ptr) {
 		free(result_buf.ptr);
 		return error(_("path '%s': cannot merge"), path);
 	}
diff --git a/ll-merge.c b/ll-merge.c
index 261657578c7..669c09eed6c 100644
--- a/ll-merge.c
+++ b/ll-merge.c
@@ -14,7 +14,7 @@
 
 struct ll_merge_driver;
 
-typedef int (*ll_merge_fn)(const struct ll_merge_driver *,
+typedef enum ll_merge_result (*ll_merge_fn)(const struct ll_merge_driver *,
 			   mmbuffer_t *result,
 			   const char *path,
 			   mmfile_t *orig, const char *orig_name,
@@ -49,7 +49,7 @@ void reset_merge_attributes(void)
 /*
  * Built-in low-levels
  */
-static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
+static enum ll_merge_result ll_binary_merge(const struct ll_merge_driver *drv_unused,
 			   mmbuffer_t *result,
 			   const char *path,
 			   mmfile_t *orig, const char *orig_name,
@@ -58,6 +58,7 @@ static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
 			   const struct ll_merge_options *opts,
 			   int marker_size)
 {
+	enum ll_merge_result ret;
 	mmfile_t *stolen;
 	assert(opts);
 
@@ -68,16 +69,19 @@ static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
 	 */
 	if (opts->virtual_ancestor) {
 		stolen = orig;
+		ret = LL_MERGE_OK;
 	} else {
 		switch (opts->variant) {
 		default:
-			warning("Cannot merge binary files: %s (%s vs. %s)",
-				path, name1, name2);
-			/* fallthru */
+			ret = LL_MERGE_BINARY_CONFLICT;
+			stolen = src1;
+			break;
 		case XDL_MERGE_FAVOR_OURS:
+			ret = LL_MERGE_OK;
 			stolen = src1;
 			break;
 		case XDL_MERGE_FAVOR_THEIRS:
+			ret = LL_MERGE_OK;
 			stolen = src2;
 			break;
 		}
@@ -87,16 +91,10 @@ static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
 	result->size = stolen->size;
 	stolen->ptr = NULL;
 
-	/*
-	 * With -Xtheirs or -Xours, we have cleanly merged;
-	 * otherwise we got a conflict.
-	 */
-	return opts->variant == XDL_MERGE_FAVOR_OURS ||
-	       opts->variant == XDL_MERGE_FAVOR_THEIRS ?
-	       0 : 1;
+	return ret;
 }
 
-static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
+static enum ll_merge_result ll_xdl_merge(const struct ll_merge_driver *drv_unused,
 			mmbuffer_t *result,
 			const char *path,
 			mmfile_t *orig, const char *orig_name,
@@ -105,7 +103,9 @@ static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
 			const struct ll_merge_options *opts,
 			int marker_size)
 {
+	enum ll_merge_result ret;
 	xmparam_t xmp;
+	int status;
 	assert(opts);
 
 	if (orig->size > MAX_XDIFF_SIZE ||
@@ -133,10 +133,12 @@ static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
 	xmp.ancestor = orig_name;
 	xmp.file1 = name1;
 	xmp.file2 = name2;
-	return xdl_merge(orig, src1, src2, &xmp, result);
+	status = xdl_merge(orig, src1, src2, &xmp, result);
+	ret = (status > 1 ) ? LL_MERGE_CONFLICT : status;
+	return ret;
 }
 
-static int ll_union_merge(const struct ll_merge_driver *drv_unused,
+static enum ll_merge_result ll_union_merge(const struct ll_merge_driver *drv_unused,
 			  mmbuffer_t *result,
 			  const char *path,
 			  mmfile_t *orig, const char *orig_name,
@@ -178,7 +180,7 @@ static void create_temp(mmfile_t *src, char *path, size_t len)
 /*
  * User defined low-level merge driver support.
  */
-static int ll_ext_merge(const struct ll_merge_driver *fn,
+static enum ll_merge_result ll_ext_merge(const struct ll_merge_driver *fn,
 			mmbuffer_t *result,
 			const char *path,
 			mmfile_t *orig, const char *orig_name,
@@ -194,6 +196,7 @@ static int ll_ext_merge(const struct ll_merge_driver *fn,
 	const char *args[] = { NULL, NULL };
 	int status, fd, i;
 	struct stat st;
+	enum ll_merge_result ret;
 	assert(opts);
 
 	sq_quote_buf(&path_sq, path);
@@ -236,7 +239,8 @@ static int ll_ext_merge(const struct ll_merge_driver *fn,
 		unlink_or_warn(temp[i]);
 	strbuf_release(&cmd);
 	strbuf_release(&path_sq);
-	return status;
+	ret = (status > 1) ? LL_MERGE_CONFLICT : status;
+	return ret;
 }
 
 /*
@@ -362,7 +366,7 @@ static void normalize_file(mmfile_t *mm, const char *path, struct index_state *i
 	}
 }
 
-int ll_merge(mmbuffer_t *result_buf,
+enum ll_merge_result ll_merge(mmbuffer_t *result_buf,
 	     const char *path,
 	     mmfile_t *ancestor, const char *ancestor_label,
 	     mmfile_t *ours, const char *our_label,
diff --git a/ll-merge.h b/ll-merge.h
index aceb1b24132..e4a20e81a3a 100644
--- a/ll-merge.h
+++ b/ll-merge.h
@@ -82,13 +82,20 @@ struct ll_merge_options {
 	long xdl_opts;
 };
 
+enum ll_merge_result {
+	LL_MERGE_ERROR = -1,
+	LL_MERGE_OK = 0,
+	LL_MERGE_CONFLICT,
+	LL_MERGE_BINARY_CONFLICT,
+};
+
 /**
  * Perform a three-way single-file merge in core.  This is a thin wrapper
  * around `xdl_merge` that takes the path and any merge backend specified in
  * `.gitattributes` or `.git/info/attributes` into account.
  * Returns 0 for a clean merge.
  */
-int ll_merge(mmbuffer_t *result_buf,
+enum ll_merge_result ll_merge(mmbuffer_t *result_buf,
 	     const char *path,
 	     mmfile_t *ancestor, const char *ancestor_label,
 	     mmfile_t *ours, const char *our_label,
diff --git a/merge-blobs.c b/merge-blobs.c
index ee0a0e90c94..8138090f81c 100644
--- a/merge-blobs.c
+++ b/merge-blobs.c
@@ -36,7 +36,7 @@ static void *three_way_filemerge(struct index_state *istate,
 				 mmfile_t *their,
 				 unsigned long *size)
 {
-	int merge_status;
+	enum ll_merge_result merge_status;
 	mmbuffer_t res;
 
 	/*
@@ -50,6 +50,9 @@ static void *three_way_filemerge(struct index_state *istate,
 				istate, NULL);
 	if (merge_status < 0)
 		return NULL;
+	if (merge_status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			path, ".our", ".their");
 
 	*size = res.size;
 	return res.ptr;
diff --git a/merge-ort.c b/merge-ort.c
index 0342f104836..c24da2ba3cb 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -1743,7 +1743,7 @@ static int merge_3way(struct merge_options *opt,
 	mmfile_t orig, src1, src2;
 	struct ll_merge_options ll_opts = {0};
 	char *base, *name1, *name2;
-	int merge_status;
+	enum ll_merge_result merge_status;
 
 	if (!opt->priv->attr_index.initialized)
 		initialize_attr_index(opt);
@@ -1787,6 +1787,9 @@ static int merge_3way(struct merge_options *opt,
 	merge_status = ll_merge(result_buf, path, &orig, base,
 				&src1, name1, &src2, name2,
 				&opt->priv->attr_index, &ll_opts);
+	if (merge_status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			path, name1, name2);
 
 	free(base);
 	free(name1);
diff --git a/merge-recursive.c b/merge-recursive.c
index d9457797dbb..bc73c52dd84 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1044,7 +1044,7 @@ static int merge_3way(struct merge_options *opt,
 	mmfile_t orig, src1, src2;
 	struct ll_merge_options ll_opts = {0};
 	char *base, *name1, *name2;
-	int merge_status;
+	enum ll_merge_result merge_status;
 
 	ll_opts.renormalize = opt->renormalize;
 	ll_opts.extra_marker_size = extra_marker_size;
@@ -1090,6 +1090,9 @@ static int merge_3way(struct merge_options *opt,
 	merge_status = ll_merge(result_buf, a->path, &orig, base,
 				&src1, name1, &src2, name2,
 				opt->repo->index, &ll_opts);
+	if (merge_status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			a->path, name1, name2);
 
 	free(base);
 	free(name1);
diff --git a/notes-merge.c b/notes-merge.c
index b4a3a903e86..01d596920ea 100644
--- a/notes-merge.c
+++ b/notes-merge.c
@@ -344,7 +344,7 @@ static int ll_merge_in_worktree(struct notes_merge_options *o,
 {
 	mmbuffer_t result_buf;
 	mmfile_t base, local, remote;
-	int status;
+	enum ll_merge_result status;
 
 	read_mmblob(&base, &p->base);
 	read_mmblob(&local, &p->local);
@@ -358,6 +358,9 @@ static int ll_merge_in_worktree(struct notes_merge_options *o,
 	free(local.ptr);
 	free(remote.ptr);
 
+	if (status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			oid_to_hex(&p->obj), o->local_ref, o->remote_ref);
 	if ((status < 0) || !result_buf.ptr)
 		die("Failed to execute internal merge");
 
diff --git a/rerere.c b/rerere.c
index d83d58df4fb..b1f8961ed9e 100644
--- a/rerere.c
+++ b/rerere.c
@@ -609,19 +609,23 @@ static int try_merge(struct index_state *istate,
 		     const struct rerere_id *id, const char *path,
 		     mmfile_t *cur, mmbuffer_t *result)
 {
-	int ret;
+	enum ll_merge_result ret;
 	mmfile_t base = {NULL, 0}, other = {NULL, 0};
 
 	if (read_mmfile(&base, rerere_path(id, "preimage")) ||
-	    read_mmfile(&other, rerere_path(id, "postimage")))
-		ret = 1;
-	else
+	    read_mmfile(&other, rerere_path(id, "postimage"))) {
+		ret = LL_MERGE_CONFLICT;
+	} else {
 		/*
 		 * A three-way merge. Note that this honors user-customizable
 		 * low-level merge driver settings.
 		 */
 		ret = ll_merge(result, path, &base, NULL, cur, "", &other, "",
 			       istate, NULL);
+		if (ret == LL_MERGE_BINARY_CONFLICT)
+			warning("Cannot merge binary files: %s (%s vs. %s)",
+				path, "", "");
+	}
 
 	free(base.ptr);
 	free(other.ptr);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v2 4/8] merge-ort: capture and print ll-merge warnings in our preferred fashion
  2021-12-25  7:59 ` [PATCH v2 0/8] " Elijah Newren via GitGitGadget
                     ` (2 preceding siblings ...)
  2021-12-25  7:59   ` [PATCH v2 3/8] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
@ 2021-12-25  7:59   ` Elijah Newren via GitGitGadget
  2021-12-25  7:59   ` [PATCH v2 5/8] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
                     ` (6 subsequent siblings)
  10 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-25  7:59 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Instead of immediately printing ll-merge warnings to stderr, we save
them in our output strbuf.  Besides allowing us to move these warnings
to a special file for --remerge-diff, this has two other benefits for
regular merges done by merge-ort:

  * The deferral of messages ensures we can print all messages about
    any given path together (merge-recursive was known to sometimes
    intersperse messages about other paths, particularly when renames
    were involved).

  * The deferral of messages means we can avoid printing spurious
    conflict messages when we just end up aborting due to local user
    modifications in the way.  (In contrast to merge-recursive.c which
    prematurely checks for local modifications in the way via
    unpack_trees() and gets the check wrong both in terms of false
    positives and false negatives relative to renames, merge-ort does
    not perform the local modifications in the way check until the
    checkout() step after the full merge has been computed.)

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c                | 5 +++--
 t/t6404-recursive-merge.sh | 9 +++++++--
 t/t6406-merge-attr.sh      | 9 +++++++--
 3 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index c24da2ba3cb..a18f47e23c5 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -1788,8 +1788,9 @@ static int merge_3way(struct merge_options *opt,
 				&src1, name1, &src2, name2,
 				&opt->priv->attr_index, &ll_opts);
 	if (merge_status == LL_MERGE_BINARY_CONFLICT)
-		warning("Cannot merge binary files: %s (%s vs. %s)",
-			path, name1, name2);
+		path_msg(opt, path, 0,
+			 "warning: Cannot merge binary files: %s (%s vs. %s)",
+			 path, name1, name2);
 
 	free(base);
 	free(name1);
diff --git a/t/t6404-recursive-merge.sh b/t/t6404-recursive-merge.sh
index eaf48e941e2..b8735c6db4d 100755
--- a/t/t6404-recursive-merge.sh
+++ b/t/t6404-recursive-merge.sh
@@ -108,8 +108,13 @@ test_expect_success 'refuse to merge binary files' '
 	printf "\0\0" >binary-file &&
 	git add binary-file &&
 	git commit -m binary2 &&
-	test_must_fail git merge F >merge.out 2>merge.err &&
-	grep "Cannot merge binary files: binary-file (HEAD vs. F)" merge.err
+	if test "$GIT_TEST_MERGE_ALGORITHM" = ort
+	then
+		test_must_fail git merge F >merge_output
+	else
+		test_must_fail git merge F 2>merge_output
+	fi &&
+	grep "Cannot merge binary files: binary-file (HEAD vs. F)" merge_output
 '
 
 test_expect_success 'mark rename/delete as unmerged' '
diff --git a/t/t6406-merge-attr.sh b/t/t6406-merge-attr.sh
index 84946458371..c41584eb33e 100755
--- a/t/t6406-merge-attr.sh
+++ b/t/t6406-merge-attr.sh
@@ -221,8 +221,13 @@ test_expect_success 'binary files with union attribute' '
 	printf "two\0" >bin.txt &&
 	git commit -am two &&
 
-	test_must_fail git merge bin-main 2>stderr &&
-	grep -i "warning.*cannot merge.*HEAD vs. bin-main" stderr
+	if test "$GIT_TEST_MERGE_ALGORITHM" = ort
+	then
+		test_must_fail git merge bin-main >output
+	else
+		test_must_fail git merge bin-main 2>output
+	fi &&
+	grep -i "warning.*cannot merge.*HEAD vs. bin-main" output
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v2 5/8] merge-ort: mark a few more conflict messages as omittable
  2021-12-25  7:59 ` [PATCH v2 0/8] " Elijah Newren via GitGitGadget
                     ` (3 preceding siblings ...)
  2021-12-25  7:59   ` [PATCH v2 4/8] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
@ 2021-12-25  7:59   ` Elijah Newren via GitGitGadget
  2021-12-25  7:59   ` [PATCH v2 6/8] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
                     ` (5 subsequent siblings)
  10 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-25  7:59 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

path_msg() has the ability to mark messages as omittable, designed for
remerge-diff where we'll instead be showing conflict messages as diff
headers for a subsequent diff.  While all these messages are very useful
when trying to create a merge initially, early use with the
--remerge-diff feature (the only user of this omittable conflict message
capability), suggests that the particular messages marked in this commit
are just noise when trying to see what changes users made to create a
merge commit.  Mark them as omittable.

Note that there were already a few messages marked as omittable in
merge-ort when doing a remerge-diff, because the development of
--remerge-diff preceded the upstreaming of merge-ort and I was trying to
ensure merge-ort could handle all the necessary requirements.  See
commit c5a6f65527 ("merge-ort: add modify/delete handling and delayed
output processing", 2020-12-03) for the initial details.  For some
examples of already-marked-as-omittable messages, see either
"Auto-merging <path>" or some of the submodule update hints.  This
commit just adds two more messages that should also be omittable.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index a18f47e23c5..998e92ec593 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -2420,7 +2420,7 @@ static void apply_directory_rename_modifications(struct merge_options *opt,
 		 */
 		ci->path_conflict = 1;
 		if (pair->status == 'A')
-			path_msg(opt, new_path, 0,
+			path_msg(opt, new_path, 1,
 				 _("CONFLICT (file location): %s added in %s "
 				   "inside a directory that was renamed in %s, "
 				   "suggesting it should perhaps be moved to "
@@ -2428,7 +2428,7 @@ static void apply_directory_rename_modifications(struct merge_options *opt,
 				 old_path, branch_with_new_path,
 				 branch_with_dir_rename, new_path);
 		else
-			path_msg(opt, new_path, 0,
+			path_msg(opt, new_path, 1,
 				 _("CONFLICT (file location): %s renamed to %s "
 				   "in %s, inside a directory that was renamed "
 				   "in %s, suggesting it should perhaps be "
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v2 6/8] merge-ort: format messages slightly different for use in headers
  2021-12-25  7:59 ` [PATCH v2 0/8] " Elijah Newren via GitGitGadget
                     ` (4 preceding siblings ...)
  2021-12-25  7:59   ` [PATCH v2 5/8] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
@ 2021-12-25  7:59   ` Elijah Newren via GitGitGadget
  2021-12-26 18:30     ` In-tree strbuf "in-place" search/replace (was: [PATCH v2 6/8] merge-ort: format messages slightly different for use in headers) Ævar Arnfjörð Bjarmason
  2021-12-28 10:56     ` [PATCH v2 6/8] merge-ort: format messages slightly different for use in headers Johannes Altmanninger
  2021-12-25  7:59   ` [PATCH v2 7/8] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
                     ` (4 subsequent siblings)
  10 siblings, 2 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-25  7:59 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

When users run
    git show --remerge-diff $MERGE_COMMIT
or
    git log -p --remerge-diff ...
stdout is not an appropriate location to dump conflict messages, but we
do want to provide them to users.  We will include them in the diff
headers instead...but for that to work, we need for any multiline
messages to replace newlines with both a newline and a space.  Add a new
flag to signal when we want these messages modified in such a fashion,
and use it in path_msg() to modify these messages this way.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c       | 36 ++++++++++++++++++++++++++++++++++--
 merge-recursive.c |  3 +++
 merge-recursive.h |  1 +
 3 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index 998e92ec593..9142d56e0ad 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -634,17 +634,46 @@ static void path_msg(struct merge_options *opt,
 		     const char *fmt, ...)
 {
 	va_list ap;
-	struct strbuf *sb = strmap_get(&opt->priv->output, path);
+	struct strbuf *sb, *dest;
+	struct strbuf tmp = STRBUF_INIT;
+
+	if (opt->record_conflict_msgs_as_headers && omittable_hint)
+		return; /* Do not record mere hints in tree */
+	sb = strmap_get(&opt->priv->output, path);
 	if (!sb) {
 		sb = xmalloc(sizeof(*sb));
 		strbuf_init(sb, 0);
 		strmap_put(&opt->priv->output, path, sb);
 	}
 
+	dest = (opt->record_conflict_msgs_as_headers ? &tmp : sb);
+
 	va_start(ap, fmt);
-	strbuf_vaddf(sb, fmt, ap);
+	strbuf_vaddf(dest, fmt, ap);
 	va_end(ap);
 
+	if (opt->record_conflict_msgs_as_headers) {
+		int i_sb = 0, i_tmp = 0;
+
+		/* Copy tmp to sb, adding spaces after newlines */
+		strbuf_grow(sb, 2*tmp.len); /* more than sufficient */
+		for (; i_tmp < tmp.len; i_tmp++, i_sb++) {
+			/* Copy next character from tmp to sb */
+			sb->buf[sb->len + i_sb] = tmp.buf[i_tmp];
+
+			/* If we copied a newline, add a space */
+			if (tmp.buf[i_tmp] == '\n')
+				sb->buf[++i_sb] = ' ';
+		}
+		/* Update length and ensure it's NUL-terminated */
+		sb->len += i_sb;
+		sb->buf[sb->len] = '\0';
+
+		/* Clean up tmp */
+		strbuf_release(&tmp);
+	}
+
+	/* Add final newline character to sb */
 	strbuf_addch(sb, '\n');
 }
 
@@ -4246,6 +4275,9 @@ void merge_switch_to_result(struct merge_options *opt,
 		struct string_list olist = STRING_LIST_INIT_NODUP;
 		int i;
 
+		if (opt->record_conflict_msgs_as_headers)
+			BUG("Either display conflict messages or record them as headers, not both");
+
 		trace2_region_enter("merge", "display messages", opt->repo);
 
 		/* Hack to pre-allocate olist to the desired size */
diff --git a/merge-recursive.c b/merge-recursive.c
index bc73c52dd84..c9ba7e904a6 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -3714,6 +3714,9 @@ static int merge_start(struct merge_options *opt, struct tree *head)
 
 	assert(opt->priv == NULL);
 
+	/* Not supported; option specific to merge-ort */
+	assert(!opt->record_conflict_msgs_as_headers);
+
 	/* Sanity check on repo state; index must match head */
 	if (repo_index_has_changes(opt->repo, head, &sb)) {
 		err(opt, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
diff --git a/merge-recursive.h b/merge-recursive.h
index 0795a1d3ec1..ebfdb7f994e 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -46,6 +46,7 @@ struct merge_options {
 	/* miscellaneous control options */
 	const char *subtree_shift;
 	unsigned renormalize : 1;
+	unsigned record_conflict_msgs_as_headers : 1;
 
 	/* internal fields used by the implementation */
 	struct merge_options_internal *priv;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v2 7/8] diff: add ability to insert additional headers for paths
  2021-12-25  7:59 ` [PATCH v2 0/8] " Elijah Newren via GitGitGadget
                     ` (5 preceding siblings ...)
  2021-12-25  7:59   ` [PATCH v2 6/8] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
@ 2021-12-25  7:59   ` Elijah Newren via GitGitGadget
  2021-12-28 10:57     ` Johannes Altmanninger
  2021-12-25  7:59   ` [PATCH v2 8/8] show, log: include conflict/warning messages in --remerge-diff headers Elijah Newren via GitGitGadget
                     ` (3 subsequent siblings)
  10 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-25  7:59 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

When additional headers are provided, we need to
  * add diff_filepairs to diff_queued_diff for each paths in the
    additional headers map which, unless that path is part of
    another diff_filepair already found in diff_queued_diff
  * format the headers (colorization, line_prefix for --graph)
  * make sure the various codepaths that attempt to return early
    if there are "no changes" take into account the headers that
    need to be shown.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 diff.c     | 116 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 diff.h     |   3 +-
 log-tree.c |   2 +-
 3 files changed, 115 insertions(+), 6 deletions(-)

diff --git a/diff.c b/diff.c
index 861282db1c3..aaa6a19f158 100644
--- a/diff.c
+++ b/diff.c
@@ -27,6 +27,7 @@
 #include "help.h"
 #include "promisor-remote.h"
 #include "dir.h"
+#include "strmap.h"
 
 #ifdef NO_FAST_WORKING_DIRECTORY
 #define FAST_WORKING_DIRECTORY 0
@@ -3406,6 +3407,31 @@ struct userdiff_driver *get_textconv(struct repository *r,
 	return userdiff_get_textconv(r, one->driver);
 }
 
+static struct strbuf *additional_headers(struct diff_options *o,
+					 const char *path)
+{
+	if (!o->additional_path_headers)
+		return NULL;
+	return strmap_get(o->additional_path_headers, path);
+}
+
+static void add_formatted_headers(struct strbuf *msg,
+				  struct strbuf *more_headers,
+				  const char *line_prefix,
+				  const char *meta,
+				  const char *reset)
+{
+	char *next, *newline;
+
+	for (next = more_headers->buf; *next; next = newline) {
+		newline = strchrnul(next, '\n');
+		strbuf_addf(msg, "%s%s%.*s%s\n", line_prefix, meta,
+			    (int)(newline - next), next, reset);
+		if (*newline)
+			newline++;
+	}
+}
+
 static void builtin_diff(const char *name_a,
 			 const char *name_b,
 			 struct diff_filespec *one,
@@ -3464,6 +3490,17 @@ static void builtin_diff(const char *name_a,
 	b_two = quote_two(b_prefix, name_b + (*name_b == '/'));
 	lbl[0] = DIFF_FILE_VALID(one) ? a_one : "/dev/null";
 	lbl[1] = DIFF_FILE_VALID(two) ? b_two : "/dev/null";
+	if (!DIFF_FILE_VALID(one) && !DIFF_FILE_VALID(two)) {
+		/*
+		 * We should only reach this point for pairs from
+		 * create_filepairs_for_header_only_notifications().  For
+		 * these, we should avoid the "/dev/null" special casing
+		 * above, meaning we avoid showing such pairs as either
+		 * "new file" or "deleted file" below.
+		 */
+		lbl[0] = a_one;
+		lbl[1] = b_two;
+	}
 	strbuf_addf(&header, "%s%sdiff --git %s %s%s\n", line_prefix, meta, a_one, b_two, reset);
 	if (lbl[0][0] == '/') {
 		/* /dev/null */
@@ -4328,6 +4365,7 @@ static void fill_metainfo(struct strbuf *msg,
 	const char *set = diff_get_color(use_color, DIFF_METAINFO);
 	const char *reset = diff_get_color(use_color, DIFF_RESET);
 	const char *line_prefix = diff_line_prefix(o);
+	struct strbuf *more_headers = NULL;
 
 	*must_show_header = 1;
 	strbuf_init(msg, PATH_MAX * 2 + 300);
@@ -4364,6 +4402,11 @@ static void fill_metainfo(struct strbuf *msg,
 	default:
 		*must_show_header = 0;
 	}
+	if ((more_headers = additional_headers(o, name))) {
+		add_formatted_headers(msg, more_headers,
+				      line_prefix, set, reset);
+		*must_show_header = 1;
+	}
 	if (one && two && !oideq(&one->oid, &two->oid)) {
 		const unsigned hexsz = the_hash_algo->hexsz;
 		int abbrev = o->abbrev ? o->abbrev : DEFAULT_ABBREV;
@@ -5852,12 +5895,22 @@ int diff_unmodified_pair(struct diff_filepair *p)
 
 static void diff_flush_patch(struct diff_filepair *p, struct diff_options *o)
 {
-	if (diff_unmodified_pair(p))
+	/*
+	 * Check if we can return early without showing a diff.  Note that
+	 * diff_filepair only stores {oid, path, mode, is_valid}
+	 * information for each path, and thus diff_unmodified_pair() only
+	 * considers those bits of info.  However, we do not want pairs
+	 * created by create_filepairs_for_header_only_notifications() to
+	 * be ignored, so return early if both p is unmodified AND
+	 * p->one->path is not in additional headers.
+	 */
+	if (diff_unmodified_pair(p) && !additional_headers(o, p->one->path))
 		return;
 
+	/* Actually, we can also return early to avoid showing tree diffs */
 	if ((DIFF_FILE_VALID(p->one) && S_ISDIR(p->one->mode)) ||
 	    (DIFF_FILE_VALID(p->two) && S_ISDIR(p->two->mode)))
-		return; /* no tree diffs in patch format */
+		return;
 
 	run_diff(p, o);
 }
@@ -5888,10 +5941,14 @@ static void diff_flush_checkdiff(struct diff_filepair *p,
 	run_checkdiff(p, o);
 }
 
-int diff_queue_is_empty(void)
+int diff_queue_is_empty(struct diff_options *o)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
 	int i;
+
+	if (o->additional_path_headers &&
+	    !strmap_empty(o->additional_path_headers))
+		return 0;
 	for (i = 0; i < q->nr; i++)
 		if (!diff_unmodified_pair(q->queue[i]))
 			return 0;
@@ -6325,6 +6382,54 @@ void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc)
 		warning(_(rename_limit_advice), varname, needed);
 }
 
+static void create_filepairs_for_header_only_notifications(struct diff_options *o)
+{
+	struct strset present;
+	struct diff_queue_struct *q = &diff_queued_diff;
+	struct hashmap_iter iter;
+	struct strmap_entry *e;
+	int i;
+
+	strset_init_with_options(&present, /*pool*/ NULL, /*strdup*/ 0);
+
+	/*
+	 * Find out which paths exist in diff_queued_diff, preferring
+	 * one->path for any pair that has multiple paths.
+	 */
+	for (i = 0; i < q->nr; i++) {
+		struct diff_filepair *p = q->queue[i];
+		char *path = p->one->path ? p->one->path : p->two->path;
+
+		if (strmap_contains(o->additional_path_headers, path))
+			strset_add(&present, path);
+	}
+
+	/*
+	 * Loop over paths in additional_path_headers; for each NOT already
+	 * in diff_queued_diff, create a synthetic filepair and insert that
+	 * into diff_queued_diff.
+	 */
+	strmap_for_each_entry(o->additional_path_headers, &iter, e) {
+		if (!strset_contains(&present, e->key)) {
+			struct diff_filespec *one, *two;
+			struct diff_filepair *p;
+
+			one = alloc_filespec(e->key);
+			two = alloc_filespec(e->key);
+			fill_filespec(one, null_oid(), 0, 0);
+			fill_filespec(two, null_oid(), 0, 0);
+			p = diff_queue(q, one, two);
+			p->status = DIFF_STATUS_MODIFIED;
+		}
+	}
+
+	/* Re-sort the filepairs */
+	diffcore_fix_diff_index();
+
+	/* Cleanup */
+	strset_clear(&present);
+}
+
 static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 {
 	int i;
@@ -6337,6 +6442,9 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 	if (o->color_moved)
 		o->emitted_symbols = &esm;
 
+	if (o->additional_path_headers)
+		create_filepairs_for_header_only_notifications(o);
+
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
 		if (check_pair_status(p))
@@ -6413,7 +6521,7 @@ void diff_flush(struct diff_options *options)
 	 * Order: raw, stat, summary, patch
 	 * or:    name/name-status/checkdiff (other bits clear)
 	 */
-	if (!q->nr)
+	if (!q->nr && !options->additional_path_headers)
 		goto free_queue;
 
 	if (output_format & (DIFF_FORMAT_RAW |
diff --git a/diff.h b/diff.h
index 8ba85c5e605..06a0a67afda 100644
--- a/diff.h
+++ b/diff.h
@@ -395,6 +395,7 @@ struct diff_options {
 
 	struct repository *repo;
 	struct option *parseopts;
+	struct strmap *additional_path_headers;
 
 	int no_free;
 };
@@ -593,7 +594,7 @@ void diffcore_fix_diff_index(void);
 "                show all files diff when -S is used and hit is found.\n" \
 "  -a  --text    treat all files as text.\n"
 
-int diff_queue_is_empty(void);
+int diff_queue_is_empty(struct diff_options*);
 void diff_flush(struct diff_options*);
 void diff_free(struct diff_options*);
 void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc);
diff --git a/log-tree.c b/log-tree.c
index d4655b63d75..33c28f537a6 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -850,7 +850,7 @@ int log_tree_diff_flush(struct rev_info *opt)
 	opt->shown_dashes = 0;
 	diffcore_std(&opt->diffopt);
 
-	if (diff_queue_is_empty()) {
+	if (diff_queue_is_empty(&opt->diffopt)) {
 		int saved_fmt = opt->diffopt.output_format;
 		opt->diffopt.output_format = DIFF_FORMAT_NO_OUTPUT;
 		diff_flush(&opt->diffopt);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v2 8/8] show, log: include conflict/warning messages in --remerge-diff headers
  2021-12-25  7:59 ` [PATCH v2 0/8] " Elijah Newren via GitGitGadget
                     ` (6 preceding siblings ...)
  2021-12-25  7:59   ` [PATCH v2 7/8] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
@ 2021-12-25  7:59   ` Elijah Newren via GitGitGadget
  2021-12-28 10:57     ` Johannes Altmanninger
  2021-12-26 21:52   ` [PATCH v2 0/8] Add a new --remerge-diff capability to show & log Ævar Arnfjörð Bjarmason
                     ` (2 subsequent siblings)
  10 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-25  7:59 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Conflicts such as modify/delete, rename/rename, or file/directory are
not representable via content conflict markers, and the normal output
messages notifying users about these were dropped with --remerge-diff.
While we don't want these messages randomly shown before the commit
and diff headers, we do want them to still be shown; include them as
part of the diff headers instead.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 log-tree.c              |  3 ++
 merge-ort.c             |  1 +
 merge-ort.h             | 10 +++++
 t/t4069-remerge-diff.sh | 86 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 100 insertions(+)

diff --git a/log-tree.c b/log-tree.c
index 33c28f537a6..97fbb756d21 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -922,6 +922,7 @@ static int do_remerge_diff(struct rev_info *opt,
 	/* Setup merge options */
 	init_merge_options(&o, the_repository);
 	o.show_rename_progress = 0;
+	o.record_conflict_msgs_as_headers = 1;
 
 	ctx.abbrev = DEFAULT_ABBREV;
 	format_commit_message(parent1, "%h (%s)", &parent1_desc, &ctx);
@@ -938,10 +939,12 @@ static int do_remerge_diff(struct rev_info *opt,
 	merge_incore_recursive(&o, bases, parent1, parent2, &res);
 
 	/* Show the diff */
+	opt->diffopt.additional_path_headers = res.path_messages;
 	diff_tree_oid(&res.tree->object.oid, oid, "", &opt->diffopt);
 	log_tree_diff_flush(opt);
 
 	/* Cleanup */
+	opt->diffopt.additional_path_headers = NULL;
 	strbuf_release(&parent1_desc);
 	strbuf_release(&parent2_desc);
 	merge_finalize(&o, &res);
diff --git a/merge-ort.c b/merge-ort.c
index 9142d56e0ad..07e53083cbd 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -4579,6 +4579,7 @@ redo:
 	trace2_region_leave("merge", "process_entries", opt->repo);
 
 	/* Set return values */
+	result->path_messages = &opt->priv->output;
 	result->tree = parse_tree_indirect(&working_tree_oid);
 	/* existence of conflicted entries implies unclean */
 	result->clean &= strmap_empty(&opt->priv->conflicted);
diff --git a/merge-ort.h b/merge-ort.h
index c011864ffeb..fe599b87868 100644
--- a/merge-ort.h
+++ b/merge-ort.h
@@ -5,6 +5,7 @@
 
 struct commit;
 struct tree;
+struct strmap;
 
 struct merge_result {
 	/*
@@ -23,6 +24,15 @@ struct merge_result {
 	 */
 	struct tree *tree;
 
+	/*
+	 * Special messages and conflict notices for various paths
+	 *
+	 * This is a map of pathnames to strbufs.  It contains various
+	 * warning/conflict/notice messages (possibly multiple per path)
+	 * that callers may want to use.
+	 */
+	struct strmap *path_messages;
+
 	/*
 	 * Additional metadata used by merge_switch_to_result() or future calls
 	 * to merge_incore_*().  Includes data needed to update the index (if
diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
index 192dbce2bfe..a040d3bcd91 100755
--- a/t/t4069-remerge-diff.sh
+++ b/t/t4069-remerge-diff.sh
@@ -4,6 +4,15 @@ test_description='remerge-diff handling'
 
 . ./test-lib.sh
 
+# --remerge-diff uses ort under the hood regardless of setting.  However,
+# we set up a file/directory conflict beforehand, and the different backends
+# handle the conflict differently, which would require separate code paths
+# to resolve.  There's not much point in making the code uglier to do that,
+# though, when the real thing we are testing (--remerge-diff) will hardcode
+# calls directly into the merge-ort API anyway.  So just force the use of
+# ort on the setup too.
+GIT_TEST_MERGE_ALGORITHM=ort
+
 test_expect_success 'setup basic merges' '
 	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
 	git add numbers &&
@@ -55,6 +64,7 @@ test_expect_success 'remerge-diff with both a resolved conflict and an unrelated
 	git log -1 --oneline ab_resolution >tmp &&
 	cat <<-EOF >>tmp &&
 	diff --git a/numbers b/numbers
+	CONFLICT (content): Merge conflict in numbers
 	index a1fb731..6875544 100644
 	--- a/numbers
 	+++ b/numbers
@@ -83,4 +93,80 @@ test_expect_success 'remerge-diff with both a resolved conflict and an unrelated
 	test_cmp expect actual
 '
 
+test_expect_success 'setup non-content conflicts' '
+	git switch --orphan base &&
+
+	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
+	test_write_lines a b c d e f g h i >letters &&
+	test_write_lines in the way >content &&
+	git add numbers letters content &&
+	git commit -m base &&
+
+	git branch side1 &&
+	git branch side2 &&
+
+	git checkout side1 &&
+	test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
+	git mv letters letters_side1 &&
+	git mv content file_or_directory &&
+	git add numbers &&
+	git commit -m side1 &&
+
+	git checkout side2 &&
+	git rm numbers &&
+	git mv letters letters_side2 &&
+	mkdir file_or_directory &&
+	echo hello >file_or_directory/world &&
+	git add file_or_directory/world &&
+	git commit -m side2 &&
+
+	git checkout -b resolution side1 &&
+	test_must_fail git merge side2 &&
+	test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
+	git add numbers &&
+	git add letters_side1 &&
+	git rm letters &&
+	git rm letters_side2 &&
+	git add file_or_directory~HEAD &&
+	git mv file_or_directory~HEAD wanted_content &&
+	git commit -m resolved
+'
+
+test_expect_success 'remerge-diff with non-content conflicts' '
+	git log -1 --oneline resolution >tmp &&
+	cat <<-EOF >>tmp &&
+	diff --git a/file_or_directory~HASH (side1) b/wanted_content
+	similarity index 100%
+	rename from file_or_directory~HASH (side1)
+	rename to wanted_content
+	CONFLICT (file/directory): directory in the way of file_or_directory from HASH (side1); moving it to file_or_directory~HASH (side1) instead.
+	diff --git a/letters b/letters
+	CONFLICT (rename/rename): letters renamed to letters_side1 in HASH (side1) and to letters_side2 in HASH (side2).
+	diff --git a/letters_side2 b/letters_side2
+	deleted file mode 100644
+	index b236ae5..0000000
+	--- a/letters_side2
+	+++ /dev/null
+	@@ -1,9 +0,0 @@
+	-a
+	-b
+	-c
+	-d
+	-e
+	-f
+	-g
+	-h
+	-i
+	diff --git a/numbers b/numbers
+	CONFLICT (modify/delete): numbers deleted in HASH (side2) and modified in HASH (side1).  Version HASH (side1) of numbers left in tree.
+	EOF
+	# We still have some sha1 hashes above; rip them out so test works
+	# with sha256
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
+
+	git show --oneline --remerge-diff resolution >tmp &&
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 113+ messages in thread

* In-tree strbuf "in-place" search/replace (was: [PATCH v2 6/8] merge-ort: format messages slightly different for use in headers)
  2021-12-25  7:59   ` [PATCH v2 6/8] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
@ 2021-12-26 18:30     ` Ævar Arnfjörð Bjarmason
  2021-12-28 10:56     ` [PATCH v2 6/8] merge-ort: format messages slightly different for use in headers Johannes Altmanninger
  1 sibling, 0 replies; 113+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-26 18:30 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Neeraj Singh, Elijah Newren


On Sat, Dec 25 2021, Elijah Newren via GitGitGadget wrote:

> @@ -634,17 +634,46 @@ static void path_msg(struct merge_options *opt,
>  		     const char *fmt, ...)
>  {
>  	va_list ap;
> -	struct strbuf *sb = strmap_get(&opt->priv->output, path);
> +	struct strbuf *sb, *dest;
> +	struct strbuf tmp = STRBUF_INIT;
> +
> +	if (opt->record_conflict_msgs_as_headers && omittable_hint)
> +		return; /* Do not record mere hints in tree */
> +	sb = strmap_get(&opt->priv->output, path);
>  	if (!sb) {
>  		sb = xmalloc(sizeof(*sb));
>  		strbuf_init(sb, 0);
>  		strmap_put(&opt->priv->output, path, sb);
>  	}
>  
> +	dest = (opt->record_conflict_msgs_as_headers ? &tmp : sb);
> +
>  	va_start(ap, fmt);
> -	strbuf_vaddf(sb, fmt, ap);
> +	strbuf_vaddf(dest, fmt, ap);
>  	va_end(ap);
>  
> +	if (opt->record_conflict_msgs_as_headers) {
> +		int i_sb = 0, i_tmp = 0;
> +
> +		/* Copy tmp to sb, adding spaces after newlines */
> +		strbuf_grow(sb, 2*tmp.len); /* more than sufficient */
> +		for (; i_tmp < tmp.len; i_tmp++, i_sb++) {
> +			/* Copy next character from tmp to sb */
> +			sb->buf[sb->len + i_sb] = tmp.buf[i_tmp];
> +
> +			/* If we copied a newline, add a space */
> +			if (tmp.buf[i_tmp] == '\n')
> +				sb->buf[++i_sb] = ' ';
> +		}
> +		/* Update length and ensure it's NUL-terminated */
> +		sb->len += i_sb;
> +		sb->buf[sb->len] = '\0';
> +
> +		/* Clean up tmp */
> +		strbuf_release(&tmp);
> +	}
> +
> +	/* Add final newline character to sb */
>  	strbuf_addch(sb, '\n');
>  }
>  

I'm not saying this is wrong or needs to change. Just a reader's note
that this sent me on an interesting journey of looking at various
in-tree callers of strbufs that want to do the equivalent of
s/$from/$to/ on a strbuf, with and without the equivalent of /g.

I figured I'd change the $subject since this is more of a general
musing...

In trailer.c we've got strbuf_replace(), which looks like it could be
made to be general enough to serve most callers if it did a memmem()
instead of a strstr(), and knew to take a "all" flag to implement a /g.

We then have e.g. lf_to_crlf() in imap-send.c, which uses a newly
alloc'd buffer followed by a strbuf_attach(), which is a common pattern.

Then strbuf_reencode() in strbuf.c basically solves this problem, and
calls reencode_string_len(), both it and the underlying function are
*almost* general enough to know to take some "from/to" string/length
pair, i.e. to not be bound to "reencoding" only with iconv().

Then there's strbuf_add_percentencode() and strbuf_add_urlencode() whose
API users might be happy with in-place replacing, but do a
read-and-copy-maybe-expand.

It might be an interesting follow-up project for someone to come up with
a generic in-place search-replace function with a signature like:

	int strbuf_replace(struct strbuf *sb, const char *from,
			   size_t from_len, const char *to,
			   size_t to_len, int max);

To do e.g. in this case:

	if (opt->record_conflict_msgs_as_headers)
		strbuf_replace(sb, "\n", strlen("\n"), "\n ", strlen("\n "), -1);

The various in-tree implementations do some variant of over-mallocing to
save work in the loop (as is being done here), copying where a small
realloc/memmove might do, scanning the string to figure out how much to
malloc, then copying in a second pass etc.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 0/8] Add a new --remerge-diff capability to show & log
  2021-12-25  7:59 ` [PATCH v2 0/8] " Elijah Newren via GitGitGadget
                     ` (7 preceding siblings ...)
  2021-12-25  7:59   ` [PATCH v2 8/8] show, log: include conflict/warning messages in --remerge-diff headers Elijah Newren via GitGitGadget
@ 2021-12-26 21:52   ` Ævar Arnfjörð Bjarmason
  2021-12-27 21:11     ` Elijah Newren
  2021-12-28 10:55   ` Johannes Altmanninger
  2021-12-30 23:36   ` [PATCH v3 0/9] " Elijah Newren via GitGitGadget
  10 siblings, 1 reply; 113+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-26 21:52 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Neeraj Singh, Elijah Newren


On Sat, Dec 25 2021, Elijah Newren via GitGitGadget wrote:

> === FURTHER BACKGROUND (original cover letter material) ==
>
> Here are some example commits you can try this out on (with git show
> --remerge-diff $COMMIT):
>
>  * git.git conflicted merge: 07601b5b36
>  * git.git non-conflicted change: bf04590ecd
>  * linux.git conflicted merge: eab3540562fb
>  * linux.git non-conflicted change: 223cea6a4f05
>
> Many more can be found by just running git log --merges --remerge-diff in
> your repository of choice and searching for diffs (most merges tend to be
> clean and unmodified and thus produce no diff but a search of '^diff' in the
> log output tends to find the examples nicely).
>
> Some basic high level details about this new option:
>
>  * This option is most naturally compared to --cc, though the output seems
>    to be much more understandable to most users than --cc output.
>  * Since merges are often clean and unmodified, this new option results in
>    an empty diff for most merges.
>  * This new option shows things like the removal of conflict markers, which
>    hunks users picked from the various conflicted sides to keep or remove,
>    and shows changes made outside of conflict markers (which might reflect
>    changes needed to resolve semantic conflicts or cleanups of e.g.
>    compilation warnings or other additional changes an integrator felt
>    belonged in the merged result).
>  * This new option does not (currently) work for octopus merges, since
>    merge-ort is specific to two-parent merges[1].
>  * This option will not work on a read-only or full filesystem[2].
>  * We discussed this capability at Git Merge 2020, and one of the
>    suggestions was doing a periodic git gc --auto during the operation (due
>    to potential new blobs and trees created during the operation). I found a
>    way to avoid that; see [2].
>  * This option is faster than you'd probably expect; it handles 33.5 merge
>    commits per second in linux.git on my computer; see below.
>
> In regards to the performance point above, the timing for running the
> following command:
>
> time git log --min-parents=2 --max-parents=2 $DIFF_FLAG | wc -l

I've been trying to come up with some other useful recipies for this new
option (which is already very useful, thanks!)

Some of these (if correct) are suggestions for incorporating into the
(now rather sparse) documentation. I.e. walking users through how to use
this, and how (if at all) it combines with other options.

I wanted to find all merges between "master".."seen" for which Junio's
had to resolve a conflict, a naïve version is:

    $ git log --oneline --remerge-diff -p --min-parents=2 origin/master..origin/seen|grep ^diff -B1 | grep Merge
    [...]

But I found that this new option nicely integrates with --diff-filter,
i.e. we'll end up showing a diff, and the diff machinery allows you to
to filter on it.

It seems to me like all the diffs you show fall under "M", so for
master..seen (2ae0a9cb829..61055c2920d) this is equivalent (and the
output is the same as the above):

    $ git -P log --oneline --remerge-diff --no-patch --min-parents=2 --diff-filter=M origin/master..origin/seen 
    95daa54b1c3 Merge branch 'hn/reftable-fixes' into seen
    26c4c09dd34 Merge branch 'gc/fetch-negotiate-only-early-return' into seen
    e3dc8d073f6 Merge branch 'gc/branch-recurse-submodules' into seen
    aeada898196 Merge branch 'js/branch-track-inherit' into seen
    4dd30e0da45 Merge branch 'jh/builtin-fsmonitor-part2' into seen
    337743b17d0 Merge branch 'ab/config-based-hooks-2' into seen
    261672178c0 Merge branch 'pw/fix-some-issues-in-reset-head' into seen
    1296d35b041 Merge branch 'ms/customizable-ident-expansion' into seen
    7a3d7d05126 Merge branch 'ja/i18n-similar-messages' into seen
    eda714bb8bc Merge branch 'tb/midx-bitmap-corruption-fix' into seen
    ba02295e3f8 Merge branch 'jh/p4-human-unit-numbers' into jch
    751773fc38b Merge branch 'es/test-chain-lint' into jch
    ec17879f495 Merge branch 'tb/cruft-packs' into tb/midx-bitmap-corruption-fix

However for "origin/master..origin/next" (next = 510f9eba9a2 currently)
we'll oddly show this with "-p":
    
    9af51fd1d0d Sync with 'master'
    diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
    CONFLICT (content): Merge conflict in t/lib-gpg.sh
    d6f56f3248e Merge branch 'es/test-chain-lint' into next
    diff --git a/t/t4126-apply-empty.sh b/t/t4126-apply-empty.sh
    CONFLICT (content): Merge conflict in t/t4126-apply-empty.sh
    index 996c93329c6..33860d38290 100755
    --- a/t/t4126-apply-empty.sh
    +++ b/t/t4126-apply-empty.sh
    [...]

The "oddly" applying only to that "9af51fd1d0d Sync with 'master'", not
the second d6f56f3248e, which shows the sort of conflict I'd expect. The
two-line "diff" of:

    diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
    CONFLICT (content): Merge conflict in t/lib-gpg.sh

Shows up with -p --remerge-diff, not a mere -p. I also tried the other
--diff-merges=* options, that behavior is new in
--diff-merges=remerge. Is this a bug?

My local build also has a --pickaxe-patch option. It's something I
submitted on-list before[1] and have been meaning to re-roll.

I'm discussing it here because it skips the stripping of the "+ " and "-
" prefixes under -G<regex> and allows you to search through the -U<n>
context. With that I'm able to do:

    git log --oneline --remerge-diff -p --min-parents=2 --pickaxe-patch -G'^\+' --diff-filter=M origin/master..origin/seen

I.e. on top of the above filter only show those diffs that have
additions. FAICT the conflicting diffs where the committer of the merge
conflict picked one side or the other will only have "-" lines".

So those diffs that have additions look to be those where the person
doing the merge needed to combine the two.

Well, usually. E.g. 26c4c09dd34 (Merge branch
'gc/fetch-negotiate-only-early-return' into seen, 2021-12-25) in that
range shows that isn't strictly true. Most such deletion-only diffs are
less interesting in picking one side or the other of the conflict, but
that one combines the two:
    
    -<<<<<<< d3419aac9f4 (Merge branch 'pw/add-p-hunk-split-fix' into seen)
                            warning(_("protocol does not support --negotiate-only, exiting"));
    -                       return 1;
    -=======
    -                       warning(_("Protocol does not support --negotiate-only, exiting."));
                            result = 1;
                            goto cleanup;
    ->>>>>>> 495e8601f28 (builtin/fetch: die on --negotiate-only and --recurse-submodules)

Which I guess is partially commentary and partially a request (either
for this series, or some follow-up) for something like a
--remerge-diff-filter option. I.e. it would be very useful to be able to
filter on some combination of:

 * Which side(s) of the conflict(s) were picked, or a combination?
 * Is there "new work" in the diff to resolve the conflict?
   AFIACT this will always mean we'll have "+ " lines.

Or maybe that's not useful at all, and just -G<rx> (maybe combined with
my --pickaxe-patch) will cover it?

1. https://lore.kernel.org/git/20190424152215.16251-3-avarab@gmail.com/

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 0/8] Add a new --remerge-diff capability to show & log
  2021-12-26 21:52   ` [PATCH v2 0/8] Add a new --remerge-diff capability to show & log Ævar Arnfjörð Bjarmason
@ 2021-12-27 21:11     ` Elijah Newren
  2022-01-10 15:48       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren @ 2021-12-27 21:11 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya, Neeraj Singh

On Sun, Dec 26, 2021 at 2:28 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Sat, Dec 25 2021, Elijah Newren via GitGitGadget wrote:
>
> > === FURTHER BACKGROUND (original cover letter material) ==
> >
> > Here are some example commits you can try this out on (with git show
> > --remerge-diff $COMMIT):
> >
> >  * git.git conflicted merge: 07601b5b36
> >  * git.git non-conflicted change: bf04590ecd
> >  * linux.git conflicted merge: eab3540562fb
> >  * linux.git non-conflicted change: 223cea6a4f05
> >
> > Many more can be found by just running git log --merges --remerge-diff in
> > your repository of choice and searching for diffs (most merges tend to be
> > clean and unmodified and thus produce no diff but a search of '^diff' in the
> > log output tends to find the examples nicely).
> >
> > Some basic high level details about this new option:
> >
> >  * This option is most naturally compared to --cc, though the output seems
> >    to be much more understandable to most users than --cc output.
> >  * Since merges are often clean and unmodified, this new option results in
> >    an empty diff for most merges.
> >  * This new option shows things like the removal of conflict markers, which
> >    hunks users picked from the various conflicted sides to keep or remove,
> >    and shows changes made outside of conflict markers (which might reflect
> >    changes needed to resolve semantic conflicts or cleanups of e.g.
> >    compilation warnings or other additional changes an integrator felt
> >    belonged in the merged result).
> >  * This new option does not (currently) work for octopus merges, since
> >    merge-ort is specific to two-parent merges[1].
> >  * This option will not work on a read-only or full filesystem[2].
> >  * We discussed this capability at Git Merge 2020, and one of the
> >    suggestions was doing a periodic git gc --auto during the operation (due
> >    to potential new blobs and trees created during the operation). I found a
> >    way to avoid that; see [2].
> >  * This option is faster than you'd probably expect; it handles 33.5 merge
> >    commits per second in linux.git on my computer; see below.
> >
> > In regards to the performance point above, the timing for running the
> > following command:
> >
> > time git log --min-parents=2 --max-parents=2 $DIFF_FLAG | wc -l
>
> I've been trying to come up with some other useful recipies for this new
> option (which is already very useful, thanks!)

I'm glad you like it.  :-)

> Some of these (if correct) are suggestions for incorporating into the
> (now rather sparse) documentation. I.e. walking users through how to use
> this, and how (if at all) it combines with other options.
>
> I wanted to find all merges between "master".."seen" for which Junio's
> had to resolve a conflict, a naïve version is:
>
>     $ git log --oneline --remerge-diff -p --min-parents=2 origin/master..origin/seen|grep ^diff -B1 | grep Merge
>     [...]

I think the naive version is
  $ git log --remerge-diff --min-parents=2 origin/master..origin/seen
  <search for "^diff" using your pager's search functionality>

Where the "--min-parents=2 origin/master..origin/seen" comes from your
problem description ("find all merges between master..seen").

You can add --oneline to format it, though it's an orthogonal concern.
Also, adding -p is unnecessary: --remerge-diff, like --cc, implies -p.

> But I found that this new option nicely integrates with --diff-filter,
> i.e. we'll end up showing a diff, and the diff machinery allows you to
> to filter on it.
>
> It seems to me like all the diffs you show fall under "M", so for

Yes, the diffs I happened to pick all fell under "M", but by no means
should you rely on that happening for all merges in history.  For
example, make a new merge commit, then add a completely new file (or
delete a file, or rename a file, or copy a file, or change its
mode/type), stage the new/deleted/renamed/copied/changed file, and run
"git commit --amend".

So, although --diff-filter=M can be interesting, I would not rely on it.

> master..seen (2ae0a9cb829..61055c2920d) this is equivalent (and the
> output is the same as the above):
>
>     $ git -P log --oneline --remerge-diff --no-patch --min-parents=2 --diff-filter=M origin/master..origin/seen
>     95daa54b1c3 Merge branch 'hn/reftable-fixes' into seen
>     26c4c09dd34 Merge branch 'gc/fetch-negotiate-only-early-return' into seen
>     e3dc8d073f6 Merge branch 'gc/branch-recurse-submodules' into seen
>     aeada898196 Merge branch 'js/branch-track-inherit' into seen
>     4dd30e0da45 Merge branch 'jh/builtin-fsmonitor-part2' into seen
>     337743b17d0 Merge branch 'ab/config-based-hooks-2' into seen
>     261672178c0 Merge branch 'pw/fix-some-issues-in-reset-head' into seen
>     1296d35b041 Merge branch 'ms/customizable-ident-expansion' into seen
>     7a3d7d05126 Merge branch 'ja/i18n-similar-messages' into seen
>     eda714bb8bc Merge branch 'tb/midx-bitmap-corruption-fix' into seen
>     ba02295e3f8 Merge branch 'jh/p4-human-unit-numbers' into jch
>     751773fc38b Merge branch 'es/test-chain-lint' into jch
>     ec17879f495 Merge branch 'tb/cruft-packs' into tb/midx-bitmap-corruption-fix
>
> However for "origin/master..origin/next" (next = 510f9eba9a2 currently)
> we'll oddly show this with "-p":
>
>     9af51fd1d0d Sync with 'master'
>     diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
>     CONFLICT (content): Merge conflict in t/lib-gpg.sh
>     d6f56f3248e Merge branch 'es/test-chain-lint' into next
>     diff --git a/t/t4126-apply-empty.sh b/t/t4126-apply-empty.sh
>     CONFLICT (content): Merge conflict in t/t4126-apply-empty.sh
>     index 996c93329c6..33860d38290 100755
>     --- a/t/t4126-apply-empty.sh
>     +++ b/t/t4126-apply-empty.sh
>     [...]
>
> The "oddly" applying only to that "9af51fd1d0d Sync with 'master'", not
> the second d6f56f3248e, which shows the sort of conflict I'd expect. The
> two-line "diff" of:
>
>     diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
>     CONFLICT (content): Merge conflict in t/lib-gpg.sh
>
> Shows up with -p --remerge-diff, not a mere -p. I also tried the other
> --diff-merges=* options, that behavior is new in
> --diff-merges=remerge. Is this a bug?

Ugh, this is related to my comment elsewhere that conflicts from inner
merges are not nicely differentiated.  If I also apply my other series
(which has not yet been submitted), this instead appears as follows:

$ git show --oneline --remerge-diff 9af51fd1d0d
9af51fd1d0 Sync with 'master'
diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
  From inner merge:  CONFLICT (content): Merge conflict in t/lib-gpg.sh

and the addition of the "From inner merge: " text makes it clearer why
that line appears.  This is an interesting case where a conflict
notice _only_ appears in the inner merge (i.e. the merge of merge
bases), which means that both sides on the outer merge changed the
relevant portion of the file in the same way, so the outer merge had
no conflict.

However, instead of trying to differentiate messages from inner
merges, I think for --remerge-diff's purposes we should just drop all
notices that come from the inner merges.  Those conflict notices might
be helpful when initially resolving a merge, but at the --remerge-diff
level, they're more likely to be distracting than helpful.

> My local build also has a --pickaxe-patch option. It's something I
> submitted on-list before[1] and have been meaning to re-roll.
>
> I'm discussing it here because it skips the stripping of the "+ " and "-
> " prefixes under -G<regex> and allows you to search through the -U<n>
> context. With that I'm able to do:
>
>     git log --oneline --remerge-diff -p --min-parents=2 --pickaxe-patch -G'^\+' --diff-filter=M origin/master..origin/seen
>
> I.e. on top of the above filter only show those diffs that have
> additions. FAICT the conflicting diffs where the committer of the merge
> conflict picked one side or the other will only have "-" lines".
>
> So those diffs that have additions look to be those where the person
> doing the merge needed to combine the two.
>
> Well, usually. E.g. 26c4c09dd34 (Merge branch
> 'gc/fetch-negotiate-only-early-return' into seen, 2021-12-25) in that
> range shows that isn't strictly true. Most such deletion-only diffs are
> less interesting in picking one side or the other of the conflict, but
> that one combines the two:
>
>     -<<<<<<< d3419aac9f4 (Merge branch 'pw/add-p-hunk-split-fix' into seen)
>                             warning(_("protocol does not support --negotiate-only, exiting"));
>     -                       return 1;
>     -=======
>     -                       warning(_("Protocol does not support --negotiate-only, exiting."));
>                             result = 1;
>                             goto cleanup;
>     ->>>>>>> 495e8601f28 (builtin/fetch: die on --negotiate-only and --recurse-submodules)
>
> Which I guess is partially commentary and partially a request (either
> for this series, or some follow-up) for something like a
> --remerge-diff-filter option. I.e. it would be very useful to be able to
> filter on some combination of:
>
>  * Which side(s) of the conflict(s) were picked, or a combination?
>  * Is there "new work" in the diff to resolve the conflict?
>    AFIACT this will always mean we'll have "+ " lines.

Do any of the following count as "new work"? :

  * the deletion of a file (perhaps one that had no conflict but was
deleted anyway)
  * mode changes (again, perhaps on files that had no conflict)
  * renames of files/directories?

If so, searching for "^+" lines might be insufficient, but it depends
on what you mean by new work.

> Or maybe that's not useful at all, and just -G<rx> (maybe combined with
> my --pickaxe-patch) will cover it?

I'd rather wait until we have a good idea of the potential range of
usecases before adding a filter.  (And I think for now, the -G and
--pickaxe-patch are probably good enough for this usecase.)  These
particular usecases you point out are interesting; thanks for
detailing them.  Here's some others to consider:

  * Finding out when text was added or removed: `git log
--remerge-diff -S<text>` (note that with only -p instead of
--remerge-diff, that command will annoyingly misses cases where a
merge introduced or removed the text)
  * Finding out how a merge differed from one run with some
non-default options (e.g. `git show --remerge-diff -Xours` or `git
show --remerge-diff -Xno-space-change`; although show doesn't take -X
options so this is just an idea at this point)
  * Finding out how a merge would have differed had it been run with
different options (so instead of comparing a remerge to the merge
recorded in history, compare one remerge with default options with a
different merge that uses e.g. -Xno-space-change)

Also, I've got a follow-up series that also introduces a
--remerge-diff-only flag which:
  * For single parent commits that cannot be identified as a revert or
cherry-pick, do not show a diff.
  * For single parent commits that can be identified as a revert or
cherry-pick, instead of showing a diff against the parent of the
commit, redo the revert or cherry-pick in memory and show a diff
against that.
  * For merge commits, act the same as --remerge-diff

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 0/8] Add a new --remerge-diff capability to show & log
  2021-12-25  7:59 ` [PATCH v2 0/8] " Elijah Newren via GitGitGadget
                     ` (8 preceding siblings ...)
  2021-12-26 21:52   ` [PATCH v2 0/8] Add a new --remerge-diff capability to show & log Ævar Arnfjörð Bjarmason
@ 2021-12-28 10:55   ` Johannes Altmanninger
  2021-12-30 23:36   ` [PATCH v3 0/9] " Elijah Newren via GitGitGadget
  10 siblings, 0 replies; 113+ messages in thread
From: Johannes Altmanninger @ 2021-12-28 10:55 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh

On Sat, Dec 25, 2021 at 07:59:11AM +0000, Elijah Newren via GitGitGadget wrote:
> Here are some example commits you can try this out on (with git show
> --remerge-diff $COMMIT):
> 
>  * git.git conflicted merge: 07601b5b36
>  * git.git non-conflicted change: bf04590ecd
>  * linux.git conflicted merge: eab3540562fb
>  * linux.git non-conflicted change: 223cea6a4f05
> 
> Many more can be found by just running git log --merges --remerge-diff in
> your repository of choice and searching for diffs (most merges tend to be
> clean and unmodified and thus produce no diff but a search of '^diff' in the
> log output tends to find the examples nicely).
> 
> Some basic high level details about this new option:
> 
>  * This option is most naturally compared to --cc, though the output seems
>    to be much more understandable to most users than --cc output.

Agreed. --cc is *simple* but I'm more comfortable reading conflict markers
from --remerge-diff, since I'm used to that.  So at least for content
conflicts it looks simpler.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 1/8] show, log: provide a --remerge-diff capability
  2021-12-25  7:59   ` [PATCH v2 1/8] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
@ 2021-12-28 10:56     ` Johannes Altmanninger
  2021-12-28 22:34       ` Elijah Newren
  0 siblings, 1 reply; 113+ messages in thread
From: Johannes Altmanninger @ 2021-12-28 10:56 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh

On Sat, Dec 25, 2021 at 07:59:12AM +0000, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
> 
> When this option is specified, we remerge all (two parent) merge commits
> and diff the actual merge commit to the automatically created version,
> in order to show how users removed conflict markers, resolved the
> different conflict versions, and potentially added new changes outside
> of conflict regions in order to resolve semantic merge problems (or,
> possibly, just to hide other random changes).
> 
> This capability works by creating a temporary object directory and
> marking it as the primary object store.  This makes it so that any blobs
> or trees created during the automatic merge easily removable afterwards

s/easily/are easily/ ?

> by just deleting all objects from the temporary object directory.
> 
> There are a few ways that this implementation is suboptimal:
>   * `log --remerge-diff` becomes slow, because the temporary object
>     directory can fills with many loose objects while running

s/can fills/can fill/

>   * the log output can be muddied with misplaced "warning: cannot merge
>     binary files" messages, since ll-merge.c unconditionally writes those
>     messages to stderr while running instead of allowing callers to
>     manage them.
>   * important conflict and warning messages are simply dropped; thus for
>     conflicts like modify/delete or rename/rename or file/directory which
>     are not representable with content conflict markers, there may be no
>     way for a user of --remerge-diff to know that there had been a
>     conflict which was resolved (and which possibly motivated other
>     changes in the merge commit).
> Subsequent commits will address these issues.
> 
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  Documentation/diff-options.txt |  8 ++++
>  builtin/log.c                  | 14 ++++++
>  diff-merges.c                  | 12 +++++
>  log-tree.c                     | 59 +++++++++++++++++++++++
>  revision.h                     |  3 +-
>  t/t4069-remerge-diff.sh        | 86 ++++++++++++++++++++++++++++++++++
>  6 files changed, 181 insertions(+), 1 deletion(-)
>  create mode 100755 t/t4069-remerge-diff.sh
> 
> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> index c89d530d3d1..b05f1c9f1c9 100644
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -64,6 +64,14 @@ ifdef::git-log[]
>  	each of the parents. Separate log entry and diff is generated
>  	for each parent.
>  +
> +--diff-merges=remerge:::
> +--diff-merges=r:::
> +--remerge-diff:::

The synopsis above needs an update, too:

	diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
	index c89d530d3d..7a98ab3f85 100644
	--- a/Documentation/diff-options.txt
	+++ b/Documentation/diff-options.txt
	@@ -36,3 +36,3 @@ endif::git-format-patch[]
	 ifdef::git-log[]
	---diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc)::
	+--diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc|remerge|r)::
	 --no-diff-merges::

> +	With this option, two-parent merge commits are remerged to
> +	create a temporary tree object -- potentially containing files
> +	with conflict markers and such.  A diff is then shown between
> +	that temporary tree and the actual merge commit.

I had not really looked at any of the --diff-merges options before.  The term
"remerge" felt a bit opaque at first, because I didn't know what the diff
would look like. I might have found this easier:

--diff-merges=resolution:::
--diff-merges=r:::
--resolution-diff:::
	This makes two-parent merge commits show the diff with respect to
	a mechanical merge of their parents -- potentially containing files
	with conflict markers and such.

But on a second thought, remerge is actually consistent with the rest,
because it states _what_ we compare to the merge commit, so nevermind.

> ++
>  --diff-merges=combined:::
>  --diff-merges=c:::
>  -c:::
> diff --git a/builtin/log.c b/builtin/log.c
> index f75d87e8d7f..d053418fddd 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -35,6 +35,7 @@
>  #include "repository.h"
>  #include "commit-reach.h"
>  #include "range-diff.h"
> +#include "tmp-objdir.h"
>  
>  #define MAIL_DEFAULT_WRAP 72
>  #define COVER_FROM_AUTO_MAX_SUBJECT_LEN 100
> @@ -406,6 +407,14 @@ static int cmd_log_walk(struct rev_info *rev)
>  	struct commit *commit;
>  	int saved_nrl = 0;
>  	int saved_dcctc = 0;
> +	struct tmp_objdir *remerge_objdir = NULL;
> +
> +	if (rev->remerge_diff) {
> +		remerge_objdir = tmp_objdir_create("remerge-diff");
> +		if (!remerge_objdir)
> +			die_errno(_("unable to create temporary object directory"));
> +		tmp_objdir_replace_primary_odb(remerge_objdir, 1);
> +	}
>  
>  	if (rev->early_output)
>  		setup_early_output();
> @@ -449,6 +458,9 @@ static int cmd_log_walk(struct rev_info *rev)
>  	rev->diffopt.no_free = 0;
>  	diff_free(&rev->diffopt);
>  
> +	if (rev->remerge_diff)
> +		tmp_objdir_destroy(remerge_objdir);
> +
>  	if (rev->diffopt.output_format & DIFF_FORMAT_CHECKDIFF &&
>  	    rev->diffopt.flags.check_failed) {
>  		return 02;
> @@ -1943,6 +1955,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
>  		die(_("--name-status does not make sense"));
>  	if (rev.diffopt.output_format & DIFF_FORMAT_CHECKDIFF)
>  		die(_("--check does not make sense"));
> +	if (rev.remerge_diff)
> +		die(_("--remerge-diff does not make sense"));
>  
>  	if (!use_patch_format &&
>  		(!rev.diffopt.output_format ||
> diff --git a/diff-merges.c b/diff-merges.c
> index 5060ccd890b..0af4b3f9191 100644
> --- a/diff-merges.c
> +++ b/diff-merges.c
> @@ -17,6 +17,7 @@ static void suppress(struct rev_info *revs)
>  	revs->combined_all_paths = 0;
>  	revs->merges_imply_patch = 0;
>  	revs->merges_need_diff = 0;
> +	revs->remerge_diff = 0;
>  }
>  
>  static void set_separate(struct rev_info *revs)
> @@ -45,6 +46,12 @@ static void set_dense_combined(struct rev_info *revs)
>  	revs->dense_combined_merges = 1;
>  }
>  
> +static void set_remerge_diff(struct rev_info *revs)
> +{
> +	suppress(revs);
> +	revs->remerge_diff = 1;
> +}
> +
>  static diff_merges_setup_func_t func_by_opt(const char *optarg)
>  {
>  	if (!strcmp(optarg, "off") || !strcmp(optarg, "none"))
> @@ -57,6 +64,8 @@ static diff_merges_setup_func_t func_by_opt(const char *optarg)
>  		return set_combined;
>  	else if (!strcmp(optarg, "cc") || !strcmp(optarg, "dense-combined"))
>  		return set_dense_combined;
> +	else if (!strcmp(optarg, "r") || !strcmp(optarg, "remerge"))
> +		return set_remerge_diff;
>  	else if (!strcmp(optarg, "m") || !strcmp(optarg, "on"))
>  		return set_to_default;
>  	return NULL;
> @@ -110,6 +119,9 @@ int diff_merges_parse_opts(struct rev_info *revs, const char **argv)
>  	} else if (!strcmp(arg, "--cc")) {
>  		set_dense_combined(revs);
>  		revs->merges_imply_patch = 1;
> +	} else if (!strcmp(arg, "--remerge-diff")) {
> +		set_remerge_diff(revs);
> +		revs->merges_imply_patch = 1;
>  	} else if (!strcmp(arg, "--no-diff-merges")) {
>  		suppress(revs);
>  	} else if (!strcmp(arg, "--combined-all-paths")) {
> diff --git a/log-tree.c b/log-tree.c
> index 644893fd8cf..84ed864fc81 100644
> --- a/log-tree.c
> +++ b/log-tree.c
> @@ -1,4 +1,5 @@
>  #include "cache.h"
> +#include "commit-reach.h"
>  #include "config.h"
>  #include "diff.h"
>  #include "object-store.h"
> @@ -7,6 +8,7 @@
>  #include "tag.h"
>  #include "graph.h"
>  #include "log-tree.h"
> +#include "merge-ort.h"
>  #include "reflog-walk.h"
>  #include "refs.h"
>  #include "string-list.h"
> @@ -902,6 +904,51 @@ static int do_diff_combined(struct rev_info *opt, struct commit *commit)
>  	return !opt->loginfo;
>  }
>  
> +static int do_remerge_diff(struct rev_info *opt,
> +			   struct commit_list *parents,
> +			   struct object_id *oid,
> +			   struct commit *commit)
> +{
> +	struct merge_options o;
> +	struct commit_list *bases;
> +	struct merge_result res = {0};
> +	struct pretty_print_context ctx = {0};
> +	struct commit *parent1 = parents->item;
> +	struct commit *parent2 = parents->next->item;
> +	struct strbuf parent1_desc = STRBUF_INIT;
> +	struct strbuf parent2_desc = STRBUF_INIT;
> +
> +	/* Setup merge options */
> +	init_merge_options(&o, the_repository);
> +	o.show_rename_progress = 0;

Is there a reason why we are repeating the default here (but not anywhere else)?
For example sequencer.c::do_merge() and builtin/am.c::fall_back_threeway()
don't, and probably also rely on this being disabled(?).

> +
> +	ctx.abbrev = DEFAULT_ABBREV;
> +	format_commit_message(parent1, "%h (%s)", &parent1_desc, &ctx);
> +	format_commit_message(parent2, "%h (%s)", &parent2_desc, &ctx);
> +	o.branch1 = parent1_desc.buf;
> +	o.branch2 = parent2_desc.buf;
> +
> +	/* Parse the relevant commits and get the merge bases */
> +	parse_commit_or_die(parent1);
> +	parse_commit_or_die(parent2);
> +	bases = get_merge_bases(parent1, parent2);
> +
> +	/* Re-merge the parents */
> +	merge_incore_recursive(&o, bases, parent1, parent2, &res);
> +
> +	/* Show the diff */
> +	diff_tree_oid(&res.tree->object.oid, oid, "", &opt->diffopt);
> +	log_tree_diff_flush(opt);
> +
> +	/* Cleanup */
> +	strbuf_release(&parent1_desc);
> +	strbuf_release(&parent2_desc);
> +	merge_finalize(&o, &res);
> +	/* TODO: clean up the temporary object directory */
> +
> +	return !opt->loginfo;
> +}
> +
>  /*
>   * Show the diff of a commit.
>   *
> @@ -936,6 +983,18 @@ static int log_tree_diff(struct rev_info *opt, struct commit *commit, struct log
>  	}
>  
>  	if (is_merge) {
> +		int octopus = (parents->next->next != NULL);
> +
> +		if (opt->remerge_diff) {
> +			if (octopus) {
> +				show_log(opt);
> +				fprintf(opt->diffopt.file,
> +					"diff: warning: Skipping remerge-diff "
> +					"for octopus merges.\n");
> +				return 1;
> +			}
> +			return do_remerge_diff(opt, parents, oid, commit);
> +		}
>  		if (opt->combine_merges)
>  			return do_diff_combined(opt, commit);
>  		if (opt->separate_merges) {
> diff --git a/revision.h b/revision.h
> index 5578bb4720a..13178e6b8f3 100644
> --- a/revision.h
> +++ b/revision.h
> @@ -195,7 +195,8 @@ struct rev_info {
>  			combine_merges:1,
>  			combined_all_paths:1,
>  			dense_combined_merges:1,
> -			first_parent_merges:1;
> +			first_parent_merges:1,
> +			remerge_diff:1;
>  
>  	/* Format info */
>  	int		show_notes;
> diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
> new file mode 100755
> index 00000000000..192dbce2bfe
> --- /dev/null
> +++ b/t/t4069-remerge-diff.sh
> @@ -0,0 +1,86 @@
> +#!/bin/sh
> +
> +test_description='remerge-diff handling'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'setup basic merges' '
> +	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
> +	git add numbers &&
> +	git commit -m base &&
> +
> +	git branch feature_a &&
> +	git branch feature_b &&
> +	git branch feature_c &&
> +
> +	git branch ab_resolution &&
> +	git branch bc_resolution &&
> +
> +	git checkout feature_a &&
> +	test_write_lines 1 2 three 4 5 6 7 eight 9 >numbers &&
> +	git commit -a -m change_a &&
> +
> +	git checkout feature_b &&
> +	test_write_lines 1 2 tres 4 5 6 7 8 9 >numbers &&
> +	git commit -a -m change_b &&
> +
> +	git checkout feature_c &&
> +	test_write_lines 1 2 3 4 5 6 7 8 9 10 >numbers &&
> +	git commit -a -m change_c &&
> +
> +	git checkout bc_resolution &&
> +	# fast forward
> +	git merge feature_b &&

maybe use --ff-only instead of the comment? Same below.
(But if we did that we probably want to drop the "no conflict" comment too.)

> +	# no conflict
> +	git merge feature_c &&
> +
> +	git checkout ab_resolution &&
> +	# fast forward
> +	git merge feature_a &&
> +	# conflicts!
> +	test_must_fail git merge feature_b &&
> +	# Resolve conflict...and make another change elsewhere
> +	test_write_lines 1 2 drei 4 5 6 7 acht 9 >numbers &&
> +	git add numbers &&
> +	git merge --continue
> +'
> +
> +test_expect_success 'remerge-diff on a clean merge' '
> +	git log -1 --oneline bc_resolution >expect &&
> +	git show --oneline --remerge-diff bc_resolution >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'remerge-diff with both a resolved conflict and an unrelated change' '
> +	git log -1 --oneline ab_resolution >tmp &&
> +	cat <<-EOF >>tmp &&
> +	diff --git a/numbers b/numbers
> +	index a1fb731..6875544 100644
> +	--- a/numbers
> +	+++ b/numbers
> +	@@ -1,13 +1,9 @@
> +	 1
> +	 2
> +	-<<<<<<< b0ed5cb (change_a)
> +	-three
> +	-=======
> +	-tres
> +	->>>>>>> 6cd3f82 (change_b)
> +	+drei

nice

> +	 4
> +	 5
> +	 6
> +	 7
> +	-eight
> +	+acht
> +	 9
> +	EOF
> +	# Hashes above are sha1; rip them out so test works with sha256
> +	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&

Right, sha256 could cause many noisy test changes. I wonder if there is a
more general way to avoid this; maybe default to SHA1 for existing tests?

> +
> +	git show --oneline --remerge-diff ab_resolution >tmp &&
> +	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_done
> -- 
> gitgitgadget
> 

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 3/8] ll-merge: make callers responsible for showing warnings
  2021-12-25  7:59   ` [PATCH v2 3/8] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
@ 2021-12-28 10:56     ` Johannes Altmanninger
  2021-12-28 19:37       ` Elijah Newren
  0 siblings, 1 reply; 113+ messages in thread
From: Johannes Altmanninger @ 2021-12-28 10:56 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh

On Sat, Dec 25, 2021 at 07:59:14AM +0000, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
> 
> Since some callers may want to send warning messages to somewhere other
> than stdout/stderr, stop printing "warning: Cannot merge binary files"
> from ll-merge and instead modify the return status of ll_merge() to
> indicate when a merge of binary files has occurred.
> 
> This commit continues printing the message as-is; future changes will
> start handling the new commit differently in the merge-ort codepath.

"the new commit" looks like a typo, do you mean "binary conflicts"?

> 
> Note that my methodology included first modifying ll_merge() to return
> a struct, so that the compiler would catch all the callers for me and
> ensure I had modified all of them.  After modifying all of them, I then
> changed the struct to an enum.

Heh, this is a clever way to work around C's weak typing.

The language server I'm using (clangd) supports the Call Hierarchy feature,
which is intended to list callers or callees of the function at the editor's
cursor. If I ask the server for callers of ll_merge I get this response
(on 510f9eba9 plus this series)

	ll-merge.h:98:1: ll_merge - list of callers
	  builtin/checkout.c:242:12: checkout_merged
	    builtin/checkout.c:279:17: 	merge_status = ll_merge(&result_buf, path, &ancestor, "base",
	  rerere.c:943:12: handle_cache
	    rerere.c:984:2: 	ll_merge(&result, path, &mmfile[0], NULL,
	  notes-merge.c:342:12: ll_merge_in_worktree
	    notes-merge.c:353:11: 	status = ll_merge(&result_buf, oid_to_hex(&p->obj), &base, NULL,
	  merge-recursive.c:1035:12: merge_3way
	    merge-recursive.c:1090:17: 	merge_status = ll_merge(result_buf, a->path, &orig, base,
	  merge-ort.c:1763:12: merge_3way
	    merge-ort.c:1816:17: 	merge_status = ll_merge(result_buf, path, &orig, base,
	  merge-blobs.c:32:14: three_way_filemerge
	    merge-blobs.c:48:17: 	merge_status = ll_merge(&res, path, base, NULL,
	  apply.c:3491:12: three_way_merge
	    apply.c:3511:11: 	status = ll_merge(&result, path,
	  rerere.c:608:12: try_merge
	    rerere.c:623:9: 		ret = ll_merge(result, path, &base, NULL, cur, "", &other, "",

So there are 8 callers in total; but only 7 print the warning (including the
one in merge-ort which will change in the next commit). I think you missed
the call at rerere.c:984 because we ignore its return value.

> 
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  apply.c            |  5 ++++-
>  builtin/checkout.c | 12 ++++++++----
>  ll-merge.c         | 40 ++++++++++++++++++++++------------------
>  ll-merge.h         |  9 ++++++++-
>  merge-blobs.c      |  5 ++++-
>  merge-ort.c        |  5 ++++-
>  merge-recursive.c  |  5 ++++-
>  notes-merge.c      |  5 ++++-
>  rerere.c           | 12 ++++++++----
>  9 files changed, 66 insertions(+), 32 deletions(-)
> 
> diff --git a/apply.c b/apply.c
> index 43a0aebf4ee..8079395755f 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -3492,7 +3492,7 @@ static int three_way_merge(struct apply_state *state,
>  {
>  	mmfile_t base_file, our_file, their_file;
>  	mmbuffer_t result = { NULL };
> -	int status;
> +	enum ll_merge_result status;
>  
>  	/* resolve trivial cases first */
>  	if (oideq(base, ours))
> @@ -3509,6 +3509,9 @@ static int three_way_merge(struct apply_state *state,
>  			  &their_file, "theirs",
>  			  state->repo->index,
>  			  NULL);
> +	if (status == LL_MERGE_BINARY_CONFLICT)
> +		warning("Cannot merge binary files: %s (%s vs. %s)",
> +			path, "ours", "theirs");
>  	free(base_file.ptr);
>  	free(our_file.ptr);
>  	free(their_file.ptr);
> diff --git a/builtin/checkout.c b/builtin/checkout.c
> index cbf73b8c9f6..3a559d69303 100644
> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c
> @@ -237,6 +237,7 @@ static int checkout_merged(int pos, const struct checkout *state,
>  	struct cache_entry *ce = active_cache[pos];
>  	const char *path = ce->name;
>  	mmfile_t ancestor, ours, theirs;
> +	enum ll_merge_result merge_status;
>  	int status;
>  	struct object_id oid;
>  	mmbuffer_t result_buf;
> @@ -267,13 +268,16 @@ static int checkout_merged(int pos, const struct checkout *state,
>  	memset(&ll_opts, 0, sizeof(ll_opts));
>  	git_config_get_bool("merge.renormalize", &renormalize);
>  	ll_opts.renormalize = renormalize;
> -	status = ll_merge(&result_buf, path, &ancestor, "base",
> -			  &ours, "ours", &theirs, "theirs",
> -			  state->istate, &ll_opts);
> +	merge_status = ll_merge(&result_buf, path, &ancestor, "base",
> +				&ours, "ours", &theirs, "theirs",
> +				state->istate, &ll_opts);
>  	free(ancestor.ptr);
>  	free(ours.ptr);
>  	free(theirs.ptr);
> -	if (status < 0 || !result_buf.ptr) {
> +	if (merge_status == LL_MERGE_BINARY_CONFLICT)
> +		warning("Cannot merge binary files: %s (%s vs. %s)",
> +			path, "ours", "theirs");
> +	if (merge_status < 0 || !result_buf.ptr) {
>  		free(result_buf.ptr);
>  		return error(_("path '%s': cannot merge"), path);
>  	}
> diff --git a/ll-merge.c b/ll-merge.c
> index 261657578c7..669c09eed6c 100644
> --- a/ll-merge.c
> +++ b/ll-merge.c
> @@ -14,7 +14,7 @@
>  
>  struct ll_merge_driver;
>  
> -typedef int (*ll_merge_fn)(const struct ll_merge_driver *,
> +typedef enum ll_merge_result (*ll_merge_fn)(const struct ll_merge_driver *,
>  			   mmbuffer_t *result,
>  			   const char *path,
>  			   mmfile_t *orig, const char *orig_name,
> @@ -49,7 +49,7 @@ void reset_merge_attributes(void)
>  /*
>   * Built-in low-levels
>   */
> -static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
> +static enum ll_merge_result ll_binary_merge(const struct ll_merge_driver *drv_unused,
>  			   mmbuffer_t *result,
>  			   const char *path,
>  			   mmfile_t *orig, const char *orig_name,
> @@ -58,6 +58,7 @@ static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
>  			   const struct ll_merge_options *opts,
>  			   int marker_size)
>  {
> +	enum ll_merge_result ret;
>  	mmfile_t *stolen;
>  	assert(opts);
>  
> @@ -68,16 +69,19 @@ static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
>  	 */
>  	if (opts->virtual_ancestor) {
>  		stolen = orig;
> +		ret = LL_MERGE_OK;
>  	} else {
>  		switch (opts->variant) {
>  		default:
> -			warning("Cannot merge binary files: %s (%s vs. %s)",
> -				path, name1, name2);
> -			/* fallthru */
> +			ret = LL_MERGE_BINARY_CONFLICT;
> +			stolen = src1;
> +			break;
>  		case XDL_MERGE_FAVOR_OURS:
> +			ret = LL_MERGE_OK;
>  			stolen = src1;
>  			break;
>  		case XDL_MERGE_FAVOR_THEIRS:
> +			ret = LL_MERGE_OK;
>  			stolen = src2;
>  			break;
>  		}
> @@ -87,16 +91,10 @@ static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
>  	result->size = stolen->size;
>  	stolen->ptr = NULL;
>  
> -	/*
> -	 * With -Xtheirs or -Xours, we have cleanly merged;
> -	 * otherwise we got a conflict.
> -	 */
> -	return opts->variant == XDL_MERGE_FAVOR_OURS ||
> -	       opts->variant == XDL_MERGE_FAVOR_THEIRS ?
> -	       0 : 1;
> +	return ret;
>  }
>  
> -static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
> +static enum ll_merge_result ll_xdl_merge(const struct ll_merge_driver *drv_unused,
>  			mmbuffer_t *result,
>  			const char *path,
>  			mmfile_t *orig, const char *orig_name,
> @@ -105,7 +103,9 @@ static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
>  			const struct ll_merge_options *opts,
>  			int marker_size)
>  {
> +	enum ll_merge_result ret;
>  	xmparam_t xmp;
> +	int status;
>  	assert(opts);
>  
>  	if (orig->size > MAX_XDIFF_SIZE ||
> @@ -133,10 +133,12 @@ static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
>  	xmp.ancestor = orig_name;
>  	xmp.file1 = name1;
>  	xmp.file2 = name2;
> -	return xdl_merge(orig, src1, src2, &xmp, result);
> +	status = xdl_merge(orig, src1, src2, &xmp, result);
> +	ret = (status > 1 ) ? LL_MERGE_CONFLICT : status;

" (status > 1 )" has an extra space

I'm not sure it's wise to handle status=1 and status=2 in two different code paths.
Both mean the same (the only difference is the number of conflicts).
status=1 coincides with LL_MERGE_CONFLICT but that's purely coincidental

	ret = (status > 0) ? LL_MERGE_CONFLICT : status;

> +	return ret;
>  }
>  
> -static int ll_union_merge(const struct ll_merge_driver *drv_unused,
> +static enum ll_merge_result ll_union_merge(const struct ll_merge_driver *drv_unused,
>  			  mmbuffer_t *result,
>  			  const char *path,
>  			  mmfile_t *orig, const char *orig_name,
> @@ -178,7 +180,7 @@ static void create_temp(mmfile_t *src, char *path, size_t len)
>  /*
>   * User defined low-level merge driver support.
>   */
> -static int ll_ext_merge(const struct ll_merge_driver *fn,
> +static enum ll_merge_result ll_ext_merge(const struct ll_merge_driver *fn,
>  			mmbuffer_t *result,
>  			const char *path,
>  			mmfile_t *orig, const char *orig_name,
> @@ -194,6 +196,7 @@ static int ll_ext_merge(const struct ll_merge_driver *fn,
>  	const char *args[] = { NULL, NULL };
>  	int status, fd, i;
>  	struct stat st;
> +	enum ll_merge_result ret;
>  	assert(opts);
>  
>  	sq_quote_buf(&path_sq, path);
> @@ -236,7 +239,8 @@ static int ll_ext_merge(const struct ll_merge_driver *fn,
>  		unlink_or_warn(temp[i]);
>  	strbuf_release(&cmd);
>  	strbuf_release(&path_sq);
> -	return status;
> +	ret = (status > 1) ? LL_MERGE_CONFLICT : status;

same here, I'd test for "status > 0" because that's the convention for
external programs

> +	return ret;
>  }
>  
>  /*
> @@ -362,7 +366,7 @@ static void normalize_file(mmfile_t *mm, const char *path, struct index_state *i
>  	}
>  }
>  
> -int ll_merge(mmbuffer_t *result_buf,
> +enum ll_merge_result ll_merge(mmbuffer_t *result_buf,
>  	     const char *path,
>  	     mmfile_t *ancestor, const char *ancestor_label,
>  	     mmfile_t *ours, const char *our_label,
> diff --git a/ll-merge.h b/ll-merge.h
> index aceb1b24132..e4a20e81a3a 100644
> --- a/ll-merge.h
> +++ b/ll-merge.h
> @@ -82,13 +82,20 @@ struct ll_merge_options {
>  	long xdl_opts;
>  };
>  
> +enum ll_merge_result {
> +	LL_MERGE_ERROR = -1,
> +	LL_MERGE_OK = 0,
> +	LL_MERGE_CONFLICT,
> +	LL_MERGE_BINARY_CONFLICT,
> +};
> +
>  /**
>   * Perform a three-way single-file merge in core.  This is a thin wrapper
>   * around `xdl_merge` that takes the path and any merge backend specified in
>   * `.gitattributes` or `.git/info/attributes` into account.
>   * Returns 0 for a clean merge.
>   */
> -int ll_merge(mmbuffer_t *result_buf,
> +enum ll_merge_result ll_merge(mmbuffer_t *result_buf,
>  	     const char *path,
>  	     mmfile_t *ancestor, const char *ancestor_label,
>  	     mmfile_t *ours, const char *our_label,
> diff --git a/merge-blobs.c b/merge-blobs.c
> index ee0a0e90c94..8138090f81c 100644
> --- a/merge-blobs.c
> +++ b/merge-blobs.c
> @@ -36,7 +36,7 @@ static void *three_way_filemerge(struct index_state *istate,
>  				 mmfile_t *their,
>  				 unsigned long *size)
>  {
> -	int merge_status;
> +	enum ll_merge_result merge_status;
>  	mmbuffer_t res;
>  
>  	/*
> @@ -50,6 +50,9 @@ static void *three_way_filemerge(struct index_state *istate,
>  				istate, NULL);
>  	if (merge_status < 0)
>  		return NULL;
> +	if (merge_status == LL_MERGE_BINARY_CONFLICT)
> +		warning("Cannot merge binary files: %s (%s vs. %s)",
> +			path, ".our", ".their");
>  
>  	*size = res.size;
>  	return res.ptr;
> diff --git a/merge-ort.c b/merge-ort.c
> index 0342f104836..c24da2ba3cb 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -1743,7 +1743,7 @@ static int merge_3way(struct merge_options *opt,
>  	mmfile_t orig, src1, src2;
>  	struct ll_merge_options ll_opts = {0};
>  	char *base, *name1, *name2;
> -	int merge_status;
> +	enum ll_merge_result merge_status;
>  
>  	if (!opt->priv->attr_index.initialized)
>  		initialize_attr_index(opt);
> @@ -1787,6 +1787,9 @@ static int merge_3way(struct merge_options *opt,
>  	merge_status = ll_merge(result_buf, path, &orig, base,
>  				&src1, name1, &src2, name2,
>  				&opt->priv->attr_index, &ll_opts);
> +	if (merge_status == LL_MERGE_BINARY_CONFLICT)
> +		warning("Cannot merge binary files: %s (%s vs. %s)",
> +			path, name1, name2);
>  
>  	free(base);
>  	free(name1);
> diff --git a/merge-recursive.c b/merge-recursive.c
> index d9457797dbb..bc73c52dd84 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -1044,7 +1044,7 @@ static int merge_3way(struct merge_options *opt,
>  	mmfile_t orig, src1, src2;
>  	struct ll_merge_options ll_opts = {0};
>  	char *base, *name1, *name2;
> -	int merge_status;
> +	enum ll_merge_result merge_status;
>  
>  	ll_opts.renormalize = opt->renormalize;
>  	ll_opts.extra_marker_size = extra_marker_size;
> @@ -1090,6 +1090,9 @@ static int merge_3way(struct merge_options *opt,
>  	merge_status = ll_merge(result_buf, a->path, &orig, base,
>  				&src1, name1, &src2, name2,
>  				opt->repo->index, &ll_opts);
> +	if (merge_status == LL_MERGE_BINARY_CONFLICT)
> +		warning("Cannot merge binary files: %s (%s vs. %s)",
> +			a->path, name1, name2);
>  
>  	free(base);
>  	free(name1);
> diff --git a/notes-merge.c b/notes-merge.c
> index b4a3a903e86..01d596920ea 100644
> --- a/notes-merge.c
> +++ b/notes-merge.c
> @@ -344,7 +344,7 @@ static int ll_merge_in_worktree(struct notes_merge_options *o,
>  {
>  	mmbuffer_t result_buf;
>  	mmfile_t base, local, remote;
> -	int status;
> +	enum ll_merge_result status;
>  
>  	read_mmblob(&base, &p->base);
>  	read_mmblob(&local, &p->local);
> @@ -358,6 +358,9 @@ static int ll_merge_in_worktree(struct notes_merge_options *o,
>  	free(local.ptr);
>  	free(remote.ptr);
>  
> +	if (status == LL_MERGE_BINARY_CONFLICT)
> +		warning("Cannot merge binary files: %s (%s vs. %s)",
> +			oid_to_hex(&p->obj), o->local_ref, o->remote_ref);
>  	if ((status < 0) || !result_buf.ptr)
>  		die("Failed to execute internal merge");
>  
> diff --git a/rerere.c b/rerere.c
> index d83d58df4fb..b1f8961ed9e 100644
> --- a/rerere.c
> +++ b/rerere.c
> @@ -609,19 +609,23 @@ static int try_merge(struct index_state *istate,
>  		     const struct rerere_id *id, const char *path,
>  		     mmfile_t *cur, mmbuffer_t *result)
>  {
> -	int ret;
> +	enum ll_merge_result ret;
>  	mmfile_t base = {NULL, 0}, other = {NULL, 0};
>  
>  	if (read_mmfile(&base, rerere_path(id, "preimage")) ||
> -	    read_mmfile(&other, rerere_path(id, "postimage")))
> -		ret = 1;
> -	else
> +	    read_mmfile(&other, rerere_path(id, "postimage"))) {
> +		ret = LL_MERGE_CONFLICT;
> +	} else {
>  		/*
>  		 * A three-way merge. Note that this honors user-customizable
>  		 * low-level merge driver settings.
>  		 */
>  		ret = ll_merge(result, path, &base, NULL, cur, "", &other, "",
>  			       istate, NULL);
> +		if (ret == LL_MERGE_BINARY_CONFLICT)
> +			warning("Cannot merge binary files: %s (%s vs. %s)",
> +				path, "", "");

With the next patch, 7/8 callers of ll_merge (almost) immediately print
that warning.  Looks fine as is, but does it make sense to introduce a helper
function for the common case, or add a flag to ll_merge_options?

> +	}
>  
>  	free(base.ptr);
>  	free(other.ptr);
> -- 
> gitgitgadget
> 

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 6/8] merge-ort: format messages slightly different for use in headers
  2021-12-25  7:59   ` [PATCH v2 6/8] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
  2021-12-26 18:30     ` In-tree strbuf "in-place" search/replace (was: [PATCH v2 6/8] merge-ort: format messages slightly different for use in headers) Ævar Arnfjörð Bjarmason
@ 2021-12-28 10:56     ` Johannes Altmanninger
  2021-12-28 21:48       ` Elijah Newren
  1 sibling, 1 reply; 113+ messages in thread
From: Johannes Altmanninger @ 2021-12-28 10:56 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh

On Sat, Dec 25, 2021 at 07:59:17AM +0000, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
> 
> When users run
>     git show --remerge-diff $MERGE_COMMIT
> or
>     git log -p --remerge-diff ...
> stdout is not an appropriate location to dump conflict messages, but we
> do want to provide them to users.  We will include them in the diff
> headers instead...but for that to work, we need for any multiline
> messages to replace newlines with both a newline and a space.  Add a new
> flag to signal when we want these messages modified in such a fashion,
> and use it in path_msg() to modify these messages this way.

makes sense (same for patches 4 & 5)

> 
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  merge-ort.c       | 36 ++++++++++++++++++++++++++++++++++--
>  merge-recursive.c |  3 +++
>  merge-recursive.h |  1 +
>  3 files changed, 38 insertions(+), 2 deletions(-)
> 
> diff --git a/merge-ort.c b/merge-ort.c
> index 998e92ec593..9142d56e0ad 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -634,17 +634,46 @@ static void path_msg(struct merge_options *opt,
>  		     const char *fmt, ...)
>  {
>  	va_list ap;
> -	struct strbuf *sb = strmap_get(&opt->priv->output, path);
> +	struct strbuf *sb, *dest;
> +	struct strbuf tmp = STRBUF_INIT;
> +
> +	if (opt->record_conflict_msgs_as_headers && omittable_hint)
> +		return; /* Do not record mere hints in tree */
> +	sb = strmap_get(&opt->priv->output, path);
>  	if (!sb) {
>  		sb = xmalloc(sizeof(*sb));
>  		strbuf_init(sb, 0);
>  		strmap_put(&opt->priv->output, path, sb);
>  	}
>  
> +	dest = (opt->record_conflict_msgs_as_headers ? &tmp : sb);
> +
>  	va_start(ap, fmt);
> -	strbuf_vaddf(sb, fmt, ap);
> +	strbuf_vaddf(dest, fmt, ap);
>  	va_end(ap);
>  
> +	if (opt->record_conflict_msgs_as_headers) {
> +		int i_sb = 0, i_tmp = 0;
> +
> +		/* Copy tmp to sb, adding spaces after newlines */
> +		strbuf_grow(sb, 2*tmp.len); /* more than sufficient */
> +		for (; i_tmp < tmp.len; i_tmp++, i_sb++) {
> +			/* Copy next character from tmp to sb */
> +			sb->buf[sb->len + i_sb] = tmp.buf[i_tmp];
> +
> +			/* If we copied a newline, add a space */
> +			if (tmp.buf[i_tmp] == '\n')
> +				sb->buf[++i_sb] = ' ';
> +		}
> +		/* Update length and ensure it's NUL-terminated */

I think this and the two comments inside the loop are mostly redundant. I'd
drop them (except maybe this one because it's a common oversight I guess).

> +		sb->len += i_sb;
> +		sb->buf[sb->len] = '\0';
> +
> +		/* Clean up tmp */

Also this one I guess

> +		strbuf_release(&tmp);
> +	}
> +
> +	/* Add final newline character to sb */
>  	strbuf_addch(sb, '\n');
>  }
>  
> @@ -4246,6 +4275,9 @@ void merge_switch_to_result(struct merge_options *opt,
>  		struct string_list olist = STRING_LIST_INIT_NODUP;
>  		int i;
>  
> +		if (opt->record_conflict_msgs_as_headers)
> +			BUG("Either display conflict messages or record them as headers, not both");
> +
>  		trace2_region_enter("merge", "display messages", opt->repo);
>  
>  		/* Hack to pre-allocate olist to the desired size */
> diff --git a/merge-recursive.c b/merge-recursive.c
> index bc73c52dd84..c9ba7e904a6 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -3714,6 +3714,9 @@ static int merge_start(struct merge_options *opt, struct tree *head)
>  
>  	assert(opt->priv == NULL);
>  
> +	/* Not supported; option specific to merge-ort */
> +	assert(!opt->record_conflict_msgs_as_headers);
> +
>  	/* Sanity check on repo state; index must match head */
>  	if (repo_index_has_changes(opt->repo, head, &sb)) {
>  		err(opt, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
> diff --git a/merge-recursive.h b/merge-recursive.h
> index 0795a1d3ec1..ebfdb7f994e 100644
> --- a/merge-recursive.h
> +++ b/merge-recursive.h
> @@ -46,6 +46,7 @@ struct merge_options {
>  	/* miscellaneous control options */
>  	const char *subtree_shift;
>  	unsigned renormalize : 1;
> +	unsigned record_conflict_msgs_as_headers : 1;
>  
>  	/* internal fields used by the implementation */
>  	struct merge_options_internal *priv;
> -- 
> gitgitgadget
> 

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 7/8] diff: add ability to insert additional headers for paths
  2021-12-25  7:59   ` [PATCH v2 7/8] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
@ 2021-12-28 10:57     ` Johannes Altmanninger
  2021-12-28 21:09       ` Elijah Newren
  0 siblings, 1 reply; 113+ messages in thread
From: Johannes Altmanninger @ 2021-12-28 10:57 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh

On Sat, Dec 25, 2021 at 07:59:18AM +0000, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
> 
> When additional headers are provided, we need to
>   * add diff_filepairs to diff_queued_diff for each paths in the
>     additional headers map which, unless that path is part of
>     another diff_filepair already found in diff_queued_diff
>   * format the headers (colorization, line_prefix for --graph)
>   * make sure the various codepaths that attempt to return early
>     if there are "no changes" take into account the headers that
>     need to be shown.
> 
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  diff.c     | 116 +++++++++++++++++++++++++++++++++++++++++++++++++++--
>  diff.h     |   3 +-
>  log-tree.c |   2 +-
>  3 files changed, 115 insertions(+), 6 deletions(-)
> 
> diff --git a/diff.c b/diff.c
> index 861282db1c3..aaa6a19f158 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -27,6 +27,7 @@
>  #include "help.h"
>  #include "promisor-remote.h"
>  #include "dir.h"
> +#include "strmap.h"
>  
>  #ifdef NO_FAST_WORKING_DIRECTORY
>  #define FAST_WORKING_DIRECTORY 0
> @@ -3406,6 +3407,31 @@ struct userdiff_driver *get_textconv(struct repository *r,
>  	return userdiff_get_textconv(r, one->driver);
>  }
>  
> +static struct strbuf *additional_headers(struct diff_options *o,
> +					 const char *path)
> +{
> +	if (!o->additional_path_headers)
> +		return NULL;
> +	return strmap_get(o->additional_path_headers, path);
> +}
> +
> +static void add_formatted_headers(struct strbuf *msg,
> +				  struct strbuf *more_headers,
> +				  const char *line_prefix,
> +				  const char *meta,
> +				  const char *reset)
> +{
> +	char *next, *newline;
> +
> +	for (next = more_headers->buf; *next; next = newline) {
> +		newline = strchrnul(next, '\n');
> +		strbuf_addf(msg, "%s%s%.*s%s\n", line_prefix, meta,
> +			    (int)(newline - next), next, reset);
> +		if (*newline)
> +			newline++;
> +	}
> +}
> +
>  static void builtin_diff(const char *name_a,
>  			 const char *name_b,
>  			 struct diff_filespec *one,
> @@ -3464,6 +3490,17 @@ static void builtin_diff(const char *name_a,
>  	b_two = quote_two(b_prefix, name_b + (*name_b == '/'));
>  	lbl[0] = DIFF_FILE_VALID(one) ? a_one : "/dev/null";
>  	lbl[1] = DIFF_FILE_VALID(two) ? b_two : "/dev/null";
> +	if (!DIFF_FILE_VALID(one) && !DIFF_FILE_VALID(two)) {
> +		/*
> +		 * We should only reach this point for pairs from
> +		 * create_filepairs_for_header_only_notifications().  For
> +		 * these, we should avoid the "/dev/null" special casing
> +		 * above, meaning we avoid showing such pairs as either
> +		 * "new file" or "deleted file" below.
> +		 */
> +		lbl[0] = a_one;
> +		lbl[1] = b_two;
> +	}

not so familiar with this logic, but I saw that without this change, the
rename/rename conflict test fails. Is this because we add a file pair under
the original name (that's been renamed on both sides). I wonder if we
can sketch such a case in the comment.

>  	strbuf_addf(&header, "%s%sdiff --git %s %s%s\n", line_prefix, meta, a_one, b_two, reset);
>  	if (lbl[0][0] == '/') {
>  		/* /dev/null */
> @@ -4328,6 +4365,7 @@ static void fill_metainfo(struct strbuf *msg,
>  	const char *set = diff_get_color(use_color, DIFF_METAINFO);
>  	const char *reset = diff_get_color(use_color, DIFF_RESET);
>  	const char *line_prefix = diff_line_prefix(o);
> +	struct strbuf *more_headers = NULL;
>  
>  	*must_show_header = 1;
>  	strbuf_init(msg, PATH_MAX * 2 + 300);
> @@ -4364,6 +4402,11 @@ static void fill_metainfo(struct strbuf *msg,
>  	default:
>  		*must_show_header = 0;
>  	}
> +	if ((more_headers = additional_headers(o, name))) {
> +		add_formatted_headers(msg, more_headers,
> +				      line_prefix, set, reset);
> +		*must_show_header = 1;
> +	}
>  	if (one && two && !oideq(&one->oid, &two->oid)) {
>  		const unsigned hexsz = the_hash_algo->hexsz;
>  		int abbrev = o->abbrev ? o->abbrev : DEFAULT_ABBREV;
> @@ -5852,12 +5895,22 @@ int diff_unmodified_pair(struct diff_filepair *p)
>  
>  static void diff_flush_patch(struct diff_filepair *p, struct diff_options *o)
>  {
> -	if (diff_unmodified_pair(p))
> +	/*
> +	 * Check if we can return early without showing a diff.  Note that
> +	 * diff_filepair only stores {oid, path, mode, is_valid}
> +	 * information for each path, and thus diff_unmodified_pair() only
> +	 * considers those bits of info.  However, we do not want pairs
> +	 * created by create_filepairs_for_header_only_notifications() to
> +	 * be ignored, so return early if both p is unmodified AND
> +	 * p->one->path is not in additional headers.
> +	 */
> +	if (diff_unmodified_pair(p) && !additional_headers(o, p->one->path))
>  		return;
>  
> +	/* Actually, we can also return early to avoid showing tree diffs */
>  	if ((DIFF_FILE_VALID(p->one) && S_ISDIR(p->one->mode)) ||
>  	    (DIFF_FILE_VALID(p->two) && S_ISDIR(p->two->mode)))
> -		return; /* no tree diffs in patch format */
> +		return;
>  
>  	run_diff(p, o);
>  }
> @@ -5888,10 +5941,14 @@ static void diff_flush_checkdiff(struct diff_filepair *p,
>  	run_checkdiff(p, o);
>  }
>  
> -int diff_queue_is_empty(void)
> +int diff_queue_is_empty(struct diff_options *o)
>  {
>  	struct diff_queue_struct *q = &diff_queued_diff;
>  	int i;
> +
> +	if (o->additional_path_headers &&
> +	    !strmap_empty(o->additional_path_headers))
> +		return 0;
>  	for (i = 0; i < q->nr; i++)
>  		if (!diff_unmodified_pair(q->queue[i]))
>  			return 0;
> @@ -6325,6 +6382,54 @@ void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc)
>  		warning(_(rename_limit_advice), varname, needed);
>  }
>  
> +static void create_filepairs_for_header_only_notifications(struct diff_options *o)
> +{
> +	struct strset present;
> +	struct diff_queue_struct *q = &diff_queued_diff;
> +	struct hashmap_iter iter;
> +	struct strmap_entry *e;
> +	int i;
> +
> +	strset_init_with_options(&present, /*pool*/ NULL, /*strdup*/ 0);
> +
> +	/*
> +	 * Find out which paths exist in diff_queued_diff, preferring
> +	 * one->path for any pair that has multiple paths.

Why do we prefer one->path?

> +	 */
> +	for (i = 0; i < q->nr; i++) {
> +		struct diff_filepair *p = q->queue[i];
> +		char *path = p->one->path ? p->one->path : p->two->path;
> +
> +		if (strmap_contains(o->additional_path_headers, path))
> +			strset_add(&present, path);
> +	}
> +
> +	/*
> +	 * Loop over paths in additional_path_headers; for each NOT already
> +	 * in diff_queued_diff, create a synthetic filepair and insert that
> +	 * into diff_queued_diff.
> +	 */
> +	strmap_for_each_entry(o->additional_path_headers, &iter, e) {
> +		if (!strset_contains(&present, e->key)) {
> +			struct diff_filespec *one, *two;
> +			struct diff_filepair *p;
> +
> +			one = alloc_filespec(e->key);
> +			two = alloc_filespec(e->key);
> +			fill_filespec(one, null_oid(), 0, 0);
> +			fill_filespec(two, null_oid(), 0, 0);
> +			p = diff_queue(q, one, two);
> +			p->status = DIFF_STATUS_MODIFIED;
> +		}
> +	}

All these string hash-maps are not really typical for a C program. I'm sure
they are the best choice for an advanced merge algorithm but they are not
really necessary for computing/printing a diff. It feels like this is an
implementation detail from merge-ort that's leaking into other components.

What we want to do is

	for file_pair in additional_headers:
		if not already_queued(file_pair):
			queue(file_pair)

to do that, you use a temporary has-set ("present") that records everything
that's already queued (already_queued() is a lookup in that set).

Let's assume both the queue and additional_headers are sorted arrays.
Then we could efficiently merge them (like a merge-sort algorithm)
without ever allocating a temporary hash map.

I haven't checked if this is practical (better wait for feedback).
We'd probably need to convert the strmap additional_path_headers into an
array and sort it (I guess our hash map does not guarantee any ordering?)

> +
> +	/* Re-sort the filepairs */
> +	diffcore_fix_diff_index();
> +
> +	/* Cleanup */
> +	strset_clear(&present);

Not a strong opinion, but I'd probably drop this comment

> +}
> +
>  static void diff_flush_patch_all_file_pairs(struct diff_options *o)
>  {
>  	int i;
> @@ -6337,6 +6442,9 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
>  	if (o->color_moved)
>  		o->emitted_symbols = &esm;
>  
> +	if (o->additional_path_headers)
> +		create_filepairs_for_header_only_notifications(o);
> +
>  	for (i = 0; i < q->nr; i++) {
>  		struct diff_filepair *p = q->queue[i];
>  		if (check_pair_status(p))
> @@ -6413,7 +6521,7 @@ void diff_flush(struct diff_options *options)
>  	 * Order: raw, stat, summary, patch
>  	 * or:    name/name-status/checkdiff (other bits clear)
>  	 */
> -	if (!q->nr)
> +	if (!q->nr && !options->additional_path_headers)
>  		goto free_queue;
>  
>  	if (output_format & (DIFF_FORMAT_RAW |
> diff --git a/diff.h b/diff.h
> index 8ba85c5e605..06a0a67afda 100644
> --- a/diff.h
> +++ b/diff.h
> @@ -395,6 +395,7 @@ struct diff_options {
>  
>  	struct repository *repo;
>  	struct option *parseopts;
> +	struct strmap *additional_path_headers;
>  
>  	int no_free;
>  };
> @@ -593,7 +594,7 @@ void diffcore_fix_diff_index(void);
>  "                show all files diff when -S is used and hit is found.\n" \
>  "  -a  --text    treat all files as text.\n"
>  
> -int diff_queue_is_empty(void);
> +int diff_queue_is_empty(struct diff_options*);
>  void diff_flush(struct diff_options*);
>  void diff_free(struct diff_options*);
>  void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc);
> diff --git a/log-tree.c b/log-tree.c
> index d4655b63d75..33c28f537a6 100644
> --- a/log-tree.c
> +++ b/log-tree.c
> @@ -850,7 +850,7 @@ int log_tree_diff_flush(struct rev_info *opt)
>  	opt->shown_dashes = 0;
>  	diffcore_std(&opt->diffopt);
>  
> -	if (diff_queue_is_empty()) {
> +	if (diff_queue_is_empty(&opt->diffopt)) {
>  		int saved_fmt = opt->diffopt.output_format;
>  		opt->diffopt.output_format = DIFF_FORMAT_NO_OUTPUT;
>  		diff_flush(&opt->diffopt);
> -- 
> gitgitgadget
> 

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 8/8] show, log: include conflict/warning messages in --remerge-diff headers
  2021-12-25  7:59   ` [PATCH v2 8/8] show, log: include conflict/warning messages in --remerge-diff headers Elijah Newren via GitGitGadget
@ 2021-12-28 10:57     ` Johannes Altmanninger
  2021-12-28 23:42       ` Elijah Newren
  0 siblings, 1 reply; 113+ messages in thread
From: Johannes Altmanninger @ 2021-12-28 10:57 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh

On Sat, Dec 25, 2021 at 07:59:19AM +0000, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
> 
> Conflicts such as modify/delete, rename/rename, or file/directory are
> not representable via content conflict markers, and the normal output
> messages notifying users about these were dropped with --remerge-diff.
> While we don't want these messages randomly shown before the commit
> and diff headers, we do want them to still be shown; include them as
> part of the diff headers instead.
> 
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  log-tree.c              |  3 ++
>  merge-ort.c             |  1 +
>  merge-ort.h             | 10 +++++
>  t/t4069-remerge-diff.sh | 86 +++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 100 insertions(+)
> 
> diff --git a/log-tree.c b/log-tree.c
> index 33c28f537a6..97fbb756d21 100644
> --- a/log-tree.c
> +++ b/log-tree.c
> @@ -922,6 +922,7 @@ static int do_remerge_diff(struct rev_info *opt,
>  	/* Setup merge options */
>  	init_merge_options(&o, the_repository);
>  	o.show_rename_progress = 0;
> +	o.record_conflict_msgs_as_headers = 1;
>  
>  	ctx.abbrev = DEFAULT_ABBREV;
>  	format_commit_message(parent1, "%h (%s)", &parent1_desc, &ctx);
> @@ -938,10 +939,12 @@ static int do_remerge_diff(struct rev_info *opt,
>  	merge_incore_recursive(&o, bases, parent1, parent2, &res);
>  
>  	/* Show the diff */
> +	opt->diffopt.additional_path_headers = res.path_messages;
>  	diff_tree_oid(&res.tree->object.oid, oid, "", &opt->diffopt);
>  	log_tree_diff_flush(opt);
>  
>  	/* Cleanup */
> +	opt->diffopt.additional_path_headers = NULL;
>  	strbuf_release(&parent1_desc);
>  	strbuf_release(&parent2_desc);
>  	merge_finalize(&o, &res);
> diff --git a/merge-ort.c b/merge-ort.c
> index 9142d56e0ad..07e53083cbd 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -4579,6 +4579,7 @@ redo:
>  	trace2_region_leave("merge", "process_entries", opt->repo);
>  
>  	/* Set return values */
> +	result->path_messages = &opt->priv->output;
>  	result->tree = parse_tree_indirect(&working_tree_oid);
>  	/* existence of conflicted entries implies unclean */
>  	result->clean &= strmap_empty(&opt->priv->conflicted);
> diff --git a/merge-ort.h b/merge-ort.h
> index c011864ffeb..fe599b87868 100644
> --- a/merge-ort.h
> +++ b/merge-ort.h
> @@ -5,6 +5,7 @@
>  
>  struct commit;
>  struct tree;
> +struct strmap;
>  
>  struct merge_result {
>  	/*
> @@ -23,6 +24,15 @@ struct merge_result {
>  	 */
>  	struct tree *tree;
>  
> +	/*
> +	 * Special messages and conflict notices for various paths
> +	 *
> +	 * This is a map of pathnames to strbufs.  It contains various
> +	 * warning/conflict/notice messages (possibly multiple per path)
> +	 * that callers may want to use.
> +	 */
> +	struct strmap *path_messages;
> +
>  	/*
>  	 * Additional metadata used by merge_switch_to_result() or future calls
>  	 * to merge_incore_*().  Includes data needed to update the index (if
> diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
> index 192dbce2bfe..a040d3bcd91 100755
> --- a/t/t4069-remerge-diff.sh
> +++ b/t/t4069-remerge-diff.sh
> @@ -4,6 +4,15 @@ test_description='remerge-diff handling'
>  
>  . ./test-lib.sh
>  
> +# --remerge-diff uses ort under the hood regardless of setting.  However,
> +# we set up a file/directory conflict beforehand, and the different backends
> +# handle the conflict differently, which would require separate code paths
> +# to resolve.  There's not much point in making the code uglier to do that,
> +# though, when the real thing we are testing (--remerge-diff) will hardcode
> +# calls directly into the merge-ort API anyway.  So just force the use of
> +# ort on the setup too.
> +GIT_TEST_MERGE_ALGORITHM=ort
> +
>  test_expect_success 'setup basic merges' '
>  	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
>  	git add numbers &&
> @@ -55,6 +64,7 @@ test_expect_success 'remerge-diff with both a resolved conflict and an unrelated
>  	git log -1 --oneline ab_resolution >tmp &&
>  	cat <<-EOF >>tmp &&
>  	diff --git a/numbers b/numbers
> +	CONFLICT (content): Merge conflict in numbers
>  	index a1fb731..6875544 100644
>  	--- a/numbers
>  	+++ b/numbers
> @@ -83,4 +93,80 @@ test_expect_success 'remerge-diff with both a resolved conflict and an unrelated
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'setup non-content conflicts' '
> +	git switch --orphan base &&
> +
> +	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
> +	test_write_lines a b c d e f g h i >letters &&
> +	test_write_lines in the way >content &&
> +	git add numbers letters content &&
> +	git commit -m base &&
> +
> +	git branch side1 &&
> +	git branch side2 &&
> +
> +	git checkout side1 &&
> +	test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
> +	git mv letters letters_side1 &&
> +	git mv content file_or_directory &&
> +	git add numbers &&
> +	git commit -m side1 &&
> +
> +	git checkout side2 &&
> +	git rm numbers &&
> +	git mv letters letters_side2 &&
> +	mkdir file_or_directory &&
> +	echo hello >file_or_directory/world &&
> +	git add file_or_directory/world &&
> +	git commit -m side2 &&
> +
> +	git checkout -b resolution side1 &&
> +	test_must_fail git merge side2 &&
> +	test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
> +	git add numbers &&
> +	git add letters_side1 &&
> +	git rm letters &&
> +	git rm letters_side2 &&
> +	git add file_or_directory~HEAD &&
> +	git mv file_or_directory~HEAD wanted_content &&
> +	git commit -m resolved
> +'
> +
> +test_expect_success 'remerge-diff with non-content conflicts' '
> +	git log -1 --oneline resolution >tmp &&
> +	cat <<-EOF >>tmp &&
> +	diff --git a/file_or_directory~HASH (side1) b/wanted_content

the "~HASH (side1)" suffix will probably mess with some programs that extract
the filename from the diff.
I don't know what programs are supposed to expect.  I can see arguments for
either dropping the suffix or including only "~HASH" since that's part of
the actual filename that's left in the worktree.
Maybe it's not so important.

The file/link typechange conflict test I'll add below exposes what looks
like an accidental interaction with the trailing tab characters that we emit
on --- and +++ lines if the "filename" contains a space (since 1a9eb3b9d5
(git-diff/git-apply: make diff output a bit friendlier to GNU patch (part
2), 2006-09-22)).

	index 70885e4..0000000
	--- a/typechange~738109f (side1)	<-- git diff adds a trailing tab!
	+++ /dev/null

I haven't formed an opinion yet, but since Tig uses the --- and +++ lines
to extract file names, I'd drop the " (side1)" suffix from at least the ---
and +++ lines. Maybe also the ^diff lines, I'm not sure

> +	similarity index 100%
> +	rename from file_or_directory~HASH (side1)
> +	rename to wanted_content
> +	CONFLICT (file/directory): directory in the way of file_or_directory from HASH (side1); moving it to file_or_directory~HASH (side1) instead.

I wonder if it's better to have this line further up, before the "rename"
resolution, to correct the temporal order.

> +	diff --git a/letters b/letters
> +	CONFLICT (rename/rename): letters renamed to letters_side1 in HASH (side1) and to letters_side2 in HASH (side2).
> +	diff --git a/letters_side2 b/letters_side2
> +	deleted file mode 100644
> +	index b236ae5..0000000
> +	--- a/letters_side2
> +	+++ /dev/null
> +	@@ -1,9 +0,0 @@
> +	-a
> +	-b
> +	-c
> +	-d
> +	-e
> +	-f
> +	-g
> +	-h
> +	-i
> +	diff --git a/numbers b/numbers
> +	CONFLICT (modify/delete): numbers deleted in HASH (side2) and modified in HASH (side1).  Version HASH (side1) of numbers left in tree.
> +	EOF

Took me some time to grok these but the output makes sense (it's loud and
ugly but that's okay since these are serious conflicts).

> +	# We still have some sha1 hashes above; rip them out so test works
> +	# with sha256
> +	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
> +
> +	git show --oneline --remerge-diff resolution >tmp &&
> +	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
> +	test_cmp expect actual
> +'
> +
>  test_done
> -- 
> gitgitgadget

We're missing a test case for typechange.  Here's is a quick draft I've been
playing around with. Seems ugly that the "diff --git a/typechange b/typechange"
is doubled but okay.

Maybe a rename/delete conflict is interesting as well, I'm not sure.  (Also I
wonder if switching the order of parents will give any interesting difference,
I guess not)

test_expect_success 'remerge-diff with file/link conflict' '
	git branch -d base side1 side2 &&
	git switch --orphan base &&

	echo base >typechange &&
	git add typechange &&
	git commit -m base &&

	git branch side1 &&
	git branch side2 &&

	git checkout side1 &&
	echo orig-file-contents >typechange &&
	git commit -a -m side1 &&

	git checkout side2 &&
	ln -sf . typechange &&
	git add typechange &&
	git commit -m side2 &&

	git checkout -b resolution2 side1 &&
	test_must_fail git merge side2 &&
	rm typechange &&
	mv typechange~HEAD typechange &&
	echo resolved >>typechange &&
	git add typechange~HEAD typechange &&
	git merge --continue &&

	git show --oneline --remerge-diff resolution2 >tmp &&
	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&

	cat <<-EOF >tmp &&
	7759b27 Merge branch ${SQ}side2${SQ} into resolution2
	diff --git a/typechange b/typechange
	deleted file mode 120000
	CONFLICT (distinct types): typechange had different types on each side; renamed one of them so each can be recorded somewhere.
	index 945c9b4..0000000
	--- a/typechange
	+++ /dev/null
	@@ -1 +0,0 @@
	-.
	\ No newline at end of file
	diff --git a/typechange b/typechange
	new file mode 100644
	CONFLICT (distinct types): typechange had different types on each side; renamed one of them so each can be recorded somewhere.
	index 0000000..70885e4
	--- /dev/null
	+++ b/typechange
	@@ -0,0 +1,2 @@
	+orig-file-contents
	+resolved
	diff --git a/typechange~738109f (side1) b/typechange~738109f (side1)
	deleted file mode 100644
	index 70885e4..0000000
	--- a/typechange~738109f (side1)	
	+++ /dev/null
	@@ -1 +0,0 @@
	-orig-file-contents
	EOF
	# We still have some sha1 hashes above; rip them out so test works
	# with sha256
	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&

	test_cmp expect actual
'

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 3/8] ll-merge: make callers responsible for showing warnings
  2021-12-28 10:56     ` Johannes Altmanninger
@ 2021-12-28 19:37       ` Elijah Newren
  2021-12-28 22:05         ` Johannes Altmanninger
  0 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren @ 2021-12-28 19:37 UTC (permalink / raw)
  To: Johannes Altmanninger
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason, Neeraj Singh

On Tue, Dec 28, 2021 at 2:56 AM Johannes Altmanninger <aclopte@gmail.com> wrote:
>
> On Sat, Dec 25, 2021 at 07:59:14AM +0000, Elijah Newren via GitGitGadget wrote:
> > From: Elijah Newren <newren@gmail.com>
> >
> > Since some callers may want to send warning messages to somewhere other
> > than stdout/stderr, stop printing "warning: Cannot merge binary files"
> > from ll-merge and instead modify the return status of ll_merge() to
> > indicate when a merge of binary files has occurred.
> >
> > This commit continues printing the message as-is; future changes will
> > start handling the new commit differently in the merge-ort codepath.
>
> "the new commit" looks like a typo, do you mean "binary conflicts"?

Good catch, yeah should be "the binary conflicts message"

> >
> > Note that my methodology included first modifying ll_merge() to return
> > a struct, so that the compiler would catch all the callers for me and
> > ensure I had modified all of them.  After modifying all of them, I then
> > changed the struct to an enum.
>
> Heh, this is a clever way to work around C's weak typing.
>
> The language server I'm using (clangd) supports the Call Hierarchy feature,
> which is intended to list callers or callees of the function at the editor's
> cursor. If I ask the server for callers of ll_merge I get this response
> (on 510f9eba9 plus this series)
>
>         ll-merge.h:98:1: ll_merge - list of callers
>           builtin/checkout.c:242:12: checkout_merged
>             builtin/checkout.c:279:17:  merge_status = ll_merge(&result_buf, path, &ancestor, "base",
>           rerere.c:943:12: handle_cache
>             rerere.c:984:2:     ll_merge(&result, path, &mmfile[0], NULL,
>           notes-merge.c:342:12: ll_merge_in_worktree
>             notes-merge.c:353:11:       status = ll_merge(&result_buf, oid_to_hex(&p->obj), &base, NULL,
>           merge-recursive.c:1035:12: merge_3way
>             merge-recursive.c:1090:17:  merge_status = ll_merge(result_buf, a->path, &orig, base,
>           merge-ort.c:1763:12: merge_3way
>             merge-ort.c:1816:17:        merge_status = ll_merge(result_buf, path, &orig, base,
>           merge-blobs.c:32:14: three_way_filemerge
>             merge-blobs.c:48:17:        merge_status = ll_merge(&res, path, base, NULL,
>           apply.c:3491:12: three_way_merge
>             apply.c:3511:11:    status = ll_merge(&result, path,
>           rerere.c:608:12: try_merge
>             rerere.c:623:9:             ret = ll_merge(result, path, &base, NULL, cur, "", &other, "",
>
> So there are 8 callers in total; but only 7 print the warning (including the
> one in merge-ort which will change in the next commit). I think you missed
> the call at rerere.c:984 because we ignore its return value.

Doh, I missed one!  Though, as pointed out by Junio, rerere won't
operate on binary files and thus can't hit that codepath.  Still, I
should either have it in both rerere codepaths or neither.

> > @@ -133,10 +133,12 @@ static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
> >       xmp.ancestor = orig_name;
> >       xmp.file1 = name1;
> >       xmp.file2 = name2;
> > -     return xdl_merge(orig, src1, src2, &xmp, result);
> > +     status = xdl_merge(orig, src1, src2, &xmp, result);
> > +     ret = (status > 1 ) ? LL_MERGE_CONFLICT : status;
>
> " (status > 1 )" has an extra space
>
> I'm not sure it's wise to handle status=1 and status=2 in two different code paths.
> Both mean the same (the only difference is the number of conflicts).
> status=1 coincides with LL_MERGE_CONFLICT but that's purely coincidental
>
>         ret = (status > 0) ? LL_MERGE_CONFLICT : status;

Um, whoops.  Yeah, this should be > 0, not > 1.  (As per
xdl_do_merge() comment, status >= 0 means status returns the number of
conflicts)  No clue how I messed that up so badly; kind of
embarrassing, honestly.

Thanks for the careful reading.

> > @@ -236,7 +239,8 @@ static int ll_ext_merge(const struct ll_merge_driver *fn,
> >               unlink_or_warn(temp[i]);
> >       strbuf_release(&cmd);
> >       strbuf_release(&path_sq);
> > -     return status;
> > +     ret = (status > 1) ? LL_MERGE_CONFLICT : status;
>
> same here, I'd test for "status > 0" because that's the convention for
> external programs

Yep.

...
> > diff --git a/rerere.c b/rerere.c
> > index d83d58df4fb..b1f8961ed9e 100644
> > --- a/rerere.c
> > +++ b/rerere.c
> > @@ -609,19 +609,23 @@ static int try_merge(struct index_state *istate,
> >                    const struct rerere_id *id, const char *path,
> >                    mmfile_t *cur, mmbuffer_t *result)
> >  {
> > -     int ret;
> > +     enum ll_merge_result ret;
> >       mmfile_t base = {NULL, 0}, other = {NULL, 0};
> >
> >       if (read_mmfile(&base, rerere_path(id, "preimage")) ||
> > -         read_mmfile(&other, rerere_path(id, "postimage")))
> > -             ret = 1;
> > -     else
> > +         read_mmfile(&other, rerere_path(id, "postimage"))) {
> > +             ret = LL_MERGE_CONFLICT;
> > +     } else {
> >               /*
> >                * A three-way merge. Note that this honors user-customizable
> >                * low-level merge driver settings.
> >                */
> >               ret = ll_merge(result, path, &base, NULL, cur, "", &other, "",
> >                              istate, NULL);
> > +             if (ret == LL_MERGE_BINARY_CONFLICT)
> > +                     warning("Cannot merge binary files: %s (%s vs. %s)",
> > +                             path, "", "");
>
> With the next patch, 7/8 callers of ll_merge (almost) immediately print
> that warning.  Looks fine as is, but does it make sense to introduce a helper
> function for the common case, or add a flag to ll_merge_options?

I started by adding a flag, and Peff suggested not doing so (because
the printing doesn't belong in a "low-level" merge, as ll_merge stands
for[1]), but instead making the callers responsible.  We could add a
helper function, outside of ll-merge.[ch], but I'm not sure where to
put it or what to call it and I'm leaning towards just leaving things
as-is (well, other than fixing up the important issues you brought up
before this).

[1] https://lore.kernel.org/git/YVOZRhWttzF18Xql@coredump.intra.peff.net/

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 7/8] diff: add ability to insert additional headers for paths
  2021-12-28 10:57     ` Johannes Altmanninger
@ 2021-12-28 21:09       ` Elijah Newren
  2021-12-29  0:16         ` Johannes Altmanninger
  0 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren @ 2021-12-28 21:09 UTC (permalink / raw)
  To: Johannes Altmanninger
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason, Neeraj Singh

On Tue, Dec 28, 2021 at 2:57 AM Johannes Altmanninger <aclopte@gmail.com> wrote:
>
> On Sat, Dec 25, 2021 at 07:59:18AM +0000, Elijah Newren via GitGitGadget wrote:
> > From: Elijah Newren <newren@gmail.com>
> >
> > When additional headers are provided, we need to
> >   * add diff_filepairs to diff_queued_diff for each paths in the
> >     additional headers map which, unless that path is part of
> >     another diff_filepair already found in diff_queued_diff
> >   * format the headers (colorization, line_prefix for --graph)
> >   * make sure the various codepaths that attempt to return early
> >     if there are "no changes" take into account the headers that
> >     need to be shown.
> >
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > ---
> >  diff.c     | 116 +++++++++++++++++++++++++++++++++++++++++++++++++++--
> >  diff.h     |   3 +-
> >  log-tree.c |   2 +-
> >  3 files changed, 115 insertions(+), 6 deletions(-)
> >
> > diff --git a/diff.c b/diff.c
> > index 861282db1c3..aaa6a19f158 100644
> > --- a/diff.c
> > +++ b/diff.c
> > @@ -27,6 +27,7 @@
> >  #include "help.h"
> >  #include "promisor-remote.h"
> >  #include "dir.h"
> > +#include "strmap.h"
> >
> >  #ifdef NO_FAST_WORKING_DIRECTORY
> >  #define FAST_WORKING_DIRECTORY 0
> > @@ -3406,6 +3407,31 @@ struct userdiff_driver *get_textconv(struct repository *r,
> >       return userdiff_get_textconv(r, one->driver);
> >  }
> >
> > +static struct strbuf *additional_headers(struct diff_options *o,
> > +                                      const char *path)
> > +{
> > +     if (!o->additional_path_headers)
> > +             return NULL;
> > +     return strmap_get(o->additional_path_headers, path);
> > +}
> > +
> > +static void add_formatted_headers(struct strbuf *msg,
> > +                               struct strbuf *more_headers,
> > +                               const char *line_prefix,
> > +                               const char *meta,
> > +                               const char *reset)
> > +{
> > +     char *next, *newline;
> > +
> > +     for (next = more_headers->buf; *next; next = newline) {
> > +             newline = strchrnul(next, '\n');
> > +             strbuf_addf(msg, "%s%s%.*s%s\n", line_prefix, meta,
> > +                         (int)(newline - next), next, reset);
> > +             if (*newline)
> > +                     newline++;
> > +     }
> > +}
> > +
> >  static void builtin_diff(const char *name_a,
> >                        const char *name_b,
> >                        struct diff_filespec *one,
> > @@ -3464,6 +3490,17 @@ static void builtin_diff(const char *name_a,
> >       b_two = quote_two(b_prefix, name_b + (*name_b == '/'));
> >       lbl[0] = DIFF_FILE_VALID(one) ? a_one : "/dev/null";
> >       lbl[1] = DIFF_FILE_VALID(two) ? b_two : "/dev/null";
> > +     if (!DIFF_FILE_VALID(one) && !DIFF_FILE_VALID(two)) {
> > +             /*
> > +              * We should only reach this point for pairs from
> > +              * create_filepairs_for_header_only_notifications().  For
> > +              * these, we should avoid the "/dev/null" special casing
> > +              * above, meaning we avoid showing such pairs as either
> > +              * "new file" or "deleted file" below.
> > +              */
> > +             lbl[0] = a_one;
> > +             lbl[1] = b_two;
> > +     }
>
> not so familiar with this logic, but I saw that without this change, the
> rename/rename conflict test fails. Is this because we add a file pair under
> the original name (that's been renamed on both sides). I wonder if we
> can sketch such a case in the comment.

That may be the only current test in the testsuite that fails without
this bit of logic, but I don't want the comment to be specific to the
rename/rename case.  Whenever we have a conflict/warning/whatever
message from the merge machinery tied to a path which doesn't show up
in either the automatic merge or the recorded merge commit, we will
hit this situation.  Even if I were to give a complete listing of all
the current cases, more could be added in the future.

> > +static void create_filepairs_for_header_only_notifications(struct diff_options *o)
> > +{
> > +     struct strset present;
> > +     struct diff_queue_struct *q = &diff_queued_diff;
> > +     struct hashmap_iter iter;
> > +     struct strmap_entry *e;
> > +     int i;
> > +
> > +     strset_init_with_options(&present, /*pool*/ NULL, /*strdup*/ 0);
> > +
> > +     /*
> > +      * Find out which paths exist in diff_queued_diff, preferring
> > +      * one->path for any pair that has multiple paths.
>
> Why do we prefer one->path?

run_diff() sets name = one->path, passes it along to run_diff_cmd(),
and from there it goes to fill_metainfo() and either
run_external_diff() or builtin_diff().

I'm wondering if I should just ignore two->path entirely and only use
one->path; I think I partially looked at both because of various
places in diff.c that already do but give preferential treatment to
one->path (diffnamecmp(), the calls to show_submodule*diff*(), what is
passed to write_name_quoted() in diff_flush_raw()).

> > +      */
> > +     for (i = 0; i < q->nr; i++) {
> > +             struct diff_filepair *p = q->queue[i];
> > +             char *path = p->one->path ? p->one->path : p->two->path;
> > +
> > +             if (strmap_contains(o->additional_path_headers, path))
> > +                     strset_add(&present, path);
> > +     }
> > +
> > +     /*
> > +      * Loop over paths in additional_path_headers; for each NOT already
> > +      * in diff_queued_diff, create a synthetic filepair and insert that
> > +      * into diff_queued_diff.
> > +      */
> > +     strmap_for_each_entry(o->additional_path_headers, &iter, e) {
> > +             if (!strset_contains(&present, e->key)) {
> > +                     struct diff_filespec *one, *two;
> > +                     struct diff_filepair *p;
> > +
> > +                     one = alloc_filespec(e->key);
> > +                     two = alloc_filespec(e->key);
> > +                     fill_filespec(one, null_oid(), 0, 0);
> > +                     fill_filespec(two, null_oid(), 0, 0);
> > +                     p = diff_queue(q, one, two);
> > +                     p->status = DIFF_STATUS_MODIFIED;
> > +             }
> > +     }
>
> All these string hash-maps are not really typical for a C program. I'm sure
> they are the best choice for an advanced merge algorithm

Agreed up to here.

> but they are not
> really necessary for computing/printing a diff.

Technically agree that it _could_ be solved a different way, but the
strmaps are a much more natural solution to this problem in this
particular case; more on this below.

> It feels like this is an
> implementation detail from merge-ort that's leaking into other components.

And I disagree here, on _both_ the explicit point and the underlying
suggestion that you seem to be making that strmap should be avoided
outside of merging.  The strmap.[ch] type was originally a suggestion
from Peff for areas of git completely unrelated to merging (see the
beginning of https://lore.kernel.org/git/20200821194857.GD1165@coredump.intra.peff.net/,
and the first link in that email).  It's a new datatype for git, much
like strbuf or string_list or whatever before it, that is there to be
used when it's a natural fit for the problem at hand.  The lack of
strmap previously led folks to abuse other existing data structures
(and in a way that often led to poor performance to boot).

> What we want to do is
>
>         for file_pair in additional_headers:
>                 if not already_queued(file_pair):
>                         queue(file_pair)

Yes, precisely.

> to do that, you use a temporary has-set ("present") that records everything
> that's already queued (already_queued() is a lookup in that set).
>
> Let's assume both the queue and additional_headers are sorted arrays.

That's a bad assumption; we can't rely on *either* being sorted.  I
actually started my implementation by trying exactly what you mention
first; I too thought it'd be more natural and clearer to do this.  Of
course, before implementing it, I had to verify whether
diff_queued_diff was sorted.  So, I added some code that would check
the order and fail if the queue wasn't sorted.  7 of the test files in
the regression testsuite had one or more failing tests.

I think the queue was intended to be sorted (see
diffcore_fix_diff_index()), but in practice it's not.  And I'm worried
that if I find the current cases where it fails to be sorted and "fix"
them (though I don't actually know if this was intentional or not so I
don't know if that's really a fix or a break), that I'd end up with
additional cases in the future where they fail to be sorted anyway.
So, no matter what, relying on diff_queued_diff being sorted seems
ill-advised.

Also...

> Then we could efficiently merge them (like a merge-sort algorithm)
> without ever allocating a temporary hash map.
>
> I haven't checked if this is practical (better wait for feedback).
> We'd probably need to convert the strmap additional_path_headers into an
> array and sort it (I guess our hash map does not guarantee any ordering?)

Right, strmap has no ordering either.  I was willing to stick those
into a string_list and sort them, but making temporary copies of both
the strmap and the diff_queued_diff just to sort them so that I can
reasonably cheaply ask "are items from this thing present in this
other thing?" seems to be stretching things a bit too far.
maps/hashes provide a very nice "is this item present" lookup and are
a natural way to ask that.  Since that is exactly the question I am
asking, I think they are the better data structure here.  So, this was
not at all a leak of merge-ort datastructures, but rather a picking of
the appropriate data structures for the problem at hand.

> > +
> > +     /* Re-sort the filepairs */
> > +     diffcore_fix_diff_index();
> > +
> > +     /* Cleanup */
> > +     strset_clear(&present);
>
> Not a strong opinion, but I'd probably drop this comment
>
> > +}
> > +
> >  static void diff_flush_patch_all_file_pairs(struct diff_options *o)
> >  {
> >       int i;
> > @@ -6337,6 +6442,9 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
> >       if (o->color_moved)
> >               o->emitted_symbols = &esm;
> >
> > +     if (o->additional_path_headers)
> > +             create_filepairs_for_header_only_notifications(o);
> > +
> >       for (i = 0; i < q->nr; i++) {
> >               struct diff_filepair *p = q->queue[i];
> >               if (check_pair_status(p))
> > @@ -6413,7 +6521,7 @@ void diff_flush(struct diff_options *options)
> >        * Order: raw, stat, summary, patch
> >        * or:    name/name-status/checkdiff (other bits clear)
> >        */
> > -     if (!q->nr)
> > +     if (!q->nr && !options->additional_path_headers)
> >               goto free_queue;
> >
> >       if (output_format & (DIFF_FORMAT_RAW |
> > diff --git a/diff.h b/diff.h
> > index 8ba85c5e605..06a0a67afda 100644
> > --- a/diff.h
> > +++ b/diff.h
> > @@ -395,6 +395,7 @@ struct diff_options {
> >
> >       struct repository *repo;
> >       struct option *parseopts;
> > +     struct strmap *additional_path_headers;
> >
> >       int no_free;
> >  };
> > @@ -593,7 +594,7 @@ void diffcore_fix_diff_index(void);
> >  "                show all files diff when -S is used and hit is found.\n" \
> >  "  -a  --text    treat all files as text.\n"
> >
> > -int diff_queue_is_empty(void);
> > +int diff_queue_is_empty(struct diff_options*);
> >  void diff_flush(struct diff_options*);
> >  void diff_free(struct diff_options*);
> >  void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc);
> > diff --git a/log-tree.c b/log-tree.c
> > index d4655b63d75..33c28f537a6 100644
> > --- a/log-tree.c
> > +++ b/log-tree.c
> > @@ -850,7 +850,7 @@ int log_tree_diff_flush(struct rev_info *opt)
> >       opt->shown_dashes = 0;
> >       diffcore_std(&opt->diffopt);
> >
> > -     if (diff_queue_is_empty()) {
> > +     if (diff_queue_is_empty(&opt->diffopt)) {
> >               int saved_fmt = opt->diffopt.output_format;
> >               opt->diffopt.output_format = DIFF_FORMAT_NO_OUTPUT;
> >               diff_flush(&opt->diffopt);
> > --
> > gitgitgadget
> >

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 6/8] merge-ort: format messages slightly different for use in headers
  2021-12-28 10:56     ` [PATCH v2 6/8] merge-ort: format messages slightly different for use in headers Johannes Altmanninger
@ 2021-12-28 21:48       ` Elijah Newren
  0 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren @ 2021-12-28 21:48 UTC (permalink / raw)
  To: Johannes Altmanninger
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason, Neeraj Singh

On Tue, Dec 28, 2021 at 2:56 AM Johannes Altmanninger <aclopte@gmail.com> wrote:
>
> On Sat, Dec 25, 2021 at 07:59:17AM +0000, Elijah Newren via GitGitGadget wrote:
> > From: Elijah Newren <newren@gmail.com>
> >
> > When users run
> >     git show --remerge-diff $MERGE_COMMIT
> > or
> >     git log -p --remerge-diff ...
> > stdout is not an appropriate location to dump conflict messages, but we
> > do want to provide them to users.  We will include them in the diff
> > headers instead...but for that to work, we need for any multiline
> > messages to replace newlines with both a newline and a space.  Add a new
> > flag to signal when we want these messages modified in such a fashion,
> > and use it in path_msg() to modify these messages this way.
>
> makes sense (same for patches 4 & 5)
>
> >
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > ---
> >  merge-ort.c       | 36 ++++++++++++++++++++++++++++++++++--
> >  merge-recursive.c |  3 +++
> >  merge-recursive.h |  1 +
> >  3 files changed, 38 insertions(+), 2 deletions(-)
> >
> > diff --git a/merge-ort.c b/merge-ort.c
> > index 998e92ec593..9142d56e0ad 100644
> > --- a/merge-ort.c
> > +++ b/merge-ort.c
> > @@ -634,17 +634,46 @@ static void path_msg(struct merge_options *opt,
> >                    const char *fmt, ...)
> >  {
> >       va_list ap;
> > -     struct strbuf *sb = strmap_get(&opt->priv->output, path);
> > +     struct strbuf *sb, *dest;
> > +     struct strbuf tmp = STRBUF_INIT;
> > +
> > +     if (opt->record_conflict_msgs_as_headers && omittable_hint)
> > +             return; /* Do not record mere hints in tree */
> > +     sb = strmap_get(&opt->priv->output, path);
> >       if (!sb) {
> >               sb = xmalloc(sizeof(*sb));
> >               strbuf_init(sb, 0);
> >               strmap_put(&opt->priv->output, path, sb);
> >       }
> >
> > +     dest = (opt->record_conflict_msgs_as_headers ? &tmp : sb);
> > +
> >       va_start(ap, fmt);
> > -     strbuf_vaddf(sb, fmt, ap);
> > +     strbuf_vaddf(dest, fmt, ap);
> >       va_end(ap);
> >
> > +     if (opt->record_conflict_msgs_as_headers) {
> > +             int i_sb = 0, i_tmp = 0;
> > +
> > +             /* Copy tmp to sb, adding spaces after newlines */
> > +             strbuf_grow(sb, 2*tmp.len); /* more than sufficient */
> > +             for (; i_tmp < tmp.len; i_tmp++, i_sb++) {
> > +                     /* Copy next character from tmp to sb */
> > +                     sb->buf[sb->len + i_sb] = tmp.buf[i_tmp];
> > +
> > +                     /* If we copied a newline, add a space */
> > +                     if (tmp.buf[i_tmp] == '\n')
> > +                             sb->buf[++i_sb] = ' ';
> > +             }
> > +             /* Update length and ensure it's NUL-terminated */
>
> I think this and the two comments inside the loop are mostly redundant. I'd
> drop them (except maybe this one because it's a common oversight I guess).

I don't think redundancy is (necessarily) a reason to drop comments.
Take for example the following from early in abspath.c:

    /* Find start of the last component */
    while (offset < len && !is_dir_sep(path->buf[len - 1]))
        len--;
    /* Skip sequences of multiple path-separators */
    while (offset < len && is_dir_sep(path->buf[len - 1]))
        len--;

The comment quickly explains what might take a bit more time to reason
out.  Since I'm dealing with multiple different indices and various
arithmetic, I figured a quick explanation was helpful.  And, of
course, the reminder to make it NUL-terminated.  Granted, if the code
is very readily obvious then comments are not helpful, and there's a
gray area somewhere in between.  I think the code we're discussing
here is in that gray area, where it's a matter of taste what the
threshold is.  I don't find your taste here unreasonable, but I don't
find mine for the above examples to be unreasonable either.  I'd
rather leave these in.

>
> > +             sb->len += i_sb;
> > +             sb->buf[sb->len] = '\0';
> > +
> > +             /* Clean up tmp */
>
> Also this one I guess

Yeah, I'll nuke this one.  The reason for this comment was more that
sometimes I like having a comment apply to the code below it until the
next comment; do thing that way avoids the need to put a
here-ends-the-previous-comment comment, or arbitrarily avoid blank
lines after a comment.  But that reasoning is a bit weaker here, and
it clearly doesn't need any explanation, so I'll just drop it.

> > +             strbuf_release(&tmp);
> > +     }
> > +
> > +     /* Add final newline character to sb */
> >       strbuf_addch(sb, '\n');
> >  }
> >
> > @@ -4246,6 +4275,9 @@ void merge_switch_to_result(struct merge_options *opt,
> >               struct string_list olist = STRING_LIST_INIT_NODUP;
> >               int i;
> >
> > +             if (opt->record_conflict_msgs_as_headers)
> > +                     BUG("Either display conflict messages or record them as headers, not both");
> > +
> >               trace2_region_enter("merge", "display messages", opt->repo);
> >
> >               /* Hack to pre-allocate olist to the desired size */
> > diff --git a/merge-recursive.c b/merge-recursive.c
> > index bc73c52dd84..c9ba7e904a6 100644
> > --- a/merge-recursive.c
> > +++ b/merge-recursive.c
> > @@ -3714,6 +3714,9 @@ static int merge_start(struct merge_options *opt, struct tree *head)
> >
> >       assert(opt->priv == NULL);
> >
> > +     /* Not supported; option specific to merge-ort */
> > +     assert(!opt->record_conflict_msgs_as_headers);
> > +
> >       /* Sanity check on repo state; index must match head */
> >       if (repo_index_has_changes(opt->repo, head, &sb)) {
> >               err(opt, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
> > diff --git a/merge-recursive.h b/merge-recursive.h
> > index 0795a1d3ec1..ebfdb7f994e 100644
> > --- a/merge-recursive.h
> > +++ b/merge-recursive.h
> > @@ -46,6 +46,7 @@ struct merge_options {
> >       /* miscellaneous control options */
> >       const char *subtree_shift;
> >       unsigned renormalize : 1;
> > +     unsigned record_conflict_msgs_as_headers : 1;
> >
> >       /* internal fields used by the implementation */
> >       struct merge_options_internal *priv;
> > --
> > gitgitgadget
> >

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 3/8] ll-merge: make callers responsible for showing warnings
  2021-12-28 19:37       ` Elijah Newren
@ 2021-12-28 22:05         ` Johannes Altmanninger
  0 siblings, 0 replies; 113+ messages in thread
From: Johannes Altmanninger @ 2021-12-28 22:05 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason, Neeraj Singh

On Tue, Dec 28, 2021 at 11:37:01AM -0800, Elijah Newren wrote:
> On Tue, Dec 28, 2021 at 2:56 AM Johannes Altmanninger <aclopte@gmail.com> wrote:
> >
> > On Sat, Dec 25, 2021 at 07:59:14AM +0000, Elijah Newren via GitGitGadget wrote:
> >
> > So there are 8 callers in total; but only 7 print the warning (including the
> > one in merge-ort which will change in the next commit). I think you missed
> > the call at rerere.c:984 because we ignore its return value.
> 
> Doh, I missed one!  Though, as pointed out by Junio, rerere won't
> operate on binary files and thus can't hit that codepath.  Still, I
> should either have it in both rerere codepaths or neither.

"neither" sounds good

> > > +             if (ret == LL_MERGE_BINARY_CONFLICT)
> > > +                     warning("Cannot merge binary files: %s (%s vs. %s)",
> > > +                             path, "", "");
> >
> > With the next patch, 7/8 callers of ll_merge (almost) immediately print
> > that warning.  Looks fine as is, but does it make sense to introduce a helper
> > function for the common case, or add a flag to ll_merge_options?
> 
> I started by adding a flag, and Peff suggested not doing so (because
> the printing doesn't belong in a "low-level" merge, as ll_merge stands
> for[1]), but instead making the callers responsible.  We could add a
> helper function, outside of ll-merge.[ch], but I'm not sure where to
> put it or what to call it and I'm leaning towards just leaving things
> as-is (well, other than fixing up the important issues you brought up
> before this).

Sure, leaving this sounds fine.  If we can formulate good reasons against
the discarded approaches we should add them to the commit message.  I guess
in this case the small number of call sites is a good indication that it's
probably not worth it.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 1/8] show, log: provide a --remerge-diff capability
  2021-12-28 10:56     ` Johannes Altmanninger
@ 2021-12-28 22:34       ` Elijah Newren
  2021-12-28 23:01         ` brian m. carlson
  0 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren @ 2021-12-28 22:34 UTC (permalink / raw)
  To: Johannes Altmanninger
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason, Neeraj Singh,
	brian m. carlson

CC'ing brian in case he has comments on the sha256 stuff and whether
he thinks there's a cleaner way to make my tests work with sha256.
(brian: See the very end of the email.)

On Tue, Dec 28, 2021 at 2:56 AM Johannes Altmanninger <aclopte@gmail.com> wrote:
>
> On Sat, Dec 25, 2021 at 07:59:12AM +0000, Elijah Newren via GitGitGadget wrote:
> > From: Elijah Newren <newren@gmail.com>
> >
> > When this option is specified, we remerge all (two parent) merge commits
> > and diff the actual merge commit to the automatically created version,
> > in order to show how users removed conflict markers, resolved the
> > different conflict versions, and potentially added new changes outside
> > of conflict regions in order to resolve semantic merge problems (or,
> > possibly, just to hide other random changes).
> >
> > This capability works by creating a temporary object directory and
> > marking it as the primary object store.  This makes it so that any blobs
> > or trees created during the automatic merge easily removable afterwards
>
> s/easily/are easily/ ?

sure

> > by just deleting all objects from the temporary object directory.
> >
> > There are a few ways that this implementation is suboptimal:
> >   * `log --remerge-diff` becomes slow, because the temporary object
> >     directory can fills with many loose objects while running
>
> s/can fills/can fill/

Thanks.

>
> >   * the log output can be muddied with misplaced "warning: cannot merge
> >     binary files" messages, since ll-merge.c unconditionally writes those
> >     messages to stderr while running instead of allowing callers to
> >     manage them.
> >   * important conflict and warning messages are simply dropped; thus for
> >     conflicts like modify/delete or rename/rename or file/directory which
> >     are not representable with content conflict markers, there may be no
> >     way for a user of --remerge-diff to know that there had been a
> >     conflict which was resolved (and which possibly motivated other
> >     changes in the merge commit).
> > Subsequent commits will address these issues.
> >
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > ---
> >  Documentation/diff-options.txt |  8 ++++
> >  builtin/log.c                  | 14 ++++++
> >  diff-merges.c                  | 12 +++++
> >  log-tree.c                     | 59 +++++++++++++++++++++++
> >  revision.h                     |  3 +-
> >  t/t4069-remerge-diff.sh        | 86 ++++++++++++++++++++++++++++++++++
> >  6 files changed, 181 insertions(+), 1 deletion(-)
> >  create mode 100755 t/t4069-remerge-diff.sh
> >
> > diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> > index c89d530d3d1..b05f1c9f1c9 100644
> > --- a/Documentation/diff-options.txt
> > +++ b/Documentation/diff-options.txt
> > @@ -64,6 +64,14 @@ ifdef::git-log[]
> >       each of the parents. Separate log entry and diff is generated
> >       for each parent.
> >  +
> > +--diff-merges=remerge:::
> > +--diff-merges=r:::
> > +--remerge-diff:::
>
> The synopsis above needs an update, too:
>
>         diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
>         index c89d530d3d..7a98ab3f85 100644
>         --- a/Documentation/diff-options.txt
>         +++ b/Documentation/diff-options.txt
>         @@ -36,3 +36,3 @@ endif::git-format-patch[]
>          ifdef::git-log[]
>         ---diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc)::
>         +--diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc|remerge|r)::
>          --no-diff-merges::

Ah, good catch.

> > +     With this option, two-parent merge commits are remerged to
> > +     create a temporary tree object -- potentially containing files
> > +     with conflict markers and such.  A diff is then shown between
> > +     that temporary tree and the actual merge commit.
>
> I had not really looked at any of the --diff-merges options before.  The term
> "remerge" felt a bit opaque at first, because I didn't know what the diff
> would look like. I might have found this easier:
>
> --diff-merges=resolution:::
> --diff-merges=r:::
> --resolution-diff:::
>         This makes two-parent merge commits show the diff with respect to
>         a mechanical merge of their parents -- potentially containing files
>         with conflict markers and such.
>
> But on a second thought, remerge is actually consistent with the rest,
> because it states _what_ we compare to the merge commit, so nevermind.
>
> > ++
> >  --diff-merges=combined:::
> >  --diff-merges=c:::
> >  -c:::
> > diff --git a/builtin/log.c b/builtin/log.c
> > index f75d87e8d7f..d053418fddd 100644
> > --- a/builtin/log.c
> > +++ b/builtin/log.c
> > @@ -35,6 +35,7 @@
> >  #include "repository.h"
> >  #include "commit-reach.h"
> >  #include "range-diff.h"
> > +#include "tmp-objdir.h"
> >
> >  #define MAIL_DEFAULT_WRAP 72
> >  #define COVER_FROM_AUTO_MAX_SUBJECT_LEN 100
> > @@ -406,6 +407,14 @@ static int cmd_log_walk(struct rev_info *rev)
> >       struct commit *commit;
> >       int saved_nrl = 0;
> >       int saved_dcctc = 0;
> > +     struct tmp_objdir *remerge_objdir = NULL;
> > +
> > +     if (rev->remerge_diff) {
> > +             remerge_objdir = tmp_objdir_create("remerge-diff");
> > +             if (!remerge_objdir)
> > +                     die_errno(_("unable to create temporary object directory"));
> > +             tmp_objdir_replace_primary_odb(remerge_objdir, 1);
> > +     }
> >
> >       if (rev->early_output)
> >               setup_early_output();
> > @@ -449,6 +458,9 @@ static int cmd_log_walk(struct rev_info *rev)
> >       rev->diffopt.no_free = 0;
> >       diff_free(&rev->diffopt);
> >
> > +     if (rev->remerge_diff)
> > +             tmp_objdir_destroy(remerge_objdir);
> > +
> >       if (rev->diffopt.output_format & DIFF_FORMAT_CHECKDIFF &&
> >           rev->diffopt.flags.check_failed) {
> >               return 02;
> > @@ -1943,6 +1955,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
> >               die(_("--name-status does not make sense"));
> >       if (rev.diffopt.output_format & DIFF_FORMAT_CHECKDIFF)
> >               die(_("--check does not make sense"));
> > +     if (rev.remerge_diff)
> > +             die(_("--remerge-diff does not make sense"));
> >
> >       if (!use_patch_format &&
> >               (!rev.diffopt.output_format ||
> > diff --git a/diff-merges.c b/diff-merges.c
> > index 5060ccd890b..0af4b3f9191 100644
> > --- a/diff-merges.c
> > +++ b/diff-merges.c
> > @@ -17,6 +17,7 @@ static void suppress(struct rev_info *revs)
> >       revs->combined_all_paths = 0;
> >       revs->merges_imply_patch = 0;
> >       revs->merges_need_diff = 0;
> > +     revs->remerge_diff = 0;
> >  }
> >
> >  static void set_separate(struct rev_info *revs)
> > @@ -45,6 +46,12 @@ static void set_dense_combined(struct rev_info *revs)
> >       revs->dense_combined_merges = 1;
> >  }
> >
> > +static void set_remerge_diff(struct rev_info *revs)
> > +{
> > +     suppress(revs);
> > +     revs->remerge_diff = 1;
> > +}
> > +
> >  static diff_merges_setup_func_t func_by_opt(const char *optarg)
> >  {
> >       if (!strcmp(optarg, "off") || !strcmp(optarg, "none"))
> > @@ -57,6 +64,8 @@ static diff_merges_setup_func_t func_by_opt(const char *optarg)
> >               return set_combined;
> >       else if (!strcmp(optarg, "cc") || !strcmp(optarg, "dense-combined"))
> >               return set_dense_combined;
> > +     else if (!strcmp(optarg, "r") || !strcmp(optarg, "remerge"))
> > +             return set_remerge_diff;
> >       else if (!strcmp(optarg, "m") || !strcmp(optarg, "on"))
> >               return set_to_default;
> >       return NULL;
> > @@ -110,6 +119,9 @@ int diff_merges_parse_opts(struct rev_info *revs, const char **argv)
> >       } else if (!strcmp(arg, "--cc")) {
> >               set_dense_combined(revs);
> >               revs->merges_imply_patch = 1;
> > +     } else if (!strcmp(arg, "--remerge-diff")) {
> > +             set_remerge_diff(revs);
> > +             revs->merges_imply_patch = 1;
> >       } else if (!strcmp(arg, "--no-diff-merges")) {
> >               suppress(revs);
> >       } else if (!strcmp(arg, "--combined-all-paths")) {
> > diff --git a/log-tree.c b/log-tree.c
> > index 644893fd8cf..84ed864fc81 100644
> > --- a/log-tree.c
> > +++ b/log-tree.c
> > @@ -1,4 +1,5 @@
> >  #include "cache.h"
> > +#include "commit-reach.h"
> >  #include "config.h"
> >  #include "diff.h"
> >  #include "object-store.h"
> > @@ -7,6 +8,7 @@
> >  #include "tag.h"
> >  #include "graph.h"
> >  #include "log-tree.h"
> > +#include "merge-ort.h"
> >  #include "reflog-walk.h"
> >  #include "refs.h"
> >  #include "string-list.h"
> > @@ -902,6 +904,51 @@ static int do_diff_combined(struct rev_info *opt, struct commit *commit)
> >       return !opt->loginfo;
> >  }
> >
> > +static int do_remerge_diff(struct rev_info *opt,
> > +                        struct commit_list *parents,
> > +                        struct object_id *oid,
> > +                        struct commit *commit)
> > +{
> > +     struct merge_options o;
> > +     struct commit_list *bases;
> > +     struct merge_result res = {0};
> > +     struct pretty_print_context ctx = {0};
> > +     struct commit *parent1 = parents->item;
> > +     struct commit *parent2 = parents->next->item;
> > +     struct strbuf parent1_desc = STRBUF_INIT;
> > +     struct strbuf parent2_desc = STRBUF_INIT;
> > +
> > +     /* Setup merge options */
> > +     init_merge_options(&o, the_repository);
> > +     o.show_rename_progress = 0;
>
> Is there a reason why we are repeating the default here (but not anywhere else)?
> For example sequencer.c::do_merge() and builtin/am.c::fall_back_threeway()
> don't, and probably also rely on this being disabled(?).

No, I think each of rebase, am, and merge could sensibly have progress
output be shown -- whether or not they do currently.  Whether or not
showing progress is the default now or in the future, though, we don't
want it for remerge diff.  So, yes, I explicitly made sure to turn it
off.

> > +
> > +     ctx.abbrev = DEFAULT_ABBREV;
> > +     format_commit_message(parent1, "%h (%s)", &parent1_desc, &ctx);
> > +     format_commit_message(parent2, "%h (%s)", &parent2_desc, &ctx);
> > +     o.branch1 = parent1_desc.buf;
> > +     o.branch2 = parent2_desc.buf;
> > +
> > +     /* Parse the relevant commits and get the merge bases */
> > +     parse_commit_or_die(parent1);
> > +     parse_commit_or_die(parent2);
> > +     bases = get_merge_bases(parent1, parent2);
> > +
> > +     /* Re-merge the parents */
> > +     merge_incore_recursive(&o, bases, parent1, parent2, &res);
> > +
> > +     /* Show the diff */
> > +     diff_tree_oid(&res.tree->object.oid, oid, "", &opt->diffopt);
> > +     log_tree_diff_flush(opt);
> > +
> > +     /* Cleanup */
> > +     strbuf_release(&parent1_desc);
> > +     strbuf_release(&parent2_desc);
> > +     merge_finalize(&o, &res);
> > +     /* TODO: clean up the temporary object directory */
> > +
> > +     return !opt->loginfo;
> > +}
> > +
> >  /*
> >   * Show the diff of a commit.
> >   *
> > @@ -936,6 +983,18 @@ static int log_tree_diff(struct rev_info *opt, struct commit *commit, struct log
> >       }
> >
> >       if (is_merge) {
> > +             int octopus = (parents->next->next != NULL);
> > +
> > +             if (opt->remerge_diff) {
> > +                     if (octopus) {
> > +                             show_log(opt);
> > +                             fprintf(opt->diffopt.file,
> > +                                     "diff: warning: Skipping remerge-diff "
> > +                                     "for octopus merges.\n");
> > +                             return 1;
> > +                     }
> > +                     return do_remerge_diff(opt, parents, oid, commit);
> > +             }
> >               if (opt->combine_merges)
> >                       return do_diff_combined(opt, commit);
> >               if (opt->separate_merges) {
> > diff --git a/revision.h b/revision.h
> > index 5578bb4720a..13178e6b8f3 100644
> > --- a/revision.h
> > +++ b/revision.h
> > @@ -195,7 +195,8 @@ struct rev_info {
> >                       combine_merges:1,
> >                       combined_all_paths:1,
> >                       dense_combined_merges:1,
> > -                     first_parent_merges:1;
> > +                     first_parent_merges:1,
> > +                     remerge_diff:1;
> >
> >       /* Format info */
> >       int             show_notes;
> > diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
> > new file mode 100755
> > index 00000000000..192dbce2bfe
> > --- /dev/null
> > +++ b/t/t4069-remerge-diff.sh
> > @@ -0,0 +1,86 @@
> > +#!/bin/sh
> > +
> > +test_description='remerge-diff handling'
> > +
> > +. ./test-lib.sh
> > +
> > +test_expect_success 'setup basic merges' '
> > +     test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
> > +     git add numbers &&
> > +     git commit -m base &&
> > +
> > +     git branch feature_a &&
> > +     git branch feature_b &&
> > +     git branch feature_c &&
> > +
> > +     git branch ab_resolution &&
> > +     git branch bc_resolution &&
> > +
> > +     git checkout feature_a &&
> > +     test_write_lines 1 2 three 4 5 6 7 eight 9 >numbers &&
> > +     git commit -a -m change_a &&
> > +
> > +     git checkout feature_b &&
> > +     test_write_lines 1 2 tres 4 5 6 7 8 9 >numbers &&
> > +     git commit -a -m change_b &&
> > +
> > +     git checkout feature_c &&
> > +     test_write_lines 1 2 3 4 5 6 7 8 9 10 >numbers &&
> > +     git commit -a -m change_c &&
> > +
> > +     git checkout bc_resolution &&
> > +     # fast forward
> > +     git merge feature_b &&
>
> maybe use --ff-only instead of the comment? Same below.

That'd be fine.

> (But if we did that we probably want to drop the "no conflict" comment too.)

Nah, I'd rather keep it.

> > +     # no conflict
> > +     git merge feature_c &&
> > +
> > +     git checkout ab_resolution &&
> > +     # fast forward
> > +     git merge feature_a &&
> > +     # conflicts!
> > +     test_must_fail git merge feature_b &&
> > +     # Resolve conflict...and make another change elsewhere
> > +     test_write_lines 1 2 drei 4 5 6 7 acht 9 >numbers &&
> > +     git add numbers &&
> > +     git merge --continue
> > +'
> > +
> > +test_expect_success 'remerge-diff on a clean merge' '
> > +     git log -1 --oneline bc_resolution >expect &&
> > +     git show --oneline --remerge-diff bc_resolution >actual &&
> > +     test_cmp expect actual
> > +'
> > +
> > +test_expect_success 'remerge-diff with both a resolved conflict and an unrelated change' '
> > +     git log -1 --oneline ab_resolution >tmp &&
> > +     cat <<-EOF >>tmp &&
> > +     diff --git a/numbers b/numbers
> > +     index a1fb731..6875544 100644
> > +     --- a/numbers
> > +     +++ b/numbers
> > +     @@ -1,13 +1,9 @@
> > +      1
> > +      2
> > +     -<<<<<<< b0ed5cb (change_a)
> > +     -three
> > +     -=======
> > +     -tres
> > +     ->>>>>>> 6cd3f82 (change_b)
> > +     +drei
>
> nice
>
> > +      4
> > +      5
> > +      6
> > +      7
> > +     -eight
> > +     +acht
> > +      9
> > +     EOF
> > +     # Hashes above are sha1; rip them out so test works with sha256
> > +     sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
>
> Right, sha256 could cause many noisy test changes. I wonder if there is a
> more general way to avoid this; maybe default to SHA1 for existing tests?

Not "could", but "does".  And this is not something to be avoided.
The default testsuite we run in CI involves a run of
GIT_TEST_DEFAULT_HASH=sha256 under linux-clang.  Making these tests
SHA1-only just reduces our coverage and makes the transition to SHA256
harder; I think that's the opposite of the direction we want to go.

These changes I've made here are sufficient to make these tests work
under sha256; you can see the test results here:
https://github.com/gitgitgadget/git/runs/4646949283?check_suite_focus=true.
Under "Run ci/run-build-and-tests.sh" note that there are two runs of
tests, and the second has "export GIT_TEST_DEFAULT_HASH=sha256"
preceding it.

There might be a cleaner way to make these tests sha256-compatible,
but this seemed like a pretty simple way to me.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 1/8] show, log: provide a --remerge-diff capability
  2021-12-28 22:34       ` Elijah Newren
@ 2021-12-28 23:01         ` brian m. carlson
  2021-12-28 23:45           ` Elijah Newren
  0 siblings, 1 reply; 113+ messages in thread
From: brian m. carlson @ 2021-12-28 23:01 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Johannes Altmanninger, Elijah Newren via GitGitGadget,
	Git Mailing List, Jeff King, Jonathan Nieder, Sergey Organov,
	Bagas Sanjaya, Ævar Arnfjörð Bjarmason,
	Neeraj Singh

[-- Attachment #1: Type: text/plain, Size: 2785 bytes --]

On 2021-12-28 at 22:34:03, Elijah Newren wrote:
> CC'ing brian in case he has comments on the sha256 stuff and whether
> he thinks there's a cleaner way to make my tests work with sha256.
> (brian: See the very end of the email.)
> 
> On Tue, Dec 28, 2021 at 2:56 AM Johannes Altmanninger <aclopte@gmail.com> wrote:
> >
> > On Sat, Dec 25, 2021 at 07:59:12AM +0000, Elijah Newren via GitGitGadget wrote:
> > > +test_expect_success 'remerge-diff with both a resolved conflict and an unrelated change' '
> > > +     git log -1 --oneline ab_resolution >tmp &&
> > > +     cat <<-EOF >>tmp &&
> > > +     diff --git a/numbers b/numbers
> > > +     index a1fb731..6875544 100644
> > > +     --- a/numbers
> > > +     +++ b/numbers
> > > +     @@ -1,13 +1,9 @@
> > > +      1
> > > +      2
> > > +     -<<<<<<< b0ed5cb (change_a)
> > > +     -three
> > > +     -=======
> > > +     -tres
> > > +     ->>>>>>> 6cd3f82 (change_b)
> > > +     +drei
> >
> > nice
> >
> > > +      4
> > > +      5
> > > +      6
> > > +      7
> > > +     -eight
> > > +     +acht
> > > +      9
> > > +     EOF
> > > +     # Hashes above are sha1; rip them out so test works with sha256
> > > +     sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
> >
> > Right, sha256 could cause many noisy test changes. I wonder if there is a
> > more general way to avoid this; maybe default to SHA1 for existing tests?
> 
> Not "could", but "does".  And this is not something to be avoided.
> The default testsuite we run in CI involves a run of
> GIT_TEST_DEFAULT_HASH=sha256 under linux-clang.  Making these tests
> SHA1-only just reduces our coverage and makes the transition to SHA256
> harder; I think that's the opposite of the direction we want to go.
>
> These changes I've made here are sufficient to make these tests work
> under sha256; you can see the test results here:
> https://github.com/gitgitgadget/git/runs/4646949283?check_suite_focus=true.
> Under "Run ci/run-build-and-tests.sh" note that there are two runs of
> tests, and the second has "export GIT_TEST_DEFAULT_HASH=sha256"
> preceding it.
> 
> There might be a cleaner way to make these tests sha256-compatible,
> but this seemed like a pretty simple way to me.

The question here is, do we care very much about testing these specific
hashes?  If so, then we should use test_oid_cache to set up some OIDs
and make sure they're correct for both SHA-1 and SHA-256, and then
replace them in the code with calls to test_oid.

However, my impression is that we probably don't care very much about
what the specific values are, and in that case, this is completely fine.
We do similar things elsewhere in the testsuite.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 8/8] show, log: include conflict/warning messages in --remerge-diff headers
  2021-12-28 10:57     ` Johannes Altmanninger
@ 2021-12-28 23:42       ` Elijah Newren
  0 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren @ 2021-12-28 23:42 UTC (permalink / raw)
  To: Johannes Altmanninger
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason, Neeraj Singh

On Tue, Dec 28, 2021 at 2:57 AM Johannes Altmanninger <aclopte@gmail.com> wrote:
>
> On Sat, Dec 25, 2021 at 07:59:19AM +0000, Elijah Newren via GitGitGadget wrote:
...
> > +test_expect_success 'remerge-diff with non-content conflicts' '
> > +     git log -1 --oneline resolution >tmp &&
> > +     cat <<-EOF >>tmp &&
> > +     diff --git a/file_or_directory~HASH (side1) b/wanted_content
>
> the "~HASH (side1)" suffix will probably mess with some programs that extract
> the filename from the diff.

"~HASH (side1)" is part of the filename, so this won't mess those
programs up at all (unless those programs can't deal with filenames
containing spaces or parentheses or something).

> I don't know what programs are supposed to expect.  I can see arguments for
> either dropping the suffix or including only "~HASH" since that's part of
> the actual filename that's left in the worktree.

When there are conflicts that prevent the file from being recorded in
the tree, such as file/directory conflicts, we have to rename the file
elsewhere.  We want the new name to be something that the user can
find and reason about.  So, both merge-recursive and merge-ort use
${filename}~${branchname}, where ${branchname} is defined in the
struct merge_options (opt->branch1 or opt->branch2) that were passed
in to the function.

For a regular `git merge` we just use opt->branch1 = "HEAD" and
opt->branch2 = <name of branch/commit typed on command line to merge>.

Neither of those strings make sense for remerge-diff.  We could just
use hashes, but why are users going to be familiar with the hashes of
the parents of a merge commit when looking at --remerge-diff output?
Parents are not part of the output by default.  So, log-tree.c uses
this bit of logic

    format_commit_message(parent1, "%h (%s)", &parent1_desc, &ctx);
    format_commit_message(parent2, "%h (%s)", &parent2_desc, &ctx);
    o.branch1 = parent1_desc.buf;
    o.branch2 = parent2_desc.buf;

which means that the branch name is of the form "$HASH
($EXTRA_DESCRIPTION)", and thus that files in file/directory conflicts
(or add/add + file/symlink conflicts or file/submodule conflicts) will
get renamed to a file of the form "$FILENAME~$HASH
($EXTRA_DESCRIPTION)"

Note that these branch names also appear in CONFLICT messages, in
conflict markers, etc., and in fact are used much more frequently in
those locations.  In those places it's perhaps even more important to
attempt to provide meaningful names, so dropping the extra description
doesn't make sense.

> The file/link typechange conflict test I'll add below exposes what looks
> like an accidental interaction with the trailing tab characters that we emit
> on --- and +++ lines if the "filename" contains a space (since 1a9eb3b9d5
> (git-diff/git-apply: make diff output a bit friendlier to GNU patch (part
> 2), 2006-09-22)).
>
>         index 70885e4..0000000
>         --- a/typechange~738109f (side1)        <-- git diff adds a trailing tab!
>         +++ /dev/null
>
> I haven't formed an opinion yet, but since Tig uses the --- and +++ lines
> to extract file names, I'd drop the " (side1)" suffix from at least the ---
> and +++ lines. Maybe also the ^diff lines, I'm not sure

As above, " (side1)" is part of the filename and thus belongs here.

> > +     similarity index 100%
> > +     rename from file_or_directory~HASH (side1)
> > +     rename to wanted_content
> > +     CONFLICT (file/directory): directory in the way of file_or_directory from HASH (side1); moving it to file_or_directory~HASH (side1) instead.
>
> I wonder if it's better to have this line further up, before the "rename"
> resolution, to correct the temporal order.

Yeah, I've gone back and forth about where these would best be placed.
You make a good point, even if the code is slightly uglier to move
earlier.  However, I do really like having the CONFLICT notices being
close to the file text being shown, which makes me conflicted (no pun
intended) about moving it earlier.  Hmm....

> > +     diff --git a/letters b/letters
> > +     CONFLICT (rename/rename): letters renamed to letters_side1 in HASH (side1) and to letters_side2 in HASH (side2).
> > +     diff --git a/letters_side2 b/letters_side2
> > +     deleted file mode 100644
> > +     index b236ae5..0000000
> > +     --- a/letters_side2
> > +     +++ /dev/null
> > +     @@ -1,9 +0,0 @@
> > +     -a
> > +     -b
> > +     -c
> > +     -d
> > +     -e
> > +     -f
> > +     -g
> > +     -h
> > +     -i
> > +     diff --git a/numbers b/numbers
> > +     CONFLICT (modify/delete): numbers deleted in HASH (side2) and modified in HASH (side1).  Version HASH (side1) of numbers left in tree.
> > +     EOF
>
> Took me some time to grok these but the output makes sense (it's loud and
> ugly but that's okay since these are serious conflicts).
>
> > +     # We still have some sha1 hashes above; rip them out so test works
> > +     # with sha256
> > +     sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
> > +
> > +     git show --oneline --remerge-diff resolution >tmp &&
> > +     sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
> > +     test_cmp expect actual
> > +'
> > +
> >  test_done
> > --
> > gitgitgadget
>
> We're missing a test case for typechange.

<grin>
...and a testcase for an add/add conflict, and a file/submodule
conflict, and file/symlink, and symlink/submodule, and
submodule/submodule (different submodules both added at same path),
and binary merge conflict, and a variety of
failure-to-merge-submodule-updates, and at least half a dozen regular
rename-based conflict types, and several directory-rename-based
conflict types...and that's just beginning to scratch the surface once
you start dreaming of combinations of the different conflict types
occurring for the same path (in particular, I'm thinking of examples
from the testcases found in t6416, t6422, t6423 -- or at least
sections 7, 9, & 12 of t6423).

I don't think providing a comprehensive set of possible conflicts is
useful here; we just need a representative sample.  I was curious
whether that was best served with just two examples or three, but
ultimately decided on 3.  I would have been more likely to pick 2 than
4, though.

However, while I fail to see how typechange stresses --remerge-diff in
ways the other conflict types don't, or how it might help clarify the
output for users, I might be overlooking something.  Is there a
particular reason you wanted to see the typechange conflict included?

>  Here's is a quick draft I've been
> playing around with. Seems ugly that the "diff --git a/typechange b/typechange"
> is doubled but okay.
>
> Maybe a rename/delete conflict is interesting as well, I'm not sure.  (Also I
> wonder if switching the order of parents will give any interesting difference,
> I guess not)
>
> test_expect_success 'remerge-diff with file/link conflict' '
>         git branch -d base side1 side2 &&
>         git switch --orphan base &&

I'd rather have subdirectories with git repositories (much like t6416,
t6422, and t6423) if we're going to be adding many more tests here.


>         echo base >typechange &&
>         git add typechange &&
>         git commit -m base &&
>
>         git branch side1 &&
>         git branch side2 &&
>
>         git checkout side1 &&
>         echo orig-file-contents >typechange &&
>         git commit -a -m side1 &&
>
>         git checkout side2 &&
>         ln -sf . typechange &&
>         git add typechange &&
>         git commit -m side2 &&
>
>         git checkout -b resolution2 side1 &&
>         test_must_fail git merge side2 &&
>         rm typechange &&
>         mv typechange~HEAD typechange &&
>         echo resolved >>typechange &&
>         git add typechange~HEAD typechange &&
>         git merge --continue &&
>
>         git show --oneline --remerge-diff resolution2 >tmp &&
>         sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
>
>         cat <<-EOF >tmp &&
>         7759b27 Merge branch ${SQ}side2${SQ} into resolution2
>         diff --git a/typechange b/typechange
>         deleted file mode 120000
>         CONFLICT (distinct types): typechange had different types on each side; renamed one of them so each can be recorded somewhere.
>         index 945c9b4..0000000
>         --- a/typechange
>         +++ /dev/null
>         @@ -1 +0,0 @@
>         -.
>         \ No newline at end of file
>         diff --git a/typechange b/typechange
>         new file mode 100644
>         CONFLICT (distinct types): typechange had different types on each side; renamed one of them so each can be recorded somewhere.
>         index 0000000..70885e4
>         --- /dev/null
>         +++ b/typechange
>         @@ -0,0 +1,2 @@
>         +orig-file-contents
>         +resolved
>         diff --git a/typechange~738109f (side1) b/typechange~738109f (side1)
>         deleted file mode 100644
>         index 70885e4..0000000
>         --- a/typechange~738109f (side1)
>         +++ /dev/null
>         @@ -1 +0,0 @@
>         -orig-file-contents
>         EOF
>         # We still have some sha1 hashes above; rip them out so test works
>         # with sha256
>         sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
>
>         test_cmp expect actual
> '

but otherwise the testcase looks good, if we want it.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 1/8] show, log: provide a --remerge-diff capability
  2021-12-28 23:01         ` brian m. carlson
@ 2021-12-28 23:45           ` Elijah Newren
  0 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren @ 2021-12-28 23:45 UTC (permalink / raw)
  To: brian m. carlson, Elijah Newren, Johannes Altmanninger,
	Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason, Neeraj Singh

On Tue, Dec 28, 2021 at 3:01 PM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> On 2021-12-28 at 22:34:03, Elijah Newren wrote:
> > CC'ing brian in case he has comments on the sha256 stuff and whether
> > he thinks there's a cleaner way to make my tests work with sha256.
> > (brian: See the very end of the email.)
> >
> > On Tue, Dec 28, 2021 at 2:56 AM Johannes Altmanninger <aclopte@gmail.com> wrote:
> > >
> > > On Sat, Dec 25, 2021 at 07:59:12AM +0000, Elijah Newren via GitGitGadget wrote:
> > > > +test_expect_success 'remerge-diff with both a resolved conflict and an unrelated change' '
> > > > +     git log -1 --oneline ab_resolution >tmp &&
> > > > +     cat <<-EOF >>tmp &&
> > > > +     diff --git a/numbers b/numbers
> > > > +     index a1fb731..6875544 100644
> > > > +     --- a/numbers
> > > > +     +++ b/numbers
> > > > +     @@ -1,13 +1,9 @@
> > > > +      1
> > > > +      2
> > > > +     -<<<<<<< b0ed5cb (change_a)
> > > > +     -three
> > > > +     -=======
> > > > +     -tres
> > > > +     ->>>>>>> 6cd3f82 (change_b)
> > > > +     +drei
> > >
> > > nice
> > >
> > > > +      4
> > > > +      5
> > > > +      6
> > > > +      7
> > > > +     -eight
> > > > +     +acht
> > > > +      9
> > > > +     EOF
> > > > +     # Hashes above are sha1; rip them out so test works with sha256
> > > > +     sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
> > >
> > > Right, sha256 could cause many noisy test changes. I wonder if there is a
> > > more general way to avoid this; maybe default to SHA1 for existing tests?
> >
> > Not "could", but "does".  And this is not something to be avoided.
> > The default testsuite we run in CI involves a run of
> > GIT_TEST_DEFAULT_HASH=sha256 under linux-clang.  Making these tests
> > SHA1-only just reduces our coverage and makes the transition to SHA256
> > harder; I think that's the opposite of the direction we want to go.
> >
> > These changes I've made here are sufficient to make these tests work
> > under sha256; you can see the test results here:
> > https://github.com/gitgitgadget/git/runs/4646949283?check_suite_focus=true.
> > Under "Run ci/run-build-and-tests.sh" note that there are two runs of
> > tests, and the second has "export GIT_TEST_DEFAULT_HASH=sha256"
> > preceding it.
> >
> > There might be a cleaner way to make these tests sha256-compatible,
> > but this seemed like a pretty simple way to me.
>
> The question here is, do we care very much about testing these specific
> hashes?  If so, then we should use test_oid_cache to set up some OIDs
> and make sure they're correct for both SHA-1 and SHA-256, and then
> replace them in the code with calls to test_oid.
>
> However, my impression is that we probably don't care very much about
> what the specific values are, and in that case, this is completely fine.
> We do similar things elsewhere in the testsuite.

Thanks for chiming in.  Your impression is right; I don't care about
the specific hashes, just the general form of the diff content, so
I'll keep it as is.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 7/8] diff: add ability to insert additional headers for paths
  2021-12-28 21:09       ` Elijah Newren
@ 2021-12-29  0:16         ` Johannes Altmanninger
  2021-12-30 22:04           ` Elijah Newren
  0 siblings, 1 reply; 113+ messages in thread
From: Johannes Altmanninger @ 2021-12-29  0:16 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason, Neeraj Singh

On Tue, Dec 28, 2021 at 01:09:57PM -0800, Elijah Newren wrote:
> On Tue, Dec 28, 2021 at 2:57 AM Johannes Altmanninger <aclopte@gmail.com> wrote:
> >
> > On Sat, Dec 25, 2021 at 07:59:18AM +0000, Elijah Newren via GitGitGadget wrote:
> > > +     for (i = 0; i < q->nr; i++) {
> > > +             struct diff_filepair *p = q->queue[i];
> > > +             char *path = p->one->path ? p->one->path : p->two->path;
> > > +
> > > +             if (strmap_contains(o->additional_path_headers, path))
> > > +                     strset_add(&present, path);
> > > +     }
> > > +
> > > +     /*
> > > +      * Loop over paths in additional_path_headers; for each NOT already
> > > +      * in diff_queued_diff, create a synthetic filepair and insert that
> > > +      * into diff_queued_diff.
> > > +      */
> > > +     strmap_for_each_entry(o->additional_path_headers, &iter, e) {
> > > +             if (!strset_contains(&present, e->key)) {
> > > +                     struct diff_filespec *one, *two;
> > > +                     struct diff_filepair *p;
> > > +
> > > +                     one = alloc_filespec(e->key);
> > > +                     two = alloc_filespec(e->key);
> > > +                     fill_filespec(one, null_oid(), 0, 0);
> > > +                     fill_filespec(two, null_oid(), 0, 0);
> > > +                     p = diff_queue(q, one, two);
> > > +                     p->status = DIFF_STATUS_MODIFIED;
> > > +             }
> > > +     }
> >
> > All these string hash-maps are not really typical for a C program. I'm sure
> > they are the best choice for an advanced merge algorithm
> 
> Agreed up to here.
> 
> > but they are not
> > really necessary for computing/printing a diff.
> 
> Technically agree that it _could_ be solved a different way, but the
> strmaps are a much more natural solution to this problem in this
> particular case; more on this below.

Oh yeah, I agree that strmaps are the more intuitive solution.

> 
> > It feels like this is an
> > implementation detail from merge-ort that's leaking into other components.
> 
> And I disagree here, on _both_ the explicit point and the underlying
> suggestion that you seem to be making that strmap should be avoided
> outside of merging.  The strmap.[ch] type was originally a suggestion
> from Peff for areas of git completely unrelated to merging (see the
> beginning of https://lore.kernel.org/git/20200821194857.GD1165@coredump.intra.peff.net/,
> and the first link in that email).  It's a new datatype for git, much
> like strbuf or string_list or whatever before it, that is there to be
> used when it's a natural fit for the problem at hand.  The lack of
> strmap previously led folks to abuse other existing data structures
> (and in a way that often led to poor performance to boot).

Right, all those rename-detection performance fixes were pretty dazzling

> 
> > What we want to do is
> >
> >         for file_pair in additional_headers:
> >                 if not already_queued(file_pair):
> >                         queue(file_pair)
> 
> Yes, precisely.
> 
> > to do that, you use a temporary has-set ("present") that records everything
> > that's already queued (already_queued() is a lookup in that set).
> >
> > Let's assume both the queue and additional_headers are sorted arrays.
> 
> That's a bad assumption; we can't rely on *either* being sorted.  I

OK, I hadn't checked if the queue is sorted

> actually started my implementation by trying exactly what you mention
> first; I too thought it'd be more natural and clearer to do this.  Of
> course, before implementing it, I had to verify whether
> diff_queued_diff was sorted.  So, I added some code that would check
> the order and fail if the queue wasn't sorted.  7 of the test files in
> the regression testsuite had one or more failing tests.
> 
> I think the queue was intended to be sorted (see
> diffcore_fix_diff_index()), but in practice it's not.  And I'm worried
> that if I find the current cases where it fails to be sorted and "fix"
> them (though I don't actually know if this was intentional or not so I
> don't know if that's really a fix or a break), that I'd end up with
> additional cases in the future where they fail to be sorted anyway.
> So, no matter what, relying on diff_queued_diff being sorted seems
> ill-advised.
> 
> Also...
> 
> > Then we could efficiently merge them (like a merge-sort algorithm)
> > without ever allocating a temporary hash map.
> >
> > I haven't checked if this is practical (better wait for feedback).
> > We'd probably need to convert the strmap additional_path_headers into an
> > array and sort it (I guess our hash map does not guarantee any ordering?)
> 
> Right, strmap has no ordering either.  I was willing to stick those
> into a string_list and sort them, but making temporary copies of both
> the strmap and the diff_queued_diff just to sort them so that I can

But you already sort diff_queued_diff at the end of
create_filepairs_for_header_only_notifications(), so sorting a bit earlier
in that function, before enqueueing the new entries won't change the final
result, and allows us to work with a sorted queue; no need for a temporary
copy (we'd only need to copy the strmap).

> reasonably cheaply ask "are items from this thing present in this
> other thing?" seems to be stretching things a bit too far.
> maps/hashes provide a very nice "is this item present" lookup and are
> a natural way to ask that.  Since that is exactly the question I am
> asking, I think they are the better data structure here.

Yeah that makes sense. In theory if we ask
"What is the union of the queued pairs and the extra pairs induced by
conflict messages?"  we could abstract away the "is this item present"
lookup but in practice that's hard.

> So, this was not at all a leak of merge-ort datastructures, but rather a
> picking of the appropriate data structures for the problem at hand.

I think we have two viable solutions to this problem
1. use a temporary strset to figure out which pairs to add
2. use a temporary array, sort it, and "merge" the two arrays

I agree that 1 is more intuitive and natural for humans, and it's probably the
way to go. But it is a bit less elegant because it adds a strmap entry for
each pair in the queue, whereas 2 only needs to add an array element for
each pair with non-content conflicts, which are much fewer. (Okay that's a
minor detail.)  With the right abstractions 2 is pretty simple as well:

	j = 0
	extra_headers = sorted((key, val) for key, val in additional_headers)
	for i in 0..len(queue):
		while j < len(extra_headers) && compare(extra_headers[j].key, queue[i]) <= 0:
			if compare(extra_headers[j].key, queue[i]) < 0:
				enqueue(file_pair_for(extra_headers[j]))
			j++

where

	def compare(key: str, pair: diff_filepair) -> int:
		other = pair.one ? pair.one.path : pair.two.path # Mimic diffnamecmp
		return strcmp(key, other)

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 7/8] diff: add ability to insert additional headers for paths
  2021-12-29  0:16         ` Johannes Altmanninger
@ 2021-12-30 22:04           ` Elijah Newren
  2021-12-31  3:07             ` Johannes Altmanninger
  0 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren @ 2021-12-30 22:04 UTC (permalink / raw)
  To: Johannes Altmanninger
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason, Neeraj Singh

On Tue, Dec 28, 2021 at 4:16 PM Johannes Altmanninger <aclopte@gmail.com> wrote:
>
> On Tue, Dec 28, 2021 at 01:09:57PM -0800, Elijah Newren wrote:
> > On Tue, Dec 28, 2021 at 2:57 AM Johannes Altmanninger <aclopte@gmail.com> wrote:
> > >
> > > On Sat, Dec 25, 2021 at 07:59:18AM +0000, Elijah Newren via GitGitGadget wrote:
....
> > > but they are not
> > > really necessary for computing/printing a diff.
> >
> > Technically agree that it _could_ be solved a different way, but the
> > strmaps are a much more natural solution to this problem in this
> > particular case; more on this below.
>
> Oh yeah, I agree that strmaps are the more intuitive solution.

Cool, sounds like we're heading towards consensus.

> > > It feels like this is an
> > > implementation detail from merge-ort that's leaking into other components.
> >
> > And I disagree here, on _both_ the explicit point and the underlying
> > suggestion that you seem to be making that strmap should be avoided
> > outside of merging.  The strmap.[ch] type was originally a suggestion
> > from Peff for areas of git completely unrelated to merging (see the
> > beginning of https://lore.kernel.org/git/20200821194857.GD1165@coredump.intra.peff.net/,
> > and the first link in that email).  It's a new datatype for git, much
> > like strbuf or string_list or whatever before it, that is there to be
> > used when it's a natural fit for the problem at hand.  The lack of
> > strmap previously led folks to abuse other existing data structures
> > (and in a way that often led to poor performance to boot).
>
> Right, all those rename-detection performance fixes were pretty dazzling

I actually wasn't talking about rename-detection or merge machinery
(though it applies there too).  The original strmap proposal was
suggested in an email entitled, "ordered string-list considered
harmful", and neither rename-detection nor merge machinery were part
of the thread at the time.  I also didn't participate in the
thread...until I implemented the suggested API with some tweaks and
submitted it about a year later.

> > > What we want to do is
> > >
> > >         for file_pair in additional_headers:
> > >                 if not already_queued(file_pair):
> > >                         queue(file_pair)
> >
> > Yes, precisely.
> >
> > > to do that, you use a temporary has-set ("present") that records everything
> > > that's already queued (already_queued() is a lookup in that set).
> > >
> > > Let's assume both the queue and additional_headers are sorted arrays.
> >
> > That's a bad assumption; we can't rely on *either* being sorted.  I
>
...
> > > Then we could efficiently merge them (like a merge-sort algorithm)
> > > without ever allocating a temporary hash map.
> > >
> > > I haven't checked if this is practical (better wait for feedback).
> > > We'd probably need to convert the strmap additional_path_headers into an
> > > array and sort it (I guess our hash map does not guarantee any ordering?)
> >
> > Right, strmap has no ordering either.  I was willing to stick those
> > into a string_list and sort them, but making temporary copies of both
> > the strmap and the diff_queued_diff just to sort them so that I can
>
> But you already sort diff_queued_diff at the end of
> create_filepairs_for_header_only_notifications(), so sorting a bit earlier
> in that function, before enqueueing the new entries won't change the final
> result, and allows us to work with a sorted queue; no need for a temporary
> copy (we'd only need to copy the strmap).

Good point, although...

> > reasonably cheaply ask "are items from this thing present in this
> > other thing?" seems to be stretching things a bit too far.
> > maps/hashes provide a very nice "is this item present" lookup and are
> > a natural way to ask that.  Since that is exactly the question I am
> > asking, I think they are the better data structure here.
>
> Yeah that makes sense. In theory if we ask
> "What is the union of the queued pairs and the extra pairs induced by
> conflict messages?"  we could abstract away the "is this item present"
> lookup but in practice that's hard.
>
> > So, this was not at all a leak of merge-ort datastructures, but rather a
> > picking of the appropriate data structures for the problem at hand.
>
> I think we have two viable solutions to this problem
> 1. use a temporary strset to figure out which pairs to add
> 2. use a temporary array, sort it, and "merge" the two arrays
>
> I agree that 1 is more intuitive and natural for humans, and it's probably the
> way to go. But it is a bit less elegant because it adds a strmap entry for
> each pair in the queue, whereas 2 only needs to add an array element for
> each pair with non-content conflicts, which are much fewer. (Okay that's a
> minor detail.)  With the right abstractions 2 is pretty simple as well:
>
>         j = 0
>         extra_headers = sorted((key, val) for key, val in additional_headers)

Right, so this is two sorts instead of one.  (Sorting both the
diff_queued_diff initially, as well as the additional_headers, before
then attempting to merge the two.)  Probably a win performance-wise,
but just noting that it makes the code slightly less simple.

>         for i in 0..len(queue):
>                 while j < len(extra_headers) && compare(extra_headers[j].key, queue[i]) <= 0:
>                         if compare(extra_headers[j].key, queue[i]) < 0:

The duplicate comparison (two calls to strcmp) probably kills any
performance gain you were aiming for with this strategy.  Fixable, but
it does make the code longer.

>                                 enqueue(file_pair_for(extra_headers[j]))

The queue is an array of sorted items, so enqueue here would be
insertion into an already sorted list.  Inserting N items into a list
of M items is quadratic (O(N*M)) -- unless you meant to just append to
the end and add a third sort at the end?

>                         j++

At the end of the for loop, there may be remaining additional headers
that sort after all those found in the queue, so you'll need an
additional loop to handle those.

> where
>
>         def compare(key: str, pair: diff_filepair) -> int:
>                 other = pair.one ? pair.one.path : pair.two.path # Mimic diffnamecmp
>                 return strcmp(key, other)


Since you clearly felt this approach might be better, I went and
implemented it (+ tested and debugged):

 diff.c | 112 +++++++++++++++++++++++++++++++++++++++++++----------------------
 1 file changed, 73 insertions(+), 38 deletions(-)

diff --git a/diff.c b/diff.c
index d771406e69..0cdaa2e2ab 100644
--- a/diff.c
+++ b/diff.c
@@ -6389,52 +6389,87 @@ void diff_warn_rename_limit(const char
*varname, int needed, int degraded_cc)
                warning(_(rename_limit_advice), varname, needed);
 }

+static void diff_queue_placeholder(char *path)
+{
+       struct diff_filespec *one, *two;
+       struct diff_filepair *p;
+
+       one = alloc_filespec(path);
+       two = alloc_filespec(path);
+       fill_filespec(one, null_oid(), 0, 0);
+       fill_filespec(two, null_oid(), 0, 0);
+       p = diff_queue(&diff_queued_diff, one, two);
+       p->status = DIFF_STATUS_MODIFIED;
+}
+
 static void create_filepairs_for_header_only_notifications(struct
diff_options *o)
 {
-       struct strset present;
-       struct diff_queue_struct *q = &diff_queued_diff;
+       struct diff_queue_struct tmp_queue = { 0 };
+       struct string_list tmp_list = STRING_LIST_INIT_NODUP;
        struct hashmap_iter iter;
        struct strmap_entry *e;
+       char *queue_path = NULL, *list_path = NULL;
        int i;
+       int j;

-       strset_init_with_options(&present, /*pool*/ NULL, /*strdup*/ 0);
-
-       /*
-        * Find out which paths exist in diff_queued_diff, preferring
-        * one->path for any pair that has multiple paths.
-        */
-       for (i = 0; i < q->nr; i++) {
-               struct diff_filepair *p = q->queue[i];
-               char *path = p->one->path ? p->one->path : p->two->path;
-
-               if (strmap_contains(o->additional_path_headers, path))
-                       strset_add(&present, path);
-       }
-
-       /*
-        * Loop over paths in additional_path_headers; for each NOT already
-        * in diff_queued_diff, create a synthetic filepair and insert that
-        * into diff_queued_diff.
-        */
-       strmap_for_each_entry(o->additional_path_headers, &iter, e) {
-               if (!strset_contains(&present, e->key)) {
-                       struct diff_filespec *one, *two;
-                       struct diff_filepair *p;
-
-                       one = alloc_filespec(e->key);
-                       two = alloc_filespec(e->key);
-                       fill_filespec(one, null_oid(), 0, 0);
-                       fill_filespec(two, null_oid(), 0, 0);
-                       p = diff_queue(q, one, two);
-                       p->status = DIFF_STATUS_MODIFIED;
-               }
-       }
-
-       /* Re-sort the filepairs */
+       /* Ensure existing filepairs are sorted */
        diffcore_fix_diff_index();

-       /* Cleanup */
-       strset_clear(&present);
+       /* Get a sorted list of additional_path_headers */
+       strmap_for_each_entry(o->additional_path_headers, &iter, e) {
+               string_list_append(&tmp_list, e->key);
+       }
+       string_list_sort(&tmp_list);
+
+       /*
+        * Move everything from diff_queued_diff to tmp_queue.  We'll copy
+        * them back one-by-one, with extra entries inserted from tmp_list.
+        */
+       SWAP(tmp_queue, diff_queued_diff);
+
+       /*
+        * Add entries from tmp_queue and tmp_list to diff_queued_diff, keeping
+        * the overall list sorted.
+        */
+       j = 0;
+       for (i = 0; i < tmp_queue.nr; i++) {
+               struct diff_filepair *p = tmp_queue.queue[i];
+               queue_path = p->one->path ? p->one->path : p->two->path;
+
+               while (j < tmp_list.nr) {
+                       int cmp;
+
+                       list_path = tmp_list.items[j].string;
+                       cmp = strcmp(queue_path, list_path);
+
+                       if (cmp < 0)
+                               break;
+                       else if (cmp > 0)
+                               diff_queue_placeholder(list_path);
+                       j++;
+               }
+               diff_q(&diff_queued_diff, p);
+       }
+       /*
+        * We've got all the entries from tmp_queue now, but may have more
+        * from tmp_list to insert.  Make sure to only add new entries for
+        * strings not already in diff_queued_diff.
+        */
+       if (j < tmp_list.nr && !strcmp(queue_path, list_path))
+               j++;
+       while (j < tmp_list.nr) {
+               char *list_path = tmp_list.items[j].string;
+               diff_queue_placeholder(list_path);
+               j++;
+       }
+
+       /*
+        * We *only* free tmp_queue.queue, not the stuff it points to because
+        * that has been copied into diff_queued_diff.  Zero out tmp_queue to
+        * make it clear we don't want to free anything else.
+        */
+       free(tmp_queue.queue);
+       memset(&tmp_queue, 0, sizeof(tmp_queue));
 }

 static void diff_flush_patch_all_file_pairs(struct diff_options *o)


It's actually considerably more code as you can see from the diffstat,
and feels like we're reaching into some ugly internals with tmp_queue
(the SWAP and the special-case freeing) in order to get the desired
performance improvements.  And it was already O(NlogN) overall (due to
the sort), which doesn't change with this new algorithm.  It's really,
really hard for me to imagine a case where we have large numbers of
additional headers.  Even if someone else can imagine that we for some
reason have a huge number of conflicts in order to generate a huge
number of additional headers...how could the performance of sorting
O(N) filenames and merging these lists possibly matter in comparison
to the O(N) three-way file merges that would likely have been
performed from those conflicts?

So, I'm going to throw this code away and keep the original.

It was an interesting idea and exercise; thanks for keeping me on my toes.

^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v3 0/9] Add a new --remerge-diff capability to show & log
  2021-12-25  7:59 ` [PATCH v2 0/8] " Elijah Newren via GitGitGadget
                     ` (9 preceding siblings ...)
  2021-12-28 10:55   ` Johannes Altmanninger
@ 2021-12-30 23:36   ` Elijah Newren via GitGitGadget
  2021-12-30 23:36     ` [PATCH v3 1/9] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
                       ` (10 more replies)
  10 siblings, 11 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-30 23:36 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren

Here are some patches to add a --remerge-diff capability to show & log,
which works by comparing merge commits to an automatic remerge (note that
the automatic remerge tree can contain files with conflict markers).

Changes NOT included (mostly because I'm not sure what to add or where):

 * Ævar suggested also extending the docs with usage guidelines, but the
   example he picked was IMO best handled by just add --remerge-diff, so I'm
   not sure what to add to the docs. Maybe the log -S<string> --remerge-diff
   example as a way to more reliably determine when a string was added to or
   removed from the codebase? Where would that go anyway?
 * Johannes Altimanninger suggested changing the ordering of the new headers
   relative to other headers. He made a good point, but I also like having
   the conflict messages next to the text, so I'm conflicted about what's
   best.
 * (Technically not part of this feature, but kind of related.) Months ago,
   Junio suggested documenting ${GIT_DIR}/AUTO_MERGE better
   (https://lore.kernel.org/git/xmqqtuj4nepe.fsf@gitster.g/). I looked at
   the time, but couldn't find a place to put it that made sense to me.

Changes since v2 (of the restarted submission):

 * Numerous small improvements suggested by Johannes Altmanninger
 * Avoid including conflict messages from inner merges (due to example
   pointed out by Ævar).
 * Added a "remerge" prefix to all the new diff headers (suggested by Junio
   in a previous round, but I couldn't come up with a good name before. It
   suddenly hit me that "remerge" is an obvious prefix to use, and even
   helps explain what the rest of the line is for.)

Changes since v1 (of the restarted submission, which technically was v2):

 * Restructured the series, so the first patch introduces the feature --
   with a bunch of caveats. Subsequent patches clean up those caveats. This
   avoids introducing not-yet-used functions, and hopefully makes review
   easier.
 * added testcases
 * numerous small improvements suggested by Ævar and Junio

Changes since original submission[1]:

 * Rebased on top of the version of ns/tmp-objdir that Neeraj submitted
   (Neeraj's patches were based on v2.34, but ns/tmp-objdir got applied on
   an old commit and does not even build because of that).
 * Modify ll-merge API to return a status, instead of printing "Cannot merge
   binary files" on stdout[2] (as suggested by Peff)
 * Make conflict messages and other such warnings into diff headers of the
   subsequent remerge-diff rather than appearing in the diff as file content
   of some funny looking filenames (as suggested by Peff[3] and Junio[4])
 * Sergey ack'ed the diff-merges.c portion of the patches, but that wasn't
   limited to one patch so not sure where to record that ack.

[1]
https://lore.kernel.org/git/pull.1080.git.git.1630376800.gitgitgadget@gmail.com/;
GitHub wouldn't let me change the target branch for the PR, so I had to
create a new one with the new base and thus the reason for not sending this
as v2 even though it is. [2]
https://lore.kernel.org/git/YVOZRhWttzF18Xql@coredump.intra.peff.net/,
https://lore.kernel.org/git/YVOZty9D7NRbzhE5@coredump.intra.peff.net/ [3]
https://lore.kernel.org/git/YVOXPTjsp9lrxmS6@coredump.intra.peff.net/ [4]
https://lore.kernel.org/git/xmqqr1d7e4ug.fsf@gitster.g/

=== FURTHER BACKGROUND (original cover letter material) ==

Here are some example commits you can try this out on (with git show
--remerge-diff $COMMIT):

 * git.git conflicted merge: 07601b5b36
 * git.git non-conflicted change: bf04590ecd
 * linux.git conflicted merge: eab3540562fb
 * linux.git non-conflicted change: 223cea6a4f05

Many more can be found by just running git log --merges --remerge-diff in
your repository of choice and searching for diffs (most merges tend to be
clean and unmodified and thus produce no diff but a search of '^diff' in the
log output tends to find the examples nicely).

Some basic high level details about this new option:

 * This option is most naturally compared to --cc, though the output seems
   to be much more understandable to most users than --cc output.
 * Since merges are often clean and unmodified, this new option results in
   an empty diff for most merges.
 * This new option shows things like the removal of conflict markers, which
   hunks users picked from the various conflicted sides to keep or remove,
   and shows changes made outside of conflict markers (which might reflect
   changes needed to resolve semantic conflicts or cleanups of e.g.
   compilation warnings or other additional changes an integrator felt
   belonged in the merged result).
 * This new option does not (currently) work for octopus merges, since
   merge-ort is specific to two-parent merges[1].
 * This option will not work on a read-only or full filesystem[2].
 * We discussed this capability at Git Merge 2020, and one of the
   suggestions was doing a periodic git gc --auto during the operation (due
   to potential new blobs and trees created during the operation). I found a
   way to avoid that; see [2].
 * This option is faster than you'd probably expect; it handles 33.5 merge
   commits per second in linux.git on my computer; see below.

In regards to the performance point above, the timing for running the
following command:

time git log --min-parents=2 --max-parents=2 $DIFF_FLAG | wc -l


in linux.git (with v5.4 checked out, since my copy of linux is very out of
date) is as follows:

DIFF_FLAG=--cc:            71m 31.536s
DIFF_FLAG=--remerge-diff:  31m  3.170s


Note that there are 62476 merges in this history. Also, output size is:

DIFF_FLAG=--cc:            2169111 lines
DIFF_FLAG=--remerge-diff:  2458020 lines


So roughly the same amount of output as --cc, as you'd expect.

As a side note: git log --remerge-diff, when run in various repositories and
allowed to run all the way back to the beginning(s) of history, is a nice
stress test of sorts for merge-ort. Especially when users run it for you on
their repositories they are working on, whether intentionally or via a bug
in a tool triggering that command to be run unexpectedly. Long story short,
such a bug in an internal tool existed in December 2020 and this command was
run on an internal repository and found a platform-specific bug in merge-ort
on some really old merge commit from that repo. I fixed that bug (a
STABLE_QSORT thing) while upstreaming all the merge-ort patches in the mean
time, but it was nice getting extra testing. Having more folks run this on
their repositories might be useful extra testing of the new merge strategy.

Also, I previously mentioned --remerge-diff-only (a flag to show how
cherry-picks or reverts differ from an automatic cherry-pick or revert, in
addition to showing how merges differ from an automatic merge). This series
does not include the patches to introduce that option; I'll submit them
later.

Two other things that might be interesting but are not included and which I
haven't investigated:

 * some mechanism for passing extra merge options through (e.g.
   -Xignore-space-change)
 * a capability to compare the automatic merge to a second automatic merge
   done with different merge options. (Not sure if this would be of interest
   to end users, but might be interesting while developing new a
   --strategy-option, or maybe checking how changing some default in the
   merge algorithm would affect historical merges in various repositories).

[1] I have nebulous ideas of how an Octopus-centric ORT strategy could be
written -- basically, just repeatedly invoking ort and trying to make sure
nested conflicts can be differentiated. For now, though, a simple warning is
printed that octopus merges are not handled and no diff will be shown. [2]
New blobs/trees can be written by the three-way merging step. These are
written to a temporary area (via tmp-objdir.c) under the git object store
that is cleaned up at the end of the operation, with the new loose objects
from the remerge being cleaned up after each individual merge.

Elijah Newren (9):
  show, log: provide a --remerge-diff capability
  log: clean unneeded objects during `log --remerge-diff`
  ll-merge: make callers responsible for showing warnings
  merge-ort: capture and print ll-merge warnings in our preferred
    fashion
  merge-ort: mark a few more conflict messages as omittable
  merge-ort: format messages slightly different for use in headers
  diff: add ability to insert additional headers for paths
  show, log: include conflict/warning messages in --remerge-diff headers
  merge-ort: mark conflict/warning messages from inner merges as
    omittable

 Documentation/diff-options.txt |  10 +-
 apply.c                        |   5 +-
 builtin/checkout.c             |  12 ++-
 builtin/log.c                  |  15 +++
 diff-merges.c                  |  12 +++
 diff.c                         | 116 +++++++++++++++++++++-
 diff.h                         |   3 +-
 ll-merge.c                     |  40 ++++----
 ll-merge.h                     |   9 +-
 log-tree.c                     |  71 +++++++++++++-
 merge-blobs.c                  |   5 +-
 merge-ort.c                    |  55 ++++++++++-
 merge-ort.h                    |  10 ++
 merge-recursive.c              |   9 +-
 merge-recursive.h              |   2 +
 notes-merge.c                  |   5 +-
 rerere.c                       |   9 +-
 revision.h                     |   6 +-
 t/t4069-remerge-diff.sh        | 170 +++++++++++++++++++++++++++++++++
 t/t6404-recursive-merge.sh     |   9 +-
 t/t6406-merge-attr.sh          |   9 +-
 tmp-objdir.c                   |   5 +
 tmp-objdir.h                   |   6 ++
 23 files changed, 545 insertions(+), 48 deletions(-)
 create mode 100755 t/t4069-remerge-diff.sh


base-commit: 4e44121c2d7bced65e25eb7ec5156290132bec94
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1103%2Fnewren%2Fremerge-diff-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1103/newren/remerge-diff-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1103

Range-diff vs v2:

  1:  b3ae62083e1 !  1:  d57ae218cf9 show, log: provide a --remerge-diff capability
     @@ Commit message
      
          This capability works by creating a temporary object directory and
          marking it as the primary object store.  This makes it so that any blobs
     -    or trees created during the automatic merge easily removable afterwards
     -    by just deleting all objects from the temporary object directory.
     +    or trees created during the automatic merge are easily removable
     +    afterwards by just deleting all objects from the temporary object
     +    directory.
      
          There are a few ways that this implementation is suboptimal:
            * `log --remerge-diff` becomes slow, because the temporary object
     -        directory can fills with many loose objects while running
     +        directory can fill with many loose objects while running
            * the log output can be muddied with misplaced "warning: cannot merge
              binary files" messages, since ll-merge.c unconditionally writes those
              messages to stderr while running instead of allowing callers to
     @@ Commit message
              way for a user of --remerge-diff to know that there had been a
              conflict which was resolved (and which possibly motivated other
              changes in the merge commit).
     +      * when fixing the previous issue, note that some unimportant conflict
     +        and warning messages might start being included.  We should instead
     +        make sure these remain dropped.
          Subsequent commits will address these issues.
      
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
       ## Documentation/diff-options.txt ##
     +@@ Documentation/diff-options.txt: endif::git-diff[]
     + endif::git-format-patch[]
     + 
     + ifdef::git-log[]
     +---diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc)::
     ++--diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc|remerge|r)::
     + --no-diff-merges::
     + 	Specify diff format to be used for merge commits. Default is
     + 	{diff-merges-default} unless `--first-parent` is in use, in which case
      @@ Documentation/diff-options.txt: ifdef::git-log[]
       	each of the parents. Separate log entry and diff is generated
       	for each parent.
     @@ t/t4069-remerge-diff.sh (new)
      +	git commit -a -m change_c &&
      +
      +	git checkout bc_resolution &&
     -+	# fast forward
     -+	git merge feature_b &&
     ++	git merge --ff-only feature_b &&
      +	# no conflict
      +	git merge feature_c &&
      +
      +	git checkout ab_resolution &&
     -+	# fast forward
     -+	git merge feature_a &&
     ++	git merge --ff-only feature_a &&
      +	# conflicts!
      +	test_must_fail git merge feature_b &&
      +	# Resolve conflict...and make another change elsewhere
  2:  54f1fb31d04 =  2:  798625b53f2 log: clean unneeded objects during `log --remerge-diff`
  3:  d5566f5d136 !  3:  b952f674df1 ll-merge: make callers responsible for showing warnings
     @@ Commit message
          Since some callers may want to send warning messages to somewhere other
          than stdout/stderr, stop printing "warning: Cannot merge binary files"
          from ll-merge and instead modify the return status of ll_merge() to
     -    indicate when a merge of binary files has occurred.
     +    indicate when a merge of binary files has occurred.  Message printing
     +    probably does not belong in a "low-level merge" anyway.
      
     -    This commit continues printing the message as-is; future changes will
     -    start handling the new commit differently in the merge-ort codepath.
     +    This commit continues printing the message as-is, just from the callers
     +    instead of within ll_merge().  Future changes will start handling the
     +    message differently in the merge-ort codepath.
     +
     +    There was one special case here: the callers in rerere.c do NOT check
     +    for and print such a message; since those code paths explicitly skip
     +    over binary files, there is no reason to check for a return status of
     +    LL_MERGE_BINARY_CONFLICT or print the related message.
      
          Note that my methodology included first modifying ll_merge() to return
          a struct, so that the compiler would catch all the callers for me and
     @@ ll-merge.c: static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
       	xmp.file2 = name2;
      -	return xdl_merge(orig, src1, src2, &xmp, result);
      +	status = xdl_merge(orig, src1, src2, &xmp, result);
     -+	ret = (status > 1 ) ? LL_MERGE_CONFLICT : status;
     ++	ret = (status > 0) ? LL_MERGE_CONFLICT : status;
      +	return ret;
       }
       
     @@ ll-merge.c: static int ll_ext_merge(const struct ll_merge_driver *fn,
       	strbuf_release(&cmd);
       	strbuf_release(&path_sq);
      -	return status;
     -+	ret = (status > 1) ? LL_MERGE_CONFLICT : status;
     ++	ret = (status > 0) ? LL_MERGE_CONFLICT : status;
      +	return ret;
       }
       
     @@ rerere.c: static int try_merge(struct index_state *istate,
       		 */
       		ret = ll_merge(result, path, &base, NULL, cur, "", &other, "",
       			       istate, NULL);
     -+		if (ret == LL_MERGE_BINARY_CONFLICT)
     -+			warning("Cannot merge binary files: %s (%s vs. %s)",
     -+				path, "", "");
      +	}
       
       	free(base.ptr);
  4:  a02845f12db =  4:  e8cf1626960 merge-ort: capture and print ll-merge warnings in our preferred fashion
  5:  000933c5d7f =  5:  4d1848c8a29 merge-ort: mark a few more conflict messages as omittable
  6:  887e46435c0 !  6:  81e736b847e merge-ort: format messages slightly different for use in headers
     @@ Commit message
          headers instead...but for that to work, we need for any multiline
          messages to replace newlines with both a newline and a space.  Add a new
          flag to signal when we want these messages modified in such a fashion,
     -    and use it in path_msg() to modify these messages this way.
     +    and use it in path_msg() to modify these messages this way.  Also, allow
     +    a special prefix to be specified for these headers.
      
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
     @@ merge-ort.c: static void path_msg(struct merge_options *opt,
      +	if (opt->record_conflict_msgs_as_headers) {
      +		int i_sb = 0, i_tmp = 0;
      +
     ++		/* Start with the specified prefix */
     ++		if (opt->msg_header_prefix)
     ++			strbuf_addf(sb, "%s ", opt->msg_header_prefix);
     ++
      +		/* Copy tmp to sb, adding spaces after newlines */
     -+		strbuf_grow(sb, 2*tmp.len); /* more than sufficient */
     ++		strbuf_grow(sb, sb->len + 2*tmp.len); /* more than sufficient */
      +		for (; i_tmp < tmp.len; i_tmp++, i_sb++) {
      +			/* Copy next character from tmp to sb */
      +			sb->buf[sb->len + i_sb] = tmp.buf[i_tmp];
     @@ merge-ort.c: static void path_msg(struct merge_options *opt,
      +		sb->len += i_sb;
      +		sb->buf[sb->len] = '\0';
      +
     -+		/* Clean up tmp */
      +		strbuf_release(&tmp);
      +	}
      +
     @@ merge-ort.c: void merge_switch_to_result(struct merge_options *opt,
       		trace2_region_enter("merge", "display messages", opt->repo);
       
       		/* Hack to pre-allocate olist to the desired size */
     +@@ merge-ort.c: static void merge_start(struct merge_options *opt, struct merge_result *result)
     + 	assert(opt->recursive_variant >= MERGE_VARIANT_NORMAL &&
     + 	       opt->recursive_variant <= MERGE_VARIANT_THEIRS);
     + 
     ++	if (opt->msg_header_prefix)
     ++		assert(opt->record_conflict_msgs_as_headers);
     ++
     + 	/*
     + 	 * detect_renames, verbosity, buffer_output, and obuf are ignored
     + 	 * fields that were used by "recursive" rather than "ort" -- but
      
       ## merge-recursive.c ##
      @@ merge-recursive.c: static int merge_start(struct merge_options *opt, struct tree *head)
     @@ merge-recursive.c: static int merge_start(struct merge_options *opt, struct tree
       
      +	/* Not supported; option specific to merge-ort */
      +	assert(!opt->record_conflict_msgs_as_headers);
     ++	assert(!opt->msg_header_prefix);
      +
       	/* Sanity check on repo state; index must match head */
       	if (repo_index_has_changes(opt->repo, head, &sb)) {
     @@ merge-recursive.h: struct merge_options {
       	const char *subtree_shift;
       	unsigned renormalize : 1;
      +	unsigned record_conflict_msgs_as_headers : 1;
     ++	const char *msg_header_prefix;
       
       	/* internal fields used by the implementation */
       	struct merge_options_internal *priv;
  7:  e9470651303 =  7:  5000a94aa98 diff: add ability to insert additional headers for paths
  8:  4cc53c55a6e !  8:  78ec1f44e4e show, log: include conflict/warning messages in --remerge-diff headers
     @@ log-tree.c: static int do_remerge_diff(struct rev_info *opt,
       	init_merge_options(&o, the_repository);
       	o.show_rename_progress = 0;
      +	o.record_conflict_msgs_as_headers = 1;
     ++	o.msg_header_prefix = "remerge";
       
       	ctx.abbrev = DEFAULT_ABBREV;
       	format_commit_message(parent1, "%h (%s)", &parent1_desc, &ctx);
     @@ t/t4069-remerge-diff.sh: test_expect_success 'remerge-diff with both a resolved
       	git log -1 --oneline ab_resolution >tmp &&
       	cat <<-EOF >>tmp &&
       	diff --git a/numbers b/numbers
     -+	CONFLICT (content): Merge conflict in numbers
     ++	remerge CONFLICT (content): Merge conflict in numbers
       	index a1fb731..6875544 100644
       	--- a/numbers
       	+++ b/numbers
     @@ t/t4069-remerge-diff.sh: test_expect_success 'remerge-diff with both a resolved
      +	similarity index 100%
      +	rename from file_or_directory~HASH (side1)
      +	rename to wanted_content
     -+	CONFLICT (file/directory): directory in the way of file_or_directory from HASH (side1); moving it to file_or_directory~HASH (side1) instead.
     ++	remerge CONFLICT (file/directory): directory in the way of file_or_directory from HASH (side1); moving it to file_or_directory~HASH (side1) instead.
      +	diff --git a/letters b/letters
     -+	CONFLICT (rename/rename): letters renamed to letters_side1 in HASH (side1) and to letters_side2 in HASH (side2).
     ++	remerge CONFLICT (rename/rename): letters renamed to letters_side1 in HASH (side1) and to letters_side2 in HASH (side2).
      +	diff --git a/letters_side2 b/letters_side2
      +	deleted file mode 100644
      +	index b236ae5..0000000
     @@ t/t4069-remerge-diff.sh: test_expect_success 'remerge-diff with both a resolved
      +	-h
      +	-i
      +	diff --git a/numbers b/numbers
     -+	CONFLICT (modify/delete): numbers deleted in HASH (side2) and modified in HASH (side1).  Version HASH (side1) of numbers left in tree.
     ++	remerge CONFLICT (modify/delete): numbers deleted in HASH (side2) and modified in HASH (side1).  Version HASH (side1) of numbers left in tree.
      +	EOF
      +	# We still have some sha1 hashes above; rip them out so test works
      +	# with sha256
  -:  ----------- >  9:  64b44ee84f3 merge-ort: mark conflict/warning messages from inner merges as omittable

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH v3 1/9] show, log: provide a --remerge-diff capability
  2021-12-30 23:36   ` [PATCH v3 0/9] " Elijah Newren via GitGitGadget
@ 2021-12-30 23:36     ` Elijah Newren via GitGitGadget
  2022-01-19 15:49       ` Ævar Arnfjörð Bjarmason
  2022-01-19 16:01       ` Ævar Arnfjörð Bjarmason
  2021-12-30 23:36     ` [PATCH v3 2/9] log: clean unneeded objects during `log --remerge-diff` Elijah Newren via GitGitGadget
                       ` (9 subsequent siblings)
  10 siblings, 2 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-30 23:36 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

When this option is specified, we remerge all (two parent) merge commits
and diff the actual merge commit to the automatically created version,
in order to show how users removed conflict markers, resolved the
different conflict versions, and potentially added new changes outside
of conflict regions in order to resolve semantic merge problems (or,
possibly, just to hide other random changes).

This capability works by creating a temporary object directory and
marking it as the primary object store.  This makes it so that any blobs
or trees created during the automatic merge are easily removable
afterwards by just deleting all objects from the temporary object
directory.

There are a few ways that this implementation is suboptimal:
  * `log --remerge-diff` becomes slow, because the temporary object
    directory can fill with many loose objects while running
  * the log output can be muddied with misplaced "warning: cannot merge
    binary files" messages, since ll-merge.c unconditionally writes those
    messages to stderr while running instead of allowing callers to
    manage them.
  * important conflict and warning messages are simply dropped; thus for
    conflicts like modify/delete or rename/rename or file/directory which
    are not representable with content conflict markers, there may be no
    way for a user of --remerge-diff to know that there had been a
    conflict which was resolved (and which possibly motivated other
    changes in the merge commit).
  * when fixing the previous issue, note that some unimportant conflict
    and warning messages might start being included.  We should instead
    make sure these remain dropped.
Subsequent commits will address these issues.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Documentation/diff-options.txt | 10 +++-
 builtin/log.c                  | 14 ++++++
 diff-merges.c                  | 12 +++++
 log-tree.c                     | 59 ++++++++++++++++++++++++
 revision.h                     |  3 +-
 t/t4069-remerge-diff.sh        | 84 ++++++++++++++++++++++++++++++++++
 6 files changed, 180 insertions(+), 2 deletions(-)
 create mode 100755 t/t4069-remerge-diff.sh

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index c89d530d3d1..6b8175defe6 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -34,7 +34,7 @@ endif::git-diff[]
 endif::git-format-patch[]
 
 ifdef::git-log[]
---diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc)::
+--diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc|remerge|r)::
 --no-diff-merges::
 	Specify diff format to be used for merge commits. Default is
 	{diff-merges-default} unless `--first-parent` is in use, in which case
@@ -64,6 +64,14 @@ ifdef::git-log[]
 	each of the parents. Separate log entry and diff is generated
 	for each parent.
 +
+--diff-merges=remerge:::
+--diff-merges=r:::
+--remerge-diff:::
+	With this option, two-parent merge commits are remerged to
+	create a temporary tree object -- potentially containing files
+	with conflict markers and such.  A diff is then shown between
+	that temporary tree and the actual merge commit.
++
 --diff-merges=combined:::
 --diff-merges=c:::
 -c:::
diff --git a/builtin/log.c b/builtin/log.c
index f75d87e8d7f..d053418fddd 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -35,6 +35,7 @@
 #include "repository.h"
 #include "commit-reach.h"
 #include "range-diff.h"
+#include "tmp-objdir.h"
 
 #define MAIL_DEFAULT_WRAP 72
 #define COVER_FROM_AUTO_MAX_SUBJECT_LEN 100
@@ -406,6 +407,14 @@ static int cmd_log_walk(struct rev_info *rev)
 	struct commit *commit;
 	int saved_nrl = 0;
 	int saved_dcctc = 0;
+	struct tmp_objdir *remerge_objdir = NULL;
+
+	if (rev->remerge_diff) {
+		remerge_objdir = tmp_objdir_create("remerge-diff");
+		if (!remerge_objdir)
+			die_errno(_("unable to create temporary object directory"));
+		tmp_objdir_replace_primary_odb(remerge_objdir, 1);
+	}
 
 	if (rev->early_output)
 		setup_early_output();
@@ -449,6 +458,9 @@ static int cmd_log_walk(struct rev_info *rev)
 	rev->diffopt.no_free = 0;
 	diff_free(&rev->diffopt);
 
+	if (rev->remerge_diff)
+		tmp_objdir_destroy(remerge_objdir);
+
 	if (rev->diffopt.output_format & DIFF_FORMAT_CHECKDIFF &&
 	    rev->diffopt.flags.check_failed) {
 		return 02;
@@ -1943,6 +1955,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 		die(_("--name-status does not make sense"));
 	if (rev.diffopt.output_format & DIFF_FORMAT_CHECKDIFF)
 		die(_("--check does not make sense"));
+	if (rev.remerge_diff)
+		die(_("--remerge-diff does not make sense"));
 
 	if (!use_patch_format &&
 		(!rev.diffopt.output_format ||
diff --git a/diff-merges.c b/diff-merges.c
index 5060ccd890b..0af4b3f9191 100644
--- a/diff-merges.c
+++ b/diff-merges.c
@@ -17,6 +17,7 @@ static void suppress(struct rev_info *revs)
 	revs->combined_all_paths = 0;
 	revs->merges_imply_patch = 0;
 	revs->merges_need_diff = 0;
+	revs->remerge_diff = 0;
 }
 
 static void set_separate(struct rev_info *revs)
@@ -45,6 +46,12 @@ static void set_dense_combined(struct rev_info *revs)
 	revs->dense_combined_merges = 1;
 }
 
+static void set_remerge_diff(struct rev_info *revs)
+{
+	suppress(revs);
+	revs->remerge_diff = 1;
+}
+
 static diff_merges_setup_func_t func_by_opt(const char *optarg)
 {
 	if (!strcmp(optarg, "off") || !strcmp(optarg, "none"))
@@ -57,6 +64,8 @@ static diff_merges_setup_func_t func_by_opt(const char *optarg)
 		return set_combined;
 	else if (!strcmp(optarg, "cc") || !strcmp(optarg, "dense-combined"))
 		return set_dense_combined;
+	else if (!strcmp(optarg, "r") || !strcmp(optarg, "remerge"))
+		return set_remerge_diff;
 	else if (!strcmp(optarg, "m") || !strcmp(optarg, "on"))
 		return set_to_default;
 	return NULL;
@@ -110,6 +119,9 @@ int diff_merges_parse_opts(struct rev_info *revs, const char **argv)
 	} else if (!strcmp(arg, "--cc")) {
 		set_dense_combined(revs);
 		revs->merges_imply_patch = 1;
+	} else if (!strcmp(arg, "--remerge-diff")) {
+		set_remerge_diff(revs);
+		revs->merges_imply_patch = 1;
 	} else if (!strcmp(arg, "--no-diff-merges")) {
 		suppress(revs);
 	} else if (!strcmp(arg, "--combined-all-paths")) {
diff --git a/log-tree.c b/log-tree.c
index 644893fd8cf..84ed864fc81 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -1,4 +1,5 @@
 #include "cache.h"
+#include "commit-reach.h"
 #include "config.h"
 #include "diff.h"
 #include "object-store.h"
@@ -7,6 +8,7 @@
 #include "tag.h"
 #include "graph.h"
 #include "log-tree.h"
+#include "merge-ort.h"
 #include "reflog-walk.h"
 #include "refs.h"
 #include "string-list.h"
@@ -902,6 +904,51 @@ static int do_diff_combined(struct rev_info *opt, struct commit *commit)
 	return !opt->loginfo;
 }
 
+static int do_remerge_diff(struct rev_info *opt,
+			   struct commit_list *parents,
+			   struct object_id *oid,
+			   struct commit *commit)
+{
+	struct merge_options o;
+	struct commit_list *bases;
+	struct merge_result res = {0};
+	struct pretty_print_context ctx = {0};
+	struct commit *parent1 = parents->item;
+	struct commit *parent2 = parents->next->item;
+	struct strbuf parent1_desc = STRBUF_INIT;
+	struct strbuf parent2_desc = STRBUF_INIT;
+
+	/* Setup merge options */
+	init_merge_options(&o, the_repository);
+	o.show_rename_progress = 0;
+
+	ctx.abbrev = DEFAULT_ABBREV;
+	format_commit_message(parent1, "%h (%s)", &parent1_desc, &ctx);
+	format_commit_message(parent2, "%h (%s)", &parent2_desc, &ctx);
+	o.branch1 = parent1_desc.buf;
+	o.branch2 = parent2_desc.buf;
+
+	/* Parse the relevant commits and get the merge bases */
+	parse_commit_or_die(parent1);
+	parse_commit_or_die(parent2);
+	bases = get_merge_bases(parent1, parent2);
+
+	/* Re-merge the parents */
+	merge_incore_recursive(&o, bases, parent1, parent2, &res);
+
+	/* Show the diff */
+	diff_tree_oid(&res.tree->object.oid, oid, "", &opt->diffopt);
+	log_tree_diff_flush(opt);
+
+	/* Cleanup */
+	strbuf_release(&parent1_desc);
+	strbuf_release(&parent2_desc);
+	merge_finalize(&o, &res);
+	/* TODO: clean up the temporary object directory */
+
+	return !opt->loginfo;
+}
+
 /*
  * Show the diff of a commit.
  *
@@ -936,6 +983,18 @@ static int log_tree_diff(struct rev_info *opt, struct commit *commit, struct log
 	}
 
 	if (is_merge) {
+		int octopus = (parents->next->next != NULL);
+
+		if (opt->remerge_diff) {
+			if (octopus) {
+				show_log(opt);
+				fprintf(opt->diffopt.file,
+					"diff: warning: Skipping remerge-diff "
+					"for octopus merges.\n");
+				return 1;
+			}
+			return do_remerge_diff(opt, parents, oid, commit);
+		}
 		if (opt->combine_merges)
 			return do_diff_combined(opt, commit);
 		if (opt->separate_merges) {
diff --git a/revision.h b/revision.h
index 5578bb4720a..13178e6b8f3 100644
--- a/revision.h
+++ b/revision.h
@@ -195,7 +195,8 @@ struct rev_info {
 			combine_merges:1,
 			combined_all_paths:1,
 			dense_combined_merges:1,
-			first_parent_merges:1;
+			first_parent_merges:1,
+			remerge_diff:1;
 
 	/* Format info */
 	int		show_notes;
diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
new file mode 100755
index 00000000000..1b32028e897
--- /dev/null
+++ b/t/t4069-remerge-diff.sh
@@ -0,0 +1,84 @@
+#!/bin/sh
+
+test_description='remerge-diff handling'
+
+. ./test-lib.sh
+
+test_expect_success 'setup basic merges' '
+	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
+	git add numbers &&
+	git commit -m base &&
+
+	git branch feature_a &&
+	git branch feature_b &&
+	git branch feature_c &&
+
+	git branch ab_resolution &&
+	git branch bc_resolution &&
+
+	git checkout feature_a &&
+	test_write_lines 1 2 three 4 5 6 7 eight 9 >numbers &&
+	git commit -a -m change_a &&
+
+	git checkout feature_b &&
+	test_write_lines 1 2 tres 4 5 6 7 8 9 >numbers &&
+	git commit -a -m change_b &&
+
+	git checkout feature_c &&
+	test_write_lines 1 2 3 4 5 6 7 8 9 10 >numbers &&
+	git commit -a -m change_c &&
+
+	git checkout bc_resolution &&
+	git merge --ff-only feature_b &&
+	# no conflict
+	git merge feature_c &&
+
+	git checkout ab_resolution &&
+	git merge --ff-only feature_a &&
+	# conflicts!
+	test_must_fail git merge feature_b &&
+	# Resolve conflict...and make another change elsewhere
+	test_write_lines 1 2 drei 4 5 6 7 acht 9 >numbers &&
+	git add numbers &&
+	git merge --continue
+'
+
+test_expect_success 'remerge-diff on a clean merge' '
+	git log -1 --oneline bc_resolution >expect &&
+	git show --oneline --remerge-diff bc_resolution >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'remerge-diff with both a resolved conflict and an unrelated change' '
+	git log -1 --oneline ab_resolution >tmp &&
+	cat <<-EOF >>tmp &&
+	diff --git a/numbers b/numbers
+	index a1fb731..6875544 100644
+	--- a/numbers
+	+++ b/numbers
+	@@ -1,13 +1,9 @@
+	 1
+	 2
+	-<<<<<<< b0ed5cb (change_a)
+	-three
+	-=======
+	-tres
+	->>>>>>> 6cd3f82 (change_b)
+	+drei
+	 4
+	 5
+	 6
+	 7
+	-eight
+	+acht
+	 9
+	EOF
+	# Hashes above are sha1; rip them out so test works with sha256
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
+
+	git show --oneline --remerge-diff ab_resolution >tmp &&
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
+	test_cmp expect actual
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v3 2/9] log: clean unneeded objects during `log --remerge-diff`
  2021-12-30 23:36   ` [PATCH v3 0/9] " Elijah Newren via GitGitGadget
  2021-12-30 23:36     ` [PATCH v3 1/9] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
@ 2021-12-30 23:36     ` Elijah Newren via GitGitGadget
  2021-12-30 23:36     ` [PATCH v3 3/9] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
                       ` (8 subsequent siblings)
  10 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-30 23:36 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

The --remerge-diff option will need to create new blobs and trees
representing the "automatic merge" state.  If one is traversing a
long project history, one can easily get hundreds of thousands of
loose objects generated during `log --remerge-diff`.  However, none of
those loose objects are needed after we have completed our diff
operation; they can be summarily deleted.

Add a new helper function to tmp_objdir to discard all the contained
objects, and call it after each merge is handled.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/log.c | 13 +++++++------
 log-tree.c    |  8 +++++++-
 revision.h    |  3 +++
 tmp-objdir.c  |  5 +++++
 tmp-objdir.h  |  6 ++++++
 5 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index d053418fddd..e6a080df914 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -407,13 +407,12 @@ static int cmd_log_walk(struct rev_info *rev)
 	struct commit *commit;
 	int saved_nrl = 0;
 	int saved_dcctc = 0;
-	struct tmp_objdir *remerge_objdir = NULL;
 
 	if (rev->remerge_diff) {
-		remerge_objdir = tmp_objdir_create("remerge-diff");
-		if (!remerge_objdir)
+		rev->remerge_objdir = tmp_objdir_create("remerge-diff");
+		if (!rev->remerge_objdir)
 			die_errno(_("unable to create temporary object directory"));
-		tmp_objdir_replace_primary_odb(remerge_objdir, 1);
+		tmp_objdir_replace_primary_odb(rev->remerge_objdir, 1);
 	}
 
 	if (rev->early_output)
@@ -458,8 +457,10 @@ static int cmd_log_walk(struct rev_info *rev)
 	rev->diffopt.no_free = 0;
 	diff_free(&rev->diffopt);
 
-	if (rev->remerge_diff)
-		tmp_objdir_destroy(remerge_objdir);
+	if (rev->remerge_diff) {
+		tmp_objdir_destroy(rev->remerge_objdir);
+		rev->remerge_objdir = NULL;
+	}
 
 	if (rev->diffopt.output_format & DIFF_FORMAT_CHECKDIFF &&
 	    rev->diffopt.flags.check_failed) {
diff --git a/log-tree.c b/log-tree.c
index 84ed864fc81..d4655b63d75 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -4,6 +4,7 @@
 #include "diff.h"
 #include "object-store.h"
 #include "repository.h"
+#include "tmp-objdir.h"
 #include "commit.h"
 #include "tag.h"
 #include "graph.h"
@@ -944,7 +945,12 @@ static int do_remerge_diff(struct rev_info *opt,
 	strbuf_release(&parent1_desc);
 	strbuf_release(&parent2_desc);
 	merge_finalize(&o, &res);
-	/* TODO: clean up the temporary object directory */
+
+	/* Clean up the contents of the temporary object directory */
+	if (opt->remerge_objdir)
+		tmp_objdir_discard_objects(opt->remerge_objdir);
+	else
+		BUG("unable to remove temporary object directory");
 
 	return !opt->loginfo;
 }
diff --git a/revision.h b/revision.h
index 13178e6b8f3..44efce3f410 100644
--- a/revision.h
+++ b/revision.h
@@ -318,6 +318,9 @@ struct rev_info {
 
 	/* misc. flags related to '--no-kept-objects' */
 	unsigned keep_pack_cache_flags;
+
+	/* Location where temporary objects for remerge-diff are written. */
+	struct tmp_objdir *remerge_objdir;
 };
 
 int ref_excluded(struct string_list *, const char *path);
diff --git a/tmp-objdir.c b/tmp-objdir.c
index 3d38eeab66b..adf6033549e 100644
--- a/tmp-objdir.c
+++ b/tmp-objdir.c
@@ -79,6 +79,11 @@ static void remove_tmp_objdir_on_signal(int signo)
 	raise(signo);
 }
 
+void tmp_objdir_discard_objects(struct tmp_objdir *t)
+{
+	remove_dir_recursively(&t->path, REMOVE_DIR_KEEP_TOPLEVEL);
+}
+
 /*
  * These env_* functions are for setting up the child environment; the
  * "replace" variant overrides the value of any existing variable with that
diff --git a/tmp-objdir.h b/tmp-objdir.h
index cda5ec76778..76efc7edee5 100644
--- a/tmp-objdir.h
+++ b/tmp-objdir.h
@@ -46,6 +46,12 @@ int tmp_objdir_migrate(struct tmp_objdir *);
  */
 int tmp_objdir_destroy(struct tmp_objdir *);
 
+/*
+ * Remove all objects from the temporary object directory, while leaving it
+ * around so more objects can be added.
+ */
+void tmp_objdir_discard_objects(struct tmp_objdir *);
+
 /*
  * Add the temporary object directory as an alternate object store in the
  * current process.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v3 3/9] ll-merge: make callers responsible for showing warnings
  2021-12-30 23:36   ` [PATCH v3 0/9] " Elijah Newren via GitGitGadget
  2021-12-30 23:36     ` [PATCH v3 1/9] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
  2021-12-30 23:36     ` [PATCH v3 2/9] log: clean unneeded objects during `log --remerge-diff` Elijah Newren via GitGitGadget
@ 2021-12-30 23:36     ` Elijah Newren via GitGitGadget
  2022-01-19 16:41       ` Ævar Arnfjörð Bjarmason
  2021-12-30 23:36     ` [PATCH v3 4/9] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
                       ` (7 subsequent siblings)
  10 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-30 23:36 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Since some callers may want to send warning messages to somewhere other
than stdout/stderr, stop printing "warning: Cannot merge binary files"
from ll-merge and instead modify the return status of ll_merge() to
indicate when a merge of binary files has occurred.  Message printing
probably does not belong in a "low-level merge" anyway.

This commit continues printing the message as-is, just from the callers
instead of within ll_merge().  Future changes will start handling the
message differently in the merge-ort codepath.

There was one special case here: the callers in rerere.c do NOT check
for and print such a message; since those code paths explicitly skip
over binary files, there is no reason to check for a return status of
LL_MERGE_BINARY_CONFLICT or print the related message.

Note that my methodology included first modifying ll_merge() to return
a struct, so that the compiler would catch all the callers for me and
ensure I had modified all of them.  After modifying all of them, I then
changed the struct to an enum.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 apply.c            |  5 ++++-
 builtin/checkout.c | 12 ++++++++----
 ll-merge.c         | 40 ++++++++++++++++++++++------------------
 ll-merge.h         |  9 ++++++++-
 merge-blobs.c      |  5 ++++-
 merge-ort.c        |  5 ++++-
 merge-recursive.c  |  5 ++++-
 notes-merge.c      |  5 ++++-
 rerere.c           |  9 +++++----
 9 files changed, 63 insertions(+), 32 deletions(-)

diff --git a/apply.c b/apply.c
index 43a0aebf4ee..8079395755f 100644
--- a/apply.c
+++ b/apply.c
@@ -3492,7 +3492,7 @@ static int three_way_merge(struct apply_state *state,
 {
 	mmfile_t base_file, our_file, their_file;
 	mmbuffer_t result = { NULL };
-	int status;
+	enum ll_merge_result status;
 
 	/* resolve trivial cases first */
 	if (oideq(base, ours))
@@ -3509,6 +3509,9 @@ static int three_way_merge(struct apply_state *state,
 			  &their_file, "theirs",
 			  state->repo->index,
 			  NULL);
+	if (status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			path, "ours", "theirs");
 	free(base_file.ptr);
 	free(our_file.ptr);
 	free(their_file.ptr);
diff --git a/builtin/checkout.c b/builtin/checkout.c
index cbf73b8c9f6..3a559d69303 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -237,6 +237,7 @@ static int checkout_merged(int pos, const struct checkout *state,
 	struct cache_entry *ce = active_cache[pos];
 	const char *path = ce->name;
 	mmfile_t ancestor, ours, theirs;
+	enum ll_merge_result merge_status;
 	int status;
 	struct object_id oid;
 	mmbuffer_t result_buf;
@@ -267,13 +268,16 @@ static int checkout_merged(int pos, const struct checkout *state,
 	memset(&ll_opts, 0, sizeof(ll_opts));
 	git_config_get_bool("merge.renormalize", &renormalize);
 	ll_opts.renormalize = renormalize;
-	status = ll_merge(&result_buf, path, &ancestor, "base",
-			  &ours, "ours", &theirs, "theirs",
-			  state->istate, &ll_opts);
+	merge_status = ll_merge(&result_buf, path, &ancestor, "base",
+				&ours, "ours", &theirs, "theirs",
+				state->istate, &ll_opts);
 	free(ancestor.ptr);
 	free(ours.ptr);
 	free(theirs.ptr);
-	if (status < 0 || !result_buf.ptr) {
+	if (merge_status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			path, "ours", "theirs");
+	if (merge_status < 0 || !result_buf.ptr) {
 		free(result_buf.ptr);
 		return error(_("path '%s': cannot merge"), path);
 	}
diff --git a/ll-merge.c b/ll-merge.c
index 261657578c7..a937cec59a6 100644
--- a/ll-merge.c
+++ b/ll-merge.c
@@ -14,7 +14,7 @@
 
 struct ll_merge_driver;
 
-typedef int (*ll_merge_fn)(const struct ll_merge_driver *,
+typedef enum ll_merge_result (*ll_merge_fn)(const struct ll_merge_driver *,
 			   mmbuffer_t *result,
 			   const char *path,
 			   mmfile_t *orig, const char *orig_name,
@@ -49,7 +49,7 @@ void reset_merge_attributes(void)
 /*
  * Built-in low-levels
  */
-static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
+static enum ll_merge_result ll_binary_merge(const struct ll_merge_driver *drv_unused,
 			   mmbuffer_t *result,
 			   const char *path,
 			   mmfile_t *orig, const char *orig_name,
@@ -58,6 +58,7 @@ static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
 			   const struct ll_merge_options *opts,
 			   int marker_size)
 {
+	enum ll_merge_result ret;
 	mmfile_t *stolen;
 	assert(opts);
 
@@ -68,16 +69,19 @@ static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
 	 */
 	if (opts->virtual_ancestor) {
 		stolen = orig;
+		ret = LL_MERGE_OK;
 	} else {
 		switch (opts->variant) {
 		default:
-			warning("Cannot merge binary files: %s (%s vs. %s)",
-				path, name1, name2);
-			/* fallthru */
+			ret = LL_MERGE_BINARY_CONFLICT;
+			stolen = src1;
+			break;
 		case XDL_MERGE_FAVOR_OURS:
+			ret = LL_MERGE_OK;
 			stolen = src1;
 			break;
 		case XDL_MERGE_FAVOR_THEIRS:
+			ret = LL_MERGE_OK;
 			stolen = src2;
 			break;
 		}
@@ -87,16 +91,10 @@ static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
 	result->size = stolen->size;
 	stolen->ptr = NULL;
 
-	/*
-	 * With -Xtheirs or -Xours, we have cleanly merged;
-	 * otherwise we got a conflict.
-	 */
-	return opts->variant == XDL_MERGE_FAVOR_OURS ||
-	       opts->variant == XDL_MERGE_FAVOR_THEIRS ?
-	       0 : 1;
+	return ret;
 }
 
-static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
+static enum ll_merge_result ll_xdl_merge(const struct ll_merge_driver *drv_unused,
 			mmbuffer_t *result,
 			const char *path,
 			mmfile_t *orig, const char *orig_name,
@@ -105,7 +103,9 @@ static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
 			const struct ll_merge_options *opts,
 			int marker_size)
 {
+	enum ll_merge_result ret;
 	xmparam_t xmp;
+	int status;
 	assert(opts);
 
 	if (orig->size > MAX_XDIFF_SIZE ||
@@ -133,10 +133,12 @@ static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
 	xmp.ancestor = orig_name;
 	xmp.file1 = name1;
 	xmp.file2 = name2;
-	return xdl_merge(orig, src1, src2, &xmp, result);
+	status = xdl_merge(orig, src1, src2, &xmp, result);
+	ret = (status > 0) ? LL_MERGE_CONFLICT : status;
+	return ret;
 }
 
-static int ll_union_merge(const struct ll_merge_driver *drv_unused,
+static enum ll_merge_result ll_union_merge(const struct ll_merge_driver *drv_unused,
 			  mmbuffer_t *result,
 			  const char *path,
 			  mmfile_t *orig, const char *orig_name,
@@ -178,7 +180,7 @@ static void create_temp(mmfile_t *src, char *path, size_t len)
 /*
  * User defined low-level merge driver support.
  */
-static int ll_ext_merge(const struct ll_merge_driver *fn,
+static enum ll_merge_result ll_ext_merge(const struct ll_merge_driver *fn,
 			mmbuffer_t *result,
 			const char *path,
 			mmfile_t *orig, const char *orig_name,
@@ -194,6 +196,7 @@ static int ll_ext_merge(const struct ll_merge_driver *fn,
 	const char *args[] = { NULL, NULL };
 	int status, fd, i;
 	struct stat st;
+	enum ll_merge_result ret;
 	assert(opts);
 
 	sq_quote_buf(&path_sq, path);
@@ -236,7 +239,8 @@ static int ll_ext_merge(const struct ll_merge_driver *fn,
 		unlink_or_warn(temp[i]);
 	strbuf_release(&cmd);
 	strbuf_release(&path_sq);
-	return status;
+	ret = (status > 0) ? LL_MERGE_CONFLICT : status;
+	return ret;
 }
 
 /*
@@ -362,7 +366,7 @@ static void normalize_file(mmfile_t *mm, const char *path, struct index_state *i
 	}
 }
 
-int ll_merge(mmbuffer_t *result_buf,
+enum ll_merge_result ll_merge(mmbuffer_t *result_buf,
 	     const char *path,
 	     mmfile_t *ancestor, const char *ancestor_label,
 	     mmfile_t *ours, const char *our_label,
diff --git a/ll-merge.h b/ll-merge.h
index aceb1b24132..e4a20e81a3a 100644
--- a/ll-merge.h
+++ b/ll-merge.h
@@ -82,13 +82,20 @@ struct ll_merge_options {
 	long xdl_opts;
 };
 
+enum ll_merge_result {
+	LL_MERGE_ERROR = -1,
+	LL_MERGE_OK = 0,
+	LL_MERGE_CONFLICT,
+	LL_MERGE_BINARY_CONFLICT,
+};
+
 /**
  * Perform a three-way single-file merge in core.  This is a thin wrapper
  * around `xdl_merge` that takes the path and any merge backend specified in
  * `.gitattributes` or `.git/info/attributes` into account.
  * Returns 0 for a clean merge.
  */
-int ll_merge(mmbuffer_t *result_buf,
+enum ll_merge_result ll_merge(mmbuffer_t *result_buf,
 	     const char *path,
 	     mmfile_t *ancestor, const char *ancestor_label,
 	     mmfile_t *ours, const char *our_label,
diff --git a/merge-blobs.c b/merge-blobs.c
index ee0a0e90c94..8138090f81c 100644
--- a/merge-blobs.c
+++ b/merge-blobs.c
@@ -36,7 +36,7 @@ static void *three_way_filemerge(struct index_state *istate,
 				 mmfile_t *their,
 				 unsigned long *size)
 {
-	int merge_status;
+	enum ll_merge_result merge_status;
 	mmbuffer_t res;
 
 	/*
@@ -50,6 +50,9 @@ static void *three_way_filemerge(struct index_state *istate,
 				istate, NULL);
 	if (merge_status < 0)
 		return NULL;
+	if (merge_status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			path, ".our", ".their");
 
 	*size = res.size;
 	return res.ptr;
diff --git a/merge-ort.c b/merge-ort.c
index 0342f104836..c24da2ba3cb 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -1743,7 +1743,7 @@ static int merge_3way(struct merge_options *opt,
 	mmfile_t orig, src1, src2;
 	struct ll_merge_options ll_opts = {0};
 	char *base, *name1, *name2;
-	int merge_status;
+	enum ll_merge_result merge_status;
 
 	if (!opt->priv->attr_index.initialized)
 		initialize_attr_index(opt);
@@ -1787,6 +1787,9 @@ static int merge_3way(struct merge_options *opt,
 	merge_status = ll_merge(result_buf, path, &orig, base,
 				&src1, name1, &src2, name2,
 				&opt->priv->attr_index, &ll_opts);
+	if (merge_status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			path, name1, name2);
 
 	free(base);
 	free(name1);
diff --git a/merge-recursive.c b/merge-recursive.c
index d9457797dbb..bc73c52dd84 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1044,7 +1044,7 @@ static int merge_3way(struct merge_options *opt,
 	mmfile_t orig, src1, src2;
 	struct ll_merge_options ll_opts = {0};
 	char *base, *name1, *name2;
-	int merge_status;
+	enum ll_merge_result merge_status;
 
 	ll_opts.renormalize = opt->renormalize;
 	ll_opts.extra_marker_size = extra_marker_size;
@@ -1090,6 +1090,9 @@ static int merge_3way(struct merge_options *opt,
 	merge_status = ll_merge(result_buf, a->path, &orig, base,
 				&src1, name1, &src2, name2,
 				opt->repo->index, &ll_opts);
+	if (merge_status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			a->path, name1, name2);
 
 	free(base);
 	free(name1);
diff --git a/notes-merge.c b/notes-merge.c
index b4a3a903e86..01d596920ea 100644
--- a/notes-merge.c
+++ b/notes-merge.c
@@ -344,7 +344,7 @@ static int ll_merge_in_worktree(struct notes_merge_options *o,
 {
 	mmbuffer_t result_buf;
 	mmfile_t base, local, remote;
-	int status;
+	enum ll_merge_result status;
 
 	read_mmblob(&base, &p->base);
 	read_mmblob(&local, &p->local);
@@ -358,6 +358,9 @@ static int ll_merge_in_worktree(struct notes_merge_options *o,
 	free(local.ptr);
 	free(remote.ptr);
 
+	if (status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			oid_to_hex(&p->obj), o->local_ref, o->remote_ref);
 	if ((status < 0) || !result_buf.ptr)
 		die("Failed to execute internal merge");
 
diff --git a/rerere.c b/rerere.c
index d83d58df4fb..d26627c5932 100644
--- a/rerere.c
+++ b/rerere.c
@@ -609,19 +609,20 @@ static int try_merge(struct index_state *istate,
 		     const struct rerere_id *id, const char *path,
 		     mmfile_t *cur, mmbuffer_t *result)
 {
-	int ret;
+	enum ll_merge_result ret;
 	mmfile_t base = {NULL, 0}, other = {NULL, 0};
 
 	if (read_mmfile(&base, rerere_path(id, "preimage")) ||
-	    read_mmfile(&other, rerere_path(id, "postimage")))
-		ret = 1;
-	else
+	    read_mmfile(&other, rerere_path(id, "postimage"))) {
+		ret = LL_MERGE_CONFLICT;
+	} else {
 		/*
 		 * A three-way merge. Note that this honors user-customizable
 		 * low-level merge driver settings.
 		 */
 		ret = ll_merge(result, path, &base, NULL, cur, "", &other, "",
 			       istate, NULL);
+	}
 
 	free(base.ptr);
 	free(other.ptr);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v3 4/9] merge-ort: capture and print ll-merge warnings in our preferred fashion
  2021-12-30 23:36   ` [PATCH v3 0/9] " Elijah Newren via GitGitGadget
                       ` (2 preceding siblings ...)
  2021-12-30 23:36     ` [PATCH v3 3/9] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
@ 2021-12-30 23:36     ` Elijah Newren via GitGitGadget
  2021-12-30 23:36     ` [PATCH v3 5/9] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
                       ` (6 subsequent siblings)
  10 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-30 23:36 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Instead of immediately printing ll-merge warnings to stderr, we save
them in our output strbuf.  Besides allowing us to move these warnings
to a special file for --remerge-diff, this has two other benefits for
regular merges done by merge-ort:

  * The deferral of messages ensures we can print all messages about
    any given path together (merge-recursive was known to sometimes
    intersperse messages about other paths, particularly when renames
    were involved).

  * The deferral of messages means we can avoid printing spurious
    conflict messages when we just end up aborting due to local user
    modifications in the way.  (In contrast to merge-recursive.c which
    prematurely checks for local modifications in the way via
    unpack_trees() and gets the check wrong both in terms of false
    positives and false negatives relative to renames, merge-ort does
    not perform the local modifications in the way check until the
    checkout() step after the full merge has been computed.)

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c                | 5 +++--
 t/t6404-recursive-merge.sh | 9 +++++++--
 t/t6406-merge-attr.sh      | 9 +++++++--
 3 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index c24da2ba3cb..a18f47e23c5 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -1788,8 +1788,9 @@ static int merge_3way(struct merge_options *opt,
 				&src1, name1, &src2, name2,
 				&opt->priv->attr_index, &ll_opts);
 	if (merge_status == LL_MERGE_BINARY_CONFLICT)
-		warning("Cannot merge binary files: %s (%s vs. %s)",
-			path, name1, name2);
+		path_msg(opt, path, 0,
+			 "warning: Cannot merge binary files: %s (%s vs. %s)",
+			 path, name1, name2);
 
 	free(base);
 	free(name1);
diff --git a/t/t6404-recursive-merge.sh b/t/t6404-recursive-merge.sh
index eaf48e941e2..b8735c6db4d 100755
--- a/t/t6404-recursive-merge.sh
+++ b/t/t6404-recursive-merge.sh
@@ -108,8 +108,13 @@ test_expect_success 'refuse to merge binary files' '
 	printf "\0\0" >binary-file &&
 	git add binary-file &&
 	git commit -m binary2 &&
-	test_must_fail git merge F >merge.out 2>merge.err &&
-	grep "Cannot merge binary files: binary-file (HEAD vs. F)" merge.err
+	if test "$GIT_TEST_MERGE_ALGORITHM" = ort
+	then
+		test_must_fail git merge F >merge_output
+	else
+		test_must_fail git merge F 2>merge_output
+	fi &&
+	grep "Cannot merge binary files: binary-file (HEAD vs. F)" merge_output
 '
 
 test_expect_success 'mark rename/delete as unmerged' '
diff --git a/t/t6406-merge-attr.sh b/t/t6406-merge-attr.sh
index 84946458371..c41584eb33e 100755
--- a/t/t6406-merge-attr.sh
+++ b/t/t6406-merge-attr.sh
@@ -221,8 +221,13 @@ test_expect_success 'binary files with union attribute' '
 	printf "two\0" >bin.txt &&
 	git commit -am two &&
 
-	test_must_fail git merge bin-main 2>stderr &&
-	grep -i "warning.*cannot merge.*HEAD vs. bin-main" stderr
+	if test "$GIT_TEST_MERGE_ALGORITHM" = ort
+	then
+		test_must_fail git merge bin-main >output
+	else
+		test_must_fail git merge bin-main 2>output
+	fi &&
+	grep -i "warning.*cannot merge.*HEAD vs. bin-main" output
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v3 5/9] merge-ort: mark a few more conflict messages as omittable
  2021-12-30 23:36   ` [PATCH v3 0/9] " Elijah Newren via GitGitGadget
                       ` (3 preceding siblings ...)
  2021-12-30 23:36     ` [PATCH v3 4/9] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
@ 2021-12-30 23:36     ` Elijah Newren via GitGitGadget
  2021-12-30 23:36     ` [PATCH v3 6/9] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
                       ` (5 subsequent siblings)
  10 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-30 23:36 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

path_msg() has the ability to mark messages as omittable, designed for
remerge-diff where we'll instead be showing conflict messages as diff
headers for a subsequent diff.  While all these messages are very useful
when trying to create a merge initially, early use with the
--remerge-diff feature (the only user of this omittable conflict message
capability), suggests that the particular messages marked in this commit
are just noise when trying to see what changes users made to create a
merge commit.  Mark them as omittable.

Note that there were already a few messages marked as omittable in
merge-ort when doing a remerge-diff, because the development of
--remerge-diff preceded the upstreaming of merge-ort and I was trying to
ensure merge-ort could handle all the necessary requirements.  See
commit c5a6f65527 ("merge-ort: add modify/delete handling and delayed
output processing", 2020-12-03) for the initial details.  For some
examples of already-marked-as-omittable messages, see either
"Auto-merging <path>" or some of the submodule update hints.  This
commit just adds two more messages that should also be omittable.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index a18f47e23c5..998e92ec593 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -2420,7 +2420,7 @@ static void apply_directory_rename_modifications(struct merge_options *opt,
 		 */
 		ci->path_conflict = 1;
 		if (pair->status == 'A')
-			path_msg(opt, new_path, 0,
+			path_msg(opt, new_path, 1,
 				 _("CONFLICT (file location): %s added in %s "
 				   "inside a directory that was renamed in %s, "
 				   "suggesting it should perhaps be moved to "
@@ -2428,7 +2428,7 @@ static void apply_directory_rename_modifications(struct merge_options *opt,
 				 old_path, branch_with_new_path,
 				 branch_with_dir_rename, new_path);
 		else
-			path_msg(opt, new_path, 0,
+			path_msg(opt, new_path, 1,
 				 _("CONFLICT (file location): %s renamed to %s "
 				   "in %s, inside a directory that was renamed "
 				   "in %s, suggesting it should perhaps be "
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v3 6/9] merge-ort: format messages slightly different for use in headers
  2021-12-30 23:36   ` [PATCH v3 0/9] " Elijah Newren via GitGitGadget
                       ` (4 preceding siblings ...)
  2021-12-30 23:36     ` [PATCH v3 5/9] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
@ 2021-12-30 23:36     ` Elijah Newren via GitGitGadget
  2021-12-30 23:36     ` [PATCH v3 7/9] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
                       ` (4 subsequent siblings)
  10 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-30 23:36 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

When users run
    git show --remerge-diff $MERGE_COMMIT
or
    git log -p --remerge-diff ...
stdout is not an appropriate location to dump conflict messages, but we
do want to provide them to users.  We will include them in the diff
headers instead...but for that to work, we need for any multiline
messages to replace newlines with both a newline and a space.  Add a new
flag to signal when we want these messages modified in such a fashion,
and use it in path_msg() to modify these messages this way.  Also, allow
a special prefix to be specified for these headers.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c       | 42 ++++++++++++++++++++++++++++++++++++++++--
 merge-recursive.c |  4 ++++
 merge-recursive.h |  2 ++
 3 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index 998e92ec593..481305d2bcf 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -634,17 +634,49 @@ static void path_msg(struct merge_options *opt,
 		     const char *fmt, ...)
 {
 	va_list ap;
-	struct strbuf *sb = strmap_get(&opt->priv->output, path);
+	struct strbuf *sb, *dest;
+	struct strbuf tmp = STRBUF_INIT;
+
+	if (opt->record_conflict_msgs_as_headers && omittable_hint)
+		return; /* Do not record mere hints in tree */
+	sb = strmap_get(&opt->priv->output, path);
 	if (!sb) {
 		sb = xmalloc(sizeof(*sb));
 		strbuf_init(sb, 0);
 		strmap_put(&opt->priv->output, path, sb);
 	}
 
+	dest = (opt->record_conflict_msgs_as_headers ? &tmp : sb);
+
 	va_start(ap, fmt);
-	strbuf_vaddf(sb, fmt, ap);
+	strbuf_vaddf(dest, fmt, ap);
 	va_end(ap);
 
+	if (opt->record_conflict_msgs_as_headers) {
+		int i_sb = 0, i_tmp = 0;
+
+		/* Start with the specified prefix */
+		if (opt->msg_header_prefix)
+			strbuf_addf(sb, "%s ", opt->msg_header_prefix);
+
+		/* Copy tmp to sb, adding spaces after newlines */
+		strbuf_grow(sb, sb->len + 2*tmp.len); /* more than sufficient */
+		for (; i_tmp < tmp.len; i_tmp++, i_sb++) {
+			/* Copy next character from tmp to sb */
+			sb->buf[sb->len + i_sb] = tmp.buf[i_tmp];
+
+			/* If we copied a newline, add a space */
+			if (tmp.buf[i_tmp] == '\n')
+				sb->buf[++i_sb] = ' ';
+		}
+		/* Update length and ensure it's NUL-terminated */
+		sb->len += i_sb;
+		sb->buf[sb->len] = '\0';
+
+		strbuf_release(&tmp);
+	}
+
+	/* Add final newline character to sb */
 	strbuf_addch(sb, '\n');
 }
 
@@ -4246,6 +4278,9 @@ void merge_switch_to_result(struct merge_options *opt,
 		struct string_list olist = STRING_LIST_INIT_NODUP;
 		int i;
 
+		if (opt->record_conflict_msgs_as_headers)
+			BUG("Either display conflict messages or record them as headers, not both");
+
 		trace2_region_enter("merge", "display messages", opt->repo);
 
 		/* Hack to pre-allocate olist to the desired size */
@@ -4347,6 +4382,9 @@ static void merge_start(struct merge_options *opt, struct merge_result *result)
 	assert(opt->recursive_variant >= MERGE_VARIANT_NORMAL &&
 	       opt->recursive_variant <= MERGE_VARIANT_THEIRS);
 
+	if (opt->msg_header_prefix)
+		assert(opt->record_conflict_msgs_as_headers);
+
 	/*
 	 * detect_renames, verbosity, buffer_output, and obuf are ignored
 	 * fields that were used by "recursive" rather than "ort" -- but
diff --git a/merge-recursive.c b/merge-recursive.c
index bc73c52dd84..9ec1e6d043a 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -3714,6 +3714,10 @@ static int merge_start(struct merge_options *opt, struct tree *head)
 
 	assert(opt->priv == NULL);
 
+	/* Not supported; option specific to merge-ort */
+	assert(!opt->record_conflict_msgs_as_headers);
+	assert(!opt->msg_header_prefix);
+
 	/* Sanity check on repo state; index must match head */
 	if (repo_index_has_changes(opt->repo, head, &sb)) {
 		err(opt, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
diff --git a/merge-recursive.h b/merge-recursive.h
index 0795a1d3ec1..b88000e3c25 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -46,6 +46,8 @@ struct merge_options {
 	/* miscellaneous control options */
 	const char *subtree_shift;
 	unsigned renormalize : 1;
+	unsigned record_conflict_msgs_as_headers : 1;
+	const char *msg_header_prefix;
 
 	/* internal fields used by the implementation */
 	struct merge_options_internal *priv;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v3 7/9] diff: add ability to insert additional headers for paths
  2021-12-30 23:36   ` [PATCH v3 0/9] " Elijah Newren via GitGitGadget
                       ` (5 preceding siblings ...)
  2021-12-30 23:36     ` [PATCH v3 6/9] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
@ 2021-12-30 23:36     ` Elijah Newren via GitGitGadget
  2021-12-30 23:36     ` [PATCH v3 8/9] show, log: include conflict/warning messages in --remerge-diff headers Elijah Newren via GitGitGadget
                       ` (3 subsequent siblings)
  10 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-30 23:36 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

When additional headers are provided, we need to
  * add diff_filepairs to diff_queued_diff for each paths in the
    additional headers map which, unless that path is part of
    another diff_filepair already found in diff_queued_diff
  * format the headers (colorization, line_prefix for --graph)
  * make sure the various codepaths that attempt to return early
    if there are "no changes" take into account the headers that
    need to be shown.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 diff.c     | 116 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 diff.h     |   3 +-
 log-tree.c |   2 +-
 3 files changed, 115 insertions(+), 6 deletions(-)

diff --git a/diff.c b/diff.c
index 861282db1c3..aaa6a19f158 100644
--- a/diff.c
+++ b/diff.c
@@ -27,6 +27,7 @@
 #include "help.h"
 #include "promisor-remote.h"
 #include "dir.h"
+#include "strmap.h"
 
 #ifdef NO_FAST_WORKING_DIRECTORY
 #define FAST_WORKING_DIRECTORY 0
@@ -3406,6 +3407,31 @@ struct userdiff_driver *get_textconv(struct repository *r,
 	return userdiff_get_textconv(r, one->driver);
 }
 
+static struct strbuf *additional_headers(struct diff_options *o,
+					 const char *path)
+{
+	if (!o->additional_path_headers)
+		return NULL;
+	return strmap_get(o->additional_path_headers, path);
+}
+
+static void add_formatted_headers(struct strbuf *msg,
+				  struct strbuf *more_headers,
+				  const char *line_prefix,
+				  const char *meta,
+				  const char *reset)
+{
+	char *next, *newline;
+
+	for (next = more_headers->buf; *next; next = newline) {
+		newline = strchrnul(next, '\n');
+		strbuf_addf(msg, "%s%s%.*s%s\n", line_prefix, meta,
+			    (int)(newline - next), next, reset);
+		if (*newline)
+			newline++;
+	}
+}
+
 static void builtin_diff(const char *name_a,
 			 const char *name_b,
 			 struct diff_filespec *one,
@@ -3464,6 +3490,17 @@ static void builtin_diff(const char *name_a,
 	b_two = quote_two(b_prefix, name_b + (*name_b == '/'));
 	lbl[0] = DIFF_FILE_VALID(one) ? a_one : "/dev/null";
 	lbl[1] = DIFF_FILE_VALID(two) ? b_two : "/dev/null";
+	if (!DIFF_FILE_VALID(one) && !DIFF_FILE_VALID(two)) {
+		/*
+		 * We should only reach this point for pairs from
+		 * create_filepairs_for_header_only_notifications().  For
+		 * these, we should avoid the "/dev/null" special casing
+		 * above, meaning we avoid showing such pairs as either
+		 * "new file" or "deleted file" below.
+		 */
+		lbl[0] = a_one;
+		lbl[1] = b_two;
+	}
 	strbuf_addf(&header, "%s%sdiff --git %s %s%s\n", line_prefix, meta, a_one, b_two, reset);
 	if (lbl[0][0] == '/') {
 		/* /dev/null */
@@ -4328,6 +4365,7 @@ static void fill_metainfo(struct strbuf *msg,
 	const char *set = diff_get_color(use_color, DIFF_METAINFO);
 	const char *reset = diff_get_color(use_color, DIFF_RESET);
 	const char *line_prefix = diff_line_prefix(o);
+	struct strbuf *more_headers = NULL;
 
 	*must_show_header = 1;
 	strbuf_init(msg, PATH_MAX * 2 + 300);
@@ -4364,6 +4402,11 @@ static void fill_metainfo(struct strbuf *msg,
 	default:
 		*must_show_header = 0;
 	}
+	if ((more_headers = additional_headers(o, name))) {
+		add_formatted_headers(msg, more_headers,
+				      line_prefix, set, reset);
+		*must_show_header = 1;
+	}
 	if (one && two && !oideq(&one->oid, &two->oid)) {
 		const unsigned hexsz = the_hash_algo->hexsz;
 		int abbrev = o->abbrev ? o->abbrev : DEFAULT_ABBREV;
@@ -5852,12 +5895,22 @@ int diff_unmodified_pair(struct diff_filepair *p)
 
 static void diff_flush_patch(struct diff_filepair *p, struct diff_options *o)
 {
-	if (diff_unmodified_pair(p))
+	/*
+	 * Check if we can return early without showing a diff.  Note that
+	 * diff_filepair only stores {oid, path, mode, is_valid}
+	 * information for each path, and thus diff_unmodified_pair() only
+	 * considers those bits of info.  However, we do not want pairs
+	 * created by create_filepairs_for_header_only_notifications() to
+	 * be ignored, so return early if both p is unmodified AND
+	 * p->one->path is not in additional headers.
+	 */
+	if (diff_unmodified_pair(p) && !additional_headers(o, p->one->path))
 		return;
 
+	/* Actually, we can also return early to avoid showing tree diffs */
 	if ((DIFF_FILE_VALID(p->one) && S_ISDIR(p->one->mode)) ||
 	    (DIFF_FILE_VALID(p->two) && S_ISDIR(p->two->mode)))
-		return; /* no tree diffs in patch format */
+		return;
 
 	run_diff(p, o);
 }
@@ -5888,10 +5941,14 @@ static void diff_flush_checkdiff(struct diff_filepair *p,
 	run_checkdiff(p, o);
 }
 
-int diff_queue_is_empty(void)
+int diff_queue_is_empty(struct diff_options *o)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
 	int i;
+
+	if (o->additional_path_headers &&
+	    !strmap_empty(o->additional_path_headers))
+		return 0;
 	for (i = 0; i < q->nr; i++)
 		if (!diff_unmodified_pair(q->queue[i]))
 			return 0;
@@ -6325,6 +6382,54 @@ void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc)
 		warning(_(rename_limit_advice), varname, needed);
 }
 
+static void create_filepairs_for_header_only_notifications(struct diff_options *o)
+{
+	struct strset present;
+	struct diff_queue_struct *q = &diff_queued_diff;
+	struct hashmap_iter iter;
+	struct strmap_entry *e;
+	int i;
+
+	strset_init_with_options(&present, /*pool*/ NULL, /*strdup*/ 0);
+
+	/*
+	 * Find out which paths exist in diff_queued_diff, preferring
+	 * one->path for any pair that has multiple paths.
+	 */
+	for (i = 0; i < q->nr; i++) {
+		struct diff_filepair *p = q->queue[i];
+		char *path = p->one->path ? p->one->path : p->two->path;
+
+		if (strmap_contains(o->additional_path_headers, path))
+			strset_add(&present, path);
+	}
+
+	/*
+	 * Loop over paths in additional_path_headers; for each NOT already
+	 * in diff_queued_diff, create a synthetic filepair and insert that
+	 * into diff_queued_diff.
+	 */
+	strmap_for_each_entry(o->additional_path_headers, &iter, e) {
+		if (!strset_contains(&present, e->key)) {
+			struct diff_filespec *one, *two;
+			struct diff_filepair *p;
+
+			one = alloc_filespec(e->key);
+			two = alloc_filespec(e->key);
+			fill_filespec(one, null_oid(), 0, 0);
+			fill_filespec(two, null_oid(), 0, 0);
+			p = diff_queue(q, one, two);
+			p->status = DIFF_STATUS_MODIFIED;
+		}
+	}
+
+	/* Re-sort the filepairs */
+	diffcore_fix_diff_index();
+
+	/* Cleanup */
+	strset_clear(&present);
+}
+
 static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 {
 	int i;
@@ -6337,6 +6442,9 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 	if (o->color_moved)
 		o->emitted_symbols = &esm;
 
+	if (o->additional_path_headers)
+		create_filepairs_for_header_only_notifications(o);
+
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
 		if (check_pair_status(p))
@@ -6413,7 +6521,7 @@ void diff_flush(struct diff_options *options)
 	 * Order: raw, stat, summary, patch
 	 * or:    name/name-status/checkdiff (other bits clear)
 	 */
-	if (!q->nr)
+	if (!q->nr && !options->additional_path_headers)
 		goto free_queue;
 
 	if (output_format & (DIFF_FORMAT_RAW |
diff --git a/diff.h b/diff.h
index 8ba85c5e605..06a0a67afda 100644
--- a/diff.h
+++ b/diff.h
@@ -395,6 +395,7 @@ struct diff_options {
 
 	struct repository *repo;
 	struct option *parseopts;
+	struct strmap *additional_path_headers;
 
 	int no_free;
 };
@@ -593,7 +594,7 @@ void diffcore_fix_diff_index(void);
 "                show all files diff when -S is used and hit is found.\n" \
 "  -a  --text    treat all files as text.\n"
 
-int diff_queue_is_empty(void);
+int diff_queue_is_empty(struct diff_options*);
 void diff_flush(struct diff_options*);
 void diff_free(struct diff_options*);
 void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc);
diff --git a/log-tree.c b/log-tree.c
index d4655b63d75..33c28f537a6 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -850,7 +850,7 @@ int log_tree_diff_flush(struct rev_info *opt)
 	opt->shown_dashes = 0;
 	diffcore_std(&opt->diffopt);
 
-	if (diff_queue_is_empty()) {
+	if (diff_queue_is_empty(&opt->diffopt)) {
 		int saved_fmt = opt->diffopt.output_format;
 		opt->diffopt.output_format = DIFF_FORMAT_NO_OUTPUT;
 		diff_flush(&opt->diffopt);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v3 8/9] show, log: include conflict/warning messages in --remerge-diff headers
  2021-12-30 23:36   ` [PATCH v3 0/9] " Elijah Newren via GitGitGadget
                       ` (6 preceding siblings ...)
  2021-12-30 23:36     ` [PATCH v3 7/9] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
@ 2021-12-30 23:36     ` Elijah Newren via GitGitGadget
  2022-01-19 16:19       ` Ævar Arnfjörð Bjarmason
  2021-12-30 23:36     ` [PATCH v3 9/9] merge-ort: mark conflict/warning messages from inner merges as omittable Elijah Newren via GitGitGadget
                       ` (2 subsequent siblings)
  10 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-30 23:36 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Conflicts such as modify/delete, rename/rename, or file/directory are
not representable via content conflict markers, and the normal output
messages notifying users about these were dropped with --remerge-diff.
While we don't want these messages randomly shown before the commit
and diff headers, we do want them to still be shown; include them as
part of the diff headers instead.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 log-tree.c              |  4 ++
 merge-ort.c             |  1 +
 merge-ort.h             | 10 +++++
 t/t4069-remerge-diff.sh | 86 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 101 insertions(+)

diff --git a/log-tree.c b/log-tree.c
index 33c28f537a6..a04172d2908 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -922,6 +922,8 @@ static int do_remerge_diff(struct rev_info *opt,
 	/* Setup merge options */
 	init_merge_options(&o, the_repository);
 	o.show_rename_progress = 0;
+	o.record_conflict_msgs_as_headers = 1;
+	o.msg_header_prefix = "remerge";
 
 	ctx.abbrev = DEFAULT_ABBREV;
 	format_commit_message(parent1, "%h (%s)", &parent1_desc, &ctx);
@@ -938,10 +940,12 @@ static int do_remerge_diff(struct rev_info *opt,
 	merge_incore_recursive(&o, bases, parent1, parent2, &res);
 
 	/* Show the diff */
+	opt->diffopt.additional_path_headers = res.path_messages;
 	diff_tree_oid(&res.tree->object.oid, oid, "", &opt->diffopt);
 	log_tree_diff_flush(opt);
 
 	/* Cleanup */
+	opt->diffopt.additional_path_headers = NULL;
 	strbuf_release(&parent1_desc);
 	strbuf_release(&parent2_desc);
 	merge_finalize(&o, &res);
diff --git a/merge-ort.c b/merge-ort.c
index 481305d2bcf..43f980d2586 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -4585,6 +4585,7 @@ redo:
 	trace2_region_leave("merge", "process_entries", opt->repo);
 
 	/* Set return values */
+	result->path_messages = &opt->priv->output;
 	result->tree = parse_tree_indirect(&working_tree_oid);
 	/* existence of conflicted entries implies unclean */
 	result->clean &= strmap_empty(&opt->priv->conflicted);
diff --git a/merge-ort.h b/merge-ort.h
index c011864ffeb..fe599b87868 100644
--- a/merge-ort.h
+++ b/merge-ort.h
@@ -5,6 +5,7 @@
 
 struct commit;
 struct tree;
+struct strmap;
 
 struct merge_result {
 	/*
@@ -23,6 +24,15 @@ struct merge_result {
 	 */
 	struct tree *tree;
 
+	/*
+	 * Special messages and conflict notices for various paths
+	 *
+	 * This is a map of pathnames to strbufs.  It contains various
+	 * warning/conflict/notice messages (possibly multiple per path)
+	 * that callers may want to use.
+	 */
+	struct strmap *path_messages;
+
 	/*
 	 * Additional metadata used by merge_switch_to_result() or future calls
 	 * to merge_incore_*().  Includes data needed to update the index (if
diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
index 1b32028e897..c1b44138145 100755
--- a/t/t4069-remerge-diff.sh
+++ b/t/t4069-remerge-diff.sh
@@ -4,6 +4,15 @@ test_description='remerge-diff handling'
 
 . ./test-lib.sh
 
+# --remerge-diff uses ort under the hood regardless of setting.  However,
+# we set up a file/directory conflict beforehand, and the different backends
+# handle the conflict differently, which would require separate code paths
+# to resolve.  There's not much point in making the code uglier to do that,
+# though, when the real thing we are testing (--remerge-diff) will hardcode
+# calls directly into the merge-ort API anyway.  So just force the use of
+# ort on the setup too.
+GIT_TEST_MERGE_ALGORITHM=ort
+
 test_expect_success 'setup basic merges' '
 	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
 	git add numbers &&
@@ -53,6 +62,7 @@ test_expect_success 'remerge-diff with both a resolved conflict and an unrelated
 	git log -1 --oneline ab_resolution >tmp &&
 	cat <<-EOF >>tmp &&
 	diff --git a/numbers b/numbers
+	remerge CONFLICT (content): Merge conflict in numbers
 	index a1fb731..6875544 100644
 	--- a/numbers
 	+++ b/numbers
@@ -81,4 +91,80 @@ test_expect_success 'remerge-diff with both a resolved conflict and an unrelated
 	test_cmp expect actual
 '
 
+test_expect_success 'setup non-content conflicts' '
+	git switch --orphan base &&
+
+	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
+	test_write_lines a b c d e f g h i >letters &&
+	test_write_lines in the way >content &&
+	git add numbers letters content &&
+	git commit -m base &&
+
+	git branch side1 &&
+	git branch side2 &&
+
+	git checkout side1 &&
+	test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
+	git mv letters letters_side1 &&
+	git mv content file_or_directory &&
+	git add numbers &&
+	git commit -m side1 &&
+
+	git checkout side2 &&
+	git rm numbers &&
+	git mv letters letters_side2 &&
+	mkdir file_or_directory &&
+	echo hello >file_or_directory/world &&
+	git add file_or_directory/world &&
+	git commit -m side2 &&
+
+	git checkout -b resolution side1 &&
+	test_must_fail git merge side2 &&
+	test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
+	git add numbers &&
+	git add letters_side1 &&
+	git rm letters &&
+	git rm letters_side2 &&
+	git add file_or_directory~HEAD &&
+	git mv file_or_directory~HEAD wanted_content &&
+	git commit -m resolved
+'
+
+test_expect_success 'remerge-diff with non-content conflicts' '
+	git log -1 --oneline resolution >tmp &&
+	cat <<-EOF >>tmp &&
+	diff --git a/file_or_directory~HASH (side1) b/wanted_content
+	similarity index 100%
+	rename from file_or_directory~HASH (side1)
+	rename to wanted_content
+	remerge CONFLICT (file/directory): directory in the way of file_or_directory from HASH (side1); moving it to file_or_directory~HASH (side1) instead.
+	diff --git a/letters b/letters
+	remerge CONFLICT (rename/rename): letters renamed to letters_side1 in HASH (side1) and to letters_side2 in HASH (side2).
+	diff --git a/letters_side2 b/letters_side2
+	deleted file mode 100644
+	index b236ae5..0000000
+	--- a/letters_side2
+	+++ /dev/null
+	@@ -1,9 +0,0 @@
+	-a
+	-b
+	-c
+	-d
+	-e
+	-f
+	-g
+	-h
+	-i
+	diff --git a/numbers b/numbers
+	remerge CONFLICT (modify/delete): numbers deleted in HASH (side2) and modified in HASH (side1).  Version HASH (side1) of numbers left in tree.
+	EOF
+	# We still have some sha1 hashes above; rip them out so test works
+	# with sha256
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
+
+	git show --oneline --remerge-diff resolution >tmp &&
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v3 9/9] merge-ort: mark conflict/warning messages from inner merges as omittable
  2021-12-30 23:36   ` [PATCH v3 0/9] " Elijah Newren via GitGitGadget
                       ` (7 preceding siblings ...)
  2021-12-30 23:36     ` [PATCH v3 8/9] show, log: include conflict/warning messages in --remerge-diff headers Elijah Newren via GitGitGadget
@ 2021-12-30 23:36     ` Elijah Newren via GitGitGadget
  2021-12-31  8:46     ` [PATCH v3 0/9] Add a new --remerge-diff capability to show & log Junio C Hamano
  2022-01-21 19:12     ` [PATCH v4 00/10] " Elijah Newren via GitGitGadget
  10 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-12-30 23:36 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

A recursive merge involves merging the merge bases of the two branches
being merged.  Such an inner merge can itself generate conflict notices.
While such notices may be useful when initially trying to create a
merge, they seem to just be noise when investigating merges later with
--remerge-diff.  (Especially when both sides of the outer merge resolved
the conflict the same way leading to no overall conflict.)  Remove them.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/merge-ort.c b/merge-ort.c
index 43f980d2586..9bf15a01db8 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -638,7 +638,9 @@ static void path_msg(struct merge_options *opt,
 	struct strbuf tmp = STRBUF_INIT;
 
 	if (opt->record_conflict_msgs_as_headers && omittable_hint)
-		return; /* Do not record mere hints in tree */
+		return; /* Do not record mere hints in headers */
+	if (opt->record_conflict_msgs_as_headers && opt->priv->call_depth)
+		return; /* Do not record inner merge issues in headers */
 	sb = strmap_get(&opt->priv->output, path);
 	if (!sb) {
 		sb = xmalloc(sizeof(*sb));
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 7/8] diff: add ability to insert additional headers for paths
  2021-12-30 22:04           ` Elijah Newren
@ 2021-12-31  3:07             ` Johannes Altmanninger
  0 siblings, 0 replies; 113+ messages in thread
From: Johannes Altmanninger @ 2021-12-31  3:07 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Ævar Arnfjörð Bjarmason, Neeraj Singh

On Thu, Dec 30, 2021 at 02:04:30PM -0800, Elijah Newren wrote:
> >                                 enqueue(file_pair_for(extra_headers[j]))
> 
> The queue is an array of sorted items, so enqueue here would be
> insertion into an already sorted list.  Inserting N items into a list
> of M items is quadratic (O(N*M)) -- unless you meant to just append to
> the end and add a third sort at the end?

yeah I would have probably used a third sort

> 
> >                         j++
> 
> At the end of the for loop, there may be remaining additional headers
> that sort after all those found in the queue, so you'll need an
> additional loop to handle those.

my bad, I should have tried it

> It's actually considerably more code as you can see from the diffstat,
> and feels like we're reaching into some ugly internals with tmp_queue
> (the SWAP and the special-case freeing) in order to get the desired
> performance improvements.  And it was already O(NlogN) overall (due to
> the sort), which doesn't change with this new algorithm.  It's really,
> really hard for me to imagine a case where we have large numbers of
> additional headers.  Even if someone else can imagine that we for some
> reason have a huge number of conflicts in order to generate a huge
> number of additional headers...how could the performance of sorting
> O(N) filenames and merging these lists possibly matter in comparison
> to the O(N) three-way file merges that would likely have been
> performed from those conflicts?

Yeah, I agree with that conclusion, it's surely not worth the added complexity.
Seeing the code definitely helps, thanks.

> 
> So, I'm going to throw this code away and keep the original.
> 
> It was an interesting idea and exercise; thanks for keeping me on my toes.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 0/9] Add a new --remerge-diff capability to show & log
  2021-12-30 23:36   ` [PATCH v3 0/9] " Elijah Newren via GitGitGadget
                       ` (8 preceding siblings ...)
  2021-12-30 23:36     ` [PATCH v3 9/9] merge-ort: mark conflict/warning messages from inner merges as omittable Elijah Newren via GitGitGadget
@ 2021-12-31  8:46     ` Junio C Hamano
  2022-01-21 19:12     ` [PATCH v4 00/10] " Elijah Newren via GitGitGadget
  10 siblings, 0 replies; 113+ messages in thread
From: Junio C Hamano @ 2021-12-31  8:46 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Changes since v2 (of the restarted submission):
>
>  * Numerous small improvements suggested by Johannes Altmanninger
>  * Avoid including conflict messages from inner merges (due to example
>    pointed out by Ævar).
>  * Added a "remerge" prefix to all the new diff headers (suggested by Junio
>    in a previous round, but I couldn't come up with a good name before. It
>    suddenly hit me that "remerge" is an obvious prefix to use, and even
>    helps explain what the rest of the line is for.)

That sounds sensible.


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 0/8] Add a new --remerge-diff capability to show & log
  2021-12-27 21:11     ` Elijah Newren
@ 2022-01-10 15:48       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 113+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-10 15:48 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya, Neeraj Singh


On Mon, Dec 27 2021, Elijah Newren wrote:

> On Sun, Dec 26, 2021 at 2:28 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>> On Sat, Dec 25 2021, Elijah Newren via GitGitGadget wrote:
>>
>> > === FURTHER BACKGROUND (original cover letter material) ==
>> >
>> > Here are some example commits you can try this out on (with git show
>> > --remerge-diff $COMMIT):
>> >
>> >  * git.git conflicted merge: 07601b5b36
>> >  * git.git non-conflicted change: bf04590ecd
>> >  * linux.git conflicted merge: eab3540562fb
>> >  * linux.git non-conflicted change: 223cea6a4f05
>> >
>> > Many more can be found by just running git log --merges --remerge-diff in
>> > your repository of choice and searching for diffs (most merges tend to be
>> > clean and unmodified and thus produce no diff but a search of '^diff' in the
>> > log output tends to find the examples nicely).
>> >
>> > Some basic high level details about this new option:
>> >
>> >  * This option is most naturally compared to --cc, though the output seems
>> >    to be much more understandable to most users than --cc output.
>> >  * Since merges are often clean and unmodified, this new option results in
>> >    an empty diff for most merges.
>> >  * This new option shows things like the removal of conflict markers, which
>> >    hunks users picked from the various conflicted sides to keep or remove,
>> >    and shows changes made outside of conflict markers (which might reflect
>> >    changes needed to resolve semantic conflicts or cleanups of e.g.
>> >    compilation warnings or other additional changes an integrator felt
>> >    belonged in the merged result).
>> >  * This new option does not (currently) work for octopus merges, since
>> >    merge-ort is specific to two-parent merges[1].
>> >  * This option will not work on a read-only or full filesystem[2].
>> >  * We discussed this capability at Git Merge 2020, and one of the
>> >    suggestions was doing a periodic git gc --auto during the operation (due
>> >    to potential new blobs and trees created during the operation). I found a
>> >    way to avoid that; see [2].
>> >  * This option is faster than you'd probably expect; it handles 33.5 merge
>> >    commits per second in linux.git on my computer; see below.
>> >
>> > In regards to the performance point above, the timing for running the
>> > following command:
>> >
>> > time git log --min-parents=2 --max-parents=2 $DIFF_FLAG | wc -l
>>
>> I've been trying to come up with some other useful recipies for this new
>> option (which is already very useful, thanks!)
>
> I'm glad you like it.  :-)
>
>> Some of these (if correct) are suggestions for incorporating into the
>> (now rather sparse) documentation. I.e. walking users through how to use
>> this, and how (if at all) it combines with other options.
>>
>> I wanted to find all merges between "master".."seen" for which Junio's
>> had to resolve a conflict, a naïve version is:
>>
>>     $ git log --oneline --remerge-diff -p --min-parents=2 origin/master..origin/seen|grep ^diff -B1 | grep Merge
>>     [...]
>
> I think the naive version is
>   $ git log --remerge-diff --min-parents=2 origin/master..origin/seen
>   <search for "^diff" using your pager's search functionality>
>
> Where the "--min-parents=2 origin/master..origin/seen" comes from your
> problem description ("find all merges between master..seen").
>
> You can add --oneline to format it, though it's an orthogonal concern.
> Also, adding -p is unnecessary: --remerge-diff, like --cc, implies -p.
>
>> But I found that this new option nicely integrates with --diff-filter,
>> i.e. we'll end up showing a diff, and the diff machinery allows you to
>> to filter on it.
>>
>> It seems to me like all the diffs you show fall under "M", so for
>
> Yes, the diffs I happened to pick all fell under "M", but by no means
> should you rely on that happening for all merges in history.  For
> example, make a new merge commit, then add a completely new file (or
> delete a file, or rename a file, or copy a file, or change its
> mode/type), stage the new/deleted/renamed/copied/changed file, and run
> "git commit --amend".
>
> So, although --diff-filter=M can be interesting, I would not rely on it.

*Nod* I hadn't thought of those (in retrospect, rather obvious) cases.

>> master..seen (2ae0a9cb829..61055c2920d) this is equivalent (and the
>> output is the same as the above):
>>
>>     $ git -P log --oneline --remerge-diff --no-patch --min-parents=2 --diff-filter=M origin/master..origin/seen
>>     95daa54b1c3 Merge branch 'hn/reftable-fixes' into seen
>>     26c4c09dd34 Merge branch 'gc/fetch-negotiate-only-early-return' into seen
>>     e3dc8d073f6 Merge branch 'gc/branch-recurse-submodules' into seen
>>     aeada898196 Merge branch 'js/branch-track-inherit' into seen
>>     4dd30e0da45 Merge branch 'jh/builtin-fsmonitor-part2' into seen
>>     337743b17d0 Merge branch 'ab/config-based-hooks-2' into seen
>>     261672178c0 Merge branch 'pw/fix-some-issues-in-reset-head' into seen
>>     1296d35b041 Merge branch 'ms/customizable-ident-expansion' into seen
>>     7a3d7d05126 Merge branch 'ja/i18n-similar-messages' into seen
>>     eda714bb8bc Merge branch 'tb/midx-bitmap-corruption-fix' into seen
>>     ba02295e3f8 Merge branch 'jh/p4-human-unit-numbers' into jch
>>     751773fc38b Merge branch 'es/test-chain-lint' into jch
>>     ec17879f495 Merge branch 'tb/cruft-packs' into tb/midx-bitmap-corruption-fix
>>
>> However for "origin/master..origin/next" (next = 510f9eba9a2 currently)
>> we'll oddly show this with "-p":
>>
>>     9af51fd1d0d Sync with 'master'
>>     diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
>>     CONFLICT (content): Merge conflict in t/lib-gpg.sh
>>     d6f56f3248e Merge branch 'es/test-chain-lint' into next
>>     diff --git a/t/t4126-apply-empty.sh b/t/t4126-apply-empty.sh
>>     CONFLICT (content): Merge conflict in t/t4126-apply-empty.sh
>>     index 996c93329c6..33860d38290 100755
>>     --- a/t/t4126-apply-empty.sh
>>     +++ b/t/t4126-apply-empty.sh
>>     [...]
>>
>> The "oddly" applying only to that "9af51fd1d0d Sync with 'master'", not
>> the second d6f56f3248e, which shows the sort of conflict I'd expect. The
>> two-line "diff" of:
>>
>>     diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
>>     CONFLICT (content): Merge conflict in t/lib-gpg.sh
>>
>> Shows up with -p --remerge-diff, not a mere -p. I also tried the other
>> --diff-merges=* options, that behavior is new in
>> --diff-merges=remerge. Is this a bug?
>
> Ugh, this is related to my comment elsewhere that conflicts from inner
> merges are not nicely differentiated.  If I also apply my other series
> (which has not yet been submitted), this instead appears as follows:
>
> $ git show --oneline --remerge-diff 9af51fd1d0d
> 9af51fd1d0 Sync with 'master'
> diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
>   From inner merge:  CONFLICT (content): Merge conflict in t/lib-gpg.sh
>
> and the addition of the "From inner merge: " text makes it clearer why
> that line appears.  This is an interesting case where a conflict
> notice _only_ appears in the inner merge (i.e. the merge of merge
> bases), which means that both sides on the outer merge changed the
> relevant portion of the file in the same way, so the outer merge had
> no conflict.
>
> However, instead of trying to differentiate messages from inner
> merges, I think for --remerge-diff's purposes we should just drop all
> notices that come from the inner merges.  Those conflict notices might
> be helpful when initially resolving a merge, but at the --remerge-diff
> level, they're more likely to be distracting than helpful.

Thanks. I see that v3 addresses this. Will try it out...

>> My local build also has a --pickaxe-patch option. It's something I
>> submitted on-list before[1] and have been meaning to re-roll.
>>
>> I'm discussing it here because it skips the stripping of the "+ " and "-
>> " prefixes under -G<regex> and allows you to search through the -U<n>
>> context. With that I'm able to do:
>>
>>     git log --oneline --remerge-diff -p --min-parents=2 --pickaxe-patch -G'^\+' --diff-filter=M origin/master..origin/seen
>>
>> I.e. on top of the above filter only show those diffs that have
>> additions. FAICT the conflicting diffs where the committer of the merge
>> conflict picked one side or the other will only have "-" lines".
>>
>> So those diffs that have additions look to be those where the person
>> doing the merge needed to combine the two.
>>
>> Well, usually. E.g. 26c4c09dd34 (Merge branch
>> 'gc/fetch-negotiate-only-early-return' into seen, 2021-12-25) in that
>> range shows that isn't strictly true. Most such deletion-only diffs are
>> less interesting in picking one side or the other of the conflict, but
>> that one combines the two:
>>
>>     -<<<<<<< d3419aac9f4 (Merge branch 'pw/add-p-hunk-split-fix' into seen)
>>                             warning(_("protocol does not support --negotiate-only, exiting"));
>>     -                       return 1;
>>     -=======
>>     -                       warning(_("Protocol does not support --negotiate-only, exiting."));
>>                             result = 1;
>>                             goto cleanup;
>>     ->>>>>>> 495e8601f28 (builtin/fetch: die on --negotiate-only and --recurse-submodules)
>>
>> Which I guess is partially commentary and partially a request (either
>> for this series, or some follow-up) for something like a
>> --remerge-diff-filter option. I.e. it would be very useful to be able to
>> filter on some combination of:
>>
>>  * Which side(s) of the conflict(s) were picked, or a combination?
>>  * Is there "new work" in the diff to resolve the conflict?
>>    AFIACT this will always mean we'll have "+ " lines.
>
> Do any of the following count as "new work"? :
>
>   * the deletion of a file (perhaps one that had no conflict but was
> deleted anyway)
>   * mode changes (again, perhaps on files that had no conflict)
>   * renames of files/directories?
>
> If so, searching for "^+" lines might be insufficient, but it depends
> on what you mean by new work.

Yes, I think at least for the use-cases I had in mind the useful thing
is knowing which state a merge commit is in:

 A: "Vanilla" merge (i.e. no --remerge-diff output)
 B: non-"vanilla" merge" (i.e. ew have --remerge-diff output)

What I'm requesting/musing about here (which is very much in the
category of stuff we can/should do later, and shouldn't prevent this
"good enough for now" series going in) is being able to disambiguate
those "B" cases.

I.e. being able to see if their/ours side "won", some mixture of the two
etc.

>> Or maybe that's not useful at all, and just -G<rx> (maybe combined with
>> my --pickaxe-patch) will cover it?
>
> I'd rather wait until we have a good idea of the potential range of
> usecases before adding a filter.  (And I think for now, the -G and
> --pickaxe-patch are probably good enough for this usecase.)  These
> particular usecases you point out are interesting; thanks for
> detailing them.  Here's some others to consider:
>
>   * Finding out when text was added or removed: `git log
> --remerge-diff -S<text>` (note that with only -p instead of
> --remerge-diff, that command will annoyingly misses cases where a
> merge introduced or removed the text)
>   * Finding out how a merge differed from one run with some
> non-default options (e.g. `git show --remerge-diff -Xours` or `git
> show --remerge-diff -Xno-space-change`; although show doesn't take -X
> options so this is just an idea at this point)
>   * Finding out how a merge would have differed had it been run with
> different options (so instead of comparing a remerge to the merge
> recorded in history, compare one remerge with default options with a
> different merge that uses e.g. -Xno-space-change)
>
> Also, I've got a follow-up series that also introduces a
> --remerge-diff-only flag which:
>   * For single parent commits that cannot be identified as a revert or
> cherry-pick, do not show a diff.
>   * For single parent commits that can be identified as a revert or
> cherry-pick, instead of showing a diff against the parent of the
> commit, redo the revert or cherry-pick in memory and show a diff
> against that.
>   * For merge commits, act the same as --remerge-diff

*nod*

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 1/9] show, log: provide a --remerge-diff capability
  2021-12-30 23:36     ` [PATCH v3 1/9] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
@ 2022-01-19 15:49       ` Ævar Arnfjörð Bjarmason
  2022-01-20  2:31         ` Elijah Newren
  2022-01-19 16:01       ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 113+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-19 15:49 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren


On Thu, Dec 30 2021, Elijah Newren via GitGitGadget wrote:

> From: Elijah Newren <newren@gmail.com>

> +static int do_remerge_diff(struct rev_info *opt,
> +			   struct commit_list *parents,
> +			   struct object_id *oid,
> +			   struct commit *commit)
> +{
> +	struct merge_options o;
> +	struct commit_list *bases;
> +	struct merge_result res = {0};
> +	struct pretty_print_context ctx = {0};
> +	struct commit *parent1 = parents->item;
> +	struct commit *parent2 = parents->next->item;
> +	struct strbuf parent1_desc = STRBUF_INIT;
> +	struct strbuf parent2_desc = STRBUF_INIT;
> +
> +	/* Setup merge options */
> +	init_merge_options(&o, the_repository);
> +	o.show_rename_progress = 0;
> +
> +	ctx.abbrev = DEFAULT_ABBREV;
> +	format_commit_message(parent1, "%h (%s)", &parent1_desc, &ctx);
> +	format_commit_message(parent2, "%h (%s)", &parent2_desc, &ctx);
> +	o.branch1 = parent1_desc.buf;
> +	o.branch2 = parent2_desc.buf;
> +
> +	/* Parse the relevant commits and get the merge bases */
> +	parse_commit_or_die(parent1);
> +	parse_commit_or_die(parent2);
> +	bases = get_merge_bases(parent1, parent2);

There's existing leaks all over the place here unrelated to this new
code, so this is no big deal, but I noticed that get_merge_bases() here
leaks.

Shouldn't it call free_commit_list() like e.g. diff_get_merge_base()
which invokes get_merge_bases() does on the return value?

> +test_description='remerge-diff handling'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'setup basic merges' '
> +	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
> +	git add numbers &&
> +	git commit -m base &&
> +
> +	git branch feature_a &&
> +	git branch feature_b &&
> +	git branch feature_c &&
> +
> +	git branch ab_resolution &&
> +	git branch bc_resolution &&
> +
> +	git checkout feature_a &&
> +	test_write_lines 1 2 three 4 5 6 7 eight 9 >numbers &&
> +	git commit -a -m change_a &&
> +
> +	git checkout feature_b &&
> +	test_write_lines 1 2 tres 4 5 6 7 8 9 >numbers &&
> +	git commit -a -m change_b &&
> +
> +	git checkout feature_c &&
> +	test_write_lines 1 2 3 4 5 6 7 8 9 10 >numbers &&
> +	git commit -a -m change_c &&
> +
> +	git checkout bc_resolution &&
> +	git merge --ff-only feature_b &&
> +	# no conflict
> +	git merge feature_c &&
> +
> +	git checkout ab_resolution &&
> +	git merge --ff-only feature_a &&
> +	# conflicts!
> +	test_must_fail git merge feature_b &&
> +	# Resolve conflict...and make another change elsewhere
> +	test_write_lines 1 2 drei 4 5 6 7 acht 9 >numbers &&
> +	git add numbers &&

Just a matter of taste, but FWIW some of the custom
test_write_lines/commit here could nowadays use test_commit with
--printf: 47c88d16ba6 (test-lib functions: add --printf option to
test_commit, 2021-05-10)

I don't think it's worth the churn to change it here, just an FYI.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 1/9] show, log: provide a --remerge-diff capability
  2021-12-30 23:36     ` [PATCH v3 1/9] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
  2022-01-19 15:49       ` Ævar Arnfjörð Bjarmason
@ 2022-01-19 16:01       ` Ævar Arnfjörð Bjarmason
  2022-01-20  2:33         ` Elijah Newren
  1 sibling, 1 reply; 113+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-19 16:01 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren


On Thu, Dec 30 2021, Elijah Newren via GitGitGadget wrote:

> From: Elijah Newren <newren@gmail.com>
> +	struct tmp_objdir *remerge_objdir = NULL;
> +
> +	if (rev->remerge_diff) {
> +		remerge_objdir = tmp_objdir_create("remerge-diff");
> +		if (!remerge_objdir)
> +			die_errno(_("unable to create temporary object directory"));
> +		tmp_objdir_replace_primary_odb(remerge_objdir, 1);
> +	}

Re the errno feedback on v1
https://lore.kernel.org/git/211221.864k71r6kz.gmgdl@evledraar.gmail.com/
the API might lose the "errno" due to e.g. the remove_dir_recurse()
codepath. This seems like it would take care of that:

diff --git a/builtin/log.c b/builtin/log.c
index 944d9c0d9b5..d4b8b1aa4b6 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -424,9 +424,9 @@ static int cmd_log_walk(struct rev_info *rev)
 	int saved_dcctc = 0;
 
 	if (rev->remerge_diff) {
-		rev->remerge_objdir = tmp_objdir_create("remerge-diff");
+		rev->remerge_objdir = tmp_objdir_create_gently("remerge-diff", 0);
 		if (!rev->remerge_objdir)
-			die_errno(_("unable to create temporary object directory"));
+			exit(128);
 		tmp_objdir_replace_primary_odb(rev->remerge_objdir, 1);
 	}
 
diff --git a/tmp-objdir.c b/tmp-objdir.c
index adf6033549e..3c656120003 100644
--- a/tmp-objdir.c
+++ b/tmp-objdir.c
@@ -121,19 +121,21 @@ static void env_replace(struct strvec *env, const char *key, const char *val)
 	strvec_pushf(env, "%s=%s", key, val);
 }
 
-static int setup_tmp_objdir(const char *root)
+static int setup_tmp_objdir(const char *root, int quiet)
 {
 	char *path;
 	int ret = 0;
 
 	path = xstrfmt("%s/pack", root);
 	ret = mkdir(path, 0777);
+	if (!quiet && ret < 0)
+		die_errno(_("unable to create temporary object directory '%s'"), path);
 	free(path);
 
 	return ret;
 }
 
-struct tmp_objdir *tmp_objdir_create(const char *prefix)
+struct tmp_objdir *tmp_objdir_create_gently(const char *prefix, int quiet)
 {
 	static int installed_handlers;
 	struct tmp_objdir *t;
@@ -161,6 +163,8 @@ struct tmp_objdir *tmp_objdir_create(const char *prefix)
 	strbuf_grow(&t->path, 1024);
 
 	if (!mkdtemp(t->path.buf)) {
+		if (!quiet)
+			error_errno(_("unable to create temporary directory '%s'"), t->path.buf);
 		/* free, not destroy, as we never touched the filesystem */
 		tmp_objdir_free(t);
 		return NULL;
@@ -173,7 +177,7 @@ struct tmp_objdir *tmp_objdir_create(const char *prefix)
 		installed_handlers++;
 	}
 
-	if (setup_tmp_objdir(t->path.buf)) {
+	if (setup_tmp_objdir(t->path.buf, quiet)) {
 		tmp_objdir_destroy(t);
 		return NULL;
 	}
diff --git a/tmp-objdir.h b/tmp-objdir.h
index 76efc7edee5..5072fb860d9 100644
--- a/tmp-objdir.h
+++ b/tmp-objdir.h
@@ -24,8 +24,15 @@ struct tmp_objdir;
 /*
  * Create a new temporary object directory with the specified prefix;
  * returns NULL on failure.
+ *
+ * The tmp_objdir_create() is an a wrapper for
+ * tmp_objdir_create_gently(..., 1).
  */
-struct tmp_objdir *tmp_objdir_create(const char *prefix);
+struct tmp_objdir *tmp_objdir_create_gently(const char *prefix, int quiet);
+static inline struct tmp_objdir *tmp_objdir_create(const char *prefix)
+{
+	return tmp_objdir_create_gently(prefix, 1);
+}
 
 /*
  * Return a list of environment strings, suitable for use with

^ permalink raw reply related	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 8/9] show, log: include conflict/warning messages in --remerge-diff headers
  2021-12-30 23:36     ` [PATCH v3 8/9] show, log: include conflict/warning messages in --remerge-diff headers Elijah Newren via GitGitGadget
@ 2022-01-19 16:19       ` Ævar Arnfjörð Bjarmason
  2022-01-21  2:16         ` Elijah Newren
  0 siblings, 1 reply; 113+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-19 16:19 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren


On Thu, Dec 30 2021, Elijah Newren via GitGitGadget wrote:

> From: Elijah Newren <newren@gmail.com>
>
> Conflicts such as modify/delete, rename/rename, or file/directory are
> not representable via content conflict markers, and the normal output
> messages notifying users about these were dropped with --remerge-diff.
> While we don't want these messages randomly shown before the commit
> and diff headers, we do want them to still be shown; include them as
> part of the diff headers instead.
> [...]
> +test_expect_success 'setup non-content conflicts' '
> +	git switch --orphan base &&
> +
> +	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
> +	test_write_lines a b c d e f g h i >letters &&
> +	test_write_lines in the way >content &&
> +	git add numbers letters content &&
> +	git commit -m base &&
> +
> +	git branch side1 &&
> +	git branch side2 &&
> +
> +	git checkout side1 &&
> +	test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
> +	git mv letters letters_side1 &&
> +	git mv content file_or_directory &&
> +	git add numbers &&
> +	git commit -m side1 &&
> +
> +	git checkout side2 &&
> +	git rm numbers &&
> +	git mv letters letters_side2 &&
> +	mkdir file_or_directory &&
> +	echo hello >file_or_directory/world &&
> +	git add file_or_directory/world &&
> +	git commit -m side2 &&
> +
> +	git checkout -b resolution side1 &&
> +	test_must_fail git merge side2 &&
> +	test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
> +	git add numbers &&
> +	git add letters_side1 &&
> +	git rm letters &&
> +	git rm letters_side2 &&
> +	git add file_or_directory~HEAD &&
> +	git mv file_or_directory~HEAD wanted_content &&
> +	git commit -m resolved
> +'
> +
> +test_expect_success 'remerge-diff with non-content conflicts' '
> +	git log -1 --oneline resolution >tmp &&
> +	cat <<-EOF >>tmp &&
> +	diff --git a/file_or_directory~HASH (side1) b/wanted_content
> +	similarity index 100%
> +	rename from file_or_directory~HASH (side1)
> +	rename to wanted_content
> +	remerge CONFLICT (file/directory): directory in the way of file_or_directory from HASH (side1); moving it to file_or_directory~HASH (side1) instead.
> +	diff --git a/letters b/letters
> +	remerge CONFLICT (rename/rename): letters renamed to letters_side1 in HASH (side1) and to letters_side2 in HASH (side2).
> +	diff --git a/letters_side2 b/letters_side2
> +	deleted file mode 100644
> +	index b236ae5..0000000
> +	--- a/letters_side2
> +	+++ /dev/null
> +	@@ -1,9 +0,0 @@
> +	-a
> +	-b
> +	-c
> +	-d
> +	-e
> +	-f
> +	-g
> +	-h
> +	-i
> +	diff --git a/numbers b/numbers
> +	remerge CONFLICT (modify/delete): numbers deleted in HASH (side2) and modified in HASH (side1).  Version HASH (side1) of numbers left in tree.
> +	EOF
> +	# We still have some sha1 hashes above; rip them out so test works
> +	# with sha256
> +	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
> +
> +	git show --oneline --remerge-diff resolution >tmp &&
> +	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
> +	test_cmp expect actual
> +'
> +
>  test_done

Re my comment about --diff-filter in an earlier round, I think testing
for that option should really be added here.

With current master and seen:

    $ git rev-parse origin/master
    50b2d72e110cad39ecaf2322bfdf1b60cd13dd96
    $ git rev-parse origin/seen
    9e835a8bdafce2aaeb6df5f57f11014051bbfdca

I will, with A, M, D get:

    for i in A M D; do echo With $i: && git -P log --oneline --remerge-diff --diff-filter=$i origin/master..origin/seen; done

Some of which is expected, and some of which is still weird, e.g.:
    
    $ git log --oneline --remerge-diff --diff-filter=D origin/master..origin/seen
    d120673d7cc Merge branch 'jh/builtin-fsmonitor-part2' (early part) into seen
    diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
    remerge CONFLICT (content): Merge conflict in t/perf/p7519-fsmonitor.sh
    61239ae3ee7 Merge branch 'pw/fix-some-issues-in-reset-head' into seen
    diff --git a/reset.c b/reset.c
    remerge CONFLICT (content): Merge conflict in reset.c
    diff --git a/sequencer.c b/sequencer.c
    remerge CONFLICT (content): Merge conflict in sequencer.c
    9b44aca15e4 Merge branch 'hn/reftable-coverity-fixes' into jch
    diff --git a/reftable/stack.c b/reftable/stack.c
    remerge CONFLICT (content): Merge conflict in reftable/stack.c
    [...]

Let's take the jh/builtin-fsmonitor-part2 merge, with =M I get this
output:
        
    $ git -P show --oneline --remerge-diff --diff-filter=M d120673d7cc
    d120673d7cc Merge branch 'jh/builtin-fsmonitor-part2' (early part) into seen
    diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
    remerge CONFLICT (content): Merge conflict in t/perf/p7519-fsmonitor.sh
    index 03269b5553b..e70252ed65a 100755
    --- a/t/perf/p7519-fsmonitor.sh
    +++ b/t/perf/p7519-fsmonitor.sh
    @@ -127,18 +127,11 @@ test_expect_success "one time repo setup" '
            fi &&
     
            mkdir 1_file 10_files 100_files 1000_files 10000_files &&
    -<<<<<<< 61239ae3ee7 (Merge branch 'pw/fix-some-issues-in-reset-head' into seen)
    -       for i in $(test_seq 1 10); do touch 10_files/$i || return 1; done &&
    -       for i in $(test_seq 1 100); do touch 100_files/$i || return 1; done &&
    -       for i in $(test_seq 1 1000); do touch 1000_files/$i || return 1; done &&
    -       for i in $(test_seq 1 10000); do touch 10000_files/$i || return 1; done &&
    -=======
            touch_files 1 &&
            touch_files 10 &&
            touch_files 100 &&
            touch_files 1000 &&
            touch_files 10000 &&
    ->>>>>>> e89980feb1d (t7527: test status with untracked-cache and fsmonitor--daemon)
            git add 1_file 10_files 100_files 1000_files 10000_files &&
            git commit -qm "Add files" &&

Which is fully expected, i.e. here the diff is modified (M).

But there aren't any added lines, so why do I get it under =A, and why
isn't the diff shown with =D (compare a normal 'git log --diff-filter=D
-p')?:
    
    $ for i in A D; do echo With $i: && git -P show --oneline --remerge-diff --diff-filter=$i d120673d7cc; done
    With A:
    d120673d7cc Merge branch 'jh/builtin-fsmonitor-part2' (early part) into seen
    diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
    remerge CONFLICT (content): Merge conflict in t/perf/p7519-fsmonitor.sh
    With D:
    d120673d7cc Merge branch 'jh/builtin-fsmonitor-part2' (early part) into seen
    diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
    remerge CONFLICT (content): Merge conflict in t/perf/p7519-fsmonitor.sh

Furthermore pathspec arguments seem to be broken. I.e. to use that
commit we can see without --remerge-diff that it's not directly modified
in a non-merge in that range:
    
    $ git -P log --oneline origin/master..origin/next -- t/perf/p7519-fsmonitor.sh
    d6f56f3248e Merge branch 'es/test-chain-lint' into next
    
But this should surely work, but doesn't. It's faking up a diff with =M,
so the pathspec filters should show it, shouldn't they?

    $ for i in A M D; do git -P show --oneline --remerge-diff --diff-filter=$i d120673d7cc -- t/perf/p7519-fsmonitor.sh; done
    $

Probably what's happening is that the filtering is being done on the
pre-"-remerge-diff" output. I.e. the traversal code needs to be updated
to inject modified paths into the commits we show --remerge-diff commits
for (but I'm just guessing).

For the rest of the --diff-filter flags the behavior also seems wrong, I
really didn't expect this to show any output:

    $ for i in R T U X B; do echo With $i: && git -P show --oneline --remerge-diff --diff-filter=$i d120673d7cc; done
    With R:
    d120673d7cc Merge branch 'jh/builtin-fsmonitor-part2' (early part) into seen
    diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
    remerge CONFLICT (content): Merge conflict in t/perf/p7519-fsmonitor.sh
    With T:
    d120673d7cc Merge branch 'jh/builtin-fsmonitor-part2' (early part) into seen
    diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
    remerge CONFLICT (content): Merge conflict in t/perf/p7519-fsmonitor.sh
    With U:
    d120673d7cc Merge branch 'jh/builtin-fsmonitor-part2' (early part) into seen
    diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
    remerge CONFLICT (content): Merge conflict in t/perf/p7519-fsmonitor.sh
    With X:
    d120673d7cc Merge branch 'jh/builtin-fsmonitor-part2' (early part) into seen
    diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
    remerge CONFLICT (content): Merge conflict in t/perf/p7519-fsmonitor.sh
    With B:
    d120673d7cc Merge branch 'jh/builtin-fsmonitor-part2' (early part) into seen
    diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
    remerge CONFLICT (content): Merge conflict in t/perf/p7519-fsmonitor.sh

I.e. we don't have a (R)ename, (T)type change, (U)nmerged (well, maybe,
but isn't it just for the index? See t6060-merge-index.sh) or Unknown
(X) there. Are they all being shown because of that generic "remerge
CONFLICT" line?

If the answer to all of the above is "yes, some of it is weird or
unintended, but let's deal with it later" I'd think that would also be
fine.

But let's then at least add something like what I added to the
git-range-diff.txt docs in df569c3f31f (range-diff doc: add a section
about output stability, 2018-11-09). I.e. explicitly say that we might
change the output when combined with other log options in the future,
and that any combination not currently documented won't be supported.

Re the CL mention of:
    
     * Ævar suggested also extending the docs with usage guidelines, but the
       example he picked was IMO best handled by just add --remerge-diff, so I'm
       not sure what to add to the docs. Maybe the log -S<string> --remerge-diff
       example as a way to more reliably determine when a string was added to or
       removed from the codebase? Where would that go anyway?

I don't think we need to document how --remerge-diff interacts with -S,
-G, or perhaps even most of --diff-filter.

But per the above it seems to me that we should at least have basic
tests (perhaps TODO tests), or explicitly document/note that some of the
interactions are buggy/weird (or not, maybe I'm just missing something).

The same goes for some other diff options, particularly those where
we're showing output we didn't before because of --remerge-diff,
e.g. --check is one such option. When I alter your tests with:
    
    diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
    index c1b44138145..d96320e6ab8 100755
    --- a/t/t4069-remerge-diff.sh
    +++ b/t/t4069-remerge-diff.sh
    @@ -120,7 +120,8 @@ test_expect_success 'setup non-content conflicts' '
     
            git checkout -b resolution side1 &&
            test_must_fail git merge side2 &&
    -       test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
    +       test_write_lines 1 2 three 4 5 6 7 8 >numbers &&
    +       echo "9 " >>numbers &&
            git add numbers &&
            git add letters_side1 &&
            git rm letters &&

The --check option works as expected, but we've got no test for the
combination of the two. Maybe we don't need them since we're confident
enough in the shared machinery, but I'd think it would be better to
consider this a black box and test it. I.e. maybe another --check
implementation would filter on whatever we use for the pathspecs
(showing it doesn't need to look at merge commits), and show nothing.
    
All of the above is just noting the journy of testing this, i.e. "hrm,
will it work with XYZ? No? Seems odd, and it's not tested at all...".

As noted before I find the current output really useful already. I've
just been trying to poke it in various ways to see if I can uncover any
bugs or unintended behavior.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 3/9] ll-merge: make callers responsible for showing warnings
  2021-12-30 23:36     ` [PATCH v3 3/9] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
@ 2022-01-19 16:41       ` Ævar Arnfjörð Bjarmason
  2022-01-20  3:29         ` Elijah Newren
  0 siblings, 1 reply; 113+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-19 16:41 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren


On Thu, Dec 30 2021, Elijah Newren via GitGitGadget wrote:

> Note that my methodology included first modifying ll_merge() to return
> a struct, so that the compiler would catch all the callers for me and
> ensure I had modified all of them.  After modifying all of them, I then
> changed the struct to an enum.
> [...]
> -int ll_merge(mmbuffer_t *result_buf,
> +enum ll_merge_result ll_merge(mmbuffer_t *result_buf,
>  	     const char *path,
>  	     mmfile_t *ancestor, const char *ancestor_label,
>  	     mmfile_t *ours, const char *our_label,
> diff --git a/ll-merge.h b/ll-merge.h
> index aceb1b24132..e4a20e81a3a 100644
> --- a/ll-merge.h
> +++ b/ll-merge.h
> @@ -82,13 +82,20 @@ struct ll_merge_options {
>  	long xdl_opts;
>  };
>  
> +enum ll_merge_result {
> +	LL_MERGE_ERROR = -1,
> +	LL_MERGE_OK = 0,
> +	LL_MERGE_CONFLICT,
> +	LL_MERGE_BINARY_CONFLICT,
> +};
> +

Isn't the other side of the enum checking missing in many cases?

E.g. ll_ext_merge() returns "enum ll_merge_result" now, and does:

        status = run_command_v_opt(args, RUN_USING_SHELL);
        ret = (status > 0) ? LL_MERGE_CONFLICT : status;

And grepping at the tip of this series shows:
    
    $ git grep LL_MERGE_OK
    ll-merge.c:             ret = LL_MERGE_OK;
    ll-merge.c:                     ret = LL_MERGE_OK;
    ll-merge.c:                     ret = LL_MERGE_OK;
    ll-merge.h:     LL_MERGE_OK = 0,

Similar for LL_MERGE_CONFLICT, the only one that's used outside of the
file itself and its header is LL_MERGE_BINARY_CONFLICT.

I.e. shouldn't these codepaths:

    git grep -w ll_merge

Be doing a switch() on that new enum? E.g. we lose the type in
three_way_merge() in apply.c, it seems to me that that function should
switch over this new enum, and return the "int" that the callers of
three_way_merge() care about (i.e. just <0, 0, 1, not this enum's -1, 0,
1, 2.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 1/9] show, log: provide a --remerge-diff capability
  2022-01-19 15:49       ` Ævar Arnfjörð Bjarmason
@ 2022-01-20  2:31         ` Elijah Newren
  2022-01-20  7:53           ` Elijah Newren
  0 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren @ 2022-01-20  2:31 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya, Neeraj Singh,
	Johannes Altmanninger

On Wed, Jan 19, 2022 at 7:53 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Thu, Dec 30 2021, Elijah Newren via GitGitGadget wrote:
>
> > From: Elijah Newren <newren@gmail.com>
>
> > +static int do_remerge_diff(struct rev_info *opt,
> > +                        struct commit_list *parents,
> > +                        struct object_id *oid,
> > +                        struct commit *commit)
> > +{
> > +     struct merge_options o;
> > +     struct commit_list *bases;
> > +     struct merge_result res = {0};
> > +     struct pretty_print_context ctx = {0};
> > +     struct commit *parent1 = parents->item;
> > +     struct commit *parent2 = parents->next->item;
> > +     struct strbuf parent1_desc = STRBUF_INIT;
> > +     struct strbuf parent2_desc = STRBUF_INIT;
> > +
> > +     /* Setup merge options */
> > +     init_merge_options(&o, the_repository);
> > +     o.show_rename_progress = 0;
> > +
> > +     ctx.abbrev = DEFAULT_ABBREV;
> > +     format_commit_message(parent1, "%h (%s)", &parent1_desc, &ctx);
> > +     format_commit_message(parent2, "%h (%s)", &parent2_desc, &ctx);
> > +     o.branch1 = parent1_desc.buf;
> > +     o.branch2 = parent2_desc.buf;
> > +
> > +     /* Parse the relevant commits and get the merge bases */
> > +     parse_commit_or_die(parent1);
> > +     parse_commit_or_die(parent2);
> > +     bases = get_merge_bases(parent1, parent2);
>
> There's existing leaks all over the place here unrelated to this new
> code, so this is no big deal, but I noticed that get_merge_bases() here
> leaks.

Interesting.

> Shouldn't it call free_commit_list() like e.g. diff_get_merge_base()
> which invokes get_merge_bases() does on the return value?

See the comment describing merge_incore_recursive() in merge-ort.h,
particularly this part:

* merge_bases will be consumed (emptied) so make a copy if you need it.

So free_commit_list() seems like it'd lead to a double free or use-after-free.

However, looking at merge_ort_internal() it looks like there is a bug
in its consumption of the merge bases (which I copied from
merge_recursive; oops).  It pops the first one off the commit list,
but then merely iterates through the remainder of the list without
popping.  So, if there's only one merge base, it'll consume it and the
code will look leak free (which must have been the cases I was looking
at when I was doing leak testing).  But in recursive cases, it leaks
the second and later ones.

Since the caller still has a pointer referring to the first (already
free'd) commit, I think that if they attempt to use it then it would
probably cause a use-after-free.


So, yes, I think there's a leak, but it's not due to this patch.  It's
one that has been around since...the introduction of merge-recursive
(though it originally computed the merge bases internally rather than
allowing them to be passed in).  So, it's been around for quite a
while.

I'll look into it, and see if I can come up with a fix, but it doesn't
really belong in this series.  I'll submit it separately.

Thanks for the report.

> > +test_description='remerge-diff handling'
> > +
> > +. ./test-lib.sh
> > +
> > +test_expect_success 'setup basic merges' '
> > +     test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
> > +     git add numbers &&
> > +     git commit -m base &&
> > +
> > +     git branch feature_a &&
> > +     git branch feature_b &&
> > +     git branch feature_c &&
> > +
> > +     git branch ab_resolution &&
> > +     git branch bc_resolution &&
> > +
> > +     git checkout feature_a &&
> > +     test_write_lines 1 2 three 4 5 6 7 eight 9 >numbers &&
> > +     git commit -a -m change_a &&
> > +
> > +     git checkout feature_b &&
> > +     test_write_lines 1 2 tres 4 5 6 7 8 9 >numbers &&
> > +     git commit -a -m change_b &&
> > +
> > +     git checkout feature_c &&
> > +     test_write_lines 1 2 3 4 5 6 7 8 9 10 >numbers &&
> > +     git commit -a -m change_c &&
> > +
> > +     git checkout bc_resolution &&
> > +     git merge --ff-only feature_b &&
> > +     # no conflict
> > +     git merge feature_c &&
> > +
> > +     git checkout ab_resolution &&
> > +     git merge --ff-only feature_a &&
> > +     # conflicts!
> > +     test_must_fail git merge feature_b &&
> > +     # Resolve conflict...and make another change elsewhere
> > +     test_write_lines 1 2 drei 4 5 6 7 acht 9 >numbers &&
> > +     git add numbers &&
>
> Just a matter of taste, but FWIW some of the custom
> test_write_lines/commit here could nowadays use test_commit with
> --printf: 47c88d16ba6 (test-lib functions: add --printf option to
> test_commit, 2021-05-10)
>
> I don't think it's worth the churn to change it here, just an FYI.

Good to know; thanks for the heads up.

Note, though, and this has nothing to do with your patches, but I'm
not sure I'll ever use this particular feature since I don't much care
for test_commit except in trivial cases.  Others have recommended the
function to me before, but my attempts to use it have cost me far more
time than it has saved due to its quirks not working well with the
merges I have attempted to setup.  Beyond the fact that its
documentation is a lie and the filename defaults to <message>.t, one
also has to memorize the order of three positional arguments and add a
smattering of additional flags (--printf, --append, --no-tag) and add
a bunch of newline directives to get things right.  The function can
be useful and nice on non-merge tests (e.g. when you can pass it just
one positional argument) and I'm happy to use it there, but for
merge-related tests it's a needless time sink where the best you can
hope for after fighting it is getting code that is overall _less_
readable than what you started with.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 1/9] show, log: provide a --remerge-diff capability
  2022-01-19 16:01       ` Ævar Arnfjörð Bjarmason
@ 2022-01-20  2:33         ` Elijah Newren
  0 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren @ 2022-01-20  2:33 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya, Neeraj Singh,
	Johannes Altmanninger

On Wed, Jan 19, 2022 at 8:06 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Thu, Dec 30 2021, Elijah Newren via GitGitGadget wrote:
>
> > From: Elijah Newren <newren@gmail.com>
> > +     struct tmp_objdir *remerge_objdir = NULL;
> > +
> > +     if (rev->remerge_diff) {
> > +             remerge_objdir = tmp_objdir_create("remerge-diff");
> > +             if (!remerge_objdir)
> > +                     die_errno(_("unable to create temporary object directory"));
> > +             tmp_objdir_replace_primary_odb(remerge_objdir, 1);
> > +     }
>
> Re the errno feedback on v1
> https://lore.kernel.org/git/211221.864k71r6kz.gmgdl@evledraar.gmail.com/
> the API might lose the "errno" due to e.g. the remove_dir_recurse()
> codepath. This seems like it would take care of that:
>
> diff --git a/builtin/log.c b/builtin/log.c
> index 944d9c0d9b5..d4b8b1aa4b6 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -424,9 +424,9 @@ static int cmd_log_walk(struct rev_info *rev)
>         int saved_dcctc = 0;
>
>         if (rev->remerge_diff) {
> -               rev->remerge_objdir = tmp_objdir_create("remerge-diff");
> +               rev->remerge_objdir = tmp_objdir_create_gently("remerge-diff", 0);
>                 if (!rev->remerge_objdir)
> -                       die_errno(_("unable to create temporary object directory"));
> +                       exit(128);
>                 tmp_objdir_replace_primary_odb(rev->remerge_objdir, 1);
>         }
>
> diff --git a/tmp-objdir.c b/tmp-objdir.c
> index adf6033549e..3c656120003 100644
> --- a/tmp-objdir.c
> +++ b/tmp-objdir.c
> @@ -121,19 +121,21 @@ static void env_replace(struct strvec *env, const char *key, const char *val)
>         strvec_pushf(env, "%s=%s", key, val);
>  }
>
> -static int setup_tmp_objdir(const char *root)
> +static int setup_tmp_objdir(const char *root, int quiet)
>  {
>         char *path;
>         int ret = 0;
>
>         path = xstrfmt("%s/pack", root);
>         ret = mkdir(path, 0777);
> +       if (!quiet && ret < 0)
> +               die_errno(_("unable to create temporary object directory '%s'"), path);
>         free(path);
>
>         return ret;
>  }
>
> -struct tmp_objdir *tmp_objdir_create(const char *prefix)
> +struct tmp_objdir *tmp_objdir_create_gently(const char *prefix, int quiet)
>  {
>         static int installed_handlers;
>         struct tmp_objdir *t;
> @@ -161,6 +163,8 @@ struct tmp_objdir *tmp_objdir_create(const char *prefix)
>         strbuf_grow(&t->path, 1024);
>
>         if (!mkdtemp(t->path.buf)) {
> +               if (!quiet)
> +                       error_errno(_("unable to create temporary directory '%s'"), t->path.buf);
>                 /* free, not destroy, as we never touched the filesystem */
>                 tmp_objdir_free(t);
>                 return NULL;
> @@ -173,7 +177,7 @@ struct tmp_objdir *tmp_objdir_create(const char *prefix)
>                 installed_handlers++;
>         }
>
> -       if (setup_tmp_objdir(t->path.buf)) {
> +       if (setup_tmp_objdir(t->path.buf, quiet)) {
>                 tmp_objdir_destroy(t);
>                 return NULL;
>         }
> diff --git a/tmp-objdir.h b/tmp-objdir.h
> index 76efc7edee5..5072fb860d9 100644
> --- a/tmp-objdir.h
> +++ b/tmp-objdir.h
> @@ -24,8 +24,15 @@ struct tmp_objdir;
>  /*
>   * Create a new temporary object directory with the specified prefix;
>   * returns NULL on failure.
> + *
> + * The tmp_objdir_create() is an a wrapper for
> + * tmp_objdir_create_gently(..., 1).
>   */
> -struct tmp_objdir *tmp_objdir_create(const char *prefix);
> +struct tmp_objdir *tmp_objdir_create_gently(const char *prefix, int quiet);
> +static inline struct tmp_objdir *tmp_objdir_create(const char *prefix)
> +{
> +       return tmp_objdir_create_gently(prefix, 1);
> +}
>
>  /*
>   * Return a list of environment strings, suitable for use with

Yeah, I think this suggests that switching from die() to die_errno()
was a mistake.  Your patch looks right (though most of it belongs as
part of ns/tmp-objdir rather than this series), but I think it makes
the code uglier and I don't see why this theoretical error path is
worth all this trouble.  A die() is totally sufficient here.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 3/9] ll-merge: make callers responsible for showing warnings
  2022-01-19 16:41       ` Ævar Arnfjörð Bjarmason
@ 2022-01-20  3:29         ` Elijah Newren
  0 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren @ 2022-01-20  3:29 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya, Neeraj Singh,
	Johannes Altmanninger

On Wed, Jan 19, 2022 at 8:51 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Thu, Dec 30 2021, Elijah Newren via GitGitGadget wrote:
>
> > Note that my methodology included first modifying ll_merge() to return
> > a struct, so that the compiler would catch all the callers for me and
> > ensure I had modified all of them.  After modifying all of them, I then
> > changed the struct to an enum.
> > [...]
> > -int ll_merge(mmbuffer_t *result_buf,
> > +enum ll_merge_result ll_merge(mmbuffer_t *result_buf,
> >            const char *path,
> >            mmfile_t *ancestor, const char *ancestor_label,
> >            mmfile_t *ours, const char *our_label,
> > diff --git a/ll-merge.h b/ll-merge.h
> > index aceb1b24132..e4a20e81a3a 100644
> > --- a/ll-merge.h
> > +++ b/ll-merge.h
> > @@ -82,13 +82,20 @@ struct ll_merge_options {
> >       long xdl_opts;
> >  };
> >
> > +enum ll_merge_result {
> > +     LL_MERGE_ERROR = -1,
> > +     LL_MERGE_OK = 0,
> > +     LL_MERGE_CONFLICT,
> > +     LL_MERGE_BINARY_CONFLICT,
> > +};
> > +
>
> Isn't the other side of the enum checking missing in many cases?
>
> E.g. ll_ext_merge() returns "enum ll_merge_result" now, and does:
>
>         status = run_command_v_opt(args, RUN_USING_SHELL);
>         ret = (status > 0) ? LL_MERGE_CONFLICT : status;
>
> And grepping at the tip of this series shows:
>
>     $ git grep LL_MERGE_OK
>     ll-merge.c:             ret = LL_MERGE_OK;
>     ll-merge.c:                     ret = LL_MERGE_OK;
>     ll-merge.c:                     ret = LL_MERGE_OK;
>     ll-merge.h:     LL_MERGE_OK = 0,
>
> Similar for LL_MERGE_CONFLICT, the only one that's used outside of the
> file itself and its header is LL_MERGE_BINARY_CONFLICT.
>
> I.e. shouldn't these codepaths:
>
>     git grep -w ll_merge
>
> Be doing a switch() on that new enum? E.g. we lose the type in
> three_way_merge() in apply.c, it seems to me that that function should
> switch over this new enum, and return the "int" that the callers of
> three_way_merge() care about (i.e. just <0, 0, 1, not this enum's -1, 0,
> 1, 2.

Actually, three_way_merge()'s callers operate on <0, 0, >0; 1 is not
special.  That's also the interface that ll_merge() traditionally
always used, and which still mostly applies, it's just they now want
to differentiate between two of the >0 cases (namely LL_MERGE_CONFLICT
vs. LL_MERGE_BINARY_CONFLICT).

That may sound like a big change, but since every single caller can
just specially check for LL_MERGE_BINARY_CONFLICT as a first step if
they care (some callers don't), and then drop back to using old code
as-is that assumes <0 vs. 0 vs. >0, that seems like a lot simpler
change.  And that's what this patch does.

Trying to convert all these callers over to switch statements seems
like unnecessary churn to me.  Since Junio has already reviewed a
former round of this patch and found it to his liking (modulo a
completely different one-line issue that I since corrected), I think
I'd like to stick with the patch as-is.  If folks feel really strongly
about changing something here, though, I can change the return type of
the ll_*merge() functions back to int.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 1/9] show, log: provide a --remerge-diff capability
  2022-01-20  2:31         ` Elijah Newren
@ 2022-01-20  7:53           ` Elijah Newren
  0 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren @ 2022-01-20  7:53 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya, Neeraj Singh,
	Johannes Altmanninger

On Wed, Jan 19, 2022 at 6:31 PM Elijah Newren <newren@gmail.com> wrote:
>
> On Wed, Jan 19, 2022 at 7:53 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
> >
> > On Thu, Dec 30 2021, Elijah Newren via GitGitGadget wrote:
> >
> > > From: Elijah Newren <newren@gmail.com>
> >
> > > +static int do_remerge_diff(struct rev_info *opt,
> > > +                        struct commit_list *parents,
> > > +                        struct object_id *oid,
> > > +                        struct commit *commit)
> > > +{
> > > +     struct merge_options o;
> > > +     struct commit_list *bases;
> > > +     struct merge_result res = {0};
> > > +     struct pretty_print_context ctx = {0};
> > > +     struct commit *parent1 = parents->item;
> > > +     struct commit *parent2 = parents->next->item;
> > > +     struct strbuf parent1_desc = STRBUF_INIT;
> > > +     struct strbuf parent2_desc = STRBUF_INIT;
> > > +
> > > +     /* Setup merge options */
> > > +     init_merge_options(&o, the_repository);
> > > +     o.show_rename_progress = 0;
> > > +
> > > +     ctx.abbrev = DEFAULT_ABBREV;
> > > +     format_commit_message(parent1, "%h (%s)", &parent1_desc, &ctx);
> > > +     format_commit_message(parent2, "%h (%s)", &parent2_desc, &ctx);
> > > +     o.branch1 = parent1_desc.buf;
> > > +     o.branch2 = parent2_desc.buf;
> > > +
> > > +     /* Parse the relevant commits and get the merge bases */
> > > +     parse_commit_or_die(parent1);
> > > +     parse_commit_or_die(parent2);
> > > +     bases = get_merge_bases(parent1, parent2);
> >
> > There's existing leaks all over the place here unrelated to this new
> > code, so this is no big deal, but I noticed that get_merge_bases() here
> > leaks.
>
> Interesting.
>
> > Shouldn't it call free_commit_list() like e.g. diff_get_merge_base()
> > which invokes get_merge_bases() does on the return value?

...
> So, yes, I think there's a leak, but it's not due to this patch.  It's
> one that has been around since...the introduction of merge-recursive
> (though it originally computed the merge bases internally rather than
> allowing them to be passed in).  So, it's been around for quite a
> while.
>
> I'll look into it, and see if I can come up with a fix, but it doesn't
> really belong in this series.  I'll submit it separately.
>
> Thanks for the report.

Fix over here: https://lore.kernel.org/git/pull.1200.git.git.1642664835.gitgitgadget@gmail.com/T/

Has a couple bonus memory leak fixes too.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 8/9] show, log: include conflict/warning messages in --remerge-diff headers
  2022-01-19 16:19       ` Ævar Arnfjörð Bjarmason
@ 2022-01-21  2:16         ` Elijah Newren
  2022-01-21 16:55           ` Elijah Newren
  0 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren @ 2022-01-21  2:16 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya, Neeraj Singh,
	Johannes Altmanninger

On Thu, Jan 20, 2022 at 3:27 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Thu, Dec 30 2021, Elijah Newren via GitGitGadget wrote:
>
> > From: Elijah Newren <newren@gmail.com>
> >
> > Conflicts such as modify/delete, rename/rename, or file/directory are
> > not representable via content conflict markers, and the normal output
> > messages notifying users about these were dropped with --remerge-diff.
> > While we don't want these messages randomly shown before the commit
> > and diff headers, we do want them to still be shown; include them as
> > part of the diff headers instead.
> > [...]
> > +test_expect_success 'setup non-content conflicts' '
> > +     git switch --orphan base &&
> > +
> > +     test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
> > +     test_write_lines a b c d e f g h i >letters &&
> > +     test_write_lines in the way >content &&
> > +     git add numbers letters content &&
> > +     git commit -m base &&
> > +
> > +     git branch side1 &&
> > +     git branch side2 &&
> > +
> > +     git checkout side1 &&
> > +     test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
> > +     git mv letters letters_side1 &&
> > +     git mv content file_or_directory &&
> > +     git add numbers &&
> > +     git commit -m side1 &&
> > +
> > +     git checkout side2 &&
> > +     git rm numbers &&
> > +     git mv letters letters_side2 &&
> > +     mkdir file_or_directory &&
> > +     echo hello >file_or_directory/world &&
> > +     git add file_or_directory/world &&
> > +     git commit -m side2 &&
> > +
> > +     git checkout -b resolution side1 &&
> > +     test_must_fail git merge side2 &&
> > +     test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
> > +     git add numbers &&
> > +     git add letters_side1 &&
> > +     git rm letters &&
> > +     git rm letters_side2 &&
> > +     git add file_or_directory~HEAD &&
> > +     git mv file_or_directory~HEAD wanted_content &&
> > +     git commit -m resolved
> > +'
> > +
> > +test_expect_success 'remerge-diff with non-content conflicts' '
> > +     git log -1 --oneline resolution >tmp &&
> > +     cat <<-EOF >>tmp &&
> > +     diff --git a/file_or_directory~HASH (side1) b/wanted_content
> > +     similarity index 100%
> > +     rename from file_or_directory~HASH (side1)
> > +     rename to wanted_content
> > +     remerge CONFLICT (file/directory): directory in the way of file_or_directory from HASH (side1); moving it to file_or_directory~HASH (side1) instead.
> > +     diff --git a/letters b/letters
> > +     remerge CONFLICT (rename/rename): letters renamed to letters_side1 in HASH (side1) and to letters_side2 in HASH (side2).
> > +     diff --git a/letters_side2 b/letters_side2
> > +     deleted file mode 100644
> > +     index b236ae5..0000000
> > +     --- a/letters_side2
> > +     +++ /dev/null
> > +     @@ -1,9 +0,0 @@
> > +     -a
> > +     -b
> > +     -c
> > +     -d
> > +     -e
> > +     -f
> > +     -g
> > +     -h
> > +     -i
> > +     diff --git a/numbers b/numbers
> > +     remerge CONFLICT (modify/delete): numbers deleted in HASH (side2) and modified in HASH (side1).  Version HASH (side1) of numbers left in tree.
> > +     EOF
> > +     # We still have some sha1 hashes above; rip them out so test works
> > +     # with sha256
> > +     sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
> > +
> > +     git show --oneline --remerge-diff resolution >tmp &&
> > +     sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
> > +     test_cmp expect actual
> > +'
> > +
> >  test_done
>
> Re my comment about --diff-filter in an earlier round, I think testing
> for that option should really be added here.
>
> With current master and seen:
>
>     $ git rev-parse origin/master
>     50b2d72e110cad39ecaf2322bfdf1b60cd13dd96
>     $ git rev-parse origin/seen
>     9e835a8bdafce2aaeb6df5f57f11014051bbfdca
>
> I will, with A, M, D get:
>
>     for i in A M D; do echo With $i: && git -P log --oneline --remerge-diff --diff-filter=$i origin/master..origin/seen; done
>
> Some of which is expected, and some of which is still weird, e.g.:
>
>     $ git log --oneline --remerge-diff --diff-filter=D origin/master..origin/seen
>     d120673d7cc Merge branch 'jh/builtin-fsmonitor-part2' (early part) into seen
>     diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
>     remerge CONFLICT (content): Merge conflict in t/perf/p7519-fsmonitor.sh
>     61239ae3ee7 Merge branch 'pw/fix-some-issues-in-reset-head' into seen
>     diff --git a/reset.c b/reset.c
>     remerge CONFLICT (content): Merge conflict in reset.c
>     diff --git a/sequencer.c b/sequencer.c
>     remerge CONFLICT (content): Merge conflict in sequencer.c
>     9b44aca15e4 Merge branch 'hn/reftable-coverity-fixes' into jch
>     diff --git a/reftable/stack.c b/reftable/stack.c
>     remerge CONFLICT (content): Merge conflict in reftable/stack.c
>     [...]

Thanks for the detailed testing and report.  Much appreciated.

I agree that this output doesn't make sense...but...we might have a
bit of a pickle.  I want to be able to do
   $ git show --remerge-diff ${merge_commit}
and have it show me whether the user did something to resolve that
merge.  That is NOT the same as asking whether the
as-merged-as-possible-tree matches the final tree.  Some examples
where those results differ:
  * The original merge had a binary conflict.  Merge machinery bailed
and put the copy from the first parent in the working tree.  User
resolved it by keeping copy from the first parent.
  * The original merge had a modify/delete conflict.  Merge machinery
bailed and left the file from the modified side in place.  User
resolved by keeping the file.
  * Similar to the previous case, but rename/delete instead of modify/delete.
  * Directory rename detection -- the merge machinery put the file in
the new directory but marked it as conflicted.  User decided they
liked the location and kept it.
In any of the above cases, there were conflicts for the user to
resolve, but none of them will show up by diffing the
tree-at-start-of-conflict-resolution to the final-tree.  The only
notification we will have is the conflict header.  When I'm looking
for how users resolved merge conflicts, I do not want these merges to
show up empty for me (implying they were clean merges); that would
miss out on important information.

Perhaps this means that the conflict header is a change of some type,
but not one that fits into the traditional ACDMRTUXB categories of
diff-filter?  Maybe we could attempt to categorize them (e.g. content
conflicts are modifications, modify/delete are...deletes? adds?
modifies?), but I think madness and corner cases lie down that route.
My attempt to make these conflict headers show up despite empty diffs
in the simple case (which required extra code in patch 7), caused it
to also show up for all your diff-filter cases.  I agree it doesn't
seem to make as much sense in your diff-filter cases...and I can
remove it from them by removing my extra code in patch 7, but then I'm
back to not seeing anything conflict headers for diffs that are
otherwise empty even when that is the info I'm looking for.

What should be done?  Treat conflict notices as an "unmerged" type of
change (maybe flagged with diff-filter=U)?  Treat it as some new type
of diff-filter change?  Something else?

I'm open to bright ideas here.

> Let's take the jh/builtin-fsmonitor-part2 merge, with =M I get this
> output:
>
>     $ git -P show --oneline --remerge-diff --diff-filter=M d120673d7cc
>     d120673d7cc Merge branch 'jh/builtin-fsmonitor-part2' (early part) into seen
>     diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
>     remerge CONFLICT (content): Merge conflict in t/perf/p7519-fsmonitor.sh
>     index 03269b5553b..e70252ed65a 100755
>     --- a/t/perf/p7519-fsmonitor.sh
>     +++ b/t/perf/p7519-fsmonitor.sh
>     @@ -127,18 +127,11 @@ test_expect_success "one time repo setup" '
>             fi &&
>
>             mkdir 1_file 10_files 100_files 1000_files 10000_files &&
>     -<<<<<<< 61239ae3ee7 (Merge branch 'pw/fix-some-issues-in-reset-head' into seen)
>     -       for i in $(test_seq 1 10); do touch 10_files/$i || return 1; done &&
>     -       for i in $(test_seq 1 100); do touch 100_files/$i || return 1; done &&
>     -       for i in $(test_seq 1 1000); do touch 1000_files/$i || return 1; done &&
>     -       for i in $(test_seq 1 10000); do touch 10000_files/$i || return 1; done &&
>     -=======
>             touch_files 1 &&
>             touch_files 10 &&
>             touch_files 100 &&
>             touch_files 1000 &&
>             touch_files 10000 &&
>     ->>>>>>> e89980feb1d (t7527: test status with untracked-cache and fsmonitor--daemon)
>             git add 1_file 10_files 100_files 1000_files 10000_files &&
>             git commit -qm "Add files" &&
>
> Which is fully expected, i.e. here the diff is modified (M).
>
> But there aren't any added lines, so why do I get it under =A, and why
> isn't the diff shown with =D (compare a normal 'git log --diff-filter=D
> -p')?:

Note that 'A' and 'D' diff filters are NOT about added/deleted lines,
they are about added/deleted files.  Those diff-filters will not find
add/deleted lines except when the entire file is added/deleted.  So
this part is fully expected.

>     $ for i in A D; do echo With $i: && git -P show --oneline --remerge-diff --diff-filter=$i d120673d7cc; done
>     With A:
>     d120673d7cc Merge branch 'jh/builtin-fsmonitor-part2' (early part) into seen
>     diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
>     remerge CONFLICT (content): Merge conflict in t/perf/p7519-fsmonitor.sh
>     With D:
>     d120673d7cc Merge branch 'jh/builtin-fsmonitor-part2' (early part) into seen
>     diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
>     remerge CONFLICT (content): Merge conflict in t/perf/p7519-fsmonitor.sh

Yes, this is the same issue as the first one above: my extra code in
patch 7 makes it always show the conflict headers (there are no add or
delete changes to show for this particular commit).

> Furthermore pathspec arguments seem to be broken. I.e. to use that
> commit we can see without --remerge-diff that it's not directly modified
> in a non-merge in that range:
>
>     $ git -P log --oneline origin/master..origin/next -- t/perf/p7519-fsmonitor.sh
>     d6f56f3248e Merge branch 'es/test-chain-lint' into next
>
> But this should surely work, but doesn't. It's faking up a diff with =M,
> so the pathspec filters should show it, shouldn't they?
>
>     $ for i in A M D; do git -P show --oneline --remerge-diff --diff-filter=$i d120673d7cc -- t/perf/p7519-fsmonitor.sh; done
>     $

For pinpointing this down a bit more, you could simplify this command
line to just

    $ git show --remerge-diff d120673d7cc -- t/perf/p7519-fsmonitor.sh
    $

and see that it doesn't include any output.  In other words, it's not
an interaction with --diff-filter (or --oneline), but just the
pathspecs.

(Also, it turns out that you can also leave out --remerge-diff and
you'll also get no output, but I'm not sure if that helps or hurts
with the "debugging", which I'll continue below.)

> Probably what's happening is that the filtering is being done on the
> pre-"-remerge-diff" output. I.e. the traversal code needs to be updated
> to inject modified paths into the commits we show --remerge-diff commits
> for (but I'm just guessing).

Yes, the filtering is being done on the commit before
do_remerge_diff() is ever called.  In fact, do_remerge_diff() is NOT
called; it never even gets a chance to look at this commit.

But this isn't unique to remerge-diff; check this out:

    $ git show --diff-merges=first-parent d120673d7cc --
t/perf/p7519-fsmonitor.sh
    $ git show --diff-merges=separate d120673d7cc -- t/perf/p7519-fsmonitor.sh
    $

This might explain what's going on to you:

    $ git diff --raw d120673d7cc^1 d120673d7cc -- t/perf/p7519-fsmonitor.sh
    :100755 100755 c8be58f3c7 e70252ed65 M  t/perf/p7519-fsmonitor.sh
    $ git diff --raw d120673d7cc^2 d120673d7cc -- t/perf/p7519-fsmonitor.sh
    $

Basically, git-log's (and git-show's) default history simplification
says "oh, the file version matches what was seen in the second parent?
 UNINTERESTING."  Which is kinda broken when diffing against the first
parent OR against the remerge-diff.

This particular issue is probably worth bringing up in the
remerge-diff documentation (and perhaps also the diff-merges=separate
and diff-merges=first-parent documentation), so that users know they
may want to specify --full-history together with these options.  (Or
maybe these particular types of diff-merges should just automatically
turn full history on?)

Now, just to show that this really is a history simplification issue,
here's the output when I include the --full-history flag together with
your command line:

$ git show --remerge-diff --oneline --full-history --diff-filter=M
d120673d7cc -- t/perf/p7519-fsmonitor.sh
d120673d7c Merge branch 'jh/builtin-fsmonitor-part2' (early part) into seen
diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
remerge CONFLICT (content): Merge conflict in t/perf/p7519-fsmonitor.sh
index 02b53bcaec..e70252ed65 100755
--- a/t/perf/p7519-fsmonitor.sh
+++ b/t/perf/p7519-fsmonitor.sh
@@ -127,23 +127,11 @@ test_expect_success "one time repo setup" '
        fi &&

        mkdir 1_file 10_files 100_files 1000_files 10000_files &&
-<<<<<<< 61239ae3ee (Merge branch 'pw/fix-some-issues-in-reset-head' into seen)
-       for i in $(test_seq 1 10); do touch 10_files/$i || return 1; done &&
-       for i in $(test_seq 1 100); do touch 100_files/$i || return 1; done &&
-       for i in $(test_seq 1 1000); do touch 1000_files/$i || return 1; done &&
-       for i in $(test_seq 1 10000); do touch 10000_files/$i ||
return 1; done &&
-||||||| 9d530dc002
-       for i in $(test_seq 1 10); do touch 10_files/$i; done &&
-       for i in $(test_seq 1 100); do touch 100_files/$i; done &&
-       for i in $(test_seq 1 1000); do touch 1000_files/$i; done &&
-       for i in $(test_seq 1 10000); do touch 10000_files/$i; done &&
-=======
        touch_files 1 &&
        touch_files 10 &&
        touch_files 100 &&
        touch_files 1000 &&
        touch_files 10000 &&
->>>>>>> e89980feb1 (t7527: test status with untracked-cache and
fsmonitor--daemon)
        git add 1_file 10_files 100_files 1000_files 10000_files &&
        git commit -qm "Add files" &&

So, for this pathspec with remerge-diff, we're at least behaving
exactly as git-log is documented.  But I do agree that folks might not
be expecting the default history simplification that log/show do when
using these other diff types.

> For the rest of the --diff-filter flags the behavior also seems wrong, I
> really didn't expect this to show any output:
>
>     $ for i in R T U X B; do echo With $i: && git -P show --oneline --remerge-diff --diff-filter=$i d120673d7cc; done
>     With R:
>     d120673d7cc Merge branch 'jh/builtin-fsmonitor-part2' (early part) into seen
>     diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
>     remerge CONFLICT (content): Merge conflict in t/perf/p7519-fsmonitor.sh
>     With T:
>     d120673d7cc Merge branch 'jh/builtin-fsmonitor-part2' (early part) into seen
>     diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
>     remerge CONFLICT (content): Merge conflict in t/perf/p7519-fsmonitor.sh
>     With U:
>     d120673d7cc Merge branch 'jh/builtin-fsmonitor-part2' (early part) into seen
>     diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
>     remerge CONFLICT (content): Merge conflict in t/perf/p7519-fsmonitor.sh
>     With X:
>     d120673d7cc Merge branch 'jh/builtin-fsmonitor-part2' (early part) into seen
>     diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
>     remerge CONFLICT (content): Merge conflict in t/perf/p7519-fsmonitor.sh
>     With B:
>     d120673d7cc Merge branch 'jh/builtin-fsmonitor-part2' (early part) into seen
>     diff --git a/t/perf/p7519-fsmonitor.sh b/t/perf/p7519-fsmonitor.sh
>     remerge CONFLICT (content): Merge conflict in t/perf/p7519-fsmonitor.sh
>
> I.e. we don't have a (R)ename, (T)type change, (U)nmerged (well, maybe,
> but isn't it just for the index? See t6060-merge-index.sh) or Unknown
> (X) there. Are they all being shown because of that generic "remerge
> CONFLICT" line?

Here you're repeating the same issue as at the beginning of the email,
though I guess just to check all the other diff-filter types besides
A, M, and D?  I agree we want to be able to filter these away somehow,
but, if a merge had conflicts and was resolved without making any file
changes, I don't want those conflicts notices to be filtered away when
I do a simple
    $ git show --remerge-diff ${merge_commit}
So, we need a clever solution that handles both kinds of cases, and
hopefully in a way that generically makes sense.  Perhaps the conflict
notices are selected by --diff-filter=U; maybe something else.  I'm
open to ideas.

> If the answer to all of the above is "yes, some of it is weird or
> unintended, but let's deal with it later" I'd think that would also be
> fine.
>
> But let's then at least add something like what I added to the
> git-range-diff.txt docs in df569c3f31f (range-diff doc: add a section
> about output stability, 2018-11-09). I.e. explicitly say that we might
> change the output when combined with other log options in the future,
> and that any combination not currently documented won't be supported.
>
> Re the CL mention of:
>
>      * Ævar suggested also extending the docs with usage guidelines, but the
>        example he picked was IMO best handled by just add --remerge-diff, so I'm
>        not sure what to add to the docs. Maybe the log -S<string> --remerge-diff
>        example as a way to more reliably determine when a string was added to or
>        removed from the codebase? Where would that go anyway?
>
> I don't think we need to document how --remerge-diff interacts with -S,
> -G, or perhaps even most of --diff-filter.
>
> But per the above it seems to me that we should at least have basic
> tests (perhaps TODO tests), or explicitly document/note that some of the
> interactions are buggy/weird (or not, maybe I'm just missing something).
>
> The same goes for some other diff options, particularly those where
> we're showing output we didn't before because of --remerge-diff,
> e.g. --check is one such option. When I alter your tests with:
>
>     diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
>     index c1b44138145..d96320e6ab8 100755
>     --- a/t/t4069-remerge-diff.sh
>     +++ b/t/t4069-remerge-diff.sh
>     @@ -120,7 +120,8 @@ test_expect_success 'setup non-content conflicts' '
>
>             git checkout -b resolution side1 &&
>             test_must_fail git merge side2 &&
>     -       test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
>     +       test_write_lines 1 2 three 4 5 6 7 8 >numbers &&
>     +       echo "9 " >>numbers &&
>             git add numbers &&
>             git add letters_side1 &&
>             git rm letters &&
>
> The --check option works as expected, but we've got no test for the
> combination of the two. Maybe we don't need them since we're confident
> enough in the shared machinery, but I'd think it would be better to
> consider this a black box and test it. I.e. maybe another --check
> implementation would filter on whatever we use for the pathspecs
> (showing it doesn't need to look at merge commits), and show nothing.
>
> All of the above is just noting the journy of testing this, i.e. "hrm,
> will it work with XYZ? No? Seems odd, and it's not tested at all...".
>
> As noted before I find the current output really useful already. I've
> just been trying to poke it in various ways to see if I can uncover any
> bugs or unintended behavior.

Very helpful, thanks.  So, I think there are three issues here:

   * default history simplification is surprising for
--diff-merges={separate,first-parent,remerge}, particularly in
combination with pathspecs.  Document that, or just have these options
turn on --full-history.
   * more tests would be useful (though I'm worried about
combinatorial explosion, so I think just a few more would be good,
particularly around the --diff-filter options)
   * interactions with --diff-filter are suboptimal.  We need
something better, but if I revert my other changes to fix that issue
then I break the simple "Did this merge have conflicts?" usecase.  We
need some clever solution.

^ permalink raw reply related	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 8/9] show, log: include conflict/warning messages in --remerge-diff headers
  2022-01-21  2:16         ` Elijah Newren
@ 2022-01-21 16:55           ` Elijah Newren
  0 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren @ 2022-01-21 16:55 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya, Neeraj Singh,
	Johannes Altmanninger

On Thu, Jan 20, 2022 at 6:16 PM Elijah Newren <newren@gmail.com> wrote:
>
> On Thu, Jan 20, 2022 at 3:27 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
> >
...
> > As noted before I find the current output really useful already. I've
> > just been trying to poke it in various ways to see if I can uncover any
> > bugs or unintended behavior.
>
> Very helpful, thanks.  So, I think there are three issues here:
>
>    * default history simplification is surprising for
> --diff-merges={separate,first-parent,remerge}, particularly in
> combination with pathspecs.  Document that, or just have these options
> turn on --full-history.
>    * more tests would be useful (though I'm worried about
> combinatorial explosion, so I think just a few more would be good,
> particularly around the --diff-filter options)
>    * interactions with --diff-filter are suboptimal.  We need
> something better, but if I revert my other changes to fix that issue
> then I break the simple "Did this merge have conflicts?" usecase.  We
> need some clever solution.

I tried out making the conflict headers be shown by --diff-filter=U
*OR* whenever the associated diff for that file would be shown (e.g.
if --diff-filter=R selects the file and it had a conflict header, it'd
show the conflict header.)  For diff-filters other than U that don't
match the filediff, nothing for the file shows up.  Seems to work
great, and cleans up the output quite a bit.

I've added more tests, including one for pathspecs since that did need
just a little more work.  I'll get it cleaned up and resubmitted soon.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH v4 00/10] Add a new --remerge-diff capability to show & log
  2021-12-30 23:36   ` [PATCH v3 0/9] " Elijah Newren via GitGitGadget
                       ` (9 preceding siblings ...)
  2021-12-31  8:46     ` [PATCH v3 0/9] Add a new --remerge-diff capability to show & log Junio C Hamano
@ 2022-01-21 19:12     ` Elijah Newren via GitGitGadget
  2022-01-21 19:12       ` [PATCH v4 01/10] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
                         ` (10 more replies)
  10 siblings, 11 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-01-21 19:12 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren

Here are some patches to add a --remerge-diff capability to show & log,
which works by comparing merge commits to an automatic remerge (note that
the automatic remerge tree can contain files with conflict markers).

Changes since v3:

 * Filter conflict headers according to pathspecs
 * Instead of always including conflict headers for all diff types, only
   select them with --diff-filter=U OR whenever the associated diff in
   question is selected
 * New testcases dealing with --diff-filter, pathspecs, and default history
   simplification
 * Switched back from die_errno() to die()

Changes NOT included (mostly because I'm not sure what to add or where):

 * Johannes Altimanninger suggested changing the ordering of the new headers
   relative to other headers. He made a good point, but I also like having
   the conflict messages next to the text, so I'm conflicted about what's
   best.
 * (Technically not part of this feature, but kind of related.) Months ago,
   Junio suggested documenting ${GIT_DIR}/AUTO_MERGE better
   (https://lore.kernel.org/git/xmqqtuj4nepe.fsf@gitster.g/). I looked at
   the time, but couldn't find a place to put it that made sense to me.

Changes since v2 (of the restarted submission):

 * Numerous small improvements suggested by Johannes Altmanninger
 * Avoid including conflict messages from inner merges (due to example
   pointed out by Ævar).
 * Added a "remerge" prefix to all the new diff headers (suggested by Junio
   in a previous round, but I couldn't come up with a good name before. It
   suddenly hit me that "remerge" is an obvious prefix to use, and even
   helps explain what the rest of the line is for.)

Changes since v1 (of the restarted submission, which technically was v2):

 * Restructured the series, so the first patch introduces the feature --
   with a bunch of caveats. Subsequent patches clean up those caveats. This
   avoids introducing not-yet-used functions, and hopefully makes review
   easier.
 * added testcases
 * numerous small improvements suggested by Ævar and Junio

Changes since original submission[1]:

 * Rebased on top of the version of ns/tmp-objdir that Neeraj submitted
   (Neeraj's patches were based on v2.34, but ns/tmp-objdir got applied on
   an old commit and does not even build because of that).
 * Modify ll-merge API to return a status, instead of printing "Cannot merge
   binary files" on stdout[2] (as suggested by Peff)
 * Make conflict messages and other such warnings into diff headers of the
   subsequent remerge-diff rather than appearing in the diff as file content
   of some funny looking filenames (as suggested by Peff[3] and Junio[4])
 * Sergey ack'ed the diff-merges.c portion of the patches, but that wasn't
   limited to one patch so not sure where to record that ack.

[1]
https://lore.kernel.org/git/pull.1080.git.git.1630376800.gitgitgadget@gmail.com/;
GitHub wouldn't let me change the target branch for the PR, so I had to
create a new one with the new base and thus the reason for not sending this
as v2 even though it is. [2]
https://lore.kernel.org/git/YVOZRhWttzF18Xql@coredump.intra.peff.net/,
https://lore.kernel.org/git/YVOZty9D7NRbzhE5@coredump.intra.peff.net/ [3]
https://lore.kernel.org/git/YVOXPTjsp9lrxmS6@coredump.intra.peff.net/ [4]
https://lore.kernel.org/git/xmqqr1d7e4ug.fsf@gitster.g/

=== FURTHER BACKGROUND (original cover letter material) ==

Here are some example commits you can try this out on (with git show
--remerge-diff $COMMIT):

 * git.git conflicted merge: 07601b5b36
 * git.git non-conflicted change: bf04590ecd
 * linux.git conflicted merge: eab3540562fb
 * linux.git non-conflicted change: 223cea6a4f05

Many more can be found by just running git log --merges --remerge-diff in
your repository of choice and searching for diffs (most merges tend to be
clean and unmodified and thus produce no diff but a search of '^diff' in the
log output tends to find the examples nicely).

Some basic high level details about this new option:

 * This option is most naturally compared to --cc, though the output seems
   to be much more understandable to most users than --cc output.
 * Since merges are often clean and unmodified, this new option results in
   an empty diff for most merges.
 * This new option shows things like the removal of conflict markers, which
   hunks users picked from the various conflicted sides to keep or remove,
   and shows changes made outside of conflict markers (which might reflect
   changes needed to resolve semantic conflicts or cleanups of e.g.
   compilation warnings or other additional changes an integrator felt
   belonged in the merged result).
 * This new option does not (currently) work for octopus merges, since
   merge-ort is specific to two-parent merges[1].
 * This option will not work on a read-only or full filesystem[2].
 * We discussed this capability at Git Merge 2020, and one of the
   suggestions was doing a periodic git gc --auto during the operation (due
   to potential new blobs and trees created during the operation). I found a
   way to avoid that; see [2].
 * This option is faster than you'd probably expect; it handles 33.5 merge
   commits per second in linux.git on my computer; see below.

In regards to the performance point above, the timing for running the
following command:

time git log --min-parents=2 --max-parents=2 $DIFF_FLAG | wc -l


in linux.git (with v5.4 checked out, since my copy of linux is very out of
date) is as follows:

DIFF_FLAG=--cc:            71m 31.536s
DIFF_FLAG=--remerge-diff:  31m  3.170s


Note that there are 62476 merges in this history. Also, output size is:

DIFF_FLAG=--cc:            2169111 lines
DIFF_FLAG=--remerge-diff:  2458020 lines


So roughly the same amount of output as --cc, as you'd expect.

As a side note: git log --remerge-diff, when run in various repositories and
allowed to run all the way back to the beginning(s) of history, is a nice
stress test of sorts for merge-ort. Especially when users run it for you on
their repositories they are working on, whether intentionally or via a bug
in a tool triggering that command to be run unexpectedly. Long story short,
such a bug in an internal tool existed in December 2020 and this command was
run on an internal repository and found a platform-specific bug in merge-ort
on some really old merge commit from that repo. I fixed that bug (a
STABLE_QSORT thing) while upstreaming all the merge-ort patches in the mean
time, but it was nice getting extra testing. Having more folks run this on
their repositories might be useful extra testing of the new merge strategy.

Also, I previously mentioned --remerge-diff-only (a flag to show how
cherry-picks or reverts differ from an automatic cherry-pick or revert, in
addition to showing how merges differ from an automatic merge). This series
does not include the patches to introduce that option; I'll submit them
later.

Two other things that might be interesting but are not included and which I
haven't investigated:

 * some mechanism for passing extra merge options through (e.g.
   -Xignore-space-change)
 * a capability to compare the automatic merge to a second automatic merge
   done with different merge options. (Not sure if this would be of interest
   to end users, but might be interesting while developing new a
   --strategy-option, or maybe checking how changing some default in the
   merge algorithm would affect historical merges in various repositories).

[1] I have nebulous ideas of how an Octopus-centric ORT strategy could be
written -- basically, just repeatedly invoking ort and trying to make sure
nested conflicts can be differentiated. For now, though, a simple warning is
printed that octopus merges are not handled and no diff will be shown. [2]
New blobs/trees can be written by the three-way merging step. These are
written to a temporary area (via tmp-objdir.c) under the git object store
that is cleaned up at the end of the operation, with the new loose objects
from the remerge being cleaned up after each individual merge.

Elijah Newren (10):
  show, log: provide a --remerge-diff capability
  log: clean unneeded objects during `log --remerge-diff`
  ll-merge: make callers responsible for showing warnings
  merge-ort: capture and print ll-merge warnings in our preferred
    fashion
  merge-ort: mark a few more conflict messages as omittable
  merge-ort: format messages slightly different for use in headers
  diff: add ability to insert additional headers for paths
  show, log: include conflict/warning messages in --remerge-diff headers
  merge-ort: mark conflict/warning messages from inner merges as
    omittable
  diff-merges: avoid history simplifications when diffing merges

 Documentation/diff-options.txt |  10 +-
 apply.c                        |   5 +-
 builtin/checkout.c             |  12 +-
 builtin/log.c                  |  15 ++
 diff-merges.c                  |  14 ++
 diff.c                         | 124 +++++++++++++-
 diff.h                         |   3 +-
 ll-merge.c                     |  40 +++--
 ll-merge.h                     |   9 +-
 log-tree.c                     | 118 +++++++++++++-
 merge-blobs.c                  |   5 +-
 merge-ort.c                    |  55 ++++++-
 merge-ort.h                    |  10 ++
 merge-recursive.c              |   9 +-
 merge-recursive.h              |   2 +
 notes-merge.c                  |   5 +-
 rerere.c                       |   9 +-
 revision.h                     |   6 +-
 t/t4069-remerge-diff.sh        | 290 +++++++++++++++++++++++++++++++++
 t/t6404-recursive-merge.sh     |   9 +-
 t/t6406-merge-attr.sh          |   9 +-
 tmp-objdir.c                   |   5 +
 tmp-objdir.h                   |   6 +
 23 files changed, 722 insertions(+), 48 deletions(-)
 create mode 100755 t/t4069-remerge-diff.sh


base-commit: 4e44121c2d7bced65e25eb7ec5156290132bec94
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1103%2Fnewren%2Fremerge-diff-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1103/newren/remerge-diff-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1103

Range-diff vs v3:

  1:  d57ae218cf9 !  1:  0b94724311d show, log: provide a --remerge-diff capability
     @@ builtin/log.c: static int cmd_log_walk(struct rev_info *rev)
      +	if (rev->remerge_diff) {
      +		remerge_objdir = tmp_objdir_create("remerge-diff");
      +		if (!remerge_objdir)
     -+			die_errno(_("unable to create temporary object directory"));
     ++			die(_("unable to create temporary object directory"));
      +		tmp_objdir_replace_primary_odb(remerge_objdir, 1);
      +	}
       
     @@ t/t4069-remerge-diff.sh (new)
      +
      +. ./test-lib.sh
      +
     ++# This test is ort-specific
     ++test "${GIT_TEST_MERGE_ALGORITHM:-ort}" = ort || {
     ++	skip_all="GIT_TEST_MERGE_ALGORITHM != ort"
     ++	test_done
     ++}
     ++
      +test_expect_success 'setup basic merges' '
      +	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
      +	git add numbers &&
  2:  798625b53f2 !  2:  f06de6c1b2f log: clean unneeded objects during `log --remerge-diff`
     @@ builtin/log.c: static int cmd_log_walk(struct rev_info *rev)
      -		if (!remerge_objdir)
      +		rev->remerge_objdir = tmp_objdir_create("remerge-diff");
      +		if (!rev->remerge_objdir)
     - 			die_errno(_("unable to create temporary object directory"));
     + 			die(_("unable to create temporary object directory"));
      -		tmp_objdir_replace_primary_odb(remerge_objdir, 1);
      +		tmp_objdir_replace_primary_odb(rev->remerge_objdir, 1);
       	}
  3:  b952f674df1 =  3:  8d6c3d48f0e ll-merge: make callers responsible for showing warnings
  4:  e8cf1626960 =  4:  de8e8f88fa4 merge-ort: capture and print ll-merge warnings in our preferred fashion
  5:  4d1848c8a29 =  5:  6b535a4d55a merge-ort: mark a few more conflict messages as omittable
  6:  81e736b847e =  6:  e2441608c63 merge-ort: format messages slightly different for use in headers
  7:  5000a94aa98 !  7:  62734beb693 diff: add ability to insert additional headers for paths
     @@ diff.c: int diff_unmodified_pair(struct diff_filepair *p)
       static void diff_flush_patch(struct diff_filepair *p, struct diff_options *o)
       {
      -	if (diff_unmodified_pair(p))
     ++	int include_conflict_headers =
     ++	    (additional_headers(o, p->one->path) &&
     ++	     (!o->filter || filter_bit_tst(DIFF_STATUS_UNMERGED, o)));
     ++
      +	/*
      +	 * Check if we can return early without showing a diff.  Note that
      +	 * diff_filepair only stores {oid, path, mode, is_valid}
      +	 * information for each path, and thus diff_unmodified_pair() only
      +	 * considers those bits of info.  However, we do not want pairs
     -+	 * created by create_filepairs_for_header_only_notifications() to
     -+	 * be ignored, so return early if both p is unmodified AND
     -+	 * p->one->path is not in additional headers.
     ++	 * created by create_filepairs_for_header_only_notifications()
     ++	 * (which always look like unmodified pairs) to be ignored, so
     ++	 * return early if both p is unmodified AND we don't want to
     ++	 * include_conflict_headers.
      +	 */
     -+	if (diff_unmodified_pair(p) && !additional_headers(o, p->one->path))
     ++	if (diff_unmodified_pair(p) && !include_conflict_headers)
       		return;
       
      +	/* Actually, we can also return early to avoid showing tree diffs */
     @@ diff.c: static void diff_flush_checkdiff(struct diff_filepair *p,
       {
       	struct diff_queue_struct *q = &diff_queued_diff;
       	int i;
     ++	int include_conflict_headers =
     ++	    (o->additional_path_headers &&
     ++	     (!o->filter || filter_bit_tst(DIFF_STATUS_UNMERGED, o)));
      +
     -+	if (o->additional_path_headers &&
     -+	    !strmap_empty(o->additional_path_headers))
     ++	if (include_conflict_headers)
      +		return 0;
     ++
       	for (i = 0; i < q->nr; i++)
       		if (!diff_unmodified_pair(q->queue[i]))
       			return 0;
     @@ diff.h: void diffcore_fix_diff_index(void);
       "  -a  --text    treat all files as text.\n"
       
      -int diff_queue_is_empty(void);
     -+int diff_queue_is_empty(struct diff_options*);
     ++int diff_queue_is_empty(struct diff_options *o);
       void diff_flush(struct diff_options*);
       void diff_free(struct diff_options*);
       void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc);
  8:  78ec1f44e4e !  8:  17eccf7e0d6 show, log: include conflict/warning messages in --remerge-diff headers
     @@ Commit message
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
       ## log-tree.c ##
     +@@
     + #include "line-log.h"
     + #include "help.h"
     + #include "range-diff.h"
     ++#include "strmap.h"
     + 
     + static struct decoration name_decoration = { "object names" };
     + static int decoration_loaded;
     +@@ log-tree.c: static int do_diff_combined(struct rev_info *opt, struct commit *commit)
     + 	return !opt->loginfo;
     + }
     + 
     ++static void setup_additional_headers(struct diff_options *o,
     ++				     struct strmap *all_headers)
     ++{
     ++	struct hashmap_iter iter;
     ++	struct strmap_entry *entry;
     ++
     ++	/*
     ++	 * Make o->additional_path_headers contain the subset of all_headers
     ++	 * that match o->pathspec.  If there aren't any that match o->pathspec,
     ++	 * then make o->additional_path_headers be NULL.
     ++	 */
     ++
     ++	if (!o->pathspec.nr) {
     ++		o->additional_path_headers = all_headers;
     ++		return;
     ++	}
     ++
     ++	o->additional_path_headers = xmalloc(sizeof(struct strmap));
     ++	strmap_init_with_options(o->additional_path_headers, NULL, 0);
     ++	strmap_for_each_entry(all_headers, &iter, entry) {
     ++		if (match_pathspec(the_repository->index, &o->pathspec,
     ++				   entry->key, strlen(entry->key),
     ++				   0 /* prefix */, NULL /* seen */,
     ++				   0 /* is_dir */))
     ++			strmap_put(o->additional_path_headers,
     ++				   entry->key, entry->value);
     ++	}
     ++	if (!strmap_get_size(o->additional_path_headers)) {
     ++		strmap_clear(o->additional_path_headers, 0);
     ++		FREE_AND_NULL(o->additional_path_headers);
     ++	}
     ++}
     ++
     ++static void cleanup_additional_headers(struct diff_options *o)
     ++{
     ++	if (!o->pathspec.nr) {
     ++		o->additional_path_headers = NULL;
     ++		return;
     ++	}
     ++	if (!o->additional_path_headers)
     ++		return;
     ++
     ++	strmap_clear(o->additional_path_headers, 0);
     ++	FREE_AND_NULL(o->additional_path_headers);
     ++}
     ++
     + static int do_remerge_diff(struct rev_info *opt,
     + 			   struct commit_list *parents,
     + 			   struct object_id *oid,
      @@ log-tree.c: static int do_remerge_diff(struct rev_info *opt,
       	/* Setup merge options */
       	init_merge_options(&o, the_repository);
     @@ log-tree.c: static int do_remerge_diff(struct rev_info *opt,
       	merge_incore_recursive(&o, bases, parent1, parent2, &res);
       
       	/* Show the diff */
     -+	opt->diffopt.additional_path_headers = res.path_messages;
     ++	setup_additional_headers(&opt->diffopt, res.path_messages);
       	diff_tree_oid(&res.tree->object.oid, oid, "", &opt->diffopt);
       	log_tree_diff_flush(opt);
       
       	/* Cleanup */
     -+	opt->diffopt.additional_path_headers = NULL;
     ++	cleanup_additional_headers(&opt->diffopt);
       	strbuf_release(&parent1_desc);
       	strbuf_release(&parent2_desc);
       	merge_finalize(&o, &res);
     @@ merge-ort.h: struct merge_result {
       	 * to merge_incore_*().  Includes data needed to update the index (if
      
       ## t/t4069-remerge-diff.sh ##
     -@@ t/t4069-remerge-diff.sh: test_description='remerge-diff handling'
     - 
     - . ./test-lib.sh
     - 
     -+# --remerge-diff uses ort under the hood regardless of setting.  However,
     -+# we set up a file/directory conflict beforehand, and the different backends
     -+# handle the conflict differently, which would require separate code paths
     -+# to resolve.  There's not much point in making the code uglier to do that,
     -+# though, when the real thing we are testing (--remerge-diff) will hardcode
     -+# calls directly into the merge-ort API anyway.  So just force the use of
     -+# ort on the setup too.
     -+GIT_TEST_MERGE_ALGORITHM=ort
     -+
     - test_expect_success 'setup basic merges' '
     - 	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
     - 	git add numbers &&
      @@ t/t4069-remerge-diff.sh: test_expect_success 'remerge-diff with both a resolved conflict and an unrelated
       	git log -1 --oneline ab_resolution >tmp &&
       	cat <<-EOF >>tmp &&
     @@ t/t4069-remerge-diff.sh: test_expect_success 'remerge-diff with both a resolved
      +	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
      +	test_cmp expect actual
      +'
     ++
     ++test_expect_success 'remerge-diff w/ diff-filter=U: all conflict headers, no diff content' '
     ++	git log -1 --oneline resolution >tmp &&
     ++	cat <<-EOF >>tmp &&
     ++	diff --git a/file_or_directory~HASH (side1) b/file_or_directory~HASH (side1)
     ++	remerge CONFLICT (file/directory): directory in the way of file_or_directory from HASH (side1); moving it to file_or_directory~HASH (side1) instead.
     ++	diff --git a/letters b/letters
     ++	remerge CONFLICT (rename/rename): letters renamed to letters_side1 in HASH (side1) and to letters_side2 in HASH (side2).
     ++	diff --git a/numbers b/numbers
     ++	remerge CONFLICT (modify/delete): numbers deleted in HASH (side2) and modified in HASH (side1).  Version HASH (side1) of numbers left in tree.
     ++	EOF
     ++	# We still have some sha1 hashes above; rip them out so test works
     ++	# with sha256
     ++	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
     ++
     ++	git show --oneline --remerge-diff --diff-filter=U resolution >tmp &&
     ++	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
     ++	test_cmp expect actual
     ++'
     ++
     ++test_expect_success 'remerge-diff w/ diff-filter=R: relevant file + conflict header' '
     ++	git log -1 --oneline resolution >tmp &&
     ++	cat <<-EOF >>tmp &&
     ++	diff --git a/file_or_directory~HASH (side1) b/wanted_content
     ++	similarity index 100%
     ++	rename from file_or_directory~HASH (side1)
     ++	rename to wanted_content
     ++	remerge CONFLICT (file/directory): directory in the way of file_or_directory from HASH (side1); moving it to file_or_directory~HASH (side1) instead.
     ++	EOF
     ++	# We still have some sha1 hashes above; rip them out so test works
     ++	# with sha256
     ++	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
     ++
     ++	git show --oneline --remerge-diff --diff-filter=R resolution >tmp &&
     ++	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
     ++	test_cmp expect actual
     ++'
     ++
     ++test_expect_success 'remerge-diff w/ pathspec: limits to relevant file including conflict header' '
     ++	git log -1 --oneline resolution >tmp &&
     ++	cat <<-EOF >>tmp &&
     ++	diff --git a/letters b/letters
     ++	remerge CONFLICT (rename/rename): letters renamed to letters_side1 in HASH (side1) and to letters_side2 in HASH (side2).
     ++	diff --git a/letters_side2 b/letters_side2
     ++	deleted file mode 100644
     ++	index b236ae5..0000000
     ++	--- a/letters_side2
     ++	+++ /dev/null
     ++	@@ -1,9 +0,0 @@
     ++	-a
     ++	-b
     ++	-c
     ++	-d
     ++	-e
     ++	-f
     ++	-g
     ++	-h
     ++	-i
     ++	EOF
     ++	# We still have some sha1 hashes above; rip them out so test works
     ++	# with sha256
     ++	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
     ++
     ++	git show --oneline --remerge-diff --full-history resolution -- "letters*" >tmp &&
     ++	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
     ++	test_cmp expect actual
     ++'
      +
       test_done
  9:  64b44ee84f3 =  9:  b3e7656cfc6 merge-ort: mark conflict/warning messages from inner merges as omittable
  -:  ----------- > 10:  ea5df61cf35 diff-merges: avoid history simplifications when diffing merges

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH v4 01/10] show, log: provide a --remerge-diff capability
  2022-01-21 19:12     ` [PATCH v4 00/10] " Elijah Newren via GitGitGadget
@ 2022-01-21 19:12       ` Elijah Newren via GitGitGadget
  2022-02-01  9:09         ` Ævar Arnfjörð Bjarmason
  2022-01-21 19:12       ` [PATCH v4 02/10] log: clean unneeded objects during `log --remerge-diff` Elijah Newren via GitGitGadget
                         ` (9 subsequent siblings)
  10 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-01-21 19:12 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

When this option is specified, we remerge all (two parent) merge commits
and diff the actual merge commit to the automatically created version,
in order to show how users removed conflict markers, resolved the
different conflict versions, and potentially added new changes outside
of conflict regions in order to resolve semantic merge problems (or,
possibly, just to hide other random changes).

This capability works by creating a temporary object directory and
marking it as the primary object store.  This makes it so that any blobs
or trees created during the automatic merge are easily removable
afterwards by just deleting all objects from the temporary object
directory.

There are a few ways that this implementation is suboptimal:
  * `log --remerge-diff` becomes slow, because the temporary object
    directory can fill with many loose objects while running
  * the log output can be muddied with misplaced "warning: cannot merge
    binary files" messages, since ll-merge.c unconditionally writes those
    messages to stderr while running instead of allowing callers to
    manage them.
  * important conflict and warning messages are simply dropped; thus for
    conflicts like modify/delete or rename/rename or file/directory which
    are not representable with content conflict markers, there may be no
    way for a user of --remerge-diff to know that there had been a
    conflict which was resolved (and which possibly motivated other
    changes in the merge commit).
  * when fixing the previous issue, note that some unimportant conflict
    and warning messages might start being included.  We should instead
    make sure these remain dropped.
Subsequent commits will address these issues.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Documentation/diff-options.txt | 10 +++-
 builtin/log.c                  | 14 ++++++
 diff-merges.c                  | 12 +++++
 log-tree.c                     | 59 ++++++++++++++++++++++
 revision.h                     |  3 +-
 t/t4069-remerge-diff.sh        | 90 ++++++++++++++++++++++++++++++++++
 6 files changed, 186 insertions(+), 2 deletions(-)
 create mode 100755 t/t4069-remerge-diff.sh

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index c89d530d3d1..6b8175defe6 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -34,7 +34,7 @@ endif::git-diff[]
 endif::git-format-patch[]
 
 ifdef::git-log[]
---diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc)::
+--diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc|remerge|r)::
 --no-diff-merges::
 	Specify diff format to be used for merge commits. Default is
 	{diff-merges-default} unless `--first-parent` is in use, in which case
@@ -64,6 +64,14 @@ ifdef::git-log[]
 	each of the parents. Separate log entry and diff is generated
 	for each parent.
 +
+--diff-merges=remerge:::
+--diff-merges=r:::
+--remerge-diff:::
+	With this option, two-parent merge commits are remerged to
+	create a temporary tree object -- potentially containing files
+	with conflict markers and such.  A diff is then shown between
+	that temporary tree and the actual merge commit.
++
 --diff-merges=combined:::
 --diff-merges=c:::
 -c:::
diff --git a/builtin/log.c b/builtin/log.c
index f75d87e8d7f..846ba0f995a 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -35,6 +35,7 @@
 #include "repository.h"
 #include "commit-reach.h"
 #include "range-diff.h"
+#include "tmp-objdir.h"
 
 #define MAIL_DEFAULT_WRAP 72
 #define COVER_FROM_AUTO_MAX_SUBJECT_LEN 100
@@ -406,6 +407,14 @@ static int cmd_log_walk(struct rev_info *rev)
 	struct commit *commit;
 	int saved_nrl = 0;
 	int saved_dcctc = 0;
+	struct tmp_objdir *remerge_objdir = NULL;
+
+	if (rev->remerge_diff) {
+		remerge_objdir = tmp_objdir_create("remerge-diff");
+		if (!remerge_objdir)
+			die(_("unable to create temporary object directory"));
+		tmp_objdir_replace_primary_odb(remerge_objdir, 1);
+	}
 
 	if (rev->early_output)
 		setup_early_output();
@@ -449,6 +458,9 @@ static int cmd_log_walk(struct rev_info *rev)
 	rev->diffopt.no_free = 0;
 	diff_free(&rev->diffopt);
 
+	if (rev->remerge_diff)
+		tmp_objdir_destroy(remerge_objdir);
+
 	if (rev->diffopt.output_format & DIFF_FORMAT_CHECKDIFF &&
 	    rev->diffopt.flags.check_failed) {
 		return 02;
@@ -1943,6 +1955,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 		die(_("--name-status does not make sense"));
 	if (rev.diffopt.output_format & DIFF_FORMAT_CHECKDIFF)
 		die(_("--check does not make sense"));
+	if (rev.remerge_diff)
+		die(_("--remerge-diff does not make sense"));
 
 	if (!use_patch_format &&
 		(!rev.diffopt.output_format ||
diff --git a/diff-merges.c b/diff-merges.c
index 5060ccd890b..0af4b3f9191 100644
--- a/diff-merges.c
+++ b/diff-merges.c
@@ -17,6 +17,7 @@ static void suppress(struct rev_info *revs)
 	revs->combined_all_paths = 0;
 	revs->merges_imply_patch = 0;
 	revs->merges_need_diff = 0;
+	revs->remerge_diff = 0;
 }
 
 static void set_separate(struct rev_info *revs)
@@ -45,6 +46,12 @@ static void set_dense_combined(struct rev_info *revs)
 	revs->dense_combined_merges = 1;
 }
 
+static void set_remerge_diff(struct rev_info *revs)
+{
+	suppress(revs);
+	revs->remerge_diff = 1;
+}
+
 static diff_merges_setup_func_t func_by_opt(const char *optarg)
 {
 	if (!strcmp(optarg, "off") || !strcmp(optarg, "none"))
@@ -57,6 +64,8 @@ static diff_merges_setup_func_t func_by_opt(const char *optarg)
 		return set_combined;
 	else if (!strcmp(optarg, "cc") || !strcmp(optarg, "dense-combined"))
 		return set_dense_combined;
+	else if (!strcmp(optarg, "r") || !strcmp(optarg, "remerge"))
+		return set_remerge_diff;
 	else if (!strcmp(optarg, "m") || !strcmp(optarg, "on"))
 		return set_to_default;
 	return NULL;
@@ -110,6 +119,9 @@ int diff_merges_parse_opts(struct rev_info *revs, const char **argv)
 	} else if (!strcmp(arg, "--cc")) {
 		set_dense_combined(revs);
 		revs->merges_imply_patch = 1;
+	} else if (!strcmp(arg, "--remerge-diff")) {
+		set_remerge_diff(revs);
+		revs->merges_imply_patch = 1;
 	} else if (!strcmp(arg, "--no-diff-merges")) {
 		suppress(revs);
 	} else if (!strcmp(arg, "--combined-all-paths")) {
diff --git a/log-tree.c b/log-tree.c
index 644893fd8cf..84ed864fc81 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -1,4 +1,5 @@
 #include "cache.h"
+#include "commit-reach.h"
 #include "config.h"
 #include "diff.h"
 #include "object-store.h"
@@ -7,6 +8,7 @@
 #include "tag.h"
 #include "graph.h"
 #include "log-tree.h"
+#include "merge-ort.h"
 #include "reflog-walk.h"
 #include "refs.h"
 #include "string-list.h"
@@ -902,6 +904,51 @@ static int do_diff_combined(struct rev_info *opt, struct commit *commit)
 	return !opt->loginfo;
 }
 
+static int do_remerge_diff(struct rev_info *opt,
+			   struct commit_list *parents,
+			   struct object_id *oid,
+			   struct commit *commit)
+{
+	struct merge_options o;
+	struct commit_list *bases;
+	struct merge_result res = {0};
+	struct pretty_print_context ctx = {0};
+	struct commit *parent1 = parents->item;
+	struct commit *parent2 = parents->next->item;
+	struct strbuf parent1_desc = STRBUF_INIT;
+	struct strbuf parent2_desc = STRBUF_INIT;
+
+	/* Setup merge options */
+	init_merge_options(&o, the_repository);
+	o.show_rename_progress = 0;
+
+	ctx.abbrev = DEFAULT_ABBREV;
+	format_commit_message(parent1, "%h (%s)", &parent1_desc, &ctx);
+	format_commit_message(parent2, "%h (%s)", &parent2_desc, &ctx);
+	o.branch1 = parent1_desc.buf;
+	o.branch2 = parent2_desc.buf;
+
+	/* Parse the relevant commits and get the merge bases */
+	parse_commit_or_die(parent1);
+	parse_commit_or_die(parent2);
+	bases = get_merge_bases(parent1, parent2);
+
+	/* Re-merge the parents */
+	merge_incore_recursive(&o, bases, parent1, parent2, &res);
+
+	/* Show the diff */
+	diff_tree_oid(&res.tree->object.oid, oid, "", &opt->diffopt);
+	log_tree_diff_flush(opt);
+
+	/* Cleanup */
+	strbuf_release(&parent1_desc);
+	strbuf_release(&parent2_desc);
+	merge_finalize(&o, &res);
+	/* TODO: clean up the temporary object directory */
+
+	return !opt->loginfo;
+}
+
 /*
  * Show the diff of a commit.
  *
@@ -936,6 +983,18 @@ static int log_tree_diff(struct rev_info *opt, struct commit *commit, struct log
 	}
 
 	if (is_merge) {
+		int octopus = (parents->next->next != NULL);
+
+		if (opt->remerge_diff) {
+			if (octopus) {
+				show_log(opt);
+				fprintf(opt->diffopt.file,
+					"diff: warning: Skipping remerge-diff "
+					"for octopus merges.\n");
+				return 1;
+			}
+			return do_remerge_diff(opt, parents, oid, commit);
+		}
 		if (opt->combine_merges)
 			return do_diff_combined(opt, commit);
 		if (opt->separate_merges) {
diff --git a/revision.h b/revision.h
index 5578bb4720a..13178e6b8f3 100644
--- a/revision.h
+++ b/revision.h
@@ -195,7 +195,8 @@ struct rev_info {
 			combine_merges:1,
 			combined_all_paths:1,
 			dense_combined_merges:1,
-			first_parent_merges:1;
+			first_parent_merges:1,
+			remerge_diff:1;
 
 	/* Format info */
 	int		show_notes;
diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
new file mode 100755
index 00000000000..5ef191f4fc9
--- /dev/null
+++ b/t/t4069-remerge-diff.sh
@@ -0,0 +1,90 @@
+#!/bin/sh
+
+test_description='remerge-diff handling'
+
+. ./test-lib.sh
+
+# This test is ort-specific
+test "${GIT_TEST_MERGE_ALGORITHM:-ort}" = ort || {
+	skip_all="GIT_TEST_MERGE_ALGORITHM != ort"
+	test_done
+}
+
+test_expect_success 'setup basic merges' '
+	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
+	git add numbers &&
+	git commit -m base &&
+
+	git branch feature_a &&
+	git branch feature_b &&
+	git branch feature_c &&
+
+	git branch ab_resolution &&
+	git branch bc_resolution &&
+
+	git checkout feature_a &&
+	test_write_lines 1 2 three 4 5 6 7 eight 9 >numbers &&
+	git commit -a -m change_a &&
+
+	git checkout feature_b &&
+	test_write_lines 1 2 tres 4 5 6 7 8 9 >numbers &&
+	git commit -a -m change_b &&
+
+	git checkout feature_c &&
+	test_write_lines 1 2 3 4 5 6 7 8 9 10 >numbers &&
+	git commit -a -m change_c &&
+
+	git checkout bc_resolution &&
+	git merge --ff-only feature_b &&
+	# no conflict
+	git merge feature_c &&
+
+	git checkout ab_resolution &&
+	git merge --ff-only feature_a &&
+	# conflicts!
+	test_must_fail git merge feature_b &&
+	# Resolve conflict...and make another change elsewhere
+	test_write_lines 1 2 drei 4 5 6 7 acht 9 >numbers &&
+	git add numbers &&
+	git merge --continue
+'
+
+test_expect_success 'remerge-diff on a clean merge' '
+	git log -1 --oneline bc_resolution >expect &&
+	git show --oneline --remerge-diff bc_resolution >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'remerge-diff with both a resolved conflict and an unrelated change' '
+	git log -1 --oneline ab_resolution >tmp &&
+	cat <<-EOF >>tmp &&
+	diff --git a/numbers b/numbers
+	index a1fb731..6875544 100644
+	--- a/numbers
+	+++ b/numbers
+	@@ -1,13 +1,9 @@
+	 1
+	 2
+	-<<<<<<< b0ed5cb (change_a)
+	-three
+	-=======
+	-tres
+	->>>>>>> 6cd3f82 (change_b)
+	+drei
+	 4
+	 5
+	 6
+	 7
+	-eight
+	+acht
+	 9
+	EOF
+	# Hashes above are sha1; rip them out so test works with sha256
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
+
+	git show --oneline --remerge-diff ab_resolution >tmp &&
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
+	test_cmp expect actual
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v4 02/10] log: clean unneeded objects during `log --remerge-diff`
  2022-01-21 19:12     ` [PATCH v4 00/10] " Elijah Newren via GitGitGadget
  2022-01-21 19:12       ` [PATCH v4 01/10] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
@ 2022-01-21 19:12       ` Elijah Newren via GitGitGadget
  2022-02-01  9:35         ` Ævar Arnfjörð Bjarmason
  2022-01-21 19:12       ` [PATCH v4 03/10] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
                         ` (8 subsequent siblings)
  10 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-01-21 19:12 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

The --remerge-diff option will need to create new blobs and trees
representing the "automatic merge" state.  If one is traversing a
long project history, one can easily get hundreds of thousands of
loose objects generated during `log --remerge-diff`.  However, none of
those loose objects are needed after we have completed our diff
operation; they can be summarily deleted.

Add a new helper function to tmp_objdir to discard all the contained
objects, and call it after each merge is handled.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/log.c | 13 +++++++------
 log-tree.c    |  8 +++++++-
 revision.h    |  3 +++
 tmp-objdir.c  |  5 +++++
 tmp-objdir.h  |  6 ++++++
 5 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 846ba0f995a..ac550e1ae62 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -407,13 +407,12 @@ static int cmd_log_walk(struct rev_info *rev)
 	struct commit *commit;
 	int saved_nrl = 0;
 	int saved_dcctc = 0;
-	struct tmp_objdir *remerge_objdir = NULL;
 
 	if (rev->remerge_diff) {
-		remerge_objdir = tmp_objdir_create("remerge-diff");
-		if (!remerge_objdir)
+		rev->remerge_objdir = tmp_objdir_create("remerge-diff");
+		if (!rev->remerge_objdir)
 			die(_("unable to create temporary object directory"));
-		tmp_objdir_replace_primary_odb(remerge_objdir, 1);
+		tmp_objdir_replace_primary_odb(rev->remerge_objdir, 1);
 	}
 
 	if (rev->early_output)
@@ -458,8 +457,10 @@ static int cmd_log_walk(struct rev_info *rev)
 	rev->diffopt.no_free = 0;
 	diff_free(&rev->diffopt);
 
-	if (rev->remerge_diff)
-		tmp_objdir_destroy(remerge_objdir);
+	if (rev->remerge_diff) {
+		tmp_objdir_destroy(rev->remerge_objdir);
+		rev->remerge_objdir = NULL;
+	}
 
 	if (rev->diffopt.output_format & DIFF_FORMAT_CHECKDIFF &&
 	    rev->diffopt.flags.check_failed) {
diff --git a/log-tree.c b/log-tree.c
index 84ed864fc81..d4655b63d75 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -4,6 +4,7 @@
 #include "diff.h"
 #include "object-store.h"
 #include "repository.h"
+#include "tmp-objdir.h"
 #include "commit.h"
 #include "tag.h"
 #include "graph.h"
@@ -944,7 +945,12 @@ static int do_remerge_diff(struct rev_info *opt,
 	strbuf_release(&parent1_desc);
 	strbuf_release(&parent2_desc);
 	merge_finalize(&o, &res);
-	/* TODO: clean up the temporary object directory */
+
+	/* Clean up the contents of the temporary object directory */
+	if (opt->remerge_objdir)
+		tmp_objdir_discard_objects(opt->remerge_objdir);
+	else
+		BUG("unable to remove temporary object directory");
 
 	return !opt->loginfo;
 }
diff --git a/revision.h b/revision.h
index 13178e6b8f3..44efce3f410 100644
--- a/revision.h
+++ b/revision.h
@@ -318,6 +318,9 @@ struct rev_info {
 
 	/* misc. flags related to '--no-kept-objects' */
 	unsigned keep_pack_cache_flags;
+
+	/* Location where temporary objects for remerge-diff are written. */
+	struct tmp_objdir *remerge_objdir;
 };
 
 int ref_excluded(struct string_list *, const char *path);
diff --git a/tmp-objdir.c b/tmp-objdir.c
index 3d38eeab66b..adf6033549e 100644
--- a/tmp-objdir.c
+++ b/tmp-objdir.c
@@ -79,6 +79,11 @@ static void remove_tmp_objdir_on_signal(int signo)
 	raise(signo);
 }
 
+void tmp_objdir_discard_objects(struct tmp_objdir *t)
+{
+	remove_dir_recursively(&t->path, REMOVE_DIR_KEEP_TOPLEVEL);
+}
+
 /*
  * These env_* functions are for setting up the child environment; the
  * "replace" variant overrides the value of any existing variable with that
diff --git a/tmp-objdir.h b/tmp-objdir.h
index cda5ec76778..76efc7edee5 100644
--- a/tmp-objdir.h
+++ b/tmp-objdir.h
@@ -46,6 +46,12 @@ int tmp_objdir_migrate(struct tmp_objdir *);
  */
 int tmp_objdir_destroy(struct tmp_objdir *);
 
+/*
+ * Remove all objects from the temporary object directory, while leaving it
+ * around so more objects can be added.
+ */
+void tmp_objdir_discard_objects(struct tmp_objdir *);
+
 /*
  * Add the temporary object directory as an alternate object store in the
  * current process.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v4 03/10] ll-merge: make callers responsible for showing warnings
  2022-01-21 19:12     ` [PATCH v4 00/10] " Elijah Newren via GitGitGadget
  2022-01-21 19:12       ` [PATCH v4 01/10] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
  2022-01-21 19:12       ` [PATCH v4 02/10] log: clean unneeded objects during `log --remerge-diff` Elijah Newren via GitGitGadget
@ 2022-01-21 19:12       ` Elijah Newren via GitGitGadget
  2022-01-21 19:12       ` [PATCH v4 04/10] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
                         ` (7 subsequent siblings)
  10 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-01-21 19:12 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Since some callers may want to send warning messages to somewhere other
than stdout/stderr, stop printing "warning: Cannot merge binary files"
from ll-merge and instead modify the return status of ll_merge() to
indicate when a merge of binary files has occurred.  Message printing
probably does not belong in a "low-level merge" anyway.

This commit continues printing the message as-is, just from the callers
instead of within ll_merge().  Future changes will start handling the
message differently in the merge-ort codepath.

There was one special case here: the callers in rerere.c do NOT check
for and print such a message; since those code paths explicitly skip
over binary files, there is no reason to check for a return status of
LL_MERGE_BINARY_CONFLICT or print the related message.

Note that my methodology included first modifying ll_merge() to return
a struct, so that the compiler would catch all the callers for me and
ensure I had modified all of them.  After modifying all of them, I then
changed the struct to an enum.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 apply.c            |  5 ++++-
 builtin/checkout.c | 12 ++++++++----
 ll-merge.c         | 40 ++++++++++++++++++++++------------------
 ll-merge.h         |  9 ++++++++-
 merge-blobs.c      |  5 ++++-
 merge-ort.c        |  5 ++++-
 merge-recursive.c  |  5 ++++-
 notes-merge.c      |  5 ++++-
 rerere.c           |  9 +++++----
 9 files changed, 63 insertions(+), 32 deletions(-)

diff --git a/apply.c b/apply.c
index 43a0aebf4ee..8079395755f 100644
--- a/apply.c
+++ b/apply.c
@@ -3492,7 +3492,7 @@ static int three_way_merge(struct apply_state *state,
 {
 	mmfile_t base_file, our_file, their_file;
 	mmbuffer_t result = { NULL };
-	int status;
+	enum ll_merge_result status;
 
 	/* resolve trivial cases first */
 	if (oideq(base, ours))
@@ -3509,6 +3509,9 @@ static int three_way_merge(struct apply_state *state,
 			  &their_file, "theirs",
 			  state->repo->index,
 			  NULL);
+	if (status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			path, "ours", "theirs");
 	free(base_file.ptr);
 	free(our_file.ptr);
 	free(their_file.ptr);
diff --git a/builtin/checkout.c b/builtin/checkout.c
index cbf73b8c9f6..3a559d69303 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -237,6 +237,7 @@ static int checkout_merged(int pos, const struct checkout *state,
 	struct cache_entry *ce = active_cache[pos];
 	const char *path = ce->name;
 	mmfile_t ancestor, ours, theirs;
+	enum ll_merge_result merge_status;
 	int status;
 	struct object_id oid;
 	mmbuffer_t result_buf;
@@ -267,13 +268,16 @@ static int checkout_merged(int pos, const struct checkout *state,
 	memset(&ll_opts, 0, sizeof(ll_opts));
 	git_config_get_bool("merge.renormalize", &renormalize);
 	ll_opts.renormalize = renormalize;
-	status = ll_merge(&result_buf, path, &ancestor, "base",
-			  &ours, "ours", &theirs, "theirs",
-			  state->istate, &ll_opts);
+	merge_status = ll_merge(&result_buf, path, &ancestor, "base",
+				&ours, "ours", &theirs, "theirs",
+				state->istate, &ll_opts);
 	free(ancestor.ptr);
 	free(ours.ptr);
 	free(theirs.ptr);
-	if (status < 0 || !result_buf.ptr) {
+	if (merge_status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			path, "ours", "theirs");
+	if (merge_status < 0 || !result_buf.ptr) {
 		free(result_buf.ptr);
 		return error(_("path '%s': cannot merge"), path);
 	}
diff --git a/ll-merge.c b/ll-merge.c
index 261657578c7..a937cec59a6 100644
--- a/ll-merge.c
+++ b/ll-merge.c
@@ -14,7 +14,7 @@
 
 struct ll_merge_driver;
 
-typedef int (*ll_merge_fn)(const struct ll_merge_driver *,
+typedef enum ll_merge_result (*ll_merge_fn)(const struct ll_merge_driver *,
 			   mmbuffer_t *result,
 			   const char *path,
 			   mmfile_t *orig, const char *orig_name,
@@ -49,7 +49,7 @@ void reset_merge_attributes(void)
 /*
  * Built-in low-levels
  */
-static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
+static enum ll_merge_result ll_binary_merge(const struct ll_merge_driver *drv_unused,
 			   mmbuffer_t *result,
 			   const char *path,
 			   mmfile_t *orig, const char *orig_name,
@@ -58,6 +58,7 @@ static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
 			   const struct ll_merge_options *opts,
 			   int marker_size)
 {
+	enum ll_merge_result ret;
 	mmfile_t *stolen;
 	assert(opts);
 
@@ -68,16 +69,19 @@ static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
 	 */
 	if (opts->virtual_ancestor) {
 		stolen = orig;
+		ret = LL_MERGE_OK;
 	} else {
 		switch (opts->variant) {
 		default:
-			warning("Cannot merge binary files: %s (%s vs. %s)",
-				path, name1, name2);
-			/* fallthru */
+			ret = LL_MERGE_BINARY_CONFLICT;
+			stolen = src1;
+			break;
 		case XDL_MERGE_FAVOR_OURS:
+			ret = LL_MERGE_OK;
 			stolen = src1;
 			break;
 		case XDL_MERGE_FAVOR_THEIRS:
+			ret = LL_MERGE_OK;
 			stolen = src2;
 			break;
 		}
@@ -87,16 +91,10 @@ static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
 	result->size = stolen->size;
 	stolen->ptr = NULL;
 
-	/*
-	 * With -Xtheirs or -Xours, we have cleanly merged;
-	 * otherwise we got a conflict.
-	 */
-	return opts->variant == XDL_MERGE_FAVOR_OURS ||
-	       opts->variant == XDL_MERGE_FAVOR_THEIRS ?
-	       0 : 1;
+	return ret;
 }
 
-static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
+static enum ll_merge_result ll_xdl_merge(const struct ll_merge_driver *drv_unused,
 			mmbuffer_t *result,
 			const char *path,
 			mmfile_t *orig, const char *orig_name,
@@ -105,7 +103,9 @@ static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
 			const struct ll_merge_options *opts,
 			int marker_size)
 {
+	enum ll_merge_result ret;
 	xmparam_t xmp;
+	int status;
 	assert(opts);
 
 	if (orig->size > MAX_XDIFF_SIZE ||
@@ -133,10 +133,12 @@ static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
 	xmp.ancestor = orig_name;
 	xmp.file1 = name1;
 	xmp.file2 = name2;
-	return xdl_merge(orig, src1, src2, &xmp, result);
+	status = xdl_merge(orig, src1, src2, &xmp, result);
+	ret = (status > 0) ? LL_MERGE_CONFLICT : status;
+	return ret;
 }
 
-static int ll_union_merge(const struct ll_merge_driver *drv_unused,
+static enum ll_merge_result ll_union_merge(const struct ll_merge_driver *drv_unused,
 			  mmbuffer_t *result,
 			  const char *path,
 			  mmfile_t *orig, const char *orig_name,
@@ -178,7 +180,7 @@ static void create_temp(mmfile_t *src, char *path, size_t len)
 /*
  * User defined low-level merge driver support.
  */
-static int ll_ext_merge(const struct ll_merge_driver *fn,
+static enum ll_merge_result ll_ext_merge(const struct ll_merge_driver *fn,
 			mmbuffer_t *result,
 			const char *path,
 			mmfile_t *orig, const char *orig_name,
@@ -194,6 +196,7 @@ static int ll_ext_merge(const struct ll_merge_driver *fn,
 	const char *args[] = { NULL, NULL };
 	int status, fd, i;
 	struct stat st;
+	enum ll_merge_result ret;
 	assert(opts);
 
 	sq_quote_buf(&path_sq, path);
@@ -236,7 +239,8 @@ static int ll_ext_merge(const struct ll_merge_driver *fn,
 		unlink_or_warn(temp[i]);
 	strbuf_release(&cmd);
 	strbuf_release(&path_sq);
-	return status;
+	ret = (status > 0) ? LL_MERGE_CONFLICT : status;
+	return ret;
 }
 
 /*
@@ -362,7 +366,7 @@ static void normalize_file(mmfile_t *mm, const char *path, struct index_state *i
 	}
 }
 
-int ll_merge(mmbuffer_t *result_buf,
+enum ll_merge_result ll_merge(mmbuffer_t *result_buf,
 	     const char *path,
 	     mmfile_t *ancestor, const char *ancestor_label,
 	     mmfile_t *ours, const char *our_label,
diff --git a/ll-merge.h b/ll-merge.h
index aceb1b24132..e4a20e81a3a 100644
--- a/ll-merge.h
+++ b/ll-merge.h
@@ -82,13 +82,20 @@ struct ll_merge_options {
 	long xdl_opts;
 };
 
+enum ll_merge_result {
+	LL_MERGE_ERROR = -1,
+	LL_MERGE_OK = 0,
+	LL_MERGE_CONFLICT,
+	LL_MERGE_BINARY_CONFLICT,
+};
+
 /**
  * Perform a three-way single-file merge in core.  This is a thin wrapper
  * around `xdl_merge` that takes the path and any merge backend specified in
  * `.gitattributes` or `.git/info/attributes` into account.
  * Returns 0 for a clean merge.
  */
-int ll_merge(mmbuffer_t *result_buf,
+enum ll_merge_result ll_merge(mmbuffer_t *result_buf,
 	     const char *path,
 	     mmfile_t *ancestor, const char *ancestor_label,
 	     mmfile_t *ours, const char *our_label,
diff --git a/merge-blobs.c b/merge-blobs.c
index ee0a0e90c94..8138090f81c 100644
--- a/merge-blobs.c
+++ b/merge-blobs.c
@@ -36,7 +36,7 @@ static void *three_way_filemerge(struct index_state *istate,
 				 mmfile_t *their,
 				 unsigned long *size)
 {
-	int merge_status;
+	enum ll_merge_result merge_status;
 	mmbuffer_t res;
 
 	/*
@@ -50,6 +50,9 @@ static void *three_way_filemerge(struct index_state *istate,
 				istate, NULL);
 	if (merge_status < 0)
 		return NULL;
+	if (merge_status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			path, ".our", ".their");
 
 	*size = res.size;
 	return res.ptr;
diff --git a/merge-ort.c b/merge-ort.c
index 0342f104836..c24da2ba3cb 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -1743,7 +1743,7 @@ static int merge_3way(struct merge_options *opt,
 	mmfile_t orig, src1, src2;
 	struct ll_merge_options ll_opts = {0};
 	char *base, *name1, *name2;
-	int merge_status;
+	enum ll_merge_result merge_status;
 
 	if (!opt->priv->attr_index.initialized)
 		initialize_attr_index(opt);
@@ -1787,6 +1787,9 @@ static int merge_3way(struct merge_options *opt,
 	merge_status = ll_merge(result_buf, path, &orig, base,
 				&src1, name1, &src2, name2,
 				&opt->priv->attr_index, &ll_opts);
+	if (merge_status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			path, name1, name2);
 
 	free(base);
 	free(name1);
diff --git a/merge-recursive.c b/merge-recursive.c
index d9457797dbb..bc73c52dd84 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1044,7 +1044,7 @@ static int merge_3way(struct merge_options *opt,
 	mmfile_t orig, src1, src2;
 	struct ll_merge_options ll_opts = {0};
 	char *base, *name1, *name2;
-	int merge_status;
+	enum ll_merge_result merge_status;
 
 	ll_opts.renormalize = opt->renormalize;
 	ll_opts.extra_marker_size = extra_marker_size;
@@ -1090,6 +1090,9 @@ static int merge_3way(struct merge_options *opt,
 	merge_status = ll_merge(result_buf, a->path, &orig, base,
 				&src1, name1, &src2, name2,
 				opt->repo->index, &ll_opts);
+	if (merge_status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			a->path, name1, name2);
 
 	free(base);
 	free(name1);
diff --git a/notes-merge.c b/notes-merge.c
index b4a3a903e86..01d596920ea 100644
--- a/notes-merge.c
+++ b/notes-merge.c
@@ -344,7 +344,7 @@ static int ll_merge_in_worktree(struct notes_merge_options *o,
 {
 	mmbuffer_t result_buf;
 	mmfile_t base, local, remote;
-	int status;
+	enum ll_merge_result status;
 
 	read_mmblob(&base, &p->base);
 	read_mmblob(&local, &p->local);
@@ -358,6 +358,9 @@ static int ll_merge_in_worktree(struct notes_merge_options *o,
 	free(local.ptr);
 	free(remote.ptr);
 
+	if (status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			oid_to_hex(&p->obj), o->local_ref, o->remote_ref);
 	if ((status < 0) || !result_buf.ptr)
 		die("Failed to execute internal merge");
 
diff --git a/rerere.c b/rerere.c
index d83d58df4fb..d26627c5932 100644
--- a/rerere.c
+++ b/rerere.c
@@ -609,19 +609,20 @@ static int try_merge(struct index_state *istate,
 		     const struct rerere_id *id, const char *path,
 		     mmfile_t *cur, mmbuffer_t *result)
 {
-	int ret;
+	enum ll_merge_result ret;
 	mmfile_t base = {NULL, 0}, other = {NULL, 0};
 
 	if (read_mmfile(&base, rerere_path(id, "preimage")) ||
-	    read_mmfile(&other, rerere_path(id, "postimage")))
-		ret = 1;
-	else
+	    read_mmfile(&other, rerere_path(id, "postimage"))) {
+		ret = LL_MERGE_CONFLICT;
+	} else {
 		/*
 		 * A three-way merge. Note that this honors user-customizable
 		 * low-level merge driver settings.
 		 */
 		ret = ll_merge(result, path, &base, NULL, cur, "", &other, "",
 			       istate, NULL);
+	}
 
 	free(base.ptr);
 	free(other.ptr);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v4 04/10] merge-ort: capture and print ll-merge warnings in our preferred fashion
  2022-01-21 19:12     ` [PATCH v4 00/10] " Elijah Newren via GitGitGadget
                         ` (2 preceding siblings ...)
  2022-01-21 19:12       ` [PATCH v4 03/10] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
@ 2022-01-21 19:12       ` Elijah Newren via GitGitGadget
  2022-01-21 19:12       ` [PATCH v4 05/10] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
                         ` (6 subsequent siblings)
  10 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-01-21 19:12 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Instead of immediately printing ll-merge warnings to stderr, we save
them in our output strbuf.  Besides allowing us to move these warnings
to a special file for --remerge-diff, this has two other benefits for
regular merges done by merge-ort:

  * The deferral of messages ensures we can print all messages about
    any given path together (merge-recursive was known to sometimes
    intersperse messages about other paths, particularly when renames
    were involved).

  * The deferral of messages means we can avoid printing spurious
    conflict messages when we just end up aborting due to local user
    modifications in the way.  (In contrast to merge-recursive.c which
    prematurely checks for local modifications in the way via
    unpack_trees() and gets the check wrong both in terms of false
    positives and false negatives relative to renames, merge-ort does
    not perform the local modifications in the way check until the
    checkout() step after the full merge has been computed.)

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c                | 5 +++--
 t/t6404-recursive-merge.sh | 9 +++++++--
 t/t6406-merge-attr.sh      | 9 +++++++--
 3 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index c24da2ba3cb..a18f47e23c5 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -1788,8 +1788,9 @@ static int merge_3way(struct merge_options *opt,
 				&src1, name1, &src2, name2,
 				&opt->priv->attr_index, &ll_opts);
 	if (merge_status == LL_MERGE_BINARY_CONFLICT)
-		warning("Cannot merge binary files: %s (%s vs. %s)",
-			path, name1, name2);
+		path_msg(opt, path, 0,
+			 "warning: Cannot merge binary files: %s (%s vs. %s)",
+			 path, name1, name2);
 
 	free(base);
 	free(name1);
diff --git a/t/t6404-recursive-merge.sh b/t/t6404-recursive-merge.sh
index eaf48e941e2..b8735c6db4d 100755
--- a/t/t6404-recursive-merge.sh
+++ b/t/t6404-recursive-merge.sh
@@ -108,8 +108,13 @@ test_expect_success 'refuse to merge binary files' '
 	printf "\0\0" >binary-file &&
 	git add binary-file &&
 	git commit -m binary2 &&
-	test_must_fail git merge F >merge.out 2>merge.err &&
-	grep "Cannot merge binary files: binary-file (HEAD vs. F)" merge.err
+	if test "$GIT_TEST_MERGE_ALGORITHM" = ort
+	then
+		test_must_fail git merge F >merge_output
+	else
+		test_must_fail git merge F 2>merge_output
+	fi &&
+	grep "Cannot merge binary files: binary-file (HEAD vs. F)" merge_output
 '
 
 test_expect_success 'mark rename/delete as unmerged' '
diff --git a/t/t6406-merge-attr.sh b/t/t6406-merge-attr.sh
index 84946458371..c41584eb33e 100755
--- a/t/t6406-merge-attr.sh
+++ b/t/t6406-merge-attr.sh
@@ -221,8 +221,13 @@ test_expect_success 'binary files with union attribute' '
 	printf "two\0" >bin.txt &&
 	git commit -am two &&
 
-	test_must_fail git merge bin-main 2>stderr &&
-	grep -i "warning.*cannot merge.*HEAD vs. bin-main" stderr
+	if test "$GIT_TEST_MERGE_ALGORITHM" = ort
+	then
+		test_must_fail git merge bin-main >output
+	else
+		test_must_fail git merge bin-main 2>output
+	fi &&
+	grep -i "warning.*cannot merge.*HEAD vs. bin-main" output
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v4 05/10] merge-ort: mark a few more conflict messages as omittable
  2022-01-21 19:12     ` [PATCH v4 00/10] " Elijah Newren via GitGitGadget
                         ` (3 preceding siblings ...)
  2022-01-21 19:12       ` [PATCH v4 04/10] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
@ 2022-01-21 19:12       ` Elijah Newren via GitGitGadget
  2022-01-21 19:12       ` [PATCH v4 06/10] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
                         ` (5 subsequent siblings)
  10 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-01-21 19:12 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

path_msg() has the ability to mark messages as omittable, designed for
remerge-diff where we'll instead be showing conflict messages as diff
headers for a subsequent diff.  While all these messages are very useful
when trying to create a merge initially, early use with the
--remerge-diff feature (the only user of this omittable conflict message
capability), suggests that the particular messages marked in this commit
are just noise when trying to see what changes users made to create a
merge commit.  Mark them as omittable.

Note that there were already a few messages marked as omittable in
merge-ort when doing a remerge-diff, because the development of
--remerge-diff preceded the upstreaming of merge-ort and I was trying to
ensure merge-ort could handle all the necessary requirements.  See
commit c5a6f65527 ("merge-ort: add modify/delete handling and delayed
output processing", 2020-12-03) for the initial details.  For some
examples of already-marked-as-omittable messages, see either
"Auto-merging <path>" or some of the submodule update hints.  This
commit just adds two more messages that should also be omittable.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index a18f47e23c5..998e92ec593 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -2420,7 +2420,7 @@ static void apply_directory_rename_modifications(struct merge_options *opt,
 		 */
 		ci->path_conflict = 1;
 		if (pair->status == 'A')
-			path_msg(opt, new_path, 0,
+			path_msg(opt, new_path, 1,
 				 _("CONFLICT (file location): %s added in %s "
 				   "inside a directory that was renamed in %s, "
 				   "suggesting it should perhaps be moved to "
@@ -2428,7 +2428,7 @@ static void apply_directory_rename_modifications(struct merge_options *opt,
 				 old_path, branch_with_new_path,
 				 branch_with_dir_rename, new_path);
 		else
-			path_msg(opt, new_path, 0,
+			path_msg(opt, new_path, 1,
 				 _("CONFLICT (file location): %s renamed to %s "
 				   "in %s, inside a directory that was renamed "
 				   "in %s, suggesting it should perhaps be "
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v4 06/10] merge-ort: format messages slightly different for use in headers
  2022-01-21 19:12     ` [PATCH v4 00/10] " Elijah Newren via GitGitGadget
                         ` (4 preceding siblings ...)
  2022-01-21 19:12       ` [PATCH v4 05/10] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
@ 2022-01-21 19:12       ` Elijah Newren via GitGitGadget
  2022-01-21 19:12       ` [PATCH v4 07/10] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
                         ` (4 subsequent siblings)
  10 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-01-21 19:12 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

When users run
    git show --remerge-diff $MERGE_COMMIT
or
    git log -p --remerge-diff ...
stdout is not an appropriate location to dump conflict messages, but we
do want to provide them to users.  We will include them in the diff
headers instead...but for that to work, we need for any multiline
messages to replace newlines with both a newline and a space.  Add a new
flag to signal when we want these messages modified in such a fashion,
and use it in path_msg() to modify these messages this way.  Also, allow
a special prefix to be specified for these headers.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c       | 42 ++++++++++++++++++++++++++++++++++++++++--
 merge-recursive.c |  4 ++++
 merge-recursive.h |  2 ++
 3 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index 998e92ec593..481305d2bcf 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -634,17 +634,49 @@ static void path_msg(struct merge_options *opt,
 		     const char *fmt, ...)
 {
 	va_list ap;
-	struct strbuf *sb = strmap_get(&opt->priv->output, path);
+	struct strbuf *sb, *dest;
+	struct strbuf tmp = STRBUF_INIT;
+
+	if (opt->record_conflict_msgs_as_headers && omittable_hint)
+		return; /* Do not record mere hints in tree */
+	sb = strmap_get(&opt->priv->output, path);
 	if (!sb) {
 		sb = xmalloc(sizeof(*sb));
 		strbuf_init(sb, 0);
 		strmap_put(&opt->priv->output, path, sb);
 	}
 
+	dest = (opt->record_conflict_msgs_as_headers ? &tmp : sb);
+
 	va_start(ap, fmt);
-	strbuf_vaddf(sb, fmt, ap);
+	strbuf_vaddf(dest, fmt, ap);
 	va_end(ap);
 
+	if (opt->record_conflict_msgs_as_headers) {
+		int i_sb = 0, i_tmp = 0;
+
+		/* Start with the specified prefix */
+		if (opt->msg_header_prefix)
+			strbuf_addf(sb, "%s ", opt->msg_header_prefix);
+
+		/* Copy tmp to sb, adding spaces after newlines */
+		strbuf_grow(sb, sb->len + 2*tmp.len); /* more than sufficient */
+		for (; i_tmp < tmp.len; i_tmp++, i_sb++) {
+			/* Copy next character from tmp to sb */
+			sb->buf[sb->len + i_sb] = tmp.buf[i_tmp];
+
+			/* If we copied a newline, add a space */
+			if (tmp.buf[i_tmp] == '\n')
+				sb->buf[++i_sb] = ' ';
+		}
+		/* Update length and ensure it's NUL-terminated */
+		sb->len += i_sb;
+		sb->buf[sb->len] = '\0';
+
+		strbuf_release(&tmp);
+	}
+
+	/* Add final newline character to sb */
 	strbuf_addch(sb, '\n');
 }
 
@@ -4246,6 +4278,9 @@ void merge_switch_to_result(struct merge_options *opt,
 		struct string_list olist = STRING_LIST_INIT_NODUP;
 		int i;
 
+		if (opt->record_conflict_msgs_as_headers)
+			BUG("Either display conflict messages or record them as headers, not both");
+
 		trace2_region_enter("merge", "display messages", opt->repo);
 
 		/* Hack to pre-allocate olist to the desired size */
@@ -4347,6 +4382,9 @@ static void merge_start(struct merge_options *opt, struct merge_result *result)
 	assert(opt->recursive_variant >= MERGE_VARIANT_NORMAL &&
 	       opt->recursive_variant <= MERGE_VARIANT_THEIRS);
 
+	if (opt->msg_header_prefix)
+		assert(opt->record_conflict_msgs_as_headers);
+
 	/*
 	 * detect_renames, verbosity, buffer_output, and obuf are ignored
 	 * fields that were used by "recursive" rather than "ort" -- but
diff --git a/merge-recursive.c b/merge-recursive.c
index bc73c52dd84..9ec1e6d043a 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -3714,6 +3714,10 @@ static int merge_start(struct merge_options *opt, struct tree *head)
 
 	assert(opt->priv == NULL);
 
+	/* Not supported; option specific to merge-ort */
+	assert(!opt->record_conflict_msgs_as_headers);
+	assert(!opt->msg_header_prefix);
+
 	/* Sanity check on repo state; index must match head */
 	if (repo_index_has_changes(opt->repo, head, &sb)) {
 		err(opt, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
diff --git a/merge-recursive.h b/merge-recursive.h
index 0795a1d3ec1..b88000e3c25 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -46,6 +46,8 @@ struct merge_options {
 	/* miscellaneous control options */
 	const char *subtree_shift;
 	unsigned renormalize : 1;
+	unsigned record_conflict_msgs_as_headers : 1;
+	const char *msg_header_prefix;
 
 	/* internal fields used by the implementation */
 	struct merge_options_internal *priv;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v4 07/10] diff: add ability to insert additional headers for paths
  2022-01-21 19:12     ` [PATCH v4 00/10] " Elijah Newren via GitGitGadget
                         ` (5 preceding siblings ...)
  2022-01-21 19:12       ` [PATCH v4 06/10] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
@ 2022-01-21 19:12       ` Elijah Newren via GitGitGadget
  2022-01-21 19:12       ` [PATCH v4 08/10] show, log: include conflict/warning messages in --remerge-diff headers Elijah Newren via GitGitGadget
                         ` (3 subsequent siblings)
  10 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-01-21 19:12 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

When additional headers are provided, we need to
  * add diff_filepairs to diff_queued_diff for each paths in the
    additional headers map which, unless that path is part of
    another diff_filepair already found in diff_queued_diff
  * format the headers (colorization, line_prefix for --graph)
  * make sure the various codepaths that attempt to return early
    if there are "no changes" take into account the headers that
    need to be shown.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 diff.c     | 124 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 diff.h     |   3 +-
 log-tree.c |   2 +-
 3 files changed, 123 insertions(+), 6 deletions(-)

diff --git a/diff.c b/diff.c
index 861282db1c3..1bfb01c18ec 100644
--- a/diff.c
+++ b/diff.c
@@ -27,6 +27,7 @@
 #include "help.h"
 #include "promisor-remote.h"
 #include "dir.h"
+#include "strmap.h"
 
 #ifdef NO_FAST_WORKING_DIRECTORY
 #define FAST_WORKING_DIRECTORY 0
@@ -3406,6 +3407,31 @@ struct userdiff_driver *get_textconv(struct repository *r,
 	return userdiff_get_textconv(r, one->driver);
 }
 
+static struct strbuf *additional_headers(struct diff_options *o,
+					 const char *path)
+{
+	if (!o->additional_path_headers)
+		return NULL;
+	return strmap_get(o->additional_path_headers, path);
+}
+
+static void add_formatted_headers(struct strbuf *msg,
+				  struct strbuf *more_headers,
+				  const char *line_prefix,
+				  const char *meta,
+				  const char *reset)
+{
+	char *next, *newline;
+
+	for (next = more_headers->buf; *next; next = newline) {
+		newline = strchrnul(next, '\n');
+		strbuf_addf(msg, "%s%s%.*s%s\n", line_prefix, meta,
+			    (int)(newline - next), next, reset);
+		if (*newline)
+			newline++;
+	}
+}
+
 static void builtin_diff(const char *name_a,
 			 const char *name_b,
 			 struct diff_filespec *one,
@@ -3464,6 +3490,17 @@ static void builtin_diff(const char *name_a,
 	b_two = quote_two(b_prefix, name_b + (*name_b == '/'));
 	lbl[0] = DIFF_FILE_VALID(one) ? a_one : "/dev/null";
 	lbl[1] = DIFF_FILE_VALID(two) ? b_two : "/dev/null";
+	if (!DIFF_FILE_VALID(one) && !DIFF_FILE_VALID(two)) {
+		/*
+		 * We should only reach this point for pairs from
+		 * create_filepairs_for_header_only_notifications().  For
+		 * these, we should avoid the "/dev/null" special casing
+		 * above, meaning we avoid showing such pairs as either
+		 * "new file" or "deleted file" below.
+		 */
+		lbl[0] = a_one;
+		lbl[1] = b_two;
+	}
 	strbuf_addf(&header, "%s%sdiff --git %s %s%s\n", line_prefix, meta, a_one, b_two, reset);
 	if (lbl[0][0] == '/') {
 		/* /dev/null */
@@ -4328,6 +4365,7 @@ static void fill_metainfo(struct strbuf *msg,
 	const char *set = diff_get_color(use_color, DIFF_METAINFO);
 	const char *reset = diff_get_color(use_color, DIFF_RESET);
 	const char *line_prefix = diff_line_prefix(o);
+	struct strbuf *more_headers = NULL;
 
 	*must_show_header = 1;
 	strbuf_init(msg, PATH_MAX * 2 + 300);
@@ -4364,6 +4402,11 @@ static void fill_metainfo(struct strbuf *msg,
 	default:
 		*must_show_header = 0;
 	}
+	if ((more_headers = additional_headers(o, name))) {
+		add_formatted_headers(msg, more_headers,
+				      line_prefix, set, reset);
+		*must_show_header = 1;
+	}
 	if (one && two && !oideq(&one->oid, &two->oid)) {
 		const unsigned hexsz = the_hash_algo->hexsz;
 		int abbrev = o->abbrev ? o->abbrev : DEFAULT_ABBREV;
@@ -5852,12 +5895,27 @@ int diff_unmodified_pair(struct diff_filepair *p)
 
 static void diff_flush_patch(struct diff_filepair *p, struct diff_options *o)
 {
-	if (diff_unmodified_pair(p))
+	int include_conflict_headers =
+	    (additional_headers(o, p->one->path) &&
+	     (!o->filter || filter_bit_tst(DIFF_STATUS_UNMERGED, o)));
+
+	/*
+	 * Check if we can return early without showing a diff.  Note that
+	 * diff_filepair only stores {oid, path, mode, is_valid}
+	 * information for each path, and thus diff_unmodified_pair() only
+	 * considers those bits of info.  However, we do not want pairs
+	 * created by create_filepairs_for_header_only_notifications()
+	 * (which always look like unmodified pairs) to be ignored, so
+	 * return early if both p is unmodified AND we don't want to
+	 * include_conflict_headers.
+	 */
+	if (diff_unmodified_pair(p) && !include_conflict_headers)
 		return;
 
+	/* Actually, we can also return early to avoid showing tree diffs */
 	if ((DIFF_FILE_VALID(p->one) && S_ISDIR(p->one->mode)) ||
 	    (DIFF_FILE_VALID(p->two) && S_ISDIR(p->two->mode)))
-		return; /* no tree diffs in patch format */
+		return;
 
 	run_diff(p, o);
 }
@@ -5888,10 +5946,17 @@ static void diff_flush_checkdiff(struct diff_filepair *p,
 	run_checkdiff(p, o);
 }
 
-int diff_queue_is_empty(void)
+int diff_queue_is_empty(struct diff_options *o)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
 	int i;
+	int include_conflict_headers =
+	    (o->additional_path_headers &&
+	     (!o->filter || filter_bit_tst(DIFF_STATUS_UNMERGED, o)));
+
+	if (include_conflict_headers)
+		return 0;
+
 	for (i = 0; i < q->nr; i++)
 		if (!diff_unmodified_pair(q->queue[i]))
 			return 0;
@@ -6325,6 +6390,54 @@ void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc)
 		warning(_(rename_limit_advice), varname, needed);
 }
 
+static void create_filepairs_for_header_only_notifications(struct diff_options *o)
+{
+	struct strset present;
+	struct diff_queue_struct *q = &diff_queued_diff;
+	struct hashmap_iter iter;
+	struct strmap_entry *e;
+	int i;
+
+	strset_init_with_options(&present, /*pool*/ NULL, /*strdup*/ 0);
+
+	/*
+	 * Find out which paths exist in diff_queued_diff, preferring
+	 * one->path for any pair that has multiple paths.
+	 */
+	for (i = 0; i < q->nr; i++) {
+		struct diff_filepair *p = q->queue[i];
+		char *path = p->one->path ? p->one->path : p->two->path;
+
+		if (strmap_contains(o->additional_path_headers, path))
+			strset_add(&present, path);
+	}
+
+	/*
+	 * Loop over paths in additional_path_headers; for each NOT already
+	 * in diff_queued_diff, create a synthetic filepair and insert that
+	 * into diff_queued_diff.
+	 */
+	strmap_for_each_entry(o->additional_path_headers, &iter, e) {
+		if (!strset_contains(&present, e->key)) {
+			struct diff_filespec *one, *two;
+			struct diff_filepair *p;
+
+			one = alloc_filespec(e->key);
+			two = alloc_filespec(e->key);
+			fill_filespec(one, null_oid(), 0, 0);
+			fill_filespec(two, null_oid(), 0, 0);
+			p = diff_queue(q, one, two);
+			p->status = DIFF_STATUS_MODIFIED;
+		}
+	}
+
+	/* Re-sort the filepairs */
+	diffcore_fix_diff_index();
+
+	/* Cleanup */
+	strset_clear(&present);
+}
+
 static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 {
 	int i;
@@ -6337,6 +6450,9 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 	if (o->color_moved)
 		o->emitted_symbols = &esm;
 
+	if (o->additional_path_headers)
+		create_filepairs_for_header_only_notifications(o);
+
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
 		if (check_pair_status(p))
@@ -6413,7 +6529,7 @@ void diff_flush(struct diff_options *options)
 	 * Order: raw, stat, summary, patch
 	 * or:    name/name-status/checkdiff (other bits clear)
 	 */
-	if (!q->nr)
+	if (!q->nr && !options->additional_path_headers)
 		goto free_queue;
 
 	if (output_format & (DIFF_FORMAT_RAW |
diff --git a/diff.h b/diff.h
index 8ba85c5e605..ce9e2cf2e4f 100644
--- a/diff.h
+++ b/diff.h
@@ -395,6 +395,7 @@ struct diff_options {
 
 	struct repository *repo;
 	struct option *parseopts;
+	struct strmap *additional_path_headers;
 
 	int no_free;
 };
@@ -593,7 +594,7 @@ void diffcore_fix_diff_index(void);
 "                show all files diff when -S is used and hit is found.\n" \
 "  -a  --text    treat all files as text.\n"
 
-int diff_queue_is_empty(void);
+int diff_queue_is_empty(struct diff_options *o);
 void diff_flush(struct diff_options*);
 void diff_free(struct diff_options*);
 void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc);
diff --git a/log-tree.c b/log-tree.c
index d4655b63d75..33c28f537a6 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -850,7 +850,7 @@ int log_tree_diff_flush(struct rev_info *opt)
 	opt->shown_dashes = 0;
 	diffcore_std(&opt->diffopt);
 
-	if (diff_queue_is_empty()) {
+	if (diff_queue_is_empty(&opt->diffopt)) {
 		int saved_fmt = opt->diffopt.output_format;
 		opt->diffopt.output_format = DIFF_FORMAT_NO_OUTPUT;
 		diff_flush(&opt->diffopt);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v4 08/10] show, log: include conflict/warning messages in --remerge-diff headers
  2022-01-21 19:12     ` [PATCH v4 00/10] " Elijah Newren via GitGitGadget
                         ` (6 preceding siblings ...)
  2022-01-21 19:12       ` [PATCH v4 07/10] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
@ 2022-01-21 19:12       ` Elijah Newren via GitGitGadget
  2022-01-21 19:12       ` [PATCH v4 09/10] merge-ort: mark conflict/warning messages from inner merges as omittable Elijah Newren via GitGitGadget
                         ` (2 subsequent siblings)
  10 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-01-21 19:12 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Conflicts such as modify/delete, rename/rename, or file/directory are
not representable via content conflict markers, and the normal output
messages notifying users about these were dropped with --remerge-diff.
While we don't want these messages randomly shown before the commit
and diff headers, we do want them to still be shown; include them as
part of the diff headers instead.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 log-tree.c              |  51 ++++++++++++++
 merge-ort.c             |   1 +
 merge-ort.h             |  10 +++
 t/t4069-remerge-diff.sh | 144 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 206 insertions(+)

diff --git a/log-tree.c b/log-tree.c
index 33c28f537a6..85bfd9e50d8 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -19,6 +19,7 @@
 #include "line-log.h"
 #include "help.h"
 #include "range-diff.h"
+#include "strmap.h"
 
 static struct decoration name_decoration = { "object names" };
 static int decoration_loaded;
@@ -905,6 +906,52 @@ static int do_diff_combined(struct rev_info *opt, struct commit *commit)
 	return !opt->loginfo;
 }
 
+static void setup_additional_headers(struct diff_options *o,
+				     struct strmap *all_headers)
+{
+	struct hashmap_iter iter;
+	struct strmap_entry *entry;
+
+	/*
+	 * Make o->additional_path_headers contain the subset of all_headers
+	 * that match o->pathspec.  If there aren't any that match o->pathspec,
+	 * then make o->additional_path_headers be NULL.
+	 */
+
+	if (!o->pathspec.nr) {
+		o->additional_path_headers = all_headers;
+		return;
+	}
+
+	o->additional_path_headers = xmalloc(sizeof(struct strmap));
+	strmap_init_with_options(o->additional_path_headers, NULL, 0);
+	strmap_for_each_entry(all_headers, &iter, entry) {
+		if (match_pathspec(the_repository->index, &o->pathspec,
+				   entry->key, strlen(entry->key),
+				   0 /* prefix */, NULL /* seen */,
+				   0 /* is_dir */))
+			strmap_put(o->additional_path_headers,
+				   entry->key, entry->value);
+	}
+	if (!strmap_get_size(o->additional_path_headers)) {
+		strmap_clear(o->additional_path_headers, 0);
+		FREE_AND_NULL(o->additional_path_headers);
+	}
+}
+
+static void cleanup_additional_headers(struct diff_options *o)
+{
+	if (!o->pathspec.nr) {
+		o->additional_path_headers = NULL;
+		return;
+	}
+	if (!o->additional_path_headers)
+		return;
+
+	strmap_clear(o->additional_path_headers, 0);
+	FREE_AND_NULL(o->additional_path_headers);
+}
+
 static int do_remerge_diff(struct rev_info *opt,
 			   struct commit_list *parents,
 			   struct object_id *oid,
@@ -922,6 +969,8 @@ static int do_remerge_diff(struct rev_info *opt,
 	/* Setup merge options */
 	init_merge_options(&o, the_repository);
 	o.show_rename_progress = 0;
+	o.record_conflict_msgs_as_headers = 1;
+	o.msg_header_prefix = "remerge";
 
 	ctx.abbrev = DEFAULT_ABBREV;
 	format_commit_message(parent1, "%h (%s)", &parent1_desc, &ctx);
@@ -938,10 +987,12 @@ static int do_remerge_diff(struct rev_info *opt,
 	merge_incore_recursive(&o, bases, parent1, parent2, &res);
 
 	/* Show the diff */
+	setup_additional_headers(&opt->diffopt, res.path_messages);
 	diff_tree_oid(&res.tree->object.oid, oid, "", &opt->diffopt);
 	log_tree_diff_flush(opt);
 
 	/* Cleanup */
+	cleanup_additional_headers(&opt->diffopt);
 	strbuf_release(&parent1_desc);
 	strbuf_release(&parent2_desc);
 	merge_finalize(&o, &res);
diff --git a/merge-ort.c b/merge-ort.c
index 481305d2bcf..43f980d2586 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -4585,6 +4585,7 @@ redo:
 	trace2_region_leave("merge", "process_entries", opt->repo);
 
 	/* Set return values */
+	result->path_messages = &opt->priv->output;
 	result->tree = parse_tree_indirect(&working_tree_oid);
 	/* existence of conflicted entries implies unclean */
 	result->clean &= strmap_empty(&opt->priv->conflicted);
diff --git a/merge-ort.h b/merge-ort.h
index c011864ffeb..fe599b87868 100644
--- a/merge-ort.h
+++ b/merge-ort.h
@@ -5,6 +5,7 @@
 
 struct commit;
 struct tree;
+struct strmap;
 
 struct merge_result {
 	/*
@@ -23,6 +24,15 @@ struct merge_result {
 	 */
 	struct tree *tree;
 
+	/*
+	 * Special messages and conflict notices for various paths
+	 *
+	 * This is a map of pathnames to strbufs.  It contains various
+	 * warning/conflict/notice messages (possibly multiple per path)
+	 * that callers may want to use.
+	 */
+	struct strmap *path_messages;
+
 	/*
 	 * Additional metadata used by merge_switch_to_result() or future calls
 	 * to merge_incore_*().  Includes data needed to update the index (if
diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
index 5ef191f4fc9..86c5a33bd77 100755
--- a/t/t4069-remerge-diff.sh
+++ b/t/t4069-remerge-diff.sh
@@ -59,6 +59,7 @@ test_expect_success 'remerge-diff with both a resolved conflict and an unrelated
 	git log -1 --oneline ab_resolution >tmp &&
 	cat <<-EOF >>tmp &&
 	diff --git a/numbers b/numbers
+	remerge CONFLICT (content): Merge conflict in numbers
 	index a1fb731..6875544 100644
 	--- a/numbers
 	+++ b/numbers
@@ -87,4 +88,147 @@ test_expect_success 'remerge-diff with both a resolved conflict and an unrelated
 	test_cmp expect actual
 '
 
+test_expect_success 'setup non-content conflicts' '
+	git switch --orphan base &&
+
+	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
+	test_write_lines a b c d e f g h i >letters &&
+	test_write_lines in the way >content &&
+	git add numbers letters content &&
+	git commit -m base &&
+
+	git branch side1 &&
+	git branch side2 &&
+
+	git checkout side1 &&
+	test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
+	git mv letters letters_side1 &&
+	git mv content file_or_directory &&
+	git add numbers &&
+	git commit -m side1 &&
+
+	git checkout side2 &&
+	git rm numbers &&
+	git mv letters letters_side2 &&
+	mkdir file_or_directory &&
+	echo hello >file_or_directory/world &&
+	git add file_or_directory/world &&
+	git commit -m side2 &&
+
+	git checkout -b resolution side1 &&
+	test_must_fail git merge side2 &&
+	test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
+	git add numbers &&
+	git add letters_side1 &&
+	git rm letters &&
+	git rm letters_side2 &&
+	git add file_or_directory~HEAD &&
+	git mv file_or_directory~HEAD wanted_content &&
+	git commit -m resolved
+'
+
+test_expect_success 'remerge-diff with non-content conflicts' '
+	git log -1 --oneline resolution >tmp &&
+	cat <<-EOF >>tmp &&
+	diff --git a/file_or_directory~HASH (side1) b/wanted_content
+	similarity index 100%
+	rename from file_or_directory~HASH (side1)
+	rename to wanted_content
+	remerge CONFLICT (file/directory): directory in the way of file_or_directory from HASH (side1); moving it to file_or_directory~HASH (side1) instead.
+	diff --git a/letters b/letters
+	remerge CONFLICT (rename/rename): letters renamed to letters_side1 in HASH (side1) and to letters_side2 in HASH (side2).
+	diff --git a/letters_side2 b/letters_side2
+	deleted file mode 100644
+	index b236ae5..0000000
+	--- a/letters_side2
+	+++ /dev/null
+	@@ -1,9 +0,0 @@
+	-a
+	-b
+	-c
+	-d
+	-e
+	-f
+	-g
+	-h
+	-i
+	diff --git a/numbers b/numbers
+	remerge CONFLICT (modify/delete): numbers deleted in HASH (side2) and modified in HASH (side1).  Version HASH (side1) of numbers left in tree.
+	EOF
+	# We still have some sha1 hashes above; rip them out so test works
+	# with sha256
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
+
+	git show --oneline --remerge-diff resolution >tmp &&
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'remerge-diff w/ diff-filter=U: all conflict headers, no diff content' '
+	git log -1 --oneline resolution >tmp &&
+	cat <<-EOF >>tmp &&
+	diff --git a/file_or_directory~HASH (side1) b/file_or_directory~HASH (side1)
+	remerge CONFLICT (file/directory): directory in the way of file_or_directory from HASH (side1); moving it to file_or_directory~HASH (side1) instead.
+	diff --git a/letters b/letters
+	remerge CONFLICT (rename/rename): letters renamed to letters_side1 in HASH (side1) and to letters_side2 in HASH (side2).
+	diff --git a/numbers b/numbers
+	remerge CONFLICT (modify/delete): numbers deleted in HASH (side2) and modified in HASH (side1).  Version HASH (side1) of numbers left in tree.
+	EOF
+	# We still have some sha1 hashes above; rip them out so test works
+	# with sha256
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
+
+	git show --oneline --remerge-diff --diff-filter=U resolution >tmp &&
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'remerge-diff w/ diff-filter=R: relevant file + conflict header' '
+	git log -1 --oneline resolution >tmp &&
+	cat <<-EOF >>tmp &&
+	diff --git a/file_or_directory~HASH (side1) b/wanted_content
+	similarity index 100%
+	rename from file_or_directory~HASH (side1)
+	rename to wanted_content
+	remerge CONFLICT (file/directory): directory in the way of file_or_directory from HASH (side1); moving it to file_or_directory~HASH (side1) instead.
+	EOF
+	# We still have some sha1 hashes above; rip them out so test works
+	# with sha256
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
+
+	git show --oneline --remerge-diff --diff-filter=R resolution >tmp &&
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'remerge-diff w/ pathspec: limits to relevant file including conflict header' '
+	git log -1 --oneline resolution >tmp &&
+	cat <<-EOF >>tmp &&
+	diff --git a/letters b/letters
+	remerge CONFLICT (rename/rename): letters renamed to letters_side1 in HASH (side1) and to letters_side2 in HASH (side2).
+	diff --git a/letters_side2 b/letters_side2
+	deleted file mode 100644
+	index b236ae5..0000000
+	--- a/letters_side2
+	+++ /dev/null
+	@@ -1,9 +0,0 @@
+	-a
+	-b
+	-c
+	-d
+	-e
+	-f
+	-g
+	-h
+	-i
+	EOF
+	# We still have some sha1 hashes above; rip them out so test works
+	# with sha256
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
+
+	git show --oneline --remerge-diff --full-history resolution -- "letters*" >tmp &&
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v4 09/10] merge-ort: mark conflict/warning messages from inner merges as omittable
  2022-01-21 19:12     ` [PATCH v4 00/10] " Elijah Newren via GitGitGadget
                         ` (7 preceding siblings ...)
  2022-01-21 19:12       ` [PATCH v4 08/10] show, log: include conflict/warning messages in --remerge-diff headers Elijah Newren via GitGitGadget
@ 2022-01-21 19:12       ` Elijah Newren via GitGitGadget
  2022-01-21 19:12       ` [PATCH v4 10/10] diff-merges: avoid history simplifications when diffing merges Elijah Newren via GitGitGadget
  2022-02-02  2:37       ` [PATCH v5 00/10] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
  10 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-01-21 19:12 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

A recursive merge involves merging the merge bases of the two branches
being merged.  Such an inner merge can itself generate conflict notices.
While such notices may be useful when initially trying to create a
merge, they seem to just be noise when investigating merges later with
--remerge-diff.  (Especially when both sides of the outer merge resolved
the conflict the same way leading to no overall conflict.)  Remove them.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/merge-ort.c b/merge-ort.c
index 43f980d2586..9bf15a01db8 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -638,7 +638,9 @@ static void path_msg(struct merge_options *opt,
 	struct strbuf tmp = STRBUF_INIT;
 
 	if (opt->record_conflict_msgs_as_headers && omittable_hint)
-		return; /* Do not record mere hints in tree */
+		return; /* Do not record mere hints in headers */
+	if (opt->record_conflict_msgs_as_headers && opt->priv->call_depth)
+		return; /* Do not record inner merge issues in headers */
 	sb = strmap_get(&opt->priv->output, path);
 	if (!sb) {
 		sb = xmalloc(sizeof(*sb));
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v4 10/10] diff-merges: avoid history simplifications when diffing merges
  2022-01-21 19:12     ` [PATCH v4 00/10] " Elijah Newren via GitGitGadget
                         ` (8 preceding siblings ...)
  2022-01-21 19:12       ` [PATCH v4 09/10] merge-ort: mark conflict/warning messages from inner merges as omittable Elijah Newren via GitGitGadget
@ 2022-01-21 19:12       ` Elijah Newren via GitGitGadget
  2022-02-02  2:37       ` [PATCH v5 00/10] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
  10 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-01-21 19:12 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Doing diffs for merges are special; they should typically avoid history
simplification.  For example, with

    git log --diff-merges=first-parent -- path

the default history simplification would remove merge commits from
consideration if the file "path" matched the second parent.  That is
counter to what the user wants when looking for first-parent diffs.
Similar comments can be made for --diff-merges=separate (which diffs
against both parents) and --diff-merges=remerge (which diffs against a
remerge of the merge commit).

However, history simplification still makes sense if not doing diffing
merges, and it also makes sense for the combined and dense-combined
forms of diffing merges (because both of those are defined to only show
a diff when the merge result at the relevant paths differs from *both*
parents).

So, for separate, first-parent, and remerge styles of diff-merges, turn
off history simplification.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 diff-merges.c           |  2 ++
 t/t4069-remerge-diff.sh | 58 ++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/diff-merges.c b/diff-merges.c
index 0af4b3f9191..a833fd747ad 100644
--- a/diff-merges.c
+++ b/diff-merges.c
@@ -24,6 +24,7 @@ static void set_separate(struct rev_info *revs)
 {
 	suppress(revs);
 	revs->separate_merges = 1;
+	revs->simplify_history = 0;
 }
 
 static void set_first_parent(struct rev_info *revs)
@@ -50,6 +51,7 @@ static void set_remerge_diff(struct rev_info *revs)
 {
 	suppress(revs);
 	revs->remerge_diff = 1;
+	revs->simplify_history = 0;
 }
 
 static diff_merges_setup_func_t func_by_opt(const char *optarg)
diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
index 86c5a33bd77..962888cc7fb 100755
--- a/t/t4069-remerge-diff.sh
+++ b/t/t4069-remerge-diff.sh
@@ -226,7 +226,63 @@ test_expect_success 'remerge-diff w/ pathspec: limits to relevant file including
 	# with sha256
 	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
 
-	git show --oneline --remerge-diff --full-history resolution -- "letters*" >tmp &&
+	git show --oneline --remerge-diff resolution -- "letters*" >tmp &&
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'setup non-content conflicts' '
+	git switch --orphan newbase &&
+
+	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
+	git add numbers &&
+	git commit -m base &&
+
+	git branch newside1 &&
+	git branch newside2 &&
+
+	git checkout newside1 &&
+	test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
+	git add numbers &&
+	git commit -m side1 &&
+
+	git checkout newside2 &&
+	test_write_lines 1 2 drei 4 5 6 7 8 9 >numbers &&
+	git add numbers &&
+	git commit -m side2 &&
+
+	git checkout -b newresolution newside1 &&
+	test_must_fail git merge newside2 &&
+	git checkout --theirs numbers &&
+	git add -u numbers &&
+	git commit -m resolved
+'
+
+test_expect_success 'remerge-diff turns off history simplification' '
+	git log -1 --oneline newresolution >tmp &&
+	cat <<-EOF >>tmp &&
+	diff --git a/numbers b/numbers
+	remerge CONFLICT (content): Merge conflict in numbers
+	index 070e9e7..5335e78 100644
+	--- a/numbers
+	+++ b/numbers
+	@@ -1,10 +1,6 @@
+	 1
+	 2
+	-<<<<<<< 96f1e45 (side1)
+	-three
+	-=======
+	 drei
+	->>>>>>> 4fd522f (side2)
+	 4
+	 5
+	 6
+	EOF
+	# We still have some sha1 hashes above; rip them out so test works
+	# with sha256
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
+
+	git show --oneline --remerge-diff newresolution -- numbers >tmp &&
 	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
 	test_cmp expect actual
 '
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 113+ messages in thread

* Re: [PATCH v4 01/10] show, log: provide a --remerge-diff capability
  2022-01-21 19:12       ` [PATCH v4 01/10] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
@ 2022-02-01  9:09         ` Ævar Arnfjörð Bjarmason
  2022-02-01 16:40           ` Elijah Newren
  0 siblings, 1 reply; 113+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-01  9:09 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren


On Fri, Jan 21 2022, Elijah Newren via GitGitGadget wrote:

> From: Elijah Newren <newren@gmail.com>
> [...]
>  ifdef::git-log[]
> ---diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc)::
> +--diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc|remerge|r)::
>  --no-diff-merges::
>  	Specify diff format to be used for merge commits. Default is
>  	{diff-merges-default} unless `--first-parent` is in use, in which case
> @@ -64,6 +64,14 @@ ifdef::git-log[]
>  	each of the parents. Separate log entry and diff is generated
>  	for each parent.
>  +
> +--diff-merges=remerge:::
> +--diff-merges=r:::
> +--remerge-diff:::
> +	With this option, two-parent merge commits are remerged to
> +	create a temporary tree object -- potentially containing files
> +	with conflict markers and such.  A diff is then shown between
> +	that temporary tree and the actual merge commit.
> ++

Re some previous discussion. I really think we should add something like
this paragraph to this:
    
    The output emitted when this option is used is subject to change, and so
    is its interaction with other options (unless explicitly
    documented). I.e. many of the same caveats as the "OUTPUT STABILITY" in
    the linkgit:git-range-diff[1] documentation describes apply here. In
    particular other diff filtering options, pathspec limitations etc. may
    not produce the expected results, as some of those may apply to the
    "real" diff of the merge, and not on the generated "remerge-diff".

I think that would nicely give us permission to develop this further
without having to think about all the option interaction etc.

This is really useful right now, but I'd hate for it to get merged with
some bug/behavior that's not obvious to us now, and it being hard to fix
that because we'd have to consider the implicitly promised backwards
compatibility.

>  	int saved_dcctc = 0;
> +	struct tmp_objdir *remerge_objdir = NULL;
> +
> +	if (rev->remerge_diff) {
> +		remerge_objdir = tmp_objdir_create("remerge-diff");
> +		if (!remerge_objdir)
> +			die(_("unable to create temporary object directory"));

I guess the s/die_errno/die/ here is better for now as we won't report
the wrong errno, but also lose the common case of errno being right. But
that can be fixed up with some other series to the tmp-objdir API.

> [...]
> +# This test is ort-specific
> +test "${GIT_TEST_MERGE_ALGORITHM:-ort}" = ort || {
> +	skip_all="GIT_TEST_MERGE_ALGORITHM != ort"
> +	test_done
> +}

FWIW this is still on a more complex pattern that it needs to be, see
this v1 discussion (which you seemed to ack):

https://lore.kernel.org/git/CABPp-BE+4rZNP-5mT2MNOWR6y6BgEG6mt1r_qcrZtarom6aGsw@mail.gmail.com/

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v4 02/10] log: clean unneeded objects during `log --remerge-diff`
  2022-01-21 19:12       ` [PATCH v4 02/10] log: clean unneeded objects during `log --remerge-diff` Elijah Newren via GitGitGadget
@ 2022-02-01  9:35         ` Ævar Arnfjörð Bjarmason
  2022-02-01 16:54           ` Elijah Newren
  0 siblings, 1 reply; 113+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-01  9:35 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget
  Cc: git, Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren


On Fri, Jan 21 2022, Elijah Newren via GitGitGadget wrote:

> From: Elijah Newren <newren@gmail.com>
> [...]
> @@ -944,7 +945,12 @@ static int do_remerge_diff(struct rev_info *opt,
>  	strbuf_release(&parent1_desc);
>  	strbuf_release(&parent2_desc);
>  	merge_finalize(&o, &res);
> -	/* TODO: clean up the temporary object directory */
> +
> +	/* Clean up the contents of the temporary object directory */
> +	if (opt->remerge_objdir)
> +		tmp_objdir_discard_objects(opt->remerge_objdir);
> +	else
> +		BUG("unable to remove temporary object directory");

Re the die in 1/10 I don't think this will ever trigger the way this bug
suggests.

If we didn't manage to remove the directory that'll be signalled with
the return code of tmp_objdir_discard_objects() which you're adding
here, but which doesn't have a meaningful return value.

So shouldn't it first of all be returning the "int" like the
remove_dir_recursively() user in tmp_objdir_destroy_1() makes use of?

What this bug is really about is:

    BUG("our juggling of opt->remerge_objdir between here and builtin/log.c is screwy")

Or something, because if we failed to remove the director(ies) we'll
just ignore that here.

> +void tmp_objdir_discard_objects(struct tmp_objdir *t)
> +{
> +	remove_dir_recursively(&t->path, REMOVE_DIR_KEEP_TOPLEVEL);
> +}

I skimmed remove_dir_recurse() a bit, but didn't test this, does this
remove just the "de/eadbeef..." in "de/eadbeef..." or also "de/",
i.e. do we (and do we want) to keep the fanned-out 256 loose top-level
directories throughout the operation?

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v4 01/10] show, log: provide a --remerge-diff capability
  2022-02-01  9:09         ` Ævar Arnfjörð Bjarmason
@ 2022-02-01 16:40           ` Elijah Newren
  0 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren @ 2022-02-01 16:40 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya, Neeraj Singh,
	Johannes Altmanninger

On Tue, Feb 1, 2022 at 1:35 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
> On Fri, Jan 21 2022, Elijah Newren via GitGitGadget wrote:
>
> > From: Elijah Newren <newren@gmail.com>
> > [...]
> >  ifdef::git-log[]
> > ---diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc)::
> > +--diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc|remerge|r)::
> >  --no-diff-merges::
> >       Specify diff format to be used for merge commits. Default is
> >       {diff-merges-default} unless `--first-parent` is in use, in which case
> > @@ -64,6 +64,14 @@ ifdef::git-log[]
> >       each of the parents. Separate log entry and diff is generated
> >       for each parent.
> >  +
> > +--diff-merges=remerge:::
> > +--diff-merges=r:::
> > +--remerge-diff:::
> > +     With this option, two-parent merge commits are remerged to
> > +     create a temporary tree object -- potentially containing files
> > +     with conflict markers and such.  A diff is then shown between
> > +     that temporary tree and the actual merge commit.
> > ++
>
> Re some previous discussion. I really think we should add something like
> this paragraph to this:
>
>     The output emitted when this option is used is subject to change, and so
>     is its interaction with other options (unless explicitly
>     documented). I.e. many of the same caveats as the "OUTPUT STABILITY" in
>     the linkgit:git-range-diff[1] documentation describes apply here. In
>     particular other diff filtering options, pathspec limitations etc. may
>     not produce the expected results, as some of those may apply to the
>     "real" diff of the merge, and not on the generated "remerge-diff".
>
> I think that would nicely give us permission to develop this further
> without having to think about all the option interaction etc.
>
> This is really useful right now, but I'd hate for it to get merged with
> some bug/behavior that's not obvious to us now, and it being hard to fix
> that because we'd have to consider the implicitly promised backwards
> compatibility.

Sure I can add something.  I think the first sentence should be
sufficient though.

> >       int saved_dcctc = 0;
> > +     struct tmp_objdir *remerge_objdir = NULL;
> > +
> > +     if (rev->remerge_diff) {
> > +             remerge_objdir = tmp_objdir_create("remerge-diff");
> > +             if (!remerge_objdir)
> > +                     die(_("unable to create temporary object directory"));
>
> I guess the s/die_errno/die/ here is better for now as we won't report
> the wrong errno, but also lose the common case of errno being right. But
> that can be fixed up with some other series to the tmp-objdir API.
>
> > [...]
> > +# This test is ort-specific
> > +test "${GIT_TEST_MERGE_ALGORITHM:-ort}" = ort || {
> > +     skip_all="GIT_TEST_MERGE_ALGORITHM != ort"
> > +     test_done
> > +}
>
> FWIW this is still on a more complex pattern that it needs to be, see
> this v1 discussion (which you seemed to ack):
>
> https://lore.kernel.org/git/CABPp-BE+4rZNP-5mT2MNOWR6y6BgEG6mt1r_qcrZtarom6aGsw@mail.gmail.com/

Um, I thought I made this change.  How did I lose it?

Thanks for catching; will fix.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v4 02/10] log: clean unneeded objects during `log --remerge-diff`
  2022-02-01  9:35         ` Ævar Arnfjörð Bjarmason
@ 2022-02-01 16:54           ` Elijah Newren
  2022-02-02 11:17             ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 113+ messages in thread
From: Elijah Newren @ 2022-02-01 16:54 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya, Neeraj Singh,
	Johannes Altmanninger

On Tue, Feb 1, 2022 at 1:45 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
> On Fri, Jan 21 2022, Elijah Newren via GitGitGadget wrote:
>
> > From: Elijah Newren <newren@gmail.com>
> > [...]
> > @@ -944,7 +945,12 @@ static int do_remerge_diff(struct rev_info *opt,
> >       strbuf_release(&parent1_desc);
> >       strbuf_release(&parent2_desc);
> >       merge_finalize(&o, &res);
> > -     /* TODO: clean up the temporary object directory */
> > +
> > +     /* Clean up the contents of the temporary object directory */
> > +     if (opt->remerge_objdir)
> > +             tmp_objdir_discard_objects(opt->remerge_objdir);
> > +     else
> > +             BUG("unable to remove temporary object directory");
>
> Re the die in 1/10 I don't think this will ever trigger the way this bug
> suggests.
>
> If we didn't manage to remove the directory that'll be signalled with
> the return code of tmp_objdir_discard_objects() which you're adding
> here, but which doesn't have a meaningful return value.
>
> So shouldn't it first of all be returning the "int" like the
> remove_dir_recursively() user in tmp_objdir_destroy_1() makes use of?
>
> What this bug is really about is:
>
>     BUG("our juggling of opt->remerge_objdir between here and builtin/log.c is screwy")
>
> Or something, because if we failed to remove the director(ies) we'll
> just ignore that here.

Yeah, I think I'm suffering from leftover bits from earlier versions
since this patch series has been waiting for 17 months now.  I
switched it to

    BUG("did a remerge diff without remerge_objdir?!?");

>
> > +void tmp_objdir_discard_objects(struct tmp_objdir *t)
> > +{
> > +     remove_dir_recursively(&t->path, REMOVE_DIR_KEEP_TOPLEVEL);
> > +}
>
> I skimmed remove_dir_recurse() a bit, but didn't test this, does this
> remove just the "de/eadbeef..." in "de/eadbeef..." or also "de/",
> i.e. do we (and do we want) to keep the fanned-out 256 loose top-level
> directories throughout the operation?

It will remove everything below t->path, but leave t->path.  As such,
it'll nuke any of the 256 loose top-level directories that exist.

If someone wants to come along later and measure performance and
determine if leaving those 256 loose top-level directories around
improves things, I think that's fine, but I'm not going to look at it
as part of this series.  I'm more curious about where tmp_objdir
creates the temporary directory; when the intent is to migrate the
objects into the main directory, it should probably be created on the
same filesystem.  When the intent is scratch space, like it is for
--remerge-diff, the tmp_objdir should probably be shoved in /dev/shm
or something like that.  But again, that's outside of this series.
This series already has had a long list of things keeping it from the
light of day; there's no need to add frills to it as part of the
initial submission.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH v5 00/10] Add a new --remerge-diff capability to show & log
  2022-01-21 19:12     ` [PATCH v4 00/10] " Elijah Newren via GitGitGadget
                         ` (9 preceding siblings ...)
  2022-01-21 19:12       ` [PATCH v4 10/10] diff-merges: avoid history simplifications when diffing merges Elijah Newren via GitGitGadget
@ 2022-02-02  2:37       ` Elijah Newren via GitGitGadget
  2022-02-02  2:37         ` [PATCH v5 01/10] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
                           ` (9 more replies)
  10 siblings, 10 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-02-02  2:37 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren

Here are some patches to add a --remerge-diff capability to show & log,
which works by comparing merge commits to an automatic remerge (note that
the automatic remerge tree can contain files with conflict markers).

Changes since v4:

 * Just a few minor tweaks -- comment on output being subject to change,
   simpler test skipping check, and improved bug message

Changes since v3:

 * Filter conflict headers according to pathspecs
 * Instead of always including conflict headers for all diff types, only
   select them with --diff-filter=U OR whenever the associated diff in
   question is selected
 * New testcases dealing with --diff-filter, pathspecs, and default history
   simplification
 * Switched back from die_errno() to die()

Changes NOT included (mostly because I'm not sure what to add or where):

 * Johannes Altimanninger suggested changing the ordering of the new headers
   relative to other headers. He made a good point, but I also like having
   the conflict messages next to the text, so I'm conflicted about what's
   best.
 * (Technically not part of this feature, but kind of related.) Months ago,
   Junio suggested documenting ${GIT_DIR}/AUTO_MERGE better
   (https://lore.kernel.org/git/xmqqtuj4nepe.fsf@gitster.g/). I looked at
   the time, but couldn't find a place to put it that made sense to me.

Changes since v2 (of the restarted submission):

 * Numerous small improvements suggested by Johannes Altmanninger
 * Avoid including conflict messages from inner merges (due to example
   pointed out by Ævar).
 * Added a "remerge" prefix to all the new diff headers (suggested by Junio
   in a previous round, but I couldn't come up with a good name before. It
   suddenly hit me that "remerge" is an obvious prefix to use, and even
   helps explain what the rest of the line is for.)

Changes since v1 (of the restarted submission, which technically was v2):

 * Restructured the series, so the first patch introduces the feature --
   with a bunch of caveats. Subsequent patches clean up those caveats. This
   avoids introducing not-yet-used functions, and hopefully makes review
   easier.
 * added testcases
 * numerous small improvements suggested by Ævar and Junio

Changes since original submission[1]:

 * Rebased on top of the version of ns/tmp-objdir that Neeraj submitted
   (Neeraj's patches were based on v2.34, but ns/tmp-objdir got applied on
   an old commit and does not even build because of that).
 * Modify ll-merge API to return a status, instead of printing "Cannot merge
   binary files" on stdout[2] (as suggested by Peff)
 * Make conflict messages and other such warnings into diff headers of the
   subsequent remerge-diff rather than appearing in the diff as file content
   of some funny looking filenames (as suggested by Peff[3] and Junio[4])
 * Sergey ack'ed the diff-merges.c portion of the patches, but that wasn't
   limited to one patch so not sure where to record that ack.

[1]
https://lore.kernel.org/git/pull.1080.git.git.1630376800.gitgitgadget@gmail.com/;
GitHub wouldn't let me change the target branch for the PR, so I had to
create a new one with the new base and thus the reason for not sending this
as v2 even though it is. [2]
https://lore.kernel.org/git/YVOZRhWttzF18Xql@coredump.intra.peff.net/,
https://lore.kernel.org/git/YVOZty9D7NRbzhE5@coredump.intra.peff.net/ [3]
https://lore.kernel.org/git/YVOXPTjsp9lrxmS6@coredump.intra.peff.net/ [4]
https://lore.kernel.org/git/xmqqr1d7e4ug.fsf@gitster.g/

=== FURTHER BACKGROUND (original cover letter material) ==

Here are some example commits you can try this out on (with git show
--remerge-diff $COMMIT):

 * git.git conflicted merge: 07601b5b36
 * git.git non-conflicted change: bf04590ecd
 * linux.git conflicted merge: eab3540562fb
 * linux.git non-conflicted change: 223cea6a4f05

Many more can be found by just running git log --merges --remerge-diff in
your repository of choice and searching for diffs (most merges tend to be
clean and unmodified and thus produce no diff but a search of '^diff' in the
log output tends to find the examples nicely).

Some basic high level details about this new option:

 * This option is most naturally compared to --cc, though the output seems
   to be much more understandable to most users than --cc output.
 * Since merges are often clean and unmodified, this new option results in
   an empty diff for most merges.
 * This new option shows things like the removal of conflict markers, which
   hunks users picked from the various conflicted sides to keep or remove,
   and shows changes made outside of conflict markers (which might reflect
   changes needed to resolve semantic conflicts or cleanups of e.g.
   compilation warnings or other additional changes an integrator felt
   belonged in the merged result).
 * This new option does not (currently) work for octopus merges, since
   merge-ort is specific to two-parent merges[1].
 * This option will not work on a read-only or full filesystem[2].
 * We discussed this capability at Git Merge 2020, and one of the
   suggestions was doing a periodic git gc --auto during the operation (due
   to potential new blobs and trees created during the operation). I found a
   way to avoid that; see [2].
 * This option is faster than you'd probably expect; it handles 33.5 merge
   commits per second in linux.git on my computer; see below.

In regards to the performance point above, the timing for running the
following command:

time git log --min-parents=2 --max-parents=2 $DIFF_FLAG | wc -l


in linux.git (with v5.4 checked out, since my copy of linux is very out of
date) is as follows:

DIFF_FLAG=--cc:            71m 31.536s
DIFF_FLAG=--remerge-diff:  31m  3.170s


Note that there are 62476 merges in this history. Also, output size is:

DIFF_FLAG=--cc:            2169111 lines
DIFF_FLAG=--remerge-diff:  2458020 lines


So roughly the same amount of output as --cc, as you'd expect.

As a side note: git log --remerge-diff, when run in various repositories and
allowed to run all the way back to the beginning(s) of history, is a nice
stress test of sorts for merge-ort. Especially when users run it for you on
their repositories they are working on, whether intentionally or via a bug
in a tool triggering that command to be run unexpectedly. Long story short,
such a bug in an internal tool existed in December 2020 and this command was
run on an internal repository and found a platform-specific bug in merge-ort
on some really old merge commit from that repo. I fixed that bug (a
STABLE_QSORT thing) while upstreaming all the merge-ort patches in the mean
time, but it was nice getting extra testing. Having more folks run this on
their repositories might be useful extra testing of the new merge strategy.

Also, I previously mentioned --remerge-diff-only (a flag to show how
cherry-picks or reverts differ from an automatic cherry-pick or revert, in
addition to showing how merges differ from an automatic merge). This series
does not include the patches to introduce that option; I'll submit them
later.

Two other things that might be interesting but are not included and which I
haven't investigated:

 * some mechanism for passing extra merge options through (e.g.
   -Xignore-space-change)
 * a capability to compare the automatic merge to a second automatic merge
   done with different merge options. (Not sure if this would be of interest
   to end users, but might be interesting while developing new a
   --strategy-option, or maybe checking how changing some default in the
   merge algorithm would affect historical merges in various repositories).

[1] I have nebulous ideas of how an Octopus-centric ORT strategy could be
written -- basically, just repeatedly invoking ort and trying to make sure
nested conflicts can be differentiated. For now, though, a simple warning is
printed that octopus merges are not handled and no diff will be shown. [2]
New blobs/trees can be written by the three-way merging step. These are
written to a temporary area (via tmp-objdir.c) under the git object store
that is cleaned up at the end of the operation, with the new loose objects
from the remerge being cleaned up after each individual merge.

Elijah Newren (10):
  show, log: provide a --remerge-diff capability
  log: clean unneeded objects during `log --remerge-diff`
  ll-merge: make callers responsible for showing warnings
  merge-ort: capture and print ll-merge warnings in our preferred
    fashion
  merge-ort: mark a few more conflict messages as omittable
  merge-ort: format messages slightly different for use in headers
  diff: add ability to insert additional headers for paths
  show, log: include conflict/warning messages in --remerge-diff headers
  merge-ort: mark conflict/warning messages from inner merges as
    omittable
  diff-merges: avoid history simplifications when diffing merges

 Documentation/diff-options.txt |  14 +-
 apply.c                        |   5 +-
 builtin/checkout.c             |  12 +-
 builtin/log.c                  |  15 ++
 diff-merges.c                  |  14 ++
 diff.c                         | 124 +++++++++++++-
 diff.h                         |   3 +-
 ll-merge.c                     |  40 +++--
 ll-merge.h                     |   9 +-
 log-tree.c                     | 118 ++++++++++++-
 merge-blobs.c                  |   5 +-
 merge-ort.c                    |  55 ++++++-
 merge-ort.h                    |  10 ++
 merge-recursive.c              |   9 +-
 merge-recursive.h              |   2 +
 notes-merge.c                  |   5 +-
 rerere.c                       |   9 +-
 revision.h                     |   6 +-
 t/t4069-remerge-diff.sh        | 291 +++++++++++++++++++++++++++++++++
 t/t6404-recursive-merge.sh     |   9 +-
 t/t6406-merge-attr.sh          |   9 +-
 tmp-objdir.c                   |   5 +
 tmp-objdir.h                   |   6 +
 23 files changed, 727 insertions(+), 48 deletions(-)
 create mode 100755 t/t4069-remerge-diff.sh


base-commit: 4e44121c2d7bced65e25eb7ec5156290132bec94
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1103%2Fnewren%2Fremerge-diff-v5
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1103/newren/remerge-diff-v5
Pull-Request: https://github.com/gitgitgadget/git/pull/1103

Range-diff vs v4:

  1:  0b94724311d !  1:  0a260125266 show, log: provide a --remerge-diff capability
     @@ Documentation/diff-options.txt: ifdef::git-log[]
      +	create a temporary tree object -- potentially containing files
      +	with conflict markers and such.  A diff is then shown between
      +	that temporary tree and the actual merge commit.
     +++
     ++The output emitted when this option is used is subject to change, and
     ++so is its interaction with other options (unless explicitly
     ++documented).
      ++
       --diff-merges=combined:::
       --diff-merges=c:::
     @@ t/t4069-remerge-diff.sh (new)
      +. ./test-lib.sh
      +
      +# This test is ort-specific
     -+test "${GIT_TEST_MERGE_ALGORITHM:-ort}" = ort || {
     ++if test "${GIT_TEST_MERGE_ALGORITHM}" != ort
     ++then
      +	skip_all="GIT_TEST_MERGE_ALGORITHM != ort"
      +	test_done
     -+}
     ++fi
      +
      +test_expect_success 'setup basic merges' '
      +	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
  2:  f06de6c1b2f !  2:  ed0d60de24c log: clean unneeded objects during `log --remerge-diff`
     @@ log-tree.c: static int do_remerge_diff(struct rev_info *opt,
      +	if (opt->remerge_objdir)
      +		tmp_objdir_discard_objects(opt->remerge_objdir);
      +	else
     -+		BUG("unable to remove temporary object directory");
     ++		BUG("did a remerge diff without remerge_objdir?!?");
       
       	return !opt->loginfo;
       }
  3:  8d6c3d48f0e =  3:  ba4de88f2c4 ll-merge: make callers responsible for showing warnings
  4:  de8e8f88fa4 =  4:  d7a1f4e1f9f merge-ort: capture and print ll-merge warnings in our preferred fashion
  5:  6b535a4d55a =  5:  cbde4e5d372 merge-ort: mark a few more conflict messages as omittable
  6:  e2441608c63 =  6:  d3e4242a5bd merge-ort: format messages slightly different for use in headers
  7:  62734beb693 =  7:  4d79da6e20a diff: add ability to insert additional headers for paths
  8:  17eccf7e0d6 =  8:  ff9c14b0b7c show, log: include conflict/warning messages in --remerge-diff headers
  9:  b3e7656cfc6 =  9:  aa63860cd0f merge-ort: mark conflict/warning messages from inner merges as omittable
 10:  ea5df61cf35 = 10:  59d12f213b2 diff-merges: avoid history simplifications when diffing merges

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH v5 01/10] show, log: provide a --remerge-diff capability
  2022-02-02  2:37       ` [PATCH v5 00/10] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
@ 2022-02-02  2:37         ` Elijah Newren via GitGitGadget
  2022-02-02  2:37         ` [PATCH v5 02/10] log: clean unneeded objects during `log --remerge-diff` Elijah Newren via GitGitGadget
                           ` (8 subsequent siblings)
  9 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-02-02  2:37 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

When this option is specified, we remerge all (two parent) merge commits
and diff the actual merge commit to the automatically created version,
in order to show how users removed conflict markers, resolved the
different conflict versions, and potentially added new changes outside
of conflict regions in order to resolve semantic merge problems (or,
possibly, just to hide other random changes).

This capability works by creating a temporary object directory and
marking it as the primary object store.  This makes it so that any blobs
or trees created during the automatic merge are easily removable
afterwards by just deleting all objects from the temporary object
directory.

There are a few ways that this implementation is suboptimal:
  * `log --remerge-diff` becomes slow, because the temporary object
    directory can fill with many loose objects while running
  * the log output can be muddied with misplaced "warning: cannot merge
    binary files" messages, since ll-merge.c unconditionally writes those
    messages to stderr while running instead of allowing callers to
    manage them.
  * important conflict and warning messages are simply dropped; thus for
    conflicts like modify/delete or rename/rename or file/directory which
    are not representable with content conflict markers, there may be no
    way for a user of --remerge-diff to know that there had been a
    conflict which was resolved (and which possibly motivated other
    changes in the merge commit).
  * when fixing the previous issue, note that some unimportant conflict
    and warning messages might start being included.  We should instead
    make sure these remain dropped.
Subsequent commits will address these issues.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Documentation/diff-options.txt | 14 +++++-
 builtin/log.c                  | 14 ++++++
 diff-merges.c                  | 12 +++++
 log-tree.c                     | 59 ++++++++++++++++++++++
 revision.h                     |  3 +-
 t/t4069-remerge-diff.sh        | 91 ++++++++++++++++++++++++++++++++++
 6 files changed, 191 insertions(+), 2 deletions(-)
 create mode 100755 t/t4069-remerge-diff.sh

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index c89d530d3d1..7e27841a95b 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -34,7 +34,7 @@ endif::git-diff[]
 endif::git-format-patch[]
 
 ifdef::git-log[]
---diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc)::
+--diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc|remerge|r)::
 --no-diff-merges::
 	Specify diff format to be used for merge commits. Default is
 	{diff-merges-default} unless `--first-parent` is in use, in which case
@@ -64,6 +64,18 @@ ifdef::git-log[]
 	each of the parents. Separate log entry and diff is generated
 	for each parent.
 +
+--diff-merges=remerge:::
+--diff-merges=r:::
+--remerge-diff:::
+	With this option, two-parent merge commits are remerged to
+	create a temporary tree object -- potentially containing files
+	with conflict markers and such.  A diff is then shown between
+	that temporary tree and the actual merge commit.
++
+The output emitted when this option is used is subject to change, and
+so is its interaction with other options (unless explicitly
+documented).
++
 --diff-merges=combined:::
 --diff-merges=c:::
 -c:::
diff --git a/builtin/log.c b/builtin/log.c
index f75d87e8d7f..846ba0f995a 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -35,6 +35,7 @@
 #include "repository.h"
 #include "commit-reach.h"
 #include "range-diff.h"
+#include "tmp-objdir.h"
 
 #define MAIL_DEFAULT_WRAP 72
 #define COVER_FROM_AUTO_MAX_SUBJECT_LEN 100
@@ -406,6 +407,14 @@ static int cmd_log_walk(struct rev_info *rev)
 	struct commit *commit;
 	int saved_nrl = 0;
 	int saved_dcctc = 0;
+	struct tmp_objdir *remerge_objdir = NULL;
+
+	if (rev->remerge_diff) {
+		remerge_objdir = tmp_objdir_create("remerge-diff");
+		if (!remerge_objdir)
+			die(_("unable to create temporary object directory"));
+		tmp_objdir_replace_primary_odb(remerge_objdir, 1);
+	}
 
 	if (rev->early_output)
 		setup_early_output();
@@ -449,6 +458,9 @@ static int cmd_log_walk(struct rev_info *rev)
 	rev->diffopt.no_free = 0;
 	diff_free(&rev->diffopt);
 
+	if (rev->remerge_diff)
+		tmp_objdir_destroy(remerge_objdir);
+
 	if (rev->diffopt.output_format & DIFF_FORMAT_CHECKDIFF &&
 	    rev->diffopt.flags.check_failed) {
 		return 02;
@@ -1943,6 +1955,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 		die(_("--name-status does not make sense"));
 	if (rev.diffopt.output_format & DIFF_FORMAT_CHECKDIFF)
 		die(_("--check does not make sense"));
+	if (rev.remerge_diff)
+		die(_("--remerge-diff does not make sense"));
 
 	if (!use_patch_format &&
 		(!rev.diffopt.output_format ||
diff --git a/diff-merges.c b/diff-merges.c
index 5060ccd890b..0af4b3f9191 100644
--- a/diff-merges.c
+++ b/diff-merges.c
@@ -17,6 +17,7 @@ static void suppress(struct rev_info *revs)
 	revs->combined_all_paths = 0;
 	revs->merges_imply_patch = 0;
 	revs->merges_need_diff = 0;
+	revs->remerge_diff = 0;
 }
 
 static void set_separate(struct rev_info *revs)
@@ -45,6 +46,12 @@ static void set_dense_combined(struct rev_info *revs)
 	revs->dense_combined_merges = 1;
 }
 
+static void set_remerge_diff(struct rev_info *revs)
+{
+	suppress(revs);
+	revs->remerge_diff = 1;
+}
+
 static diff_merges_setup_func_t func_by_opt(const char *optarg)
 {
 	if (!strcmp(optarg, "off") || !strcmp(optarg, "none"))
@@ -57,6 +64,8 @@ static diff_merges_setup_func_t func_by_opt(const char *optarg)
 		return set_combined;
 	else if (!strcmp(optarg, "cc") || !strcmp(optarg, "dense-combined"))
 		return set_dense_combined;
+	else if (!strcmp(optarg, "r") || !strcmp(optarg, "remerge"))
+		return set_remerge_diff;
 	else if (!strcmp(optarg, "m") || !strcmp(optarg, "on"))
 		return set_to_default;
 	return NULL;
@@ -110,6 +119,9 @@ int diff_merges_parse_opts(struct rev_info *revs, const char **argv)
 	} else if (!strcmp(arg, "--cc")) {
 		set_dense_combined(revs);
 		revs->merges_imply_patch = 1;
+	} else if (!strcmp(arg, "--remerge-diff")) {
+		set_remerge_diff(revs);
+		revs->merges_imply_patch = 1;
 	} else if (!strcmp(arg, "--no-diff-merges")) {
 		suppress(revs);
 	} else if (!strcmp(arg, "--combined-all-paths")) {
diff --git a/log-tree.c b/log-tree.c
index 644893fd8cf..84ed864fc81 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -1,4 +1,5 @@
 #include "cache.h"
+#include "commit-reach.h"
 #include "config.h"
 #include "diff.h"
 #include "object-store.h"
@@ -7,6 +8,7 @@
 #include "tag.h"
 #include "graph.h"
 #include "log-tree.h"
+#include "merge-ort.h"
 #include "reflog-walk.h"
 #include "refs.h"
 #include "string-list.h"
@@ -902,6 +904,51 @@ static int do_diff_combined(struct rev_info *opt, struct commit *commit)
 	return !opt->loginfo;
 }
 
+static int do_remerge_diff(struct rev_info *opt,
+			   struct commit_list *parents,
+			   struct object_id *oid,
+			   struct commit *commit)
+{
+	struct merge_options o;
+	struct commit_list *bases;
+	struct merge_result res = {0};
+	struct pretty_print_context ctx = {0};
+	struct commit *parent1 = parents->item;
+	struct commit *parent2 = parents->next->item;
+	struct strbuf parent1_desc = STRBUF_INIT;
+	struct strbuf parent2_desc = STRBUF_INIT;
+
+	/* Setup merge options */
+	init_merge_options(&o, the_repository);
+	o.show_rename_progress = 0;
+
+	ctx.abbrev = DEFAULT_ABBREV;
+	format_commit_message(parent1, "%h (%s)", &parent1_desc, &ctx);
+	format_commit_message(parent2, "%h (%s)", &parent2_desc, &ctx);
+	o.branch1 = parent1_desc.buf;
+	o.branch2 = parent2_desc.buf;
+
+	/* Parse the relevant commits and get the merge bases */
+	parse_commit_or_die(parent1);
+	parse_commit_or_die(parent2);
+	bases = get_merge_bases(parent1, parent2);
+
+	/* Re-merge the parents */
+	merge_incore_recursive(&o, bases, parent1, parent2, &res);
+
+	/* Show the diff */
+	diff_tree_oid(&res.tree->object.oid, oid, "", &opt->diffopt);
+	log_tree_diff_flush(opt);
+
+	/* Cleanup */
+	strbuf_release(&parent1_desc);
+	strbuf_release(&parent2_desc);
+	merge_finalize(&o, &res);
+	/* TODO: clean up the temporary object directory */
+
+	return !opt->loginfo;
+}
+
 /*
  * Show the diff of a commit.
  *
@@ -936,6 +983,18 @@ static int log_tree_diff(struct rev_info *opt, struct commit *commit, struct log
 	}
 
 	if (is_merge) {
+		int octopus = (parents->next->next != NULL);
+
+		if (opt->remerge_diff) {
+			if (octopus) {
+				show_log(opt);
+				fprintf(opt->diffopt.file,
+					"diff: warning: Skipping remerge-diff "
+					"for octopus merges.\n");
+				return 1;
+			}
+			return do_remerge_diff(opt, parents, oid, commit);
+		}
 		if (opt->combine_merges)
 			return do_diff_combined(opt, commit);
 		if (opt->separate_merges) {
diff --git a/revision.h b/revision.h
index 5578bb4720a..13178e6b8f3 100644
--- a/revision.h
+++ b/revision.h
@@ -195,7 +195,8 @@ struct rev_info {
 			combine_merges:1,
 			combined_all_paths:1,
 			dense_combined_merges:1,
-			first_parent_merges:1;
+			first_parent_merges:1,
+			remerge_diff:1;
 
 	/* Format info */
 	int		show_notes;
diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
new file mode 100755
index 00000000000..d7ab0f50066
--- /dev/null
+++ b/t/t4069-remerge-diff.sh
@@ -0,0 +1,91 @@
+#!/bin/sh
+
+test_description='remerge-diff handling'
+
+. ./test-lib.sh
+
+# This test is ort-specific
+if test "${GIT_TEST_MERGE_ALGORITHM}" != ort
+then
+	skip_all="GIT_TEST_MERGE_ALGORITHM != ort"
+	test_done
+fi
+
+test_expect_success 'setup basic merges' '
+	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
+	git add numbers &&
+	git commit -m base &&
+
+	git branch feature_a &&
+	git branch feature_b &&
+	git branch feature_c &&
+
+	git branch ab_resolution &&
+	git branch bc_resolution &&
+
+	git checkout feature_a &&
+	test_write_lines 1 2 three 4 5 6 7 eight 9 >numbers &&
+	git commit -a -m change_a &&
+
+	git checkout feature_b &&
+	test_write_lines 1 2 tres 4 5 6 7 8 9 >numbers &&
+	git commit -a -m change_b &&
+
+	git checkout feature_c &&
+	test_write_lines 1 2 3 4 5 6 7 8 9 10 >numbers &&
+	git commit -a -m change_c &&
+
+	git checkout bc_resolution &&
+	git merge --ff-only feature_b &&
+	# no conflict
+	git merge feature_c &&
+
+	git checkout ab_resolution &&
+	git merge --ff-only feature_a &&
+	# conflicts!
+	test_must_fail git merge feature_b &&
+	# Resolve conflict...and make another change elsewhere
+	test_write_lines 1 2 drei 4 5 6 7 acht 9 >numbers &&
+	git add numbers &&
+	git merge --continue
+'
+
+test_expect_success 'remerge-diff on a clean merge' '
+	git log -1 --oneline bc_resolution >expect &&
+	git show --oneline --remerge-diff bc_resolution >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'remerge-diff with both a resolved conflict and an unrelated change' '
+	git log -1 --oneline ab_resolution >tmp &&
+	cat <<-EOF >>tmp &&
+	diff --git a/numbers b/numbers
+	index a1fb731..6875544 100644
+	--- a/numbers
+	+++ b/numbers
+	@@ -1,13 +1,9 @@
+	 1
+	 2
+	-<<<<<<< b0ed5cb (change_a)
+	-three
+	-=======
+	-tres
+	->>>>>>> 6cd3f82 (change_b)
+	+drei
+	 4
+	 5
+	 6
+	 7
+	-eight
+	+acht
+	 9
+	EOF
+	# Hashes above are sha1; rip them out so test works with sha256
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
+
+	git show --oneline --remerge-diff ab_resolution >tmp &&
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
+	test_cmp expect actual
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v5 02/10] log: clean unneeded objects during `log --remerge-diff`
  2022-02-02  2:37       ` [PATCH v5 00/10] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
  2022-02-02  2:37         ` [PATCH v5 01/10] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
@ 2022-02-02  2:37         ` Elijah Newren via GitGitGadget
  2022-02-02  2:37         ` [PATCH v5 03/10] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
                           ` (7 subsequent siblings)
  9 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-02-02  2:37 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

The --remerge-diff option will need to create new blobs and trees
representing the "automatic merge" state.  If one is traversing a
long project history, one can easily get hundreds of thousands of
loose objects generated during `log --remerge-diff`.  However, none of
those loose objects are needed after we have completed our diff
operation; they can be summarily deleted.

Add a new helper function to tmp_objdir to discard all the contained
objects, and call it after each merge is handled.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/log.c | 13 +++++++------
 log-tree.c    |  8 +++++++-
 revision.h    |  3 +++
 tmp-objdir.c  |  5 +++++
 tmp-objdir.h  |  6 ++++++
 5 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 846ba0f995a..ac550e1ae62 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -407,13 +407,12 @@ static int cmd_log_walk(struct rev_info *rev)
 	struct commit *commit;
 	int saved_nrl = 0;
 	int saved_dcctc = 0;
-	struct tmp_objdir *remerge_objdir = NULL;
 
 	if (rev->remerge_diff) {
-		remerge_objdir = tmp_objdir_create("remerge-diff");
-		if (!remerge_objdir)
+		rev->remerge_objdir = tmp_objdir_create("remerge-diff");
+		if (!rev->remerge_objdir)
 			die(_("unable to create temporary object directory"));
-		tmp_objdir_replace_primary_odb(remerge_objdir, 1);
+		tmp_objdir_replace_primary_odb(rev->remerge_objdir, 1);
 	}
 
 	if (rev->early_output)
@@ -458,8 +457,10 @@ static int cmd_log_walk(struct rev_info *rev)
 	rev->diffopt.no_free = 0;
 	diff_free(&rev->diffopt);
 
-	if (rev->remerge_diff)
-		tmp_objdir_destroy(remerge_objdir);
+	if (rev->remerge_diff) {
+		tmp_objdir_destroy(rev->remerge_objdir);
+		rev->remerge_objdir = NULL;
+	}
 
 	if (rev->diffopt.output_format & DIFF_FORMAT_CHECKDIFF &&
 	    rev->diffopt.flags.check_failed) {
diff --git a/log-tree.c b/log-tree.c
index 84ed864fc81..89da7de5dbf 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -4,6 +4,7 @@
 #include "diff.h"
 #include "object-store.h"
 #include "repository.h"
+#include "tmp-objdir.h"
 #include "commit.h"
 #include "tag.h"
 #include "graph.h"
@@ -944,7 +945,12 @@ static int do_remerge_diff(struct rev_info *opt,
 	strbuf_release(&parent1_desc);
 	strbuf_release(&parent2_desc);
 	merge_finalize(&o, &res);
-	/* TODO: clean up the temporary object directory */
+
+	/* Clean up the contents of the temporary object directory */
+	if (opt->remerge_objdir)
+		tmp_objdir_discard_objects(opt->remerge_objdir);
+	else
+		BUG("did a remerge diff without remerge_objdir?!?");
 
 	return !opt->loginfo;
 }
diff --git a/revision.h b/revision.h
index 13178e6b8f3..44efce3f410 100644
--- a/revision.h
+++ b/revision.h
@@ -318,6 +318,9 @@ struct rev_info {
 
 	/* misc. flags related to '--no-kept-objects' */
 	unsigned keep_pack_cache_flags;
+
+	/* Location where temporary objects for remerge-diff are written. */
+	struct tmp_objdir *remerge_objdir;
 };
 
 int ref_excluded(struct string_list *, const char *path);
diff --git a/tmp-objdir.c b/tmp-objdir.c
index 3d38eeab66b..adf6033549e 100644
--- a/tmp-objdir.c
+++ b/tmp-objdir.c
@@ -79,6 +79,11 @@ static void remove_tmp_objdir_on_signal(int signo)
 	raise(signo);
 }
 
+void tmp_objdir_discard_objects(struct tmp_objdir *t)
+{
+	remove_dir_recursively(&t->path, REMOVE_DIR_KEEP_TOPLEVEL);
+}
+
 /*
  * These env_* functions are for setting up the child environment; the
  * "replace" variant overrides the value of any existing variable with that
diff --git a/tmp-objdir.h b/tmp-objdir.h
index cda5ec76778..76efc7edee5 100644
--- a/tmp-objdir.h
+++ b/tmp-objdir.h
@@ -46,6 +46,12 @@ int tmp_objdir_migrate(struct tmp_objdir *);
  */
 int tmp_objdir_destroy(struct tmp_objdir *);
 
+/*
+ * Remove all objects from the temporary object directory, while leaving it
+ * around so more objects can be added.
+ */
+void tmp_objdir_discard_objects(struct tmp_objdir *);
+
 /*
  * Add the temporary object directory as an alternate object store in the
  * current process.
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v5 03/10] ll-merge: make callers responsible for showing warnings
  2022-02-02  2:37       ` [PATCH v5 00/10] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
  2022-02-02  2:37         ` [PATCH v5 01/10] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
  2022-02-02  2:37         ` [PATCH v5 02/10] log: clean unneeded objects during `log --remerge-diff` Elijah Newren via GitGitGadget
@ 2022-02-02  2:37         ` Elijah Newren via GitGitGadget
  2022-02-02  2:37         ` [PATCH v5 04/10] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
                           ` (6 subsequent siblings)
  9 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-02-02  2:37 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Since some callers may want to send warning messages to somewhere other
than stdout/stderr, stop printing "warning: Cannot merge binary files"
from ll-merge and instead modify the return status of ll_merge() to
indicate when a merge of binary files has occurred.  Message printing
probably does not belong in a "low-level merge" anyway.

This commit continues printing the message as-is, just from the callers
instead of within ll_merge().  Future changes will start handling the
message differently in the merge-ort codepath.

There was one special case here: the callers in rerere.c do NOT check
for and print such a message; since those code paths explicitly skip
over binary files, there is no reason to check for a return status of
LL_MERGE_BINARY_CONFLICT or print the related message.

Note that my methodology included first modifying ll_merge() to return
a struct, so that the compiler would catch all the callers for me and
ensure I had modified all of them.  After modifying all of them, I then
changed the struct to an enum.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 apply.c            |  5 ++++-
 builtin/checkout.c | 12 ++++++++----
 ll-merge.c         | 40 ++++++++++++++++++++++------------------
 ll-merge.h         |  9 ++++++++-
 merge-blobs.c      |  5 ++++-
 merge-ort.c        |  5 ++++-
 merge-recursive.c  |  5 ++++-
 notes-merge.c      |  5 ++++-
 rerere.c           |  9 +++++----
 9 files changed, 63 insertions(+), 32 deletions(-)

diff --git a/apply.c b/apply.c
index 43a0aebf4ee..8079395755f 100644
--- a/apply.c
+++ b/apply.c
@@ -3492,7 +3492,7 @@ static int three_way_merge(struct apply_state *state,
 {
 	mmfile_t base_file, our_file, their_file;
 	mmbuffer_t result = { NULL };
-	int status;
+	enum ll_merge_result status;
 
 	/* resolve trivial cases first */
 	if (oideq(base, ours))
@@ -3509,6 +3509,9 @@ static int three_way_merge(struct apply_state *state,
 			  &their_file, "theirs",
 			  state->repo->index,
 			  NULL);
+	if (status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			path, "ours", "theirs");
 	free(base_file.ptr);
 	free(our_file.ptr);
 	free(their_file.ptr);
diff --git a/builtin/checkout.c b/builtin/checkout.c
index cbf73b8c9f6..3a559d69303 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -237,6 +237,7 @@ static int checkout_merged(int pos, const struct checkout *state,
 	struct cache_entry *ce = active_cache[pos];
 	const char *path = ce->name;
 	mmfile_t ancestor, ours, theirs;
+	enum ll_merge_result merge_status;
 	int status;
 	struct object_id oid;
 	mmbuffer_t result_buf;
@@ -267,13 +268,16 @@ static int checkout_merged(int pos, const struct checkout *state,
 	memset(&ll_opts, 0, sizeof(ll_opts));
 	git_config_get_bool("merge.renormalize", &renormalize);
 	ll_opts.renormalize = renormalize;
-	status = ll_merge(&result_buf, path, &ancestor, "base",
-			  &ours, "ours", &theirs, "theirs",
-			  state->istate, &ll_opts);
+	merge_status = ll_merge(&result_buf, path, &ancestor, "base",
+				&ours, "ours", &theirs, "theirs",
+				state->istate, &ll_opts);
 	free(ancestor.ptr);
 	free(ours.ptr);
 	free(theirs.ptr);
-	if (status < 0 || !result_buf.ptr) {
+	if (merge_status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			path, "ours", "theirs");
+	if (merge_status < 0 || !result_buf.ptr) {
 		free(result_buf.ptr);
 		return error(_("path '%s': cannot merge"), path);
 	}
diff --git a/ll-merge.c b/ll-merge.c
index 261657578c7..a937cec59a6 100644
--- a/ll-merge.c
+++ b/ll-merge.c
@@ -14,7 +14,7 @@
 
 struct ll_merge_driver;
 
-typedef int (*ll_merge_fn)(const struct ll_merge_driver *,
+typedef enum ll_merge_result (*ll_merge_fn)(const struct ll_merge_driver *,
 			   mmbuffer_t *result,
 			   const char *path,
 			   mmfile_t *orig, const char *orig_name,
@@ -49,7 +49,7 @@ void reset_merge_attributes(void)
 /*
  * Built-in low-levels
  */
-static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
+static enum ll_merge_result ll_binary_merge(const struct ll_merge_driver *drv_unused,
 			   mmbuffer_t *result,
 			   const char *path,
 			   mmfile_t *orig, const char *orig_name,
@@ -58,6 +58,7 @@ static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
 			   const struct ll_merge_options *opts,
 			   int marker_size)
 {
+	enum ll_merge_result ret;
 	mmfile_t *stolen;
 	assert(opts);
 
@@ -68,16 +69,19 @@ static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
 	 */
 	if (opts->virtual_ancestor) {
 		stolen = orig;
+		ret = LL_MERGE_OK;
 	} else {
 		switch (opts->variant) {
 		default:
-			warning("Cannot merge binary files: %s (%s vs. %s)",
-				path, name1, name2);
-			/* fallthru */
+			ret = LL_MERGE_BINARY_CONFLICT;
+			stolen = src1;
+			break;
 		case XDL_MERGE_FAVOR_OURS:
+			ret = LL_MERGE_OK;
 			stolen = src1;
 			break;
 		case XDL_MERGE_FAVOR_THEIRS:
+			ret = LL_MERGE_OK;
 			stolen = src2;
 			break;
 		}
@@ -87,16 +91,10 @@ static int ll_binary_merge(const struct ll_merge_driver *drv_unused,
 	result->size = stolen->size;
 	stolen->ptr = NULL;
 
-	/*
-	 * With -Xtheirs or -Xours, we have cleanly merged;
-	 * otherwise we got a conflict.
-	 */
-	return opts->variant == XDL_MERGE_FAVOR_OURS ||
-	       opts->variant == XDL_MERGE_FAVOR_THEIRS ?
-	       0 : 1;
+	return ret;
 }
 
-static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
+static enum ll_merge_result ll_xdl_merge(const struct ll_merge_driver *drv_unused,
 			mmbuffer_t *result,
 			const char *path,
 			mmfile_t *orig, const char *orig_name,
@@ -105,7 +103,9 @@ static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
 			const struct ll_merge_options *opts,
 			int marker_size)
 {
+	enum ll_merge_result ret;
 	xmparam_t xmp;
+	int status;
 	assert(opts);
 
 	if (orig->size > MAX_XDIFF_SIZE ||
@@ -133,10 +133,12 @@ static int ll_xdl_merge(const struct ll_merge_driver *drv_unused,
 	xmp.ancestor = orig_name;
 	xmp.file1 = name1;
 	xmp.file2 = name2;
-	return xdl_merge(orig, src1, src2, &xmp, result);
+	status = xdl_merge(orig, src1, src2, &xmp, result);
+	ret = (status > 0) ? LL_MERGE_CONFLICT : status;
+	return ret;
 }
 
-static int ll_union_merge(const struct ll_merge_driver *drv_unused,
+static enum ll_merge_result ll_union_merge(const struct ll_merge_driver *drv_unused,
 			  mmbuffer_t *result,
 			  const char *path,
 			  mmfile_t *orig, const char *orig_name,
@@ -178,7 +180,7 @@ static void create_temp(mmfile_t *src, char *path, size_t len)
 /*
  * User defined low-level merge driver support.
  */
-static int ll_ext_merge(const struct ll_merge_driver *fn,
+static enum ll_merge_result ll_ext_merge(const struct ll_merge_driver *fn,
 			mmbuffer_t *result,
 			const char *path,
 			mmfile_t *orig, const char *orig_name,
@@ -194,6 +196,7 @@ static int ll_ext_merge(const struct ll_merge_driver *fn,
 	const char *args[] = { NULL, NULL };
 	int status, fd, i;
 	struct stat st;
+	enum ll_merge_result ret;
 	assert(opts);
 
 	sq_quote_buf(&path_sq, path);
@@ -236,7 +239,8 @@ static int ll_ext_merge(const struct ll_merge_driver *fn,
 		unlink_or_warn(temp[i]);
 	strbuf_release(&cmd);
 	strbuf_release(&path_sq);
-	return status;
+	ret = (status > 0) ? LL_MERGE_CONFLICT : status;
+	return ret;
 }
 
 /*
@@ -362,7 +366,7 @@ static void normalize_file(mmfile_t *mm, const char *path, struct index_state *i
 	}
 }
 
-int ll_merge(mmbuffer_t *result_buf,
+enum ll_merge_result ll_merge(mmbuffer_t *result_buf,
 	     const char *path,
 	     mmfile_t *ancestor, const char *ancestor_label,
 	     mmfile_t *ours, const char *our_label,
diff --git a/ll-merge.h b/ll-merge.h
index aceb1b24132..e4a20e81a3a 100644
--- a/ll-merge.h
+++ b/ll-merge.h
@@ -82,13 +82,20 @@ struct ll_merge_options {
 	long xdl_opts;
 };
 
+enum ll_merge_result {
+	LL_MERGE_ERROR = -1,
+	LL_MERGE_OK = 0,
+	LL_MERGE_CONFLICT,
+	LL_MERGE_BINARY_CONFLICT,
+};
+
 /**
  * Perform a three-way single-file merge in core.  This is a thin wrapper
  * around `xdl_merge` that takes the path and any merge backend specified in
  * `.gitattributes` or `.git/info/attributes` into account.
  * Returns 0 for a clean merge.
  */
-int ll_merge(mmbuffer_t *result_buf,
+enum ll_merge_result ll_merge(mmbuffer_t *result_buf,
 	     const char *path,
 	     mmfile_t *ancestor, const char *ancestor_label,
 	     mmfile_t *ours, const char *our_label,
diff --git a/merge-blobs.c b/merge-blobs.c
index ee0a0e90c94..8138090f81c 100644
--- a/merge-blobs.c
+++ b/merge-blobs.c
@@ -36,7 +36,7 @@ static void *three_way_filemerge(struct index_state *istate,
 				 mmfile_t *their,
 				 unsigned long *size)
 {
-	int merge_status;
+	enum ll_merge_result merge_status;
 	mmbuffer_t res;
 
 	/*
@@ -50,6 +50,9 @@ static void *three_way_filemerge(struct index_state *istate,
 				istate, NULL);
 	if (merge_status < 0)
 		return NULL;
+	if (merge_status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			path, ".our", ".their");
 
 	*size = res.size;
 	return res.ptr;
diff --git a/merge-ort.c b/merge-ort.c
index 0342f104836..c24da2ba3cb 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -1743,7 +1743,7 @@ static int merge_3way(struct merge_options *opt,
 	mmfile_t orig, src1, src2;
 	struct ll_merge_options ll_opts = {0};
 	char *base, *name1, *name2;
-	int merge_status;
+	enum ll_merge_result merge_status;
 
 	if (!opt->priv->attr_index.initialized)
 		initialize_attr_index(opt);
@@ -1787,6 +1787,9 @@ static int merge_3way(struct merge_options *opt,
 	merge_status = ll_merge(result_buf, path, &orig, base,
 				&src1, name1, &src2, name2,
 				&opt->priv->attr_index, &ll_opts);
+	if (merge_status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			path, name1, name2);
 
 	free(base);
 	free(name1);
diff --git a/merge-recursive.c b/merge-recursive.c
index d9457797dbb..bc73c52dd84 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1044,7 +1044,7 @@ static int merge_3way(struct merge_options *opt,
 	mmfile_t orig, src1, src2;
 	struct ll_merge_options ll_opts = {0};
 	char *base, *name1, *name2;
-	int merge_status;
+	enum ll_merge_result merge_status;
 
 	ll_opts.renormalize = opt->renormalize;
 	ll_opts.extra_marker_size = extra_marker_size;
@@ -1090,6 +1090,9 @@ static int merge_3way(struct merge_options *opt,
 	merge_status = ll_merge(result_buf, a->path, &orig, base,
 				&src1, name1, &src2, name2,
 				opt->repo->index, &ll_opts);
+	if (merge_status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			a->path, name1, name2);
 
 	free(base);
 	free(name1);
diff --git a/notes-merge.c b/notes-merge.c
index b4a3a903e86..01d596920ea 100644
--- a/notes-merge.c
+++ b/notes-merge.c
@@ -344,7 +344,7 @@ static int ll_merge_in_worktree(struct notes_merge_options *o,
 {
 	mmbuffer_t result_buf;
 	mmfile_t base, local, remote;
-	int status;
+	enum ll_merge_result status;
 
 	read_mmblob(&base, &p->base);
 	read_mmblob(&local, &p->local);
@@ -358,6 +358,9 @@ static int ll_merge_in_worktree(struct notes_merge_options *o,
 	free(local.ptr);
 	free(remote.ptr);
 
+	if (status == LL_MERGE_BINARY_CONFLICT)
+		warning("Cannot merge binary files: %s (%s vs. %s)",
+			oid_to_hex(&p->obj), o->local_ref, o->remote_ref);
 	if ((status < 0) || !result_buf.ptr)
 		die("Failed to execute internal merge");
 
diff --git a/rerere.c b/rerere.c
index d83d58df4fb..d26627c5932 100644
--- a/rerere.c
+++ b/rerere.c
@@ -609,19 +609,20 @@ static int try_merge(struct index_state *istate,
 		     const struct rerere_id *id, const char *path,
 		     mmfile_t *cur, mmbuffer_t *result)
 {
-	int ret;
+	enum ll_merge_result ret;
 	mmfile_t base = {NULL, 0}, other = {NULL, 0};
 
 	if (read_mmfile(&base, rerere_path(id, "preimage")) ||
-	    read_mmfile(&other, rerere_path(id, "postimage")))
-		ret = 1;
-	else
+	    read_mmfile(&other, rerere_path(id, "postimage"))) {
+		ret = LL_MERGE_CONFLICT;
+	} else {
 		/*
 		 * A three-way merge. Note that this honors user-customizable
 		 * low-level merge driver settings.
 		 */
 		ret = ll_merge(result, path, &base, NULL, cur, "", &other, "",
 			       istate, NULL);
+	}
 
 	free(base.ptr);
 	free(other.ptr);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v5 04/10] merge-ort: capture and print ll-merge warnings in our preferred fashion
  2022-02-02  2:37       ` [PATCH v5 00/10] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
                           ` (2 preceding siblings ...)
  2022-02-02  2:37         ` [PATCH v5 03/10] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
@ 2022-02-02  2:37         ` Elijah Newren via GitGitGadget
  2022-02-02  2:37         ` [PATCH v5 05/10] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
                           ` (5 subsequent siblings)
  9 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-02-02  2:37 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Instead of immediately printing ll-merge warnings to stderr, we save
them in our output strbuf.  Besides allowing us to move these warnings
to a special file for --remerge-diff, this has two other benefits for
regular merges done by merge-ort:

  * The deferral of messages ensures we can print all messages about
    any given path together (merge-recursive was known to sometimes
    intersperse messages about other paths, particularly when renames
    were involved).

  * The deferral of messages means we can avoid printing spurious
    conflict messages when we just end up aborting due to local user
    modifications in the way.  (In contrast to merge-recursive.c which
    prematurely checks for local modifications in the way via
    unpack_trees() and gets the check wrong both in terms of false
    positives and false negatives relative to renames, merge-ort does
    not perform the local modifications in the way check until the
    checkout() step after the full merge has been computed.)

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c                | 5 +++--
 t/t6404-recursive-merge.sh | 9 +++++++--
 t/t6406-merge-attr.sh      | 9 +++++++--
 3 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index c24da2ba3cb..a18f47e23c5 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -1788,8 +1788,9 @@ static int merge_3way(struct merge_options *opt,
 				&src1, name1, &src2, name2,
 				&opt->priv->attr_index, &ll_opts);
 	if (merge_status == LL_MERGE_BINARY_CONFLICT)
-		warning("Cannot merge binary files: %s (%s vs. %s)",
-			path, name1, name2);
+		path_msg(opt, path, 0,
+			 "warning: Cannot merge binary files: %s (%s vs. %s)",
+			 path, name1, name2);
 
 	free(base);
 	free(name1);
diff --git a/t/t6404-recursive-merge.sh b/t/t6404-recursive-merge.sh
index eaf48e941e2..b8735c6db4d 100755
--- a/t/t6404-recursive-merge.sh
+++ b/t/t6404-recursive-merge.sh
@@ -108,8 +108,13 @@ test_expect_success 'refuse to merge binary files' '
 	printf "\0\0" >binary-file &&
 	git add binary-file &&
 	git commit -m binary2 &&
-	test_must_fail git merge F >merge.out 2>merge.err &&
-	grep "Cannot merge binary files: binary-file (HEAD vs. F)" merge.err
+	if test "$GIT_TEST_MERGE_ALGORITHM" = ort
+	then
+		test_must_fail git merge F >merge_output
+	else
+		test_must_fail git merge F 2>merge_output
+	fi &&
+	grep "Cannot merge binary files: binary-file (HEAD vs. F)" merge_output
 '
 
 test_expect_success 'mark rename/delete as unmerged' '
diff --git a/t/t6406-merge-attr.sh b/t/t6406-merge-attr.sh
index 84946458371..c41584eb33e 100755
--- a/t/t6406-merge-attr.sh
+++ b/t/t6406-merge-attr.sh
@@ -221,8 +221,13 @@ test_expect_success 'binary files with union attribute' '
 	printf "two\0" >bin.txt &&
 	git commit -am two &&
 
-	test_must_fail git merge bin-main 2>stderr &&
-	grep -i "warning.*cannot merge.*HEAD vs. bin-main" stderr
+	if test "$GIT_TEST_MERGE_ALGORITHM" = ort
+	then
+		test_must_fail git merge bin-main >output
+	else
+		test_must_fail git merge bin-main 2>output
+	fi &&
+	grep -i "warning.*cannot merge.*HEAD vs. bin-main" output
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v5 05/10] merge-ort: mark a few more conflict messages as omittable
  2022-02-02  2:37       ` [PATCH v5 00/10] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
                           ` (3 preceding siblings ...)
  2022-02-02  2:37         ` [PATCH v5 04/10] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
@ 2022-02-02  2:37         ` Elijah Newren via GitGitGadget
  2022-02-02  2:37         ` [PATCH v5 06/10] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
                           ` (4 subsequent siblings)
  9 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-02-02  2:37 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

path_msg() has the ability to mark messages as omittable, designed for
remerge-diff where we'll instead be showing conflict messages as diff
headers for a subsequent diff.  While all these messages are very useful
when trying to create a merge initially, early use with the
--remerge-diff feature (the only user of this omittable conflict message
capability), suggests that the particular messages marked in this commit
are just noise when trying to see what changes users made to create a
merge commit.  Mark them as omittable.

Note that there were already a few messages marked as omittable in
merge-ort when doing a remerge-diff, because the development of
--remerge-diff preceded the upstreaming of merge-ort and I was trying to
ensure merge-ort could handle all the necessary requirements.  See
commit c5a6f65527 ("merge-ort: add modify/delete handling and delayed
output processing", 2020-12-03) for the initial details.  For some
examples of already-marked-as-omittable messages, see either
"Auto-merging <path>" or some of the submodule update hints.  This
commit just adds two more messages that should also be omittable.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index a18f47e23c5..998e92ec593 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -2420,7 +2420,7 @@ static void apply_directory_rename_modifications(struct merge_options *opt,
 		 */
 		ci->path_conflict = 1;
 		if (pair->status == 'A')
-			path_msg(opt, new_path, 0,
+			path_msg(opt, new_path, 1,
 				 _("CONFLICT (file location): %s added in %s "
 				   "inside a directory that was renamed in %s, "
 				   "suggesting it should perhaps be moved to "
@@ -2428,7 +2428,7 @@ static void apply_directory_rename_modifications(struct merge_options *opt,
 				 old_path, branch_with_new_path,
 				 branch_with_dir_rename, new_path);
 		else
-			path_msg(opt, new_path, 0,
+			path_msg(opt, new_path, 1,
 				 _("CONFLICT (file location): %s renamed to %s "
 				   "in %s, inside a directory that was renamed "
 				   "in %s, suggesting it should perhaps be "
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v5 06/10] merge-ort: format messages slightly different for use in headers
  2022-02-02  2:37       ` [PATCH v5 00/10] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
                           ` (4 preceding siblings ...)
  2022-02-02  2:37         ` [PATCH v5 05/10] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
@ 2022-02-02  2:37         ` Elijah Newren via GitGitGadget
  2022-02-02  2:37         ` [PATCH v5 07/10] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
                           ` (3 subsequent siblings)
  9 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-02-02  2:37 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

When users run
    git show --remerge-diff $MERGE_COMMIT
or
    git log -p --remerge-diff ...
stdout is not an appropriate location to dump conflict messages, but we
do want to provide them to users.  We will include them in the diff
headers instead...but for that to work, we need for any multiline
messages to replace newlines with both a newline and a space.  Add a new
flag to signal when we want these messages modified in such a fashion,
and use it in path_msg() to modify these messages this way.  Also, allow
a special prefix to be specified for these headers.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c       | 42 ++++++++++++++++++++++++++++++++++++++++--
 merge-recursive.c |  4 ++++
 merge-recursive.h |  2 ++
 3 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index 998e92ec593..481305d2bcf 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -634,17 +634,49 @@ static void path_msg(struct merge_options *opt,
 		     const char *fmt, ...)
 {
 	va_list ap;
-	struct strbuf *sb = strmap_get(&opt->priv->output, path);
+	struct strbuf *sb, *dest;
+	struct strbuf tmp = STRBUF_INIT;
+
+	if (opt->record_conflict_msgs_as_headers && omittable_hint)
+		return; /* Do not record mere hints in tree */
+	sb = strmap_get(&opt->priv->output, path);
 	if (!sb) {
 		sb = xmalloc(sizeof(*sb));
 		strbuf_init(sb, 0);
 		strmap_put(&opt->priv->output, path, sb);
 	}
 
+	dest = (opt->record_conflict_msgs_as_headers ? &tmp : sb);
+
 	va_start(ap, fmt);
-	strbuf_vaddf(sb, fmt, ap);
+	strbuf_vaddf(dest, fmt, ap);
 	va_end(ap);
 
+	if (opt->record_conflict_msgs_as_headers) {
+		int i_sb = 0, i_tmp = 0;
+
+		/* Start with the specified prefix */
+		if (opt->msg_header_prefix)
+			strbuf_addf(sb, "%s ", opt->msg_header_prefix);
+
+		/* Copy tmp to sb, adding spaces after newlines */
+		strbuf_grow(sb, sb->len + 2*tmp.len); /* more than sufficient */
+		for (; i_tmp < tmp.len; i_tmp++, i_sb++) {
+			/* Copy next character from tmp to sb */
+			sb->buf[sb->len + i_sb] = tmp.buf[i_tmp];
+
+			/* If we copied a newline, add a space */
+			if (tmp.buf[i_tmp] == '\n')
+				sb->buf[++i_sb] = ' ';
+		}
+		/* Update length and ensure it's NUL-terminated */
+		sb->len += i_sb;
+		sb->buf[sb->len] = '\0';
+
+		strbuf_release(&tmp);
+	}
+
+	/* Add final newline character to sb */
 	strbuf_addch(sb, '\n');
 }
 
@@ -4246,6 +4278,9 @@ void merge_switch_to_result(struct merge_options *opt,
 		struct string_list olist = STRING_LIST_INIT_NODUP;
 		int i;
 
+		if (opt->record_conflict_msgs_as_headers)
+			BUG("Either display conflict messages or record them as headers, not both");
+
 		trace2_region_enter("merge", "display messages", opt->repo);
 
 		/* Hack to pre-allocate olist to the desired size */
@@ -4347,6 +4382,9 @@ static void merge_start(struct merge_options *opt, struct merge_result *result)
 	assert(opt->recursive_variant >= MERGE_VARIANT_NORMAL &&
 	       opt->recursive_variant <= MERGE_VARIANT_THEIRS);
 
+	if (opt->msg_header_prefix)
+		assert(opt->record_conflict_msgs_as_headers);
+
 	/*
 	 * detect_renames, verbosity, buffer_output, and obuf are ignored
 	 * fields that were used by "recursive" rather than "ort" -- but
diff --git a/merge-recursive.c b/merge-recursive.c
index bc73c52dd84..9ec1e6d043a 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -3714,6 +3714,10 @@ static int merge_start(struct merge_options *opt, struct tree *head)
 
 	assert(opt->priv == NULL);
 
+	/* Not supported; option specific to merge-ort */
+	assert(!opt->record_conflict_msgs_as_headers);
+	assert(!opt->msg_header_prefix);
+
 	/* Sanity check on repo state; index must match head */
 	if (repo_index_has_changes(opt->repo, head, &sb)) {
 		err(opt, _("Your local changes to the following files would be overwritten by merge:\n  %s"),
diff --git a/merge-recursive.h b/merge-recursive.h
index 0795a1d3ec1..b88000e3c25 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -46,6 +46,8 @@ struct merge_options {
 	/* miscellaneous control options */
 	const char *subtree_shift;
 	unsigned renormalize : 1;
+	unsigned record_conflict_msgs_as_headers : 1;
+	const char *msg_header_prefix;
 
 	/* internal fields used by the implementation */
 	struct merge_options_internal *priv;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v5 07/10] diff: add ability to insert additional headers for paths
  2022-02-02  2:37       ` [PATCH v5 00/10] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
                           ` (5 preceding siblings ...)
  2022-02-02  2:37         ` [PATCH v5 06/10] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
@ 2022-02-02  2:37         ` Elijah Newren via GitGitGadget
  2022-02-02  2:37         ` [PATCH v5 08/10] show, log: include conflict/warning messages in --remerge-diff headers Elijah Newren via GitGitGadget
                           ` (2 subsequent siblings)
  9 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-02-02  2:37 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

When additional headers are provided, we need to
  * add diff_filepairs to diff_queued_diff for each paths in the
    additional headers map which, unless that path is part of
    another diff_filepair already found in diff_queued_diff
  * format the headers (colorization, line_prefix for --graph)
  * make sure the various codepaths that attempt to return early
    if there are "no changes" take into account the headers that
    need to be shown.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 diff.c     | 124 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 diff.h     |   3 +-
 log-tree.c |   2 +-
 3 files changed, 123 insertions(+), 6 deletions(-)

diff --git a/diff.c b/diff.c
index 861282db1c3..1bfb01c18ec 100644
--- a/diff.c
+++ b/diff.c
@@ -27,6 +27,7 @@
 #include "help.h"
 #include "promisor-remote.h"
 #include "dir.h"
+#include "strmap.h"
 
 #ifdef NO_FAST_WORKING_DIRECTORY
 #define FAST_WORKING_DIRECTORY 0
@@ -3406,6 +3407,31 @@ struct userdiff_driver *get_textconv(struct repository *r,
 	return userdiff_get_textconv(r, one->driver);
 }
 
+static struct strbuf *additional_headers(struct diff_options *o,
+					 const char *path)
+{
+	if (!o->additional_path_headers)
+		return NULL;
+	return strmap_get(o->additional_path_headers, path);
+}
+
+static void add_formatted_headers(struct strbuf *msg,
+				  struct strbuf *more_headers,
+				  const char *line_prefix,
+				  const char *meta,
+				  const char *reset)
+{
+	char *next, *newline;
+
+	for (next = more_headers->buf; *next; next = newline) {
+		newline = strchrnul(next, '\n');
+		strbuf_addf(msg, "%s%s%.*s%s\n", line_prefix, meta,
+			    (int)(newline - next), next, reset);
+		if (*newline)
+			newline++;
+	}
+}
+
 static void builtin_diff(const char *name_a,
 			 const char *name_b,
 			 struct diff_filespec *one,
@@ -3464,6 +3490,17 @@ static void builtin_diff(const char *name_a,
 	b_two = quote_two(b_prefix, name_b + (*name_b == '/'));
 	lbl[0] = DIFF_FILE_VALID(one) ? a_one : "/dev/null";
 	lbl[1] = DIFF_FILE_VALID(two) ? b_two : "/dev/null";
+	if (!DIFF_FILE_VALID(one) && !DIFF_FILE_VALID(two)) {
+		/*
+		 * We should only reach this point for pairs from
+		 * create_filepairs_for_header_only_notifications().  For
+		 * these, we should avoid the "/dev/null" special casing
+		 * above, meaning we avoid showing such pairs as either
+		 * "new file" or "deleted file" below.
+		 */
+		lbl[0] = a_one;
+		lbl[1] = b_two;
+	}
 	strbuf_addf(&header, "%s%sdiff --git %s %s%s\n", line_prefix, meta, a_one, b_two, reset);
 	if (lbl[0][0] == '/') {
 		/* /dev/null */
@@ -4328,6 +4365,7 @@ static void fill_metainfo(struct strbuf *msg,
 	const char *set = diff_get_color(use_color, DIFF_METAINFO);
 	const char *reset = diff_get_color(use_color, DIFF_RESET);
 	const char *line_prefix = diff_line_prefix(o);
+	struct strbuf *more_headers = NULL;
 
 	*must_show_header = 1;
 	strbuf_init(msg, PATH_MAX * 2 + 300);
@@ -4364,6 +4402,11 @@ static void fill_metainfo(struct strbuf *msg,
 	default:
 		*must_show_header = 0;
 	}
+	if ((more_headers = additional_headers(o, name))) {
+		add_formatted_headers(msg, more_headers,
+				      line_prefix, set, reset);
+		*must_show_header = 1;
+	}
 	if (one && two && !oideq(&one->oid, &two->oid)) {
 		const unsigned hexsz = the_hash_algo->hexsz;
 		int abbrev = o->abbrev ? o->abbrev : DEFAULT_ABBREV;
@@ -5852,12 +5895,27 @@ int diff_unmodified_pair(struct diff_filepair *p)
 
 static void diff_flush_patch(struct diff_filepair *p, struct diff_options *o)
 {
-	if (diff_unmodified_pair(p))
+	int include_conflict_headers =
+	    (additional_headers(o, p->one->path) &&
+	     (!o->filter || filter_bit_tst(DIFF_STATUS_UNMERGED, o)));
+
+	/*
+	 * Check if we can return early without showing a diff.  Note that
+	 * diff_filepair only stores {oid, path, mode, is_valid}
+	 * information for each path, and thus diff_unmodified_pair() only
+	 * considers those bits of info.  However, we do not want pairs
+	 * created by create_filepairs_for_header_only_notifications()
+	 * (which always look like unmodified pairs) to be ignored, so
+	 * return early if both p is unmodified AND we don't want to
+	 * include_conflict_headers.
+	 */
+	if (diff_unmodified_pair(p) && !include_conflict_headers)
 		return;
 
+	/* Actually, we can also return early to avoid showing tree diffs */
 	if ((DIFF_FILE_VALID(p->one) && S_ISDIR(p->one->mode)) ||
 	    (DIFF_FILE_VALID(p->two) && S_ISDIR(p->two->mode)))
-		return; /* no tree diffs in patch format */
+		return;
 
 	run_diff(p, o);
 }
@@ -5888,10 +5946,17 @@ static void diff_flush_checkdiff(struct diff_filepair *p,
 	run_checkdiff(p, o);
 }
 
-int diff_queue_is_empty(void)
+int diff_queue_is_empty(struct diff_options *o)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
 	int i;
+	int include_conflict_headers =
+	    (o->additional_path_headers &&
+	     (!o->filter || filter_bit_tst(DIFF_STATUS_UNMERGED, o)));
+
+	if (include_conflict_headers)
+		return 0;
+
 	for (i = 0; i < q->nr; i++)
 		if (!diff_unmodified_pair(q->queue[i]))
 			return 0;
@@ -6325,6 +6390,54 @@ void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc)
 		warning(_(rename_limit_advice), varname, needed);
 }
 
+static void create_filepairs_for_header_only_notifications(struct diff_options *o)
+{
+	struct strset present;
+	struct diff_queue_struct *q = &diff_queued_diff;
+	struct hashmap_iter iter;
+	struct strmap_entry *e;
+	int i;
+
+	strset_init_with_options(&present, /*pool*/ NULL, /*strdup*/ 0);
+
+	/*
+	 * Find out which paths exist in diff_queued_diff, preferring
+	 * one->path for any pair that has multiple paths.
+	 */
+	for (i = 0; i < q->nr; i++) {
+		struct diff_filepair *p = q->queue[i];
+		char *path = p->one->path ? p->one->path : p->two->path;
+
+		if (strmap_contains(o->additional_path_headers, path))
+			strset_add(&present, path);
+	}
+
+	/*
+	 * Loop over paths in additional_path_headers; for each NOT already
+	 * in diff_queued_diff, create a synthetic filepair and insert that
+	 * into diff_queued_diff.
+	 */
+	strmap_for_each_entry(o->additional_path_headers, &iter, e) {
+		if (!strset_contains(&present, e->key)) {
+			struct diff_filespec *one, *two;
+			struct diff_filepair *p;
+
+			one = alloc_filespec(e->key);
+			two = alloc_filespec(e->key);
+			fill_filespec(one, null_oid(), 0, 0);
+			fill_filespec(two, null_oid(), 0, 0);
+			p = diff_queue(q, one, two);
+			p->status = DIFF_STATUS_MODIFIED;
+		}
+	}
+
+	/* Re-sort the filepairs */
+	diffcore_fix_diff_index();
+
+	/* Cleanup */
+	strset_clear(&present);
+}
+
 static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 {
 	int i;
@@ -6337,6 +6450,9 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o)
 	if (o->color_moved)
 		o->emitted_symbols = &esm;
 
+	if (o->additional_path_headers)
+		create_filepairs_for_header_only_notifications(o);
+
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
 		if (check_pair_status(p))
@@ -6413,7 +6529,7 @@ void diff_flush(struct diff_options *options)
 	 * Order: raw, stat, summary, patch
 	 * or:    name/name-status/checkdiff (other bits clear)
 	 */
-	if (!q->nr)
+	if (!q->nr && !options->additional_path_headers)
 		goto free_queue;
 
 	if (output_format & (DIFF_FORMAT_RAW |
diff --git a/diff.h b/diff.h
index 8ba85c5e605..ce9e2cf2e4f 100644
--- a/diff.h
+++ b/diff.h
@@ -395,6 +395,7 @@ struct diff_options {
 
 	struct repository *repo;
 	struct option *parseopts;
+	struct strmap *additional_path_headers;
 
 	int no_free;
 };
@@ -593,7 +594,7 @@ void diffcore_fix_diff_index(void);
 "                show all files diff when -S is used and hit is found.\n" \
 "  -a  --text    treat all files as text.\n"
 
-int diff_queue_is_empty(void);
+int diff_queue_is_empty(struct diff_options *o);
 void diff_flush(struct diff_options*);
 void diff_free(struct diff_options*);
 void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc);
diff --git a/log-tree.c b/log-tree.c
index 89da7de5dbf..8013edcc5d4 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -850,7 +850,7 @@ int log_tree_diff_flush(struct rev_info *opt)
 	opt->shown_dashes = 0;
 	diffcore_std(&opt->diffopt);
 
-	if (diff_queue_is_empty()) {
+	if (diff_queue_is_empty(&opt->diffopt)) {
 		int saved_fmt = opt->diffopt.output_format;
 		opt->diffopt.output_format = DIFF_FORMAT_NO_OUTPUT;
 		diff_flush(&opt->diffopt);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v5 08/10] show, log: include conflict/warning messages in --remerge-diff headers
  2022-02-02  2:37       ` [PATCH v5 00/10] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
                           ` (6 preceding siblings ...)
  2022-02-02  2:37         ` [PATCH v5 07/10] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
@ 2022-02-02  2:37         ` Elijah Newren via GitGitGadget
  2022-02-02  2:37         ` [PATCH v5 09/10] merge-ort: mark conflict/warning messages from inner merges as omittable Elijah Newren via GitGitGadget
  2022-02-02  2:37         ` [PATCH v5 10/10] diff-merges: avoid history simplifications when diffing merges Elijah Newren via GitGitGadget
  9 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-02-02  2:37 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Conflicts such as modify/delete, rename/rename, or file/directory are
not representable via content conflict markers, and the normal output
messages notifying users about these were dropped with --remerge-diff.
While we don't want these messages randomly shown before the commit
and diff headers, we do want them to still be shown; include them as
part of the diff headers instead.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 log-tree.c              |  51 ++++++++++++++
 merge-ort.c             |   1 +
 merge-ort.h             |  10 +++
 t/t4069-remerge-diff.sh | 144 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 206 insertions(+)

diff --git a/log-tree.c b/log-tree.c
index 8013edcc5d4..d93bafa5be3 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -19,6 +19,7 @@
 #include "line-log.h"
 #include "help.h"
 #include "range-diff.h"
+#include "strmap.h"
 
 static struct decoration name_decoration = { "object names" };
 static int decoration_loaded;
@@ -905,6 +906,52 @@ static int do_diff_combined(struct rev_info *opt, struct commit *commit)
 	return !opt->loginfo;
 }
 
+static void setup_additional_headers(struct diff_options *o,
+				     struct strmap *all_headers)
+{
+	struct hashmap_iter iter;
+	struct strmap_entry *entry;
+
+	/*
+	 * Make o->additional_path_headers contain the subset of all_headers
+	 * that match o->pathspec.  If there aren't any that match o->pathspec,
+	 * then make o->additional_path_headers be NULL.
+	 */
+
+	if (!o->pathspec.nr) {
+		o->additional_path_headers = all_headers;
+		return;
+	}
+
+	o->additional_path_headers = xmalloc(sizeof(struct strmap));
+	strmap_init_with_options(o->additional_path_headers, NULL, 0);
+	strmap_for_each_entry(all_headers, &iter, entry) {
+		if (match_pathspec(the_repository->index, &o->pathspec,
+				   entry->key, strlen(entry->key),
+				   0 /* prefix */, NULL /* seen */,
+				   0 /* is_dir */))
+			strmap_put(o->additional_path_headers,
+				   entry->key, entry->value);
+	}
+	if (!strmap_get_size(o->additional_path_headers)) {
+		strmap_clear(o->additional_path_headers, 0);
+		FREE_AND_NULL(o->additional_path_headers);
+	}
+}
+
+static void cleanup_additional_headers(struct diff_options *o)
+{
+	if (!o->pathspec.nr) {
+		o->additional_path_headers = NULL;
+		return;
+	}
+	if (!o->additional_path_headers)
+		return;
+
+	strmap_clear(o->additional_path_headers, 0);
+	FREE_AND_NULL(o->additional_path_headers);
+}
+
 static int do_remerge_diff(struct rev_info *opt,
 			   struct commit_list *parents,
 			   struct object_id *oid,
@@ -922,6 +969,8 @@ static int do_remerge_diff(struct rev_info *opt,
 	/* Setup merge options */
 	init_merge_options(&o, the_repository);
 	o.show_rename_progress = 0;
+	o.record_conflict_msgs_as_headers = 1;
+	o.msg_header_prefix = "remerge";
 
 	ctx.abbrev = DEFAULT_ABBREV;
 	format_commit_message(parent1, "%h (%s)", &parent1_desc, &ctx);
@@ -938,10 +987,12 @@ static int do_remerge_diff(struct rev_info *opt,
 	merge_incore_recursive(&o, bases, parent1, parent2, &res);
 
 	/* Show the diff */
+	setup_additional_headers(&opt->diffopt, res.path_messages);
 	diff_tree_oid(&res.tree->object.oid, oid, "", &opt->diffopt);
 	log_tree_diff_flush(opt);
 
 	/* Cleanup */
+	cleanup_additional_headers(&opt->diffopt);
 	strbuf_release(&parent1_desc);
 	strbuf_release(&parent2_desc);
 	merge_finalize(&o, &res);
diff --git a/merge-ort.c b/merge-ort.c
index 481305d2bcf..43f980d2586 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -4585,6 +4585,7 @@ redo:
 	trace2_region_leave("merge", "process_entries", opt->repo);
 
 	/* Set return values */
+	result->path_messages = &opt->priv->output;
 	result->tree = parse_tree_indirect(&working_tree_oid);
 	/* existence of conflicted entries implies unclean */
 	result->clean &= strmap_empty(&opt->priv->conflicted);
diff --git a/merge-ort.h b/merge-ort.h
index c011864ffeb..fe599b87868 100644
--- a/merge-ort.h
+++ b/merge-ort.h
@@ -5,6 +5,7 @@
 
 struct commit;
 struct tree;
+struct strmap;
 
 struct merge_result {
 	/*
@@ -23,6 +24,15 @@ struct merge_result {
 	 */
 	struct tree *tree;
 
+	/*
+	 * Special messages and conflict notices for various paths
+	 *
+	 * This is a map of pathnames to strbufs.  It contains various
+	 * warning/conflict/notice messages (possibly multiple per path)
+	 * that callers may want to use.
+	 */
+	struct strmap *path_messages;
+
 	/*
 	 * Additional metadata used by merge_switch_to_result() or future calls
 	 * to merge_incore_*().  Includes data needed to update the index (if
diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
index d7ab0f50066..fd6bce64781 100755
--- a/t/t4069-remerge-diff.sh
+++ b/t/t4069-remerge-diff.sh
@@ -60,6 +60,7 @@ test_expect_success 'remerge-diff with both a resolved conflict and an unrelated
 	git log -1 --oneline ab_resolution >tmp &&
 	cat <<-EOF >>tmp &&
 	diff --git a/numbers b/numbers
+	remerge CONFLICT (content): Merge conflict in numbers
 	index a1fb731..6875544 100644
 	--- a/numbers
 	+++ b/numbers
@@ -88,4 +89,147 @@ test_expect_success 'remerge-diff with both a resolved conflict and an unrelated
 	test_cmp expect actual
 '
 
+test_expect_success 'setup non-content conflicts' '
+	git switch --orphan base &&
+
+	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
+	test_write_lines a b c d e f g h i >letters &&
+	test_write_lines in the way >content &&
+	git add numbers letters content &&
+	git commit -m base &&
+
+	git branch side1 &&
+	git branch side2 &&
+
+	git checkout side1 &&
+	test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
+	git mv letters letters_side1 &&
+	git mv content file_or_directory &&
+	git add numbers &&
+	git commit -m side1 &&
+
+	git checkout side2 &&
+	git rm numbers &&
+	git mv letters letters_side2 &&
+	mkdir file_or_directory &&
+	echo hello >file_or_directory/world &&
+	git add file_or_directory/world &&
+	git commit -m side2 &&
+
+	git checkout -b resolution side1 &&
+	test_must_fail git merge side2 &&
+	test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
+	git add numbers &&
+	git add letters_side1 &&
+	git rm letters &&
+	git rm letters_side2 &&
+	git add file_or_directory~HEAD &&
+	git mv file_or_directory~HEAD wanted_content &&
+	git commit -m resolved
+'
+
+test_expect_success 'remerge-diff with non-content conflicts' '
+	git log -1 --oneline resolution >tmp &&
+	cat <<-EOF >>tmp &&
+	diff --git a/file_or_directory~HASH (side1) b/wanted_content
+	similarity index 100%
+	rename from file_or_directory~HASH (side1)
+	rename to wanted_content
+	remerge CONFLICT (file/directory): directory in the way of file_or_directory from HASH (side1); moving it to file_or_directory~HASH (side1) instead.
+	diff --git a/letters b/letters
+	remerge CONFLICT (rename/rename): letters renamed to letters_side1 in HASH (side1) and to letters_side2 in HASH (side2).
+	diff --git a/letters_side2 b/letters_side2
+	deleted file mode 100644
+	index b236ae5..0000000
+	--- a/letters_side2
+	+++ /dev/null
+	@@ -1,9 +0,0 @@
+	-a
+	-b
+	-c
+	-d
+	-e
+	-f
+	-g
+	-h
+	-i
+	diff --git a/numbers b/numbers
+	remerge CONFLICT (modify/delete): numbers deleted in HASH (side2) and modified in HASH (side1).  Version HASH (side1) of numbers left in tree.
+	EOF
+	# We still have some sha1 hashes above; rip them out so test works
+	# with sha256
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
+
+	git show --oneline --remerge-diff resolution >tmp &&
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'remerge-diff w/ diff-filter=U: all conflict headers, no diff content' '
+	git log -1 --oneline resolution >tmp &&
+	cat <<-EOF >>tmp &&
+	diff --git a/file_or_directory~HASH (side1) b/file_or_directory~HASH (side1)
+	remerge CONFLICT (file/directory): directory in the way of file_or_directory from HASH (side1); moving it to file_or_directory~HASH (side1) instead.
+	diff --git a/letters b/letters
+	remerge CONFLICT (rename/rename): letters renamed to letters_side1 in HASH (side1) and to letters_side2 in HASH (side2).
+	diff --git a/numbers b/numbers
+	remerge CONFLICT (modify/delete): numbers deleted in HASH (side2) and modified in HASH (side1).  Version HASH (side1) of numbers left in tree.
+	EOF
+	# We still have some sha1 hashes above; rip them out so test works
+	# with sha256
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
+
+	git show --oneline --remerge-diff --diff-filter=U resolution >tmp &&
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'remerge-diff w/ diff-filter=R: relevant file + conflict header' '
+	git log -1 --oneline resolution >tmp &&
+	cat <<-EOF >>tmp &&
+	diff --git a/file_or_directory~HASH (side1) b/wanted_content
+	similarity index 100%
+	rename from file_or_directory~HASH (side1)
+	rename to wanted_content
+	remerge CONFLICT (file/directory): directory in the way of file_or_directory from HASH (side1); moving it to file_or_directory~HASH (side1) instead.
+	EOF
+	# We still have some sha1 hashes above; rip them out so test works
+	# with sha256
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
+
+	git show --oneline --remerge-diff --diff-filter=R resolution >tmp &&
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'remerge-diff w/ pathspec: limits to relevant file including conflict header' '
+	git log -1 --oneline resolution >tmp &&
+	cat <<-EOF >>tmp &&
+	diff --git a/letters b/letters
+	remerge CONFLICT (rename/rename): letters renamed to letters_side1 in HASH (side1) and to letters_side2 in HASH (side2).
+	diff --git a/letters_side2 b/letters_side2
+	deleted file mode 100644
+	index b236ae5..0000000
+	--- a/letters_side2
+	+++ /dev/null
+	@@ -1,9 +0,0 @@
+	-a
+	-b
+	-c
+	-d
+	-e
+	-f
+	-g
+	-h
+	-i
+	EOF
+	# We still have some sha1 hashes above; rip them out so test works
+	# with sha256
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
+
+	git show --oneline --remerge-diff --full-history resolution -- "letters*" >tmp &&
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v5 09/10] merge-ort: mark conflict/warning messages from inner merges as omittable
  2022-02-02  2:37       ` [PATCH v5 00/10] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
                           ` (7 preceding siblings ...)
  2022-02-02  2:37         ` [PATCH v5 08/10] show, log: include conflict/warning messages in --remerge-diff headers Elijah Newren via GitGitGadget
@ 2022-02-02  2:37         ` Elijah Newren via GitGitGadget
  2022-02-02  2:37         ` [PATCH v5 10/10] diff-merges: avoid history simplifications when diffing merges Elijah Newren via GitGitGadget
  9 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-02-02  2:37 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

A recursive merge involves merging the merge bases of the two branches
being merged.  Such an inner merge can itself generate conflict notices.
While such notices may be useful when initially trying to create a
merge, they seem to just be noise when investigating merges later with
--remerge-diff.  (Especially when both sides of the outer merge resolved
the conflict the same way leading to no overall conflict.)  Remove them.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/merge-ort.c b/merge-ort.c
index 43f980d2586..9bf15a01db8 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -638,7 +638,9 @@ static void path_msg(struct merge_options *opt,
 	struct strbuf tmp = STRBUF_INIT;
 
 	if (opt->record_conflict_msgs_as_headers && omittable_hint)
-		return; /* Do not record mere hints in tree */
+		return; /* Do not record mere hints in headers */
+	if (opt->record_conflict_msgs_as_headers && opt->priv->call_depth)
+		return; /* Do not record inner merge issues in headers */
 	sb = strmap_get(&opt->priv->output, path);
 	if (!sb) {
 		sb = xmalloc(sizeof(*sb));
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v5 10/10] diff-merges: avoid history simplifications when diffing merges
  2022-02-02  2:37       ` [PATCH v5 00/10] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
                           ` (8 preceding siblings ...)
  2022-02-02  2:37         ` [PATCH v5 09/10] merge-ort: mark conflict/warning messages from inner merges as omittable Elijah Newren via GitGitGadget
@ 2022-02-02  2:37         ` Elijah Newren via GitGitGadget
  9 siblings, 0 replies; 113+ messages in thread
From: Elijah Newren via GitGitGadget @ 2022-02-02  2:37 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Jonathan Nieder, Sergey Organov, Bagas Sanjaya,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Neeraj Singh, Johannes Altmanninger, Elijah Newren, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Doing diffs for merges are special; they should typically avoid history
simplification.  For example, with

    git log --diff-merges=first-parent -- path

the default history simplification would remove merge commits from
consideration if the file "path" matched the second parent.  That is
counter to what the user wants when looking for first-parent diffs.
Similar comments can be made for --diff-merges=separate (which diffs
against both parents) and --diff-merges=remerge (which diffs against a
remerge of the merge commit).

However, history simplification still makes sense if not doing diffing
merges, and it also makes sense for the combined and dense-combined
forms of diffing merges (because both of those are defined to only show
a diff when the merge result at the relevant paths differs from *both*
parents).

So, for separate, first-parent, and remerge styles of diff-merges, turn
off history simplification.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 diff-merges.c           |  2 ++
 t/t4069-remerge-diff.sh | 58 ++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/diff-merges.c b/diff-merges.c
index 0af4b3f9191..a833fd747ad 100644
--- a/diff-merges.c
+++ b/diff-merges.c
@@ -24,6 +24,7 @@ static void set_separate(struct rev_info *revs)
 {
 	suppress(revs);
 	revs->separate_merges = 1;
+	revs->simplify_history = 0;
 }
 
 static void set_first_parent(struct rev_info *revs)
@@ -50,6 +51,7 @@ static void set_remerge_diff(struct rev_info *revs)
 {
 	suppress(revs);
 	revs->remerge_diff = 1;
+	revs->simplify_history = 0;
 }
 
 static diff_merges_setup_func_t func_by_opt(const char *optarg)
diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh
index fd6bce64781..35f94957fce 100755
--- a/t/t4069-remerge-diff.sh
+++ b/t/t4069-remerge-diff.sh
@@ -227,7 +227,63 @@ test_expect_success 'remerge-diff w/ pathspec: limits to relevant file including
 	# with sha256
 	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
 
-	git show --oneline --remerge-diff --full-history resolution -- "letters*" >tmp &&
+	git show --oneline --remerge-diff resolution -- "letters*" >tmp &&
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'setup non-content conflicts' '
+	git switch --orphan newbase &&
+
+	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
+	git add numbers &&
+	git commit -m base &&
+
+	git branch newside1 &&
+	git branch newside2 &&
+
+	git checkout newside1 &&
+	test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
+	git add numbers &&
+	git commit -m side1 &&
+
+	git checkout newside2 &&
+	test_write_lines 1 2 drei 4 5 6 7 8 9 >numbers &&
+	git add numbers &&
+	git commit -m side2 &&
+
+	git checkout -b newresolution newside1 &&
+	test_must_fail git merge newside2 &&
+	git checkout --theirs numbers &&
+	git add -u numbers &&
+	git commit -m resolved
+'
+
+test_expect_success 'remerge-diff turns off history simplification' '
+	git log -1 --oneline newresolution >tmp &&
+	cat <<-EOF >>tmp &&
+	diff --git a/numbers b/numbers
+	remerge CONFLICT (content): Merge conflict in numbers
+	index 070e9e7..5335e78 100644
+	--- a/numbers
+	+++ b/numbers
+	@@ -1,10 +1,6 @@
+	 1
+	 2
+	-<<<<<<< 96f1e45 (side1)
+	-three
+	-=======
+	 drei
+	->>>>>>> 4fd522f (side2)
+	 4
+	 5
+	 6
+	EOF
+	# We still have some sha1 hashes above; rip them out so test works
+	# with sha256
+	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
+
+	git show --oneline --remerge-diff newresolution -- numbers >tmp &&
 	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
 	test_cmp expect actual
 '
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 113+ messages in thread

* Re: [PATCH v4 02/10] log: clean unneeded objects during `log --remerge-diff`
  2022-02-01 16:54           ` Elijah Newren
@ 2022-02-02 11:17             ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 113+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-02 11:17 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jeff King,
	Jonathan Nieder, Sergey Organov, Bagas Sanjaya, Neeraj Singh,
	Johannes Altmanninger


On Tue, Feb 01 2022, Elijah Newren wrote:

> On Tue, Feb 1, 2022 at 1:45 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>>
>> On Fri, Jan 21 2022, Elijah Newren via GitGitGadget wrote:
>>
>> > From: Elijah Newren <newren@gmail.com>
>> > [...]
>> > @@ -944,7 +945,12 @@ static int do_remerge_diff(struct rev_info *opt,
>> >       strbuf_release(&parent1_desc);
>> >       strbuf_release(&parent2_desc);
>> >       merge_finalize(&o, &res);
>> > -     /* TODO: clean up the temporary object directory */
>> > +
>> > +     /* Clean up the contents of the temporary object directory */
>> > +     if (opt->remerge_objdir)
>> > +             tmp_objdir_discard_objects(opt->remerge_objdir);
>> > +     else
>> > +             BUG("unable to remove temporary object directory");
>>
>> Re the die in 1/10 I don't think this will ever trigger the way this bug
>> suggests.
>>
>> If we didn't manage to remove the directory that'll be signalled with
>> the return code of tmp_objdir_discard_objects() which you're adding
>> here, but which doesn't have a meaningful return value.
>>
>> So shouldn't it first of all be returning the "int" like the
>> remove_dir_recursively() user in tmp_objdir_destroy_1() makes use of?
>>
>> What this bug is really about is:
>>
>>     BUG("our juggling of opt->remerge_objdir between here and builtin/log.c is screwy")
>>
>> Or something, because if we failed to remove the director(ies) we'll
>> just ignore that here.
>
> Yeah, I think I'm suffering from leftover bits from earlier versions
> since this patch series has been waiting for 17 months now.  I
> switched it to
>
>     BUG("did a remerge diff without remerge_objdir?!?");

Thanks :)

>>
>> > +void tmp_objdir_discard_objects(struct tmp_objdir *t)
>> > +{
>> > +     remove_dir_recursively(&t->path, REMOVE_DIR_KEEP_TOPLEVEL);
>> > +}
>>
>> I skimmed remove_dir_recurse() a bit, but didn't test this, does this
>> remove just the "de/eadbeef..." in "de/eadbeef..." or also "de/",
>> i.e. do we (and do we want) to keep the fanned-out 256 loose top-level
>> directories throughout the operation?
>
> It will remove everything below t->path, but leave t->path.  As such,
> it'll nuke any of the 256 loose top-level directories that exist.
>
> If someone wants to come along later and measure performance and
> determine if leaving those 256 loose top-level directories around
> improves things, I think that's fine, but I'm not going to look at it
> as part of this series.  I'm more curious about where tmp_objdir
> creates the temporary directory; when the intent is to migrate the
> objects into the main directory, it should probably be created on the
> same filesystem.  When the intent is scratch space, like it is for
> --remerge-diff, the tmp_objdir should probably be shoved in /dev/shm
> or something like that.  But again, that's outside of this series.
> This series already has had a long list of things keeping it from the
> light of day; there's no need to add frills to it as part of the
> initial submission.

Sorry to add to the frustration. I really didn't mean that as a
suggestion for a thing to be addressed, I think this is way past good
enough. It was just something I found curious, didn't quite know how it
worked, and thought I'd ask if you knew offhand. Thanks!

^ permalink raw reply	[flat|nested] 113+ messages in thread

end of thread, other threads:[~2022-02-02 11:18 UTC | newest]

Thread overview: 113+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-21 18:05 [PATCH 0/9] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
2021-12-21 18:05 ` [PATCH 1/9] tmp_objdir: add a helper function for discarding all contained objects Elijah Newren via GitGitGadget
2021-12-21 23:26   ` Junio C Hamano
2021-12-21 23:51     ` Elijah Newren
2021-12-22  6:23       ` Junio C Hamano
2021-12-25  2:29         ` Elijah Newren
2021-12-21 18:05 ` [PATCH 2/9] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
2021-12-21 21:19   ` Ævar Arnfjörð Bjarmason
2021-12-21 21:57     ` Elijah Newren
2021-12-21 23:02       ` Ævar Arnfjörð Bjarmason
2021-12-21 23:15         ` Elijah Newren
2021-12-21 23:44   ` Junio C Hamano
2021-12-23 18:26     ` Elijah Newren
2021-12-21 18:05 ` [PATCH 3/9] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
2021-12-22  0:00   ` Junio C Hamano
2021-12-23 18:36     ` Elijah Newren
2021-12-21 18:05 ` [PATCH 4/9] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
2021-12-22  0:06   ` Junio C Hamano
2021-12-23 18:38     ` Elijah Newren
2021-12-21 18:05 ` [PATCH 5/9] merge-ort: make path_messages available to external callers Elijah Newren via GitGitGadget
2021-12-21 18:05 ` [PATCH 6/9] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
2021-12-22  0:24   ` Junio C Hamano
2021-12-25  2:35     ` Elijah Newren
2021-12-21 18:05 ` [PATCH 7/9] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
2021-12-21 18:05 ` [PATCH 8/9] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
2021-12-21 21:23   ` Ævar Arnfjörð Bjarmason
2021-12-21 22:18     ` Elijah Newren
2021-12-21 18:05 ` [PATCH 9/9] doc/diff-options: explain the new --remerge-diff option Elijah Newren via GitGitGadget
2021-12-21 21:28   ` Ævar Arnfjörð Bjarmason
2021-12-21 22:24     ` Elijah Newren
2021-12-21 23:47       ` Ævar Arnfjörð Bjarmason
2021-12-22 19:05         ` Elijah Newren
2021-12-21 23:20 ` [PATCH 0/9] Add a new --remerge-diff capability to show & log Junio C Hamano
2021-12-21 23:43   ` Elijah Newren
2021-12-22  0:33 ` Junio C Hamano
2021-12-25  7:59 ` [PATCH v2 0/8] " Elijah Newren via GitGitGadget
2021-12-25  7:59   ` [PATCH v2 1/8] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
2021-12-28 10:56     ` Johannes Altmanninger
2021-12-28 22:34       ` Elijah Newren
2021-12-28 23:01         ` brian m. carlson
2021-12-28 23:45           ` Elijah Newren
2021-12-25  7:59   ` [PATCH v2 2/8] log: clean unneeded objects during `log --remerge-diff` Elijah Newren via GitGitGadget
2021-12-25  7:59   ` [PATCH v2 3/8] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
2021-12-28 10:56     ` Johannes Altmanninger
2021-12-28 19:37       ` Elijah Newren
2021-12-28 22:05         ` Johannes Altmanninger
2021-12-25  7:59   ` [PATCH v2 4/8] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
2021-12-25  7:59   ` [PATCH v2 5/8] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
2021-12-25  7:59   ` [PATCH v2 6/8] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
2021-12-26 18:30     ` In-tree strbuf "in-place" search/replace (was: [PATCH v2 6/8] merge-ort: format messages slightly different for use in headers) Ævar Arnfjörð Bjarmason
2021-12-28 10:56     ` [PATCH v2 6/8] merge-ort: format messages slightly different for use in headers Johannes Altmanninger
2021-12-28 21:48       ` Elijah Newren
2021-12-25  7:59   ` [PATCH v2 7/8] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
2021-12-28 10:57     ` Johannes Altmanninger
2021-12-28 21:09       ` Elijah Newren
2021-12-29  0:16         ` Johannes Altmanninger
2021-12-30 22:04           ` Elijah Newren
2021-12-31  3:07             ` Johannes Altmanninger
2021-12-25  7:59   ` [PATCH v2 8/8] show, log: include conflict/warning messages in --remerge-diff headers Elijah Newren via GitGitGadget
2021-12-28 10:57     ` Johannes Altmanninger
2021-12-28 23:42       ` Elijah Newren
2021-12-26 21:52   ` [PATCH v2 0/8] Add a new --remerge-diff capability to show & log Ævar Arnfjörð Bjarmason
2021-12-27 21:11     ` Elijah Newren
2022-01-10 15:48       ` Ævar Arnfjörð Bjarmason
2021-12-28 10:55   ` Johannes Altmanninger
2021-12-30 23:36   ` [PATCH v3 0/9] " Elijah Newren via GitGitGadget
2021-12-30 23:36     ` [PATCH v3 1/9] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
2022-01-19 15:49       ` Ævar Arnfjörð Bjarmason
2022-01-20  2:31         ` Elijah Newren
2022-01-20  7:53           ` Elijah Newren
2022-01-19 16:01       ` Ævar Arnfjörð Bjarmason
2022-01-20  2:33         ` Elijah Newren
2021-12-30 23:36     ` [PATCH v3 2/9] log: clean unneeded objects during `log --remerge-diff` Elijah Newren via GitGitGadget
2021-12-30 23:36     ` [PATCH v3 3/9] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
2022-01-19 16:41       ` Ævar Arnfjörð Bjarmason
2022-01-20  3:29         ` Elijah Newren
2021-12-30 23:36     ` [PATCH v3 4/9] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
2021-12-30 23:36     ` [PATCH v3 5/9] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
2021-12-30 23:36     ` [PATCH v3 6/9] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
2021-12-30 23:36     ` [PATCH v3 7/9] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
2021-12-30 23:36     ` [PATCH v3 8/9] show, log: include conflict/warning messages in --remerge-diff headers Elijah Newren via GitGitGadget
2022-01-19 16:19       ` Ævar Arnfjörð Bjarmason
2022-01-21  2:16         ` Elijah Newren
2022-01-21 16:55           ` Elijah Newren
2021-12-30 23:36     ` [PATCH v3 9/9] merge-ort: mark conflict/warning messages from inner merges as omittable Elijah Newren via GitGitGadget
2021-12-31  8:46     ` [PATCH v3 0/9] Add a new --remerge-diff capability to show & log Junio C Hamano
2022-01-21 19:12     ` [PATCH v4 00/10] " Elijah Newren via GitGitGadget
2022-01-21 19:12       ` [PATCH v4 01/10] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
2022-02-01  9:09         ` Ævar Arnfjörð Bjarmason
2022-02-01 16:40           ` Elijah Newren
2022-01-21 19:12       ` [PATCH v4 02/10] log: clean unneeded objects during `log --remerge-diff` Elijah Newren via GitGitGadget
2022-02-01  9:35         ` Ævar Arnfjörð Bjarmason
2022-02-01 16:54           ` Elijah Newren
2022-02-02 11:17             ` Ævar Arnfjörð Bjarmason
2022-01-21 19:12       ` [PATCH v4 03/10] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
2022-01-21 19:12       ` [PATCH v4 04/10] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
2022-01-21 19:12       ` [PATCH v4 05/10] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
2022-01-21 19:12       ` [PATCH v4 06/10] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
2022-01-21 19:12       ` [PATCH v4 07/10] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
2022-01-21 19:12       ` [PATCH v4 08/10] show, log: include conflict/warning messages in --remerge-diff headers Elijah Newren via GitGitGadget
2022-01-21 19:12       ` [PATCH v4 09/10] merge-ort: mark conflict/warning messages from inner merges as omittable Elijah Newren via GitGitGadget
2022-01-21 19:12       ` [PATCH v4 10/10] diff-merges: avoid history simplifications when diffing merges Elijah Newren via GitGitGadget
2022-02-02  2:37       ` [PATCH v5 00/10] Add a new --remerge-diff capability to show & log Elijah Newren via GitGitGadget
2022-02-02  2:37         ` [PATCH v5 01/10] show, log: provide a --remerge-diff capability Elijah Newren via GitGitGadget
2022-02-02  2:37         ` [PATCH v5 02/10] log: clean unneeded objects during `log --remerge-diff` Elijah Newren via GitGitGadget
2022-02-02  2:37         ` [PATCH v5 03/10] ll-merge: make callers responsible for showing warnings Elijah Newren via GitGitGadget
2022-02-02  2:37         ` [PATCH v5 04/10] merge-ort: capture and print ll-merge warnings in our preferred fashion Elijah Newren via GitGitGadget
2022-02-02  2:37         ` [PATCH v5 05/10] merge-ort: mark a few more conflict messages as omittable Elijah Newren via GitGitGadget
2022-02-02  2:37         ` [PATCH v5 06/10] merge-ort: format messages slightly different for use in headers Elijah Newren via GitGitGadget
2022-02-02  2:37         ` [PATCH v5 07/10] diff: add ability to insert additional headers for paths Elijah Newren via GitGitGadget
2022-02-02  2:37         ` [PATCH v5 08/10] show, log: include conflict/warning messages in --remerge-diff headers Elijah Newren via GitGitGadget
2022-02-02  2:37         ` [PATCH v5 09/10] merge-ort: mark conflict/warning messages from inner merges as omittable Elijah Newren via GitGitGadget
2022-02-02  2:37         ` [PATCH v5 10/10] diff-merges: avoid history simplifications when diffing merges Elijah Newren via GitGitGadget

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).