git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] log,diff-tree: add --combined-with-paths options for merges with renames
@ 2019-01-24 16:46 Elijah Newren
  2019-01-25  2:19 ` brian m. carlson
                   ` (3 more replies)
  0 siblings, 4 replies; 39+ messages in thread
From: Elijah Newren @ 2019-01-24 16:46 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Elijah Newren

The raw diff format for merges with -c or --cc will only list one
filename, even if rename detection is active and a rename was detected
for the given path.  Examples:

  ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	describe.c
  ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	bar.sh
  ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	phooey.c

This doesn't let us know what the original name of bar.sh was in the
first parent, and doesn't let us know what either of the original
names of phooey.c were in either of the parents.  In contrast, for
non-merge commits, raw format does provide original filenames (and a
rename score to boot).  In order to also provide original filenames
for merge commits, add a --combined-with-paths option (which is only
useful in conjunction with -c, --raw, and -M and thus implies all
those options) so that we can print tab-separated filenames when
renames are involved.  This transforms the above output to:

  ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	describe.c
  ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	foo.sh	bar.sh
  ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	fooey.c	fuey.c	phooey.c

Signed-off-by: Elijah Newren <newren@gmail.com>
---

The critical part of the patch is the few line change to
show_raw_diff(), the rest is plumbing to set that up.

This patch was based out of Peff's suggestion[1] to fix diff-tree to
do what I needed rather than bending fast-export to cover my usecase;
I've been running with a hacky version of this patch for a while and
finally cleaned it up.

[1] https://public-inbox.org/git/20181114071454.GB19904@sigill.intra.peff.net/

As an alternative, I considered perhaps trying to sell it as a bugfix
(how often do people use -M, -c, and --raw together and have renames
in merge commits -- can I just change the format to include the old
names), but was worried that since diff-tree is plumbing and that the
format was documented to not include original filename(s), that I'd be
breaking backward compatibility in an important way for someone and
thus opted for a new flag to get the behavior I needed.

I did struggle a bit to come up with a name for the option; if others
have better suggestions, I'm happy to switch.


Range-diff:
1:  29e9ddf532 = 1:  29e9ddf532 log,diff-tree: add --combined-with-paths options for merges with renames

 Documentation/diff-format.txt      | 23 +++++++++++++---
 Documentation/git-diff-tree.txt    |  9 +++++--
 Documentation/rev-list-options.txt |  5 ++++
 combine-diff.c                     | 42 ++++++++++++++++++++++++++----
 diff.h                             |  1 +
 revision.c                         |  7 +++++
 revision.h                         |  1 +
 7 files changed, 78 insertions(+), 10 deletions(-)

diff --git a/Documentation/diff-format.txt b/Documentation/diff-format.txt
index cdcc17f0ad..6e6cfc5d56 100644
--- a/Documentation/diff-format.txt
+++ b/Documentation/diff-format.txt
@@ -87,7 +87,7 @@ diff format for merges
 ----------------------
 
 "git-diff-tree", "git-diff-files" and "git-diff --raw"
-can take `-c` or `--cc` option
+can take `-c`, `--cc`, or `--combined-with-paths` option
 to generate diff output also for merge commits.  The output differs
 from the format described above in the following way:
 
@@ -95,12 +95,29 @@ from the format described above in the following way:
 . there are more "src" modes and "src" sha1
 . status is concatenated status characters for each parent
 . no optional "score" number
-. single path, only for "dst"
+. tab-separated pathname(s) of the file
 
-Example:
+For `-c` and `--cc`, only the destination or final path is shown even
+if the file was renamed on any side of history.  With
+`--combined-with-paths`, the number of paths printed will be one more
+than the number of 'R' characters in the concatenated status.  For
+each 'R' in the concatenated status characters, the original pathname
+on that side of history will be shown, and the final path shown on the
+line will be the path used in the merge.
+
+Examples for `-c` and `-cc`:
+------------------------------------------------
+::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	describe.c
+::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	bar.sh
+::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	phooey.c
+------------------------------------------------
+
+Examples for `--combined-with-paths`:
 
 ------------------------------------------------
 ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	describe.c
+::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	foo.sh	bar.sh
+::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	fooey.c	fuey.c	phooey.c
 ------------------------------------------------
 
 Note that 'combined diff' lists only files which were modified from
diff --git a/Documentation/git-diff-tree.txt b/Documentation/git-diff-tree.txt
index 2319b2b192..2d9ba382bc 100644
--- a/Documentation/git-diff-tree.txt
+++ b/Documentation/git-diff-tree.txt
@@ -10,8 +10,8 @@ SYNOPSIS
 --------
 [verse]
 'git diff-tree' [--stdin] [-m] [-s] [-v] [--no-commit-id] [--pretty]
-	      [-t] [-r] [-c | --cc] [--root] [<common diff options>]
-	      <tree-ish> [<tree-ish>] [<path>...]
+	      [-t] [-r] [-c | --cc | --combined-with-paths] [--root]
+	      [<common diff options>] <tree-ish> [<tree-ish>] [<path>...]
 
 DESCRIPTION
 -----------
@@ -108,6 +108,11 @@ include::pretty-options.txt[]
 	itself and the commit log message is not shown, just like in any other
 	"empty diff" case.
 
+--combined-with-paths::
+	This flag is similar to -c, but modifies the raw output format for
+	merges to also show the original paths when renames are found.
+	Implies	-c, -M, and --raw.
+
 --always::
 	Show the commit itself and the commit log message even
 	if the diff itself is empty.
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 98b538bc77..4fdcaef52d 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -948,6 +948,11 @@ options may be given. See linkgit:git-diff-files[1] for more options.
 	the parents have only two variants and the merge result picks
 	one of them without modification.
 
+--combined-with-paths::
+	This flag is similar to -c, but modifies the raw output format for
+	merges to also show the original paths when renames are found.
+	Implies	-c, -M, and --raw.
+
 -m::
 	This flag makes the merge commits show the full diff like
 	regular commits; for each merge parent, a separate log entry
diff --git a/combine-diff.c b/combine-diff.c
index a143c00634..77f7de14c9 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -23,11 +23,15 @@ static int compare_paths(const struct combine_diff_path *one,
 				 two->path, strlen(two->path), two->mode);
 }
 
-static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr, int n, int num_parent)
+static struct combine_diff_path *intersect_paths(
+	struct combine_diff_path *curr,
+	int n,
+	int num_parent,
+	int combined_with_paths)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
 	struct combine_diff_path *p, **tail = &curr;
-	int i, cmp;
+	int i, j, cmp;
 
 	if (!n) {
 		for (i = 0; i < q->nr; i++) {
@@ -50,6 +54,13 @@ static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr,
 			oidcpy(&p->parent[n].oid, &q->queue[i]->one->oid);
 			p->parent[n].mode = q->queue[i]->one->mode;
 			p->parent[n].status = q->queue[i]->status;
+
+			if (combined_with_paths &&
+			    p->parent[n].status == 'R') {
+				strbuf_init(&p->parent[n].path, 0);
+				strbuf_addstr(&p->parent[n].path,
+					      q->queue[i]->one->path);
+			}
 			*tail = p;
 			tail = &p->next;
 		}
@@ -68,6 +79,10 @@ static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr,
 		if (cmp < 0) {
 			/* p->path not in q->queue[]; drop it */
 			*tail = p->next;
+			for (j = 0; j < num_parent; j++)
+				if (combined_with_paths &&
+				    p->parent[j].status == 'R')
+					strbuf_release(&p->parent[j].path);
 			free(p);
 			continue;
 		}
@@ -81,6 +96,10 @@ static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr,
 		oidcpy(&p->parent[n].oid, &q->queue[i]->one->oid);
 		p->parent[n].mode = q->queue[i]->one->mode;
 		p->parent[n].status = q->queue[i]->status;
+		if (combined_with_paths &&
+		    p->parent[n].status == 'R')
+			strbuf_addstr(&p->parent[n].path,
+				      q->queue[i]->one->path);
 
 		tail = &p->next;
 		i++;
@@ -1227,6 +1246,11 @@ static void show_raw_diff(struct combine_diff_path *p, int num_parent, struct re
 		putchar(inter_name_termination);
 	}
 
+	for (i = 0; i < num_parent; i++)
+		if (rev->combined_with_paths &&
+		    p->parent[i].status == 'R')
+			write_name_quoted(p->parent[i].path.buf, stdout,
+					  inter_name_termination);
 	write_name_quoted(p->path, stdout, line_termination);
 }
 
@@ -1324,7 +1348,9 @@ static const char *path_path(void *obj)
 
 /* find set of paths that every parent touches */
 static struct combine_diff_path *find_paths_generic(const struct object_id *oid,
-	const struct oid_array *parents, struct diff_options *opt)
+	const struct oid_array *parents,
+	struct diff_options *opt,
+	int combined_with_paths)
 {
 	struct combine_diff_path *paths = NULL;
 	int i, num_parent = parents->nr;
@@ -1350,7 +1376,8 @@ static struct combine_diff_path *find_paths_generic(const struct object_id *oid,
 			opt->output_format = DIFF_FORMAT_NO_OUTPUT;
 		diff_tree_oid(&parents->oid[i], oid, "", opt);
 		diffcore_std(opt);
-		paths = intersect_paths(paths, i, num_parent);
+		paths = intersect_paths(paths, i, num_parent,
+					combined_with_paths);
 
 		/* if showing diff, show it in requested order */
 		if (opt->output_format != DIFF_FORMAT_NO_OUTPUT &&
@@ -1460,7 +1487,8 @@ void diff_tree_combined(const struct object_id *oid,
 		 * diff(sha1,parent_i) for all i to do the job, specifically
 		 * for parent0.
 		 */
-		paths = find_paths_generic(oid, parents, &diffopts);
+		paths = find_paths_generic(oid, parents, &diffopts,
+					   rev->combined_with_paths);
 	}
 	else {
 		int stat_opt;
@@ -1535,6 +1563,10 @@ void diff_tree_combined(const struct object_id *oid,
 	while (paths) {
 		struct combine_diff_path *tmp = paths;
 		paths = paths->next;
+		for (i = 0; i < num_parent; i++)
+			if (rev->combined_with_paths &&
+			    tmp->parent[i].status == 'R')
+				strbuf_release(&tmp->parent[i].path);
 		free(tmp);
 	}
 
diff --git a/diff.h b/diff.h
index b512d0477a..90ea0256a5 100644
--- a/diff.h
+++ b/diff.h
@@ -294,6 +294,7 @@ struct combine_diff_path {
 		char status;
 		unsigned int mode;
 		struct object_id oid;
+		struct strbuf path;
 	} parent[FLEX_ARRAY];
 };
 #define combine_diff_path_size(n, l) \
diff --git a/revision.c b/revision.c
index 13cfb59b38..21c818ba11 100644
--- a/revision.c
+++ b/revision.c
@@ -1995,6 +1995,13 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->diff = 1;
 		revs->dense_combined_merges = 0;
 		revs->combine_merges = 1;
+	} else if (!strcmp(arg, "--combined-with-paths")) {
+		revs->diff = 1;
+		revs->dense_combined_merges = 0;
+		revs->combine_merges = 1;
+		revs->combined_with_paths = 1;
+		revs->diffopt.detect_rename = DIFF_DETECT_RENAME;
+		revs->diffopt.output_format |= DIFF_FORMAT_RAW;
 	} else if (!strcmp(arg, "--cc")) {
 		revs->diff = 1;
 		revs->dense_combined_merges = 1;
diff --git a/revision.h b/revision.h
index 52e5a88ff5..59a15b0ed0 100644
--- a/revision.h
+++ b/revision.h
@@ -171,6 +171,7 @@ struct rev_info {
 			verbose_header:1,
 			ignore_merges:1,
 			combine_merges:1,
+			combined_with_paths:1,
 			dense_combined_merges:1,
 			always_show_header:1;
 
-- 
2.20.1.310.g29e9ddf532


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH] log,diff-tree: add --combined-with-paths options for merges with renames
  2019-01-24 16:46 [PATCH] log,diff-tree: add --combined-with-paths options for merges with renames Elijah Newren
@ 2019-01-25  2:19 ` brian m. carlson
  2019-01-25 16:27   ` Elijah Newren
  2019-01-25 14:45 ` Derrick Stolee
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 39+ messages in thread
From: brian m. carlson @ 2019-01-25  2:19 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git, Jeff King

[-- Attachment #1: Type: text/plain, Size: 2257 bytes --]

On Thu, Jan 24, 2019 at 08:46:54AM -0800, Elijah Newren wrote:
> The critical part of the patch is the few line change to
> show_raw_diff(), the rest is plumbing to set that up.
> 
> This patch was based out of Peff's suggestion[1] to fix diff-tree to
> do what I needed rather than bending fast-export to cover my usecase;
> I've been running with a hacky version of this patch for a while and
> finally cleaned it up.
> 
> [1] https://public-inbox.org/git/20181114071454.GB19904@sigill.intra.peff.net/
> 
> As an alternative, I considered perhaps trying to sell it as a bugfix
> (how often do people use -M, -c, and --raw together and have renames
> in merge commits -- can I just change the format to include the old
> names), but was worried that since diff-tree is plumbing and that the
> format was documented to not include original filename(s), that I'd be
> breaking backward compatibility in an important way for someone and
> thus opted for a new flag to get the behavior I needed.
> 
> I did struggle a bit to come up with a name for the option; if others
> have better suggestions, I'm happy to switch.

Maybe --all-names? You should definitely let other people chime in on
this as well; it should be obvious by now that I'm no expert on naming
things.

> Range-diff:
> 1:  29e9ddf532 = 1:  29e9ddf532 log,diff-tree: add --combined-with-paths options for merges with renames
> 
>  Documentation/diff-format.txt      | 23 +++++++++++++---
>  Documentation/git-diff-tree.txt    |  9 +++++--
>  Documentation/rev-list-options.txt |  5 ++++
>  combine-diff.c                     | 42 ++++++++++++++++++++++++++----
>  diff.h                             |  1 +
>  revision.c                         |  7 +++++
>  revision.h                         |  1 +
>  7 files changed, 78 insertions(+), 10 deletions(-)

I think it might be nice to see a test for this option so that we avoid
breaking it in the future. I'm also curious how this works with -z, and
a test for that would be interesting as well (as well as illustrative of
the format). For example, is it still unambiguous for machine parsing,
even with oddly named files?
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] log,diff-tree: add --combined-with-paths options for merges with renames
  2019-01-24 16:46 [PATCH] log,diff-tree: add --combined-with-paths options for merges with renames Elijah Newren
  2019-01-25  2:19 ` brian m. carlson
@ 2019-01-25 14:45 ` Derrick Stolee
  2019-01-25 16:30   ` Elijah Newren
  2019-01-25 16:54 ` [PATCH v2] " Elijah Newren
  2019-01-25 19:29 ` [PATCH] log,diff-tree: add --combined-with-paths options for merges with renames Junio C Hamano
  3 siblings, 1 reply; 39+ messages in thread
From: Derrick Stolee @ 2019-01-25 14:45 UTC (permalink / raw)
  To: Elijah Newren, git; +Cc: Jeff King

On 1/24/2019 11:46 AM, Elijah Newren wrote:
> As an alternative, I considered perhaps trying to sell it as a bugfix
> (how often do people use -M, -c, and --raw together and have renames
> in merge commits -- can I just change the format to include the old
> names), but was worried that since diff-tree is plumbing and that the
> format was documented to not include original filename(s), that I'd be
> breaking backward compatibility in an important way for someone and
> thus opted for a new flag to get the behavior I needed.

This is wise. Changing the format is likely to break at least one GUI
client or tool, especially because many depend on the installed version
of Git instead of shipping with one "blessed" client.

In addition, there may be whitespace in the filenames. It appears the
diff format separates the filenames with tab characters. Is it
possible to have tab character in a file name? That would make the output
ambiguous, but is no worse than our current output in a non-combined
diff.

I'll repeat Brian's request for tests. I trust the compiler and the
test suite more than I trust my ability to read code.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] log,diff-tree: add --combined-with-paths options for merges with renames
  2019-01-25  2:19 ` brian m. carlson
@ 2019-01-25 16:27   ` Elijah Newren
  0 siblings, 0 replies; 39+ messages in thread
From: Elijah Newren @ 2019-01-25 16:27 UTC (permalink / raw)
  To: brian m. carlson, Elijah Newren, Git Mailing List, Jeff King

Hi,

On Thu, Jan 24, 2019 at 6:19 PM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> On Thu, Jan 24, 2019 at 08:46:54AM -0800, Elijah Newren wrote:
> > The critical part of the patch is the few line change to
> > show_raw_diff(), the rest is plumbing to set that up.
> >
> > This patch was based out of Peff's suggestion[1] to fix diff-tree to
> > do what I needed rather than bending fast-export to cover my usecase;
> > I've been running with a hacky version of this patch for a while and
> > finally cleaned it up.
> >
> > [1] https://public-inbox.org/git/20181114071454.GB19904@sigill.intra.peff.net/
> >
> > As an alternative, I considered perhaps trying to sell it as a bugfix
> > (how often do people use -M, -c, and --raw together and have renames
> > in merge commits -- can I just change the format to include the old
> > names), but was worried that since diff-tree is plumbing and that the
> > format was documented to not include original filename(s), that I'd be
> > breaking backward compatibility in an important way for someone and
> > thus opted for a new flag to get the behavior I needed.
> >
> > I did struggle a bit to come up with a name for the option; if others
> > have better suggestions, I'm happy to switch.
>
> Maybe --all-names? You should definitely let other people chime in on
> this as well; it should be obvious by now that I'm no expert on naming
> things.

--all-names may work.  Two possible worries:

1) Would people expect the output when using this option to be of the form:

::100644 100644 100644 fabadb8 cc95eb0 4866510 MM    describe.c
describe.c    describe.c

In other words, even if there are no renames involved, would they
expect us to show the filename from each side of the merge?  I was
trying to make the combined diff output format more similar to how we
handle non-merge commits; we only show more than one filename when a
rename is involved.

2) When looking through manpages I have often found an option that I
thought was related to what I wanted, only to discover that a short
semi-ambiguous option had a much different context in mind than I did
and thus performed some operation entirely unrelated to anything I was
interested in.  I suspect that this won't be a highly used option, as
such, a longer name to disambiguate seemed reasonable.


Of course --combined-with-paths isn't fully descriptive either, and
making it even longer might be annoying, so...  *shrug*

>
> > Range-diff:
> > 1:  29e9ddf532 = 1:  29e9ddf532 log,diff-tree: add --combined-with-paths options for merges with renames
> >
> >  Documentation/diff-format.txt      | 23 +++++++++++++---
> >  Documentation/git-diff-tree.txt    |  9 +++++--
> >  Documentation/rev-list-options.txt |  5 ++++
> >  combine-diff.c                     | 42 ++++++++++++++++++++++++++----
> >  diff.h                             |  1 +
> >  revision.c                         |  7 +++++
> >  revision.h                         |  1 +
> >  7 files changed, 78 insertions(+), 10 deletions(-)
>
> I think it might be nice to see a test for this option so that we avoid
> breaking it in the future. I'm also curious how this works with -z, and
> a test for that would be interesting as well (as well as illustrative of
> the format). For example, is it still unambiguous for machine parsing,
> even with oddly named files?

I'll add a test for both with and without -z.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] log,diff-tree: add --combined-with-paths options for merges with renames
  2019-01-25 14:45 ` Derrick Stolee
@ 2019-01-25 16:30   ` Elijah Newren
  0 siblings, 0 replies; 39+ messages in thread
From: Elijah Newren @ 2019-01-25 16:30 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Git Mailing List, Jeff King

Hi,

On Fri, Jan 25, 2019 at 6:45 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 1/24/2019 11:46 AM, Elijah Newren wrote:
> > As an alternative, I considered perhaps trying to sell it as a bugfix
> > (how often do people use -M, -c, and --raw together and have renames
> > in merge commits -- can I just change the format to include the old
> > names), but was worried that since diff-tree is plumbing and that the
> > format was documented to not include original filename(s), that I'd be
> > breaking backward compatibility in an important way for someone and
> > thus opted for a new flag to get the behavior I needed.
>
> This is wise. Changing the format is likely to break at least one GUI
> client or tool, especially because many depend on the installed version
> of Git instead of shipping with one "blessed" client.
>
> In addition, there may be whitespace in the filenames. It appears the
> diff format separates the filenames with tab characters. Is it
> possible to have tab character in a file name? That would make the output
> ambiguous, but is no worse than our current output in a non-combined
> diff.

No, it's actually unambiguous with or without the -z option. Without
-z, pathnames with "unusual" characters (whose definition depends on
core.quotePath) will cause the pathname to be enclosed in
double-quotes with the unusual characters handed with backslashes and
control characters.  With -z, it separates pathnames with a nul
character instead.

> I'll repeat Brian's request for tests. I trust the compiler and the
> test suite more than I trust my ability to read code.

Yep, I'll add a pair and use filenames with multiple tabs.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2] log,diff-tree: add --combined-with-paths options for merges with renames
  2019-01-24 16:46 [PATCH] log,diff-tree: add --combined-with-paths options for merges with renames Elijah Newren
  2019-01-25  2:19 ` brian m. carlson
  2019-01-25 14:45 ` Derrick Stolee
@ 2019-01-25 16:54 ` Elijah Newren
  2019-01-25 17:40   ` Derrick Stolee
  2019-01-26 22:18   ` [PATCH v3] log,diff-tree: add --combined-all-names option Elijah Newren
  2019-01-25 19:29 ` [PATCH] log,diff-tree: add --combined-with-paths options for merges with renames Junio C Hamano
  3 siblings, 2 replies; 39+ messages in thread
From: Elijah Newren @ 2019-01-25 16:54 UTC (permalink / raw)
  To: git; +Cc: Jeff King, brian m . carlson, Derrick Stolee, Elijah Newren

The raw diff format for merges with -c or --cc will only list one
filename, even if rename detection is active and a rename was detected
for the given path.  Examples:

  ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	describe.c
  ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	bar.sh
  ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	phooey.c

This doesn't let us know what the original name of bar.sh was in the
first parent, and doesn't let us know what either of the original
names of phooey.c were in either of the parents.  In contrast, for
non-merge commits, raw format does provide original filenames (and a
rename score to boot).  In order to also provide original filenames
for merge commits, add a --combined-with-paths option (which is only
useful in conjunction with -c, --raw, and -M and thus implies all
those options) so that we can print tab-separated filenames when
renames are involved.  This transforms the above output to:

  ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	describe.c
  ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	foo.sh	bar.sh
  ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	fooey.c	fuey.c	phooey.c

Signed-off-by: Elijah Newren <newren@gmail.com>
---

Differences since v1: added a testcase, with and without -z

 Documentation/diff-format.txt      | 23 +++++++++++++---
 Documentation/git-diff-tree.txt    |  9 +++++--
 Documentation/rev-list-options.txt |  5 ++++
 combine-diff.c                     | 42 ++++++++++++++++++++++++++----
 diff.h                             |  1 +
 revision.c                         |  7 +++++
 revision.h                         |  1 +
 t/t4038-diff-combined.sh           | 36 +++++++++++++++++++++++++
 8 files changed, 114 insertions(+), 10 deletions(-)

diff --git a/Documentation/diff-format.txt b/Documentation/diff-format.txt
index cdcc17f0ad..6e6cfc5d56 100644
--- a/Documentation/diff-format.txt
+++ b/Documentation/diff-format.txt
@@ -87,7 +87,7 @@ diff format for merges
 ----------------------
 
 "git-diff-tree", "git-diff-files" and "git-diff --raw"
-can take `-c` or `--cc` option
+can take `-c`, `--cc`, or `--combined-with-paths` option
 to generate diff output also for merge commits.  The output differs
 from the format described above in the following way:
 
@@ -95,12 +95,29 @@ from the format described above in the following way:
 . there are more "src" modes and "src" sha1
 . status is concatenated status characters for each parent
 . no optional "score" number
-. single path, only for "dst"
+. tab-separated pathname(s) of the file
 
-Example:
+For `-c` and `--cc`, only the destination or final path is shown even
+if the file was renamed on any side of history.  With
+`--combined-with-paths`, the number of paths printed will be one more
+than the number of 'R' characters in the concatenated status.  For
+each 'R' in the concatenated status characters, the original pathname
+on that side of history will be shown, and the final path shown on the
+line will be the path used in the merge.
+
+Examples for `-c` and `-cc`:
+------------------------------------------------
+::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	describe.c
+::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	bar.sh
+::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	phooey.c
+------------------------------------------------
+
+Examples for `--combined-with-paths`:
 
 ------------------------------------------------
 ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	describe.c
+::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	foo.sh	bar.sh
+::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	fooey.c	fuey.c	phooey.c
 ------------------------------------------------
 
 Note that 'combined diff' lists only files which were modified from
diff --git a/Documentation/git-diff-tree.txt b/Documentation/git-diff-tree.txt
index 2319b2b192..2d9ba382bc 100644
--- a/Documentation/git-diff-tree.txt
+++ b/Documentation/git-diff-tree.txt
@@ -10,8 +10,8 @@ SYNOPSIS
 --------
 [verse]
 'git diff-tree' [--stdin] [-m] [-s] [-v] [--no-commit-id] [--pretty]
-	      [-t] [-r] [-c | --cc] [--root] [<common diff options>]
-	      <tree-ish> [<tree-ish>] [<path>...]
+	      [-t] [-r] [-c | --cc | --combined-with-paths] [--root]
+	      [<common diff options>] <tree-ish> [<tree-ish>] [<path>...]
 
 DESCRIPTION
 -----------
@@ -108,6 +108,11 @@ include::pretty-options.txt[]
 	itself and the commit log message is not shown, just like in any other
 	"empty diff" case.
 
+--combined-with-paths::
+	This flag is similar to -c, but modifies the raw output format for
+	merges to also show the original paths when renames are found.
+	Implies	-c, -M, and --raw.
+
 --always::
 	Show the commit itself and the commit log message even
 	if the diff itself is empty.
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 98b538bc77..4fdcaef52d 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -948,6 +948,11 @@ options may be given. See linkgit:git-diff-files[1] for more options.
 	the parents have only two variants and the merge result picks
 	one of them without modification.
 
+--combined-with-paths::
+	This flag is similar to -c, but modifies the raw output format for
+	merges to also show the original paths when renames are found.
+	Implies	-c, -M, and --raw.
+
 -m::
 	This flag makes the merge commits show the full diff like
 	regular commits; for each merge parent, a separate log entry
diff --git a/combine-diff.c b/combine-diff.c
index a143c00634..77f7de14c9 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -23,11 +23,15 @@ static int compare_paths(const struct combine_diff_path *one,
 				 two->path, strlen(two->path), two->mode);
 }
 
-static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr, int n, int num_parent)
+static struct combine_diff_path *intersect_paths(
+	struct combine_diff_path *curr,
+	int n,
+	int num_parent,
+	int combined_with_paths)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
 	struct combine_diff_path *p, **tail = &curr;
-	int i, cmp;
+	int i, j, cmp;
 
 	if (!n) {
 		for (i = 0; i < q->nr; i++) {
@@ -50,6 +54,13 @@ static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr,
 			oidcpy(&p->parent[n].oid, &q->queue[i]->one->oid);
 			p->parent[n].mode = q->queue[i]->one->mode;
 			p->parent[n].status = q->queue[i]->status;
+
+			if (combined_with_paths &&
+			    p->parent[n].status == 'R') {
+				strbuf_init(&p->parent[n].path, 0);
+				strbuf_addstr(&p->parent[n].path,
+					      q->queue[i]->one->path);
+			}
 			*tail = p;
 			tail = &p->next;
 		}
@@ -68,6 +79,10 @@ static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr,
 		if (cmp < 0) {
 			/* p->path not in q->queue[]; drop it */
 			*tail = p->next;
+			for (j = 0; j < num_parent; j++)
+				if (combined_with_paths &&
+				    p->parent[j].status == 'R')
+					strbuf_release(&p->parent[j].path);
 			free(p);
 			continue;
 		}
@@ -81,6 +96,10 @@ static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr,
 		oidcpy(&p->parent[n].oid, &q->queue[i]->one->oid);
 		p->parent[n].mode = q->queue[i]->one->mode;
 		p->parent[n].status = q->queue[i]->status;
+		if (combined_with_paths &&
+		    p->parent[n].status == 'R')
+			strbuf_addstr(&p->parent[n].path,
+				      q->queue[i]->one->path);
 
 		tail = &p->next;
 		i++;
@@ -1227,6 +1246,11 @@ static void show_raw_diff(struct combine_diff_path *p, int num_parent, struct re
 		putchar(inter_name_termination);
 	}
 
+	for (i = 0; i < num_parent; i++)
+		if (rev->combined_with_paths &&
+		    p->parent[i].status == 'R')
+			write_name_quoted(p->parent[i].path.buf, stdout,
+					  inter_name_termination);
 	write_name_quoted(p->path, stdout, line_termination);
 }
 
@@ -1324,7 +1348,9 @@ static const char *path_path(void *obj)
 
 /* find set of paths that every parent touches */
 static struct combine_diff_path *find_paths_generic(const struct object_id *oid,
-	const struct oid_array *parents, struct diff_options *opt)
+	const struct oid_array *parents,
+	struct diff_options *opt,
+	int combined_with_paths)
 {
 	struct combine_diff_path *paths = NULL;
 	int i, num_parent = parents->nr;
@@ -1350,7 +1376,8 @@ static struct combine_diff_path *find_paths_generic(const struct object_id *oid,
 			opt->output_format = DIFF_FORMAT_NO_OUTPUT;
 		diff_tree_oid(&parents->oid[i], oid, "", opt);
 		diffcore_std(opt);
-		paths = intersect_paths(paths, i, num_parent);
+		paths = intersect_paths(paths, i, num_parent,
+					combined_with_paths);
 
 		/* if showing diff, show it in requested order */
 		if (opt->output_format != DIFF_FORMAT_NO_OUTPUT &&
@@ -1460,7 +1487,8 @@ void diff_tree_combined(const struct object_id *oid,
 		 * diff(sha1,parent_i) for all i to do the job, specifically
 		 * for parent0.
 		 */
-		paths = find_paths_generic(oid, parents, &diffopts);
+		paths = find_paths_generic(oid, parents, &diffopts,
+					   rev->combined_with_paths);
 	}
 	else {
 		int stat_opt;
@@ -1535,6 +1563,10 @@ void diff_tree_combined(const struct object_id *oid,
 	while (paths) {
 		struct combine_diff_path *tmp = paths;
 		paths = paths->next;
+		for (i = 0; i < num_parent; i++)
+			if (rev->combined_with_paths &&
+			    tmp->parent[i].status == 'R')
+				strbuf_release(&tmp->parent[i].path);
 		free(tmp);
 	}
 
diff --git a/diff.h b/diff.h
index b512d0477a..90ea0256a5 100644
--- a/diff.h
+++ b/diff.h
@@ -294,6 +294,7 @@ struct combine_diff_path {
 		char status;
 		unsigned int mode;
 		struct object_id oid;
+		struct strbuf path;
 	} parent[FLEX_ARRAY];
 };
 #define combine_diff_path_size(n, l) \
diff --git a/revision.c b/revision.c
index 13cfb59b38..21c818ba11 100644
--- a/revision.c
+++ b/revision.c
@@ -1995,6 +1995,13 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->diff = 1;
 		revs->dense_combined_merges = 0;
 		revs->combine_merges = 1;
+	} else if (!strcmp(arg, "--combined-with-paths")) {
+		revs->diff = 1;
+		revs->dense_combined_merges = 0;
+		revs->combine_merges = 1;
+		revs->combined_with_paths = 1;
+		revs->diffopt.detect_rename = DIFF_DETECT_RENAME;
+		revs->diffopt.output_format |= DIFF_FORMAT_RAW;
 	} else if (!strcmp(arg, "--cc")) {
 		revs->diff = 1;
 		revs->dense_combined_merges = 1;
diff --git a/revision.h b/revision.h
index 52e5a88ff5..59a15b0ed0 100644
--- a/revision.h
+++ b/revision.h
@@ -171,6 +171,7 @@ struct rev_info {
 			verbose_header:1,
 			ignore_merges:1,
 			combine_merges:1,
+			combined_with_paths:1,
 			dense_combined_merges:1,
 			always_show_header:1;
 
diff --git a/t/t4038-diff-combined.sh b/t/t4038-diff-combined.sh
index e2824d3437..9420c1318a 100755
--- a/t/t4038-diff-combined.sh
+++ b/t/t4038-diff-combined.sh
@@ -435,4 +435,40 @@ test_expect_success 'combine diff gets tree sorting right' '
 	test_cmp expect actual
 '
 
+test_expect_success 'setup for --combined-with-paths' '
+	git branch side1c &&
+	git branch side2c &&
+	git checkout side1c &&
+	test_seq 1 10 >$(printf "file\twith\ttabs") &&
+	git add file* &&
+	git commit -m with &&
+	git checkout side2c &&
+	test_seq 1 9 >$(printf "i\tam\ttabbed") &&
+	echo ten >>$(printf "i\tam\ttabbed") &&
+	git add *tabbed &&
+	git commit -m iam &&
+	git checkout -b mergery side1c &&
+	git merge --no-commit side2c &&
+	git rm *tabs &&
+	echo eleven >>$(printf "i\tam\ttabbed") &&
+	git mv "$(printf "i\tam\ttabbed")" "$(printf "fickle\tnaming")" &&
+	git add fickle* &&
+	git commit
+'
+
+test_expect_success '--combined-with-paths shows all paths involved' '
+	cat <<-\EOF >expect &&
+	::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR	"file\twith\ttabs"	"i\tam\ttabbed"	"fickle\tnaming"
+	EOF
+	git diff-tree --combined-with-paths HEAD >actual.tmp &&
+	sed 1d <actual.tmp >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success '--combined-with-paths works with -z as well' '
+	printf "0f9645804ebb04cc3eef91f799eb7fb54d70cefb\0::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR\0file\twith\ttabs\0i\tam\ttabbed\0fickle\tnaming\0" >expect &&
+	git diff-tree --combined-with-paths -z HEAD >actual &&
+	test_cmp -a expect actual
+'
+
 test_done
-- 
2.20.1.310.g29e9ddf532


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v2] log,diff-tree: add --combined-with-paths options for merges with renames
  2019-01-25 16:54 ` [PATCH v2] " Elijah Newren
@ 2019-01-25 17:40   ` Derrick Stolee
  2019-01-25 17:52     ` Elijah Newren
  2019-01-26 22:18   ` [PATCH v3] log,diff-tree: add --combined-all-names option Elijah Newren
  1 sibling, 1 reply; 39+ messages in thread
From: Derrick Stolee @ 2019-01-25 17:40 UTC (permalink / raw)
  To: Elijah Newren, git; +Cc: Jeff King, brian m . carlson

On 1/25/2019 11:54 AM, Elijah Newren wrote:
> +test_expect_success '--combined-with-paths works with -z as well' '
> +	printf "0f9645804ebb04cc3eef91f799eb7fb54d70cefb\0::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR\0file\twith\ttabs\0i\tam\ttabbed\0fickle\tnaming\0" >expect &&

I'm guessing that you use printf here because the
'cat <<-\EOF' approach doesn't work with the special
tabs? Kudos for putting in the extra effort here for
the special formatting!

The rest of the patch looks good.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2] log,diff-tree: add --combined-with-paths options for merges with renames
  2019-01-25 17:40   ` Derrick Stolee
@ 2019-01-25 17:52     ` Elijah Newren
  2019-01-25 19:51       ` Eric Sunshine
  2019-01-27  1:52       ` brian m. carlson
  0 siblings, 2 replies; 39+ messages in thread
From: Elijah Newren @ 2019-01-25 17:52 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Git Mailing List, Jeff King, brian m . carlson

On Fri, Jan 25, 2019 at 9:41 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 1/25/2019 11:54 AM, Elijah Newren wrote:
> > +test_expect_success '--combined-with-paths works with -z as well' '
> > +     printf "0f9645804ebb04cc3eef91f799eb7fb54d70cefb\0::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR\0file\twith\ttabs\0i\tam\ttabbed\0fickle\tnaming\0" >expect &&
>
> I'm guessing that you use printf here because the
> 'cat <<-\EOF' approach doesn't work with the special
> tabs? Kudos for putting in the extra effort here for
> the special formatting!

Yeah, I didn't know how to easily get NUL bytes in the stream without
printf, and once I was using printf the EOF HEREDOC no longer had a
useful purpose.  In the first testcase, since there were only
printable characters in the expected output, a HEREDOC worked well.  I
guess I could have just used printf for both testcases, but having the
literal output shown where it's possible for a human to read it seemed
like an advantage worth capitalizing on.  If anyone feels strongly
that I should use printf for both tests, I can switch them though.

> The rest of the patch looks good.

Thanks for taking a look!

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] log,diff-tree: add --combined-with-paths options for merges with renames
  2019-01-24 16:46 [PATCH] log,diff-tree: add --combined-with-paths options for merges with renames Elijah Newren
                   ` (2 preceding siblings ...)
  2019-01-25 16:54 ` [PATCH v2] " Elijah Newren
@ 2019-01-25 19:29 ` Junio C Hamano
  2019-01-25 20:04   ` Elijah Newren
  3 siblings, 1 reply; 39+ messages in thread
From: Junio C Hamano @ 2019-01-25 19:29 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git, Jeff King

Elijah Newren <newren@gmail.com> writes:

> The raw diff format for merges with -c or --cc will only list one
> filename, even if rename detection is active and a rename was detected
> for the given path.  Examples:
>
>   ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	describe.c
>   ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	bar.sh
>   ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	phooey.c
>
> This doesn't let us know what the original name of bar.sh was in the
> first parent, and doesn't let us know what either of the original
> names of phooey.c were in either of the parents.  In contrast, for
> non-merge commits, raw format does provide original filenames (and a
> rename score to boot).  In order to also provide original filenames
> for merge commits, add a --combined-with-paths option (which is only
> useful in conjunction with -c, --raw, and -M and thus implies all
> those options) so that we can print tab-separated filenames when
> renames are involved.  This transforms the above output to:
>
>   ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	describe.c
>   ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	foo.sh	bar.sh
>   ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	fooey.c	fuey.c	phooey.c

I admit that I designed the original without too much thought.
Perhaps we should have avoided discarding info, but it is way too
late to fix with a default behaviour change.

I am not sure if it is easy for consumers to guess which name on the
output line corresponds to which input tree from the status letter,
though.  Would it make it easier for consumers if this showed names
in all input trees if any of them is different from the name in the
resulting tree, I wonder?  Even in that case, the consumer must know
some rule like "if R or C appears in the status column, then we have
N preimage names plus the name in the result for N-way merge", so it
may not be too bad to force them to know "for each of R or C in the
status column, the name in the preimage tree is emitted, and the
last name is the name in the result".  I dunno.

> +For `-c` and `--cc`, only the destination or final path is shown even
> +if the file was renamed on any side of history.  With
> +`--combined-with-paths`, the number of paths printed will be one more
> +than the number of 'R' characters in the concatenated status.  For
> +each 'R' in the concatenated status characters, the original pathname
> +on that side of history will be shown, and the final path shown on the
> +line will be the path used in the merge.

Is it safe for readers to pay attention to only 'R'?  Will it stay
forever that way?  My immediate worry is 'C', but there might be
other cases that original and result have different names.

> +--combined-with-paths::
> +	This flag is similar to -c, but modifies the raw output format for
> +	merges to also show the original paths when renames are found.
> +	Implies	-c, -M, and --raw.

So, --cc -p is not allowed to use this?  I was wondering if we want
to have a separate "even though traditionally we did not show
preimage names in combined output, this option tells Git to do so,
regardless of output format used, as long as 'combine-diff' is in
effect".

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2] log,diff-tree: add --combined-with-paths options for merges with renames
  2019-01-25 17:52     ` Elijah Newren
@ 2019-01-25 19:51       ` Eric Sunshine
  2019-01-26 22:07         ` Elijah Newren
  2019-01-27  1:52       ` brian m. carlson
  1 sibling, 1 reply; 39+ messages in thread
From: Eric Sunshine @ 2019-01-25 19:51 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Derrick Stolee, Git Mailing List, Jeff King, brian m . carlson

On Fri, Jan 25, 2019 at 12:52 PM Elijah Newren <newren@gmail.com> wrote:
> On Fri, Jan 25, 2019 at 9:41 AM Derrick Stolee <stolee@gmail.com> wrote:
> > On 1/25/2019 11:54 AM, Elijah Newren wrote:
> > > +     printf "0f9645804ebb04cc3eef91f799eb7fb54d70cefb\0::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR\0file\twith\ttabs\0i\tam\ttabbed\0fickle\tnaming\0" >expect &&
> >
> > I'm guessing that you use printf here because the
> > 'cat <<-\EOF' approach doesn't work with the special
> > tabs? Kudos for putting in the extra effort here for
> > the special formatting!
>
> Yeah, I didn't know how to easily get NUL bytes in the stream without
> printf, and once I was using printf the EOF HEREDOC no longer had a
> useful purpose.  In the first testcase, since there were only
> printable characters in the expected output, a HEREDOC worked well.  I
> guess I could have just used printf for both testcases, but having the
> literal output shown where it's possible for a human to read it seemed
> like an advantage worth capitalizing on.

If the readability of a here-doc is preferred, you should be able to
achieve the desired result with the q_to_tab() and lf_to_nul()
functions. For instance:

    q_to_tab <<-\EOF | lf_to_nul >expect &&
    ...Q...Q...
    EOF

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] log,diff-tree: add --combined-with-paths options for merges with renames
  2019-01-25 19:29 ` [PATCH] log,diff-tree: add --combined-with-paths options for merges with renames Junio C Hamano
@ 2019-01-25 20:04   ` Elijah Newren
  2019-01-25 22:21     ` Junio C Hamano
  0 siblings, 1 reply; 39+ messages in thread
From: Elijah Newren @ 2019-01-25 20:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Jeff King

On Fri, Jan 25, 2019 at 11:29 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > The raw diff format for merges with -c or --cc will only list one
> > filename, even if rename detection is active and a rename was detected
> > for the given path.  Examples:
> >
> >   ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM   describe.c
> >   ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM   bar.sh
> >   ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR   phooey.c
> >
> > This doesn't let us know what the original name of bar.sh was in the
> > first parent, and doesn't let us know what either of the original
> > names of phooey.c were in either of the parents.  In contrast, for
> > non-merge commits, raw format does provide original filenames (and a
> > rename score to boot).  In order to also provide original filenames
> > for merge commits, add a --combined-with-paths option (which is only
> > useful in conjunction with -c, --raw, and -M and thus implies all
> > those options) so that we can print tab-separated filenames when
> > renames are involved.  This transforms the above output to:
> >
> >   ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM   describe.c
> >   ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM   foo.sh  bar.sh
> >   ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR   fooey.c fuey.c  phooey.c
>
> I admit that I designed the original without too much thought.
> Perhaps we should have avoided discarding info, but it is way too
> late to fix with a default behaviour change.
>
> I am not sure if it is easy for consumers to guess which name on the
> output line corresponds to which input tree from the status letter,
> though.  Would it make it easier for consumers if this showed names
> in all input trees if any of them is different from the name in the
> resulting tree, I wonder?  Even in that case, the consumer must know
> some rule like "if R or C appears in the status column, then we have
> N preimage names plus the name in the result for N-way merge", so it
> may not be too bad to force them to know "for each of R or C in the
> status column, the name in the preimage tree is emitted, and the
> last name is the name in the result".  I dunno.

I wasn't able to guess the other fields in the raw format ("which of
the three blob shas is the final one and which ones were for parents")
without going and reading the diff-format.txt documentation.  Unless
we always list all names on all lines (even e.g. for 'MM' changes
which will list the same filename three times), we have a more
complicated case where people will have to refer to that
documentation.  I hope the extra paragraphs and examples I added there
are sufficient to spell it out.

Also, my first version of the patch actually showed all names, on all
lines, but I found the heavy repetition really annoying, and not in
keeping with how non-merge commits are handled (where original
filenames are only shown when they differ).  Granted, my change isn't
the only one.  We could just have all names shown if they are not all
identical, as you suggest and I also considered, but I liked this way
slightly better.  If others feel strongly, I can change it, that was
just my gut feel and preference.

> > +For `-c` and `--cc`, only the destination or final path is shown even
> > +if the file was renamed on any side of history.  With
> > +`--combined-with-paths`, the number of paths printed will be one more
> > +than the number of 'R' characters in the concatenated status.  For
> > +each 'R' in the concatenated status characters, the original pathname
> > +on that side of history will be shown, and the final path shown on the
> > +line will be the path used in the merge.
>
> Is it safe for readers to pay attention to only 'R'?  Will it stay
> forever that way?  My immediate worry is 'C', but there might be
> other cases that original and result have different names.

Oops, yeah, it should also handle 'C' and be worded so that if any
future change type comes along involving different names then it'd be
included.

> > +--combined-with-paths::
> > +     This flag is similar to -c, but modifies the raw output format for
> > +     merges to also show the original paths when renames are found.
> > +     Implies -c, -M, and --raw.
>
> So, --cc -p is not allowed to use this?  I was wondering if we want
> to have a separate "even though traditionally we did not show
> preimage names in combined output, this option tells Git to do so,
> regardless of output format used, as long as 'combine-diff' is in
> effect".

You could kind of ask the same question of -c -p, actually.  I looked
into that, but I was only interested in raw format output and --cc is
only about coalescing uninteresting hunks in patches.  Whenever git
shows a combined diff in patch format, it always lists two files in
the header, e.g.:
  a/foo.c
  b/foo.c
perhaps because people have a built-in expectation that a diff has to
involve exactly two files.  I wasn't sure how hard-baked in that
assumption is, but as I was only interested in raw format I didn't
mess with it.  We'd have to switch to showing three or more if we want
this to be relevant to such a mode.  Would that make sense for users
to show in a patch?  I guess the "combined" patch is already kind of
special, so it could make sense, but it kind of feels like a follow-on
feature for someone else to implement...unless leaving it out now
somehow boxes us in.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] log,diff-tree: add --combined-with-paths options for merges with renames
  2019-01-25 20:04   ` Elijah Newren
@ 2019-01-25 22:21     ` Junio C Hamano
  2019-01-26 22:12       ` Elijah Newren
  0 siblings, 1 reply; 39+ messages in thread
From: Junio C Hamano @ 2019-01-25 22:21 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List, Jeff King

Elijah Newren <newren@gmail.com> writes:

> Also, my first version of the patch actually showed all names, on all
> lines, but I found the heavy repetition really annoying, and not in
> keeping with how non-merge commits are handled (where original
> filenames are only shown when they differ).  Granted, my change isn't
> the only one.  We could just have all names shown if they are not all
> identical, as you suggest and I also considered, but I liked this way
> slightly better.  If others feel strongly, I can change it, that was
> just my gut feel and preference.

I do not have a strong _preference_; I am only worried about ease of
use by consumers of the raw output, i.e. scripts.  From that point
of view, --all-names showing all names always even when there is no
rename is involved would certainly be the easiest to write a parser
for, and lines being overly repetitive is a non-issue.  For human
consumption, it would be a separate matter, but they won't be
reading from the --raw output that lists blob object names
repetitively (the object names may or may not be different, but they
are all equally gibberish) before showing the human-readble part,
which are names.

> You could kind of ask the same question of -c -p, actually.  I looked
> into that, but I was only interested in raw format output and --cc is
> only about coalescing uninteresting hunks in patches.  Whenever git
> shows a combined diff in patch format, it always lists two files in
> the header, e.g.:
>   a/foo.c
>   b/foo.c

Yeah, but with a new option, it does not have to stay that way.  It
is OK to show a/foo.c, b/bar.c, c/baz.c, or perhaps an easy way to
avoid overlong unreadable lines in these human readable output may
be to only show b/foo.c to report the result and list the original
names using the "rename from" "rename to" extended diff-headers
shown in normal diff output with rename detection enabled.

> ...  I guess the "combined" patch is already kind of
> special, so it could make sense, ...

Absolutely.  -c/--cc output with -p departs from the usual "diff is
to compare two things" by having two +/- indicator columns and more
than two '@'s in hunk header to begin with.  It is only natural if
we are showing more/all names in --raw, we should avoid losing info
in the same way.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2] log,diff-tree: add --combined-with-paths options for merges with renames
  2019-01-25 19:51       ` Eric Sunshine
@ 2019-01-26 22:07         ` Elijah Newren
  0 siblings, 0 replies; 39+ messages in thread
From: Elijah Newren @ 2019-01-26 22:07 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Derrick Stolee, Git Mailing List, Jeff King, brian m . carlson

On Fri, Jan 25, 2019 at 11:51 AM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Fri, Jan 25, 2019 at 12:52 PM Elijah Newren <newren@gmail.com> wrote:
> > On Fri, Jan 25, 2019 at 9:41 AM Derrick Stolee <stolee@gmail.com> wrote:
> > > On 1/25/2019 11:54 AM, Elijah Newren wrote:
> > > > +     printf "0f9645804ebb04cc3eef91f799eb7fb54d70cefb\0::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR\0file\twith\ttabs\0i\tam\ttabbed\0fickle\tnaming\0" >expect &&
> > >
> > > I'm guessing that you use printf here because the
> > > 'cat <<-\EOF' approach doesn't work with the special
> > > tabs? Kudos for putting in the extra effort here for
> > > the special formatting!
> >
> > Yeah, I didn't know how to easily get NUL bytes in the stream without
> > printf, and once I was using printf the EOF HEREDOC no longer had a
> > useful purpose.  In the first testcase, since there were only
> > printable characters in the expected output, a HEREDOC worked well.  I
> > guess I could have just used printf for both testcases, but having the
> > literal output shown where it's possible for a human to read it seemed
> > like an advantage worth capitalizing on.
>
> If the readability of a here-doc is preferred, you should be able to
> achieve the desired result with the q_to_tab() and lf_to_nul()
> functions. For instance:
>
>     q_to_tab <<-\EOF | lf_to_nul >expect &&
>     ...Q...Q...
>     EOF

I like the idea and I tried it out...but it actually looks harder to read to me:

diff --git a/t/t4038-diff-combined.sh b/t/t4038-diff-combined.sh
index 5bccc323f6..ddbd27d16a 100755
--- a/t/t4038-diff-combined.sh
+++ b/t/t4038-diff-combined.sh
@@ -457,8 +457,8 @@ test_expect_success 'setup for --combined-with-paths' '
 '

 test_expect_success '--combined-all-names and --raw' '
- cat <<-\EOF >expect &&
- ::100644 100644 100644 f00c965d8307308469e537302baa73048488f162
088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7
333b9c62519f285e1854830ade0fe1ef1d40ee1b RR "file\twith\ttabs"
"i\tam\ttabbed" "fickle\tnaming"
+ q_to_tab <<-\EOF >expect &&
+ ::100644 100644 100644 f00c965d8307308469e537302baa73048488f162
088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7
333b9c62519f285e1854830ade0fe1ef1d40ee1b
RRQ"file\twith\ttabs"Q"i\tam\ttabbed"Q"fickle\tnaming"
  EOF
  git diff-tree -c -M --raw --combined-all-names HEAD >actual.tmp &&
  sed 1d <actual.tmp >actual &&
@@ -466,7 +466,13 @@ test_expect_success '--combined-all-names and --raw' '
 '

 test_expect_success '--combined-all-names and --raw -and -z' '
- printf "0f9645804ebb04cc3eef91f799eb7fb54d70cefb\0::100644 100644
100644 f00c965d8307308469e537302baa73048488f162
088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7
333b9c62519f285e1854830ade0fe1ef1d40ee1b
RR\0file\twith\ttabs\0i\tam\ttabbed\0fickle\tnaming\0" >expect &&
+ q_to_tab <<-\EOF | lf_to_nul >expect &&
+ 0f9645804ebb04cc3eef91f799eb7fb54d70cefb
+ ::100644 100644 100644 f00c965d8307308469e537302baa73048488f162
088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7
333b9c62519f285e1854830ade0fe1ef1d40ee1b RR
+ fileQwithQtabs
+ iQamQtabbed
+ fickleQnaming
+ EOF
  git diff-tree -c -M --raw --combined-all-names -z HEAD >actual &&
  test_cmp -a expect actual
 '

So I'm going to stick with the original.  Thanks for pointing out
q_to_tab and lf_to_nul, though; they may come in handy in the future.

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH] log,diff-tree: add --combined-with-paths options for merges with renames
  2019-01-25 22:21     ` Junio C Hamano
@ 2019-01-26 22:12       ` Elijah Newren
  2019-01-28  0:19         ` Junio C Hamano
  0 siblings, 1 reply; 39+ messages in thread
From: Elijah Newren @ 2019-01-26 22:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Jeff King

On Fri, Jan 25, 2019 at 2:21 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > Also, my first version of the patch actually showed all names, on all
> > lines, but I found the heavy repetition really annoying, and not in
> > keeping with how non-merge commits are handled (where original
> > filenames are only shown when they differ).  Granted, my change isn't
> > the only one.  We could just have all names shown if they are not all
> > identical, as you suggest and I also considered, but I liked this way
> > slightly better.  If others feel strongly, I can change it, that was
> > just my gut feel and preference.
>
> I do not have a strong _preference_; I am only worried about ease of
> use by consumers of the raw output, i.e. scripts.  From that point
> of view, --all-names showing all names always even when there is no
> rename is involved would certainly be the easiest to write a parser
> for, and lines being overly repetitive is a non-issue.  For human
> consumption, it would be a separate matter, but they won't be
> reading from the --raw output that lists blob object names
> repetitively (the object names may or may not be different, but they
> are all equally gibberish) before showing the human-readble part,
> which are names.

Okay, I've switched it over.

> > You could kind of ask the same question of -c -p, actually.  I looked
> > into that, but I was only interested in raw format output and --cc is
> > only about coalescing uninteresting hunks in patches.  Whenever git
> > shows a combined diff in patch format, it always lists two files in
> > the header, e.g.:
> >   a/foo.c
> >   b/foo.c
>
> Yeah, but with a new option, it does not have to stay that way.  It
> is OK to show a/foo.c, b/bar.c, c/baz.c, or perhaps an easy way to
> avoid overlong unreadable lines in these human readable output may
> be to only show b/foo.c to report the result and list the original
> names using the "rename from" "rename to" extended diff-headers
> shown in normal diff output with rename detection enabled.
>
> > ...  I guess the "combined" patch is already kind of
> > special, so it could make sense, ...
>
> Absolutely.  -c/--cc output with -p departs from the usual "diff is
> to compare two things" by having two +/- indicator columns and more
> than two '@'s in hunk header to begin with.  It is only natural if
> we are showing more/all names in --raw, we should avoid losing info
> in the same way.

This turned out to be easier to implement than I expected, with one
small wrinkle: We have the "a/" and "b/" prefixes, and switching to
"a/", "b/" and "c/" doesn't work due to things like the
diff.mnemonicPrefix setting -- plus I didn't want to plumb multiple
prefix options everywhere they'd be needed.  Instead, I decided that
instead of having a "from" and a "to" header, we would have N "from"
headers (since all parents are what we came from) and 1 "to" header.
I hope that makes sense, because otherwise I think this become to
unwieldy of a change.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v3] log,diff-tree: add --combined-all-names option
  2019-01-25 16:54 ` [PATCH v2] " Elijah Newren
  2019-01-25 17:40   ` Derrick Stolee
@ 2019-01-26 22:18   ` Elijah Newren
  2019-02-04 20:07     ` [PATCH v4] " Elijah Newren
  1 sibling, 1 reply; 39+ messages in thread
From: Elijah Newren @ 2019-01-26 22:18 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Eric Sunshine, brian m . carlson,
	Derrick Stolee, Elijah Newren

The combined diff format for merges will only list one filename, even if
rename or copy detection is active.  For example, with raw format one
might see:

  ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	describe.c
  ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	bar.sh
  ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	phooey.c

This doesn't let us know what the original name of bar.sh was in the
first parent, and doesn't let us know what either of the original names
of phooey.c were in either of the parents.  In contrast, for non-merge
commits, raw format does provide original filenames (and a rename score
to boot).  In order to also provide original filenames for merge
commits, add a --combined-all-names option (which must be used with
either -c or --cc, and is likely only useful with rename or copy
detection active) so that we can print tab-separated filenames when
renames are involved.  This transforms the above output to:

  ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	desc.c	desc.c	desc.c
  ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	foo.sh	bar.sh	bar.sh
  ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	fooey.c	fuey.c	phooey.c

Further, in patch format, this changes the from/to headers so that
instead of just having one "from" header, we get one for each parent.
For example, instead of having

  --- a/phooey.c
  +++ b/phooey.c

we would see

  --- a/fooey.c
  --- a/fuey.c
  +++ b/phooey.c

Signed-off-by: Elijah Newren <newren@gmail.com>
---
Changes since v2:
  * Renamed --combined-with-paths to --combined-all-names
  * No longer implies -c, --raw, or -M
  * Works with either -c or --cc, with either --raw or -p, and with
    either -M or -C
  * Always shows the name in each parent even if it doesn't differ
  * additional testcase checking patch format header with --cc

 Documentation/diff-format.txt         | 20 +++++--
 Documentation/diff-generate-patch.txt | 13 +++++
 Documentation/git-diff-tree.txt       | 11 +++-
 Documentation/rev-list-options.txt    |  7 +++
 builtin/diff-tree.c                   |  6 ++-
 combine-diff.c                        | 76 +++++++++++++++++++++++----
 diff.h                                |  1 +
 revision.c                            |  6 +++
 revision.h                            |  1 +
 t/t4038-diff-combined.sh              | 47 +++++++++++++++++
 10 files changed, 171 insertions(+), 17 deletions(-)

diff --git a/Documentation/diff-format.txt b/Documentation/diff-format.txt
index cdcc17f0ad..8fe800cd9b 100644
--- a/Documentation/diff-format.txt
+++ b/Documentation/diff-format.txt
@@ -95,12 +95,26 @@ from the format described above in the following way:
 . there are more "src" modes and "src" sha1
 . status is concatenated status characters for each parent
 . no optional "score" number
-. single path, only for "dst"
+. tab-separated pathname(s) of the file
 
-Example:
+For `-c` and `--cc`, only the destination or final path is shown even
+if the file was renamed on any side of history.  With
+`--combined-all-names`, the name of the path in each parent is shown
+followed by the name of the path in the merge commit.
+
+Examples for `-c` and `-cc` without `--combined-all-names`:
+------------------------------------------------
+::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	desc.c
+::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	bar.sh
+::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	phooey.c
+------------------------------------------------
+
+Examples when `--combined-all-names` added to either `-c` or `--cc`:
 
 ------------------------------------------------
-::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	describe.c
+::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	desc.c	desc.c	desc.c
+::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	foo.sh	bar.sh	bar.sh
+::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	fooey.c	fuey.c	phooey.c
 ------------------------------------------------
 
 Note that 'combined diff' lists only files which were modified from
diff --git a/Documentation/diff-generate-patch.txt b/Documentation/diff-generate-patch.txt
index 231105cff4..69cb3b0349 100644
--- a/Documentation/diff-generate-patch.txt
+++ b/Documentation/diff-generate-patch.txt
@@ -144,6 +144,19 @@ Similar to two-line header for traditional 'unified' diff
 format, `/dev/null` is used to signal created or deleted
 files.
 
+However, if the --combined-all-paths option is provided, instead of a
+two-line from-file/to-file you get a N+1 line from-file/to-file header,
+where N is the number of parents in the merge commit
+
+       --- a/file
+       --- a/file
+       --- a/file
+       +++ b/file
++
+This extended format can be useful if rename or copy detection is
+active, to allow you to see the original name of the file in different
+parents.
+
 4.   Chunk header format is modified to prevent people from
      accidentally feeding it to `patch -p1`. Combined diff format
      was created for review of merge commit changes, and was not
diff --git a/Documentation/git-diff-tree.txt b/Documentation/git-diff-tree.txt
index 2319b2b192..7b6dcbd7a8 100644
--- a/Documentation/git-diff-tree.txt
+++ b/Documentation/git-diff-tree.txt
@@ -10,8 +10,8 @@ SYNOPSIS
 --------
 [verse]
 'git diff-tree' [--stdin] [-m] [-s] [-v] [--no-commit-id] [--pretty]
-	      [-t] [-r] [-c | --cc] [--root] [<common diff options>]
-	      <tree-ish> [<tree-ish>] [<path>...]
+	      [-t] [-r] [-c | --cc] [--combined-all-names] [--root]
+	      [<common diff options>] <tree-ish> [<tree-ish>] [<path>...]
 
 DESCRIPTION
 -----------
@@ -108,6 +108,13 @@ include::pretty-options.txt[]
 	itself and the commit log message is not shown, just like in any other
 	"empty diff" case.
 
+--combined-all-names::
+	This flag causes combined diffs (used for merge commits) to
+	list the name of the file from all parents.  It thus only has
+	effect when -c or --cc are specified, and is likely only
+	useful if filename changes are detected (i.e. when either
+	rename or copy detection have been requested).
+
 --always::
 	Show the commit itself and the commit log message even
 	if the diff itself is empty.
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 98b538bc77..97a6651bae 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -948,6 +948,13 @@ options may be given. See linkgit:git-diff-files[1] for more options.
 	the parents have only two variants and the merge result picks
 	one of them without modification.
 
+--combined-all-names::
+	This flag causes combined diffs (used for merge commits) to
+	list the name of the file from all parents.  It thus only has
+	effect when -c or --cc are specified, and is likely only
+	useful if filename changes are detected (i.e. when either
+	rename or copy detection have been requested).
+
 -m::
 	This flag makes the merge commits show the full diff like
 	regular commits; for each merge parent, a separate log entry
diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c
index ef996126d7..bbeebc833f 100644
--- a/builtin/diff-tree.c
+++ b/builtin/diff-tree.c
@@ -82,9 +82,13 @@ static int diff_tree_stdin(char *line)
 }
 
 static const char diff_tree_usage[] =
-"git diff-tree [--stdin] [-m] [-c] [--cc] [-s] [-v] [--pretty] [-t] [-r] [--root] "
+"git diff-tree [--stdin] [-m] [-c | --cc] [-s] [-v] [--pretty] [-t] [-r] [--root] "
 "[<common-diff-options>] <tree-ish> [<tree-ish>] [<path>...]\n"
 "  -r            diff recursively\n"
+"  -c            show combined diff for merge commits\n"
+"  --cc          show combined diff for merge commits removing uninteresting hunks\n"
+"  --combined-all-names\n"
+"                show name of file in all parents for combined diffs\n"
 "  --root        include the initial commit as diff against /dev/null\n"
 COMMON_DIFF_OPTIONS_HELP;
 
diff --git a/combine-diff.c b/combine-diff.c
index a143c00634..339daac02b 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -23,11 +23,20 @@ static int compare_paths(const struct combine_diff_path *one,
 				 two->path, strlen(two->path), two->mode);
 }
 
-static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr, int n, int num_parent)
+static int filename_changed(char status)
+{
+	return status == 'R' || status == 'C';
+}
+
+static struct combine_diff_path *intersect_paths(
+	struct combine_diff_path *curr,
+	int n,
+	int num_parent,
+	int combined_all_names)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
 	struct combine_diff_path *p, **tail = &curr;
-	int i, cmp;
+	int i, j, cmp;
 
 	if (!n) {
 		for (i = 0; i < q->nr; i++) {
@@ -50,6 +59,13 @@ static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr,
 			oidcpy(&p->parent[n].oid, &q->queue[i]->one->oid);
 			p->parent[n].mode = q->queue[i]->one->mode;
 			p->parent[n].status = q->queue[i]->status;
+
+			if (combined_all_names &&
+			    filename_changed(p->parent[n].status)) {
+				strbuf_init(&p->parent[n].path, 0);
+				strbuf_addstr(&p->parent[n].path,
+					      q->queue[i]->one->path);
+			}
 			*tail = p;
 			tail = &p->next;
 		}
@@ -68,6 +84,10 @@ static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr,
 		if (cmp < 0) {
 			/* p->path not in q->queue[]; drop it */
 			*tail = p->next;
+			for (j = 0; j < num_parent; j++)
+				if (combined_all_names &&
+				    filename_changed(p->parent[j].status))
+					strbuf_release(&p->parent[j].path);
 			free(p);
 			continue;
 		}
@@ -81,6 +101,10 @@ static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr,
 		oidcpy(&p->parent[n].oid, &q->queue[i]->one->oid);
 		p->parent[n].mode = q->queue[i]->one->mode;
 		p->parent[n].status = q->queue[i]->status;
+		if (combined_all_names &&
+		    filename_changed(p->parent[n].status))
+			strbuf_addstr(&p->parent[n].path,
+				      q->queue[i]->one->path);
 
 		tail = &p->next;
 		i++;
@@ -960,12 +984,25 @@ static void show_combined_header(struct combine_diff_path *elem,
 	if (!show_file_header)
 		return;
 
-	if (added)
-		dump_quoted_path("--- ", "", "/dev/null",
-				 line_prefix, c_meta, c_reset);
-	else
-		dump_quoted_path("--- ", a_prefix, elem->path,
-				 line_prefix, c_meta, c_reset);
+	if (rev->combined_all_names) {
+		for (i = 0; i < num_parent; i++) {
+			char *path = filename_changed(elem->parent[i].status)
+				? elem->parent[i].path.buf : elem->path;
+			if (elem->parent[i].status == DIFF_STATUS_ADDED)
+				dump_quoted_path("--- ", "", "/dev/null",
+						 line_prefix, c_meta, c_reset);
+			else
+				dump_quoted_path("--- ", a_prefix, path,
+						 line_prefix, c_meta, c_reset);
+		}
+	} else {
+		if (added)
+			dump_quoted_path("--- ", "", "/dev/null",
+					 line_prefix, c_meta, c_reset);
+		else
+			dump_quoted_path("--- ", a_prefix, elem->path,
+					 line_prefix, c_meta, c_reset);
+	}
 	if (deleted)
 		dump_quoted_path("+++ ", "", "/dev/null",
 				 line_prefix, c_meta, c_reset);
@@ -1227,6 +1264,15 @@ static void show_raw_diff(struct combine_diff_path *p, int num_parent, struct re
 		putchar(inter_name_termination);
 	}
 
+	for (i = 0; i < num_parent; i++)
+		if (rev->combined_all_names) {
+			if (filename_changed(p->parent[i].status))
+				write_name_quoted(p->parent[i].path.buf, stdout,
+						  inter_name_termination);
+			else
+				write_name_quoted(p->path, stdout,
+						  inter_name_termination);
+		}
 	write_name_quoted(p->path, stdout, line_termination);
 }
 
@@ -1324,7 +1370,9 @@ static const char *path_path(void *obj)
 
 /* find set of paths that every parent touches */
 static struct combine_diff_path *find_paths_generic(const struct object_id *oid,
-	const struct oid_array *parents, struct diff_options *opt)
+	const struct oid_array *parents,
+	struct diff_options *opt,
+	int combined_all_names)
 {
 	struct combine_diff_path *paths = NULL;
 	int i, num_parent = parents->nr;
@@ -1350,7 +1398,8 @@ static struct combine_diff_path *find_paths_generic(const struct object_id *oid,
 			opt->output_format = DIFF_FORMAT_NO_OUTPUT;
 		diff_tree_oid(&parents->oid[i], oid, "", opt);
 		diffcore_std(opt);
-		paths = intersect_paths(paths, i, num_parent);
+		paths = intersect_paths(paths, i, num_parent,
+					combined_all_names);
 
 		/* if showing diff, show it in requested order */
 		if (opt->output_format != DIFF_FORMAT_NO_OUTPUT &&
@@ -1460,7 +1509,8 @@ void diff_tree_combined(const struct object_id *oid,
 		 * diff(sha1,parent_i) for all i to do the job, specifically
 		 * for parent0.
 		 */
-		paths = find_paths_generic(oid, parents, &diffopts);
+		paths = find_paths_generic(oid, parents, &diffopts,
+					   rev->combined_all_names);
 	}
 	else {
 		int stat_opt;
@@ -1535,6 +1585,10 @@ void diff_tree_combined(const struct object_id *oid,
 	while (paths) {
 		struct combine_diff_path *tmp = paths;
 		paths = paths->next;
+		for (i = 0; i < num_parent; i++)
+			if (rev->combined_all_names &&
+			    filename_changed(tmp->parent[i].status))
+				strbuf_release(&tmp->parent[i].path);
 		free(tmp);
 	}
 
diff --git a/diff.h b/diff.h
index b512d0477a..90ea0256a5 100644
--- a/diff.h
+++ b/diff.h
@@ -294,6 +294,7 @@ struct combine_diff_path {
 		char status;
 		unsigned int mode;
 		struct object_id oid;
+		struct strbuf path;
 	} parent[FLEX_ARRAY];
 };
 #define combine_diff_path_size(n, l) \
diff --git a/revision.c b/revision.c
index 13cfb59b38..a608884aad 100644
--- a/revision.c
+++ b/revision.c
@@ -1995,6 +1995,9 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->diff = 1;
 		revs->dense_combined_merges = 0;
 		revs->combine_merges = 1;
+	} else if (!strcmp(arg, "--combined-all-names")) {
+		revs->diff = 1;
+		revs->combined_all_names = 1;
 	} else if (!strcmp(arg, "--cc")) {
 		revs->diff = 1;
 		revs->dense_combined_merges = 1;
@@ -2491,6 +2494,9 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 	}
 	if (revs->combine_merges)
 		revs->ignore_merges = 0;
+	if (revs->combined_all_names && !revs->combine_merges)
+		die("--combined-all-names makes no sense without -c or --cc");
+
 	revs->diffopt.abbrev = revs->abbrev;
 
 	if (revs->line_level_traverse) {
diff --git a/revision.h b/revision.h
index 52e5a88ff5..074a584410 100644
--- a/revision.h
+++ b/revision.h
@@ -171,6 +171,7 @@ struct rev_info {
 			verbose_header:1,
 			ignore_merges:1,
 			combine_merges:1,
+			combined_all_names:1,
 			dense_combined_merges:1,
 			always_show_header:1;
 
diff --git a/t/t4038-diff-combined.sh b/t/t4038-diff-combined.sh
index e2824d3437..5bccc323f6 100755
--- a/t/t4038-diff-combined.sh
+++ b/t/t4038-diff-combined.sh
@@ -435,4 +435,51 @@ test_expect_success 'combine diff gets tree sorting right' '
 	test_cmp expect actual
 '
 
+test_expect_success 'setup for --combined-with-paths' '
+	git branch side1c &&
+	git branch side2c &&
+	git checkout side1c &&
+	test_seq 1 10 >$(printf "file\twith\ttabs") &&
+	git add file* &&
+	git commit -m with &&
+	git checkout side2c &&
+	test_seq 1 9 >$(printf "i\tam\ttabbed") &&
+	echo ten >>$(printf "i\tam\ttabbed") &&
+	git add *tabbed &&
+	git commit -m iam &&
+	git checkout -b mergery side1c &&
+	git merge --no-commit side2c &&
+	git rm *tabs &&
+	echo eleven >>$(printf "i\tam\ttabbed") &&
+	git mv "$(printf "i\tam\ttabbed")" "$(printf "fickle\tnaming")" &&
+	git add fickle* &&
+	git commit
+'
+
+test_expect_success '--combined-all-names and --raw' '
+	cat <<-\EOF >expect &&
+	::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR	"file\twith\ttabs"	"i\tam\ttabbed"	"fickle\tnaming"
+	EOF
+	git diff-tree -c -M --raw --combined-all-names HEAD >actual.tmp &&
+	sed 1d <actual.tmp >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success '--combined-all-names and --raw -and -z' '
+	printf "0f9645804ebb04cc3eef91f799eb7fb54d70cefb\0::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR\0file\twith\ttabs\0i\tam\ttabbed\0fickle\tnaming\0" >expect &&
+	git diff-tree -c -M --raw --combined-all-names -z HEAD >actual &&
+	test_cmp -a expect actual
+'
+
+test_expect_success '--combined-all-names and --cc' '
+	cat <<-\EOF >expect &&
+	--- "a/file\twith\ttabs"
+	--- "a/i\tam\ttabbed"
+	+++ "b/fickle\tnaming"
+	EOF
+	git diff-tree --cc -M --combined-all-names HEAD >actual.tmp &&
+	grep ^[-+][-+][-+] <actual.tmp >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.20.1.310.gf2549f3af6


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v2] log,diff-tree: add --combined-with-paths options for merges with renames
  2019-01-25 17:52     ` Elijah Newren
  2019-01-25 19:51       ` Eric Sunshine
@ 2019-01-27  1:52       ` brian m. carlson
  1 sibling, 0 replies; 39+ messages in thread
From: brian m. carlson @ 2019-01-27  1:52 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Derrick Stolee, Git Mailing List, Jeff King

[-- Attachment #1: Type: text/plain, Size: 1717 bytes --]

On Fri, Jan 25, 2019 at 09:52:07AM -0800, Elijah Newren wrote:
> On Fri, Jan 25, 2019 at 9:41 AM Derrick Stolee <stolee@gmail.com> wrote:
> >
> > On 1/25/2019 11:54 AM, Elijah Newren wrote:
> > > +test_expect_success '--combined-with-paths works with -z as well' '
> > > +     printf "0f9645804ebb04cc3eef91f799eb7fb54d70cefb\0::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR\0file\twith\ttabs\0i\tam\ttabbed\0fickle\tnaming\0" >expect &&
> >
> > I'm guessing that you use printf here because the
> > 'cat <<-\EOF' approach doesn't work with the special
> > tabs? Kudos for putting in the extra effort here for
> > the special formatting!
> 
> Yeah, I didn't know how to easily get NUL bytes in the stream without
> printf, and once I was using printf the EOF HEREDOC no longer had a
> useful purpose.  In the first testcase, since there were only
> printable characters in the expected output, a HEREDOC worked well.  I
> guess I could have just used printf for both testcases, but having the
> literal output shown where it's possible for a human to read it seemed
> like an advantage worth capitalizing on.  If anyone feels strongly
> that I should use printf for both tests, I can switch them though.

I'm fine with the patch as it stands, but just for future reference (for
you and anyone else), we have functions in the test library called
nul_to_q and q_to_nul that convert NUL back and forth to the letter "Q".
They can make it a bit easier to write tests where we expect some format
to have NULs in it.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] log,diff-tree: add --combined-with-paths options for merges with renames
  2019-01-26 22:12       ` Elijah Newren
@ 2019-01-28  0:19         ` Junio C Hamano
  0 siblings, 0 replies; 39+ messages in thread
From: Junio C Hamano @ 2019-01-28  0:19 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List, Jeff King

Elijah Newren <newren@gmail.com> writes:

> ... Instead, I decided that
> instead of having a "from" and a "to" header, we would have N "from"
> headers (since all parents are what we came from) and 1 "to" header.
> I hope that makes sense, because otherwise I think this become to
> unwieldy of a change.

Yeah, it makes sense for an N-way merge to have N 'from' pathnames
that resulted in a single 'to' pathname in the resulting tree.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v4] log,diff-tree: add --combined-all-names option
  2019-01-26 22:18   ` [PATCH v3] log,diff-tree: add --combined-all-names option Elijah Newren
@ 2019-02-04 20:07     ` Elijah Newren
  2019-02-04 21:20       ` Junio C Hamano
                         ` (3 more replies)
  0 siblings, 4 replies; 39+ messages in thread
From: Elijah Newren @ 2019-02-04 20:07 UTC (permalink / raw)
  To: gitster; +Cc: git, Jeff King, brian m . carlson, Derrick Stolee, Elijah Newren

The combined diff format for merges will only list one filename, even if
rename or copy detection is active.  For example, with raw format one
might see:

  ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	describe.c
  ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	bar.sh
  ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	phooey.c

This doesn't let us know what the original name of bar.sh was in the
first parent, and doesn't let us know what either of the original names
of phooey.c were in either of the parents.  In contrast, for non-merge
commits, raw format does provide original filenames (and a rename score
to boot).  In order to also provide original filenames for merge
commits, add a --combined-all-names option (which must be used with
either -c or --cc, and is likely only useful with rename or copy
detection active) so that we can print tab-separated filenames when
renames are involved.  This transforms the above output to:

  ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	desc.c	desc.c	desc.c
  ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	foo.sh	bar.sh	bar.sh
  ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	fooey.c	fuey.c	phooey.c

Further, in patch format, this changes the from/to headers so that
instead of just having one "from" header, we get one for each parent.
For example, instead of having

  --- a/phooey.c
  +++ b/phooey.c

we would see

  --- a/fooey.c
  --- a/fuey.c
  +++ b/phooey.c

Signed-off-by: Elijah Newren <newren@gmail.com>
---

Changes since v3:
  * None, just sending out to maintainer with list cc'ed as per
    SubmittingPatches; v3 had time for review and Junio elsewhere in the
    thread said the strategy of having multiple "from" versions makes
    sense.

 Documentation/diff-format.txt         | 20 +++++--
 Documentation/diff-generate-patch.txt | 13 +++++
 Documentation/git-diff-tree.txt       | 11 +++-
 Documentation/rev-list-options.txt    |  7 +++
 builtin/diff-tree.c                   |  6 ++-
 combine-diff.c                        | 76 +++++++++++++++++++++++----
 diff.h                                |  1 +
 revision.c                            |  6 +++
 revision.h                            |  1 +
 t/t4038-diff-combined.sh              | 47 +++++++++++++++++
 10 files changed, 171 insertions(+), 17 deletions(-)

diff --git a/Documentation/diff-format.txt b/Documentation/diff-format.txt
index cdcc17f0ad..8fe800cd9b 100644
--- a/Documentation/diff-format.txt
+++ b/Documentation/diff-format.txt
@@ -95,12 +95,26 @@ from the format described above in the following way:
 . there are more "src" modes and "src" sha1
 . status is concatenated status characters for each parent
 . no optional "score" number
-. single path, only for "dst"
+. tab-separated pathname(s) of the file
 
-Example:
+For `-c` and `--cc`, only the destination or final path is shown even
+if the file was renamed on any side of history.  With
+`--combined-all-names`, the name of the path in each parent is shown
+followed by the name of the path in the merge commit.
+
+Examples for `-c` and `-cc` without `--combined-all-names`:
+------------------------------------------------
+::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	desc.c
+::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	bar.sh
+::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	phooey.c
+------------------------------------------------
+
+Examples when `--combined-all-names` added to either `-c` or `--cc`:
 
 ------------------------------------------------
-::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	describe.c
+::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	desc.c	desc.c	desc.c
+::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	foo.sh	bar.sh	bar.sh
+::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	fooey.c	fuey.c	phooey.c
 ------------------------------------------------
 
 Note that 'combined diff' lists only files which were modified from
diff --git a/Documentation/diff-generate-patch.txt b/Documentation/diff-generate-patch.txt
index 231105cff4..69cb3b0349 100644
--- a/Documentation/diff-generate-patch.txt
+++ b/Documentation/diff-generate-patch.txt
@@ -144,6 +144,19 @@ Similar to two-line header for traditional 'unified' diff
 format, `/dev/null` is used to signal created or deleted
 files.
 
+However, if the --combined-all-paths option is provided, instead of a
+two-line from-file/to-file you get a N+1 line from-file/to-file header,
+where N is the number of parents in the merge commit
+
+       --- a/file
+       --- a/file
+       --- a/file
+       +++ b/file
++
+This extended format can be useful if rename or copy detection is
+active, to allow you to see the original name of the file in different
+parents.
+
 4.   Chunk header format is modified to prevent people from
      accidentally feeding it to `patch -p1`. Combined diff format
      was created for review of merge commit changes, and was not
diff --git a/Documentation/git-diff-tree.txt b/Documentation/git-diff-tree.txt
index 2319b2b192..7b6dcbd7a8 100644
--- a/Documentation/git-diff-tree.txt
+++ b/Documentation/git-diff-tree.txt
@@ -10,8 +10,8 @@ SYNOPSIS
 --------
 [verse]
 'git diff-tree' [--stdin] [-m] [-s] [-v] [--no-commit-id] [--pretty]
-	      [-t] [-r] [-c | --cc] [--root] [<common diff options>]
-	      <tree-ish> [<tree-ish>] [<path>...]
+	      [-t] [-r] [-c | --cc] [--combined-all-names] [--root]
+	      [<common diff options>] <tree-ish> [<tree-ish>] [<path>...]
 
 DESCRIPTION
 -----------
@@ -108,6 +108,13 @@ include::pretty-options.txt[]
 	itself and the commit log message is not shown, just like in any other
 	"empty diff" case.
 
+--combined-all-names::
+	This flag causes combined diffs (used for merge commits) to
+	list the name of the file from all parents.  It thus only has
+	effect when -c or --cc are specified, and is likely only
+	useful if filename changes are detected (i.e. when either
+	rename or copy detection have been requested).
+
 --always::
 	Show the commit itself and the commit log message even
 	if the diff itself is empty.
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 98b538bc77..97a6651bae 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -948,6 +948,13 @@ options may be given. See linkgit:git-diff-files[1] for more options.
 	the parents have only two variants and the merge result picks
 	one of them without modification.
 
+--combined-all-names::
+	This flag causes combined diffs (used for merge commits) to
+	list the name of the file from all parents.  It thus only has
+	effect when -c or --cc are specified, and is likely only
+	useful if filename changes are detected (i.e. when either
+	rename or copy detection have been requested).
+
 -m::
 	This flag makes the merge commits show the full diff like
 	regular commits; for each merge parent, a separate log entry
diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c
index ef996126d7..bbeebc833f 100644
--- a/builtin/diff-tree.c
+++ b/builtin/diff-tree.c
@@ -82,9 +82,13 @@ static int diff_tree_stdin(char *line)
 }
 
 static const char diff_tree_usage[] =
-"git diff-tree [--stdin] [-m] [-c] [--cc] [-s] [-v] [--pretty] [-t] [-r] [--root] "
+"git diff-tree [--stdin] [-m] [-c | --cc] [-s] [-v] [--pretty] [-t] [-r] [--root] "
 "[<common-diff-options>] <tree-ish> [<tree-ish>] [<path>...]\n"
 "  -r            diff recursively\n"
+"  -c            show combined diff for merge commits\n"
+"  --cc          show combined diff for merge commits removing uninteresting hunks\n"
+"  --combined-all-names\n"
+"                show name of file in all parents for combined diffs\n"
 "  --root        include the initial commit as diff against /dev/null\n"
 COMMON_DIFF_OPTIONS_HELP;
 
diff --git a/combine-diff.c b/combine-diff.c
index a143c00634..339daac02b 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -23,11 +23,20 @@ static int compare_paths(const struct combine_diff_path *one,
 				 two->path, strlen(two->path), two->mode);
 }
 
-static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr, int n, int num_parent)
+static int filename_changed(char status)
+{
+	return status == 'R' || status == 'C';
+}
+
+static struct combine_diff_path *intersect_paths(
+	struct combine_diff_path *curr,
+	int n,
+	int num_parent,
+	int combined_all_names)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
 	struct combine_diff_path *p, **tail = &curr;
-	int i, cmp;
+	int i, j, cmp;
 
 	if (!n) {
 		for (i = 0; i < q->nr; i++) {
@@ -50,6 +59,13 @@ static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr,
 			oidcpy(&p->parent[n].oid, &q->queue[i]->one->oid);
 			p->parent[n].mode = q->queue[i]->one->mode;
 			p->parent[n].status = q->queue[i]->status;
+
+			if (combined_all_names &&
+			    filename_changed(p->parent[n].status)) {
+				strbuf_init(&p->parent[n].path, 0);
+				strbuf_addstr(&p->parent[n].path,
+					      q->queue[i]->one->path);
+			}
 			*tail = p;
 			tail = &p->next;
 		}
@@ -68,6 +84,10 @@ static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr,
 		if (cmp < 0) {
 			/* p->path not in q->queue[]; drop it */
 			*tail = p->next;
+			for (j = 0; j < num_parent; j++)
+				if (combined_all_names &&
+				    filename_changed(p->parent[j].status))
+					strbuf_release(&p->parent[j].path);
 			free(p);
 			continue;
 		}
@@ -81,6 +101,10 @@ static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr,
 		oidcpy(&p->parent[n].oid, &q->queue[i]->one->oid);
 		p->parent[n].mode = q->queue[i]->one->mode;
 		p->parent[n].status = q->queue[i]->status;
+		if (combined_all_names &&
+		    filename_changed(p->parent[n].status))
+			strbuf_addstr(&p->parent[n].path,
+				      q->queue[i]->one->path);
 
 		tail = &p->next;
 		i++;
@@ -960,12 +984,25 @@ static void show_combined_header(struct combine_diff_path *elem,
 	if (!show_file_header)
 		return;
 
-	if (added)
-		dump_quoted_path("--- ", "", "/dev/null",
-				 line_prefix, c_meta, c_reset);
-	else
-		dump_quoted_path("--- ", a_prefix, elem->path,
-				 line_prefix, c_meta, c_reset);
+	if (rev->combined_all_names) {
+		for (i = 0; i < num_parent; i++) {
+			char *path = filename_changed(elem->parent[i].status)
+				? elem->parent[i].path.buf : elem->path;
+			if (elem->parent[i].status == DIFF_STATUS_ADDED)
+				dump_quoted_path("--- ", "", "/dev/null",
+						 line_prefix, c_meta, c_reset);
+			else
+				dump_quoted_path("--- ", a_prefix, path,
+						 line_prefix, c_meta, c_reset);
+		}
+	} else {
+		if (added)
+			dump_quoted_path("--- ", "", "/dev/null",
+					 line_prefix, c_meta, c_reset);
+		else
+			dump_quoted_path("--- ", a_prefix, elem->path,
+					 line_prefix, c_meta, c_reset);
+	}
 	if (deleted)
 		dump_quoted_path("+++ ", "", "/dev/null",
 				 line_prefix, c_meta, c_reset);
@@ -1227,6 +1264,15 @@ static void show_raw_diff(struct combine_diff_path *p, int num_parent, struct re
 		putchar(inter_name_termination);
 	}
 
+	for (i = 0; i < num_parent; i++)
+		if (rev->combined_all_names) {
+			if (filename_changed(p->parent[i].status))
+				write_name_quoted(p->parent[i].path.buf, stdout,
+						  inter_name_termination);
+			else
+				write_name_quoted(p->path, stdout,
+						  inter_name_termination);
+		}
 	write_name_quoted(p->path, stdout, line_termination);
 }
 
@@ -1324,7 +1370,9 @@ static const char *path_path(void *obj)
 
 /* find set of paths that every parent touches */
 static struct combine_diff_path *find_paths_generic(const struct object_id *oid,
-	const struct oid_array *parents, struct diff_options *opt)
+	const struct oid_array *parents,
+	struct diff_options *opt,
+	int combined_all_names)
 {
 	struct combine_diff_path *paths = NULL;
 	int i, num_parent = parents->nr;
@@ -1350,7 +1398,8 @@ static struct combine_diff_path *find_paths_generic(const struct object_id *oid,
 			opt->output_format = DIFF_FORMAT_NO_OUTPUT;
 		diff_tree_oid(&parents->oid[i], oid, "", opt);
 		diffcore_std(opt);
-		paths = intersect_paths(paths, i, num_parent);
+		paths = intersect_paths(paths, i, num_parent,
+					combined_all_names);
 
 		/* if showing diff, show it in requested order */
 		if (opt->output_format != DIFF_FORMAT_NO_OUTPUT &&
@@ -1460,7 +1509,8 @@ void diff_tree_combined(const struct object_id *oid,
 		 * diff(sha1,parent_i) for all i to do the job, specifically
 		 * for parent0.
 		 */
-		paths = find_paths_generic(oid, parents, &diffopts);
+		paths = find_paths_generic(oid, parents, &diffopts,
+					   rev->combined_all_names);
 	}
 	else {
 		int stat_opt;
@@ -1535,6 +1585,10 @@ void diff_tree_combined(const struct object_id *oid,
 	while (paths) {
 		struct combine_diff_path *tmp = paths;
 		paths = paths->next;
+		for (i = 0; i < num_parent; i++)
+			if (rev->combined_all_names &&
+			    filename_changed(tmp->parent[i].status))
+				strbuf_release(&tmp->parent[i].path);
 		free(tmp);
 	}
 
diff --git a/diff.h b/diff.h
index b512d0477a..90ea0256a5 100644
--- a/diff.h
+++ b/diff.h
@@ -294,6 +294,7 @@ struct combine_diff_path {
 		char status;
 		unsigned int mode;
 		struct object_id oid;
+		struct strbuf path;
 	} parent[FLEX_ARRAY];
 };
 #define combine_diff_path_size(n, l) \
diff --git a/revision.c b/revision.c
index 13cfb59b38..a608884aad 100644
--- a/revision.c
+++ b/revision.c
@@ -1995,6 +1995,9 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->diff = 1;
 		revs->dense_combined_merges = 0;
 		revs->combine_merges = 1;
+	} else if (!strcmp(arg, "--combined-all-names")) {
+		revs->diff = 1;
+		revs->combined_all_names = 1;
 	} else if (!strcmp(arg, "--cc")) {
 		revs->diff = 1;
 		revs->dense_combined_merges = 1;
@@ -2491,6 +2494,9 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 	}
 	if (revs->combine_merges)
 		revs->ignore_merges = 0;
+	if (revs->combined_all_names && !revs->combine_merges)
+		die("--combined-all-names makes no sense without -c or --cc");
+
 	revs->diffopt.abbrev = revs->abbrev;
 
 	if (revs->line_level_traverse) {
diff --git a/revision.h b/revision.h
index 52e5a88ff5..074a584410 100644
--- a/revision.h
+++ b/revision.h
@@ -171,6 +171,7 @@ struct rev_info {
 			verbose_header:1,
 			ignore_merges:1,
 			combine_merges:1,
+			combined_all_names:1,
 			dense_combined_merges:1,
 			always_show_header:1;
 
diff --git a/t/t4038-diff-combined.sh b/t/t4038-diff-combined.sh
index e2824d3437..5bccc323f6 100755
--- a/t/t4038-diff-combined.sh
+++ b/t/t4038-diff-combined.sh
@@ -435,4 +435,51 @@ test_expect_success 'combine diff gets tree sorting right' '
 	test_cmp expect actual
 '
 
+test_expect_success 'setup for --combined-with-paths' '
+	git branch side1c &&
+	git branch side2c &&
+	git checkout side1c &&
+	test_seq 1 10 >$(printf "file\twith\ttabs") &&
+	git add file* &&
+	git commit -m with &&
+	git checkout side2c &&
+	test_seq 1 9 >$(printf "i\tam\ttabbed") &&
+	echo ten >>$(printf "i\tam\ttabbed") &&
+	git add *tabbed &&
+	git commit -m iam &&
+	git checkout -b mergery side1c &&
+	git merge --no-commit side2c &&
+	git rm *tabs &&
+	echo eleven >>$(printf "i\tam\ttabbed") &&
+	git mv "$(printf "i\tam\ttabbed")" "$(printf "fickle\tnaming")" &&
+	git add fickle* &&
+	git commit
+'
+
+test_expect_success '--combined-all-names and --raw' '
+	cat <<-\EOF >expect &&
+	::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR	"file\twith\ttabs"	"i\tam\ttabbed"	"fickle\tnaming"
+	EOF
+	git diff-tree -c -M --raw --combined-all-names HEAD >actual.tmp &&
+	sed 1d <actual.tmp >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success '--combined-all-names and --raw -and -z' '
+	printf "0f9645804ebb04cc3eef91f799eb7fb54d70cefb\0::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR\0file\twith\ttabs\0i\tam\ttabbed\0fickle\tnaming\0" >expect &&
+	git diff-tree -c -M --raw --combined-all-names -z HEAD >actual &&
+	test_cmp -a expect actual
+'
+
+test_expect_success '--combined-all-names and --cc' '
+	cat <<-\EOF >expect &&
+	--- "a/file\twith\ttabs"
+	--- "a/i\tam\ttabbed"
+	+++ "b/fickle\tnaming"
+	EOF
+	git diff-tree --cc -M --combined-all-names HEAD >actual.tmp &&
+	grep ^[-+][-+][-+] <actual.tmp >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.20.1.310.g17ca096f17


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v4] log,diff-tree: add --combined-all-names option
  2019-02-04 20:07     ` [PATCH v4] " Elijah Newren
@ 2019-02-04 21:20       ` Junio C Hamano
  2019-02-05 15:51         ` Elijah Newren
  2019-02-05  9:48       ` Johannes Schindelin
                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 39+ messages in thread
From: Junio C Hamano @ 2019-02-04 21:20 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git, Jeff King, brian m . carlson, Derrick Stolee

Elijah Newren <newren@gmail.com> writes:

There is one place that says combined-all-paths, and everywhere else
it says combined-all-names.  The former is probably techincally more
correct, I think ;-)

> The combined diff format for merges will only list one filename, even if
> rename or copy detection is active.  For example, with raw format one
> might see:
>
>   ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	describe.c
>   ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	bar.sh
>   ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	phooey.c
>
> This doesn't let us know what the original name of bar.sh was in the
> first parent, and doesn't let us know what either of the original names
> of phooey.c were in either of the parents.  In contrast, for non-merge
> commits, raw format does provide original filenames (and a rename score
> to boot).  In order to also provide original filenames for merge
> commits, add a --combined-all-names option (which must be used with
> either -c or --cc, and is likely only useful with rename or copy
> detection active) so that we can print tab-separated filenames when
> renames are involved.  This transforms the above output to:
>
>   ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	desc.c	desc.c	desc.c
>   ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	foo.sh	bar.sh	bar.sh
>   ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	fooey.c	fuey.c	phooey.c
>
> Further, in patch format, this changes the from/to headers so that
> instead of just having one "from" header, we get one for each parent.
> For example, instead of having
>
>   --- a/phooey.c
>   +++ b/phooey.c
>
> we would see
>
>   --- a/fooey.c
>   --- a/fuey.c
>   +++ b/phooey.c

Do we have the three "rename from fooey.c", "rename from fuey.c" and
"rename to "phooey.c" extended headers, too?  That's what I meant in
my response, but I do like what I see in the above example ;-)

> -. single path, only for "dst"
> +. tab-separated pathname(s) of the file
>  
> -Example:
> +For `-c` and `--cc`, only the destination or final path is shown even
> +if the file was renamed on any side of history.  With
> +`--combined-all-names`, the name of the path in each parent is shown
> +followed by the name of the path in the merge commit.
> +
> +Examples for `-c` and `-cc` without `--combined-all-names`:
> +------------------------------------------------
> +::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	desc.c
> +::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	bar.sh
> +::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	phooey.c
> +------------------------------------------------
> +
> +Examples when `--combined-all-names` added to either `-c` or `--cc`:

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4] log,diff-tree: add --combined-all-names option
  2019-02-04 20:07     ` [PATCH v4] " Elijah Newren
  2019-02-04 21:20       ` Junio C Hamano
@ 2019-02-05  9:48       ` Johannes Schindelin
  2019-02-05 15:54         ` Elijah Newren
  2019-02-07 22:28       ` Junio C Hamano
  2019-02-08  1:12       ` [PATCH v5 0/2] add --combined-all-paths option to log and diff-tree Elijah Newren
  3 siblings, 1 reply; 39+ messages in thread
From: Johannes Schindelin @ 2019-02-05  9:48 UTC (permalink / raw)
  To: Elijah Newren; +Cc: gitster, git, Jeff King, brian m . carlson, Derrick Stolee

Hi Elijah,

On Mon, 4 Feb 2019, Elijah Newren wrote:

> The combined diff format for merges will only list one filename, even if
> rename or copy detection is active.  For example, with raw format one
> might see:
> 
>   ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	describe.c
>   ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	bar.sh
>   ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	phooey.c
> 
> This doesn't let us know what the original name of bar.sh was in the
> first parent, and doesn't let us know what either of the original names
> of phooey.c were in either of the parents.  In contrast, for non-merge
> commits, raw format does provide original filenames (and a rename score
> to boot).  In order to also provide original filenames for merge
> commits, add a --combined-all-names option (which must be used with
> either -c or --cc, and is likely only useful with rename or copy
> detection active) so that we can print tab-separated filenames when
> renames are involved.  This transforms the above output to:
> 
>   ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	desc.c	desc.c	desc.c
>   ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	foo.sh	bar.sh	bar.sh
>   ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	fooey.c	fuey.c	phooey.c
> 
> Further, in patch format, this changes the from/to headers so that
> instead of just having one "from" header, we get one for each parent.
> For example, instead of having
> 
>   --- a/phooey.c
>   +++ b/phooey.c
> 
> we would see
> 
>   --- a/fooey.c
>   --- a/fuey.c
>   +++ b/phooey.c
> 
> Signed-off-by: Elijah Newren <newren@gmail.com>

I do not really see where this needs to have tests with filenames
containing tabs, but your test does create such files. Which makes it
break on filesystems that do not allow tabs in file names, such as
NTFS. I need this to make the test pass again:

-- snip --
diff --git a/t/t4038-diff-combined.sh b/t/t4038-diff-combined.sh
index 5bccc323f648..596705985baa 100755
--- a/t/t4038-diff-combined.sh
+++ b/t/t4038-diff-combined.sh
@@ -435,7 +435,7 @@ test_expect_success 'combine diff gets tree sorting right' '
 	test_cmp expect actual
 '
 
-test_expect_success 'setup for --combined-with-paths' '
+test_expect_success FUNNYNAMES 'setup for --combined-with-paths' '
 	git branch side1c &&
 	git branch side2c &&
 	git checkout side1c &&
@@ -456,7 +456,7 @@ test_expect_success 'setup for --combined-with-paths' '
 	git commit
 '
 
-test_expect_success '--combined-all-names and --raw' '
+test_expect_success FUNNYNAMES '--combined-all-names and --raw' '
 	cat <<-\EOF >expect &&
 	::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR	"file\twith\ttabs"	"i\tam\ttabbed"	"fickle\tnaming"
 	EOF
@@ -465,13 +465,13 @@ test_expect_success '--combined-all-names and --raw' '
 	test_cmp expect actual
 '
 
-test_expect_success '--combined-all-names and --raw -and -z' '
+test_expect_success FUNNYNAMES '--combined-all-names and --raw -and -z' '
 	printf "0f9645804ebb04cc3eef91f799eb7fb54d70cefb\0::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR\0file\twith\ttabs\0i\tam\ttabbed\0fickle\tnaming\0" >expect &&
 	git diff-tree -c -M --raw --combined-all-names -z HEAD >actual &&
 	test_cmp -a expect actual
 '
 
-test_expect_success '--combined-all-names and --cc' '
+test_expect_success FUNNYNAMES '--combined-all-names and --cc' '
 	cat <<-\EOF >expect &&
 	--- "a/file\twith\ttabs"
 	--- "a/i\tam\ttabbed"
-- snap --

But maybe you want to get rid of the funny file names? Or test for them in
a separate, dedicated test case so that we do not have to guard *all* of
your added tests behind FUNNYNAMES?

Ciao,
Dscho

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v4] log,diff-tree: add --combined-all-names option
  2019-02-04 21:20       ` Junio C Hamano
@ 2019-02-05 15:51         ` Elijah Newren
  2019-02-05 20:39           ` Junio C Hamano
  0 siblings, 1 reply; 39+ messages in thread
From: Elijah Newren @ 2019-02-05 15:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Jeff King, brian m . carlson, Derrick Stolee

On Mon, Feb 4, 2019 at 1:20 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> There is one place that says combined-all-paths, and everywhere else
> it says combined-all-names.  The former is probably techincally more
> correct, I think ;-)

Looks like I had a mixture of my original name (combined-with-paths)
and the one someone else suggested (combined-all-names).  Since you
like the hybrid better, I'll switch them all over to it.

> > The combined diff format for merges will only list one filename, even if
> > rename or copy detection is active.  For example, with raw format one
> > might see:
> >
> >   ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM   describe.c
> >   ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM   bar.sh
> >   ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR   phooey.c
> >
> > This doesn't let us know what the original name of bar.sh was in the
> > first parent, and doesn't let us know what either of the original names
> > of phooey.c were in either of the parents.  In contrast, for non-merge
> > commits, raw format does provide original filenames (and a rename score
> > to boot).  In order to also provide original filenames for merge
> > commits, add a --combined-all-names option (which must be used with
> > either -c or --cc, and is likely only useful with rename or copy
> > detection active) so that we can print tab-separated filenames when
> > renames are involved.  This transforms the above output to:
> >
> >   ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM   desc.c  desc.c  desc.c
> >   ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM   foo.sh  bar.sh  bar.sh
> >   ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR   fooey.c fuey.c  phooey.c
> >
> > Further, in patch format, this changes the from/to headers so that
> > instead of just having one "from" header, we get one for each parent.
> > For example, instead of having
> >
> >   --- a/phooey.c
> >   +++ b/phooey.c
> >
> > we would see
> >
> >   --- a/fooey.c
> >   --- a/fuey.c
> >   +++ b/phooey.c
>
> Do we have the three "rename from fooey.c", "rename from fuey.c" and
> "rename to "phooey.c" extended headers, too?  That's what I meant in
> my response, but I do like what I see in the above example ;-)

Ah, gotcha.  I'll look into whether it's possible to hook it up to
diff.c's fill_metainfo() .

Elijah

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4] log,diff-tree: add --combined-all-names option
  2019-02-05  9:48       ` Johannes Schindelin
@ 2019-02-05 15:54         ` Elijah Newren
  2019-02-05 18:04           ` Junio C Hamano
  0 siblings, 1 reply; 39+ messages in thread
From: Elijah Newren @ 2019-02-05 15:54 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Junio C Hamano, Git Mailing List, Jeff King, brian m . carlson,
	Derrick Stolee

Hi Dscho,

On Tue, Feb 5, 2019 at 1:48 AM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi Elijah,
>
> On Mon, 4 Feb 2019, Elijah Newren wrote:
>
> > The combined diff format for merges will only list one filename, even if
> > rename or copy detection is active.  For example, with raw format one
> > might see:
> >
> >   ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM   describe.c
> >   ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM   bar.sh
> >   ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR   phooey.c
> >
> > This doesn't let us know what the original name of bar.sh was in the
> > first parent, and doesn't let us know what either of the original names
> > of phooey.c were in either of the parents.  In contrast, for non-merge
> > commits, raw format does provide original filenames (and a rename score
> > to boot).  In order to also provide original filenames for merge
> > commits, add a --combined-all-names option (which must be used with
> > either -c or --cc, and is likely only useful with rename or copy
> > detection active) so that we can print tab-separated filenames when
> > renames are involved.  This transforms the above output to:
> >
> >   ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM   desc.c  desc.c  desc.c
> >   ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM   foo.sh  bar.sh  bar.sh
> >   ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR   fooey.c fuey.c  phooey.c
> >
> > Further, in patch format, this changes the from/to headers so that
> > instead of just having one "from" header, we get one for each parent.
> > For example, instead of having
> >
> >   --- a/phooey.c
> >   +++ b/phooey.c
> >
> > we would see
> >
> >   --- a/fooey.c
> >   --- a/fuey.c
> >   +++ b/phooey.c
> >
> > Signed-off-by: Elijah Newren <newren@gmail.com>
>
> I do not really see where this needs to have tests with filenames
> containing tabs, but your test does create such files. Which makes it
> break on filesystems that do not allow tabs in file names, such as
> NTFS. I need this to make the test pass again:
>
> -- snip --
> diff --git a/t/t4038-diff-combined.sh b/t/t4038-diff-combined.sh
> index 5bccc323f648..596705985baa 100755
> --- a/t/t4038-diff-combined.sh
> +++ b/t/t4038-diff-combined.sh
> @@ -435,7 +435,7 @@ test_expect_success 'combine diff gets tree sorting right' '
>         test_cmp expect actual
>  '
>
> -test_expect_success 'setup for --combined-with-paths' '
> +test_expect_success FUNNYNAMES 'setup for --combined-with-paths' '
>         git branch side1c &&
>         git branch side2c &&
>         git checkout side1c &&
> @@ -456,7 +456,7 @@ test_expect_success 'setup for --combined-with-paths' '
>         git commit
>  '
>
> -test_expect_success '--combined-all-names and --raw' '
> +test_expect_success FUNNYNAMES '--combined-all-names and --raw' '
>         cat <<-\EOF >expect &&
>         ::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR    "file\twith\ttabs"      "i\tam\ttabbed" "fickle\tnaming"
>         EOF
> @@ -465,13 +465,13 @@ test_expect_success '--combined-all-names and --raw' '
>         test_cmp expect actual
>  '
>
> -test_expect_success '--combined-all-names and --raw -and -z' '
> +test_expect_success FUNNYNAMES '--combined-all-names and --raw -and -z' '
>         printf "0f9645804ebb04cc3eef91f799eb7fb54d70cefb\0::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR\0file\twith\ttabs\0i\tam\ttabbed\0fickle\tnaming\0" >expect &&
>         git diff-tree -c -M --raw --combined-all-names -z HEAD >actual &&
>         test_cmp -a expect actual
>  '
>
> -test_expect_success '--combined-all-names and --cc' '
> +test_expect_success FUNNYNAMES '--combined-all-names and --cc' '
>         cat <<-\EOF >expect &&
>         --- "a/file\twith\ttabs"
>         --- "a/i\tam\ttabbed"
> -- snap --
>
> But maybe you want to get rid of the funny file names? Or test for them in
> a separate, dedicated test case so that we do not have to guard *all* of
> your added tests behind FUNNYNAMES?

I don't want to get rid of them; the initial reviews of my original
patch thought the format was ambiguous and would mishandle files with
tabs in them.  But yeah, I can definitely add a few testcases without
FUNNYNAMES.  Sorry for the headache; I'll fix this up.

Elijah

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4] log,diff-tree: add --combined-all-names option
  2019-02-05 15:54         ` Elijah Newren
@ 2019-02-05 18:04           ` Junio C Hamano
  0 siblings, 0 replies; 39+ messages in thread
From: Junio C Hamano @ 2019-02-05 18:04 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Johannes Schindelin, Git Mailing List, Jeff King,
	brian m . carlson, Derrick Stolee

Elijah Newren <newren@gmail.com> writes:

> I don't want to get rid of them; the initial reviews of my original
> patch thought the format was ambiguous and would mishandle files with
> tabs in them.  But yeah, I can definitely add a few testcases without
> FUNNYNAMES.  Sorry for the headache; I'll fix this up.

Thanks, both.  I was about to squash the fix from Dscho in, but I
agree that it is a much better idea to cover the basics in separate
tests that would run without FUNNYNAMES.  Will hold and wait for
your updates.

Thanks.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4] log,diff-tree: add --combined-all-names option
  2019-02-05 15:51         ` Elijah Newren
@ 2019-02-05 20:39           ` Junio C Hamano
  2019-02-07 19:50             ` Elijah Newren
  0 siblings, 1 reply; 39+ messages in thread
From: Junio C Hamano @ 2019-02-05 20:39 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Git Mailing List, Jeff King, brian m . carlson, Derrick Stolee

Elijah Newren <newren@gmail.com> writes:

>> > Further, in patch format, this changes the from/to headers so that
>> > instead of just having one "from" header, we get one for each parent.
>> > For example, instead of having
>> >
>> >   --- a/phooey.c
>> >   +++ b/phooey.c
>> >
>> > we would see
>> >
>> >   --- a/fooey.c
>> >   --- a/fuey.c
>> >   +++ b/phooey.c
>>
>> Do we have the three "rename from fooey.c", "rename from fuey.c" and
>> "rename to "phooey.c" extended headers, too?  That's what I meant in
>> my response, but I do like what I see in the above example ;-)
>
> Ah, gotcha.  I'll look into whether it's possible to hook it up to
> diff.c's fill_metainfo() .

Just to clarify.  I do not think these extended headers are "must
have"; the "--cc" output is not meant for machine consumption, as it
simplifies the output by omitting hunks that preimage trees agree
with each other etc., and making the resulting "patch" not showing
the whole picture, and these extended header lines might only become
waste of the screen real estate.

So, do not spend too much effort to emit these textual info that can
be easily seen with the N+1 plus/minus header lines.

Thanks.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4] log,diff-tree: add --combined-all-names option
  2019-02-05 20:39           ` Junio C Hamano
@ 2019-02-07 19:50             ` Elijah Newren
  2019-02-07 20:25               ` Junio C Hamano
  0 siblings, 1 reply; 39+ messages in thread
From: Elijah Newren @ 2019-02-07 19:50 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Jeff King, brian m . carlson, Derrick Stolee

On Tue, Feb 5, 2019 at 12:39 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> >> > Further, in patch format, this changes the from/to headers so that
> >> > instead of just having one "from" header, we get one for each parent.
> >> > For example, instead of having
> >> >
> >> >   --- a/phooey.c
> >> >   +++ b/phooey.c
> >> >
> >> > we would see
> >> >
> >> >   --- a/fooey.c
> >> >   --- a/fuey.c
> >> >   +++ b/phooey.c
> >>
> >> Do we have the three "rename from fooey.c", "rename from fuey.c" and
> >> "rename to "phooey.c" extended headers, too?  That's what I meant in
> >> my response, but I do like what I see in the above example ;-)
> >
> > Ah, gotcha.  I'll look into whether it's possible to hook it up to
> > diff.c's fill_metainfo() .
>
> Just to clarify.  I do not think these extended headers are "must
> have"; the "--cc" output is not meant for machine consumption, as it
> simplifies the output by omitting hunks that preimage trees agree
> with each other etc., and making the resulting "patch" not showing
> the whole picture, and these extended header lines might only become
> waste of the screen real estate.
>
> So, do not spend too much effort to emit these textual info that can
> be easily seen with the N+1 plus/minus header lines.
>
> Thanks.

Understood, thanks.

I think something can be done here, but I'm unsure exactly what.  From
Documentation/diff-generate-patch.txt, the extended headers for normal
diff mode are:

       old mode <mode>
       new mode <mode>
       deleted file mode <mode>
       new file mode <mode>
       copy from <path>
       copy to <path>
       rename from <path>
       rename to <path>
       similarity index <number>
       dissimilarity index <number>
       index <hash>..<hash> <mode>

and for combined diffs they are:

       index <hash>,<hash>..<hash>
       mode <mode>,<mode>..<mode>
       new file mode <mode>
       deleted file mode <mode>,<mode>

meaning that the ones we are missing are:

       copy from <path>
       copy to <path>
       rename from <path>
       rename to <path>
       similarity index <number>
       dissimilarity index <number>

I think "copy from" and "rename from" should be relatively
straightforward.  However, in a combined diff, we could have both a
modified status, a renamed status, and a copied status, meaning that
we'll need an array of both similarity and dissimilarity indexes...and
trying to present that to the user in a way that makes sense seems
like a lost cause to me.  Does anyone else know how to represent that?
 I'm inclined to just leave it out.

Also, I'm afraid "copy to" and "rename to" could be confusing if both
appeared, since there's only one "to" path.  If there is both a copy
and a rename involved relative to different parents, should these be
coalesced into a "copy/rename to" line?


Thanks,
Elijah

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4] log,diff-tree: add --combined-all-names option
  2019-02-07 19:50             ` Elijah Newren
@ 2019-02-07 20:25               ` Junio C Hamano
  2019-02-07 22:10                 ` Elijah Newren
  0 siblings, 1 reply; 39+ messages in thread
From: Junio C Hamano @ 2019-02-07 20:25 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Git Mailing List, Jeff King, brian m . carlson, Derrick Stolee

Elijah Newren <newren@gmail.com> writes:

> I think "copy from" and "rename from" should be relatively
> straightforward.  However, in a combined diff, we could have both a
> modified status, a renamed status, and a copied status, meaning that
> we'll need an array of both similarity and dissimilarity indexes...and
> trying to present that to the user in a way that makes sense seems
> like a lost cause to me.  Does anyone else know how to represent that?
>  I'm inclined to just leave it out.
>
> Also, I'm afraid "copy to" and "rename to" could be confusing if both
> appeared, since there's only one "to" path.  If there is both a copy
> and a rename involved relative to different parents, should these be
> coalesced into a "copy/rename to" line?

There are three possible labels (i.e. 'in-place modification',
'rename from elsewhere' and 'copy from elsewhere'), and you can say
"this commit created file F by renaming from X (or by copying X)"
only when you know path F did not exist _immediately before_ this
commit.  The distinction between rename and copy is whether the path
X remains in the resulting commit (i.e. if there is no X, the commit
created path F by moving X; if there is X, the commit copied the
contents of X into a new path F).

So telling renames and copies apart is probably straight-forward (if
you have sufficient information---I am not sure if you do in this
codepath offhand); as long as you know what pathname each preimage
(i.e. parent of the perge) tree had and if that pathname is missing
in the postimage (luckily there is only one---the merge result), it
was renamed, and otherwise it was copied.

But telling in-place modification and other two might be
trickier. In one parent path F may be missing but in the other
parent path F may exist, and the result of the merge is made by
merging the contents of path X in the first parent and the contents
of path F in the second parent.  From the view of the transition
between the first parent to the merge result, we moved the path X to
path F and made some modifications (i.e. renamed).  From the view of
the transition from the other branch, we kept the contents in path F
during the transition and there is no renames or copies involved.

Actually what I had in mind when I mentioned the extended headers
the first time in this discussion was that we would have "rename
from", "copy from", etc. separately for each parent, as the contents
may have come from different paths in these parents.  And that was
where my earlier "... might only become waste of the screen real
estate" comes from.

So, again, do not spend too much effort to emit these textual info
that can be easily seen with the N+1 plus/minus header lines.

Thanks.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4] log,diff-tree: add --combined-all-names option
  2019-02-07 20:25               ` Junio C Hamano
@ 2019-02-07 22:10                 ` Elijah Newren
  2019-02-07 23:31                   ` Junio C Hamano
  0 siblings, 1 reply; 39+ messages in thread
From: Elijah Newren @ 2019-02-07 22:10 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Jeff King, brian m . carlson, Derrick Stolee

On Thu, Feb 7, 2019 at 12:25 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > I think "copy from" and "rename from" should be relatively
> > straightforward.  However, in a combined diff, we could have both a
> > modified status, a renamed status, and a copied status, meaning that
> > we'll need an array of both similarity and dissimilarity indexes...and
> > trying to present that to the user in a way that makes sense seems
> > like a lost cause to me.  Does anyone else know how to represent that?
> >  I'm inclined to just leave it out.
> >
> > Also, I'm afraid "copy to" and "rename to" could be confusing if both
> > appeared, since there's only one "to" path.  If there is both a copy
> > and a rename involved relative to different parents, should these be
> > coalesced into a "copy/rename to" line?
>
> There are three possible labels (i.e. 'in-place modification',
> 'rename from elsewhere' and 'copy from elsewhere'), and you can say
> "this commit created file F by renaming from X (or by copying X)"
> only when you know path F did not exist _immediately before_ this
> commit.  The distinction between rename and copy is whether the path
> X remains in the resulting commit (i.e. if there is no X, the commit
> created path F by moving X; if there is X, the commit copied the
> contents of X into a new path F).
>
> So telling renames and copies apart is probably straight-forward (if
> you have sufficient information---I am not sure if you do in this
> codepath offhand); as long as you know what pathname each preimage
> (i.e. parent of the perge) tree had and if that pathname is missing
> in the postimage (luckily there is only one---the merge result), it
> was renamed, and otherwise it was copied.

We have change status, M, C, A, R, D, etc.  So, R vs. C tells us
renamed or copied.  We also have the original filename.

> But telling in-place modification and other two might be
> trickier. In one parent path F may be missing but in the other
> parent path F may exist, and the result of the merge is made by
> merging the contents of path X in the first parent and the contents
> of path F in the second parent.  From the view of the transition
> between the first parent to the merge result, we moved the path X to
> path F and made some modifications (i.e. renamed).  From the view of
> the transition from the other branch, we kept the contents in path F
> during the transition and there is no renames or copies involved.
>
> Actually what I had in mind when I mentioned the extended headers
> the first time in this discussion was that we would have "rename
> from", "copy from", etc. separately for each parent, as the contents
> may have come from different paths in these parents.  And that was
> where my earlier "... might only become waste of the screen real
> estate" comes from.

I think I'm with you on everything you said here, but perhaps not
since I can't see an answer to my question.  Maybe an example will
help:

Let's say we have an octopus merge.  Parent 1 had file F.  Parent 2
had file X.  Parent 3 had file Y.  The octopus has two files: F' and
X, with F' being very similar to F, X, and Y.

There's no "modified from" header; it's not needed (unless we want to
add a new kind of noise header?)
We could emit a "copied from X" header, due to parent 2.
We could emit a "renamed from Y" header, due to parent 3.

Now, the question: In addition to the two "from" headers, how many
"to" headers do we emit?  In particular, do we emit both a "copied to
F" and a "renamed to F" header, or just a combined "renamed/copied to
F" header?  I'm inclined to go with the latter, to avoid giving the
idea that there are multiple targets, but maybe folks expect there to
be one "rename to" and "copy to" for each "rename from" or "copy from"
that appeared.

> So, again, do not spend too much effort to emit these textual info
> that can be easily seen with the N+1 plus/minus header lines.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4] log,diff-tree: add --combined-all-names option
  2019-02-04 20:07     ` [PATCH v4] " Elijah Newren
  2019-02-04 21:20       ` Junio C Hamano
  2019-02-05  9:48       ` Johannes Schindelin
@ 2019-02-07 22:28       ` Junio C Hamano
  2019-02-07 23:48         ` Elijah Newren
  2019-02-08  1:12       ` [PATCH v5 0/2] add --combined-all-paths option to log and diff-tree Elijah Newren
  3 siblings, 1 reply; 39+ messages in thread
From: Junio C Hamano @ 2019-02-07 22:28 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git, Jeff King, brian m . carlson, Derrick Stolee

Elijah Newren <newren@gmail.com> writes:

> diff --git a/Documentation/diff-generate-patch.txt b/Documentation/diff-generate-patch.txt
> index 231105cff4..69cb3b0349 100644
> --- a/Documentation/diff-generate-patch.txt
> +++ b/Documentation/diff-generate-patch.txt
> @@ -144,6 +144,19 @@ Similar to two-line header for traditional 'unified' diff
>  format, `/dev/null` is used to signal created or deleted
>  files.
>  
> +However, if the --combined-all-paths option is provided, instead of a
> +two-line from-file/to-file you get a N+1 line from-file/to-file header,
> +where N is the number of parents in the merge commit
> +
> +       --- a/file
> +       --- a/file
> +       --- a/file
> +       +++ b/file
> ++
> +This extended format can be useful if rename or copy detection is
> +active, to allow you to see the original name of the file in different
> +parents.
> +
>  4.   Chunk header format is modified to prevent people from
>       accidentally feeding it to `patch -p1`. Combined diff format
>       was created for review of merge commit changes, and was not

You need to replace the blank line before the new paragraph that
begins with "However" with a line with a single "+" on it, to tell
the formatter that the new text is still part of the third item in
the list.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4] log,diff-tree: add --combined-all-names option
  2019-02-07 22:10                 ` Elijah Newren
@ 2019-02-07 23:31                   ` Junio C Hamano
  2019-02-07 23:48                     ` Elijah Newren
  0 siblings, 1 reply; 39+ messages in thread
From: Junio C Hamano @ 2019-02-07 23:31 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Git Mailing List, Jeff King, brian m . carlson, Derrick Stolee

Elijah Newren <newren@gmail.com> writes:

> Now, the question: In addition to the two "from" headers, how many
> "to" headers do we emit?  In particular, do we emit both a "copied to
> F" and a "renamed to F" header, or just a combined "renamed/copied to
> F" header?

There is only a single path that can be on the "to", as there is
only one final result, but _how_ the contents got to that path would
be different, so to be technically truly correct, you would end up
showing N "to" lines for a N-way merge, each of which gives the same
path in the postimage, but some may say renamed, another may say
copioed and some others may need to say in-place edited.

That would increase the number of necessary lines from N (from) + 1
(to) to N*2 (N for each of from and to), which makes it even less
economical.

And showing a single "renamed/copied" feels more like a cop-out to
avoid being techincally incorrect, than giving a useful piece of
information to the users.

I am inclined to say that we should do _without_ any "to" line.  And
if we can do without any "to", perhaps we do not need "from", either.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4] log,diff-tree: add --combined-all-names option
  2019-02-07 23:31                   ` Junio C Hamano
@ 2019-02-07 23:48                     ` Elijah Newren
  0 siblings, 0 replies; 39+ messages in thread
From: Elijah Newren @ 2019-02-07 23:48 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Jeff King, brian m . carlson, Derrick Stolee

On Thu, Feb 7, 2019 at 3:31 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > Now, the question: In addition to the two "from" headers, how many
> > "to" headers do we emit?  In particular, do we emit both a "copied to
> > F" and a "renamed to F" header, or just a combined "renamed/copied to
> > F" header?
>
> There is only a single path that can be on the "to", as there is
> only one final result, but _how_ the contents got to that path would
> be different, so to be technically truly correct, you would end up
> showing N "to" lines for a N-way merge, each of which gives the same
> path in the postimage, but some may say renamed, another may say
> copioed and some others may need to say in-place edited.
>
> That would increase the number of necessary lines from N (from) + 1
> (to) to N*2 (N for each of from and to), which makes it even less
> economical.
>
> And showing a single "renamed/copied" feels more like a cop-out to
> avoid being techincally incorrect, than giving a useful piece of
> information to the users.
>
> I am inclined to say that we should do _without_ any "to" line.  And

Works for me.

> if we can do without any "to", perhaps we do not need "from", either.

I would be okay with that, but looking through a few of them, I think
the "from" lines do help a little, just because lengthy filenames are
not uncommon and it's easy to miss the fact that there is a rename,
often in some directory somewhere in the middle.  A couple examples:


diff --combined packages/search/ete/src/test/resources/test-suite.yml
index 4ae66932ef9,1cc55276946..1cc55276946
rename from packages/search/src/geb/resources/test-suite.yml
--- a/packages/search/ete/src/test/resources/test-suite.yml
--- a/packages/search/src/geb/resources/test-suite.yml
+++ b/packages/search/ete/src/test/resources/test-suite.yml
@@@ -1,15 -1,5 +1,5 @@@


diff --combined packages/search/ete/var/conf/app/public/searchApp.yml
index 38cc5482e5b,487178ac0a9..487178ac0a9
rename from packages/search/ete/var/conf/app/public/searchApp
rename from packages/search/var/conf/app/public/searchApp.yml
--- a/packages/search/ete/var/conf/app/public/searchApp
--- a/packages/search/var/conf/app/public/searchApp.yml
+++ b/packages/search/ete/var/conf/app/public/searchApp.yml


If there's no renames or copies, then I don't add anything to the
combined diff.  With renames or copies, people can get the "to" name
from looking at the "+++ b/" line.

But not a real big deal, I could just omit this if you prefer.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4] log,diff-tree: add --combined-all-names option
  2019-02-07 22:28       ` Junio C Hamano
@ 2019-02-07 23:48         ` Elijah Newren
  0 siblings, 0 replies; 39+ messages in thread
From: Elijah Newren @ 2019-02-07 23:48 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Jeff King, brian m . carlson, Derrick Stolee

On Thu, Feb 7, 2019 at 2:28 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > diff --git a/Documentation/diff-generate-patch.txt b/Documentation/diff-generate-patch.txt
> > index 231105cff4..69cb3b0349 100644
> > --- a/Documentation/diff-generate-patch.txt
> > +++ b/Documentation/diff-generate-patch.txt
> > @@ -144,6 +144,19 @@ Similar to two-line header for traditional 'unified' diff
> >  format, `/dev/null` is used to signal created or deleted
> >  files.
> >
> > +However, if the --combined-all-paths option is provided, instead of a
> > +two-line from-file/to-file you get a N+1 line from-file/to-file header,
> > +where N is the number of parents in the merge commit
> > +
> > +       --- a/file
> > +       --- a/file
> > +       --- a/file
> > +       +++ b/file
> > ++
> > +This extended format can be useful if rename or copy detection is
> > +active, to allow you to see the original name of the file in different
> > +parents.
> > +
> >  4.   Chunk header format is modified to prevent people from
> >       accidentally feeding it to `patch -p1`. Combined diff format
> >       was created for review of merge commit changes, and was not
>
> You need to replace the blank line before the new paragraph that
> begins with "However" with a line with a single "+" on it, to tell
> the formatter that the new text is still part of the third item in
> the list.

Sorry about that; including that in my fixes now.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v5 0/2] add --combined-all-paths option to log and diff-tree
  2019-02-04 20:07     ` [PATCH v4] " Elijah Newren
                         ` (2 preceding siblings ...)
  2019-02-07 22:28       ` Junio C Hamano
@ 2019-02-08  1:12       ` Elijah Newren
  2019-02-08  1:12         ` [PATCH v5 1/2] log,diff-tree: add --combined-all-paths option Elijah Newren
  2019-02-08  1:12         ` [PATCH v5 2/2] squash! " Elijah Newren
  3 siblings, 2 replies; 39+ messages in thread
From: Elijah Newren @ 2019-02-08  1:12 UTC (permalink / raw)
  To: gitster; +Cc: git, Jeff King, brian m . carlson, Derrick Stolee, Elijah Newren

Changes since v4:
  * Added a second patch that can be squashed in which will add
    'rename from' and 'copy from' extended headers.  I like it, but
    Junio sounded pessimistic about it.  See
    https://public-inbox.org/git/xmqqlg2rmazz.fsf@gitster-ct.c.googlers.com/
  * Micro fixes:
    * Renamed --combined-all-names to --combined-all-paths
    * Fixed formatting error (missed '+') in diff-generate-patch.txt
    * Marked tests which used tabs in filenames with FUNNYNAMES prereq
    * Added tests that didn't depend on FUNNYNAMES

Elijah Newren (2):
  log,diff-tree: add --combined-all-paths option
  squash! log,diff-tree: add --combined-all-paths option

 Documentation/diff-format.txt         | 20 +++++-
 Documentation/diff-generate-patch.txt | 20 ++++--
 Documentation/git-diff-tree.txt       | 11 +++-
 Documentation/rev-list-options.txt    |  7 +++
 builtin/diff-tree.c                   |  6 +-
 combine-diff.c                        | 91 +++++++++++++++++++++++----
 diff.h                                |  1 +
 revision.c                            |  6 ++
 revision.h                            |  1 +
 t/t4038-diff-combined.sh              | 88 ++++++++++++++++++++++++++
 10 files changed, 230 insertions(+), 21 deletions(-)

Range-diff:
1:  26c64cee8a ! 1:  2205640429 log,diff-tree: add --combined-all-names option
    @@ -1,6 +1,6 @@
     Author: Elijah Newren <newren@gmail.com>
     
    -    log,diff-tree: add --combined-all-names option
    +    log,diff-tree: add --combined-all-paths option
     
         The combined diff format for merges will only list one filename, even if
         rename or copy detection is active.  For example, with raw format one
    @@ -15,7 +15,7 @@
         of phooey.c were in either of the parents.  In contrast, for non-merge
         commits, raw format does provide original filenames (and a rename score
         to boot).  In order to also provide original filenames for merge
    -    commits, add a --combined-all-names option (which must be used with
    +    commits, add a --combined-all-paths option (which must be used with
         either -c or --cc, and is likely only useful with rename or copy
         detection active) so that we can print tab-separated filenames when
         renames are involved.  This transforms the above output to:
    @@ -38,8 +38,6 @@
           +++ b/phooey.c
     
         Signed-off-by: Elijah Newren <newren@gmail.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    -    Message-Id: <20190204200754.16413-1-newren@gmail.com>
     
      diff --git a/Documentation/diff-format.txt b/Documentation/diff-format.txt
      --- a/Documentation/diff-format.txt
    @@ -54,17 +52,17 @@
     -Example:
     +For `-c` and `--cc`, only the destination or final path is shown even
     +if the file was renamed on any side of history.  With
    -+`--combined-all-names`, the name of the path in each parent is shown
    ++`--combined-all-paths`, the name of the path in each parent is shown
     +followed by the name of the path in the merge commit.
     +
    -+Examples for `-c` and `-cc` without `--combined-all-names`:
    ++Examples for `-c` and `-cc` without `--combined-all-paths`:
     +------------------------------------------------
     +::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	desc.c
     +::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	bar.sh
     +::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	phooey.c
     +------------------------------------------------
     +
    -+Examples when `--combined-all-names` added to either `-c` or `--cc`:
    ++Examples when `--combined-all-paths` added to either `-c` or `--cc`:
      
      ------------------------------------------------
     -::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	describe.c
    @@ -79,9 +77,10 @@
      --- a/Documentation/diff-generate-patch.txt
      +++ b/Documentation/diff-generate-patch.txt
     @@
    + Similar to two-line header for traditional 'unified' diff
      format, `/dev/null` is used to signal created or deleted
      files.
    - 
    +++
     +However, if the --combined-all-paths option is provided, instead of a
     +two-line from-file/to-file you get a N+1 line from-file/to-file header,
     +where N is the number of parents in the merge commit
    @@ -94,10 +93,9 @@
     +This extended format can be useful if rename or copy detection is
     +active, to allow you to see the original name of the file in different
     +parents.
    -+
    + 
      4.   Chunk header format is modified to prevent people from
           accidentally feeding it to `patch -p1`. Combined diff format
    -      was created for review of merge commit changes, and was not
     
      diff --git a/Documentation/git-diff-tree.txt b/Documentation/git-diff-tree.txt
      --- a/Documentation/git-diff-tree.txt
    @@ -108,7 +106,7 @@
      'git diff-tree' [--stdin] [-m] [-s] [-v] [--no-commit-id] [--pretty]
     -	      [-t] [-r] [-c | --cc] [--root] [<common diff options>]
     -	      <tree-ish> [<tree-ish>] [<path>...]
    -+	      [-t] [-r] [-c | --cc] [--combined-all-names] [--root]
    ++	      [-t] [-r] [-c | --cc] [--combined-all-paths] [--root]
     +	      [<common diff options>] <tree-ish> [<tree-ish>] [<path>...]
      
      DESCRIPTION
    @@ -117,7 +115,7 @@
      	itself and the commit log message is not shown, just like in any other
      	"empty diff" case.
      
    -+--combined-all-names::
    ++--combined-all-paths::
     +	This flag causes combined diffs (used for merge commits) to
     +	list the name of the file from all parents.  It thus only has
     +	effect when -c or --cc are specified, and is likely only
    @@ -135,7 +133,7 @@
      	the parents have only two variants and the merge result picks
      	one of them without modification.
      
    -+--combined-all-names::
    ++--combined-all-paths::
     +	This flag causes combined diffs (used for merge commits) to
     +	list the name of the file from all parents.  It thus only has
     +	effect when -c or --cc are specified, and is likely only
    @@ -159,7 +157,7 @@
      "  -r            diff recursively\n"
     +"  -c            show combined diff for merge commits\n"
     +"  --cc          show combined diff for merge commits removing uninteresting hunks\n"
    -+"  --combined-all-names\n"
    ++"  --combined-all-paths\n"
     +"                show name of file in all parents for combined diffs\n"
      "  --root        include the initial commit as diff against /dev/null\n"
      COMMON_DIFF_OPTIONS_HELP;
    @@ -182,7 +180,7 @@
     +	struct combine_diff_path *curr,
     +	int n,
     +	int num_parent,
    -+	int combined_all_names)
    ++	int combined_all_paths)
      {
      	struct diff_queue_struct *q = &diff_queued_diff;
      	struct combine_diff_path *p, **tail = &curr;
    @@ -196,7 +194,7 @@
      			p->parent[n].mode = q->queue[i]->one->mode;
      			p->parent[n].status = q->queue[i]->status;
     +
    -+			if (combined_all_names &&
    ++			if (combined_all_paths &&
     +			    filename_changed(p->parent[n].status)) {
     +				strbuf_init(&p->parent[n].path, 0);
     +				strbuf_addstr(&p->parent[n].path,
    @@ -210,7 +208,7 @@
      			/* p->path not in q->queue[]; drop it */
      			*tail = p->next;
     +			for (j = 0; j < num_parent; j++)
    -+				if (combined_all_names &&
    ++				if (combined_all_paths &&
     +				    filename_changed(p->parent[j].status))
     +					strbuf_release(&p->parent[j].path);
      			free(p);
    @@ -220,7 +218,7 @@
      		oidcpy(&p->parent[n].oid, &q->queue[i]->one->oid);
      		p->parent[n].mode = q->queue[i]->one->mode;
      		p->parent[n].status = q->queue[i]->status;
    -+		if (combined_all_names &&
    ++		if (combined_all_paths &&
     +		    filename_changed(p->parent[n].status))
     +			strbuf_addstr(&p->parent[n].path,
     +				      q->queue[i]->one->path);
    @@ -237,7 +235,7 @@
     -	else
     -		dump_quoted_path("--- ", a_prefix, elem->path,
     -				 line_prefix, c_meta, c_reset);
    -+	if (rev->combined_all_names) {
    ++	if (rev->combined_all_paths) {
     +		for (i = 0; i < num_parent; i++) {
     +			char *path = filename_changed(elem->parent[i].status)
     +				? elem->parent[i].path.buf : elem->path;
    @@ -264,7 +262,7 @@
      	}
      
     +	for (i = 0; i < num_parent; i++)
    -+		if (rev->combined_all_names) {
    ++		if (rev->combined_all_paths) {
     +			if (filename_changed(p->parent[i].status))
     +				write_name_quoted(p->parent[i].path.buf, stdout,
     +						  inter_name_termination);
    @@ -282,7 +280,7 @@
     -	const struct oid_array *parents, struct diff_options *opt)
     +	const struct oid_array *parents,
     +	struct diff_options *opt,
    -+	int combined_all_names)
    ++	int combined_all_paths)
      {
      	struct combine_diff_path *paths = NULL;
      	int i, num_parent = parents->nr;
    @@ -292,7 +290,7 @@
      		diffcore_std(opt);
     -		paths = intersect_paths(paths, i, num_parent);
     +		paths = intersect_paths(paths, i, num_parent,
    -+					combined_all_names);
    ++					combined_all_paths);
      
      		/* if showing diff, show it in requested order */
      		if (opt->output_format != DIFF_FORMAT_NO_OUTPUT &&
    @@ -302,7 +300,7 @@
      		 */
     -		paths = find_paths_generic(oid, parents, &diffopts);
     +		paths = find_paths_generic(oid, parents, &diffopts,
    -+					   rev->combined_all_names);
    ++					   rev->combined_all_paths);
      	}
      	else {
      		int stat_opt;
    @@ -311,7 +309,7 @@
      		struct combine_diff_path *tmp = paths;
      		paths = paths->next;
     +		for (i = 0; i < num_parent; i++)
    -+			if (rev->combined_all_names &&
    ++			if (rev->combined_all_paths &&
     +			    filename_changed(tmp->parent[i].status))
     +				strbuf_release(&tmp->parent[i].path);
      		free(tmp);
    @@ -337,9 +335,9 @@
      		revs->diff = 1;
      		revs->dense_combined_merges = 0;
      		revs->combine_merges = 1;
    -+	} else if (!strcmp(arg, "--combined-all-names")) {
    ++	} else if (!strcmp(arg, "--combined-all-paths")) {
     +		revs->diff = 1;
    -+		revs->combined_all_names = 1;
    ++		revs->combined_all_paths = 1;
      	} else if (!strcmp(arg, "--cc")) {
      		revs->diff = 1;
      		revs->dense_combined_merges = 1;
    @@ -347,8 +345,8 @@
      	}
      	if (revs->combine_merges)
      		revs->ignore_merges = 0;
    -+	if (revs->combined_all_names && !revs->combine_merges)
    -+		die("--combined-all-names makes no sense without -c or --cc");
    ++	if (revs->combined_all_paths && !revs->combine_merges)
    ++		die("--combined-all-paths makes no sense without -c or --cc");
     +
      	revs->diffopt.abbrev = revs->abbrev;
      
    @@ -361,7 +359,7 @@
      			verbose_header:1,
      			ignore_merges:1,
      			combine_merges:1,
    -+			combined_all_names:1,
    ++			combined_all_paths:1,
      			dense_combined_merges:1,
      			always_show_header:1;
      
    @@ -373,20 +371,61 @@
      	test_cmp expect actual
      '
      
    -+test_expect_success 'setup for --combined-with-paths' '
    ++test_expect_success 'setup for --combined-all-paths' '
     +	git branch side1c &&
     +	git branch side2c &&
     +	git checkout side1c &&
    ++	test_seq 1 10 >filename-side1c &&
    ++	git add filename-side1c &&
    ++	git commit -m with &&
    ++	git checkout side2c &&
    ++	test_seq 1 9 >filename-side2c &&
    ++	echo ten >>filename-side2c &&
    ++	git add filename-side2c &&
    ++	git commit -m iam &&
    ++	git checkout -b mergery side1c &&
    ++	git merge --no-commit side2c &&
    ++	git rm filename-side1c &&
    ++	echo eleven >>filename-side2c &&
    ++	git mv filename-side2c filename-merged &&
    ++	git add filename-merged &&
    ++	git commit
    ++'
    ++
    ++test_expect_success '--combined-all-paths and --raw' '
    ++	cat <<-\EOF >expect &&
    ++	::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR	filename-side1c	filename-side2c	filename-merged
    ++	EOF
    ++	git diff-tree -c -M --raw --combined-all-paths HEAD >actual.tmp &&
    ++	sed 1d <actual.tmp >actual &&
    ++	test_cmp expect actual
    ++'
    ++
    ++test_expect_success '--combined-all-paths and --cc' '
    ++	cat <<-\EOF >expect &&
    ++	--- a/filename-side1c
    ++	--- a/filename-side2c
    ++	+++ b/filename-merged
    ++	EOF
    ++	git diff-tree --cc -M --combined-all-paths HEAD >actual.tmp &&
    ++	grep ^[-+][-+][-+] <actual.tmp >actual &&
    ++	test_cmp expect actual
    ++'
    ++
    ++test_expect_success FUNNYNAMES 'setup for --combined-all-paths with funny names' '
    ++	git branch side1d &&
    ++	git branch side2d &&
    ++	git checkout side1d &&
     +	test_seq 1 10 >$(printf "file\twith\ttabs") &&
     +	git add file* &&
     +	git commit -m with &&
    -+	git checkout side2c &&
    ++	git checkout side2d &&
     +	test_seq 1 9 >$(printf "i\tam\ttabbed") &&
     +	echo ten >>$(printf "i\tam\ttabbed") &&
     +	git add *tabbed &&
     +	git commit -m iam &&
    -+	git checkout -b mergery side1c &&
    -+	git merge --no-commit side2c &&
    ++	git checkout -b funny-names-mergery side1d &&
    ++	git merge --no-commit side2d &&
     +	git rm *tabs &&
     +	echo eleven >>$(printf "i\tam\ttabbed") &&
     +	git mv "$(printf "i\tam\ttabbed")" "$(printf "fickle\tnaming")" &&
    @@ -394,28 +433,28 @@
     +	git commit
     +'
     +
    -+test_expect_success '--combined-all-names and --raw' '
    ++test_expect_success FUNNYNAMES '--combined-all-paths and --raw and funny names' '
     +	cat <<-\EOF >expect &&
     +	::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR	"file\twith\ttabs"	"i\tam\ttabbed"	"fickle\tnaming"
     +	EOF
    -+	git diff-tree -c -M --raw --combined-all-names HEAD >actual.tmp &&
    ++	git diff-tree -c -M --raw --combined-all-paths HEAD >actual.tmp &&
     +	sed 1d <actual.tmp >actual &&
     +	test_cmp expect actual
     +'
     +
    -+test_expect_success '--combined-all-names and --raw -and -z' '
    -+	printf "0f9645804ebb04cc3eef91f799eb7fb54d70cefb\0::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR\0file\twith\ttabs\0i\tam\ttabbed\0fickle\tnaming\0" >expect &&
    -+	git diff-tree -c -M --raw --combined-all-names -z HEAD >actual &&
    ++test_expect_success FUNNYNAMES '--combined-all-paths and --raw -and -z and funny names' '
    ++	printf "aaf8087c3cbd4db8e185a2d074cf27c53cfb75d7\0::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR\0file\twith\ttabs\0i\tam\ttabbed\0fickle\tnaming\0" >expect &&
    ++	git diff-tree -c -M --raw --combined-all-paths -z HEAD >actual &&
     +	test_cmp -a expect actual
     +'
     +
    -+test_expect_success '--combined-all-names and --cc' '
    ++test_expect_success FUNNYNAMES '--combined-all-paths and --cc and funny names' '
     +	cat <<-\EOF >expect &&
     +	--- "a/file\twith\ttabs"
     +	--- "a/i\tam\ttabbed"
     +	+++ "b/fickle\tnaming"
     +	EOF
    -+	git diff-tree --cc -M --combined-all-names HEAD >actual.tmp &&
    ++	git diff-tree --cc -M --combined-all-paths HEAD >actual.tmp &&
     +	grep ^[-+][-+][-+] <actual.tmp >actual &&
     +	test_cmp expect actual
     +'
2:  d93e6c5fee < -:  ---------- SQUASH??? fix mark-up
-:  ---------- > 2:  b8408a6075 squash! log,diff-tree: add --combined-all-paths option
-- 
2.20.1.311.gb8408a6075


^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v5 1/2] log,diff-tree: add --combined-all-paths option
  2019-02-08  1:12       ` [PATCH v5 0/2] add --combined-all-paths option to log and diff-tree Elijah Newren
@ 2019-02-08  1:12         ` Elijah Newren
  2019-02-08  4:00           ` Junio C Hamano
  2019-02-08  1:12         ` [PATCH v5 2/2] squash! " Elijah Newren
  1 sibling, 1 reply; 39+ messages in thread
From: Elijah Newren @ 2019-02-08  1:12 UTC (permalink / raw)
  To: gitster; +Cc: git, Jeff King, brian m . carlson, Derrick Stolee, Elijah Newren

The combined diff format for merges will only list one filename, even if
rename or copy detection is active.  For example, with raw format one
might see:

  ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	describe.c
  ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	bar.sh
  ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	phooey.c

This doesn't let us know what the original name of bar.sh was in the
first parent, and doesn't let us know what either of the original names
of phooey.c were in either of the parents.  In contrast, for non-merge
commits, raw format does provide original filenames (and a rename score
to boot).  In order to also provide original filenames for merge
commits, add a --combined-all-paths option (which must be used with
either -c or --cc, and is likely only useful with rename or copy
detection active) so that we can print tab-separated filenames when
renames are involved.  This transforms the above output to:

  ::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	desc.c	desc.c	desc.c
  ::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	foo.sh	bar.sh	bar.sh
  ::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	fooey.c	fuey.c	phooey.c

Further, in patch format, this changes the from/to headers so that
instead of just having one "from" header, we get one for each parent.
For example, instead of having

  --- a/phooey.c
  +++ b/phooey.c

we would see

  --- a/fooey.c
  --- a/fuey.c
  +++ b/phooey.c

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Documentation/diff-format.txt         | 20 +++++-
 Documentation/diff-generate-patch.txt | 13 ++++
 Documentation/git-diff-tree.txt       | 11 +++-
 Documentation/rev-list-options.txt    |  7 +++
 builtin/diff-tree.c                   |  6 +-
 combine-diff.c                        | 76 +++++++++++++++++++----
 diff.h                                |  1 +
 revision.c                            |  6 ++
 revision.h                            |  1 +
 t/t4038-diff-combined.sh              | 88 +++++++++++++++++++++++++++
 10 files changed, 212 insertions(+), 17 deletions(-)

diff --git a/Documentation/diff-format.txt b/Documentation/diff-format.txt
index cdcc17f0ad..715cbb7604 100644
--- a/Documentation/diff-format.txt
+++ b/Documentation/diff-format.txt
@@ -95,12 +95,26 @@ from the format described above in the following way:
 . there are more "src" modes and "src" sha1
 . status is concatenated status characters for each parent
 . no optional "score" number
-. single path, only for "dst"
+. tab-separated pathname(s) of the file
 
-Example:
+For `-c` and `--cc`, only the destination or final path is shown even
+if the file was renamed on any side of history.  With
+`--combined-all-paths`, the name of the path in each parent is shown
+followed by the name of the path in the merge commit.
+
+Examples for `-c` and `-cc` without `--combined-all-paths`:
+------------------------------------------------
+::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	desc.c
+::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	bar.sh
+::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	phooey.c
+------------------------------------------------
+
+Examples when `--combined-all-paths` added to either `-c` or `--cc`:
 
 ------------------------------------------------
-::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	describe.c
+::100644 100644 100644 fabadb8 cc95eb0 4866510 MM	desc.c	desc.c	desc.c
+::100755 100755 100755 52b7a2d 6d1ac04 d2ac7d7 RM	foo.sh	bar.sh	bar.sh
+::100644 100644 100644 e07d6c5 9042e82 ee91881 RR	fooey.c	fuey.c	phooey.c
 ------------------------------------------------
 
 Note that 'combined diff' lists only files which were modified from
diff --git a/Documentation/diff-generate-patch.txt b/Documentation/diff-generate-patch.txt
index 231105cff4..f10ca410ad 100644
--- a/Documentation/diff-generate-patch.txt
+++ b/Documentation/diff-generate-patch.txt
@@ -143,6 +143,19 @@ copying detection) are designed to work with diff of two
 Similar to two-line header for traditional 'unified' diff
 format, `/dev/null` is used to signal created or deleted
 files.
++
+However, if the --combined-all-paths option is provided, instead of a
+two-line from-file/to-file you get a N+1 line from-file/to-file header,
+where N is the number of parents in the merge commit
+
+       --- a/file
+       --- a/file
+       --- a/file
+       +++ b/file
++
+This extended format can be useful if rename or copy detection is
+active, to allow you to see the original name of the file in different
+parents.
 
 4.   Chunk header format is modified to prevent people from
      accidentally feeding it to `patch -p1`. Combined diff format
diff --git a/Documentation/git-diff-tree.txt b/Documentation/git-diff-tree.txt
index 2319b2b192..28b93ecd54 100644
--- a/Documentation/git-diff-tree.txt
+++ b/Documentation/git-diff-tree.txt
@@ -10,8 +10,8 @@ SYNOPSIS
 --------
 [verse]
 'git diff-tree' [--stdin] [-m] [-s] [-v] [--no-commit-id] [--pretty]
-	      [-t] [-r] [-c | --cc] [--root] [<common diff options>]
-	      <tree-ish> [<tree-ish>] [<path>...]
+	      [-t] [-r] [-c | --cc] [--combined-all-paths] [--root]
+	      [<common diff options>] <tree-ish> [<tree-ish>] [<path>...]
 
 DESCRIPTION
 -----------
@@ -108,6 +108,13 @@ include::pretty-options.txt[]
 	itself and the commit log message is not shown, just like in any other
 	"empty diff" case.
 
+--combined-all-paths::
+	This flag causes combined diffs (used for merge commits) to
+	list the name of the file from all parents.  It thus only has
+	effect when -c or --cc are specified, and is likely only
+	useful if filename changes are detected (i.e. when either
+	rename or copy detection have been requested).
+
 --always::
 	Show the commit itself and the commit log message even
 	if the diff itself is empty.
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 98b538bc77..642ce943c1 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -948,6 +948,13 @@ options may be given. See linkgit:git-diff-files[1] for more options.
 	the parents have only two variants and the merge result picks
 	one of them without modification.
 
+--combined-all-paths::
+	This flag causes combined diffs (used for merge commits) to
+	list the name of the file from all parents.  It thus only has
+	effect when -c or --cc are specified, and is likely only
+	useful if filename changes are detected (i.e. when either
+	rename or copy detection have been requested).
+
 -m::
 	This flag makes the merge commits show the full diff like
 	regular commits; for each merge parent, a separate log entry
diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c
index ef996126d7..ebe48f60c1 100644
--- a/builtin/diff-tree.c
+++ b/builtin/diff-tree.c
@@ -82,9 +82,13 @@ static int diff_tree_stdin(char *line)
 }
 
 static const char diff_tree_usage[] =
-"git diff-tree [--stdin] [-m] [-c] [--cc] [-s] [-v] [--pretty] [-t] [-r] [--root] "
+"git diff-tree [--stdin] [-m] [-c | --cc] [-s] [-v] [--pretty] [-t] [-r] [--root] "
 "[<common-diff-options>] <tree-ish> [<tree-ish>] [<path>...]\n"
 "  -r            diff recursively\n"
+"  -c            show combined diff for merge commits\n"
+"  --cc          show combined diff for merge commits removing uninteresting hunks\n"
+"  --combined-all-paths\n"
+"                show name of file in all parents for combined diffs\n"
 "  --root        include the initial commit as diff against /dev/null\n"
 COMMON_DIFF_OPTIONS_HELP;
 
diff --git a/combine-diff.c b/combine-diff.c
index a143c00634..54cb892ae5 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -23,11 +23,20 @@ static int compare_paths(const struct combine_diff_path *one,
 				 two->path, strlen(two->path), two->mode);
 }
 
-static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr, int n, int num_parent)
+static int filename_changed(char status)
+{
+	return status == 'R' || status == 'C';
+}
+
+static struct combine_diff_path *intersect_paths(
+	struct combine_diff_path *curr,
+	int n,
+	int num_parent,
+	int combined_all_paths)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
 	struct combine_diff_path *p, **tail = &curr;
-	int i, cmp;
+	int i, j, cmp;
 
 	if (!n) {
 		for (i = 0; i < q->nr; i++) {
@@ -50,6 +59,13 @@ static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr,
 			oidcpy(&p->parent[n].oid, &q->queue[i]->one->oid);
 			p->parent[n].mode = q->queue[i]->one->mode;
 			p->parent[n].status = q->queue[i]->status;
+
+			if (combined_all_paths &&
+			    filename_changed(p->parent[n].status)) {
+				strbuf_init(&p->parent[n].path, 0);
+				strbuf_addstr(&p->parent[n].path,
+					      q->queue[i]->one->path);
+			}
 			*tail = p;
 			tail = &p->next;
 		}
@@ -68,6 +84,10 @@ static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr,
 		if (cmp < 0) {
 			/* p->path not in q->queue[]; drop it */
 			*tail = p->next;
+			for (j = 0; j < num_parent; j++)
+				if (combined_all_paths &&
+				    filename_changed(p->parent[j].status))
+					strbuf_release(&p->parent[j].path);
 			free(p);
 			continue;
 		}
@@ -81,6 +101,10 @@ static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr,
 		oidcpy(&p->parent[n].oid, &q->queue[i]->one->oid);
 		p->parent[n].mode = q->queue[i]->one->mode;
 		p->parent[n].status = q->queue[i]->status;
+		if (combined_all_paths &&
+		    filename_changed(p->parent[n].status))
+			strbuf_addstr(&p->parent[n].path,
+				      q->queue[i]->one->path);
 
 		tail = &p->next;
 		i++;
@@ -960,12 +984,25 @@ static void show_combined_header(struct combine_diff_path *elem,
 	if (!show_file_header)
 		return;
 
-	if (added)
-		dump_quoted_path("--- ", "", "/dev/null",
-				 line_prefix, c_meta, c_reset);
-	else
-		dump_quoted_path("--- ", a_prefix, elem->path,
-				 line_prefix, c_meta, c_reset);
+	if (rev->combined_all_paths) {
+		for (i = 0; i < num_parent; i++) {
+			char *path = filename_changed(elem->parent[i].status)
+				? elem->parent[i].path.buf : elem->path;
+			if (elem->parent[i].status == DIFF_STATUS_ADDED)
+				dump_quoted_path("--- ", "", "/dev/null",
+						 line_prefix, c_meta, c_reset);
+			else
+				dump_quoted_path("--- ", a_prefix, path,
+						 line_prefix, c_meta, c_reset);
+		}
+	} else {
+		if (added)
+			dump_quoted_path("--- ", "", "/dev/null",
+					 line_prefix, c_meta, c_reset);
+		else
+			dump_quoted_path("--- ", a_prefix, elem->path,
+					 line_prefix, c_meta, c_reset);
+	}
 	if (deleted)
 		dump_quoted_path("+++ ", "", "/dev/null",
 				 line_prefix, c_meta, c_reset);
@@ -1227,6 +1264,15 @@ static void show_raw_diff(struct combine_diff_path *p, int num_parent, struct re
 		putchar(inter_name_termination);
 	}
 
+	for (i = 0; i < num_parent; i++)
+		if (rev->combined_all_paths) {
+			if (filename_changed(p->parent[i].status))
+				write_name_quoted(p->parent[i].path.buf, stdout,
+						  inter_name_termination);
+			else
+				write_name_quoted(p->path, stdout,
+						  inter_name_termination);
+		}
 	write_name_quoted(p->path, stdout, line_termination);
 }
 
@@ -1324,7 +1370,9 @@ static const char *path_path(void *obj)
 
 /* find set of paths that every parent touches */
 static struct combine_diff_path *find_paths_generic(const struct object_id *oid,
-	const struct oid_array *parents, struct diff_options *opt)
+	const struct oid_array *parents,
+	struct diff_options *opt,
+	int combined_all_paths)
 {
 	struct combine_diff_path *paths = NULL;
 	int i, num_parent = parents->nr;
@@ -1350,7 +1398,8 @@ static struct combine_diff_path *find_paths_generic(const struct object_id *oid,
 			opt->output_format = DIFF_FORMAT_NO_OUTPUT;
 		diff_tree_oid(&parents->oid[i], oid, "", opt);
 		diffcore_std(opt);
-		paths = intersect_paths(paths, i, num_parent);
+		paths = intersect_paths(paths, i, num_parent,
+					combined_all_paths);
 
 		/* if showing diff, show it in requested order */
 		if (opt->output_format != DIFF_FORMAT_NO_OUTPUT &&
@@ -1460,7 +1509,8 @@ void diff_tree_combined(const struct object_id *oid,
 		 * diff(sha1,parent_i) for all i to do the job, specifically
 		 * for parent0.
 		 */
-		paths = find_paths_generic(oid, parents, &diffopts);
+		paths = find_paths_generic(oid, parents, &diffopts,
+					   rev->combined_all_paths);
 	}
 	else {
 		int stat_opt;
@@ -1535,6 +1585,10 @@ void diff_tree_combined(const struct object_id *oid,
 	while (paths) {
 		struct combine_diff_path *tmp = paths;
 		paths = paths->next;
+		for (i = 0; i < num_parent; i++)
+			if (rev->combined_all_paths &&
+			    filename_changed(tmp->parent[i].status))
+				strbuf_release(&tmp->parent[i].path);
 		free(tmp);
 	}
 
diff --git a/diff.h b/diff.h
index b512d0477a..90ea0256a5 100644
--- a/diff.h
+++ b/diff.h
@@ -294,6 +294,7 @@ struct combine_diff_path {
 		char status;
 		unsigned int mode;
 		struct object_id oid;
+		struct strbuf path;
 	} parent[FLEX_ARRAY];
 };
 #define combine_diff_path_size(n, l) \
diff --git a/revision.c b/revision.c
index 13cfb59b38..3be3f7da83 100644
--- a/revision.c
+++ b/revision.c
@@ -1995,6 +1995,9 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->diff = 1;
 		revs->dense_combined_merges = 0;
 		revs->combine_merges = 1;
+	} else if (!strcmp(arg, "--combined-all-paths")) {
+		revs->diff = 1;
+		revs->combined_all_paths = 1;
 	} else if (!strcmp(arg, "--cc")) {
 		revs->diff = 1;
 		revs->dense_combined_merges = 1;
@@ -2491,6 +2494,9 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 	}
 	if (revs->combine_merges)
 		revs->ignore_merges = 0;
+	if (revs->combined_all_paths && !revs->combine_merges)
+		die("--combined-all-paths makes no sense without -c or --cc");
+
 	revs->diffopt.abbrev = revs->abbrev;
 
 	if (revs->line_level_traverse) {
diff --git a/revision.h b/revision.h
index 52e5a88ff5..253c12c29a 100644
--- a/revision.h
+++ b/revision.h
@@ -171,6 +171,7 @@ struct rev_info {
 			verbose_header:1,
 			ignore_merges:1,
 			combine_merges:1,
+			combined_all_paths:1,
 			dense_combined_merges:1,
 			always_show_header:1;
 
diff --git a/t/t4038-diff-combined.sh b/t/t4038-diff-combined.sh
index e2824d3437..07b49f6d6d 100755
--- a/t/t4038-diff-combined.sh
+++ b/t/t4038-diff-combined.sh
@@ -435,4 +435,92 @@ test_expect_success 'combine diff gets tree sorting right' '
 	test_cmp expect actual
 '
 
+test_expect_success 'setup for --combined-all-paths' '
+	git branch side1c &&
+	git branch side2c &&
+	git checkout side1c &&
+	test_seq 1 10 >filename-side1c &&
+	git add filename-side1c &&
+	git commit -m with &&
+	git checkout side2c &&
+	test_seq 1 9 >filename-side2c &&
+	echo ten >>filename-side2c &&
+	git add filename-side2c &&
+	git commit -m iam &&
+	git checkout -b mergery side1c &&
+	git merge --no-commit side2c &&
+	git rm filename-side1c &&
+	echo eleven >>filename-side2c &&
+	git mv filename-side2c filename-merged &&
+	git add filename-merged &&
+	git commit
+'
+
+test_expect_success '--combined-all-paths and --raw' '
+	cat <<-\EOF >expect &&
+	::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR	filename-side1c	filename-side2c	filename-merged
+	EOF
+	git diff-tree -c -M --raw --combined-all-paths HEAD >actual.tmp &&
+	sed 1d <actual.tmp >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success '--combined-all-paths and --cc' '
+	cat <<-\EOF >expect &&
+	--- a/filename-side1c
+	--- a/filename-side2c
+	+++ b/filename-merged
+	EOF
+	git diff-tree --cc -M --combined-all-paths HEAD >actual.tmp &&
+	grep ^[-+][-+][-+] <actual.tmp >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success FUNNYNAMES 'setup for --combined-all-paths with funny names' '
+	git branch side1d &&
+	git branch side2d &&
+	git checkout side1d &&
+	test_seq 1 10 >$(printf "file\twith\ttabs") &&
+	git add file* &&
+	git commit -m with &&
+	git checkout side2d &&
+	test_seq 1 9 >$(printf "i\tam\ttabbed") &&
+	echo ten >>$(printf "i\tam\ttabbed") &&
+	git add *tabbed &&
+	git commit -m iam &&
+	git checkout -b funny-names-mergery side1d &&
+	git merge --no-commit side2d &&
+	git rm *tabs &&
+	echo eleven >>$(printf "i\tam\ttabbed") &&
+	git mv "$(printf "i\tam\ttabbed")" "$(printf "fickle\tnaming")" &&
+	git add fickle* &&
+	git commit
+'
+
+test_expect_success FUNNYNAMES '--combined-all-paths and --raw and funny names' '
+	cat <<-\EOF >expect &&
+	::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR	"file\twith\ttabs"	"i\tam\ttabbed"	"fickle\tnaming"
+	EOF
+	git diff-tree -c -M --raw --combined-all-paths HEAD >actual.tmp &&
+	sed 1d <actual.tmp >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success FUNNYNAMES '--combined-all-paths and --raw -and -z and funny names' '
+	printf "aaf8087c3cbd4db8e185a2d074cf27c53cfb75d7\0::100644 100644 100644 f00c965d8307308469e537302baa73048488f162 088bd5d92c2a8e0203ca8e7e4c2a5c692f6ae3f7 333b9c62519f285e1854830ade0fe1ef1d40ee1b RR\0file\twith\ttabs\0i\tam\ttabbed\0fickle\tnaming\0" >expect &&
+	git diff-tree -c -M --raw --combined-all-paths -z HEAD >actual &&
+	test_cmp -a expect actual
+'
+
+test_expect_success FUNNYNAMES '--combined-all-paths and --cc and funny names' '
+	cat <<-\EOF >expect &&
+	--- "a/file\twith\ttabs"
+	--- "a/i\tam\ttabbed"
+	+++ "b/fickle\tnaming"
+	EOF
+	git diff-tree --cc -M --combined-all-paths HEAD >actual.tmp &&
+	grep ^[-+][-+][-+] <actual.tmp >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.20.1.311.gb8408a6075


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v5 2/2] squash! log,diff-tree: add --combined-all-paths option
  2019-02-08  1:12       ` [PATCH v5 0/2] add --combined-all-paths option to log and diff-tree Elijah Newren
  2019-02-08  1:12         ` [PATCH v5 1/2] log,diff-tree: add --combined-all-paths option Elijah Newren
@ 2019-02-08  1:12         ` Elijah Newren
  2019-02-08  4:14           ` Junio C Hamano
  1 sibling, 1 reply; 39+ messages in thread
From: Elijah Newren @ 2019-02-08  1:12 UTC (permalink / raw)
  To: gitster; +Cc: git, Jeff King, brian m . carlson, Derrick Stolee, Elijah Newren

This also adds "rename from <oldpath>" and "copy from <oldpath>"
extended headers when renames or copies are involved.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Documentation/diff-generate-patch.txt |  7 +++----
 combine-diff.c                        | 15 +++++++++++++++
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/Documentation/diff-generate-patch.txt b/Documentation/diff-generate-patch.txt
index f10ca410ad..f0c2a17aef 100644
--- a/Documentation/diff-generate-patch.txt
+++ b/Documentation/diff-generate-patch.txt
@@ -128,12 +128,11 @@ or like this (when `--cc` option is used):
        mode <mode>,<mode>..<mode>
        new file mode <mode>
        deleted file mode <mode>,<mode>
+       copy from <path>
+       rename from <path>
 +
 The `mode <mode>,<mode>..<mode>` line appears only if at least one of
-the <mode> is different from the rest. Extended headers with
-information about detected contents movement (renames and
-copying detection) are designed to work with diff of two
-<tree-ish> and are not used by combined diff format.
+the <mode> is different from the rest.
 
 3.   It is followed by two-line from-file/to-file header
 
diff --git a/combine-diff.c b/combine-diff.c
index 54cb892ae5..04a139ea03 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -981,6 +981,21 @@ static void show_combined_header(struct combine_diff_path *elem,
 		printf("%s\n", c_reset);
 	}
 
+	for (i = 0; i < num_parent; i++) {
+		switch (elem->parent[i].status) {
+		case DIFF_STATUS_COPIED:
+			dump_quoted_path("copy from ", "",
+					 elem->parent[i].path.buf,
+					 line_prefix, c_meta, c_reset);
+			break;
+		case DIFF_STATUS_RENAMED:
+			dump_quoted_path("rename from ", "",
+					 elem->parent[i].path.buf,
+					 line_prefix, c_meta, c_reset);
+			break;
+		}
+	}
+
 	if (!show_file_header)
 		return;
 
-- 
2.20.1.311.gb8408a6075


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v5 1/2] log,diff-tree: add --combined-all-paths option
  2019-02-08  1:12         ` [PATCH v5 1/2] log,diff-tree: add --combined-all-paths option Elijah Newren
@ 2019-02-08  4:00           ` Junio C Hamano
  2019-02-08  6:52             ` Elijah Newren
  0 siblings, 1 reply; 39+ messages in thread
From: Junio C Hamano @ 2019-02-08  4:00 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git, Jeff King, brian m . carlson, Derrick Stolee

Elijah Newren <newren@gmail.com> writes:

> +For `-c` and `--cc`, only the destination or final path is shown even
> +if the file was renamed on any side of history.  With
> +`--combined-all-paths`, the name of the path in each parent is shown
> +followed by the name of the path in the merge commit.
> +
> +Examples for `-c` and `-cc` without `--combined-all-paths`:

s/-cc/--cc/


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v5 2/2] squash! log,diff-tree: add --combined-all-paths option
  2019-02-08  1:12         ` [PATCH v5 2/2] squash! " Elijah Newren
@ 2019-02-08  4:14           ` Junio C Hamano
  2019-02-08  6:48             ` Elijah Newren
  0 siblings, 1 reply; 39+ messages in thread
From: Junio C Hamano @ 2019-02-08  4:14 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git, Jeff King, brian m . carlson, Derrick Stolee

Elijah Newren <newren@gmail.com> writes:

> +	for (i = 0; i < num_parent; i++) {
> +		switch (elem->parent[i].status) {
> +		case DIFF_STATUS_COPIED:
> +			dump_quoted_path("copy from ", "",
> +					 elem->parent[i].path.buf,
> +					 line_prefix, c_meta, c_reset);
> +			break;
> +		case DIFF_STATUS_RENAMED:
> +			dump_quoted_path("rename from ", "",
> +					 elem->parent[i].path.buf,
> +					 line_prefix, c_meta, c_reset);
> +			break;
> +		}
> +	}

The explanation for this addition was that it is hard to tell from
which side a rename happened in the three-dash lines alone:

    --- a/packages/search/ete/src/test/resources/test-suite.yml
    --- a/packages/search/src/geb/resources/test-suite.yml
    +++ b/packages/search/ete/src/test/resources/test-suite.yml

and your hope was that adding:

    rename from packages/search/src/geb/resources/test-suite.yml

would help especially when the path is overly long.

But I am not sure if that single "rename from" is all that helpful.
You cannot tell relative to which parent the rename happened without
going back to the three-dash lines.  A loop that iterates over all
parents but shows only a line for a parent that actually had copy or
rename loses "the line is talking about the change from this parent"
which is a fairly important piece of information, doesn't it?

If we attempt to clarify it by adding some more information on the
line, e.g.

    rename relative to parent #1 from packages/search/src/geb/...

the line gets even longer, and going back to look at

    --- a/packages/search/src/geb/resources/test-suite.yml

may turn out to be an easier way to learn that information.
So,... I dunno.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v5 2/2] squash! log,diff-tree: add --combined-all-paths option
  2019-02-08  4:14           ` Junio C Hamano
@ 2019-02-08  6:48             ` Elijah Newren
  0 siblings, 0 replies; 39+ messages in thread
From: Elijah Newren @ 2019-02-08  6:48 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Jeff King, brian m . carlson, Derrick Stolee

On Thu, Feb 7, 2019 at 8:14 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > +     for (i = 0; i < num_parent; i++) {
> > +             switch (elem->parent[i].status) {
> > +             case DIFF_STATUS_COPIED:
> > +                     dump_quoted_path("copy from ", "",
> > +                                      elem->parent[i].path.buf,
> > +                                      line_prefix, c_meta, c_reset);
> > +                     break;
> > +             case DIFF_STATUS_RENAMED:
> > +                     dump_quoted_path("rename from ", "",
> > +                                      elem->parent[i].path.buf,
> > +                                      line_prefix, c_meta, c_reset);
> > +                     break;
> > +             }
> > +     }
>
> The explanation for this addition was that it is hard to tell from
> which side a rename happened in the three-dash lines alone:
>
>     --- a/packages/search/ete/src/test/resources/test-suite.yml
>     --- a/packages/search/src/geb/resources/test-suite.yml
>     +++ b/packages/search/ete/src/test/resources/test-suite.yml
>
> and your hope was that adding:
>
>     rename from packages/search/src/geb/resources/test-suite.yml
>
> would help especially when the path is overly long.
>
> But I am not sure if that single "rename from" is all that helpful.
> You cannot tell relative to which parent the rename happened without
> going back to the three-dash lines.  A loop that iterates over all
> parents but shows only a line for a parent that actually had copy or
> rename loses "the line is talking about the change from this parent"
> which is a fairly important piece of information, doesn't it?
>
> If we attempt to clarify it by adding some more information on the
> line, e.g.
>
>     rename relative to parent #1 from packages/search/src/geb/...
>
> the line gets even longer, and going back to look at
>
>     --- a/packages/search/src/geb/resources/test-suite.yml
>
> may turn out to be an easier way to learn that information.
> So,... I dunno.

Yeah, fair enough.  I'll drop the patch.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v5 1/2] log,diff-tree: add --combined-all-paths option
  2019-02-08  4:00           ` Junio C Hamano
@ 2019-02-08  6:52             ` Elijah Newren
  2019-02-08 17:50               ` Junio C Hamano
  0 siblings, 1 reply; 39+ messages in thread
From: Elijah Newren @ 2019-02-08  6:52 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Jeff King, brian m . carlson, Derrick Stolee

On Thu, Feb 7, 2019 at 8:00 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > +For `-c` and `--cc`, only the destination or final path is shown even
> > +if the file was renamed on any side of history.  With
> > +`--combined-all-paths`, the name of the path in each parent is shown
> > +followed by the name of the path in the merge commit.
> > +
> > +Examples for `-c` and `-cc` without `--combined-all-paths`:
>
> s/-cc/--cc/

Thanks for catching that.  I was going to go fixup and resubmit, and
drop patch 2 while at it, but it looks like you've already squashed
this fix in to commit d76ce4f73 that you've pushed out to pu.  Do you
want to just drop patch two from that series and call it good?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v5 1/2] log,diff-tree: add --combined-all-paths option
  2019-02-08  6:52             ` Elijah Newren
@ 2019-02-08 17:50               ` Junio C Hamano
  0 siblings, 0 replies; 39+ messages in thread
From: Junio C Hamano @ 2019-02-08 17:50 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Git Mailing List, Jeff King, brian m . carlson, Derrick Stolee

Elijah Newren <newren@gmail.com> writes:

> On Thu, Feb 7, 2019 at 8:00 PM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Elijah Newren <newren@gmail.com> writes:
>>
>> > +For `-c` and `--cc`, only the destination or final path is shown even
>> > +if the file was renamed on any side of history.  With
>> > +`--combined-all-paths`, the name of the path in each parent is shown
>> > +followed by the name of the path in the merge commit.
>> > +
>> > +Examples for `-c` and `-cc` without `--combined-all-paths`:
>>
>> s/-cc/--cc/
>
> Thanks for catching that.  I was going to go fixup and resubmit, and
> drop patch 2 while at it, but it looks like you've already squashed
> this fix in to commit d76ce4f73 that you've pushed out to pu.  Do you
> want to just drop patch two from that series and call it good?

Sure.
Thanks.

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2019-02-08 17:50 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-24 16:46 [PATCH] log,diff-tree: add --combined-with-paths options for merges with renames Elijah Newren
2019-01-25  2:19 ` brian m. carlson
2019-01-25 16:27   ` Elijah Newren
2019-01-25 14:45 ` Derrick Stolee
2019-01-25 16:30   ` Elijah Newren
2019-01-25 16:54 ` [PATCH v2] " Elijah Newren
2019-01-25 17:40   ` Derrick Stolee
2019-01-25 17:52     ` Elijah Newren
2019-01-25 19:51       ` Eric Sunshine
2019-01-26 22:07         ` Elijah Newren
2019-01-27  1:52       ` brian m. carlson
2019-01-26 22:18   ` [PATCH v3] log,diff-tree: add --combined-all-names option Elijah Newren
2019-02-04 20:07     ` [PATCH v4] " Elijah Newren
2019-02-04 21:20       ` Junio C Hamano
2019-02-05 15:51         ` Elijah Newren
2019-02-05 20:39           ` Junio C Hamano
2019-02-07 19:50             ` Elijah Newren
2019-02-07 20:25               ` Junio C Hamano
2019-02-07 22:10                 ` Elijah Newren
2019-02-07 23:31                   ` Junio C Hamano
2019-02-07 23:48                     ` Elijah Newren
2019-02-05  9:48       ` Johannes Schindelin
2019-02-05 15:54         ` Elijah Newren
2019-02-05 18:04           ` Junio C Hamano
2019-02-07 22:28       ` Junio C Hamano
2019-02-07 23:48         ` Elijah Newren
2019-02-08  1:12       ` [PATCH v5 0/2] add --combined-all-paths option to log and diff-tree Elijah Newren
2019-02-08  1:12         ` [PATCH v5 1/2] log,diff-tree: add --combined-all-paths option Elijah Newren
2019-02-08  4:00           ` Junio C Hamano
2019-02-08  6:52             ` Elijah Newren
2019-02-08 17:50               ` Junio C Hamano
2019-02-08  1:12         ` [PATCH v5 2/2] squash! " Elijah Newren
2019-02-08  4:14           ` Junio C Hamano
2019-02-08  6:48             ` Elijah Newren
2019-01-25 19:29 ` [PATCH] log,diff-tree: add --combined-with-paths options for merges with renames Junio C Hamano
2019-01-25 20:04   ` Elijah Newren
2019-01-25 22:21     ` Junio C Hamano
2019-01-26 22:12       ` Elijah Newren
2019-01-28  0:19         ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).