git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH RFC] diff: add SUBMODULE_DIFF format to display submodule diff
@ 2016-08-09 22:32 Jacob Keller
  2016-08-09 22:40 ` Junio C Hamano
  2016-08-09 22:50 ` Stefan Beller
  0 siblings, 2 replies; 7+ messages in thread
From: Jacob Keller @ 2016-08-09 22:32 UTC (permalink / raw)
  To: git; +Cc: Jacob Keller

From: Jacob Keller <jacob.keller@gmail.com>

For projects which have frequent updates to submodules it is often
useful to be able to see a submodule update commit as a difference.
Teach diff's --submodule= a new "diff" format which will execute a diff
for the submodule between the old and new commit, and display it as
a standard diff. This allows users to easily see what changed in
a submodule without having to dig into the submodule themselves.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
---
There are no tests yet. Stefan suggested the use of child_process, but I
really believe there has to be some way of getting the diff without
having to run a sub process (as this means we also can't do the diff
without a checked out submodule). It doesn't properly handle options
either, so a better solution would be much appreciated.

I also would prefer if the diff actually prefixed the files with
"path/to/submodule" so that it was obvious where the file was located in
the directory.

Suggestions welcome.

 Documentation/diff-config.txt  |  3 +-
 Documentation/diff-options.txt |  6 ++--
 diff.c                         | 21 +++++++++++--
 diff.h                         |  2 +-
 submodule.c                    | 67 ++++++++++++++++++++++++++++++++++++++++++
 submodule.h                    |  5 ++++
 6 files changed, 97 insertions(+), 7 deletions(-)

diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
index d5a5b17d5088..f5d693afad6c 100644
--- a/Documentation/diff-config.txt
+++ b/Documentation/diff-config.txt
@@ -123,7 +123,8 @@ diff.suppressBlankEmpty::
 diff.submodule::
 	Specify the format in which differences in submodules are
 	shown.  The "log" format lists the commits in the range like
-	linkgit:git-submodule[1] `summary` does.  The "short" format
+	linkgit:git-submodule[1] `summary` does.  The "diff" format shows the
+	diff between the beginning and end of the range. The "short" format
 	format just shows the names of the commits at the beginning
 	and end of the range.  Defaults to short.
 
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 705a87394200..b17348407805 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -215,8 +215,10 @@ any of those replacements occurred.
 	the commits in the range like linkgit:git-submodule[1] `summary` does.
 	Omitting the `--submodule` option or specifying `--submodule=short`,
 	uses the 'short' format. This format just shows the names of the commits
-	at the beginning and end of the range.  Can be tweaked via the
-	`diff.submodule` configuration variable.
+	at the beginning and end of the range. When `--submodule=diff` is
+	given, the 'diff' format is used. This format shows the diff between
+	the old and new submodule commmit from the perspective of the
+	submodule.  Can be tweaked via the `diff.submodule` configuration variable.
 
 --color[=<when>]::
 	Show colored diff.
diff --git a/diff.c b/diff.c
index b43d3dd2ecb7..74d43931c8df 100644
--- a/diff.c
+++ b/diff.c
@@ -130,12 +130,18 @@ static int parse_dirstat_params(struct diff_options *options, const char *params
 
 static int parse_submodule_params(struct diff_options *options, const char *value)
 {
-	if (!strcmp(value, "log"))
+	if (!strcmp(value, "log")) {
 		DIFF_OPT_SET(options, SUBMODULE_LOG);
-	else if (!strcmp(value, "short"))
+		DIFF_OPT_CLR(options, SUBMODULE_DIFF);
+	else if (!strcmp(value, "diff")) {
+		DIFF_OPT_SET(options, SUBMODULE_DIFF);
 		DIFF_OPT_CLR(options, SUBMODULE_LOG);
-	else
+	} else if (!strcmp(value, "short")) {
+		DIFF_OPT_CLR(options, SUBMODULE_DIFF);
+		DIFF_OPT_CLR(options, SUBMODULE_LOG);
+	} else {
 		return -1;
+	}
 	return 0;
 }
 
@@ -2310,6 +2316,15 @@ static void builtin_diff(const char *name_a,
 				two->dirty_submodule,
 				meta, del, add, reset);
 		return;
+	} else if (DIFF_OPT_TST(o, SUBMODULE_DIFF) &&
+			(!one->mode || S_ISGITLINK(one->mode)) &&
+			(!two->mode || S_ISGITLINK(two->mode))) {
+		show_submodule_diff(o->file, one->path ? one->path : two->path,
+				line_prefix,
+				one->oid.hash, two->oid.hash,
+				two->dirty_submodule,
+				meta, reset);
+		return;
 	}
 
 	if (DIFF_OPT_TST(o, ALLOW_TEXTCONV)) {
diff --git a/diff.h b/diff.h
index 125447be09eb..a80f3e5bafe9 100644
--- a/diff.h
+++ b/diff.h
@@ -69,7 +69,7 @@ typedef struct strbuf *(*diff_prefix_fn_t)(struct diff_options *opt, void *data)
 #define DIFF_OPT_FIND_COPIES_HARDER  (1 <<  6)
 #define DIFF_OPT_FOLLOW_RENAMES      (1 <<  7)
 #define DIFF_OPT_RENAME_EMPTY        (1 <<  8)
-/* (1 <<  9) unused */
+#define DIFF_OPT_SUBMODULE_DIFF      (1 <<  9)
 #define DIFF_OPT_HAS_CHANGES         (1 << 10)
 #define DIFF_OPT_QUICK               (1 << 11)
 #define DIFF_OPT_NO_INDEX            (1 << 12)
diff --git a/submodule.c b/submodule.c
index 1b5cdfb7e784..4a322513d27c 100644
--- a/submodule.c
+++ b/submodule.c
@@ -333,6 +333,73 @@ static void print_submodule_summary(struct rev_info *rev, FILE *f,
 	strbuf_release(&sb);
 }
 
+static int prepare_submodule_diff(struct strbuf *buf, const char *path,
+		unsigned char one[20], unsigned char two[20])
+{
+	struct child_process cp = CHILD_PROCESS_INIT;
+	cp.git_cmd = 1;
+	cp.dir = path;
+	cp.out = -1;
+	cp.no_stdin = 1;
+	argv_array_push(&cp.args, "diff");
+	argv_array_push(&cp.args, sha1_to_hex(one));
+	argv_array_push(&cp.args, sha1_to_hex(two));
+
+	if (start_command(&cp))
+		return -1;
+
+	if (strbuf_read(buf, cp.out, 0) < 0)
+		return -1;
+
+	if (finish_command(&cp))
+		return -1;
+
+	return 0;
+}
+
+void show_submodule_diff(FILE *f, const char *path,
+		const char *line_prefix,
+		unsigned char one[20], unsigned char two[20],
+		unsigned dirty_submodule, const char *meta,
+		const char *reset)
+{
+	struct strbuf buf = STRBUF_INIT;
+	struct strbuf sb = STRBUF_INIT;
+	const char *message = NULL;
+
+	if (dirty_submodule & DIRTY_SUBMODULE_UNTRACKED)
+		fprintf(f, "%sSubmodule %s contains untracked content\n",
+			line_prefix, path);
+	if (dirty_submodule & DIRTY_SUBMODULE_MODIFIED)
+		fprintf(f, "%sSubmodule %s contains modified content\n",
+			line_prefix, path);
+
+	if (!hashcmp(one, two)) {
+		strbuf_release(&sb);
+		return;
+	}
+
+	if (add_submodule_odb(path))
+		message = "(not checked out)";
+	else if (prepare_submodule_diff(&buf, path, one, two))
+		message = "(diff failed)";
+
+	strbuf_addf(&sb, "%s%sSubmodule %s %s..", line_prefix, meta, path,
+			find_unique_abbrev(one, DEFAULT_ABBREV));
+	strbuf_addf(&sb, "%s", find_unique_abbrev(two, DEFAULT_ABBREV));
+	if (message)
+		strbuf_addf(&sb, " %s%s\n", message, reset);
+	else
+		strbuf_addf(&sb, ":%s\n", reset);
+	fwrite(sb.buf, sb.len, 1, f);
+
+	if (!message) /* only NULL if we succeeded in obtaining a diff */
+		fwrite(buf.buf, buf.len, 1, f);
+
+	strbuf_release(&buf);
+	strbuf_release(&sb);
+}
+
 void show_submodule_summary(FILE *f, const char *path,
 		const char *line_prefix,
 		unsigned char one[20], unsigned char two[20],
diff --git a/submodule.h b/submodule.h
index 2af939099819..f9180712ae2c 100644
--- a/submodule.h
+++ b/submodule.h
@@ -41,6 +41,11 @@ int parse_submodule_update_strategy(const char *value,
 		struct submodule_update_strategy *dst);
 const char *submodule_strategy_to_string(const struct submodule_update_strategy *s);
 void handle_ignore_submodules_arg(struct diff_options *diffopt, const char *);
+void show_submodule_diff(FILE *f, const char *path,
+		const char *line_prefix,
+		unsigned char one[20], unsigned char two[20],
+		unsigned dirty_submodule, const char *meta,
+		const char *reset);
 void show_submodule_summary(FILE *f, const char *path,
 		const char *line_prefix,
 		unsigned char one[20], unsigned char two[20],
-- 
2.9.2.1004.gdad50a3


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] diff: add SUBMODULE_DIFF format to display submodule diff
  2016-08-09 22:32 [PATCH RFC] diff: add SUBMODULE_DIFF format to display submodule diff Jacob Keller
@ 2016-08-09 22:40 ` Junio C Hamano
  2016-08-09 22:50 ` Stefan Beller
  1 sibling, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2016-08-09 22:40 UTC (permalink / raw)
  To: Jacob Keller; +Cc: git, Jacob Keller

Jacob Keller <jacob.e.keller@intel.com> writes:

> +static int prepare_submodule_diff(struct strbuf *buf, const char *path,
> +		unsigned char one[20], unsigned char two[20])
> +{
> +	struct child_process cp = CHILD_PROCESS_INIT;
> +	cp.git_cmd = 1;
> +	cp.dir = path;
> +	cp.out = -1;
> +	cp.no_stdin = 1;
> +	argv_array_push(&cp.args, "diff");
> +	argv_array_push(&cp.args, sha1_to_hex(one));
> +	argv_array_push(&cp.args, sha1_to_hex(two));
> +
> +	if (start_command(&cp))
> +		return -1;
> +
> +	if (strbuf_read(buf, cp.out, 0) < 0)
> +		return -1;
> +
> +	if (finish_command(&cp))
> +		return -1;
> +
> +	return 0;
> +}

It is a good idea to keep the submodule data isolated from the main
process by going through run-command/start-command interface (I
think the call to add_submodule_odb() should be rethought and
removed if possible).

I however wonder if you want to use src/dst-prefix to make the path
to the submodule appear there?  That is, if your superproject has a
submodule at "dir/" and two versions of the submodule changes a file
"doc/README", wouldn't you rather want to see

	diff --git a/dir/doc/README b/dir/doc/README

instead of comparison between a/doc/README and b/doc/README?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] diff: add SUBMODULE_DIFF format to display submodule diff
  2016-08-09 22:32 [PATCH RFC] diff: add SUBMODULE_DIFF format to display submodule diff Jacob Keller
  2016-08-09 22:40 ` Junio C Hamano
@ 2016-08-09 22:50 ` Stefan Beller
  2016-08-09 22:56   ` Jacob Keller
  1 sibling, 1 reply; 7+ messages in thread
From: Stefan Beller @ 2016-08-09 22:50 UTC (permalink / raw)
  To: Jacob Keller; +Cc: git@vger.kernel.org, Jacob Keller

On Tue, Aug 9, 2016 at 3:32 PM, Jacob Keller <jacob.e.keller@intel.com> wrote:
> From: Jacob Keller <jacob.keller@gmail.com>
>
> For projects which have frequent updates to submodules it is often
> useful to be able to see a submodule update commit as a difference.
> Teach diff's --submodule= a new "diff" format which will execute a diff
> for the submodule between the old and new commit, and display it as
> a standard diff. This allows users to easily see what changed in
> a submodule without having to dig into the submodule themselves.
>
> Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
> ---
> There are no tests yet. Stefan suggested the use of child_process,

Well you said "I just want this one command but don't know where to put it "
so it's natural to suggest using child_process.  ;)

> but I
> really believe there has to be some way of getting the diff without
> having to run a sub process (as this means we also can't do the diff
> without a checked out submodule). It doesn't properly handle options
> either, so a better solution would be much appreciated.

Oh right, we would need to codify all options again (and all future options,
too)

>
> I also would prefer if the diff actually prefixed the files with
> "path/to/submodule" so that it was obvious where the file was located in
> the directory.
>
> Suggestions welcome.
>
> +
> +       if (start_command(&cp))

I wonder if we want to stay single-execution here,
i.e. if we rather want to use run_processes_parallel
for all the submodules at the same time?

Though then non deterministic ordering may be an issue.

> +               return -1;
> +
> +       if (strbuf_read(buf, cp.out, 0) < 0)

So we keep the whole diff in memory
I don't know much about the diff machinery, but I thought
the rest of the diff machinery just streams it out?

> +
> +void show_submodule_diff(FILE *f, const char *path,
> +               const char *line_prefix,
> +               unsigned char one[20], unsigned char two[20],
> +               unsigned dirty_submodule, const char *meta,
> +               const char *reset)
> +{
> +       struct strbuf buf = STRBUF_INIT;
> +       struct strbuf sb = STRBUF_INIT;
> +       const char *message = NULL;
> +
> +       if (dirty_submodule & DIRTY_SUBMODULE_UNTRACKED)
> +               fprintf(f, "%sSubmodule %s contains untracked content\n",
> +                       line_prefix, path);
> +       if (dirty_submodule & DIRTY_SUBMODULE_MODIFIED)
> +               fprintf(f, "%sSubmodule %s contains modified content\n",
> +                       line_prefix, path);
> +
> +       if (!hashcmp(one, two)) {
> +               strbuf_release(&sb);
> +               return;
> +       }
> +
> +       if (add_submodule_odb(path))
> +               message = "(not checked out)";

When not checked out, we can invoke the diff command
in .git/modules/<name> as that is the git dir of the submodule,
i.e. operating diff with a bare repo?

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] diff: add SUBMODULE_DIFF format to display submodule diff
  2016-08-09 22:50 ` Stefan Beller
@ 2016-08-09 22:56   ` Jacob Keller
  2016-08-09 23:10     ` Stefan Beller
  2016-08-10 19:41     ` Johannes Sixt
  0 siblings, 2 replies; 7+ messages in thread
From: Jacob Keller @ 2016-08-09 22:56 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Jacob Keller, git@vger.kernel.org

On Tue, Aug 9, 2016 at 3:50 PM, Stefan Beller <sbeller@google.com> wrote:
> On Tue, Aug 9, 2016 at 3:32 PM, Jacob Keller <jacob.e.keller@intel.com> wrote:
>> From: Jacob Keller <jacob.keller@gmail.com>
>>
>> For projects which have frequent updates to submodules it is often
>> useful to be able to see a submodule update commit as a difference.
>> Teach diff's --submodule= a new "diff" format which will execute a diff
>> for the submodule between the old and new commit, and display it as
>> a standard diff. This allows users to easily see what changed in
>> a submodule without having to dig into the submodule themselves.
>>
>> Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
>> ---
>> There are no tests yet. Stefan suggested the use of child_process,
>
> Well you said "I just want this one command but don't know where to put it "
> so it's natural to suggest using child_process.  ;)

Right this is a straight forward way..

>
>> but I
>> really believe there has to be some way of getting the diff without
>> having to run a sub process (as this means we also can't do the diff
>> without a checked out submodule). It doesn't properly handle options
>> either, so a better solution would be much appreciated.
>
> Oh right, we would need to codify all options again (and all future options,
> too)


That's why I'd prefer if we had a way to do this via builtins, because
it would make option passing easier.

>
>>
>> I also would prefer if the diff actually prefixed the files with
>> "path/to/submodule" so that it was obvious where the file was located in
>> the directory.
>>
>> Suggestions welcome.
>>
>> +
>> +       if (start_command(&cp))
>
> I wonder if we want to stay single-execution here,
> i.e. if we rather want to use run_processes_parallel
> for all the submodules at the same time?
>
> Though then non deterministic ordering may be an issue.

There is no easy way to do this, we're given it one module at a time
and it would be a huge re-write to make it go in parallel. I think
that's the wrong way to think about this.

>
>> +               return -1;
>> +
>> +       if (strbuf_read(buf, cp.out, 0) < 0)
>
> So we keep the whole diff in memory
> I don't know much about the diff machinery, but I thought
> the rest of the diff machinery just streams it out?

Yea, but I can't figure out how to do that. Is there an easy way to
stream chunks from the pipe straight into the file?

>
>> +
>> +void show_submodule_diff(FILE *f, const char *path,
>> +               const char *line_prefix,
>> +               unsigned char one[20], unsigned char two[20],
>> +               unsigned dirty_submodule, const char *meta,
>> +               const char *reset)
>> +{
>> +       struct strbuf buf = STRBUF_INIT;
>> +       struct strbuf sb = STRBUF_INIT;
>> +       const char *message = NULL;
>> +
>> +       if (dirty_submodule & DIRTY_SUBMODULE_UNTRACKED)
>> +               fprintf(f, "%sSubmodule %s contains untracked content\n",
>> +                       line_prefix, path);
>> +       if (dirty_submodule & DIRTY_SUBMODULE_MODIFIED)
>> +               fprintf(f, "%sSubmodule %s contains modified content\n",
>> +                       line_prefix, path);
>> +
>> +       if (!hashcmp(one, two)) {
>> +               strbuf_release(&sb);
>> +               return;
>> +       }
>> +
>> +       if (add_submodule_odb(path))
>> +               message = "(not checked out)";
>
> When not checked out, we can invoke the diff command
> in .git/modules/<name> as that is the git dir of the submodule,
> i.e. operating diff with a bare repo?

We can actually do this every time. How would you pass that in a
child_process? I don't think it's "dir" but instead passing
"--git-dir" somehow?

Thanks,
Jake

>
> Thanks,
> Stefan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] diff: add SUBMODULE_DIFF format to display submodule diff
  2016-08-09 22:56   ` Jacob Keller
@ 2016-08-09 23:10     ` Stefan Beller
  2016-08-09 23:15       ` Stefan Beller
  2016-08-10 19:41     ` Johannes Sixt
  1 sibling, 1 reply; 7+ messages in thread
From: Stefan Beller @ 2016-08-09 23:10 UTC (permalink / raw)
  To: Jacob Keller; +Cc: Jacob Keller, git@vger.kernel.org

On Tue, Aug 9, 2016 at 3:56 PM, Jacob Keller <jacob.keller@gmail.com> wrote:

>>> +
>>> +       if (strbuf_read(buf, cp.out, 0) < 0)
>>
>> So we keep the whole diff in memory
>> I don't know much about the diff machinery, but I thought
>> the rest of the diff machinery just streams it out?
>
> Yea, but I can't figure out how to do that. Is there an easy way to
> stream chunks from the pipe straight into the file?

Maybe, roughly:

    cp.stdout = -1;
    start_command(&cp);
    do {
        int r = xread(cp.stdout, buf, MAX_IO_SIZE);
    } while (r >=0);
    finish_command(&cp);

xread does use a poll() for you so it is not active polling,
but only reading when data is available.


>>
>> When not checked out, we can invoke the diff command
>> in .git/modules/<name> as that is the git dir of the submodule,
>> i.e. operating diff with a bare repo?
>
> We can actually do this every time. How would you pass that in a
> child_process? I don't think it's "dir" but instead passing
> "--git-dir" somehow?

git -C $GIT_DIR diff --relative ${superprojects ce->name}

with $GIT_DIR = ${SUPERPROJECTS GITDIR + modules/<name>/ with name to be looked
as submodule_from_path(ce->name) and then taking `name` from the
submodule struct)


Thanks,
Stefan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] diff: add SUBMODULE_DIFF format to display submodule diff
  2016-08-09 23:10     ` Stefan Beller
@ 2016-08-09 23:15       ` Stefan Beller
  0 siblings, 0 replies; 7+ messages in thread
From: Stefan Beller @ 2016-08-09 23:15 UTC (permalink / raw)
  To: Jacob Keller; +Cc: Jacob Keller, git@vger.kernel.org

On Tue, Aug 9, 2016 at 4:10 PM, Stefan Beller <sbeller@google.com> wrote:

> xread does use a poll() for you so it is not active polling,
> but only reading when data is available.

s/active polling/ spinning/

>
>
>>>
>>> When not checked out, we can invoke the diff command
>>> in .git/modules/<name> as that is the git dir of the submodule,
>>> i.e. operating diff with a bare repo?
>>
>> We can actually do this every time. How would you pass that in a
>> child_process? I don't think it's "dir" but instead passing
>> "--git-dir" somehow?

I think it doesn't matter. You can still use .dir.
(That would be equivalent to `cd $GIT_DIR && git diff sha1..sha1`
which works just as well, even in the submodule case)
Alternatively you could do
    argv_array_pushf(cp.env_array, "GIT_DIR=%s", ...)

So I would drop the -C here and just use the .dir attribute.

> git -C $GIT_DIR diff --relative ${superprojects ce->name}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] diff: add SUBMODULE_DIFF format to display submodule diff
  2016-08-09 22:56   ` Jacob Keller
  2016-08-09 23:10     ` Stefan Beller
@ 2016-08-10 19:41     ` Johannes Sixt
  1 sibling, 0 replies; 7+ messages in thread
From: Johannes Sixt @ 2016-08-10 19:41 UTC (permalink / raw)
  To: Jacob Keller; +Cc: Stefan Beller, Jacob Keller, git@vger.kernel.org

Am 10.08.2016 um 00:56 schrieb Jacob Keller:
> On Tue, Aug 9, 2016 at 3:50 PM, Stefan Beller <sbeller@google.com> wrote:
>> On Tue, Aug 9, 2016 at 3:32 PM, Jacob Keller <jacob.e.keller@intel.com> wrote:
>>> +       if (strbuf_read(buf, cp.out, 0) < 0)
>>
>> So we keep the whole diff in memory
>> I don't know much about the diff machinery, but I thought
>> the rest of the diff machinery just streams it out?
>
> Yea, but I can't figure out how to do that. Is there an easy way to
> stream chunks from the pipe straight into the file?

You don't stream via a pipe, you let the sub-process write directly to 
the file:

	fflush(f);
	cp.out = dup(fileno(f));

Of course, you no longer "prepare_submodule_diff" anymore (which 
currently runs the child process), but you print the "Submodule" header 
line, then invoke the child process, and if there was a failure, you 
fprintf(f, "(diff failed)").

-- Hannes


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-08-10 19:44 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-09 22:32 [PATCH RFC] diff: add SUBMODULE_DIFF format to display submodule diff Jacob Keller
2016-08-09 22:40 ` Junio C Hamano
2016-08-09 22:50 ` Stefan Beller
2016-08-09 22:56   ` Jacob Keller
2016-08-09 23:10     ` Stefan Beller
2016-08-09 23:15       ` Stefan Beller
2016-08-10 19:41     ` Johannes Sixt

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).