git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jacob Keller <jacob.e.keller@intel.com>
Cc: git@vger.kernel.org, Stefan Beller <stefanbeller@gmail.com>,
	Jeff King <peff@peff.net>, Johannes Sixt <j6t@kdbg.org>,
	Jacob Keller <jacob.keller@gmail.com>
Subject: Re: [PATCH v6 3/3] diff: add SUBMODULE_DIFF format to display submodule diff
Date: Tue, 16 Aug 2016 11:48:14 -0700	[thread overview]
Message-ID: <xmqqh9akczyp.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <20160815230702.30817-4-jacob.e.keller@intel.com> (Jacob Keller's message of "Mon, 15 Aug 2016 16:07:02 -0700")

Jacob Keller <jacob.e.keller@intel.com> writes:

> diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
> index d5a5b17d5088..f5d693afad6c 100644
> --- a/Documentation/diff-config.txt
> +++ b/Documentation/diff-config.txt
> @@ -123,7 +123,8 @@ diff.suppressBlankEmpty::
>  diff.submodule::
>  	Specify the format in which differences in submodules are
>  	shown.  The "log" format lists the commits in the range like
> -	linkgit:git-submodule[1] `summary` does.  The "short" format
> +	linkgit:git-submodule[1] `summary` does.  The "diff" format shows the
> +	diff between the beginning and end of the range. The "short" format
>  	format just shows the names of the commits at the beginning
>  	and end of the range.  Defaults to short.

It would be much better to describe the default one first and then
more involved one next, no?  That would also match what the change
to "diff-options" in this patch does (can be seen below).

> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> index cc4342e2034d..d3ca4ad2c2c5 100644
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -215,8 +215,11 @@ any of those replacements occurred.
>  	the commits in the range like linkgit:git-submodule[1] `summary` does.
>  	Omitting the `--submodule` option or specifying `--submodule=short`,
>  	uses the 'short' format. This format just shows the names of the commits
> -	at the beginning and end of the range.  Can be tweaked via the
> -	`diff.submodule` configuration variable.
> +	at the beginning and end of the range. When `--submodule=diff` is
> +	given, the 'diff' format is used. This format shows the diff between
> +	the old and new submodule commmit from the perspective of the
> +	submodule.  Defaults to `diff.submodule` or 'short' if the config
> +	option is unset.

> @@ -2311,6 +2322,15 @@ static void builtin_diff(const char *name_a,
>  				two->dirty_submodule,
>  				meta, del, add, reset);
>  		return;
> +	} else if (o->submodule_format == DIFF_SUBMODULE_DIFF &&
> +		   (!one->mode || S_ISGITLINK(one->mode)) &&
> +		   (!two->mode || S_ISGITLINK(two->mode))) {
> +		show_submodule_diff(o->file, one->path ? one->path : two->path,
> +				line_prefix,
> +				one->oid.hash, two->oid.hash,
> +				two->dirty_submodule,
> +				meta, a_prefix, b_prefix, reset);
> +		return;
>  	}

The "either missing or is submodule" check used here is being
consistent with the existing "submodule=log" case.  Good.

> diff --git a/submodule.c b/submodule.c
> index 1b5cdfb7e784..b1da68dd49c9 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -333,6 +333,136 @@ static void print_submodule_summary(struct rev_info *rev, FILE *f,
>  	strbuf_release(&sb);
>  }
>  
> +void show_submodule_diff(FILE *f, const char *path,
> +		const char *line_prefix,
> +		unsigned char one[20], unsigned char two[20],
> +		unsigned dirty_submodule, const char *meta,
> +		const char *a_prefix, const char *b_prefix,
> +		const char *reset)
> +{
> +	struct strbuf submodule_git_dir = STRBUF_INIT, sb = STRBUF_INIT;
> +	struct child_process cp = CHILD_PROCESS_INIT;
> +	const char *git_dir;
> +
> +	if (dirty_submodule & DIRTY_SUBMODULE_UNTRACKED) {
> +		fprintf(f, "%sSubmodule %s contains untracked content\n",
> +			line_prefix, path);
> +	}
> +	if (dirty_submodule & DIRTY_SUBMODULE_MODIFIED) {
> +		fprintf(f, "%sSubmodule %s contains modified content\n",
> +			line_prefix, path);
> +	}
> +
> +	strbuf_addf(&sb, "%s%sSubmodule %s %s..",
> +		    line_prefix, meta, path,
> +		    find_unique_abbrev(one, DEFAULT_ABBREV));
> +	strbuf_addf(&sb, "%s:%s",
> +		    find_unique_abbrev(two, DEFAULT_ABBREV),
> +		    reset);
> +	fwrite(sb.buf, sb.len, 1, f);
> +
> +	if (is_null_sha1(one))
> +		fprintf(f, " (new submodule)");
> +	if (is_null_sha1(two))
> +		fprintf(f, " (submodule deleted)");

These messages are in sync with show_submodule_summary() that is
used in --submodule=log codepath.  Good.

> +	/*
> +	 * We need to determine the most accurate location to call the sub
> +	 * command, and handle the various corner cases involved. We'll first
> +	 * attempt to use the path directly if the submodule is checked out.
> +	 * Then, if that fails, we'll check the standard module location in
> +	 * the git directory. If even this fails, it means we can't do the
> +	 * lookup because the module has not been initialized.
> +	 */

This is more elaborate than what show_submodule_summary() does,
isn't it?  Would it make the patch series (and the resulting code)
more understandable if you used the same code by refactoring these
two?  If so, I wonder if it makes sense to split 3/3 into a few
separate steps:

 * Update the internal "--submodule=<type>" handling without adding
   the "--submodule=diff" and show_submodule_diff() function.

 * Refactor the determination of the submodule status (i.e. does it
   even have a clone?  where is its repository? etc.) from existing
   show_submodule_summary() into a separate helper function.

 * Make that helper function more elaborate like what you do here,
   and update show_submodule_summary().  I think the state
   show_submodule_summary() calls "not checked out" corresponds to
   what you say "not initialized" below, and they should share the
   same logic to determine that the submodule is in that state, and
   share the same message, for example.

 * Introduce "--submodule=diff", and show_submodule_diff() function;
   the latter would use the helper function prepared in the previous
   step.

perhaps?

> +	strbuf_addf(&submodule_git_dir, "%s/.git", path);
> +	git_dir = resolve_gitdir(submodule_git_dir.buf);
> +	if (git_dir) {
> +		/*
> +		 * If we can resolve a git dir, this means that the submodule
> +		 * has been checked out. In this case, just use the original
> +		 * path since we want access to the work tree
> +		 */
> +		git_dir = path;
> +	} else {
> +		/*
> +		 * If we can't resolve a git dir, this means that the
> +		 * submodule has not been checked out. In this case, try the
> +		 * standard location for modules
> +		 */
> +		strbuf_reset(&submodule_git_dir);
> +		strbuf_addf(&submodule_git_dir, "%s/modules/%s", get_git_dir(), path);
> +		git_dir = resolve_gitdir(submodule_git_dir.buf);
> +		if (!git_dir) {
> +			/*
> +			 * If we failed to find a git directory here, then the
> +			 * submodule must not have been initialized. Without
> +			 * the initialized contents of the submodule, we won't
> +			 * be able to perform the difference.
> +			 */
> +			fprintf(f, " (submodule not initialized)\n");
> +			goto out;
> +		}
> +	}
> +
> +	/*
> +	 * print a newline and flush the file so that the diff output appears
> +	 * starting on its own line
> +	 */
> +	fprintf(f, "\n");
> +	fflush(f);
> +
> +	cp.git_cmd = 1;
> +	cp.dir = git_dir;
> +	cp.out = dup(fileno(f));
> +	cp.no_stdin = 1;
> +
> +	argv_array_push(&cp.args, "diff");
> +	argv_array_pushf(&cp.args, "--line-prefix=%s", line_prefix);
> +	argv_array_pushf(&cp.args, "--src-prefix=%s%s/", a_prefix, path);
> +	argv_array_pushf(&cp.args, "--dst-prefix=%s%s/", b_prefix, path);
> +
> +	if (is_null_sha1(one)) {
> +		/*
> +		 * If the origin commit is null, we want to use the empty tree
> +		 * so that we see a diff of all the new contents added.
> +		 */
> +		argv_array_push(&cp.args, EMPTY_TREE_SHA1_HEX);
> +	} else {
> +		/* Use the old commit as the diff base */
> +		argv_array_push(&cp.args, sha1_to_hex(one));
> +	}
> +
> +	if (dirty_submodule & DIRTY_SUBMODULE_MODIFIED) {
> +		/*
> +		 * If the submodule has modified contents we want to diff
> +		 * against the work tree, so don't add a second parameter.
> +		 * This is essentially equivalent of using diff-index instead.
> +		 * Note that we can't (easily) show the diff of any untracked
> +		 * files.
> +		 */
> +	} else if (is_null_sha1(two)) {

It is safer to have ';' inside the empty if(){} block to make sure
that one empty statement exists there.  It makes the intention of
the code clearer, too.

I am debating myself if this is a good thing to do, though.  The
submodule is a separate project for a reason, and there is a reason
why the changes haven't been committed.  When asking "what's different
between these two superproject states?", should the user really see
these uncommitted changes?

Thanks.

  reply	other threads:[~2016-08-16 18:54 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-15 23:06 [PATCH v6 0/3] Add support for displaying submodules as a proper diff Jacob Keller
2016-08-15 23:07 ` [PATCH v6 1/3] diff.c: remove output_prefix_length field Jacob Keller
2016-08-16 18:03   ` Junio C Hamano
2016-08-16 18:08     ` Jacob Keller
2016-08-15 23:07 ` [PATCH v6 2/3] graph: add support for --line-prefix on all graph-aware output Jacob Keller
2016-08-16 18:22   ` Junio C Hamano
2016-08-16 20:19     ` Jacob Keller
2016-08-15 23:07 ` [PATCH v6 3/3] diff: add SUBMODULE_DIFF format to display submodule diff Jacob Keller
2016-08-16 18:48   ` Junio C Hamano [this message]
2016-08-16 20:25     ` Jacob Keller
2016-08-16 21:14       ` Junio C Hamano
2016-08-16 21:20         ` Jacob Keller
2016-08-16 21:31           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqh9akczyp.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=j6t@kdbg.org \
    --cc=jacob.e.keller@intel.com \
    --cc=jacob.keller@gmail.com \
    --cc=peff@peff.net \
    --cc=stefanbeller@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).