git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Jakub Narębski" <jnareb@gmail.com>
To: Johannes Schindelin <johannes.schindelin@gmx.de>, git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
	David Aguilar <davvid@gmail.com>,
	Dennis Kaarsemaker <dennis@kaarsemaker.net>
Subject: Re: [PATCH v3 2/2] difftool: implement the functionality in the builtin
Date: Fri, 25 Nov 2016 22:24:33 +0100	[thread overview]
Message-ID: <2994b0d6-4b6c-84e7-d0d5-257bcae3be98@gmail.com> (raw)
In-Reply-To: <ac91e4818cfb5c5af6b5874662dbeb61cde1f69d.1480019834.git.johannes.schindelin@gmx.de>

W dniu 24.11.2016 o 21:55, Johannes Schindelin pisze:

> The current version of the builtin difftool does not, however, make full
> use of the internals but instead chooses to spawn a couple of Git
> processes, still, to make for an easier conversion. There remains a lot
> of room for improvement, left for a later date.
[...]

> Sadly, the speedup is more noticable on Linux than on Windows: a quick
> test shows that t7800-difftool.sh runs in (2.183s/0.052s/0.108s)
> (real/user/sys) in a Linux VM, down from  (6.529s/3.112s/0.644s), while
> on Windows, it is (36.064s/2.730s/7.194s), down from
> (47.637s/2.407s/6.863s). The culprit is most likely the overhead
> incurred from *still* having to shell out to mergetool-lib.sh and
> difftool--helper.sh.

Does this mean that our shell-based testsuite is not well suited to be
benchmark suite for comparing performance on MS Windows?

Or does it mean that "builtin-difftool" spawning Git processes is the
problem?
 
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  builtin/difftool.c | 670 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 669 insertions(+), 1 deletion(-)
> 
> diff --git a/builtin/difftool.c b/builtin/difftool.c
> index 53870bb..3480920 100644
> --- a/builtin/difftool.c
> +++ b/builtin/difftool.c
> @@ -11,9 +11,608 @@
>   *
>   * Copyright (C) 2016 Johannes Schindelin
>   */
> +#include "cache.h"
>  #include "builtin.h"
>  #include "run-command.h"
>  #include "exec_cmd.h"
> +#include "parse-options.h"
> +#include "argv-array.h"
> +#include "strbuf.h"
> +#include "lockfile.h"
> +#include "dir.h"
> +
> +static char *diff_gui_tool;
> +static int trust_exit_code;
> +
> +static const char *const builtin_difftool_usage[] = {
> +	N_("git difftool [<options>] [<commit> [<commit>]] [--] [<path>...]"),
> +	NULL
> +};
> +
> +static int difftool_config(const char *var, const char *value, void *cb)
> +{
> +	if (!strcmp(var, "diff.guitool")) {

Shouldn't you also read other configuration variables, like "diff.tool",
and it's mergetool fallbacks ("merge.guitool", "merge.tool")?

> +		diff_gui_tool = xstrdup(value);
> +		return 0;
> +	}
> +
> +	if (!strcmp(var, "difftool.trustexitcode")) {
> +		trust_exit_code = git_config_bool(var, value);
> +		return 0;
> +	}

Why you do not need to check "difftool.prompt"?  And "mergetool.*" fallbacks?

> +
> +	return git_default_config(var, value, cb);
> +}
> +
> +static int print_tool_help(void)
> +{
> +	const char *argv[] = { "mergetool", "--tool-help=diff", NULL };
> +	return run_command_v_opt(argv, RUN_GIT_CMD);

This looks a bit strange to me, but I guess this is to avoid recursively
invoking ourself, and { "legacy-difftool", "--tool-help", NULL }; isn't
that much better.

> +}
> +
> +static int parse_index_info(char *p, int *mode1, int *mode2,
> +			    struct object_id *oid1, struct object_id *oid2,
> +			    char *status)

  There are only two hard things in Computer Science:
  cache invalidation and naming things.
    -- Phil Karlton

Why did you name function that parses "diff --raw" output (aka "diff-tree"
output) parse_index_info() instead of parse_raw_diff() or parse_diff_tree()?
I went searching for `update-index --index-info` formats...

This is not that important, because this function is file-static; though
future developers would thank you for more descriptive naming.

ADDED: Disregard this, I see this function is about index (?) part of
raw diff, that is only a part of "git diff --raw -z" output.  Though...

> +{
> +	if (*p != ':')
> +		return error("expected ':', got '%c'", *p);
> +	*mode1 = (int)strtol(p + 1, &p, 8);
> +	if (*p != ' ')
> +		return error("expected ' ', got '%c'", *p);

Nitpicking.

I guess because this error shouldn't really happen, and because current
implementation is transient, we don't need to worry about better error
messages (was it problem with parsing, or was it unexpected character).

For example '10064x', or '10064\n' would fail parse, but it is not
space that we were expecting...

> +	*mode2 = (int)strtol(p + 1, &p, 8);
> +	if (*p != ' ')
> +		return error("expected ' ', got '%c'", *p);
> +	if (get_oid_hex(++p, oid1))
> +		return error("expected object ID, got '%s'", p + 1);
> +	p += GIT_SHA1_HEXSZ;
> +	if (*p != ' ')
> +		return error("expected ' ', got '%c'", *p);
> +	if (get_oid_hex(++p, oid2))
> +		return error("expected object ID, got '%s'", p + 1);
> +	p += GIT_SHA1_HEXSZ;
> +	if (*p != ' ')
> +		return error("expected ' ', got '%c'", *p);
> +	*status = *++p;
> +	if (!status || p[1])
> +		return error("unexpected trailer: '%s'", p);
> +	return 0;
> +}
> +
> +/*
> + * Remove any trailing slash from $workdir
> + * before starting to avoid double slashes in symlink targets.
> + */

Err... that's not what add_path() does, in its current implementation.
It doesn't remove trailing slashes, but it checks if there is trailing
slash, and if there isn't, it adds it as separator before adding path.

Or was it original comment from the Perl implementation?  It look
like this, with '$workdir'...  If it is meant to be straight copy
of comment from legacy-difftool, a note would be nice.

> +static void add_path(struct strbuf *buf, size_t base_len, const char *path)

Naming: I think strbuf_addpath() would be a better name, but I guess
it is a matter of taste.

> +{
> +	strbuf_setlen(buf, base_len);
> +	if (buf->len && buf->buf[buf->len - 1] != '/')
> +		strbuf_addch(buf, '/');
> +	strbuf_addstr(buf, path);
> +}
> +
> +/*
> + * Determine whether we can simply reuse the file in the worktree.
> + */
> +static int use_wt_file(const char *workdir, const char *name,

Should it be 'name' or 'pathname'?

> +		       struct object_id *oid)
> +{
> +	struct strbuf buf = STRBUF_INIT;
> +	struct stat st;
> +	int use = 0;
> +
> +	strbuf_addstr(&buf, workdir);
> +	add_path(&buf, buf.len, name);

With proposed rename, it would IMVVVHO looks better

  +	strbuf_addstr(&buf, workdir);
  +	strbuf_addpath(&buf, buf.len, name);

But that is a matter of taste (again, the function is file-local).

> +
> +	if (!lstat(buf.buf, &st) && !S_ISLNK(st.st_mode)) {
> +		struct object_id wt_oid;
> +		int fd = open(buf.buf, O_RDONLY);
> +
> +		if (!index_fd(wt_oid.hash, fd, &st, OBJ_BLOB, name, 0)) {
> +			if (is_null_oid(oid)) {
> +				oidcpy(oid, &wt_oid);
> +				use = 1;
> +			} else if (!oidcmp(oid, &wt_oid))
> +				use = 1;
> +		}
> +	}
> +
> +	strbuf_release(&buf);
> +
> +	return use;
> +}

[...]

> +static int ensure_leading_directories(char *path)
> +{
> +	switch (safe_create_leading_directories(path)) {
> +		case SCLD_OK:
> +		case SCLD_EXISTS:
> +			return 0;
> +		default:
> +			return error(_("could not create leading directories "
> +				       "of '%s'"), path);
> +	}
> +}

Nice function, I wonder if it would be useful in other places.

> +
> +static int run_dir_diff(const char *extcmd, int symlinks, const char *prefix,
> +			int argc, const char **argv)
> +{
> +	char tmpdir[PATH_MAX];
> +	struct strbuf info = STRBUF_INIT, lpath = STRBUF_INIT;
> +	struct strbuf rpath = STRBUF_INIT, buf = STRBUF_INIT;

Nitpicking.

To be symmetric, it could be reordered like this:

  +	struct strbuf info = STRBUF_INIT, buf = STRBUF_INIT;
  +	struct strbuf lpath = STRBUF_INIT, rpath = STRBUF_INIT;

See: lpath, rpath; ldir, rdir; ldir_len, rdir_len.

> +	struct strbuf ldir = STRBUF_INIT, rdir = STRBUF_INIT;
> +	struct strbuf wtdir = STRBUF_INIT;
> +	size_t ldir_len, rdir_len, wtdir_len;

[...]
> +	argv_array_pushl(&child.args, "diff", "--raw", "--no-abbrev", "-z",
> +			 NULL);
> +	for (i = 0; i < argc; i++)
> +		argv_array_push(&child.args, argv[i]);
> +	if (start_command(&child))
> +		die("could not obtain raw diff");
> +	fp = xfdopen(child.out, "r");
> +
> +	/* Build index info for left and right sides of the diff */
> +	i = 0;
> +	while (!strbuf_getline_nul(&info, fp)) {
> +		int lmode, rmode;
> +		struct object_id loid, roid;
> +		char status;
> +		const char *src_path, *dst_path;
> +		size_t src_path_len, dst_path_len;
> +
> +		if (starts_with(info.buf, "::"))
> +			die(N_("combined diff formats('-c' and '--cc') are "
> +			       "not supported in\n"
> +			       "directory diff mode('-d' and '--dir-diff')."));
> +
> +		if (parse_index_info(info.buf, &lmode, &rmode, &loid, &roid,
> +				     &status))

After rename it would read as:

  +		if (parse_raw_diff(info.buf, &lmode, &rmode, &loid, &roid,
  +				   &status))

Though now I see that you parse here index information of raw diff
(I think)... so disregard my musings.

> +			break;
> +		if (strbuf_getline_nul(&lpath, fp))
> +			break;
> +		src_path = lpath.buf;
> +		src_path_len = lpath.len;
> +
> +		i++;
> +		if (status != 'C' && status != 'R') {
> +			dst_path = src_path;
> +			dst_path_len = src_path_len;
> +		} else {
> +			if (strbuf_getline_nul(&rpath, fp))
> +				break;
> +			dst_path = rpath.buf;
> +			dst_path_len = rpath.len;
> +		}


[...]
> +	/*
> +	 * Changes to submodules require special treatment.This loop writes a

Here and in few other places you are missing space after full stop.

  +	 * Changes to submodules require special treatment. This loop writes a


> +	 * temporary file to both the left and right directories to show the
> +	 * change in the recorded SHA1 for the submodule.
> +	 */
> +	hashmap_iter_init(&submodules, &iter);
> +	while ((entry = hashmap_iter_next(&iter))) {
> +		if (*entry->left) {
> +			add_path(&ldir, ldir_len, entry->path);
> +			ensure_leading_directories(ldir.buf);
> +			write_file(ldir.buf, "%s", entry->left);
> +		}
> +		if (*entry->right) {
> +			add_path(&rdir, rdir_len, entry->path);
> +			ensure_leading_directories(rdir.buf);
> +			write_file(rdir.buf, "%s", entry->right);
> +		}
> +	}
> +
> +	/*
> +	 * Symbolic links require special treatment.The standard "git diff"

Same here.

> +	 * shows only the link itself, not the contents of the link target.
> +	 * This loop replicates that behavior.
> +	 */

Best,
-- 
Jakub Narębski


  reply	other threads:[~2016-11-25 21:26 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-22 17:01 [PATCH 0/2] Show Git Mailing List: a builtin difftool Johannes Schindelin
2016-11-22 17:01 ` [PATCH 1/2] difftool: add the builtin Johannes Schindelin
2016-11-23  8:08   ` David Aguilar
2016-11-23 11:34     ` Johannes Schindelin
2016-11-22 17:01 ` [PATCH 2/2] difftool: add a feature flag for the builtin vs scripted version Johannes Schindelin
2016-11-23 14:51   ` Dennis Kaarsemaker
2016-11-23 17:29     ` Johannes Schindelin
2016-11-23 17:40       ` Junio C Hamano
2016-11-23 18:18         ` Junio C Hamano
2016-11-23 19:55           ` Johannes Schindelin
2016-11-23 20:04             ` Junio C Hamano
2016-11-23 22:01       ` Johannes Schindelin
2016-11-23 22:03 ` [PATCH v2 0/1] Show Git Mailing List: a builtin difftool Johannes Schindelin
2016-11-23 22:03   ` [PATCH v2 1/1] difftool: add the builtin Johannes Schindelin
2016-11-23 22:25     ` Junio C Hamano
2016-11-23 22:30       ` Junio C Hamano
2016-11-24 10:38         ` Johannes Schindelin
2016-11-24 20:55   ` [PATCH v3 0/2] Show Git Mailing List: a builtin difftool Johannes Schindelin
2016-11-24 20:55     ` [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin Johannes Schindelin
2016-11-24 21:08       ` Jeff King
2016-11-24 21:56         ` Johannes Schindelin
2016-11-25  3:18           ` Jeff King
2016-11-25 11:05             ` Johannes Schindelin
2016-11-25 17:19               ` Jeff King
2016-11-25 17:41                 ` Johannes Schindelin
2016-11-25 17:47                   ` Jeff King
2016-11-26 12:22                     ` Johannes Schindelin
2016-11-26 16:19                       ` Jeff King
2016-11-26 13:01                         ` Johannes Schindelin
2016-11-27 16:50                           ` Jeff King
2016-11-28 17:06                             ` Junio C Hamano
2016-11-28 17:34                               ` Johannes Schindelin
2016-11-28 19:27                                 ` Junio C Hamano
2016-11-29 20:36                                   ` Johannes Schindelin
2016-11-29 20:49                                     ` Jeff King
2016-11-30 12:30                                       ` Johannes Schindelin
2016-11-30 12:35                                         ` Jeff King
2016-11-29 20:55                                     ` Junio C Hamano
2016-11-30 12:30                                       ` Johannes Schindelin
2016-12-01 23:33                                         ` Junio C Hamano
2016-12-05 10:36                                           ` Johannes Schindelin
2016-12-05 18:37                                             ` Junio C Hamano
2016-12-06 13:16                                               ` Johannes Schindelin
2016-12-06 13:36                                                 ` Jeff King
2016-12-06 14:48                                                   ` Johannes Schindelin
2016-12-06 15:09                                                     ` Jeff King
2016-12-06 18:22                                                       ` Stefan Beller
2016-12-06 18:35                                                         ` Jeff King
2017-01-18 22:38                                                           ` Brandon Williams
2016-11-30 16:02                                 ` Jakub Narębski
2016-11-30 18:39                                   ` Junio C Hamano
2016-11-24 20:55     ` [PATCH v3 2/2] difftool: implement the functionality in the builtin Johannes Schindelin
2016-11-25 21:24       ` Jakub Narębski [this message]
2016-11-27 11:10         ` Johannes Schindelin
2016-11-27 11:20           ` Jakub Narębski
2017-01-02 16:16     ` [PATCH v4 0/4] Show Git Mailing List: a builtin difftool Johannes Schindelin
2017-01-02 16:22       ` [PATCH v4 1/4] Avoid Coverity warning about unfree()d git_exec_path() Johannes Schindelin
2017-01-03 20:11         ` Stefan Beller
2017-01-03 21:33           ` Johannes Schindelin
2017-01-04 18:09             ` Stefan Beller
2017-01-04  1:13           ` Jeff King
2017-01-09  1:25           ` Junio C Hamano
2017-01-09  6:00             ` Jeff King
2017-01-09  7:49             ` Johannes Schindelin
2017-01-09 19:21             ` Stefan Beller
2017-01-02 16:22       ` [PATCH v4 2/4] difftool: add a skeleton for the upcoming builtin Johannes Schindelin
2017-01-02 16:22       ` [PATCH v4 3/4] difftool: implement the functionality in the builtin Johannes Schindelin
2017-01-02 16:24       ` [PATCH v4 4/4] t7800: run both builtin and scripted difftool, for now Johannes Schindelin
2017-01-09  1:38         ` Junio C Hamano
2017-01-09  7:56           ` Johannes Schindelin
2017-01-09  9:46             ` Junio C Hamano
2017-01-17 15:54       ` [PATCH v5 0/3] Turn the difftool into a builtin Johannes Schindelin
2017-01-17 15:54         ` [PATCH v5 1/3] difftool: add a skeleton for the upcoming builtin Johannes Schindelin
2017-01-17 15:55         ` [PATCH v5 2/3] difftool: implement the functionality in the builtin Johannes Schindelin
2017-01-17 15:55         ` [PATCH v5 3/3] Retire the scripted difftool Johannes Schindelin
2017-01-17 21:46           ` Junio C Hamano
2017-01-18 12:33             ` Johannes Schindelin
2017-01-18 19:15               ` Junio C Hamano
2017-01-19 16:30                 ` Johannes Schindelin
2017-01-19 17:56                   ` Junio C Hamano
2017-01-19 20:32                     ` Johannes Schindelin
2017-01-17 21:31         ` [PATCH v5 0/3] Turn the difftool into a builtin Junio C Hamano
2017-01-19 20:30         ` [PATCH v6 " Johannes Schindelin
2017-01-19 20:30           ` [PATCH v6 1/3] difftool: add a skeleton for the upcoming builtin Johannes Schindelin
2017-01-19 20:30           ` [PATCH v6 2/3] difftool: implement the functionality in the builtin Johannes Schindelin
2017-01-19 20:30           ` [PATCH v6 3/3] Retire the scripted difftool Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2994b0d6-4b6c-84e7-d0d5-257bcae3be98@gmail.com \
    --to=jnareb@gmail.com \
    --cc=davvid@gmail.com \
    --cc=dennis@kaarsemaker.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johannes.schindelin@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).