git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "John Cai via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, "Eric Sunshine" <sunshine@sunshineco.com>,
	"Phillip Wood" <phillip.wood123@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Jeff King" <peff@peff.net>, "Elijah Newren" <newren@gmail.com>,
	"John Cai" <johncai86@gmail.com>
Subject: Re: [PATCH v2 2/2] diff: teach diff to read gitattribute diff-algorithm
Date: Tue, 14 Feb 2023 18:56:40 -0800	[thread overview]
Message-ID: <xmqq4jrn3ac7.fsf@gitster.g> (raw)
In-Reply-To: <cb0305631496eb4c2d51e5b586ac0ca8580c7dc1.1676410819.git.gitgitgadget@gmail.com> (John Cai via GitGitGadget's message of "Tue, 14 Feb 2023 21:40:19 +0000")

"John Cai via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: John Cai <johncai86@gmail.com>
>
> It can be useful to specify diff algorithms per file type. For example,
> one may want to use the minimal diff algorithm for .json files, another
> for .c files, etc.
>
> Teach the diff machinery to check attributes for a diff driver. Also
> teach the diff driver parser a new type "algorithm" to look for in the
> config, which will be used if a driver has been specified through the
> attributes.
>
> Enforce precedence of diff algorithm by favoring the command line option,
> then looking at the driver attributes & config combination, then finally
> the diff.algorithm config.
>
> To enforce precedence order, use the `xdl_opts_command_line` member
> during options pasing to indicate the diff algorithm was set via command
> line args.
>
> Signed-off-by: John Cai <johncai86@gmail.com>
> ---
>  Documentation/gitattributes.txt | 41 ++++++++++++++++++++++++++++++++-
>  diff.c                          | 25 +++++++++++++-------
>  diff.h                          |  2 ++
>  t/lib-diff-alternative.sh       | 38 +++++++++++++++++++++++++++++-
>  userdiff.c                      |  4 +++-
>  userdiff.h                      |  1 +
>  6 files changed, 100 insertions(+), 11 deletions(-)
>
> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
> index c19e64ea0ef..7e69f509d0a 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -736,7 +736,6 @@ String::
>  	by the configuration variables in the "diff.foo" section of the
>  	Git config file.
>  
> -
>  Defining an external diff driver
>  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Unrelated change?  Wider paragraph gap between two sections than
other inter-paragraph gaps inside a single section is what original
had, and I think that is a reasonable thing to keep.

> @@ -758,6 +757,46 @@ with the above configuration, i.e. `j-c-diff`, with 7
>  parameters, just like `GIT_EXTERNAL_DIFF` program is called.
>  See linkgit:git[1] for details.

In other words, this new section wants another blank line before to match.

>  
> +Setting the internal diff algorithm
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +The diff algorithm can be set through the `diff.algorithm` config key, but
> +sometimes it may be helpful to set the diff algorithm by path. For example, one
> +might wish to set a diff algorithm automatically for all `.json` files such that
> +the user would not need to pass in a separate command line `--diff-algorithm` flag each
> +time.

That's an overly wide paragraph.

> +
> +First, in `.gitattributes`, you would assign the `diff` attribute for paths.
> +
> +*Git attributes*

Discard this line (mimic an existing section, like "Defining a
custom hunk-header").

> +------------------------
> +*.json diff=<name>
> +------------------------
> +
> +Then, you would define a "diff.<name>.algorithm" configuration to specify the
> +diff algorithm, choosing from `meyers`, `patience`, `minimal`, and `histogram`.
> +
> +*Git config*

Likewise, discard this line (I won't repeat but the next hunk has
the same issue).

> diff --git a/diff.c b/diff.c
> index 92a0eab942e..24da439e56f 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -4456,15 +4456,11 @@ static void run_diff_cmd(const char *pgm,
>  	const char *xfrm_msg = NULL;
>  	int complete_rewrite = (p->status == DIFF_STATUS_MODIFIED) && p->score;
>  	int must_show_header = 0;
> +	struct userdiff_driver *drv = userdiff_find_by_path(o->repo->index, attr_path);

Do we run this look-up unconditionally, even when .allow_external
bit is not set?  Why?

> -
> -	if (o->flags.allow_external) {
> -		struct userdiff_driver *drv;
> -
> -		drv = userdiff_find_by_path(o->repo->index, attr_path);
> +	if (o->flags.allow_external)
>  		if (drv && drv->external)
>  			pgm = drv->external;
> -	}
>  
>  	if (msg) {
>  		/*
> @@ -4481,12 +4477,17 @@ static void run_diff_cmd(const char *pgm,
>  		run_external_diff(pgm, name, other, one, two, xfrm_msg, o);
>  		return;
>  	}
> -	if (one && two)
> +	if (one && two) {
> +		if (!o->xdl_opts_command_line)
> +			if (drv && drv->algorithm)
> +				set_diff_algorithm(o, drv->algorithm);

The idea here seems to be "if there is no explicit instruction, and
if the diff driver specifies an algorithm, then use that one", which
is very straightforward and sensible.  Can we reliably tell if we
had an explicit instruction to override the driver?  That should
probably appear in other parts of the code, I guess.

>  		builtin_diff(name, other ? other : name,
>  			     one, two, xfrm_msg, must_show_header,
>  			     o, complete_rewrite);
> -	else
> +	} else {
>  		fprintf(o->file, "* Unmerged path %s\n", name);
> +	}
>  }



> @@ -4583,6 +4584,10 @@ static void run_diffstat(struct diff_filepair *p, struct diff_options *o,
>  	const char *name;
>  	const char *other;
>  
> +	struct userdiff_driver *drv = userdiff_find_by_path(o->repo->index, p->one->path);
> +	if (drv && drv->algorithm)
> +		set_diff_algorithm(o, drv->algorithm);

Interesting.  Does external diff play a role, like in run_diff_cmd()
we saw earlier?

> @@ -5130,6 +5135,8 @@ static int diff_opt_diff_algorithm(const struct option *opt,
>  		return error(_("option diff-algorithm accepts \"myers\", "
>  			       "\"minimal\", \"patience\" and \"histogram\""));
>  
> +	options->xdl_opts_command_line = 1;

OK, calling this member "xdl_" anything is highly misleading, as it
has nothing to do with the xdiff machinery.  How about calling it
after what it does, i.e. allowing the attribute driven diff driver
to specify the algorithm?  options.ignore_driver_algorithm or
something?  The options coming _from_ the command line may happen to
be the condition to trigger this behaviour in this current
implementation, but it does not have to stay that way forever.
Losing "command line" from the name of the flag would make it
clearer what is essential (i.e. this controls if the diff driver is
allowed to affect the choice of the algorithm) and what is not (i.e.
we happen to let it decided based on the presence or absense of
command line choice).

Thanks.

  reply	other threads:[~2023-02-15  2:56 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-05  3:46 [PATCH 0/2] Teach diff to honor diff algorithms set through git attributes John Cai via GitGitGadget
2023-02-05  3:46 ` [PATCH 1/2] diff: consolidate diff algorithm option parsing John Cai via GitGitGadget
2023-02-06 16:20   ` Phillip Wood
2023-02-05  3:46 ` [PATCH 2/2] diff: teach diff to read gitattribute diff-algorithm John Cai via GitGitGadget
2023-02-05 17:50   ` Eric Sunshine
2023-02-06 13:10     ` John Cai
2023-02-06 16:27   ` Phillip Wood
2023-02-06 18:14     ` Eric Sunshine
2023-02-06 19:50     ` John Cai
2023-02-09  8:26       ` Elijah Newren
2023-02-09 10:31         ` "bad" diffs (was: [PATCH 2/2] diff: teach diff to read gitattribute diff-algorithm) Ævar Arnfjörð Bjarmason
2023-02-09 16:37         ` [PATCH 2/2] diff: teach diff to read gitattribute diff-algorithm John Cai
2023-02-06 16:39   ` Ævar Arnfjörð Bjarmason
2023-02-06 20:37     ` John Cai
2023-02-07 14:55       ` Phillip Wood
2023-02-07 17:00         ` John Cai
2023-02-09  9:09           ` Elijah Newren
2023-02-09 14:44             ` Phillip Wood
2023-02-10  9:57               ` Elijah Newren
2023-02-11 17:39                 ` Phillip Wood
2023-02-11  1:59               ` Jeff King
2023-02-15  2:35                 ` Elijah Newren
2023-02-15  4:21                   ` Jeff King
2023-02-15  5:20                     ` Junio C Hamano
2023-02-15 14:44                 ` Phillip Wood
2023-02-15 15:00                   ` Jeff King
2023-02-07 17:27         ` Ævar Arnfjörð Bjarmason
2023-02-15 14:47           ` Phillip Wood
2023-02-09  8:44       ` Elijah Newren
2023-02-14 21:16         ` John Cai
2023-02-15  3:41           ` Elijah Newren
2023-02-09  7:50     ` Elijah Newren
2023-02-09  9:41       ` Ævar Arnfjörð Bjarmason
2023-02-11  2:04         ` Jeff King
2023-02-07 17:56   ` Jeff King
2023-02-07 20:18     ` Ævar Arnfjörð Bjarmason
2023-02-07 20:47       ` Junio C Hamano
2023-02-07 21:05         ` Ævar Arnfjörð Bjarmason
2023-02-07 21:28           ` Junio C Hamano
2023-02-07 21:44             ` Ævar Arnfjörð Bjarmason
2023-02-09 16:34     ` John Cai
2023-02-11  1:39       ` Jeff King
2023-02-14 21:40 ` [PATCH v2 0/2] Teach diff to honor diff algorithms set through git attributes John Cai via GitGitGadget
2023-02-14 21:40   ` [PATCH v2 1/2] diff: consolidate diff algorithm option parsing John Cai via GitGitGadget
2023-02-15  2:38     ` Junio C Hamano
2023-02-15 23:34       ` John Cai
2023-02-15 23:42         ` Junio C Hamano
2023-02-16  2:14           ` Jeff King
2023-02-16  2:57             ` Junio C Hamano
2023-02-16 20:34               ` John Cai
2023-02-14 21:40   ` [PATCH v2 2/2] diff: teach diff to read gitattribute diff-algorithm John Cai via GitGitGadget
2023-02-15  2:56     ` Junio C Hamano [this message]
2023-02-15  3:20       ` Junio C Hamano
2023-02-16 20:37         ` John Cai
2023-02-17 20:21   ` [PATCH v3 0/2] Teach diff to honor diff algorithms set through git attributes John Cai via GitGitGadget
2023-02-17 20:21     ` [PATCH v3 1/2] diff: consolidate diff algorithm option parsing John Cai via GitGitGadget
2023-02-17 21:27       ` Junio C Hamano
2023-02-18  1:36       ` Elijah Newren
2023-02-17 20:21     ` [PATCH v3 2/2] diff: teach diff to read algorithm from diff driver John Cai via GitGitGadget
2023-02-17 21:50       ` Junio C Hamano
2023-02-18  2:56       ` Elijah Newren
2023-02-20 15:32         ` John Cai
2023-02-20 16:21           ` Elijah Newren
2023-02-20 16:49             ` John Cai
2023-02-20 17:32               ` Elijah Newren
2023-02-20 20:53                 ` John Cai
2023-02-22 19:47                 ` Jeff King
2023-02-24 17:44                   ` John Cai
2023-02-18  1:16     ` [PATCH v3 0/2] Teach diff to honor diff algorithms set through git attributes Elijah Newren
2023-02-20 13:37       ` John Cai
2023-02-20 21:04     ` [PATCH v4 " John Cai via GitGitGadget
2023-02-20 21:04       ` [PATCH v4 1/2] diff: consolidate diff algorithm option parsing John Cai via GitGitGadget
2023-02-20 21:04       ` [PATCH v4 2/2] diff: teach diff to read algorithm from diff driver John Cai via GitGitGadget
2023-02-21 17:34       ` [PATCH v4 0/2] Teach diff to honor diff algorithms set through git attributes Junio C Hamano
2023-02-21 18:05         ` Elijah Newren
2023-02-21 18:51           ` Junio C Hamano
2023-02-21 19:36             ` John Cai
2023-02-21 20:16               ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq4jrn3ac7.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=johncai86@gmail.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    --cc=phillip.wood123@gmail.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).