git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Dmitry Nikulin <pastafariant@gmail.com>
Cc: phillip.wood@dunelm.org.uk, Junio C Hamano <gitster@pobox.com>,
	git@vger.kernel.org
Subject: Re: git-diff passes <rev>:<path> args to GIT_EXTERNAL_DIFF incorrectly?
Date: Fri, 30 Aug 2019 10:27:30 -0400	[thread overview]
Message-ID: <20190830142730.GB16327@sigill.intra.peff.net> (raw)
In-Reply-To: <CAH53SykQWLtjt0gWVrz5KyH-9WyqaQ0GtkhmyLt09QEqcAS_dw@mail.gmail.com>

On Fri, Aug 30, 2019 at 04:23:13PM +0300, Dmitry Nikulin wrote:

> On Fri, 30 Aug 2019 at 13:16, Phillip Wood <phillip.wood123@gmail.com> wrote:
> > I'm not sure why the last argument is being split in
> > your example. It is not split in the example below
> 
> I have replicated the splitting issue on my small demo repo [1]:
> 
> $ env GIT_EXTERNAL_DIFF=./print_argv.py git diff -M origin/branch1
> origin/branch1-mv -- file1.txt file1-mv.txt
> ['./print_argv.py',
>  'file1.txt',
>  '/tmp/EWaCSc_file1.txt',
>  '2bef330804cb3f6962e45a72a12a3071ee9b5888',
>  '100644',
>  '/tmp/mtEiSc_file1-mv.txt',
>  'f8fd6737fbe5a45c97ba9c9de495dc46ff11eccd',
>  '100644',
>  'file1-mv.txt',
>  'similarity index 90%\n'
>  'rename from file1.txt\n'
>  'rename to file1-mv.txt\n'
>  'index 2bef330..f8fd673 100644\n']

Interesting. I _don't_ see that splitting when I run the same command in
your demo repo (nor, looking at Git's code, do I see how it could
happen; we always add the metainfo as a single argument).

> This is, however, tangential to the original problem: documenting the
> external diff CLI interface for diffing two blobs. Here is what I am
> seeing:
> 
> $ env GIT_EXTERNAL_DIFF=./print_argv.py git diff
> origin/branch1:file1.txt origin/branch1-mv:file1-mv.txt
> ['./print_argv.py',
>  'file1.txt',
>  '/tmp/n9USvy_file1.txt',
>  '2bef330804cb3f6962e45a72a12a3071ee9b5888',
>  '100644',
>  '/tmp/Zst0uy_file1-mv.txt',
>  'f8fd6737fbe5a45c97ba9c9de495dc46ff11eccd',
>  '100644',
>  'file1-mv.txt',
>  'index 2bef330..f8fd673 100644\n']
> 
> The meaning and origin of the last arg remains mysterious, and the
> other args do not conform to the published documentation[2], which
> states that the args should be:
> 
>     path old-file old-hex old-mode new-file new-hex new-mode
> 
> Instead the args that are passed are:
> 
>     path old-filename old-file old-hex old-mode new-file new-hex
> new-mode new-filename something

I think you have an extra "old-filename" in the second list. Without
that, the first 7 arguments are as documented.

The last two are:

  - when the destination path differs from the source path, we append
    it. This is generally a sign of a rename/copy, though it triggers
    in the blob case because Git has been manually given a pair of files
    with non-matching names.

  - the final one is metainfo that Git typically prints between the
    "diff" header and the diff itself. This is added only when we add
    the filename, and would generally carry information about the rename
    (but of course since there isn't one, it has only the index line).

These were both added by 427dcb4bca ([PATCH] Diff overhaul, adding half
of copy detection., 2005-05-21). I think the intent was that existing
diff commands would/could ignore the extra arguments.

Certainly the world would be a better place if those were described in
the external-diff documentation (specifically in relation to renames,
which is their intended use).

As for the behavior in the blob-diff case, I think it's _pretty_
reasonable. It's certainly useful to give the new-filename. The metainfo
is mostly useless in this case, and potentially could be suppressed if
the pair is not an actual rename/copy. But I also think it's not hurting
much (and a script can tell that's what's going on by the lack of rename
lines in the metainfo).

-Peff

  parent reply	other threads:[~2019-08-30 14:27 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-27 18:24 git-diff passes <rev>:<path> args to GIT_EXTERNAL_DIFF incorrectly? Dmitry Nikulin
2019-08-27 22:25 ` Dmitry Nikulin
2019-08-29  3:54 ` Junio C Hamano
2019-08-29 14:36   ` Dmitry Nikulin
2019-08-30 10:16     ` Phillip Wood
2019-08-30 13:23       ` Dmitry Nikulin
2019-08-30 14:17         ` Phillip Wood
2019-08-30 14:27         ` Jeff King [this message]
2019-08-30 15:28           ` Dmitry Nikulin
2019-08-30 15:51             ` Dmitry Nikulin
2019-08-30 14:29         ` SZEDER Gábor
2019-08-30 16:26           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190830142730.GB16327@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pastafariant@gmail.com \
    --cc=phillip.wood@dunelm.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).