git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: git@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Subject: [Git BUG] Please do not use "-B -M" in "diff" family for now
Date: Sat, 31 Jan 2015 11:12:34 -0800
Message-ID: <xmqqegqaahnh.fsf@gitster.dls.corp.google.com> (raw)

Please avoid the combination "-B -M" when running "diff" family of
commands, as it can produce incorrect results [*1*] in corner cases.
Use of either "-B" or "-M" by itself is fine, but not both at the
same time.

This problem exists even in Git v1.7.12, and I have no reason to
suspect that it worked correctly in any earlier versions, so I do
not consider it an urgent issue to fix during the pre-release for
upcoming Git v2.3 release.

End of TL;DR.

For a simple reproduction, go to your copy of the kernel tree and do
this:

    $ git diff -B -M v2.6.13 v2.6.12 -- \
        arch/ppc64/kernel/rtas_pci.c arch/ppc64/kernel/pSeries_pci.c >:patch

    $ git reset --hard
    $ git checkout v2.6.13

    $ git apply --cached --whitespace=nowarn :patch
    error: arch/ppc64/kernel/pSeries_pci.c: already exists in index

This is not a bug in "apply", but in "diff".  The resulting patch
looks like this:

    $ git apply --whitespace=nowarn --numstat --summary :patch
    112     5       arch/ppc64/kernel/pSeries_pci.c
     rename arch/ppc64/kernel/{rtas_pci.c => pSeries_pci.c} (81%)

That is, it wants to rename rtas_pci.c to pSeries_pci.c with a bit
of editing.

However, what really happens when going from 2.6.13 to 2.6.12 is
this:

    $ git diff v2.6.13 v2.6.12 -- \
        arch/ppc64/kernel/rtas_pci.c arch/ppc64/kernel/pSeries_pci.c |
        git apply --whitespace=nowarn --numstat --summary
    478     19      arch/ppc64/kernel/pSeries_pci.c
    0       495     arch/ppc64/kernel/rtas_pci.c
     delete mode 100644 arch/ppc64/kernel/rtas_pci.c

That is:

    #1 rtas_pci.c exists in 2.6.13 but not 2.6.12.

    #2 pSeries_pci.c exists in both 2.6.12 and 2.6.13 but is majorly
       rewritten; in fact, the difference between rtas_pci.c in
       2.6.13 and pSeries_pci.c in 2.6.12 is much smaller than the
       difference between pSeries_pci.c from 2.6.13 and 2.6.12.

What seems to happen is that "diff -B" splits the above #2 into
"removal of pSeries_pci.c with contents from 2.6.13" and "creation
of pSeries_pci.c with contents from 2.6.12" and these gets further
combined with the "removal of rtas_pci.c" [*2*].  In the end result,
we incorrectly get "rename rtas_pci.c to create pSeries_pci.c with
some changes".  We shouldn't do this because pSeries_pci.c is not
created and is not a new file.

[Footnotes]

*1* "git apply" refuses to apply the output affected by the bug, so
    at least this will not lead to silent corruption.

*2* The "-B" option was introduced solely to find a possible
    rename/copy source to express this sequence:

    $ cp A B
    $ edit B slightly ;# optional
    $ edit A heavily

as if it was done like this instead:

    $ mv A B
    $ edit B slightly ;# optional
    $ create B from scratch

Without "-B", the change would be expressed as "Heavily edit A,
create B from scratch", and with "-B", we would say "Create B by
copying from A and then edit, and edit A heavily", which would
result in more readable patch (in fact, we would see in the output
of "diff -B -M v2.6.12 v2.6.13" exactly that).

             reply index

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-31 19:12 Junio C Hamano [this message]
2015-02-02  3:18 ` [RFC PATCH] diff: do not use creation-half of -B as a rename target candidate Junio C Hamano
2015-02-02  5:52   ` Stefan Beller
2015-02-02  6:47     ` Yue Lin Ho
2015-02-02 18:25   ` Junio C Hamano

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqegqaahnh.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox