From: Junio C Hamano <gitster@pobox.com>
To: Michael Haggerty <mhagger@alum.mit.edu>
Cc: git@vger.kernel.org, "Stefan Beller" <sbeller@google.com>,
"Jeff King" <peff@peff.net>, "Jakub Narębski" <jnareb@gmail.com>,
"Jacob Keller" <jacob.keller@gmail.com>
Subject: Re: [PATCH v2 7/7] blame: actually use the diff opts parsed from the command line
Date: Tue, 23 Aug 2016 10:01:07 -0700 [thread overview]
Message-ID: <xmqqtwebwhbg.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <8192012a6bf725e0460522f9e67bab83b613127a.1471864378.git.mhagger@alum.mit.edu> (Michael Haggerty's message of "Mon, 22 Aug 2016 13:22:46 +0200")
Michael Haggerty <mhagger@alum.mit.edu> writes:
> Somebody who knows more about how diff operations are configured
> should please review this. I'm not certain that the change as
> implemented won't have other unwanted side-effects, though of course
> I checked that the test suite runs correctly.
Generally, I think this is a bad idea.
We run "diff-tree" internally for multiple purposes:
- When the same path we are drilling down appears, we use diff-tree
to compare that path alone, to obtain textual diff, so that we
can say "what did not appear in the textual diff output are
attributable to the parent commit, we can exonerate this child".
Even if the command line to "git blame" had "-Sfoo", you do not
want to pass it down to this invocation. If the child did not
change the number of occurrences of string "foo" in the path
being drilled down, it will pass down the blame for all lines to
the parent. And copying -R from the command line to this
invocation does not make any sense. Copying -C -C from the
command line will defeat the whole purpose of having find_origin
vs find_rename distinction, I would think, for this step.
- When the path we have been drilling down for no longer appears in
the parent, we use diff-tree with rename detection to find where
the file _could_ have come from.
It is a good idea to allow the similarity threshold to be tweaked
from the command line. Copying -R from the command line to this
invocation may make sense, but I think you would break this step
if you copied -C/-C -C
- When -C/-C -C/-C -C -C is given from the command line, after
running the "pass the blame to our primary suspect", we run
tree-level diff-tree with find-copy option, only to enumerate
paths that appear in the parent. We pick pieces from these paths
by doing blob-level diffs in diff_hunks() ourselves. Copying -C
from the command line would be useful if you are only doing a
single '-C' (because it would allow you to tweak the similarity
threshold), but otherwise it would probably break the command.
The design principle taken in "git blame" is to leave the decision
on what diff options do or do not make sense for these particular
invocations of the internal "diff-tree", and have "blame" give the
selected options to the internal "diff-tree" invocations. "-w"
happens to use xdl_opts that will eventually be passed down to diff
machinery but you should consider it an accidental optimization
(i.e. "blame" designer considered that it makes sense to give
whitespace-ignoring comparison at the leaf-level "diff between
blobs", and found that the most efficient way to do so is to just
take "-w" from the command line and set the ignore-whitespace bit
used to drive the internal "diff-tree" invocation). If we _were_
adding -R<num> to allow the users to affect similarity threshold,
we would parse it in "git blame" command line option processing,
and would give that _ONLY_ to the second invocation above, i.e. the
one done in find_rename() but not in others like in find_origin.
prev parent reply other threads:[~2016-08-23 17:01 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-22 11:22 [PATCH v2 0/7] Better heuristics make prettier diffs Michael Haggerty
2016-08-22 11:22 ` [PATCH v2 1/7] xdl_change_compact(): fix compaction heuristic to adjust ixo Michael Haggerty
2016-08-22 11:22 ` [PATCH v2 2/7] xdl_change_compact(): only use heuristic if group can't be matched Michael Haggerty
2016-08-22 11:22 ` [PATCH v2 3/7] is_blank_line(): take a single xrecord_t as argument Michael Haggerty
2016-08-22 11:22 ` [PATCH v2 4/7] recs_match(): take two xrecord_t pointers as arguments Michael Haggerty
2016-08-22 11:22 ` [PATCH v2 5/7] xdl_change_compact(): introduce the concept of a change group Michael Haggerty
2016-08-23 21:35 ` Junio C Hamano
2016-08-22 11:22 ` [PATCH v2 6/7] diff: improve positioning of add/delete blocks in diffs Michael Haggerty
2016-08-22 11:22 ` [PATCH v2 7/7] blame: actually use the diff opts parsed from the command line Michael Haggerty
2016-08-23 9:56 ` René Scharfe
2016-09-05 4:12 ` Michael Haggerty
2016-09-07 17:58 ` Junio C Hamano
2016-08-23 17:01 ` Junio C Hamano [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqtwebwhbg.fsf@gitster.mtv.corp.google.com \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=jacob.keller@gmail.com \
--cc=jnareb@gmail.com \
--cc=mhagger@alum.mit.edu \
--cc=peff@peff.net \
--cc=sbeller@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).