From: Johan Herland <johan@herland.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org,
Linus Torvalds <torvalds@linux-foundation.org>,
johan@herland.net
Subject: [RFC/PATCH 3/3] Teach --dirstat to not completely ignore rearranged lines
Date: Fri, 8 Apr 2011 16:55:37 +0200 [thread overview]
Message-ID: <201104081655.38075.johan@herland.net> (raw)
In-Reply-To: <201104081646.35750.johan@herland.net>
Currently, the --dirstat analysis fails to detect some kinds of changes.
For example, rearranging lines in a file causes the "damage" calculated
by show_dirstat() to be 0. However, when we process the diff queue in
show_dirstat(), we already now that there should be at least _some_
damage assigned to each entry, because truly _unchanged_ entries are
simply not present in the diff queue.
This patch teaches show_dirstat() to assign a minimum amount of damage
(== 1) to entries for which the analysis otherwise yields zero damage.
Obviously this is not a complete fix, but it's at least better to
underrepresent these changes, rather than simply pretending that they
don't exist.
Signed-off-by: Johan Herland <johan@herland.net>
---
This is a somewhat quick and ugly solution to make --dirstat at least
show _something_ for changes that consist solely of rearranging lines
in a file. Sure, those changes would be thoroughly underrepresented by
--dirstat (probably falling below the default 3% threshold in many
cases), but I figure it's better to underrepresent them rather than
ignoring them completely.
As with 2/3, this patch also relies on the assumption that the diff
queue never contains entries that should be considered "unchanged" by
--dirstat.
Documentation/diff-options.txt | 4 ++--
diff.c | 8 ++++++++
t/t4013-diff-various.sh | 2 --
t/t4013/diff.diff_--dirstat_initial_rearrange | 1 +
4 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 25e48c4..61a8409 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -75,8 +75,8 @@ endif::git-format-patch[]
+
Note that `--dirstat` does not use the regular diff machinery to calculate
the changes (rather it is based on the rename detection machinery). Therefore,
-`--dirstat` may skip some changes that `--stat` does not skip. For example,
-rearranging the lines in a file will not be detected by `--dirstat`.
+`--dirstat` will count some changes differently than `--stat`. For example,
+rearranged lines in a file will be underrepresented by `--dirstat`.
--dirstat-by-file[=<limit>]::
Same as `--dirstat`, but counts changed files instead of lines.
diff --git a/diff.c b/diff.c
index 28d9293..0d82082 100644
--- a/diff.c
+++ b/diff.c
@@ -1578,8 +1578,16 @@ static void show_dirstat(struct diff_options *options)
* Original minus copied is the removed material,
* added is the new material. They are both damages
* made to the preimage.
+ * If the resulting damage is zero, we know that
+ * diffcore_count_changes() considers the two entries
+ * to be identical, but since they are in the diff
+ * queue at all, we now that there must have been
+ * _some_ kind of change, so we force all entries to
+ * have at least a minimum of damage.
*/
damage = (p->one->size - copied) + added;
+ if (!damage)
+ damage = 1;
found_damage:
ALLOC_GROW(dir.files, dir.nr + 1, dir.alloc);
diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index e8240f2..93a6f20 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -300,9 +300,7 @@ diff --no-index --name-status -- dir2 dir
diff --no-index dir dir3
diff master master^ side
diff --dirstat master~1 master~2
-# --dirstat does NOT pick up changes that simply rearrange existing lines
diff --dirstat initial rearrange
-# ...but --dirstat-by-file DOES pick up rearranged lines
diff --dirstat-by-file initial rearrange
EOF
diff --git a/t/t4013/diff.diff_--dirstat_initial_rearrange b/t/t4013/diff.diff_--dirstat_initial_rearrange
index fb2e17d..5fb02c1 100644
--- a/t/t4013/diff.diff_--dirstat_initial_rearrange
+++ b/t/t4013/diff.diff_--dirstat_initial_rearrange
@@ -1,2 +1,3 @@
$ git diff --dirstat initial rearrange
+ 100.0% dir/
$
--
1.7.5.rc1
next prev parent reply other threads:[~2011-04-08 14:56 UTC|newest]
Thread overview: 91+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-07 13:49 BUG? in --dirstat when rearranging lines in a file Johan Herland
2011-04-07 14:56 ` Linus Torvalds
2011-04-07 22:43 ` Junio C Hamano
2011-04-07 22:59 ` Linus Torvalds
2011-04-08 14:46 ` Johan Herland
2011-04-08 14:48 ` [PATCH 1/3] --dirstat: Document shortcomings compared to --stat or regular diff Johan Herland
2011-04-08 19:50 ` Junio C Hamano
2011-04-08 14:50 ` [PATCH 2/3] --dirstat-by-file: Make it faster and more correct Johan Herland
2011-04-08 14:55 ` Johan Herland [this message]
2011-04-08 15:04 ` BUG? in --dirstat when rearranging lines in a file Linus Torvalds
2011-04-08 19:56 ` Junio C Hamano
2011-04-10 22:48 ` [PATCHv2 0/3] --dirstat fixes Johan Herland
2011-04-10 22:48 ` [PATCHv2 1/3] --dirstat: Describe non-obvious differences relative to --stat or regular diff Johan Herland
2011-04-10 22:48 ` [PATCHv2 2/3] --dirstat-by-file: Make it faster and more correct Johan Herland
2011-04-11 18:14 ` Junio C Hamano
2011-04-10 22:48 ` [PATCHv2 3/3] Teach --dirstat to not completely ignore rearranged lines within a file Johan Herland
2011-04-11 21:38 ` Junio C Hamano
2011-04-11 21:56 ` Johan Herland
2011-04-11 22:08 ` Junio C Hamano
2011-04-12 9:22 ` Johan Herland
2011-04-12 9:24 ` [PATCH 4/3] --dirstat: In case of renames, use target filename instead of source filename Johan Herland
2011-04-12 14:59 ` Linus Torvalds
2011-04-12 9:26 ` [RFC/PATCH 5/3] Alternative --dirstat implementation, based on diffstat analysis Johan Herland
2011-04-12 14:46 ` Linus Torvalds
2011-04-12 15:08 ` Linus Torvalds
2011-04-12 22:03 ` Johan Herland
2011-04-12 22:12 ` Linus Torvalds
2011-04-12 22:22 ` Junio C Hamano
2011-04-26 0:01 ` [PATCH 0/6] --dirstat fixes, part 2 Johan Herland
2011-04-26 0:01 ` [PATCH 1/6] Add several testcases for --dirstat and friends Johan Herland
2011-04-26 0:01 ` [PATCH 2/6] Make --dirstat=0 output directories that contribute < 0.1% of changes Johan Herland
2011-04-26 0:01 ` [PATCH 3/6] Refactor --dirstat parsing; deprecate --cumulative and --dirstat-by-file Johan Herland
2011-04-26 16:36 ` Junio C Hamano
2011-04-27 2:02 ` Johan Herland
2011-04-27 4:53 ` Junio C Hamano
2011-04-27 20:51 ` Junio C Hamano
2011-04-27 21:01 ` Junio C Hamano
2011-04-26 0:01 ` [PATCH 4/6] Add config variable for specifying default --dirstat behavior Johan Herland
2011-04-26 16:43 ` Junio C Hamano
2011-04-27 2:02 ` Johan Herland
2011-04-26 0:01 ` [PATCH 5/6] Use floating point for --dirstat percentages Johan Herland
2011-04-26 16:52 ` Junio C Hamano
2011-04-27 2:02 ` Johan Herland
2011-04-27 4:42 ` Junio C Hamano
2011-04-27 4:53 ` Linus Torvalds
2011-04-27 5:20 ` Junio C Hamano
2011-04-26 0:01 ` [PATCH 6/6] New --dirstat=lines mode, doing dirstat analysis based on diffstat Johan Herland
2011-04-26 16:59 ` Junio C Hamano
2011-04-27 2:02 ` Johan Herland
2011-04-26 0:15 ` [PATCH 0/6] --dirstat fixes, part 2 Linus Torvalds
2011-04-27 2:12 ` [PATCHv2 " Johan Herland
2011-04-27 2:12 ` [PATCHv2 1/6] Add several testcases for --dirstat and friends Johan Herland
2011-04-27 2:12 ` [PATCHv2 2/6] Make --dirstat=0 output directories that contribute < 0.1% of changes Johan Herland
2011-04-27 2:12 ` [PATCHv2 3/6] Refactor --dirstat parsing; deprecate --cumulative and --dirstat-by-file Johan Herland
2011-04-27 2:12 ` [PATCHv2 4/6] Add config variable for specifying default --dirstat behavior Johan Herland
2011-04-27 2:12 ` [PATCHv2 5/6] Use floating point for --dirstat percentages Johan Herland
2011-04-27 2:45 ` Linus Torvalds
2011-04-27 2:12 ` [PATCHv2 6/6] New --dirstat=lines mode, doing dirstat analysis based on diffstat Johan Herland
2011-04-27 8:24 ` [PATCHv3 0/6] --dirstat fixes, part 2 Johan Herland
2011-04-27 8:24 ` [PATCHv3 1/6] Add several testcases for --dirstat and friends Johan Herland
2011-04-27 8:24 ` [PATCHv3 2/6] Make --dirstat=0 output directories that contribute < 0.1% of changes Johan Herland
2011-04-27 8:24 ` [PATCHv3 3/6] Refactor --dirstat parsing; deprecate --cumulative and --dirstat-by-file Johan Herland
2011-04-27 8:24 ` [PATCHv3 4/6] Add config variable for specifying default --dirstat behavior Johan Herland
2011-04-27 8:24 ` [PATCHv3 5/6] Allow specifying --dirstat cut-off percentage as a floating point number Johan Herland
2011-04-27 8:37 ` Linus Torvalds
2011-04-27 10:29 ` [PATCHv4 " Johan Herland
2011-04-27 8:24 ` [PATCHv3 6/6] New --dirstat=lines mode, doing dirstat analysis based on diffstat Johan Herland
2011-04-28 1:17 ` [PATCHv5 0/7] --dirstat fixes, part 2 Johan Herland
2011-04-28 1:17 ` [PATCHv5 1/7] Add several testcases for --dirstat and friends Johan Herland
2011-04-28 1:17 ` [PATCHv5 2/7] Make --dirstat=0 output directories that contribute < 0.1% of changes Johan Herland
2011-04-28 1:17 ` [PATCHv5 3/7] Refactor --dirstat parsing; deprecate --cumulative and --dirstat-by-file Johan Herland
2011-04-28 1:17 ` [PATCHv5 4/7] Add config variable for specifying default --dirstat behavior Johan Herland
2011-04-28 1:17 ` [PATCHv5 5/7] Allow specifying --dirstat cut-off percentage as a floating point number Johan Herland
2011-04-28 1:17 ` [PATCHv5 6/7] New --dirstat=lines mode, doing dirstat analysis based on diffstat Johan Herland
2011-04-28 1:17 ` [PATCHv5 7/7] Improve error handling when parsing dirstat parameters Johan Herland
2011-04-28 18:41 ` Junio C Hamano
2011-04-28 19:20 ` Junio C Hamano
2011-04-28 23:16 ` Johan Herland
2011-04-28 23:13 ` Johan Herland
2011-04-29 4:06 ` Junio C Hamano
2011-04-29 9:36 ` [PATCHv6 0/8] --dirstat fixes, part 2 Johan Herland
2011-04-29 9:36 ` [PATCHv6 1/8] Add several testcases for --dirstat and friends Johan Herland
2011-04-29 9:36 ` [PATCHv6 2/8] Make --dirstat=0 output directories that contribute < 0.1% of changes Johan Herland
2011-04-29 9:36 ` [PATCHv6 3/8] Refactor --dirstat parsing; deprecate --cumulative and --dirstat-by-file Johan Herland
2011-04-29 9:36 ` [PATCHv6 4/8] Add config variable for specifying default --dirstat behavior Johan Herland
2011-04-29 9:36 ` [PATCHv6 5/8] Allow specifying --dirstat cut-off percentage as a floating point number Johan Herland
2011-04-29 9:36 ` [PATCHv6 6/8] New --dirstat=lines mode, doing dirstat analysis based on diffstat Johan Herland
2011-04-29 9:36 ` [PATCHv6 7/8] Improve error handling when parsing dirstat parameters Johan Herland
2011-04-29 9:36 ` [PATCHv6 8/8] Mark dirstat error messages for translation Johan Herland
2011-04-12 18:34 ` [RFC/PATCH 5/3] Alternative --dirstat implementation, based on diffstat analysis Junio C Hamano
2011-04-10 23:17 ` [PATCHv2 0/3] --dirstat fixes Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201104081655.38075.johan@herland.net \
--to=johan@herland.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).