From: Jeff King <peff@peff.net>
To: Nuthan Munaiah <nm6061@rit.edu>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: `git blame` Line Number Off-by-one
Date: Fri, 7 Aug 2020 17:21:59 -0400 [thread overview]
Message-ID: <20200807212159.GA1871940@coredump.intra.peff.net> (raw)
In-Reply-To: <emc6590292-832a-4a35-8815-d5707731d605@sanctum>
On Fri, Aug 07, 2020 at 01:05:51AM +0000, Nuthan Munaiah wrote:
> * Clone https://github.com/apache/tomcat
> * Run `git blame --root -leftw -L 21,21 -L 23,23
> 51844327d8613448bb0bf9667e1a61e462e2043c^ --
> modules/jdbc-pool/java/org/apache/tomcat/jdbc/pool/PoolProperties.java`
>
> [...]
>
> `git blame` shows the last commit that modified lines 21 and 23 of
> `modules/jdbc-pool/java/org/apache/tomcat/jdbc/pool/PoolProperties.java`
> starting at the parent of `51844327d8613448bb0bf9667e1a61e462e2043c`.
>
> [...]
>
> Line 23 is not shown in the `git blame` output. Instead, line 22 is shown.
Thanks for providing an easy reproduction case.
I think the issue is not in the -L input or in the blame algorithm
itself, but in the hunk-coalescing at the end.
As you note, this shows up even with --porcelain:
$ commit=51844327d8613448bb0bf9667e1a61e462e2043c^
$ fn=modules/jdbc-pool/java/org/apache/tomcat/jdbc/pool/PoolProperties.java
$ git blame --porcelain -L 21,21 -L 23,23 $commit -- $fn |
egrep '^[0-9a-f]{40}'
c65a429f06f4e4a025a306e377211863d9ff2a0c 21 21 2
c65a429f06f4e4a025a306e377211863d9ff2a0c 22 22
but if we try --incremental:
$ git blame --incremental -L 21,21 -L 23,23 $commit -- $fn |
egrep '^[0-9a-f]{40}'
c65a429f06f4e4a025a306e377211863d9ff2a0c 21 21 1
c65a429f06f4e4a025a306e377211863d9ff2a0c 22 23 1
So we do know at the moment we find the line that it was at line 23 in
the final result, but line 22 in the earlier version at c65a429f06.
And indeed, running a non-incremental blame in a debugger, right before
calling blame_coalesce() our entries look like this:
cmd_blame (argc=3, argv=0x7fffffffe458, prefix=0x0) at builtin/blame.c:1146
1146 blame_coalesce(&sb);
(gdb) print *sb->ent
$44 = {next = 0x55555596eda0, lno = 20, num_lines = 1, suspect = 0x555555999a30, s_lno = 20, score = 0, ignored = 0,
unblamable = 0}
(gdb) print *sb->ent->next
$45 = {next = 0x0, lno = 22, num_lines = 1, suspect = 0x555555999a30, s_lno = 21, score = 0, ignored = 0,
unblamable = 0}
So we have two one-line entries at lines 21 and 23 ("lno"; note that
internally we zero-index the lines), and we know that the second one is
actually from 22 ("s_lno").
But then after blame_coalesce() returns, we have only one entry with
both lines:
(gdb) n
1148 if (!(output_option & (OUTPUT_COLOR_LINE | OUTPUT_SHOW_AGE_WITH_COLOR)))
(gdb) print *sb->ent
$46 = {next = 0x0, lno = 20, num_lines = 2, suspect = 0x555555999a30, s_lno = 20, score = 0, ignored = 0,
unblamable = 0}
Presumably it saw the adjacent lines in the _source_ file and coalesced
them, but that's not the right thing to do. They're distinct hunks in
the output we're going to show, so they have to remain such.
-Peff
next prev parent reply other threads:[~2020-08-07 21:22 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-07 1:05 `git blame` Line Number Off-by-one Nuthan Munaiah
2020-08-07 21:21 ` Jeff King [this message]
2020-08-07 21:33 ` Jeff King
2020-08-07 21:45 ` Jeff King
2020-08-07 22:05 ` Junio C Hamano
2020-08-07 22:26 ` Jeff King
2020-08-07 22:35 ` Jeff King
2020-08-13 5:20 ` [PATCH 0/3] blame: fix bug in coalescing non-adjacent "-L" chunks Jeff King
2020-08-13 5:23 ` [PATCH 1/3] t8003: check output of coalesced blame Jeff King
2020-08-13 17:04 ` Junio C Hamano
2020-08-13 17:22 ` Jeff King
2020-08-13 5:23 ` [PATCH 2/3] t8003: factor setup out of coalesce test Jeff King
2020-08-13 5:25 ` [PATCH 3/3] blame: only coalesce lines that are adjacent in result Jeff King
2020-08-13 16:59 ` Junio C Hamano
2020-08-13 18:41 ` Junio C Hamano
2020-08-13 12:25 ` [PATCH 0/3] blame: fix bug in coalescing non-adjacent "-L" chunks Barret Rhoden
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200807212159.GA1871940@coredump.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=nm6061@rit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).