git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Nuthan Munaiah <nm6061@rit.edu>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>
Subject: Re: `git blame` Line Number Off-by-one
Date: Fri, 7 Aug 2020 17:21:59 -0400	[thread overview]
Message-ID: <20200807212159.GA1871940@coredump.intra.peff.net> (raw)
In-Reply-To: <emc6590292-832a-4a35-8815-d5707731d605@sanctum>

On Fri, Aug 07, 2020 at 01:05:51AM +0000, Nuthan Munaiah wrote:

>  * Clone https://github.com/apache/tomcat
>  * Run `git blame --root -leftw -L 21,21 -L 23,23
> 51844327d8613448bb0bf9667e1a61e462e2043c^ --
> modules/jdbc-pool/java/org/apache/tomcat/jdbc/pool/PoolProperties.java`
>
> [...]
> 
> `git blame` shows the last commit that modified lines 21 and 23 of
> `modules/jdbc-pool/java/org/apache/tomcat/jdbc/pool/PoolProperties.java`
> starting at the parent of `51844327d8613448bb0bf9667e1a61e462e2043c`.
>
> [...]
>
> Line 23 is not shown in the `git blame` output. Instead, line 22 is shown.

Thanks for providing an easy reproduction case.

I think the issue is not in the -L input or in the blame algorithm
itself, but in the hunk-coalescing at the end.

As you note, this shows up even with --porcelain:

  $ commit=51844327d8613448bb0bf9667e1a61e462e2043c^
  $ fn=modules/jdbc-pool/java/org/apache/tomcat/jdbc/pool/PoolProperties.java
  $ git blame --porcelain -L 21,21 -L 23,23 $commit -- $fn |
    egrep '^[0-9a-f]{40}'
  c65a429f06f4e4a025a306e377211863d9ff2a0c 21 21 2
  c65a429f06f4e4a025a306e377211863d9ff2a0c 22 22

but if we try --incremental:

  $ git blame --incremental -L 21,21 -L 23,23 $commit -- $fn |
    egrep '^[0-9a-f]{40}'
  c65a429f06f4e4a025a306e377211863d9ff2a0c 21 21 1
  c65a429f06f4e4a025a306e377211863d9ff2a0c 22 23 1

So we do know at the moment we find the line that it was at line 23 in
the final result, but line 22 in the earlier version at c65a429f06.

And indeed, running a non-incremental blame in a debugger, right before
calling blame_coalesce() our entries look like this:

  cmd_blame (argc=3, argv=0x7fffffffe458, prefix=0x0) at builtin/blame.c:1146
  1146		blame_coalesce(&sb);
  (gdb) print *sb->ent 
  $44 = {next = 0x55555596eda0, lno = 20, num_lines = 1, suspect = 0x555555999a30, s_lno = 20, score = 0, ignored = 0, 
    unblamable = 0}
  (gdb) print *sb->ent->next
  $45 = {next = 0x0, lno = 22, num_lines = 1, suspect = 0x555555999a30, s_lno = 21, score = 0, ignored = 0, 
    unblamable = 0}

So we have two one-line entries at lines 21 and 23 ("lno"; note that
internally we zero-index the lines), and we know that the second one is
actually from 22 ("s_lno").

But then after blame_coalesce() returns, we have only one entry with
both lines:

  (gdb) n
  1148		if (!(output_option & (OUTPUT_COLOR_LINE | OUTPUT_SHOW_AGE_WITH_COLOR)))
  (gdb) print *sb->ent
  $46 = {next = 0x0, lno = 20, num_lines = 2, suspect = 0x555555999a30, s_lno = 20, score = 0, ignored = 0, 
    unblamable = 0}

Presumably it saw the adjacent lines in the _source_ file and coalesced
them, but that's not the right thing to do. They're distinct hunks in
the output we're going to show, so they have to remain such.

-Peff

  reply	other threads:[~2020-08-07 21:22 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-07  1:05 `git blame` Line Number Off-by-one Nuthan Munaiah
2020-08-07 21:21 ` Jeff King [this message]
2020-08-07 21:33   ` Jeff King
2020-08-07 21:45     ` Jeff King
2020-08-07 22:05     ` Junio C Hamano
2020-08-07 22:26       ` Jeff King
2020-08-07 22:35         ` Jeff King
2020-08-13  5:20           ` [PATCH 0/3] blame: fix bug in coalescing non-adjacent "-L" chunks Jeff King
2020-08-13  5:23             ` [PATCH 1/3] t8003: check output of coalesced blame Jeff King
2020-08-13 17:04               ` Junio C Hamano
2020-08-13 17:22                 ` Jeff King
2020-08-13  5:23             ` [PATCH 2/3] t8003: factor setup out of coalesce test Jeff King
2020-08-13  5:25             ` [PATCH 3/3] blame: only coalesce lines that are adjacent in result Jeff King
2020-08-13 16:59               ` Junio C Hamano
2020-08-13 18:41               ` Junio C Hamano
2020-08-13 12:25             ` [PATCH 0/3] blame: fix bug in coalescing non-adjacent "-L" chunks Barret Rhoden

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200807212159.GA1871940@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=nm6061@rit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).