git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Barret Rhoden <brho@google.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Stefan Beller <stefanbeller@gmail.com>,
	Jeff Smith <whydoubt@gmail.com>, Jeff King <peff@peff.net>
Subject: Re: [PATCH] blame: add the ability to ignore commits
Date: Tue, 8 Jan 2019 11:27:42 -0500	[thread overview]
Message-ID: <20190108112742.7878d4cb@gnomeregan.cam.corp.google.com> (raw)
In-Reply-To: <xmqqbm4s6ozl.fsf@gitster-ct.c.googlers.com>

On 2019-01-07 at 15:13 Junio C Hamano <gitster@pobox.com> wrote:
> If I read it correctly, this gives a very limited form of -S, in the
> sense that anything this can do can be expressed by using -S but the
> reverse is not true, but is designed to be easier to use, in the
> sense that unlike -S, this does not have to describe the part of the
> history you do not have to lie about.  The documentation should say
> something about these pros-and-cons to help readers decide which
> feature to use.

Yeah, -S lists the revs to use, this lists the revs to *not* use.  I'll
add a note.

> I somehow feel that this is rare enough that it should not squat on
> short-and-sweet '-i'.  We would want to reserve it to something like
> "--ignore-case", for example.

Can do.  I'll change the interface to your suggestion from down below.

> > The file .git-blame-ignore-revs is checked by default.  
> 
> Giving the projects a way to easily help participants to share the
> same distorted view of the history is a good idea, but I do not
> think we should allow projects to do so to those who merely clone it
> without their consent.  IOW, "by default" is a terrible idea.
> 
> Perhaps paying attention to blame.skipRevsFile configuration
> variable that is set by the end user would be sufficient----the
> project can ship .blame-skip-revs (or any random filename of their
> choice) in-tree as tracked contents and tell its users that they can
> optionally use it.

A blame config option works for me.  I'd like the users/cloners of a
repo to not need to do anything extravagant, but a one-time config
would be fine.

> > It's useful to be alerted to the presence of an ignored commit in the
> > history of a line.  Those lines will be marked with '*' in the
> > non-porcelain output.  The '*' is attached to the line number to keep
> > from breaking tools that rely on the whitespace between columns.  
> 
> A policy decision like the above two shouldn't be hardcoded in the
> feature like this, but should be done as a separate option.  By
> default, these shouldn't be marked with '*', as the same tools you
> said you are afraid of breaking would be expecting a word with only
> digits and no asterisk in the column where line numbers appear and
> will get broken by this change if done unconditionally.

Since users are already opting-in to the blame-ignore, do you also want
them to opt-in to the annotation?  I can make a separate config option
to turn on the annotation.  Any preference for how it is marked?

> In general, I find this patch trying to change too many things at
> the same time, without sufficient justification.  Perhaps do these
> different things as separate steps in a single series?
> 
> > A blame_entry attributed to an ignored commit will get passed to its
> > parent.  
> 
> Obviously, an interesting consideration is what happens when a merge
> commit is skipped.  Is it sufficient to change this description to
> "...will get passed to its parentS", or would the code do completely
> nonsensical things without further tweaks (a possible simple tweak
> could be to refuse skipping merges)?

If we skip a merge commit, it might pick the wrong parent.  For
example, this line was brought in via a merge:

$ ~/src/git/git-blame include/linux/mm.h | grep VM_SYNC
b6fb293f2497a (Jan Kara 2017-11-01 16:36:41 +0100  204) #define VM_SYNC

It's from merge: a3841f94c7ec ("Merge tag 'libnvdimm-for-4.15', and if
we ignore it:

$ ~/src/git/git-blame -i a3841f94c7ecb include/linux/mm.h | grep VM_SYNC
cc2383ec06be0 (Konstantin Khlebnikov 2012-10-08 16:28:37 -0700  204*) #define VM_SYNC

The wrong commit is blamed.

I can put a note in the doc about it and die if we're given a merge
commit.  Is there a convenient helper for detecting a merge, or can I
just check for multiple parents?  (something like commit->parents &&
commit->parents->next)
 
> > If an ignored commit changed a line, an ancestor that changed
> > the line will get blamed.  However, if an ignored commit added lines, a
> > commit changing a nearby line may get blamed.  If no commit is found,
> > the original commit for the file will get blamed.  
> 
> The above somehow does not read as describing a carefully designed
> behaviour; rather, it sounds as if it is saying "the code does
> whatever random things it happens to do".  For example, when there
> is a newly added line how is "A" commit changing a nearby line
> chosen (a line has lines before it and after it---they may be
> touched by different commits, and before and after that skipped
> commit, so there are multiple commits to choose from)?

This was more of a commentary about its behavior.  If you ignore a
commit that added lines, it'd be nice to get a hint of what might have
caused it, and picking a commit that affected an adjacent line seemed
fine.  But yeah, it's not doing anything crazy.

> > diff --git a/Documentation/git-blame.txt b/Documentation/git-blame.txt
> > index 16323eb80e31..e41375374892 100644
> > --- a/Documentation/git-blame.txt
> > +++ b/Documentation/git-blame.txt
> > @@ -10,6 +10,7 @@ SYNOPSIS
> >  [verse]
> >  'git blame' [-c] [-b] [-l] [--root] [-t] [-f] [-n] [-s] [-e] [-p] [-w] [--incremental]
> >  	    [-L <range>] [-S <revs-file>] [-M] [-C] [-C] [-C] [--since=<date>]
> > +	    [-i <rev>] [--no-default-ignores] [--ignore-file=<file>]
> >  	    [--progress] [--abbrev=<n>] [<rev> | --contents <file> | --reverse <rev>..<rev>]
> >  	    [--] <file>
> >  
> > @@ -84,6 +85,20 @@ include::blame-options.txt[]
> >  	Ignore whitespace when comparing the parent's version and
> >  	the child's to find where the lines came from.
> >  
> > +-i <rev>::
> > +	Ignore revision when assigning blame.  Lines that were changed by an
> > +	ignored commit will be marked with a `*` in the blame output.  Lines
> > +	that were added by an ignored commit may be attributed commits making
> > +	nearby changes or to the first commit touching the file.  
> 
> It probably deserves to be told that this option can be given
> multiple times and used cumulatively (unlike usual "last one wins"
> rule).
> 
> > +--no-default-ignores::
> > +	Do not automatically ignore revisions in the file
> > +	`.git-blame-ignore-revs`.  
> 
> This should not be "opt-out" like this.
> 
> > +--ignore-file=<file>::
> > +	Ignore revisions listed in `file`, one revision per line.  Whitespace
> > +	and comments beginning with `#` are ignored.  
> 
> Should it be capable to take two or more ignore-files?  Or should we
> use the usual "the last one wins" rule?
> 
> I think we should support blame.skipRevFile configuration variable
> so that the users do not have to constantly give the option from the
> command line.  And with that, there is no need to have a hardcoded
> filename .git-blame-ignore-revs or anything like that.
> 
> If we are to use configuration variable, however, we'd need a way to
> override its use from the command line, too.  Perhaps a sane
> arrangement would be
> 
>     - if one or more --skip-revs-file=<file> are given from the
>       command line, use all of them and ignore blame.skipRevsFile
>       configuration variable.
> 
>     - if no --skip-revs-file=<file> is given from the command line,
>       use blame.skipRevsFile configuration variable.
> 
>     - regardless of the above two, pay attention to --skip-rev=<rev>
>       command line option(s).

Sounds fine to me.

> Somehow the damage to blame.c codebase looks way too noisy for what
> the code wants to do.  If all we want is to pretend in a history,
> e.g.
> 
>     ---O---A---B---C---D
> 
> that commit B does not exist, i.e. use a distorted view of the
> history
> 
>     ---O---A-------C---D
> 
> wouldn't it be sufficient to modify pass_blame(), i.e. the only the
> caller of the pass_blame_to_parent(), where we find the parent
> commits of "C" and dig the history to pass blame to "B", and have it
> pretend as if "B" does not exist and pass blame directly to "A"?

I originally tried to skip 'B' in pass_blame() when B popped up as a
scapegoat.  That broke the offsets of the blame_entries in the
parent.  By running diff_hunks(), we get the opportunity to adjust the
offsets.  Also, when it comes to marking the blame_entries for marking
the output, we want to know the specific diffs and to break them up at
the boundaries of [tlno,same) in blame_chunk().

> Thanks.  I am personally not all that interested in this yet.

Thanks for taking a look.

Barret


  reply	other threads:[~2019-01-08 16:27 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-07 21:30 [PATCH] blame: add the ability to ignore commits Barret Rhoden
2019-01-07 23:13 ` Junio C Hamano
2019-01-08 16:27   ` Barret Rhoden [this message]
2019-01-08 18:26     ` Junio C Hamano
2019-01-09 20:48       ` Barret Rhoden
2019-01-10 22:29         ` Junio C Hamano
2019-01-14 15:19           ` Barret Rhoden
2019-01-14 17:45             ` Junio C Hamano
2019-01-14 18:26               ` Randall S. Becker
2019-01-14 19:03                 ` Barret Rhoden
2019-01-15  6:08                 ` Junio C Hamano
2019-01-09 21:21       ` Stefan Beller
2019-01-08 13:12 ` Ævar Arnfjörð Bjarmason
2019-01-08 16:41   ` Barret Rhoden
2019-01-08 18:04     ` Barret Rhoden
2019-01-17 20:29 ` [PATCH v2 0/3] " Barret Rhoden
2019-01-17 20:29   ` [PATCH v2 1/3] Move init_skiplist() outside of fsck Barret Rhoden
2019-01-18  9:25     ` Johannes Schindelin
2019-01-18  9:45     ` Ævar Arnfjörð Bjarmason
2019-01-18 17:36       ` Junio C Hamano
2019-01-18 20:59         ` Johannes Schindelin
2019-01-18 21:30           ` Jeff King
2019-01-18 22:26             ` Ævar Arnfjörð Bjarmason
2019-01-22  7:12               ` Jeff King
2019-01-22  9:46                 ` Ævar Arnfjörð Bjarmason
2019-01-22 17:54                   ` Junio C Hamano
2019-01-22 18:28                   ` Jeff King
2019-01-17 20:29   ` [PATCH v2 2/3] blame: add the ability to ignore commits and their changes Barret Rhoden
2019-01-18  9:47     ` Johannes Schindelin
2019-01-18 17:33       ` Barret Rhoden
2019-01-20 18:19     ` René Scharfe
2019-01-22 15:26       ` Barret Rhoden
2019-01-22 18:14       ` Junio C Hamano
2019-01-22 19:35         ` Barret Rhoden
2019-01-23 19:26           ` Junio C Hamano
2019-02-01 20:37             ` Barret Rhoden
2019-01-17 20:29   ` [PATCH v2 3/3] blame: add a config option to mark ignored lines Barret Rhoden
2019-01-18 10:03     ` Johannes Schindelin
2019-01-18  9:52   ` [PATCH v2 0/3] blame: add the ability to ignore commits Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190108112742.7878d4cb@gnomeregan.cam.corp.google.com \
    --to=brho@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=stefanbeller@gmail.com \
    --cc=whydoubt@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).