git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: Git List <git@vger.kernel.org>
Subject: Re: [PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly
Date: Fri, 24 May 2013 10:31:18 -0700	[thread overview]
Message-ID: <7v38tcb7yx.fsf@alter.siamese.dyndns.org> (raw)
In-Reply-To: <1369391635-13056-3-git-send-email-artagnon@gmail.com> (Ramkumar Ramachandra's message of "Fri, 24 May 2013 16:03:55 +0530")

Ramkumar Ramachandra <artagnon@gmail.com> writes:

> The documentation of -S and -G is very sketchy.  Completely rewrite the
> sections in Documentation/diff-options.txt and
> Documentation/gitdiffcore.txt.
>
> References:
> 52e9578 ([PATCH] Introducing software archaeologist's tool "pickaxe".)
> f506b8e (git log/diff: add -G<regexp> that greps in the patch text)
>
> Inputs-from: Phil Hord <phil.hord@gmail.com>
> Co-authored-by: Junio C Hamano <gitster@pobox.com>
> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
> ---
>  Documentation/diff-options.txt | 38 +++++++++++++++++++++++++++++--------
>  Documentation/gitdiffcore.txt  | 43 ++++++++++++++++++++++++------------------
>  2 files changed, 55 insertions(+), 26 deletions(-)
>
> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> index 104579d..2835eef 100644
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -383,14 +383,36 @@ ifndef::git-format-patch[]
>  	that matches other criteria, nothing is selected.
>  
>  -S<string>::
> -	Look for differences that introduce or remove an instance of
> -	<string>. Note that this is different than the string simply
> -	appearing in diff output; see the 'pickaxe' entry in
> -	linkgit:gitdiffcore[7] for more details.
> +	Look for differences that change the number of occurrences of
> +	the specified string (i.e. addition/deletion) in a file.
> +	Intended for the scripter's use.
> ++
> +It is especially useful when you're looking for an exact block of code
> +(like a struct), and want to know the history of that block since it
> +first came into being: use the feature iteratively to feed the
> +interesting block in the preimage back into `-S`, and keep going until
> +you get the very first version of the block.

OK, even though I would not say "especially" nor "useful" if I were
writing it, as it is the only use case it was designed for.

>  -G<regex>::
> +	Look for differences whose patch text contains added/removed
> +	lines that match <regex>.
> ++
> +To illustrate the difference between `-S<regex> --pickaxe-regex` and
> +`-G<regex>`, consider a commit with the following diff in the same
> +file:
> ++
> +----
> ++    return !regexec(regexp, two->ptr, 1, &regmatch, 0);
> +...
> +-    hit = !regexec(regexp, mf2.ptr, 1, &regmatch, 0);
> +----
> ++
> +While `git log -G"regexec\(regexp"` will show this commit, `git log
> +-S"regexec\(regexp" --pickaxe-regex` will not (because the number of
> +occurrences of that string did not change).
> ++
> +See the 'pickaxe' entry in linkgit:gitdiffcore[7] for more
> +information.

OK.

>  --pickaxe-regex::
> -	Make the <string> not a plain string but an extended POSIX
> -	regex to match.
> +	Treat the <string> given to `-S` as an extended POSIX regular
> +	expression to match.

OK.

> diff --git a/Documentation/gitdiffcore.txt b/Documentation/gitdiffcore.txt
> index 568d757..ef4c04a 100644
> --- a/Documentation/gitdiffcore.txt
> +++ b/Documentation/gitdiffcore.txt
> @@ -222,26 +222,33 @@ version prefixed with '+'.
>  diffcore-pickaxe: For Detecting Addition/Deletion of Specified String
>  ---------------------------------------------------------------------
>  
> -This transformation is used to find filepairs that represent
> -changes that touch a specified string, and is controlled by the
> --S option and the `--pickaxe-all` option to the 'git diff-*'
> -commands.
> -
> -When diffcore-pickaxe is in use, it checks if there are
> -filepairs whose "result" side and whose "origin" side have
> -different number of specified string.  Such a filepair represents
> -"the string appeared in this changeset".  It also checks for the
> -opposite case that loses the specified string.
> -
> -When `--pickaxe-all` is not in effect, diffcore-pickaxe leaves
> -only such filepairs that touch the specified string in its
> -output.  When `--pickaxe-all` is used, diffcore-pickaxe leaves all
> -filepairs intact if there is such a filepair, or makes the
> -output empty otherwise.  The latter behaviour is designed to
> -make reviewing of the changes in the context of the whole


> +There are two kinds of pickaxe: the S kind (corresponding to 'git log
> +-S') and the G kind (mnemonic: grep; corresponding to 'git log -G').

This is good as the beginning of a second paragraph or the second
sentence of the first paragraph.  This patch loses the description
of the general purpose of this machinery that should come at the
very beginning of the section (the original had a very good ut valid
only back when we had only -S; my "how about this" text did not have
a good one).

For example, the "rename" is about taking one set of filepairs and
expressing (some of) them as renames or copies by merging a deletion
filepair and a creation filepair into a rename-modify filepair, or
turning a creation filepair into a copy-modify filepair by finding a
preimage.  What does this transformation do?

Again here is my attempt for that missing first paragraph:

	This transformation limits the set of filepairs to those
	that change specified strings between the preimage and the
	postimage in a certain way.

        -S<block of text> and -G<regex> options are used to specify
	different ways these strings are sought.  Without
	--pickaxe-all, only the filepairs matching the given
	criterion is left in the output; all filepairs are left in
	the output when --pickaxe-all is used and if at least one
	filepair matches the given criterion.

but I do not have enough time now to condense the above down to a
readable paragraph of reasonable length (I expect that the ideal
final form would be like 5-6 lines at most).

> +"-S<block of text>" detects filepairs whose preimage and postimage
> +have different number of occurrences of the specified block of text.
> +By definition, it will not detect in-file moves.  Also, when a
> +changeset moves a file wholesale without affecting the interesting
> +string, rename detection kicks in as usual, and `-S` omits the
> +filepair (since the number of occurrences of that string didn't change
> +in that rename-detected filepair).

I am not sure why it is necessary to say anything about what the
previous step (diffcore-rename) might have done.  The input of this
(or any other) step in the diffcore pipeline is a preimage-postimage
filepairs, and to this transformation the filename does not matter.
Whether a file was moved (either "wholesale", implying nothing
changed, or renamed with modification at the same time) without
touching the block of text, or a file did not get involved in any
renaming, the only thing that matters is what the preimage and the
postimage in a filepair has (or does not have).

> + The implementation essentially
> +runs a count, and is significantly cheaper than the G kind.  When used
> +with `--pickaxe-regex`, treat the <block of text> as an extended POSIX
> +regular expression to match, instead of a literal string.

Sure.  Is "essentially runs a count" needed, though?  The reader has
just read "number of occurrences of the specified block of text" so
it would be obvious that the implementation counts.  It may be true
that it is significantly cheaper, but because they serve different
purposes, I am not sure it is worth saying.  It is like saying that
a hammer is significantly faster to drive a nail into wood than a
screwdriver to drive a screw into wood, without saying "nail" and
"screw".  It only invites readers to use a hammer to drive a screw.

> +"-G<regular expression>" detects filepairs whose textual diff has an
> +added or a deleted line that matches the given regular expression.
> +This means that it can detect in-file (or what rename-detection
> +considers the same file) moves.

"it can" sounds as if it is always a merit, which is probably not
what you wanted to imply.

When you are trying to see how a particular line came into the
shape, you would want to know what the previous shape of it was, but
a literal move will also be shown, which is a noise for the purpose
of digging.

> +The implementation runs diff twice
> +and greps, and this can be quite expensive.

Unlike the "count" one above which was obvious, the "runs diff and
greps hence expensive" part is worth saying.

> +When `-S` or `-G` are used without `--pickaxe-all`, only filepairs
> +that match their respective criterion are kept in the output.  When
> +`--pickaxe-all` is used, if even one filepair matches their respective
> +criterion in a changeset, the entire changeset is kept.  This behavior
> +is designed to make reviewing changes in the context of the whole
>  changeset easier.

OK.

>  
> -
>  diffcore-order: For Sorting the Output Based on Filenames
>  ---------------------------------------------------------

  reply	other threads:[~2013-05-24 17:31 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-24 10:33 [PATCH v3 0/2] Improve diffcore-pickaxe documentation Ramkumar Ramachandra
2013-05-24 10:33 ` [PATCH 1/2] diffcore-pickaxe: make error messages more consistent Ramkumar Ramachandra
2013-05-24 10:33 ` [PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly Ramkumar Ramachandra
2013-05-24 17:31   ` Junio C Hamano [this message]
2013-05-31 12:04     ` Ramkumar Ramachandra
2013-06-02 19:56       ` Junio C Hamano
2013-06-02 20:28         ` Ramkumar Ramachandra
  -- strict thread matches above, loose matches on Subject: below --
2013-05-31 12:12 [PATCH v4 0/2] Improve diffcore-pickaxe documentation Ramkumar Ramachandra
2013-05-31 12:12 ` [PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly Ramkumar Ramachandra
2013-06-03 17:54   ` Junio C Hamano
2013-05-17 12:23 [PATCH v2 0/2] Improve diffcore-pickaxe documentation Ramkumar Ramachandra
2013-05-17 12:23 ` [PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly Ramkumar Ramachandra
2013-05-18  1:24   ` Junio C Hamano
2013-05-19  7:33     ` Junio C Hamano
2013-05-24  9:37       ` Ramkumar Ramachandra
2013-05-24 14:58         ` Phil Hord
2013-05-24 16:01           ` Ramkumar Ramachandra
2013-05-24 16:54           ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7v38tcb7yx.fsf@alter.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=artagnon@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).