git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: Git List <git@vger.kernel.org>, Phil Hord <phil.hord@gmail.com>
Subject: Re: [PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly
Date: Fri, 17 May 2013 18:24:22 -0700	[thread overview]
Message-ID: <7vfvxlw055.fsf@alter.siamese.dyndns.org> (raw)
In-Reply-To: <1368793403-4642-3-git-send-email-artagnon@gmail.com> (Ramkumar Ramachandra's message of "Fri, 17 May 2013 17:53:23 +0530")

Ramkumar Ramachandra <artagnon@gmail.com> writes:

> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
> index 104579d..b61a666 100644
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -383,14 +383,35 @@ ifndef::git-format-patch[]
>  	that matches other criteria, nothing is selected.
>  
>  -S<string>::
> -	Look for differences that introduce or remove an instance of
> -	<string>. Note that this is different than the string simply
> -	appearing in diff output; see the 'pickaxe' entry in
> -	linkgit:gitdiffcore[7] for more details.
> +	Look for commits that change the number of occurrences of the

The first part of the change is misguided.  First of all, this text
also appears in the documentation of "git diff" and -S limits the
output to those filepairs that match its criteria.

"git log A..B -- foo" looks for commits that has changes in paths
that matches 'foo'.  "git log" that selects commits is affected by
the outcome of what "diff" shows.  This "looks for *commits*" is a
characteristic of "log", not specific to the -S option.

The aspect of "diff" behaviour that is affected by -S is what is
considered as difference.  Usually "diff" says "preimage and
postimage are different" for any change, but -S changes that.  The
preimage and postimage has to have different number of specified
block of text to be considered different.

And that is why the original says "look for differences".

> +	specified string (i.e. addition/ deletion) in a file.
> +	Intended for the scripter's use.
> ++
> +It is especially useful when you're looking for an exact block of code
> +(like a struct), and want to know the history of that block since it
> +first came into being.

I am not sure this half-way description is a good idea.  If you want
to use it to discover what happend "since it first came into being",
you need to use this (and the feature is designed for such a use
pattern) iteratively, find the first commit that changes the block
of the text that appear in the latest version, find the corresponding
block of interest in the preimage and then feed that to -S and start
digging from that first-discovered commit.  If the text describes
that iteration fully, I think that is fine, but if you read the
above literally, it looks as if you feed one version of -S and it
would do the necessary adjustment on its own, which is misleading.

>  -G<regex>::
> -	Look for differences whose added or removed line matches
> -	the given <regex>.
> +	Grep through the patch text of commits for added/removed lines
> +	that match <regex>.  `--pickaxe-regex` is implied in this
> +	mode.

The same comment on differences vs commits apply to this.

If there were any behaviour change --pickaxe-regex introduces to -S
that also applies to -G "other than the pattern is used as a regular
expression", I would agree it is a good idea to say that other
behaviour is implied, but as far as I know, there is no such
implication.  Drop --pickaxe-regex, as it is not even implied in the
code:

	} else if ((argcount = short_opt('G', av, &optarg))) {
		options->pickaxe = optarg;
		options->pickaxe_opts |= DIFF_PICKAXE_KIND_G;
		return argcount;
	}

so even if there were some code added in the future that does

	if (options->pickaxe_opts & DIFF_PICKAXE_REGEX) {
		do this thing only when the user said
                --pickaxe-regex from the command line
	}

it will _not_ apply to users of -G.

"grep through", if the reader knows "grep", with "match <regex>", it
is crystal clear that this expects a regular expression.  And that
is the only thing that makes -G and --pickaxe-regex superficially
related.

> ++
> +To illustrate the difference between `-S<regex> --pickaxe-regex` and
> +`-G<regex>`, consider a commit with the following diff in the same
> +file:
> ++
> +----
> ++    return !regexec(regexp, two->ptr, 1, &regmatch, 0);
> +...
> +-    hit = !regexec(regexp, mf2.ptr, 1, &regmatch, 0);
> +----
> ++
> +While `git log -G"regexec\(regexp"` will show this commit, `git log
> +-S"regexec\(regexp" --pickaxe-regex` will not (because the number of
> +occurrences of that string didn't change).
> ++
> +See the 'pickaxe' entry in linkgit:gitdiffcore[7] for more
> +information.

This is a readable example (I think we tend not to use contraction,
so if I were writing it, I wouldn't have written "didn't change",
though).

>  --pickaxe-regex::
> -	Make the <string> not a plain string but an extended POSIX
> -	regex to match.
> +	Treat the <string> not as a plain string, but an extended
> +	POSIX regex to match.  It is implied when `-G` is used.

Ditto.  Rather, 

	Treat the <string> given to the -S option as an extended
	POSIX regular expression to match.

> diff --git a/Documentation/gitdiffcore.txt b/Documentation/gitdiffcore.txt
> index 568d757..d0f2b91 100644
> --- a/Documentation/gitdiffcore.txt
> +++ b/Documentation/gitdiffcore.txt
> @@ -222,25 +222,34 @@ version prefixed with '+'.
>  diffcore-pickaxe: For Detecting Addition/Deletion of Specified String
>  ---------------------------------------------------------------------
>  
> -This transformation is used to find filepairs that represent
> -changes that touch a specified string, and is controlled by the
> --S option and the `--pickaxe-all` option to the 'git diff-*'
> -commands.
> -
> -When diffcore-pickaxe is in use, it checks if there are
> -filepairs whose "result" side and whose "origin" side have
> -different number of specified string.  Such a filepair represents
> -"the string appeared in this changeset".  It also checks for the
> -opposite case that loses the specified string.
> -
> -When `--pickaxe-all` is not in effect, diffcore-pickaxe leaves
> -only such filepairs that touch the specified string in its
> -output.  When `--pickaxe-all` is used, diffcore-pickaxe leaves all
> -filepairs intact if there is such a filepair, or makes the
> -output empty otherwise.  The latter behaviour is designed to
> -make reviewing of the changes in the context of the whole
> -changeset easier.

This part is impossible to review on a bus, so I won't comment in
this message.

Why did you even have to touch the paragraph for --pickaxe-all?
That applies to both -S and -G.  I thought it would be just the
matter of slightly tweaking the introductory paragraph (which was
written back when there was only -S), keeping the second paragraph
for -S as-is, and insert an additional paragraph for -G before
--pickaxe-all.

> +There are two kinds of pickaxe: the S kind (corresponding to 'git log
> +-S') and the G kind (mnemonic: grep; corresponding to 'git log -G').
> +
> +The S kind detects filepairs whose "result" side and "origin" side
> +have different number of occurrences of specified string.  By
> +definition, it will not detect in-file moves.  Also, when a commit
> +moves a file wholesale without affecting the string being looked at,
> +rename detection kicks in as usual, and 'git log -S' omits the commit
> +(since the number of occurrences of that string didn't change in that
> +rename-detected filepair).  The implementation essentially runs a
> +count, and is significantly cheaper than the G kind.
> +
> +The G kind detects filepairs whose patch text has an added or a
> +deleted line that matches the given regexp.  This means that it can
> +detect in-file (or what rename-detection considers the same file)
> +moves.  The implementation of 'git log -G' runs diff twice and greps,
> +and this can be quite expensive.
> +
> +When `--pickaxe-regex` is used with `-S`, treat the <string> not as a
> +plain string, but an extended POSIX regex to match.  It is implied
> +when `-G` is used.
> +
> +When `--pickaxe-all` is not in effect, diffcore-pickaxe leaves only
> +the filepairs that touch the specified string in its output.  When in
> +effect, diffcore-pickaxe leaves all filepairs intact if there is such
> +a filepair, or makes the output empty otherwise.  The latter behavior
> +is designed to make reviewing of the changes in the context of the
> +whole changeset easier.
>  
>  diffcore-order: For Sorting the Output Based on Filenames
>  ---------------------------------------------------------

  reply	other threads:[~2013-05-18  1:24 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-17 12:23 [PATCH v2 0/2] Improve diffcore-pickaxe documentation Ramkumar Ramachandra
2013-05-17 12:23 ` [PATCH 1/2] diffcore-pickaxe: make error messages more consistent Ramkumar Ramachandra
2013-05-18  1:25   ` Junio C Hamano
2013-05-17 12:23 ` [PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly Ramkumar Ramachandra
2013-05-18  1:24   ` Junio C Hamano [this message]
2013-05-19  7:33     ` Junio C Hamano
2013-05-24  9:37       ` Ramkumar Ramachandra
2013-05-24 14:58         ` Phil Hord
2013-05-24 16:01           ` Ramkumar Ramachandra
2013-05-24 16:54           ` Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2013-05-24 10:33 [PATCH v3 0/2] Improve diffcore-pickaxe documentation Ramkumar Ramachandra
2013-05-24 10:33 ` [PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly Ramkumar Ramachandra
2013-05-24 17:31   ` Junio C Hamano
2013-05-31 12:04     ` Ramkumar Ramachandra
2013-06-02 19:56       ` Junio C Hamano
2013-06-02 20:28         ` Ramkumar Ramachandra
2013-05-31 12:12 [PATCH v4 0/2] Improve diffcore-pickaxe documentation Ramkumar Ramachandra
2013-05-31 12:12 ` [PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly Ramkumar Ramachandra
2013-06-03 17:54   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vfvxlw055.fsf@alter.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=artagnon@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=phil.hord@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).