git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* git log -S or -G
@ 2018-10-06 15:14 Julia Lawall
  2018-10-06 16:22 ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 11+ messages in thread
From: Julia Lawall @ 2018-10-06 15:14 UTC (permalink / raw)
  To: git

Hello,

Git log -S or -G make it possible to find commits that have particular
words in the changed lines.  Sometimes it would be helpful to search for
words in the removed lines or in the added lines specifically.  From the
implementation, I had the impression that this would be easy to implement.
The main question would be how to allow the user to specify what is
wanted.

thanks,
julia

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: git log -S or -G
  2018-10-06 15:14 git log -S or -G Julia Lawall
@ 2018-10-06 16:22 ` Ævar Arnfjörð Bjarmason
  2018-10-07  0:48   ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-10-06 16:22 UTC (permalink / raw)
  To: Julia Lawall; +Cc: Git Mailing List

On Sat, Oct 6, 2018 at 5:16 PM Julia Lawall <julia.lawall@lip6.fr> wrote:
> Git log -S or -G make it possible to find commits that have particular
> words in the changed lines.  Sometimes it would be helpful to search for
> words in the removed lines or in the added lines specifically.  From the
> implementation, I had the impression that this would be easy to implement.
> The main question would be how to allow the user to specify what is
> wanted.

As far as I know this isn't possible. The --diff-filter option is
similar in spirit, but e.g. adding "foo" and then removing it from an
existing file will both be covered under --diff-filter=M, so that
isn't what you're looking for.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: git log -S or -G
  2018-10-06 16:22 ` Ævar Arnfjörð Bjarmason
@ 2018-10-07  0:48   ` Junio C Hamano
  2018-10-07  5:21     ` Julia Lawall
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2018-10-07  0:48 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Julia Lawall, Git Mailing List

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Sat, Oct 6, 2018 at 5:16 PM Julia Lawall <julia.lawall@lip6.fr> wrote:
>> Git log -S or -G make it possible to find commits that have particular
>> words in the changed lines.  Sometimes it would be helpful to search for
>> words in the removed lines or in the added lines specifically.  From the
>> implementation, I had the impression that this would be easy to implement.
>> The main question would be how to allow the user to specify what is
>> wanted.
>
> As far as I know this isn't possible. The --diff-filter option is
> similar in spirit, but e.g. adding "foo" and then removing it from an
> existing file will both be covered under --diff-filter=M, so that
> isn't what you're looking for.

I agree with Julia that UI to the feature is harder than the
machinery to implement the feature to add "I am interested in seeing
a patch that contains a hunk that adds 'foo' but am not interested
in removal" (or vice versa) for -G.  You tweak
diffcore-pickaxe.c::diffgrep_consume() and you'are done.

Doing the same for -S is much harder at the machinery level, as it
performs its thing without internally running "diff" twice, but just
counts the number of occurrences of 'foo'---that is sufficient for
its intended use, and more efficient.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: git log -S or -G
  2018-10-07  0:48   ` Junio C Hamano
@ 2018-10-07  5:21     ` Julia Lawall
  2018-10-08 23:09       ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Julia Lawall @ 2018-10-07  5:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Ævar Arnfjörð Bjarmason, Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 1499 bytes --]



On Sun, 7 Oct 2018, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
> > On Sat, Oct 6, 2018 at 5:16 PM Julia Lawall <julia.lawall@lip6.fr> wrote:
> >> Git log -S or -G make it possible to find commits that have particular
> >> words in the changed lines.  Sometimes it would be helpful to search for
> >> words in the removed lines or in the added lines specifically.  From the
> >> implementation, I had the impression that this would be easy to implement.
> >> The main question would be how to allow the user to specify what is
> >> wanted.
> >
> > As far as I know this isn't possible. The --diff-filter option is
> > similar in spirit, but e.g. adding "foo" and then removing it from an
> > existing file will both be covered under --diff-filter=M, so that
> > isn't what you're looking for.
>
> I agree with Julia that UI to the feature is harder than the
> machinery to implement the feature to add "I am interested in seeing
> a patch that contains a hunk that adds 'foo' but am not interested
> in removal" (or vice versa) for -G.  You tweak
> diffcore-pickaxe.c::diffgrep_consume() and you'are done.
>
> Doing the same for -S is much harder at the machinery level, as it
> performs its thing without internally running "diff" twice, but just
> counts the number of occurrences of 'foo'---that is sufficient for
> its intended use, and more efficient.

There is still the question of whether the number of occurrences of foo
decreases or increases.

julia

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: git log -S or -G
  2018-10-07  5:21     ` Julia Lawall
@ 2018-10-08 23:09       ` Junio C Hamano
  2018-10-09  3:21         ` Jeff King
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2018-10-08 23:09 UTC (permalink / raw)
  To: Julia Lawall; +Cc: Ævar Arnfjörð Bjarmason, Git Mailing List

Julia Lawall <julia.lawall@lip6.fr> writes:

>> Doing the same for -S is much harder at the machinery level, as it
>> performs its thing without internally running "diff" twice, but just
>> counts the number of occurrences of 'foo'---that is sufficient for
>> its intended use, and more efficient.
>
> There is still the question of whether the number of occurrences of foo
> decreases or increases.

Hmph, taking the changes that makes the number of hits decrease
would catch a subset of "changes that removes 'foo' only---I am not
interested in the ones that adds 'foo'".  It will avoid getting
confused by a change that moves an existing 'foo' to another place
in the same file (as the number of hits does not change), but at the
same time, it will miss a change that genuinely removes an existing
'foo' and happens to add a 'foo' at a different place in the same
file that is unrelated to the original 'foo'.  Depending on the
definition of "I am only interested in removed ones", that may or
may not be acceptable.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: git log -S or -G
  2018-10-08 23:09       ` Junio C Hamano
@ 2018-10-09  3:21         ` Jeff King
  2018-10-09  3:58           ` Jacob Keller
  2018-10-09  5:21           ` Junio C Hamano
  0 siblings, 2 replies; 11+ messages in thread
From: Jeff King @ 2018-10-09  3:21 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Julia Lawall, Ævar Arnfjörð Bjarmason,
	Git Mailing List

On Tue, Oct 09, 2018 at 08:09:32AM +0900, Junio C Hamano wrote:

> Julia Lawall <julia.lawall@lip6.fr> writes:
> 
> >> Doing the same for -S is much harder at the machinery level, as it
> >> performs its thing without internally running "diff" twice, but just
> >> counts the number of occurrences of 'foo'---that is sufficient for
> >> its intended use, and more efficient.
> >
> > There is still the question of whether the number of occurrences of foo
> > decreases or increases.
> 
> Hmph, taking the changes that makes the number of hits decrease
> would catch a subset of "changes that removes 'foo' only---I am not
> interested in the ones that adds 'foo'".  It will avoid getting
> confused by a change that moves an existing 'foo' to another place
> in the same file (as the number of hits does not change), but at the
> same time, it will miss a change that genuinely removes an existing
> 'foo' and happens to add a 'foo' at a different place in the same
> file that is unrelated to the original 'foo'.  Depending on the
> definition of "I am only interested in removed ones", that may or
> may not be acceptable.

I think that is the best we could do for "-S", though, which is
inherently about counting hits.

For "-G", we are literally grepping the diff. It does not seem
unreasonable to add the ability to grep only "-" or "+" lines, and the
interface for that should be pretty straightforward (a tri-state flag to
look in remove, added, or both lines).

-Peff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: git log -S or -G
  2018-10-09  3:21         ` Jeff King
@ 2018-10-09  3:58           ` Jacob Keller
  2018-10-09  6:39             ` Julia Lawall
  2018-10-09  5:21           ` Junio C Hamano
  1 sibling, 1 reply; 11+ messages in thread
From: Jacob Keller @ 2018-10-09  3:58 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Julia Lawall,
	Ævar Arnfjörð Bjarmason, Git mailing list

On Mon, Oct 8, 2018 at 8:22 PM Jeff King <peff@peff.net> wrote:
>
> On Tue, Oct 09, 2018 at 08:09:32AM +0900, Junio C Hamano wrote:
>
> > Julia Lawall <julia.lawall@lip6.fr> writes:
> >
> > >> Doing the same for -S is much harder at the machinery level, as it
> > >> performs its thing without internally running "diff" twice, but just
> > >> counts the number of occurrences of 'foo'---that is sufficient for
> > >> its intended use, and more efficient.
> > >
> > > There is still the question of whether the number of occurrences of foo
> > > decreases or increases.
> >
> > Hmph, taking the changes that makes the number of hits decrease
> > would catch a subset of "changes that removes 'foo' only---I am not
> > interested in the ones that adds 'foo'".  It will avoid getting
> > confused by a change that moves an existing 'foo' to another place
> > in the same file (as the number of hits does not change), but at the
> > same time, it will miss a change that genuinely removes an existing
> > 'foo' and happens to add a 'foo' at a different place in the same
> > file that is unrelated to the original 'foo'.  Depending on the
> > definition of "I am only interested in removed ones", that may or
> > may not be acceptable.
>
> I think that is the best we could do for "-S", though, which is
> inherently about counting hits.
>
> For "-G", we are literally grepping the diff. It does not seem
> unreasonable to add the ability to grep only "-" or "+" lines, and the
> interface for that should be pretty straightforward (a tri-state flag to
> look in remove, added, or both lines).
>
> -Peff

Yea. I know I've wanted something like this in the past.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: git log -S or -G
  2018-10-09  3:21         ` Jeff King
  2018-10-09  3:58           ` Jacob Keller
@ 2018-10-09  5:21           ` Junio C Hamano
  2018-10-09 12:45             ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2018-10-09  5:21 UTC (permalink / raw)
  To: Jeff King
  Cc: Julia Lawall, Ævar Arnfjörð Bjarmason,
	Git Mailing List

Jeff King <peff@peff.net> writes:

> I think that is the best we could do for "-S", though, which is
> inherently about counting hits.
>
> For "-G", we are literally grepping the diff. It does not seem
> unreasonable to add the ability to grep only "-" or "+" lines, and the
> interface for that should be pretty straightforward (a tri-state flag to
> look in remove, added, or both lines).

Yeah, here is a lunchtime hack that hasn't even been compile tested.

 diff.c             |  4 ++++
 diff.h             |  2 ++
 diffcore-pickaxe.c | 22 ++++++++++++++++++++--
 3 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/diff.c b/diff.c
index f0c7557b40..d1f2780844 100644
--- a/diff.c
+++ b/diff.c
@@ -5068,6 +5068,10 @@ int diff_opt_parse(struct diff_options *options,
 	}
 	else if (!strcmp(arg, "--pickaxe-all"))
 		options->pickaxe_opts |= DIFF_PICKAXE_ALL;
+	else if (!strcmp(arg, "--pickaxe-ignore-add"))
+		options->pickaxe_opts |= DIFF_PICKAXE_IGNORE_ADD;
+	else if (!strcmp(arg, "--pickaxe-ignore-del"))
+		options->pickaxe_opts |= DIFF_PICKAXE_IGNORE_DEL;
 	else if (!strcmp(arg, "--pickaxe-regex"))
 		options->pickaxe_opts |= DIFF_PICKAXE_REGEX;
 	else if ((argcount = short_opt('O', av, &optarg))) {
diff --git a/diff.h b/diff.h
index a30cc35ec3..147c47ace7 100644
--- a/diff.h
+++ b/diff.h
@@ -358,6 +358,8 @@ int git_config_rename(const char *var, const char *value);
 				 DIFF_PICKAXE_KIND_OBJFIND)
 
 #define DIFF_PICKAXE_IGNORE_CASE	32
+#define DIFF_PICKAXE_IGNORE_ADD	64
+#define DIFF_PICKAXE_IGNORE_DEL 128
 
 void diffcore_std(struct diff_options *);
 void diffcore_fix_diff_index(struct diff_options *);
diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
index 800a899c86..826dde6bd4 100644
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -16,6 +16,7 @@ typedef int (*pickaxe_fn)(mmfile_t *one, mmfile_t *two,
 
 struct diffgrep_cb {
 	regex_t *regexp;
+	struct diff_options *diff_options;
 	int hit;
 };
 
@@ -23,9 +24,14 @@ static void diffgrep_consume(void *priv, char *line, unsigned long len)
 {
 	struct diffgrep_cb *data = priv;
 	regmatch_t regmatch;
+	unsigned pickaxe_opts = data->diff_options->pickaxe_opts;
 
 	if (line[0] != '+' && line[0] != '-')
 		return;
+	if ((pickaxe_opts & DIFF_PICKAXE_IGNORE_ADD) &&	line[0] == '+')
+		return;
+	if ((pickaxe_opts & DIFF_PICKAXE_IGNORE_DEL) &&	line[0] == '-')
+		return;
 	if (data->hit)
 		/*
 		 * NEEDSWORK: we should have a way to terminate the
@@ -45,13 +51,20 @@ static int diff_grep(mmfile_t *one, mmfile_t *two,
 	xpparam_t xpp;
 	xdemitconf_t xecfg;
 
-	if (!one)
+	if (!one) {
+		if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_ADD)
+			return 0;
 		return !regexec_buf(regexp, two->ptr, two->size,
 				    1, &regmatch, 0);
-	if (!two)
+	}
+	if (!two) {
+		if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_DEL)
+			return 0;
 		return !regexec_buf(regexp, one->ptr, one->size,
 				    1, &regmatch, 0);
+	}
 
+	ecbdata.diff_options = o;
 	/*
 	 * We have both sides; need to run textual diff and see if
 	 * the pattern appears on added/deleted lines.
@@ -113,6 +126,11 @@ static int has_changes(mmfile_t *one, mmfile_t *two,
 {
 	unsigned int one_contains = one ? contains(one, regexp, kws) : 0;
 	unsigned int two_contains = two ? contains(two, regexp, kws) : 0;
+
+	if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_ADD)
+		return one_contains > two_contains;
+	if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_DEL)
+		return one_contains < two_contains;
 	return one_contains != two_contains;
 }
 

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: git log -S or -G
  2018-10-09  3:58           ` Jacob Keller
@ 2018-10-09  6:39             ` Julia Lawall
  0 siblings, 0 replies; 11+ messages in thread
From: Julia Lawall @ 2018-10-09  6:39 UTC (permalink / raw)
  To: Jacob Keller
  Cc: Jeff King, Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Git mailing list



On Mon, 8 Oct 2018, Jacob Keller wrote:

> On Mon, Oct 8, 2018 at 8:22 PM Jeff King <peff@peff.net> wrote:
> >
> > On Tue, Oct 09, 2018 at 08:09:32AM +0900, Junio C Hamano wrote:
> >
> > > Julia Lawall <julia.lawall@lip6.fr> writes:
> > >
> > > >> Doing the same for -S is much harder at the machinery level, as it
> > > >> performs its thing without internally running "diff" twice, but just
> > > >> counts the number of occurrences of 'foo'---that is sufficient for
> > > >> its intended use, and more efficient.
> > > >
> > > > There is still the question of whether the number of occurrences of foo
> > > > decreases or increases.
> > >
> > > Hmph, taking the changes that makes the number of hits decrease
> > > would catch a subset of "changes that removes 'foo' only---I am not
> > > interested in the ones that adds 'foo'".  It will avoid getting
> > > confused by a change that moves an existing 'foo' to another place
> > > in the same file (as the number of hits does not change), but at the
> > > same time, it will miss a change that genuinely removes an existing
> > > 'foo' and happens to add a 'foo' at a different place in the same
> > > file that is unrelated to the original 'foo'.  Depending on the
> > > definition of "I am only interested in removed ones", that may or
> > > may not be acceptable.
> >
> > I think that is the best we could do for "-S", though, which is
> > inherently about counting hits.
> >
> > For "-G", we are literally grepping the diff. It does not seem
> > unreasonable to add the ability to grep only "-" or "+" lines, and the
> > interface for that should be pretty straightforward (a tri-state flag to
> > look in remove, added, or both lines).
> >
> > -Peff
>
> Yea. I know I've wanted something like this in the past.

It could also be nice to be able to specify multiple patterns, with and
and or between them.  So -A&-B would be remove A somewhere and remove B
somewhere.

julia

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: git log -S or -G
  2018-10-09  5:21           ` Junio C Hamano
@ 2018-10-09 12:45             ` Ævar Arnfjörð Bjarmason
  2018-10-09 13:51               ` Julia Lawall
  0 siblings, 1 reply; 11+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-10-09 12:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Julia Lawall, Git Mailing List


On Tue, Oct 09 2018, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
>
>> I think that is the best we could do for "-S", though, which is
>> inherently about counting hits.
>>
>> For "-G", we are literally grepping the diff. It does not seem
>> unreasonable to add the ability to grep only "-" or "+" lines, and the
>> interface for that should be pretty straightforward (a tri-state flag to
>> look in remove, added, or both lines).
>
> Yeah, here is a lunchtime hack that hasn't even been compile tested.
>
>  diff.c             |  4 ++++
>  diff.h             |  2 ++
>  diffcore-pickaxe.c | 22 ++++++++++++++++++++--
>  3 files changed, 26 insertions(+), 2 deletions(-)
>
> diff --git a/diff.c b/diff.c
> index f0c7557b40..d1f2780844 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -5068,6 +5068,10 @@ int diff_opt_parse(struct diff_options *options,
>  	}
>  	else if (!strcmp(arg, "--pickaxe-all"))
>  		options->pickaxe_opts |= DIFF_PICKAXE_ALL;
> +	else if (!strcmp(arg, "--pickaxe-ignore-add"))
> +		options->pickaxe_opts |= DIFF_PICKAXE_IGNORE_ADD;
> +	else if (!strcmp(arg, "--pickaxe-ignore-del"))
> +		options->pickaxe_opts |= DIFF_PICKAXE_IGNORE_DEL;
>  	else if (!strcmp(arg, "--pickaxe-regex"))
>  		options->pickaxe_opts |= DIFF_PICKAXE_REGEX;
>  	else if ((argcount = short_opt('O', av, &optarg))) {
> diff --git a/diff.h b/diff.h
> index a30cc35ec3..147c47ace7 100644
> --- a/diff.h
> +++ b/diff.h
> @@ -358,6 +358,8 @@ int git_config_rename(const char *var, const char *value);
>  				 DIFF_PICKAXE_KIND_OBJFIND)
>
>  #define DIFF_PICKAXE_IGNORE_CASE	32
> +#define DIFF_PICKAXE_IGNORE_ADD	64
> +#define DIFF_PICKAXE_IGNORE_DEL 128
>
>  void diffcore_std(struct diff_options *);
>  void diffcore_fix_diff_index(struct diff_options *);
> diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
> index 800a899c86..826dde6bd4 100644
> --- a/diffcore-pickaxe.c
> +++ b/diffcore-pickaxe.c
> @@ -16,6 +16,7 @@ typedef int (*pickaxe_fn)(mmfile_t *one, mmfile_t *two,
>
>  struct diffgrep_cb {
>  	regex_t *regexp;
> +	struct diff_options *diff_options;
>  	int hit;
>  };
>
> @@ -23,9 +24,14 @@ static void diffgrep_consume(void *priv, char *line, unsigned long len)
>  {
>  	struct diffgrep_cb *data = priv;
>  	regmatch_t regmatch;
> +	unsigned pickaxe_opts = data->diff_options->pickaxe_opts;
>
>  	if (line[0] != '+' && line[0] != '-')
>  		return;
> +	if ((pickaxe_opts & DIFF_PICKAXE_IGNORE_ADD) &&	line[0] == '+')
> +		return;
> +	if ((pickaxe_opts & DIFF_PICKAXE_IGNORE_DEL) &&	line[0] == '-')
> +		return;
>  	if (data->hit)
>  		/*

Looks good, but I wonder if a more general version of this couldn't be
that instead of returning early if the line doesn't start with +/-
above, we have an option to skip that early return.

Then you can simply specify a regex that starts by matching a + or - at
the start of the line, but you also get the poweruser tool of matching
lines around those lines, as tweaked by the -U option. I.e. this:

diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
index 800a899c86..90625a110c 100644
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -24,15 +24,13 @@ static void diffgrep_consume(void *priv, char *line, unsigned long len)
        struct diffgrep_cb *data = priv;
        regmatch_t regmatch;

-       if (line[0] != '+' && line[0] != '-')
-               return;
        if (data->hit)
                /*
                 * NEEDSWORK: we should have a way to terminate the
                 * caller early.
                 */
                return;
-       data->hit = !regexec_buf(data->regexp, line + 1, len - 1, 1,
+       data->hit = !regexec_buf(data->regexp, line, len, 1,
                                 &regmatch, 0);
 }

That patch obviously breaks existing -G, so it would need to be
optional, but allows me e.g. on git.git to do:

    ~/g/git/git --exec-path=/home/avar/g/git log -G'^ .*marc\.info' -p -U2 -- README.md

To find a change whose first line of context is a line mentioning
marc.info, and then I can use -G'^\+<rx>' to find added lines matching
<rx> etc.

Then the --pickaxe-ignore-add and --pickaxe-ignore-del options in your
patch could just be implemented in terms of that feature, i.e. by
implicitly adding a "^-" or "^\+" to the beginning of the regex,
respectively, and implicitly turning on a new --pickaxe-raw-lines or
whatever we'd call it.

>  		 * NEEDSWORK: we should have a way to terminate the
> @@ -45,13 +51,20 @@ static int diff_grep(mmfile_t *one, mmfile_t *two,
>  	xpparam_t xpp;
>  	xdemitconf_t xecfg;
>
> -	if (!one)
> +	if (!one) {
> +		if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_ADD)
> +			return 0;
>  		return !regexec_buf(regexp, two->ptr, two->size,
>  				    1, &regmatch, 0);
> -	if (!two)
> +	}
> +	if (!two) {
> +		if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_DEL)
> +			return 0;
>  		return !regexec_buf(regexp, one->ptr, one->size,
>  				    1, &regmatch, 0);
> +	}
>
> +	ecbdata.diff_options = o;
>  	/*
>  	 * We have both sides; need to run textual diff and see if
>  	 * the pattern appears on added/deleted lines.
> @@ -113,6 +126,11 @@ static int has_changes(mmfile_t *one, mmfile_t *two,
>  {
>  	unsigned int one_contains = one ? contains(one, regexp, kws) : 0;
>  	unsigned int two_contains = two ? contains(two, regexp, kws) : 0;
> +
> +	if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_ADD)
> +		return one_contains > two_contains;
> +	if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_DEL)
> +		return one_contains < two_contains;
>  	return one_contains != two_contains;
>  }
>

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: git log -S or -G
  2018-10-09 12:45             ` Ævar Arnfjörð Bjarmason
@ 2018-10-09 13:51               ` Julia Lawall
  0 siblings, 0 replies; 11+ messages in thread
From: Julia Lawall @ 2018-10-09 13:51 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Junio C Hamano, Jeff King, Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 5993 bytes --]



On Tue, 9 Oct 2018, Ævar Arnfjörð Bjarmason wrote:

>
> On Tue, Oct 09 2018, Junio C Hamano wrote:
>
> > Jeff King <peff@peff.net> writes:
> >
> >> I think that is the best we could do for "-S", though, which is
> >> inherently about counting hits.
> >>
> >> For "-G", we are literally grepping the diff. It does not seem
> >> unreasonable to add the ability to grep only "-" or "+" lines, and the
> >> interface for that should be pretty straightforward (a tri-state flag to
> >> look in remove, added, or both lines).
> >
> > Yeah, here is a lunchtime hack that hasn't even been compile tested.
> >
> >  diff.c             |  4 ++++
> >  diff.h             |  2 ++
> >  diffcore-pickaxe.c | 22 ++++++++++++++++++++--
> >  3 files changed, 26 insertions(+), 2 deletions(-)
> >
> > diff --git a/diff.c b/diff.c
> > index f0c7557b40..d1f2780844 100644
> > --- a/diff.c
> > +++ b/diff.c
> > @@ -5068,6 +5068,10 @@ int diff_opt_parse(struct diff_options *options,
> >  	}
> >  	else if (!strcmp(arg, "--pickaxe-all"))
> >  		options->pickaxe_opts |= DIFF_PICKAXE_ALL;
> > +	else if (!strcmp(arg, "--pickaxe-ignore-add"))
> > +		options->pickaxe_opts |= DIFF_PICKAXE_IGNORE_ADD;
> > +	else if (!strcmp(arg, "--pickaxe-ignore-del"))
> > +		options->pickaxe_opts |= DIFF_PICKAXE_IGNORE_DEL;
> >  	else if (!strcmp(arg, "--pickaxe-regex"))
> >  		options->pickaxe_opts |= DIFF_PICKAXE_REGEX;
> >  	else if ((argcount = short_opt('O', av, &optarg))) {
> > diff --git a/diff.h b/diff.h
> > index a30cc35ec3..147c47ace7 100644
> > --- a/diff.h
> > +++ b/diff.h
> > @@ -358,6 +358,8 @@ int git_config_rename(const char *var, const char *value);
> >  				 DIFF_PICKAXE_KIND_OBJFIND)
> >
> >  #define DIFF_PICKAXE_IGNORE_CASE	32
> > +#define DIFF_PICKAXE_IGNORE_ADD	64
> > +#define DIFF_PICKAXE_IGNORE_DEL 128
> >
> >  void diffcore_std(struct diff_options *);
> >  void diffcore_fix_diff_index(struct diff_options *);
> > diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
> > index 800a899c86..826dde6bd4 100644
> > --- a/diffcore-pickaxe.c
> > +++ b/diffcore-pickaxe.c
> > @@ -16,6 +16,7 @@ typedef int (*pickaxe_fn)(mmfile_t *one, mmfile_t *two,
> >
> >  struct diffgrep_cb {
> >  	regex_t *regexp;
> > +	struct diff_options *diff_options;
> >  	int hit;
> >  };
> >
> > @@ -23,9 +24,14 @@ static void diffgrep_consume(void *priv, char *line, unsigned long len)
> >  {
> >  	struct diffgrep_cb *data = priv;
> >  	regmatch_t regmatch;
> > +	unsigned pickaxe_opts = data->diff_options->pickaxe_opts;
> >
> >  	if (line[0] != '+' && line[0] != '-')
> >  		return;
> > +	if ((pickaxe_opts & DIFF_PICKAXE_IGNORE_ADD) &&	line[0] == '+')
> > +		return;
> > +	if ((pickaxe_opts & DIFF_PICKAXE_IGNORE_DEL) &&	line[0] == '-')
> > +		return;
> >  	if (data->hit)
> >  		/*
>
> Looks good, but I wonder if a more general version of this couldn't be
> that instead of returning early if the line doesn't start with +/-
> above, we have an option to skip that early return.
>
> Then you can simply specify a regex that starts by matching a + or - at
> the start of the line, but you also get the poweruser tool of matching
> lines around those lines, as tweaked by the -U option. I.e. this:
>
> diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
> index 800a899c86..90625a110c 100644
> --- a/diffcore-pickaxe.c
> +++ b/diffcore-pickaxe.c
> @@ -24,15 +24,13 @@ static void diffgrep_consume(void *priv, char *line, unsigned long len)
>         struct diffgrep_cb *data = priv;
>         regmatch_t regmatch;
>
> -       if (line[0] != '+' && line[0] != '-')
> -               return;
>         if (data->hit)
>                 /*
>                  * NEEDSWORK: we should have a way to terminate the
>                  * caller early.
>                  */
>                 return;
> -       data->hit = !regexec_buf(data->regexp, line + 1, len - 1, 1,
> +       data->hit = !regexec_buf(data->regexp, line, len, 1,
>                                  &regmatch, 0);
>  }
>
> That patch obviously breaks existing -G, so it would need to be
> optional, but allows me e.g. on git.git to do:
>
>     ~/g/git/git --exec-path=/home/avar/g/git log -G'^ .*marc\.info' -p -U2 -- README.md
>
> To find a change whose first line of context is a line mentioning
> marc.info, and then I can use -G'^\+<rx>' to find added lines matching
> <rx> etc.

Do -G's accumulate?  I had the impression that only the last one was taken
into account, but I didn't check the code on that.

julia

>
> Then the --pickaxe-ignore-add and --pickaxe-ignore-del options in your
> patch could just be implemented in terms of that feature, i.e. by
> implicitly adding a "^-" or "^\+" to the beginning of the regex,
> respectively, and implicitly turning on a new --pickaxe-raw-lines or
> whatever we'd call it.
>
> >  		 * NEEDSWORK: we should have a way to terminate the
> > @@ -45,13 +51,20 @@ static int diff_grep(mmfile_t *one, mmfile_t *two,
> >  	xpparam_t xpp;
> >  	xdemitconf_t xecfg;
> >
> > -	if (!one)
> > +	if (!one) {
> > +		if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_ADD)
> > +			return 0;
> >  		return !regexec_buf(regexp, two->ptr, two->size,
> >  				    1, &regmatch, 0);
> > -	if (!two)
> > +	}
> > +	if (!two) {
> > +		if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_DEL)
> > +			return 0;
> >  		return !regexec_buf(regexp, one->ptr, one->size,
> >  				    1, &regmatch, 0);
> > +	}
> >
> > +	ecbdata.diff_options = o;
> >  	/*
> >  	 * We have both sides; need to run textual diff and see if
> >  	 * the pattern appears on added/deleted lines.
> > @@ -113,6 +126,11 @@ static int has_changes(mmfile_t *one, mmfile_t *two,
> >  {
> >  	unsigned int one_contains = one ? contains(one, regexp, kws) : 0;
> >  	unsigned int two_contains = two ? contains(two, regexp, kws) : 0;
> > +
> > +	if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_ADD)
> > +		return one_contains > two_contains;
> > +	if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_DEL)
> > +		return one_contains < two_contains;
> >  	return one_contains != two_contains;
> >  }
> >
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-10-09 13:51 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-06 15:14 git log -S or -G Julia Lawall
2018-10-06 16:22 ` Ævar Arnfjörð Bjarmason
2018-10-07  0:48   ` Junio C Hamano
2018-10-07  5:21     ` Julia Lawall
2018-10-08 23:09       ` Junio C Hamano
2018-10-09  3:21         ` Jeff King
2018-10-09  3:58           ` Jacob Keller
2018-10-09  6:39             ` Julia Lawall
2018-10-09  5:21           ` Junio C Hamano
2018-10-09 12:45             ` Ævar Arnfjörð Bjarmason
2018-10-09 13:51               ` Julia Lawall

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).