* Git bug: Filter ignored when "--invert-grep" option is used. @ 2021-12-15 9:50 Dotan Cohen 2021-12-15 22:08 ` Junio C Hamano 0 siblings, 1 reply; 6+ messages in thread From: Dotan Cohen @ 2021-12-15 9:50 UTC (permalink / raw) To: git What did you do before the bug happened? $ git log -8 --author=Shachar --grep=Revert --invert-grep What did you expect to happen? I expected to see the last 8 commits from Shachar that did not have the string "Revert" in the commit message. What happened instead? The list of commits included commits by authors other than Shachar. What's different between what you expected and what actually happened? The "--author" filter seems to be ignored when the "--invert-grep" option is used. I also tried to change the order of the options, but the results remained the same. [System Info] git version: git version 2.34.1 cpu: x86_64 no commit associated with this build sizeof-long: 8 sizeof-size_t: 8 shell-path: /bin/sh uname: Linux 5.11.0-41-generic #45~20.04.1-Ubuntu SMP Wed Nov 10 10:20:10 UTC 2021 x86_64 compiler info: gnuc: 9.3 libc info: glibc: 2.31 $SHELL (typically, interactive shell): /bin/bash [Enabled Hooks] pre-commit pre-push ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Git bug: Filter ignored when "--invert-grep" option is used. 2021-12-15 9:50 Git bug: Filter ignored when "--invert-grep" option is used Dotan Cohen @ 2021-12-15 22:08 ` Junio C Hamano 2021-12-16 14:54 ` Dotan Cohen 0 siblings, 1 reply; 6+ messages in thread From: Junio C Hamano @ 2021-12-15 22:08 UTC (permalink / raw) To: Dotan Cohen; +Cc: git Dotan Cohen <dotancohen@gmail.com> writes: > What did you do before the bug happened? > $ git log -8 --author=Shachar --grep=Revert --invert-grep > > What did you expect to happen? > I expected to see the last 8 commits from Shachar that did not have > the string "Revert" in the commit message. > > What happened instead? > The list of commits included commits by authors other than Shachar. > > What's different between what you expected and what actually happened? > The "--author" filter seems to be ignored when the "--invert-grep" > option is used. > I also tried to change the order of the options, but the results > remained the same. I think --author and --grep uses the same internal pattern matching engine, so with --invert-grep, I would not be surprised if the command looks for commits that do not have Revert and (or is that or? I dunno) not authored by Shachar. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Git bug: Filter ignored when "--invert-grep" option is used. 2021-12-15 22:08 ` Junio C Hamano @ 2021-12-16 14:54 ` Dotan Cohen 2021-12-16 19:42 ` Junio C Hamano 0 siblings, 1 reply; 6+ messages in thread From: Dotan Cohen @ 2021-12-16 14:54 UTC (permalink / raw) To: Junio C Hamano; +Cc: git > I think --author and --grep uses the same internal pattern matching > engine, so with --invert-grep, I would not be surprised if the > command looks for commits that do not have Revert and (or is that > or? I dunno) not authored by Shachar. Possibly, but the flag is called --invert-grep not --invert-matches so one would expect it to revert grep only. Though behaviour contrary to user expectations is not an unusual property of git :) Other than piping to e.g. awk or worse, how would one get the commits by a particular author that do not have a specific string in the commit message? Prettying to oneline would make the piping easier to at least get the commit ids, but I'd like to see the whole commit message and affected files. Thanks, Junio. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Git bug: Filter ignored when "--invert-grep" option is used. 2021-12-16 14:54 ` Dotan Cohen @ 2021-12-16 19:42 ` Junio C Hamano 2021-12-17 16:48 ` René Scharfe 0 siblings, 1 reply; 6+ messages in thread From: Junio C Hamano @ 2021-12-16 19:42 UTC (permalink / raw) To: Dotan Cohen; +Cc: git Dotan Cohen <dotancohen@gmail.com> writes: >> I think --author and --grep uses the same internal pattern matching >> engine, so with --invert-grep, I would not be surprised if the >> command looks for commits that do not have Revert and (or is that >> or? I dunno) not authored by Shachar. > > Possibly, but the flag is called --invert-grep not --invert-matches so > one would expect it to revert grep only. That is an actionable improvement idea to introduce a synonym ;-) But in general, the way the internal "git grep" machinery is exposed to the commands in the "git log" family is very limited. With "git grep", it is quite straight-forward to say "report hits for lines that has this but not that" $ git grep -e this --and --not -e that but because that the commands in the "log" family already use "--not" for a quite different purpose, "git log --grep" cannot even express something similar, even to find hits on a single line, let alone finding hits on two different lines (i.e. one on the "author" header, the other in the message part, of the commit object). ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Git bug: Filter ignored when "--invert-grep" option is used. 2021-12-16 19:42 ` Junio C Hamano @ 2021-12-17 16:48 ` René Scharfe 2021-12-17 18:16 ` Junio C Hamano 0 siblings, 1 reply; 6+ messages in thread From: René Scharfe @ 2021-12-17 16:48 UTC (permalink / raw) To: Junio C Hamano, Dotan Cohen; +Cc: git, Christoph Junghans Am 16.12.21 um 20:42 schrieb Junio C Hamano: > Dotan Cohen <dotancohen@gmail.com> writes: > >>> I think --author and --grep uses the same internal pattern matching >>> engine, so with --invert-grep, I would not be surprised if the >>> command looks for commits that do not have Revert and (or is that >>> or? I dunno) not authored by Shachar. >> >> Possibly, but the flag is called --invert-grep not --invert-matches so >> one would expect it to revert grep only. > > That is an actionable improvement idea to introduce a synonym ;-) Documentation/rev-list-options.txt says about --invert-grep: Limit the commits output to ones with log message that do not match the pattern specified with `--grep=<pattern>`. Both the option name and this sentence suggest that it only should invert --grep, which makes sense to me. > But in general, the way the internal "git grep" machinery is exposed > to the commands in the "git log" family is very limited. With "git > grep", it is quite straight-forward to say "report hits for lines > that has this but not that" > > $ git grep -e this --and --not -e that > > but because that the commands in the "log" family already use > "--not" for a quite different purpose, "git log --grep" cannot even > express something similar, even to find hits on a single line, let > alone finding hits on two different lines (i.e. one on the "author" > header, the other in the message part, of the commit object). Right, but we can pass in the necessary bit via struct grep_opt. 22dfa8a23d (log: teach --invert-grep option, 2015-01-12) even mentions that this done in an earlier iteration of that feature. Representing buffer-level operations like --all-match and this one as expression nodes would be nice. At least I suspect that would make changing the behavior easier, without having to touch as many places. Anyway, here's a patch that is intended to bring the code in line with its documentation. The multiple negations hurt my head, so I may have snuck in some logic errors, though. :-/ --- >8 --- Subject: [PATCH] log: let --invert-grep only invert --grep The option --invert-grep is documented to filter out commits whose messages match the --grep filters. However, it also affects the header matches (--author, --committer), which is not intended. Move the handling of that option to grep.c, as only the code there can distinguish between matches in the header from those in the message body. If --invert-grep is given then enable extended expressions (not the regex type, we just need git grep's --not to work), negate the body patterns and check if any of them match by piggy-backing on the collect_hits mechanism of grep_source_1(). Collecting the matches in struct grep_opt is a bit iffy, but with "last_shown" we have a precedent for writing state information to that struct. Reported-by: Dotan Cohen <dotancohen@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> --- grep.c | 22 +++++++++++++++++++--- grep.h | 2 ++ revision.c | 4 ++-- revision.h | 2 -- t/t4202-log.sh | 19 +++++++++++++++++++ 5 files changed, 42 insertions(+), 7 deletions(-) diff --git a/grep.c b/grep.c index fe847a0111..beef5fe47e 100644 --- a/grep.c +++ b/grep.c @@ -699,6 +699,14 @@ static struct grep_expr *compile_pattern_expr(struct grep_pat **list) return compile_pattern_or(list); } +static struct grep_expr *grep_not_expr(struct grep_expr *expr) +{ + struct grep_expr *z = xcalloc(1, sizeof(*z)); + z->node = GREP_NODE_NOT; + z->u.unary = expr; + return z; +} + static struct grep_expr *grep_true_expr(void) { struct grep_expr *z = xcalloc(1, sizeof(*z)); @@ -797,7 +805,7 @@ void compile_grep_patterns(struct grep_opt *opt) } } - if (opt->all_match || header_expr) + if (opt->all_match || opt->no_body_match || header_expr) opt->extended = 1; else if (!opt->extended) return; @@ -808,6 +816,9 @@ void compile_grep_patterns(struct grep_opt *opt) if (p) die("incomplete pattern expression: %s", p->pattern); + if (opt->no_body_match && opt->pattern_expression) + opt->pattern_expression = grep_not_expr(opt->pattern_expression); + if (!header_expr) return; @@ -1057,6 +1068,8 @@ static int match_expr_eval(struct grep_opt *opt, struct grep_expr *x, if (h && (*col < 0 || tmp.rm_so < *col)) *col = tmp.rm_so; } + if (x->u.atom->token == GREP_PATTERN_BODY) + opt->body_hit |= h; break; case GREP_NODE_NOT: /* @@ -1825,16 +1838,19 @@ int grep_source(struct grep_opt *opt, struct grep_source *gs) * we do not have to do the two-pass grep when we do not check * buffer-wide "all-match". */ - if (!opt->all_match) + if (!opt->all_match && !opt->no_body_match) return grep_source_1(opt, gs, 0); /* Otherwise the toplevel "or" terms hit a bit differently. * We first clear hit markers from them. */ clr_hit_marker(opt->pattern_expression); + opt->body_hit = 0; grep_source_1(opt, gs, 1); - if (!chk_hit_marker(opt->pattern_expression)) + if (opt->all_match && !chk_hit_marker(opt->pattern_expression)) + return 0; + if (opt->no_body_match && opt->body_hit) return 0; return grep_source_1(opt, gs, 0); diff --git a/grep.h b/grep.h index 3e8815c347..6a1f0ab017 100644 --- a/grep.h +++ b/grep.h @@ -148,6 +148,8 @@ struct grep_opt { int word_regexp; int fixed; int all_match; + int no_body_match; + int body_hit; #define GREP_BINARY_DEFAULT 0 #define GREP_BINARY_NOMATCH 1 #define GREP_BINARY_TEXT 2 diff --git a/revision.c b/revision.c index 1981a0859f..97a06bc8fe 100644 --- a/revision.c +++ b/revision.c @@ -2493,7 +2493,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg } else if (!strcmp(arg, "--all-match")) { revs->grep_filter.all_match = 1; } else if (!strcmp(arg, "--invert-grep")) { - revs->invert_grep = 1; + revs->grep_filter.no_body_match = 1; } else if ((argcount = parse_long_opt("encoding", argv, &optarg))) { if (strcmp(optarg, "none")) git_log_output_encoding = xstrdup(optarg); @@ -3778,7 +3778,7 @@ static int commit_match(struct commit *commit, struct rev_info *opt) (char *)message, strlen(message)); strbuf_release(&buf); unuse_commit_buffer(commit, message); - return opt->invert_grep ? !retval : retval; + return retval; } static inline int want_ancestry(const struct rev_info *revs) diff --git a/revision.h b/revision.h index 5578bb4720..3f66147bfd 100644 --- a/revision.h +++ b/revision.h @@ -246,8 +246,6 @@ struct rev_info { /* Filter by commit log message */ struct grep_opt grep_filter; - /* Negate the match of grep_filter */ - int invert_grep; /* Display history graph */ struct git_graph *graph; diff --git a/t/t4202-log.sh b/t/t4202-log.sh index 7884e3d46b..765742fdbc 100755 --- a/t/t4202-log.sh +++ b/t/t4202-log.sh @@ -2010,4 +2010,23 @@ test_expect_success 'log --end-of-options' ' test_cmp expect actual ' +test_expect_success 'set up commits with different authors' ' + git checkout --orphan authors && + test_commit --author "Jim <jim@example.com>" jim_1 && + test_commit --author "Val <val@example.com>" val_1 && + test_commit --author "Val <val@example.com>" val_2 && + test_commit --author "Jim <jim@example.com>" jim_2 && + test_commit --author "Val <val@example.com>" val_3 && + test_commit --author "Jim <jim@example.com>" jim_3 +' + +test_expect_success 'log --invert-grep --grep --author' ' + cat >expect <<-\EOF && + val_3 + val_1 + EOF + git log --format=%s --author=Val --grep 2 --invert-grep >actual && + test_cmp expect actual +' + test_done -- 2.34.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: Git bug: Filter ignored when "--invert-grep" option is used. 2021-12-17 16:48 ` René Scharfe @ 2021-12-17 18:16 ` Junio C Hamano 0 siblings, 0 replies; 6+ messages in thread From: Junio C Hamano @ 2021-12-17 18:16 UTC (permalink / raw) To: René Scharfe; +Cc: Dotan Cohen, git, Christoph Junghans René Scharfe <l.s.r@web.de> writes: > Subject: [PATCH] log: let --invert-grep only invert --grep > > The option --invert-grep is documented to filter out commits whose > messages match the --grep filters. However, it also affects the > header matches (--author, --committer), which is not intended. I re-read the log message that introduced this feature, and I agree with the "not intended" part. I do not think the change itself was even done with awareness that the header matches may also be affected, and there is no test for it to see the interaction. > Move the handling of that option to grep.c, as only the code there can > distinguish between matches in the header from those in the message > body. If --invert-grep is given then enable extended expressions (not > the regex type, we just need git grep's --not to work), negate the body > patterns and check if any of them match by piggy-backing on the > collect_hits mechanism of grep_source_1(). Nice. The original says that --files-without-matches being a negation of --files-with-matches was what triggered them to have the bit in the revisions, not in grep_opt, by the way. > Collecting the matches in struct grep_opt is a bit iffy, but with > "last_shown" we have a precedent for writing state information to that > struct. I think this is perfectly fine. apply_state, grep_opt, diff_options, and rev_info are used the same way within their subsystems to carry in options that affect behaviour, carry around the state of the machinery, and carry out the result. The word "option" does make it sound it is an input-only thing, but others are not much better ;-). > diff --git a/grep.c b/grep.c > index fe847a0111..beef5fe47e 100644 > --- a/grep.c > +++ b/grep.c > @@ -699,6 +699,14 @@ static struct grep_expr *compile_pattern_expr(struct grep_pat **list) > return compile_pattern_or(list); > } > > +static struct grep_expr *grep_not_expr(struct grep_expr *expr) > +{ > + struct grep_expr *z = xcalloc(1, sizeof(*z)); > + z->node = GREP_NODE_NOT; > + z->u.unary = expr; > + return z; > +} A bit surprising to see that we already had GREP_NODE_NOT without a helper to create a node. Not updating compile_pattern_not() to use this new helper does make this patch simpler to read by allowing readers to focus on what matters, which is very much appreciaed. The rest of the patch looks good to me, too. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-12-17 18:16 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-12-15 9:50 Git bug: Filter ignored when "--invert-grep" option is used Dotan Cohen 2021-12-15 22:08 ` Junio C Hamano 2021-12-16 14:54 ` Dotan Cohen 2021-12-16 19:42 ` Junio C Hamano 2021-12-17 16:48 ` René Scharfe 2021-12-17 18:16 ` Junio C Hamano
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).