* bug: `git log --grep ... --invert-grep --author=...` negates / ignores --author @ 2017-05-31 15:45 Anthony Sottile 2017-05-31 18:08 ` Ævar Arnfjörð Bjarmason 0 siblings, 1 reply; 7+ messages in thread From: Anthony Sottile @ 2017-05-31 15:45 UTC (permalink / raw) To: git Given the following commits: ``` asottile@asottile-VirtualBox:/tmp$ git init test Initialized empty Git repository in /tmp/test/.git/ asottile@asottile-VirtualBox:/tmp$ cd test/ asottile@asottile-VirtualBox:/tmp/test$ GIT_COMMITTER_EMAIL=foo@example.com GIT_AUTHOR_EMAIL=foo@example.com git commit --allow-empty -m "foo" [master (root-commit) c9df62b] foo asottile@asottile-VirtualBox:/tmp/test$ git commit -m "blah" --allow-empty [master 9e3ee9b] blah asottile@asottile-VirtualBox:/tmp/test$ git log commit 9e3ee9bc1adab2ae8eb1884a8f6237da18dfd27b Author: Anthony Sottile <asottile@umich.edu> Date: Wed May 31 08:40:59 2017 -0700 blah commit c9df62b93298a247fcfbe24ed4282ccf95448f47 Author: Anthony Sottile <foo@example.com> Date: Wed May 31 08:40:49 2017 -0700 foo asottile@asottile-VirtualBox:/tmp/test$ git log --grep bar --invert-grep --author=foo commit 9e3ee9bc1adab2ae8eb1884a8f6237da18dfd27b Author: Anthony Sottile <asottile@umich.edu> Date: Wed May 31 08:40:59 2017 -0700 blah commit c9df62b93298a247fcfbe24ed4282ccf95448f47 Author: Anthony Sottile <foo@example.com> Date: Wed May 31 08:40:49 2017 -0700 foo asottile@asottile-VirtualBox:/tmp/test$ git log --author=foocommit c9df62b93298a247fcfbe24ed4282ccf95448f47 Author: Anthony Sottile <foo@example.com> Date: Wed May 31 08:40:49 2017 -0700 foo ``` I expect the same output from the last two commands, but the `--invert-grep` one seems to match _all_ the commits. I can try and dig into this if I have time, just trying to get a count using this as a workaround git log --grep ... --invert-grep --format=%ce | grep ... | wc -l Anthony ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bug: `git log --grep ... --invert-grep --author=...` negates / ignores --author 2017-05-31 15:45 bug: `git log --grep ... --invert-grep --author=...` negates / ignores --author Anthony Sottile @ 2017-05-31 18:08 ` Ævar Arnfjörð Bjarmason 2017-05-31 21:40 ` Jeff King 2017-06-01 2:20 ` Junio C Hamano 0 siblings, 2 replies; 7+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2017-05-31 18:08 UTC (permalink / raw) To: Anthony Sottile; +Cc: Git Mailing List On Wed, May 31, 2017 at 5:45 PM, Anthony Sottile <asottile@umich.edu> wrote: > Given the following commits: > ``` > asottile@asottile-VirtualBox:/tmp$ git init test > Initialized empty Git repository in /tmp/test/.git/ > asottile@asottile-VirtualBox:/tmp$ cd test/ > asottile@asottile-VirtualBox:/tmp/test$ > GIT_COMMITTER_EMAIL=foo@example.com GIT_AUTHOR_EMAIL=foo@example.com > git commit --allow-empty -m "foo" > [master (root-commit) c9df62b] foo > asottile@asottile-VirtualBox:/tmp/test$ git commit -m "blah" --allow-empty > [master 9e3ee9b] blah > asottile@asottile-VirtualBox:/tmp/test$ git log > commit 9e3ee9bc1adab2ae8eb1884a8f6237da18dfd27b > Author: Anthony Sottile <asottile@umich.edu> > Date: Wed May 31 08:40:59 2017 -0700 > > blah > > commit c9df62b93298a247fcfbe24ed4282ccf95448f47 > Author: Anthony Sottile <foo@example.com> > Date: Wed May 31 08:40:49 2017 -0700 > > foo > asottile@asottile-VirtualBox:/tmp/test$ git log --grep bar > --invert-grep --author=foo > commit 9e3ee9bc1adab2ae8eb1884a8f6237da18dfd27b > Author: Anthony Sottile <asottile@umich.edu> > Date: Wed May 31 08:40:59 2017 -0700 > > blah > > commit c9df62b93298a247fcfbe24ed4282ccf95448f47 > Author: Anthony Sottile <foo@example.com> > Date: Wed May 31 08:40:49 2017 -0700 > > foo > asottile@asottile-VirtualBox:/tmp/test$ git log --author=foocommit > c9df62b93298a247fcfbe24ed4282ccf95448f47 > Author: Anthony Sottile <foo@example.com> > Date: Wed May 31 08:40:49 2017 -0700 > > foo > ``` > > I expect the same output from the last two commands, but the > `--invert-grep` one seems to match _all_ the commits. > > I can try and dig into this if I have time, just trying to get a count > using this as a workaround > > git log --grep ... --invert-grep --format=%ce | grep ... | wc -l I had to squint a bit to see what you were getting at here, which is pretty simple: When you provide --invert-grep the --author filter is completely discarded. This is a bug. I.e. on git.git: OK: $ git log --grep=bar --author=Ævar --pretty=format:%an -100 origin/pu |sort|uniq -c|sort -nr 5 Ævar Arnfjörð Bjarmason $ git log --author=Ævar --pretty=format:%an -100 origin/pu |sort|uniq -c|sort -nr 100 Ævar Arnfjörð Bjarmason $ git log --grep=bar --invert-grep --author=Ævar --pretty=format:%an -100 origin/pu |sort|uniq -c|sort -nr 78 Junio C Hamano 14 Jeff King 2 Andreas Heiduk 1 Sahil Dua 1 Rikard Falkeborn 1 Johannes Sixt 1 Johannes Schindelin 1 Ben Peart 1 Ævar Arnfjörð Bjarmason That last command should only find my commits, but instead --author is discarded. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bug: `git log --grep ... --invert-grep --author=...` negates / ignores --author 2017-05-31 18:08 ` Ævar Arnfjörð Bjarmason @ 2017-05-31 21:40 ` Jeff King 2017-06-01 19:45 ` Ævar Arnfjörð Bjarmason 2017-06-01 2:20 ` Junio C Hamano 1 sibling, 1 reply; 7+ messages in thread From: Jeff King @ 2017-05-31 21:40 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Anthony Sottile, Git Mailing List On Wed, May 31, 2017 at 08:08:54PM +0200, Ævar Arnfjörð Bjarmason wrote: > $ git log --grep=bar --author=Ævar --pretty=format:%an -100 origin/pu > |sort|uniq -c|sort -nr > 5 Ævar Arnfjörð Bjarmason > > $ git log --author=Ævar --pretty=format:%an -100 origin/pu |sort|uniq > -c|sort -nr > 100 Ævar Arnfjörð Bjarmason > > $ git log --grep=bar --invert-grep --author=Ævar --pretty=format:%an > -100 origin/pu |sort|uniq -c|sort -nr > 78 Junio C Hamano > 14 Jeff King > 2 Andreas Heiduk > 1 Sahil Dua > 1 Rikard Falkeborn > 1 Johannes Sixt > 1 Johannes Schindelin > 1 Ben Peart > 1 Ævar Arnfjörð Bjarmason > > That last command should only find my commits, but instead --author is > discarded. I would have thought that the last command wouldn't find _any_ of your commits. Don't we consider --author a greppable pattern and invert it? By itself: $ git log --author=Ævar --invert-grep --format=%an -100 origin/pu | sort | uniq -c | sort -rn 79 Junio C Hamano 8 Stefan Beller 8 Jeff King 1 Sahil Dua 1 Rikard Falkeborn 1 Johannes Sixt 1 Johannes Schindelin 1 Andreas Heiduk So that makes sense from the "--author is invertable" world-view. But I'm puzzled that you show up at all when --grep=bar is added (and moreover, that we get _one_ commit from you, not the 5 found in your initial command). That seems like a bug. -Peff ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bug: `git log --grep ... --invert-grep --author=...` negates / ignores --author 2017-05-31 21:40 ` Jeff King @ 2017-06-01 19:45 ` Ævar Arnfjörð Bjarmason 2017-06-01 21:05 ` Anthony Sottile 2017-06-01 21:38 ` Junio C Hamano 0 siblings, 2 replies; 7+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2017-06-01 19:45 UTC (permalink / raw) To: Jeff King; +Cc: Anthony Sottile, Git Mailing List On Wed, May 31, 2017 at 11:40 PM, Jeff King <peff@peff.net> wrote: > On Wed, May 31, 2017 at 08:08:54PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> $ git log --grep=bar --author=Ævar --pretty=format:%an -100 origin/pu >> |sort|uniq -c|sort -nr >> 5 Ævar Arnfjörð Bjarmason >> >> $ git log --author=Ævar --pretty=format:%an -100 origin/pu |sort|uniq >> -c|sort -nr >> 100 Ævar Arnfjörð Bjarmason >> >> $ git log --grep=bar --invert-grep --author=Ævar --pretty=format:%an >> -100 origin/pu |sort|uniq -c|sort -nr >> 78 Junio C Hamano >> 14 Jeff King >> 2 Andreas Heiduk >> 1 Sahil Dua >> 1 Rikard Falkeborn >> 1 Johannes Sixt >> 1 Johannes Schindelin >> 1 Ben Peart >> 1 Ævar Arnfjörð Bjarmason >> >> That last command should only find my commits, but instead --author is >> discarded. > > I would have thought that the last command wouldn't find _any_ of your > commits. Don't we consider --author a greppable pattern and invert it? There's two unrelated things going on here AFAICT: * Anthony expected --author to be inverted by --invert-grep, but this is not how it's documented, it's *only* for inverting the --grep pattern: "Limit the commits output to ones with log message that do not match the pattern specified with --grep=<pattern>." * The --author filter should be applied un-inverted, but isn't, it's seemingly just ignored in the presence of --grep, actually mostly ignored, this is bizarre: $ diff -ru <(git log --grep=bar --invert-grep --pretty=format:%an -100 origin/pu |sort|uniq -c|sort -nr) <(git log --grep=bar --invert-grep --pretty=format:%an -100 --author=WeMostlyIgnoreThisButNotReally origin/pu |sort|uniq -c|sort -nr) --- /dev/fd/63 2017-06-01 21:44:08.952583804 +0200 +++ /dev/fd/62 2017-06-01 21:44:08.952583804 +0200 @@ -1,6 +1,6 @@ - 66 Junio C Hamano + 65 Junio C Hamano 10 Jeff King - 8 Stefan Beller + 9 Stefan Beller 5 Lars Schneider 2 SZEDER Gábor 1 Tyler Brazier > By itself: > > $ git log --author=Ævar --invert-grep --format=%an -100 origin/pu | > sort | uniq -c | sort -rn > 79 Junio C Hamano > 8 Stefan Beller > 8 Jeff King > 1 Sahil Dua > 1 Rikard Falkeborn > 1 Johannes Sixt > 1 Johannes Schindelin > 1 Andreas Heiduk > > So that makes sense from the "--author is invertable" world-view. > > But I'm puzzled that you show up at all when --grep=bar is added (and > moreover, that we get _one_ commit from you, not the 5 found in your > initial command). That seems like a bug. I think that's the only thing that's not a bug, i.e. we just find 1 match in the first 100 (note -100), sorry for the confusing example. Anyway, much of the above may be incorrect, I haven't dug deeply beyond just finding that something's funny going on and we definitely have *some* bugs here. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bug: `git log --grep ... --invert-grep --author=...` negates / ignores --author 2017-06-01 19:45 ` Ævar Arnfjörð Bjarmason @ 2017-06-01 21:05 ` Anthony Sottile 2017-06-01 21:38 ` Junio C Hamano 1 sibling, 0 replies; 7+ messages in thread From: Anthony Sottile @ 2017-06-01 21:05 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Jeff King, Git Mailing List I actually only expected the --grep to be inverted -- I think I'm on the same page with what's documented. I'd be happy to dig into the code and investigate this some more but I am not familiar with the git codebase, any code hints on where to get bootstrapped? Anthony On Thu, Jun 1, 2017 at 12:45 PM, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > On Wed, May 31, 2017 at 11:40 PM, Jeff King <peff@peff.net> wrote: >> On Wed, May 31, 2017 at 08:08:54PM +0200, Ævar Arnfjörð Bjarmason wrote: >> >>> $ git log --grep=bar --author=Ævar --pretty=format:%an -100 origin/pu >>> |sort|uniq -c|sort -nr >>> 5 Ævar Arnfjörð Bjarmason >>> >>> $ git log --author=Ævar --pretty=format:%an -100 origin/pu |sort|uniq >>> -c|sort -nr >>> 100 Ævar Arnfjörð Bjarmason >>> >>> $ git log --grep=bar --invert-grep --author=Ævar --pretty=format:%an >>> -100 origin/pu |sort|uniq -c|sort -nr >>> 78 Junio C Hamano >>> 14 Jeff King >>> 2 Andreas Heiduk >>> 1 Sahil Dua >>> 1 Rikard Falkeborn >>> 1 Johannes Sixt >>> 1 Johannes Schindelin >>> 1 Ben Peart >>> 1 Ævar Arnfjörð Bjarmason >>> >>> That last command should only find my commits, but instead --author is >>> discarded. >> >> I would have thought that the last command wouldn't find _any_ of your >> commits. Don't we consider --author a greppable pattern and invert it? > > There's two unrelated things going on here AFAICT: > > * Anthony expected --author to be inverted by --invert-grep, but this > is not how it's documented, it's *only* for inverting the --grep > pattern: "Limit the commits output to ones with log message that do > not match the pattern specified with --grep=<pattern>." > > * The --author filter should be applied un-inverted, but isn't, it's > seemingly just ignored in the presence of --grep, actually mostly > ignored, this is bizarre: > > $ diff -ru <(git log --grep=bar --invert-grep --pretty=format:%an -100 > origin/pu |sort|uniq -c|sort -nr) <(git log --grep=bar --invert-grep > --pretty=format:%an -100 --author=WeMostlyIgnoreThisButNotReally > origin/pu |sort|uniq -c|sort -nr) > --- /dev/fd/63 2017-06-01 21:44:08.952583804 +0200 > +++ /dev/fd/62 2017-06-01 21:44:08.952583804 +0200 > @@ -1,6 +1,6 @@ > - 66 Junio C Hamano > + 65 Junio C Hamano > 10 Jeff King > - 8 Stefan Beller > + 9 Stefan Beller > 5 Lars Schneider > 2 SZEDER Gábor > 1 Tyler Brazier > > >> By itself: >> >> $ git log --author=Ævar --invert-grep --format=%an -100 origin/pu | >> sort | uniq -c | sort -rn >> 79 Junio C Hamano >> 8 Stefan Beller >> 8 Jeff King >> 1 Sahil Dua >> 1 Rikard Falkeborn >> 1 Johannes Sixt >> 1 Johannes Schindelin >> 1 Andreas Heiduk >> >> So that makes sense from the "--author is invertable" world-view. >> >> But I'm puzzled that you show up at all when --grep=bar is added (and >> moreover, that we get _one_ commit from you, not the 5 found in your >> initial command). That seems like a bug. > > I think that's the only thing that's not a bug, i.e. we just find 1 > match in the first 100 (note -100), sorry for the confusing example. > > Anyway, much of the above may be incorrect, I haven't dug deeply > beyond just finding that something's funny going on and we definitely > have *some* bugs here. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bug: `git log --grep ... --invert-grep --author=...` negates / ignores --author 2017-06-01 19:45 ` Ævar Arnfjörð Bjarmason 2017-06-01 21:05 ` Anthony Sottile @ 2017-06-01 21:38 ` Junio C Hamano 1 sibling, 0 replies; 7+ messages in thread From: Junio C Hamano @ 2017-06-01 21:38 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Jeff King, Anthony Sottile, Git Mailing List Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > Anyway, much of the above may be incorrect, I haven't dug deeply > beyond just finding that something's funny going on and we definitely > have *some* bugs here. One thing that is very correct in what you said is that the documentation of "--invert-grep" is wrong to mention only "--grep". It probably is because the author of "--invert-grep" did not realize that the "--author" thing internally is considered a part of "grep" processing done in revision.c::commit_match() that chooses which commit to show. As you may have noticed, the integration to the grep infrastructure with "git log" and friends does not expose the power of combination (e.g. "--and", "--not", etc.), unlike "git grep", to the command line. This is purely a technical limitation (e.g. "--not" is taken to mean a completely different thing in "git log" family) and there is no fundamental reason why we shouldn't, but the fact that the end user cannot combine the "--grep" like terms in a flexible way remains from the beginning to this day. Within that constraint, --grep and things like --author are defined to interact in a certain hardcoded way (the former searches substring in the body part of commit objects while the latter searches substrings in the "header" part of them), simply because defining them to be all ORed together does not give us a very useful semantics. In general, two or more of these searches always "OR" together, e.g. "git log --grep=foo --grep=bar" finds commits that mention foo or commits that mention bar. However, searches in body and searches in header "AND" together, e.g. "git log --author=Ævar --grep=foo" finds commits that mention foo and written by you. "git log --author=Ævar --author=gitster@ --grep=foo --grep=bar" finds commits that mention either foo or bar and written by either you or me. IIRC, its parse tree would look like (AND (OR (author Ævar) (author gitster@)) (OR (grep foo) (grep bar)) The "--invert-grep" merely adds (NOT ...) around it. I vaguely recall there was impliations to "all match" when the header search and the body search is used together. Without header search, the usual "git log --grep=foo --grep=bar" becomes (OR (grep foo) (grep bar)) and "--all-match" is applied to this "OR" (i.e. by the time the matcher finishes a single commit's body, all of the terms ORed together, i.e. (grep foo) and (grep bar), must have matched at least once for the commit to be considered a match. Obviously, a commit can be authored and commited by only a single person, and applying all-match naively to "git log --author=Ævar --author=gitster@" would not make any sense (by definition the result would be an empty set), and IIRC there is a code that copes with this to avoid requiring a commit that is written by two people (or committed by two people) when "--all-match" is in effect. So taken all together, your example and its output makes all sense to me. As you mentioned, the examples by the OP was written a bit too dense for me to grok quickly and I didn't look too carefully at them to comment. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: bug: `git log --grep ... --invert-grep --author=...` negates / ignores --author 2017-05-31 18:08 ` Ævar Arnfjörð Bjarmason 2017-05-31 21:40 ` Jeff King @ 2017-06-01 2:20 ` Junio C Hamano 1 sibling, 0 replies; 7+ messages in thread From: Junio C Hamano @ 2017-06-01 2:20 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Anthony Sottile, Git Mailing List Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > $ git log --grep=bar --invert-grep --author=Ævar --pretty=format:%an > -100 origin/pu |sort|uniq -c|sort -nr > 78 Junio C Hamano > 14 Jeff King > 2 Andreas Heiduk > 1 Sahil Dua > 1 Rikard Falkeborn > 1 Johannes Sixt > 1 Johannes Schindelin > 1 Ben Peart > 1 Ævar Arnfjörð Bjarmason > > That last command should only find my commits, but instead --author is > discarded. Hmph, the way I read revision.c::commit_match() is that we check with "--grep=bar --author=Ævar" and then emit the commit if that check fails when --invert-grep is in effect (instead of emitting the commit when the check succeeds, which is the normal case). So there is one commit that didn't pass "has 'bar' and written by Ævar" check that was written by you in the recent past (iow, recently you were writing bar all over the place). Changes by other people by definition does not pass "has 'bar' and written by Ævar" check, on the other hand. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-06-01 21:38 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-05-31 15:45 bug: `git log --grep ... --invert-grep --author=...` negates / ignores --author Anthony Sottile 2017-05-31 18:08 ` Ævar Arnfjörð Bjarmason 2017-05-31 21:40 ` Jeff King 2017-06-01 19:45 ` Ævar Arnfjörð Bjarmason 2017-06-01 21:05 ` Anthony Sottile 2017-06-01 21:38 ` Junio C Hamano 2017-06-01 2:20 ` Junio C Hamano
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).