git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [bug] git log --invert-grep --grep=[sufficiently complicated regex] prints nothing
@ 2022-11-23 20:17 Zack Weinberg
  2022-11-24 10:31 ` Phillip Wood
  0 siblings, 1 reply; 5+ messages in thread
From: Zack Weinberg @ 2022-11-23 20:17 UTC (permalink / raw)
  To: git

I’m attempting to have my blog, which is a static site generated from
a bunch of Markdown files stored in git, automatically pull the
the most recent modification date for each page out of the git history.
The idea is to use ‘git log --follow --pretty=tformat:'%ct' <file>‘ on
each file and then use the oldest reported timestamp as the creation
date and the newest reported timestamp as the last modification date.

But there’s a catch: there are commits I want to ignore in this
calculation, such as mechanical changes applied to the entire site.
And this brings me to the bug: --invert-grep doesn’t work correctly
when the --grep regex is sufficiently complicated.

Here's the complete set of commits that modified an example file:

$ git log --follow --pretty=tformat:'%ct %s' \
  src/posts/uncat/unearthed-arcana-music-division.md
1668545053 Begin restoring the site structure.
1668545051 Reorganize directory tree prior to setting up Metalsmith
1668545032 Mechanically convert Pandoc to standard YAML metadata delimiters.
1610735533 Mechanically convert to properly delimited YAML metadata.
1417101173 Correct slug for Uncategorized.
1417050416 The Great Dead and Moved Link Cleanup of 2014.
1416938128 Use category_meta plugin to fix category slugs.
1416763607 Initial import of content and Pelican skeleton.

And here's an application of --grep that prints only the commits I _don't_ want:

$ git log --follow -E --pretty=tformat:'%ct %s' \
  --grep='^(Mechanically convert|Begin restoring the site structure|Reorganize directory tree)' \
 src/posts/uncat/unearthed-arcana-music-division.md
1668545053 Begin restoring the site structure.
1668545051 Reorganize directory tree prior to setting up Metalsmith
1668545032 Mechanically convert Pandoc to standard YAML metadata delimiters.
1610735533 Mechanically convert to properly delimited YAML metadata.

Theoretically, adding --invert-grep to that should make it print the
commits I do want, but instead it prints nothing at all:

$ git log --follow -E --pretty=tformat:'%ct %s' --invert-grep \
  --grep='^(Mechanically convert|Begin restoring the site structure|Reorganize directory tree)' \
 src/posts/uncat/unearthed-arcana-music-division.md
[no output]

For clarity, I expected that to print

1417101173 Correct slug for Uncategorized.
1417050416 The Great Dead and Moved Link Cleanup of 2014.
1416938128 Use category_meta plugin to fix category slugs.
1416763607 Initial import of content and Pelican skeleton.

The problem seems to be related to the complexity of the regex, e.g.
(all examples with -E --invert-grep)

--grep='^(Mechanically convert|Begin restoring)'             # works correctly
--grep='^(Mechanically convert|Begin restoring|Reorganize)'  # prints nothing
--grep='^(Mechanically convert|Begin restoring|Correct slug)'# prints incorrect subset

Incidentally, if there is a better way to query the first and last
commit (with filtering) that touched a particular file — or even
a way to do the query for many files at once — please tell me about
it.

Thanks for your attention,
zw

[System Info]
git version:
git version 2.37.2
cpu: x86_64
no commit associated with this build
sizeof-long: 8
sizeof-size_t: 8
shell-path: /bin/sh
uname: Linux 6.0.0-4-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.0.8-1 (2022-11-11) x86_64
compiler info: gnuc: 12.1
libc info: glibc: 2.36
$SHELL (typically, interactive shell): /bin/bash


[Enabled Hooks]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-11-24 16:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-23 20:17 [bug] git log --invert-grep --grep=[sufficiently complicated regex] prints nothing Zack Weinberg
2022-11-24 10:31 ` Phillip Wood
2022-11-24 13:36   ` Zack Weinberg
2022-11-24 15:53     ` Ævar Arnfjörð Bjarmason
2022-11-24 16:35       ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).