git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: "Martin Ågren" <martin.agren@gmail.com>
Cc: git@vger.kernel.org, Victoria Dye <vdye@github.com>
Subject: Re: [PATCH v2] cmd-list.perl: fix identifying man sections
Date: Fri, 23 Sep 2022 10:37:33 +0200	[thread overview]
Message-ID: <220923.86fsgi4gl6.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <20220923080733.1995862-1-martin.agren@gmail.com>


On Fri, Sep 23 2022, Martin Ågren wrote:

> We attribute each documentation text file to a man section by finding a
> line in the file that looks like "gitfoo(<digit>)". Commit cc75e556a9
> ("scalar: add to 'git help -a' command list", 2022-09-02) updated this
> logic to look not only for "gitfoo" but also "scalarfoo". In doing so,
> it forgot to account for the fact that after the updated regex has found
> a match, the man section is no longer to be found in `$1` but now lives
> in `$2`.
>
> This makes our git(1) manpage look as follows:
>
>   Main porcelain commands
>        git-add(git)
>            Add file contents to the index.
>
>   [...]
>
>        gitk(git)
>            The Git repository browser.
>
>        scalar(scalar)
>            A tool for managing large Git repositories.
>
> Restore the man sections by not capturing the (git|scalar) part of the
> match into `$1`.
>
> As noted by Ævar [1], we could even match any "foo" rather than just
> "gitfoo" and "scalarfoo", but that's a larger change. For now, just fix
> the regression in cc75e556a9.

Thanks for the quick turn-around, this looks good to me in this form.

> [1] https://lore.kernel.org/git/220923.86wn9u4joo.gmgdl@evledraar.gmail.com/#t
>
> Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> ---
>  Thanks Ævar for having a look at v1.
>
>  Documentation/cmd-list.perl | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl
> index 9515a499a3..755a110bc4 100755
> --- a/Documentation/cmd-list.perl
> +++ b/Documentation/cmd-list.perl
> @@ -10,7 +10,7 @@ sub format_one {
>  	$state = 0;
>  	open I, '<', "$name.txt" or die "No such file $name.txt";
>  	while (<I>) {
> -		if (/^(git|scalar)[a-z0-9-]*\(([0-9])\)$/) {
> +		if (/^(?:git|scalar)[a-z0-9-]*\(([0-9])\)$/) {
>  			$mansection = $1;
>  			next;
>  		}

Just for anyone doing future archaeology / digging into the "larger
change": The reason I felt safe in removing the "git" matching entirely
is because the larger context here is that we're parsing only the very
start of the *.txt. I.e. we would not want this:
	
	git(1)
	======
	
	NAME
	----
	git - the stupid content tracker
	

        foo(2)

To report Documentation/git.txt as being a "foo" in man section 2, but
that will not happen because as soon as we see a "NAME" line followed by
a "----" line we'll extract that one line of description, so even if our
regex would have eagerly matched that "foo(2)" we won't see it.

Now, having written and looked at this with fresh eyes this would be an
even better & more logical thing to do:
	
	diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl
	index 755a110bc48..0bc4c803a10 100755
	--- a/Documentation/cmd-list.perl
	+++ b/Documentation/cmd-list.perl
	@@ -1,38 +1,38 @@
	 #!/usr/bin/perl -w
	 
	 use File::Compare qw(compare);
	 
	 sub format_one {
	 	my ($out, $nameattr) = @_;
	 	my ($name, $attr) = @$nameattr;
	 	my ($state, $description);
	 	my $mansection;
	 	$state = 0;
	 	open I, '<', "$name.txt" or die "No such file $name.txt";
	 	while (<I>) {
	-		if (/^(?:git|scalar)[a-z0-9-]*\(([0-9])\)$/) {
	+		if (/^\Q$name\E\(([0-9])\)$/) {
	 			$mansection = $1;
	 			next;
	 		}
	 		if (/^NAME$/) {
	 			$state = 1;
	 			next;
	 		}
	 		if ($state == 1 && /^----$/) {
	 			$state = 2;
	 			next;
	 		}
	 		next if ($state != 2);
	 		chomp;
	 		$description = $_;
	 		last;
	 	}
	 	close I;
	 	if (!defined $description) {
	 		die "No description found in $name.txt";
	 	}
	 	if (my ($verify_name, $text) = ($description =~ /^($name) - (.*)/)) {
	 		print $out "linkgit:$name\[$mansection\]::\n\t";
	 		if ($attr =~ / deprecated /) {
	 			print $out "(deprecated) ";
	 		}

It yields the exact same result as Martin's patch above according to the
doc-diff, but as the -U25 context shows we already have a hard
dependency on the "scalar -" part of the description line matching the
name of the file ("scalar.txt") is something we should be doing.

Anyway, this is more than good enough for now, thanks! There's also much
bigger issues with the script, and we can leave that all aside from now
(e.g. if it dies the Makefile doesn't report an error, ouch!).

  reply	other threads:[~2022-09-23  8:53 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-23  7:03 [PATCH] cmd-list.perl: fix identifying man sections Martin Ågren
2022-09-23  7:38 ` Ævar Arnfjörð Bjarmason
2022-09-23  8:07   ` [PATCH v2] " Martin Ågren
2022-09-23  8:37     ` Ævar Arnfjörð Bjarmason [this message]
2022-09-23 11:42       ` Martin Ågren
2022-09-23 17:01     ` Junio C Hamano
2022-09-23 22:07 ` [PATCH] " Jeff King
2022-09-26  7:16   ` Martin Ågren
2022-09-26 17:06     ` Junio C Hamano
2022-09-26 13:35   ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=220923.86fsgi4gl6.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=martin.agren@gmail.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).