git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / code / Atom feed
* [PATCH] cmd-list.perl: fix identifying man sections
@ 2022-09-23  7:03 Martin Ågren
  2022-09-23  7:38 ` Ævar Arnfjörð Bjarmason
  2022-09-23 22:07 ` [PATCH] " Jeff King
  0 siblings, 2 replies; 7+ messages in thread
From: Martin Ågren @ 2022-09-23  7:03 UTC (permalink / raw)
  To: git; +Cc: Victoria Dye, Ævar Arnfjörð Bjarmason

We attribute each documentation text file to a man section by finding a
line in the file that looks like "gitfoo(<digit>)". Commit cc75e556a9
("scalar: add to 'git help -a' command list", 2022-09-02) updated this
logic to look not only for "gitfoo" but also "scalarfoo". In doing so,
it forgot to account for the fact that after the updated regex has found
a match, the man section is no longer to be found in `$1` but now lives
in `$2`.

This makes our git(1) manpage look as follows:

  Main porcelain commands
       git-add(git)
           Add file contents to the index.

  [...]

       gitk(git)
           The Git repository browser.

       scalar(scalar)
           A tool for managing large Git repositories.

Restore the man sections by grabbing the correct value out of the regex
match.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 This is a v2.38.0-rc1 regression.

 Documentation/cmd-list.perl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl
index 9515a499a3..16451d81b8 100755
--- a/Documentation/cmd-list.perl
+++ b/Documentation/cmd-list.perl
@@ -11,7 +11,7 @@ sub format_one {
 	open I, '<', "$name.txt" or die "No such file $name.txt";
 	while (<I>) {
 		if (/^(git|scalar)[a-z0-9-]*\(([0-9])\)$/) {
-			$mansection = $1;
+			$mansection = $2;
 			next;
 		}
 		if (/^NAME$/) {
-- 
@@ -11,7 +11,7 @@ sub format_one {
 	open I, '<', "$name.txt" or die "No such file $name.txt";
 	while (<I>) {
 		if (/^(git|scalar)[a-z0-9-]*\(([0-9])\)$/) {
-			$mansection = $1;
+			$mansection = $2;
 			next;
 		}
 		if (/^NAME$/) {
-- 
2.38.0.rc1.355.g7147cddc23


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] cmd-list.perl: fix identifying man sections
  2022-09-23  7:03 [PATCH] cmd-list.perl: fix identifying man sections Martin Ågren
@ 2022-09-23  7:38 ` Ævar Arnfjörð Bjarmason
  2022-09-23  8:07   ` [PATCH v2] " Martin Ågren
  2022-09-23 22:07 ` [PATCH] " Jeff King
  1 sibling, 1 reply; 7+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-09-23  7:38 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git, Victoria Dye


On Fri, Sep 23 2022, Martin Ågren wrote:

> We attribute each documentation text file to a man section by finding a
> line in the file that looks like "gitfoo(<digit>)". Commit cc75e556a9
> ("scalar: add to 'git help -a' command list", 2022-09-02) updated this
> logic to look not only for "gitfoo" but also "scalarfoo". In doing so,
> it forgot to account for the fact that after the updated regex has found
> a match, the man section is no longer to be found in `$1` but now lives
> in `$2`.
>
> This makes our git(1) manpage look as follows:
>
>   Main porcelain commands
>        git-add(git)
>            Add file contents to the index.
>
>   [...]
>
>        gitk(git)
>            The Git repository browser.
>
>        scalar(scalar)
>            A tool for managing large Git repositories.
>
> Restore the man sections by grabbing the correct value out of the regex
> match.
>
> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> ---
>  This is a v2.38.0-rc1 regression.
>
>  Documentation/cmd-list.perl | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl
> index 9515a499a3..16451d81b8 100755
> --- a/Documentation/cmd-list.perl
> +++ b/Documentation/cmd-list.perl
> @@ -11,7 +11,7 @@ sub format_one {
>  	open I, '<', "$name.txt" or die "No such file $name.txt";
>  	while (<I>) {
>  		if (/^(git|scalar)[a-z0-9-]*\(([0-9])\)$/) {
> -			$mansection = $1;
> +			$mansection = $2;
>  			next;
>  		}
>  		if (/^NAME$/) {

Ouch, good catch!

In the v1 review for scalar[1] I pointed out that I had a local patch
where this is instead:

	if (/^[a-z0-9-]*\(([0-9])\)$/) {
		$mansection = $1;

The v2 at [2] which I didn't get around to reviewing (sorry!) then
introduced this logic error.

Victoria: Did you find some reason to not just take the version I had?
The doc-diff with Martin's above is empty, i.e. it's the same in
practice, but if we don't need to hard-code our top-level commands for
no reason I don't see why we'd maintain this list of them here.

On the proposed rc patch: FWIW we can also just use (?:) groupings
instead of a capture:

	if (/^(?:git|scalar)[a-z0-9-]*\(([0-9])\)$/) {

It has the same effect, but arguably makes more sense. I.e. if we don't
care about $1 let's not capture the thing we don't care about into $1 in
the first place.

1. https://lore.kernel.org/git/220831.86y1v48h2x.gmgdl@evledraar.gmail.com/
2. https://lore.kernel.org/git/070f195f027e5601b88ca6a0d4c9991b472ad9ab.1662134210.git.gitgitgadget@gmail.com/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2] cmd-list.perl: fix identifying man sections
  2022-09-23  7:38 ` Ævar Arnfjörð Bjarmason
@ 2022-09-23  8:07   ` Martin Ågren
  2022-09-23  8:37     ` Ævar Arnfjörð Bjarmason
  2022-09-23 17:01     ` Junio C Hamano
  0 siblings, 2 replies; 7+ messages in thread
From: Martin Ågren @ 2022-09-23  8:07 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Victoria Dye

We attribute each documentation text file to a man section by finding a
line in the file that looks like "gitfoo(<digit>)". Commit cc75e556a9
("scalar: add to 'git help -a' command list", 2022-09-02) updated this
logic to look not only for "gitfoo" but also "scalarfoo". In doing so,
it forgot to account for the fact that after the updated regex has found
a match, the man section is no longer to be found in `$1` but now lives
in `$2`.

This makes our git(1) manpage look as follows:

  Main porcelain commands
       git-add(git)
           Add file contents to the index.

  [...]

       gitk(git)
           The Git repository browser.

       scalar(scalar)
           A tool for managing large Git repositories.

Restore the man sections by not capturing the (git|scalar) part of the
match into `$1`.

As noted by Ævar [1], we could even match any "foo" rather than just
"gitfoo" and "scalarfoo", but that's a larger change. For now, just fix
the regression in cc75e556a9.

[1] https://lore.kernel.org/git/220923.86wn9u4joo.gmgdl@evledraar.gmail.com/#t

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 Thanks Ævar for having a look at v1.

 Documentation/cmd-list.perl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl
index 9515a499a3..755a110bc4 100755
--- a/Documentation/cmd-list.perl
+++ b/Documentation/cmd-list.perl
@@ -10,7 +10,7 @@ sub format_one {
 	$state = 0;
 	open I, '<', "$name.txt" or die "No such file $name.txt";
 	while (<I>) {
-		if (/^(git|scalar)[a-z0-9-]*\(([0-9])\)$/) {
+		if (/^(?:git|scalar)[a-z0-9-]*\(([0-9])\)$/) {
 			$mansection = $1;
 			next;
 		}
-- 
@@ -10,7 +10,7 @@ sub format_one {
 	$state = 0;
 	open I, '<', "$name.txt" or die "No such file $name.txt";
 	while (<I>) {
-		if (/^(git|scalar)[a-z0-9-]*\(([0-9])\)$/) {
+		if (/^(?:git|scalar)[a-z0-9-]*\(([0-9])\)$/) {
 			$mansection = $1;
 			next;
 		}
-- 
2.38.0.rc1.355.g7147cddc23


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] cmd-list.perl: fix identifying man sections
  2022-09-23  8:07   ` [PATCH v2] " Martin Ågren
@ 2022-09-23  8:37     ` Ævar Arnfjörð Bjarmason
  2022-09-23 11:42       ` Martin Ågren
  2022-09-23 17:01     ` Junio C Hamano
  1 sibling, 1 reply; 7+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-09-23  8:37 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git, Victoria Dye


On Fri, Sep 23 2022, Martin Ågren wrote:

> We attribute each documentation text file to a man section by finding a
> line in the file that looks like "gitfoo(<digit>)". Commit cc75e556a9
> ("scalar: add to 'git help -a' command list", 2022-09-02) updated this
> logic to look not only for "gitfoo" but also "scalarfoo". In doing so,
> it forgot to account for the fact that after the updated regex has found
> a match, the man section is no longer to be found in `$1` but now lives
> in `$2`.
>
> This makes our git(1) manpage look as follows:
>
>   Main porcelain commands
>        git-add(git)
>            Add file contents to the index.
>
>   [...]
>
>        gitk(git)
>            The Git repository browser.
>
>        scalar(scalar)
>            A tool for managing large Git repositories.
>
> Restore the man sections by not capturing the (git|scalar) part of the
> match into `$1`.
>
> As noted by Ævar [1], we could even match any "foo" rather than just
> "gitfoo" and "scalarfoo", but that's a larger change. For now, just fix
> the regression in cc75e556a9.

Thanks for the quick turn-around, this looks good to me in this form.

> [1] https://lore.kernel.org/git/220923.86wn9u4joo.gmgdl@evledraar.gmail.com/#t
>
> Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> ---
>  Thanks Ævar for having a look at v1.
>
>  Documentation/cmd-list.perl | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl
> index 9515a499a3..755a110bc4 100755
> --- a/Documentation/cmd-list.perl
> +++ b/Documentation/cmd-list.perl
> @@ -10,7 +10,7 @@ sub format_one {
>  	$state = 0;
>  	open I, '<', "$name.txt" or die "No such file $name.txt";
>  	while (<I>) {
> -		if (/^(git|scalar)[a-z0-9-]*\(([0-9])\)$/) {
> +		if (/^(?:git|scalar)[a-z0-9-]*\(([0-9])\)$/) {
>  			$mansection = $1;
>  			next;
>  		}

Just for anyone doing future archaeology / digging into the "larger
change": The reason I felt safe in removing the "git" matching entirely
is because the larger context here is that we're parsing only the very
start of the *.txt. I.e. we would not want this:
	
	git(1)
	======
	
	NAME
	----
	git - the stupid content tracker
	

        foo(2)

To report Documentation/git.txt as being a "foo" in man section 2, but
that will not happen because as soon as we see a "NAME" line followed by
a "----" line we'll extract that one line of description, so even if our
regex would have eagerly matched that "foo(2)" we won't see it.

Now, having written and looked at this with fresh eyes this would be an
even better & more logical thing to do:
	
	diff --git a/Documentation/cmd-list.perl b/Documentation/cmd-list.perl
	index 755a110bc48..0bc4c803a10 100755
	--- a/Documentation/cmd-list.perl
	+++ b/Documentation/cmd-list.perl
	@@ -1,38 +1,38 @@
	 #!/usr/bin/perl -w
	 
	 use File::Compare qw(compare);
	 
	 sub format_one {
	 	my ($out, $nameattr) = @_;
	 	my ($name, $attr) = @$nameattr;
	 	my ($state, $description);
	 	my $mansection;
	 	$state = 0;
	 	open I, '<', "$name.txt" or die "No such file $name.txt";
	 	while (<I>) {
	-		if (/^(?:git|scalar)[a-z0-9-]*\(([0-9])\)$/) {
	+		if (/^\Q$name\E\(([0-9])\)$/) {
	 			$mansection = $1;
	 			next;
	 		}
	 		if (/^NAME$/) {
	 			$state = 1;
	 			next;
	 		}
	 		if ($state == 1 && /^----$/) {
	 			$state = 2;
	 			next;
	 		}
	 		next if ($state != 2);
	 		chomp;
	 		$description = $_;
	 		last;
	 	}
	 	close I;
	 	if (!defined $description) {
	 		die "No description found in $name.txt";
	 	}
	 	if (my ($verify_name, $text) = ($description =~ /^($name) - (.*)/)) {
	 		print $out "linkgit:$name\[$mansection\]::\n\t";
	 		if ($attr =~ / deprecated /) {
	 			print $out "(deprecated) ";
	 		}

It yields the exact same result as Martin's patch above according to the
doc-diff, but as the -U25 context shows we already have a hard
dependency on the "scalar -" part of the description line matching the
name of the file ("scalar.txt") is something we should be doing.

Anyway, this is more than good enough for now, thanks! There's also much
bigger issues with the script, and we can leave that all aside from now
(e.g. if it dies the Makefile doesn't report an error, ouch!).

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] cmd-list.perl: fix identifying man sections
  2022-09-23  8:37     ` Ævar Arnfjörð Bjarmason
@ 2022-09-23 11:42       ` Martin Ågren
  0 siblings, 0 replies; 7+ messages in thread
From: Martin Ågren @ 2022-09-23 11:42 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Victoria Dye

On Fri, 23 Sept 2022 at 10:53, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
> Thanks for the quick turn-around, this looks good to me in this form.

Thanks for reviewing.

> Now, having written and looked at this with fresh eyes this would be an
> even better & more logical thing to do:

>                 open I, '<', "$name.txt" or die "No such file $name.txt";
>                 while (<I>) {
>         -               if (/^(?:git|scalar)[a-z0-9-]*\(([0-9])\)$/) {
>         +               if (/^\Q$name\E\(([0-9])\)$/) {
>                                 $mansection = $1;
>                                 next;
>                         }

Right, we know what we want there, so this does look reasonable.

> Anyway, this is more than good enough for now, thanks! There's also much
> bigger issues with the script, and we can leave that all aside from now
> (e.g. if it dies the Makefile doesn't report an error, ouch!).

Martin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] cmd-list.perl: fix identifying man sections
  2022-09-23  8:07   ` [PATCH v2] " Martin Ågren
  2022-09-23  8:37     ` Ævar Arnfjörð Bjarmason
@ 2022-09-23 17:01     ` Junio C Hamano
  1 sibling, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2022-09-23 17:01 UTC (permalink / raw)
  To: Martin Ågren
  Cc: Ævar Arnfjörð Bjarmason, git, Victoria Dye

Martin Ågren <martin.agren@gmail.com> writes:

> We attribute each documentation text file to a man section by finding a
> line in the file that looks like "gitfoo(<digit>)". Commit cc75e556a9
> ("scalar: add to 'git help -a' command list", 2022-09-02) updated this
> logic to look not only for "gitfoo" but also "scalarfoo". In doing so,
> it forgot to account for the fact that after the updated regex has found
> a match, the man section is no longer to be found in `$1` but now lives
> in `$2`.
>
> This makes our git(1) manpage look as follows:
>
>   Main porcelain commands
>        git-add(git)
>            Add file contents to the index.

Thanks.  Will queue.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] cmd-list.perl: fix identifying man sections
  2022-09-23  7:03 [PATCH] cmd-list.perl: fix identifying man sections Martin Ågren
  2022-09-23  7:38 ` Ævar Arnfjörð Bjarmason
@ 2022-09-23 22:07 ` Jeff King
  1 sibling, 0 replies; 7+ messages in thread
From: Jeff King @ 2022-09-23 22:07 UTC (permalink / raw)
  To: Martin Ågren
  Cc: git, Victoria Dye, Ævar Arnfjörð Bjarmason

On Fri, Sep 23, 2022 at 09:03:34AM +0200, Martin Ågren wrote:

> This makes our git(1) manpage look as follows:
> 
>   Main porcelain commands
>        git-add(git)
>            Add file contents to the index.
> 
>   [...]
> 
>        gitk(git)
>            The Git repository browser.
> 
>        scalar(scalar)
>            A tool for managing large Git repositories.

Good catch. The patch looks good (I was going to suggest ?:, but I see
Ævar beat me to it).

I wondered if we might have caught this in a more automatic way. The
output of:

  cd Documentation
  ./doc-diff cc75e556a9^ cc75e556a9

makes the problem apparent, but I don't fault reviewers for not running
it. I rarely remember to do so. And in general you need a human looking
at doc-diff output to know if it was the intended change or not.

I wondered if it might be worth running

  ./doc-diff v2.37.0 v2.38.0-rc1

near a release to scan over all of the changes. But the diff is over
8000 lines, and I admit my eyes glazed over before I got to the
problematic hunks (even though I knew I was looking for them!). You can
limit it a bit with --diff-filter=a, which drops new entries (which
can't have regressed!), but it's still over 4000 lines.

So I dunno. I think doc-diff is a potentially useful tool, but I'm not
sure how to point the human attention at the right spot to find a bug.
Maybe "given enough eyeballs, all bugs are shallow" is our best bet
here. After all, it did find this bug before the release. :)

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-09-23 22:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-23  7:03 [PATCH] cmd-list.perl: fix identifying man sections Martin Ågren
2022-09-23  7:38 ` Ævar Arnfjörð Bjarmason
2022-09-23  8:07   ` [PATCH v2] " Martin Ågren
2022-09-23  8:37     ` Ævar Arnfjörð Bjarmason
2022-09-23 11:42       ` Martin Ågren
2022-09-23 17:01     ` Junio C Hamano
2022-09-23 22:07 ` [PATCH] " Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).