git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] parallel-checkout: use grep -c to count workers in tests
@ 2021-06-05 12:27 René Scharfe
  2021-06-05 14:31 ` Matheus Tavares Bernardino
  2021-06-05 18:11 ` [PATCH v2] parallel-checkout: avoid dash local bug " René Scharfe
  0 siblings, 2 replies; 14+ messages in thread
From: René Scharfe @ 2021-06-05 12:27 UTC (permalink / raw)
  To: Git List; +Cc: Matheus Tavares, Junio C Hamano

The parallel checkout tests fail when run with /bin/dash on MacOS 11.4,
reporting the following error:

   ./t2080-parallel-checkout-basics.sh: 33: local: 0: bad variable name

This seems to be caused by a bug in dash, which doesn't like the pipe
into wc for some reason.  We can work around it and make the test
slightly shorter and faster by having grep do the counting, though, so
let's do that.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
Reduced test case for underlying dash issue:

   $ dash -c 'foo () { local bar=$(echo baz | wc); }; foo'
   dash: 1: local: 1: bad variable name


The pipe is not even required to trigger the issue:

   $ dash -c 'foo () { local bar=$(wc /dev/null); }; foo'
   dash: 1: local: 0: bad variable name

Turning wc into a function calms the shell:

   $ dash -c 'foo () { local bar=$(echo baz | wc); }; wc () { :; }; foo'

Setting a global variable also works fine:

   $ dash -c 'foo () { bar=$(echo baz | wc); }; foo'

 t/lib-parallel-checkout.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh
index 21f5759732..145276eb4c 100644
--- a/t/lib-parallel-checkout.sh
+++ b/t/lib-parallel-checkout.sh
@@ -27,7 +27,7 @@ test_checkout_workers () {
 	rm -f "$trace_file" &&
 	GIT_TRACE2="$(pwd)/$trace_file" "$@" 2>&8 &&

-	local workers=$(grep "child_start\[..*\] git checkout--worker" "$trace_file" | wc -l) &&
+	local workers=$(grep -c "child_start\[..*\] git checkout--worker" <"$trace_file") &&
 	test $workers -eq $expected_workers &&
 	rm "$trace_file"
 } 8>&2 2>&4
--
2.31.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] parallel-checkout: use grep -c to count workers in tests
  2021-06-05 12:27 [PATCH] parallel-checkout: use grep -c to count workers in tests René Scharfe
@ 2021-06-05 14:31 ` Matheus Tavares Bernardino
  2021-06-05 15:20   ` René Scharfe
  2021-06-05 18:11 ` [PATCH v2] parallel-checkout: avoid dash local bug " René Scharfe
  1 sibling, 1 reply; 14+ messages in thread
From: Matheus Tavares Bernardino @ 2021-06-05 14:31 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List, Junio C Hamano

On Sat, Jun 5, 2021 at 9:27 AM René Scharfe <l.s.r@web.de> wrote:
>
> The parallel checkout tests fail when run with /bin/dash on MacOS 11.4,
> reporting the following error:
>
>    ./t2080-parallel-checkout-basics.sh: 33: local: 0: bad variable name

That's interesting. It seems that dash is trying to use wc's output
(in this case "0") as a local variable name which, of course, is not
valid.

This reply [1] to this bug report [2] mentions that dash expands a
local assignment like the following:

    x="1 2 3"
    local y=$x ---expands-to---> local y=1 2 3

So, in this case, dash thinks we are trying to declare three local
variables, y, 2, and 3, which is an error. In bash, the above commands
would result in $y getting the value "1 2 3", even though we didn't
quote $x in the assignment. (BTW, the reply mentions that quoting the
right side of the assignment should make this work in dash as well.)

I wonder if that's what's happening here. Maybe "wc -l" is outputting
a space before the number, and that makes dash parse this line as
something like `local workers="" 0="" `? If that's really the case (I
can't confirm because the bug seems to have been fixed in the dash
version I have), maybe we could mention that in the commit message.

[1]: https://bugs.launchpad.net/ubuntu/+source/dash/+bug/139097/comments/6
[2]: https://bugs.launchpad.net/ubuntu/+source/dash/+bug/139097

> This seems to be caused by a bug in dash, which doesn't like the pipe
> into wc for some reason.  We can work around it and make the test
> slightly shorter and faster by having grep do the counting, though, so
> let's do that.
>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
> Reduced test case for underlying dash issue:
>
>    $ dash -c 'foo () { local bar=$(echo baz | wc); }; foo'
>    dash: 1: local: 1: bad variable name
>
>
> The pipe is not even required to trigger the issue:
>
>    $ dash -c 'foo () { local bar=$(wc /dev/null); }; foo'
>    dash: 1: local: 0: bad variable name
>
> Turning wc into a function calms the shell:
>
>    $ dash -c 'foo () { local bar=$(echo baz | wc); }; wc () { :; }; foo'
>
> Setting a global variable also works fine:
>
>    $ dash -c 'foo () { bar=$(echo baz | wc); }; foo'
>
>  t/lib-parallel-checkout.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh
> index 21f5759732..145276eb4c 100644
> --- a/t/lib-parallel-checkout.sh
> +++ b/t/lib-parallel-checkout.sh
> @@ -27,7 +27,7 @@ test_checkout_workers () {
>         rm -f "$trace_file" &&
>         GIT_TRACE2="$(pwd)/$trace_file" "$@" 2>&8 &&
>
> -       local workers=$(grep "child_start\[..*\] git checkout--worker" "$trace_file" | wc -l) &&
> +       local workers=$(grep -c "child_start\[..*\] git checkout--worker" <"$trace_file") &&

Nice, and the result is both cleaner and more efficient :) Just one
minor nit: I think we could drop the redirection as grep can take the
file name as an argument.

>         test $workers -eq $expected_workers &&
>         rm "$trace_file"
>  } 8>&2 2>&4
> --
> 2.31.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] parallel-checkout: use grep -c to count workers in tests
  2021-06-05 14:31 ` Matheus Tavares Bernardino
@ 2021-06-05 15:20   ` René Scharfe
  2021-06-05 15:30     ` René Scharfe
  0 siblings, 1 reply; 14+ messages in thread
From: René Scharfe @ 2021-06-05 15:20 UTC (permalink / raw)
  To: Matheus Tavares Bernardino; +Cc: Git List, Junio C Hamano

Am 05.06.21 um 16:31 schrieb Matheus Tavares Bernardino:
> On Sat, Jun 5, 2021 at 9:27 AM René Scharfe <l.s.r@web.de> wrote:
>>
>> The parallel checkout tests fail when run with /bin/dash on MacOS 11.4,
>> reporting the following error:
>>
>>    ./t2080-parallel-checkout-basics.sh: 33: local: 0: bad variable name
>
> That's interesting. It seems that dash is trying to use wc's output
> (in this case "0") as a local variable name which, of course, is not
> valid.
>
> This reply [1] to this bug report [2] mentions that dash expands a
> local assignment like the following:
>
>     x="1 2 3"
>     local y=$x ---expands-to---> local y=1 2 3
>
> So, in this case, dash thinks we are trying to declare three local
> variables, y, 2, and 3, which is an error. In bash, the above commands
> would result in $y getting the value "1 2 3", even though we didn't
> quote $x in the assignment. (BTW, the reply mentions that quoting the
> right side of the assignment should make this work in dash as well.)
>
> I wonder if that's what's happening here. Maybe "wc -l" is outputting
> a space before the number, and that makes dash parse this line as
> something like `local workers="" 0="" `? If that's really the case (I
> can't confirm because the bug seems to have been fixed in the dash
> version I have), maybe we could mention that in the commit message.
>
> [1]: https://bugs.launchpad.net/ubuntu/+source/dash/+bug/139097/comments/6
> [2]: https://bugs.launchpad.net/ubuntu/+source/dash/+bug/139097

Ah, indeed:

   $ dash -c 'foo () { local bar=$(echo "1"); }; foo'

   $ dash -c 'foo () { local bar=$(echo " 1"); }; foo'
   dash: 1: local: 1: bad variable name

   $ wc -l </dev/null | tr ' ' s
   sssssss0

>>  t/lib-parallel-checkout.sh | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh
>> index 21f5759732..145276eb4c 100644
>> --- a/t/lib-parallel-checkout.sh
>> +++ b/t/lib-parallel-checkout.sh
>> @@ -27,7 +27,7 @@ test_checkout_workers () {
>>         rm -f "$trace_file" &&
>>         GIT_TRACE2="$(pwd)/$trace_file" "$@" 2>&8 &&
>>
>> -       local workers=$(grep "child_start\[..*\] git checkout--worker" "$trace_file" | wc -l) &&
>> +       local workers=$(grep -c "child_start\[..*\] git checkout--worker" <"$trace_file") &&
>
> Nice, and the result is both cleaner and more efficient :) Just one
> minor nit: I think we could drop the redirection as grep can take the
> file name as an argument.

I'm not sure if there's a grep out there that prints the filename before
the count even if it deals with a single file.  git grep does that, at
least.  POSIX[3] implies the lack of filename prefix for the single file
case, but I don't know if we can rely on that everywhere.

[3] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/grep.html

>
>>         test $workers -eq $expected_workers &&
>>         rm "$trace_file"
>>  } 8>&2 2>&4
>> --
>> 2.31.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] parallel-checkout: use grep -c to count workers in tests
  2021-06-05 15:20   ` René Scharfe
@ 2021-06-05 15:30     ` René Scharfe
  0 siblings, 0 replies; 14+ messages in thread
From: René Scharfe @ 2021-06-05 15:30 UTC (permalink / raw)
  To: Matheus Tavares Bernardino; +Cc: Git List, Junio C Hamano

Am 05.06.21 um 17:20 schrieb René Scharfe:
> Am 05.06.21 um 16:31 schrieb Matheus Tavares Bernardino:
>> On Sat, Jun 5, 2021 at 9:27 AM René Scharfe <l.s.r@web.de> wrote:
>> Just one
>> minor nit: I think we could drop the redirection as grep can take the
>> file name as an argument.
>
> I'm not sure if there's a grep out there that prints the filename before
> the count even if it deals with a single file.  git grep does that, at
> least.  POSIX[3] implies the lack of filename prefix for the single file
> case, but I don't know if we can rely on that everywhere.
>
> [3] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/grep.html

Except we got plenty of "grep -c" calls with a single filename and no
redirection in t/ already.  Adding one more should be fine.

René

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2] parallel-checkout: avoid dash local bug in tests
  2021-06-05 12:27 [PATCH] parallel-checkout: use grep -c to count workers in tests René Scharfe
  2021-06-05 14:31 ` Matheus Tavares Bernardino
@ 2021-06-05 18:11 ` René Scharfe
  2021-06-05 19:09   ` SZEDER Gábor
                     ` (2 more replies)
  1 sibling, 3 replies; 14+ messages in thread
From: René Scharfe @ 2021-06-05 18:11 UTC (permalink / raw)
  To: Git List; +Cc: Matheus Tavares, Junio C Hamano

The parallel checkout tests fail when run with /bin/dash on MacOS 11.4,
reporting the following error:

   ./t2080-parallel-checkout-basics.sh: 33: local: 0: bad variable name

That's because wc's output contains leading spaces and this version of
dash erroneously expands the variable declaration as "local workers= 0",
i.e. it tries to set the "workers" variable to the empty string and also
declare a variable named "0", which not a valid name.  This is a known
dash bug (https://bugs.launchpad.net/ubuntu/+source/dash/+bug/139097).

Work around it by passing the command output directly to test instead of
storing it in a variable first.  While at it, let grep count the number
of lines instead of piping its output to wc, which is a bit shorter and
more efficient.

Helped-by: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
Changes since v1:
- Explain the root cause.
- Get rid of the local variable "workers".
- Adjust title accordingly.
- Still use grep -c, though.
- Remove input redirection.

 t/lib-parallel-checkout.sh | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh
index 21f5759732..66350d5207 100644
--- a/t/lib-parallel-checkout.sh
+++ b/t/lib-parallel-checkout.sh
@@ -27,8 +27,7 @@ test_checkout_workers () {
 	rm -f "$trace_file" &&
 	GIT_TRACE2="$(pwd)/$trace_file" "$@" 2>&8 &&

-	local workers=$(grep "child_start\[..*\] git checkout--worker" "$trace_file" | wc -l) &&
-	test $workers -eq $expected_workers &&
+	test $(grep -c "child_start\[..*\] git checkout--worker" "$trace_file") -eq $expected_workers &&
 	rm "$trace_file"
 } 8>&2 2>&4

--
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] parallel-checkout: avoid dash local bug in tests
  2021-06-05 18:11 ` [PATCH v2] parallel-checkout: avoid dash local bug " René Scharfe
@ 2021-06-05 19:09   ` SZEDER Gábor
  2021-06-06  1:01     ` René Scharfe
  2021-06-05 19:56   ` Ævar Arnfjörð Bjarmason
  2021-06-06  1:01   ` [PATCH v3] " René Scharfe
  2 siblings, 1 reply; 14+ messages in thread
From: SZEDER Gábor @ 2021-06-05 19:09 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List, Matheus Tavares, Junio C Hamano

On Sat, Jun 05, 2021 at 08:11:24PM +0200, René Scharfe wrote:
> The parallel checkout tests fail when run with /bin/dash on MacOS 11.4,
> reporting the following error:
> 
>    ./t2080-parallel-checkout-basics.sh: 33: local: 0: bad variable name
> 
> That's because wc's output contains leading spaces and this version of
> dash erroneously expands the variable declaration as "local workers= 0",
> i.e. it tries to set the "workers" variable to the empty string and also
> declare a variable named "0", which not a valid name.  This is a known
> dash bug (https://bugs.launchpad.net/ubuntu/+source/dash/+bug/139097).

Perhaps a more accurate wording for this bug would be:

  ... and even fairly recent versions of dash erroneously perform
  field splitting on the expansion of the command substitution before
  assigning it to a local variable.

I think the relevant part of POSIX is section 2.9.1 Simple Commands:

  4. Each variable assignment shall be expanded for tilde expansion,
     parameter expansion, command substitution, arithmetic expansion,
     and quote removal prior to assigning the value.

  https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_01

Note that it didn't mention field splitting; though POSIX doesn't
specifies local variables in the first place, so...

Anyway, this bug has been fixed in v0.5.11 (2020-06-01).
This is an old bug, it was already present in v0.5.5 (2009-01-13); I
didn't check earlier versions.

> Work around it by passing the command output directly to test instead of
> storing it in a variable first.  While at it, let grep count the number
> of lines instead of piping its output to wc, which is a bit shorter and
> more efficient.

A more debug-friendly alternative would be to save 'grep's output to a
temporary file and use 'test_line_count = $expected_workers'.

> Helped-by: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
> Changes since v1:
> - Explain the root cause.
> - Get rid of the local variable "workers".
> - Adjust title accordingly.
> - Still use grep -c, though.
> - Remove input redirection.
> 
>  t/lib-parallel-checkout.sh | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh
> index 21f5759732..66350d5207 100644
> --- a/t/lib-parallel-checkout.sh
> +++ b/t/lib-parallel-checkout.sh
> @@ -27,8 +27,7 @@ test_checkout_workers () {
>  	rm -f "$trace_file" &&
>  	GIT_TRACE2="$(pwd)/$trace_file" "$@" 2>&8 &&
> 
> -	local workers=$(grep "child_start\[..*\] git checkout--worker" "$trace_file" | wc -l) &&
> -	test $workers -eq $expected_workers &&
> +	test $(grep -c "child_start\[..*\] git checkout--worker" "$trace_file") -eq $expected_workers &&
>  	rm "$trace_file"
>  } 8>&2 2>&4
> 
> --
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] parallel-checkout: avoid dash local bug in tests
  2021-06-05 18:11 ` [PATCH v2] parallel-checkout: avoid dash local bug " René Scharfe
  2021-06-05 19:09   ` SZEDER Gábor
@ 2021-06-05 19:56   ` Ævar Arnfjörð Bjarmason
  2021-06-05 22:17     ` Matheus Tavares Bernardino
  2021-06-06  1:01     ` René Scharfe
  2021-06-06  1:01   ` [PATCH v3] " René Scharfe
  2 siblings, 2 replies; 14+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-05 19:56 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List, Matheus Tavares, Junio C Hamano


On Sat, Jun 05 2021, René Scharfe wrote:

> The parallel checkout tests fail when run with /bin/dash on MacOS 11.4,
> reporting the following error:
>
>    ./t2080-parallel-checkout-basics.sh: 33: local: 0: bad variable name
>
> That's because wc's output contains leading spaces and this version of
> dash erroneously expands the variable declaration as "local workers= 0",
> i.e. it tries to set the "workers" variable to the empty string and also
> declare a variable named "0", which not a valid name.  This is a known
> dash bug (https://bugs.launchpad.net/ubuntu/+source/dash/+bug/139097).
>
> Work around it by passing the command output directly to test instead of
> storing it in a variable first.  While at it, let grep count the number
> of lines instead of piping its output to wc, which is a bit shorter and
> more efficient.
>
> Helped-by: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
> Changes since v1:
> - Explain the root cause.
> - Get rid of the local variable "workers".
> - Adjust title accordingly.
> - Still use grep -c, though.
> - Remove input redirection.
>
>  t/lib-parallel-checkout.sh | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh
> index 21f5759732..66350d5207 100644
> --- a/t/lib-parallel-checkout.sh
> +++ b/t/lib-parallel-checkout.sh
> @@ -27,8 +27,7 @@ test_checkout_workers () {
>  	rm -f "$trace_file" &&
>  	GIT_TRACE2="$(pwd)/$trace_file" "$@" 2>&8 &&
>
> -	local workers=$(grep "child_start\[..*\] git checkout--worker" "$trace_file" | wc -l) &&
> -	test $workers -eq $expected_workers &&
> +	test $(grep -c "child_start\[..*\] git checkout--worker" "$trace_file") -eq $expected_workers &&
>  	rm "$trace_file"
>  } 8>&2 2>&4

I'd find this thing much clearer if the v2 just narrowly focused on
avoiding the "local", and thus demonstrated the non-portable shell
issue, and perhaps with something like:
	
	diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl
	index fd3303552be..aad6f3e2bf1 100755
	--- a/t/check-non-portable-shell.pl
	+++ b/t/check-non-portable-shell.pl
	@@ -48,6 +48,7 @@ sub err {
	 	/\bexport\s+[A-Za-z0-9_]*=/ and err '"export FOO=bar" is not portable (use FOO=bar && export FOO)';
	 	/^\s*([A-Z0-9_]+=(\w*|(["']).*?\3)\s+)+(\w+)/ and exists($func{$4}) and
	 		err '"FOO=bar shell_func" assignment extends beyond "shell_func"';
	+	/\blocal\b \S+=\$\(.*?\|\s*\bwc -l\)/ and err 'whitespace handling in local=$(... | wc -l) differs in some dash versions';
	 	$line = '';
	 	# this resets our $. for each file
	 	close ARGV if eof;


The let's do grep -c while we're at it part of this IMO just adds
confusion while skimming future portability issues with --grep=dash or
--grep=POSIX in the future, and looking at the history in v1 it's just
there because in v1 the root cause wasn't fully understood.

If we're doing a general cleanup of that pattern it would seem to be
better to search-replace this with the rest of them in another commit:

    $ git grep '\$\(grep.*\| wc -l' -- t | wc -l
    27

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] parallel-checkout: avoid dash local bug in tests
  2021-06-05 19:56   ` Ævar Arnfjörð Bjarmason
@ 2021-06-05 22:17     ` Matheus Tavares Bernardino
  2021-06-05 22:21       ` Matheus Tavares Bernardino
  2021-06-06  1:01     ` René Scharfe
  1 sibling, 1 reply; 14+ messages in thread
From: Matheus Tavares Bernardino @ 2021-06-05 22:17 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: René Scharfe, Git List, Junio C Hamano

On Sat, Jun 5, 2021 at 5:03 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
>
> On Sat, Jun 05 2021, René Scharfe wrote:
>
> > The parallel checkout tests fail when run with /bin/dash on MacOS 11.4,
> > reporting the following error:
> >
> >    ./t2080-parallel-checkout-basics.sh: 33: local: 0: bad variable name
> >
> > That's because wc's output contains leading spaces and this version of
> > dash erroneously expands the variable declaration as "local workers= 0",
> > i.e. it tries to set the "workers" variable to the empty string and also
> > declare a variable named "0", which not a valid name.  This is a known
> > dash bug (https://bugs.launchpad.net/ubuntu/+source/dash/+bug/139097).
> >
> > Work around it by passing the command output directly to test instead of
> > storing it in a variable first.  While at it, let grep count the number
> > of lines instead of piping its output to wc, which is a bit shorter and
> > more efficient.
> >
> > Helped-by: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
> > Signed-off-by: René Scharfe <l.s.r@web.de>
> > ---
> > Changes since v1:
> > - Explain the root cause.
> > - Get rid of the local variable "workers".
> > - Adjust title accordingly.
> > - Still use grep -c, though.
> > - Remove input redirection.
> >
> >  t/lib-parallel-checkout.sh | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh
> > index 21f5759732..66350d5207 100644
> > --- a/t/lib-parallel-checkout.sh
> > +++ b/t/lib-parallel-checkout.sh
> > @@ -27,8 +27,7 @@ test_checkout_workers () {
> >       rm -f "$trace_file" &&
> >       GIT_TRACE2="$(pwd)/$trace_file" "$@" 2>&8 &&
> >
> > -     local workers=$(grep "child_start\[..*\] git checkout--worker" "$trace_file" | wc -l) &&
> > -     test $workers -eq $expected_workers &&
> > +     test $(grep -c "child_start\[..*\] git checkout--worker" "$trace_file") -eq $expected_workers &&
> >       rm "$trace_file"
> >  } 8>&2 2>&4
>
> I'd find this thing much clearer if the v2 just narrowly focused on
> avoiding the "local", and thus demonstrated the non-portable shell
> issue,

I don't have any strong preference, but if we are leaving the "grep |
wc -l" -> "grep -c" conversion to a followup patch, perhaps the
simplest change focusing on the dash issue would be to quote the
right-hand side of the "local" assignment:

-     local workers=$(grep "child_start\[..*\] git checkout--worker"
"$trace_file" | wc -l) &&
+     local workers="$(grep "child_start\[..*\] git checkout--worker"
"$trace_file" | wc -l)" &&

(René, could you confirm if this works to make the test pass on dash?)

Alternatively, we could use `test_line_count` as SZEDER Gábor
suggested in a parallel reply.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] parallel-checkout: avoid dash local bug in tests
  2021-06-05 22:17     ` Matheus Tavares Bernardino
@ 2021-06-05 22:21       ` Matheus Tavares Bernardino
  0 siblings, 0 replies; 14+ messages in thread
From: Matheus Tavares Bernardino @ 2021-06-05 22:21 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: René Scharfe, Git List, Junio C Hamano

On Sat, Jun 5, 2021 at 7:17 PM Matheus Tavares Bernardino
<matheus.bernardino@usp.br> wrote:
>
> I don't have any strong preference, but if we are leaving the "grep |
> wc -l" -> "grep -c" conversion to a followup patch, perhaps the
> simplest change focusing on the dash issue would be to quote the
> right-hand side of the "local" assignment:
>
> -     local workers=$(grep "child_start\[..*\] git checkout--worker"
> "$trace_file" | wc -l) &&
> +     local workers="$(grep "child_start\[..*\] git checkout--worker"
> "$trace_file" | wc -l)" &&

Oops, sorry for the odd line breaking. I forgot that Gmail's web
client would break these lines.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] parallel-checkout: avoid dash local bug in tests
  2021-06-05 19:56   ` Ævar Arnfjörð Bjarmason
  2021-06-05 22:17     ` Matheus Tavares Bernardino
@ 2021-06-06  1:01     ` René Scharfe
  2021-06-06  1:28       ` Junio C Hamano
  1 sibling, 1 reply; 14+ messages in thread
From: René Scharfe @ 2021-06-06  1:01 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git List, Matheus Tavares, Junio C Hamano

Am 05.06.21 um 21:56 schrieb Ævar Arnfjörð Bjarmason:
>
> On Sat, Jun 05 2021, René Scharfe wrote:
>
>> The parallel checkout tests fail when run with /bin/dash on MacOS 11.4,
>> reporting the following error:
>>
>>    ./t2080-parallel-checkout-basics.sh: 33: local: 0: bad variable name
>>
>> That's because wc's output contains leading spaces and this version of
>> dash erroneously expands the variable declaration as "local workers= 0",
>> i.e. it tries to set the "workers" variable to the empty string and also
>> declare a variable named "0", which not a valid name.  This is a known
>> dash bug (https://bugs.launchpad.net/ubuntu/+source/dash/+bug/139097).
>>
>> Work around it by passing the command output directly to test instead of
>> storing it in a variable first.  While at it, let grep count the number
>> of lines instead of piping its output to wc, which is a bit shorter and
>> more efficient.
>>
>> Helped-by: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
>> Signed-off-by: René Scharfe <l.s.r@web.de>
>> ---
>> Changes since v1:
>> - Explain the root cause.
>> - Get rid of the local variable "workers".
>> - Adjust title accordingly.
>> - Still use grep -c, though.
>> - Remove input redirection.
>>
>>  t/lib-parallel-checkout.sh | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh
>> index 21f5759732..66350d5207 100644
>> --- a/t/lib-parallel-checkout.sh
>> +++ b/t/lib-parallel-checkout.sh
>> @@ -27,8 +27,7 @@ test_checkout_workers () {
>>  	rm -f "$trace_file" &&
>>  	GIT_TRACE2="$(pwd)/$trace_file" "$@" 2>&8 &&
>>
>> -	local workers=$(grep "child_start\[..*\] git checkout--worker" "$trace_file" | wc -l) &&
>> -	test $workers -eq $expected_workers &&
>> +	test $(grep -c "child_start\[..*\] git checkout--worker" "$trace_file") -eq $expected_workers &&
>>  	rm "$trace_file"
>>  } 8>&2 2>&4
>
> I'd find this thing much clearer if the v2 just narrowly focused on
> avoiding the "local", and thus demonstrated the non-portable shell
> issue,

I was not aiming for a minimal fix and I don't think the patch above is
too complex, but you're right that at this point in the release cycle a
duct-tape-style patch would be better.

> and perhaps with something like:
>
> 	diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl
> 	index fd3303552be..aad6f3e2bf1 100755
> 	--- a/t/check-non-portable-shell.pl
> 	+++ b/t/check-non-portable-shell.pl
> 	@@ -48,6 +48,7 @@ sub err {
> 	 	/\bexport\s+[A-Za-z0-9_]*=/ and err '"export FOO=bar" is not portable (use FOO=bar && export FOO)';
> 	 	/^\s*([A-Z0-9_]+=(\w*|(["']).*?\3)\s+)+(\w+)/ and exists($func{$4}) and
> 	 		err '"FOO=bar shell_func" assignment extends beyond "shell_func"';
> 	+	/\blocal\b \S+=\$\(.*?\|\s*\bwc -l\)/ and err 'whitespace handling in local=$(... | wc -l) differs in some dash versions';

Any command can output whitespace, it's not limited to wc -l.  So a
better rule might be to always quote command substitutions in local
variable declarations (local foo="$(...)").  Or to disallow assignments
with local altogether, like we already do for export, but that might be
a bit much.

> 	 	$line = '';
> 	 	# this resets our $. for each file
> 	 	close ARGV if eof;
>
>
> The let's do grep -c while we're at it part of this IMO just adds
> confusion while skimming future portability issues with --grep=dash or
> --grep=POSIX in the future, and looking at the history in v1 it's just
> there because in v1 the root cause wasn't fully understood.

True, I was still gripping the "use grep -c instead of grep | wc -l"
hammer.  I better rewatch https://www.youtube.com/watch?v=Yv4tI6939q0

>
> If we're doing a general cleanup of that pattern it would seem to be
> better to search-replace this with the rest of them in another commit:
>
>     $ git grep '\$\(grep.*\| wc -l' -- t | wc -l
>     27
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] parallel-checkout: avoid dash local bug in tests
  2021-06-05 19:09   ` SZEDER Gábor
@ 2021-06-06  1:01     ` René Scharfe
  0 siblings, 0 replies; 14+ messages in thread
From: René Scharfe @ 2021-06-06  1:01 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Git List, Matheus Tavares, Junio C Hamano

Am 05.06.21 um 21:09 schrieb SZEDER Gábor:
> On Sat, Jun 05, 2021 at 08:11:24PM +0200, René Scharfe wrote:
>> The parallel checkout tests fail when run with /bin/dash on MacOS 11.4,
>> reporting the following error:
>>
>>    ./t2080-parallel-checkout-basics.sh: 33: local: 0: bad variable name
>>
>> That's because wc's output contains leading spaces and this version of
>> dash erroneously expands the variable declaration as "local workers= 0",
>> i.e. it tries to set the "workers" variable to the empty string and also
>> declare a variable named "0", which not a valid name.  This is a known
>> dash bug (https://bugs.launchpad.net/ubuntu/+source/dash/+bug/139097).
>
> Perhaps a more accurate wording for this bug would be:
>
>   ... and even fairly recent versions of dash erroneously perform
>   field splitting on the expansion of the command substitution before
>   assigning it to a local variable.

OK.

>
> I think the relevant part of POSIX is section 2.9.1 Simple Commands:
>
>   4. Each variable assignment shall be expanded for tilde expansion,
>      parameter expansion, command substitution, arithmetic expansion,
>      and quote removal prior to assigning the value.
>
>   https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_01
>
> Note that it didn't mention field splitting; though POSIX doesn't
> specifies local variables in the first place, so...
>
> Anyway, this bug has been fixed in v0.5.11 (2020-06-01).
> This is an old bug, it was already present in v0.5.5 (2009-01-13); I
> didn't check earlier versions.
>
>> Work around it by passing the command output directly to test instead of
>> storing it in a variable first.  While at it, let grep count the number
>> of lines instead of piping its output to wc, which is a bit shorter and
>> more efficient.
>
> A more debug-friendly alternative would be to save 'grep's output to a
> temporary file and use 'test_line_count = $expected_workers'.

Yes, but that would cement the use of grep and wc -l and I still can't
let go of the idea that grep -c would be slightly quicker.  And it would
add file I/O.

Something like this would be more efficient in the expected case:

	if test $expected_count -ne $(grep -c -e "$pattern" "$file")
	then
		echo "Expected $expected_count lines matching $patter, but got:"
		grep -e "$pattern" "$file"
		return 1
	fi

I have no performance numbers to show, just the vague feeling that the
test suite takes way too long already.

Anyway, for now a minimal fix should do.  Debug features can be added to
this case and several similar ones later.

>
>> Helped-by: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
>> Signed-off-by: René Scharfe <l.s.r@web.de>
>> ---
>> Changes since v1:
>> - Explain the root cause.
>> - Get rid of the local variable "workers".
>> - Adjust title accordingly.
>> - Still use grep -c, though.
>> - Remove input redirection.
>>
>>  t/lib-parallel-checkout.sh | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh
>> index 21f5759732..66350d5207 100644
>> --- a/t/lib-parallel-checkout.sh
>> +++ b/t/lib-parallel-checkout.sh
>> @@ -27,8 +27,7 @@ test_checkout_workers () {
>>  	rm -f "$trace_file" &&
>>  	GIT_TRACE2="$(pwd)/$trace_file" "$@" 2>&8 &&
>>
>> -	local workers=$(grep "child_start\[..*\] git checkout--worker" "$trace_file" | wc -l) &&
>> -	test $workers -eq $expected_workers &&
>> +	test $(grep -c "child_start\[..*\] git checkout--worker" "$trace_file") -eq $expected_workers &&
>>  	rm "$trace_file"
>>  } 8>&2 2>&4
>>
>> --
>> 2.31.1
>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v3] parallel-checkout: avoid dash local bug in tests
  2021-06-05 18:11 ` [PATCH v2] parallel-checkout: avoid dash local bug " René Scharfe
  2021-06-05 19:09   ` SZEDER Gábor
  2021-06-05 19:56   ` Ævar Arnfjörð Bjarmason
@ 2021-06-06  1:01   ` René Scharfe
  2021-06-06  1:41     ` Junio C Hamano
  2 siblings, 1 reply; 14+ messages in thread
From: René Scharfe @ 2021-06-06  1:01 UTC (permalink / raw)
  To: Git List
  Cc: Matheus Tavares, Junio C Hamano, SZEDER Gábor,
	Ævar Arnfjörð Bjarmason

Dash bug https://bugs.launchpad.net/ubuntu/+source/dash/+bug/139097
lets the shell erroneously perform field splitting on the expansion of a
command substitution during declaration of a local variable.  It causes
the parallel-checkout tests to fail e.g. when running them with
/bin/dash on MacOS 11.4, where they error out like this:

   ./t2080-parallel-checkout-basics.sh: 33: local: 0: bad variable name

That's because the output of wc -l contains leading spaces and the
returned number of lines is treated as another variable to declare, i.e.
as in "local workers= 0".

Work around it by enclosing the command substitution in quotes.

Helped-by: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
Changes since v2:
- Use minimal fix.
- New commit message.

 t/lib-parallel-checkout.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh
index 21f5759732..83b279a846 100644
--- a/t/lib-parallel-checkout.sh
+++ b/t/lib-parallel-checkout.sh
@@ -27,7 +27,7 @@ test_checkout_workers () {
 	rm -f "$trace_file" &&
 	GIT_TRACE2="$(pwd)/$trace_file" "$@" 2>&8 &&

-	local workers=$(grep "child_start\[..*\] git checkout--worker" "$trace_file" | wc -l) &&
+	local workers="$(grep "child_start\[..*\] git checkout--worker" "$trace_file" | wc -l)" &&
 	test $workers -eq $expected_workers &&
 	rm "$trace_file"
 } 8>&2 2>&4
--
2.31.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] parallel-checkout: avoid dash local bug in tests
  2021-06-06  1:01     ` René Scharfe
@ 2021-06-06  1:28       ` Junio C Hamano
  0 siblings, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2021-06-06  1:28 UTC (permalink / raw)
  To: René Scharfe
  Cc: Ævar Arnfjörð Bjarmason, Git List, Matheus Tavares

René Scharfe <l.s.r@web.de> writes:

> Am 05.06.21 um 21:56 schrieb Ævar Arnfjörð Bjarmason:
>>
>> On Sat, Jun 05 2021, René Scharfe wrote:
>>
>>> -	local workers=$(grep "child_start\[..*\] git checkout--worker" "$trace_file" | wc -l) &&
>>> -	test $workers -eq $expected_workers &&
>>> +	test $(grep -c "child_start\[..*\] git checkout--worker" "$trace_file") -eq $expected_workers &&
>>>  	rm "$trace_file"
>>>  } 8>&2 2>&4
>>
>> I'd find this thing much clearer if the v2 just narrowly focused on
>> avoiding the "local", and thus demonstrated the non-portable shell
>> issue,
>
> I was not aiming for a minimal fix and I don't think the patch above is
> too complex, but you're right that at this point in the release cycle a
> duct-tape-style patch would be better.

Sorry for being late to the party but I tend to disagree.  "grep -c"
is used elsewhere in the tests, we know it is safe.

IOW, the above change is quite straight-forward.  It is much more
obvious and close to "duct-tape-style" fix, than some altermatives
like enclosing the whole command substitution in a pair of
double-quotes and rely on $workers always getting used without
double-quotes around it.  To me, that looks more subtle and brittle.

>> and perhaps with something like:
>>
>> 	diff --git a/t/check-non-portable-shell.pl b/t/check-non-portable-shell.pl
>> 	index fd3303552be..aad6f3e2bf1 100755
>> 	--- a/t/check-non-portable-shell.pl
>> 	+++ b/t/check-non-portable-shell.pl
>> 	@@ -48,6 +48,7 @@ sub err {
>> 	 	/\bexport\s+[A-Za-z0-9_]*=/ and err '"export FOO=bar" is not portable (use FOO=bar && export FOO)';
>> 	 	/^\s*([A-Z0-9_]+=(\w*|(["']).*?\3)\s+)+(\w+)/ and exists($func{$4}) and
>> 	 		err '"FOO=bar shell_func" assignment extends beyond "shell_func"';
>> 	+	/\blocal\b \S+=\$\(.*?\|\s*\bwc -l\)/ and err 'whitespace handling in local=$(... | wc -l) differs in some dash versions';
>
> Any command can output whitespace, it's not limited to wc -l.  So a
> better rule might be to always quote command substitutions in local
> variable declarations (local foo="$(...)").  Or to disallow assignments
> with local altogether, like we already do for export, but that might be
> a bit much.

I actually do not mind "no assignments with local decl" that much.
I agree that is outside the scope of this particular fix.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] parallel-checkout: avoid dash local bug in tests
  2021-06-06  1:01   ` [PATCH v3] " René Scharfe
@ 2021-06-06  1:41     ` Junio C Hamano
  0 siblings, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2021-06-06  1:41 UTC (permalink / raw)
  To: René Scharfe
  Cc: Git List, Matheus Tavares, SZEDER Gábor,
	Ævar Arnfjörð Bjarmason

René Scharfe <l.s.r@web.de> writes:

> Dash bug https://bugs.launchpad.net/ubuntu/+source/dash/+bug/139097
> lets the shell erroneously perform field splitting on the expansion of a
> command substitution during declaration of a local variable.  It causes
> the parallel-checkout tests to fail e.g. when running them with
> /bin/dash on MacOS 11.4, where they error out like this:
>
>    ./t2080-parallel-checkout-basics.sh: 33: local: 0: bad variable name
>
> That's because the output of wc -l contains leading spaces and the
> returned number of lines is treated as another variable to declare, i.e.
> as in "local workers= 0".
>
> Work around it by enclosing the command substitution in quotes.
>
> Helped-by: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
> Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
> Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
> Changes since v2:
> - Use minimal fix.
> - New commit message.

Thanks.  As I said, I do not necessarily think this is conceptually
"minimal", but we use workers only once and without surrounding dq
so it is easy to see that this also is correct.

Will queue.

>
>  t/lib-parallel-checkout.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh
> index 21f5759732..83b279a846 100644
> --- a/t/lib-parallel-checkout.sh
> +++ b/t/lib-parallel-checkout.sh
> @@ -27,7 +27,7 @@ test_checkout_workers () {
>  	rm -f "$trace_file" &&
>  	GIT_TRACE2="$(pwd)/$trace_file" "$@" 2>&8 &&
>
> -	local workers=$(grep "child_start\[..*\] git checkout--worker" "$trace_file" | wc -l) &&
> +	local workers="$(grep "child_start\[..*\] git checkout--worker" "$trace_file" | wc -l)" &&
>  	test $workers -eq $expected_workers &&
>  	rm "$trace_file"
>  } 8>&2 2>&4
> --
> 2.31.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-06-06  1:43 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-05 12:27 [PATCH] parallel-checkout: use grep -c to count workers in tests René Scharfe
2021-06-05 14:31 ` Matheus Tavares Bernardino
2021-06-05 15:20   ` René Scharfe
2021-06-05 15:30     ` René Scharfe
2021-06-05 18:11 ` [PATCH v2] parallel-checkout: avoid dash local bug " René Scharfe
2021-06-05 19:09   ` SZEDER Gábor
2021-06-06  1:01     ` René Scharfe
2021-06-05 19:56   ` Ævar Arnfjörð Bjarmason
2021-06-05 22:17     ` Matheus Tavares Bernardino
2021-06-05 22:21       ` Matheus Tavares Bernardino
2021-06-06  1:01     ` René Scharfe
2021-06-06  1:28       ` Junio C Hamano
2021-06-06  1:01   ` [PATCH v3] " René Scharfe
2021-06-06  1:41     ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).