git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] t5534: fix misleading grep invocation
@ 2017-07-05 11:37 Johannes Schindelin
  2017-07-05 16:26 ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Johannes Schindelin @ 2017-07-05 11:37 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Michael J Gruber

It seems to be a little-known feature of `grep` (and it certainly came
as a surprise to this here developer who believed to know the Unix tools
pretty well) that multiple patterns can be passed in the same
command-line argument simply by separating them by newlines. Watch, and
learn:

	$ printf '1\n2\n3\n' | grep "$(printf '1\n3\n')"
	1
	3

That behavior also extends to patterns passed via `-e`, and it is not
modified by passing the option `-E` (but trying this with -P issues the
error "grep: the -P option only supports a single pattern").

It seems that there are more old Unix hands who are surprised by this
behavior, as grep invocations of the form

	grep "$(git rev-parse A B) C" file

were introduced in a85b377d041 (push: the beginning of "git push
--signed", 2014-09-12), and later faithfully copy-edited in b9459019bbb
(push: heed user.signingkey for signed pushes, 2014-10-22).

Please note that the output of `git rev-parse A B` separates the object
IDs via *newlines*, not via spaces, and those newlines are preserved
because the interpolation is enclosed in double quotes.

As a consequence, these tests try to validate that the file contains
either A's object ID, or B's object ID followed by C, or both. Clearly,
however, what the test wanted to see is that there is a line that
contains all of them.

This is clearly unintended, and the grep invocations in question really
match too many lines.

Fix the test by avoiding the newlines in the patterns.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t5534-push-signed.sh | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/t/t5534-push-signed.sh b/t/t5534-push-signed.sh
index 5bcb288f5c4..464ffdd147a 100755
--- a/t/t5534-push-signed.sh
+++ b/t/t5534-push-signed.sh
@@ -119,8 +119,11 @@ test_expect_success GPG 'signed push sends push certificate' '
 		sed -n -e "s/^nonce /NONCE=/p" -e "/^$/q" dst/push-cert
 	) >expect &&
 
-	grep "$(git rev-parse noop ff) refs/heads/ff" dst/push-cert &&
-	grep "$(git rev-parse noop noff) refs/heads/noff" dst/push-cert &&
+	noop=$(git rev-parse noop) &&
+	ff=$(git rev-parse ff) &&
+	noff=$(git rev-parse noff) &&
+	grep "$noop $ff refs/heads/ff" dst/push-cert &&
+	grep "$noop $noff refs/heads/noff" dst/push-cert &&
 	test_cmp expect dst/push-cert-status
 '
 
@@ -200,8 +203,11 @@ test_expect_success GPG 'fail without key and heed user.signingkey' '
 		sed -n -e "s/^nonce /NONCE=/p" -e "/^$/q" dst/push-cert
 	) >expect &&
 
-	grep "$(git rev-parse noop ff) refs/heads/ff" dst/push-cert &&
-	grep "$(git rev-parse noop noff) refs/heads/noff" dst/push-cert &&
+	noop=$(git rev-parse noop) &&
+	ff=$(git rev-parse ff) &&
+	noff=$(git rev-parse noff) &&
+	grep "$noop $ff refs/heads/ff" dst/push-cert &&
+	grep "$noop $noff refs/heads/noff" dst/push-cert &&
 	test_cmp expect dst/push-cert-status
 '
 

base-commit: 5116f791c12dda6b6c22fa85b600a8e30dfa168a
-- 
2.13.2.windows.1

Published-As: https://github.com/dscho/git/releases/tag/t5534-fix-grep-pattern-v1
Fetch-It-Via: git fetch https://github.com/dscho/git t5534-fix-grep-pattern-v1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] t5534: fix misleading grep invocation
  2017-07-05 11:37 [PATCH] t5534: fix misleading grep invocation Johannes Schindelin
@ 2017-07-05 16:26 ` Junio C Hamano
  2017-07-06  9:20   ` Michael J Gruber
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2017-07-05 16:26 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Michael J Gruber

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> It seems to be a little-known feature of `grep` (and it certainly came
> as a surprise to this here developer who believed to know the Unix tools
> pretty well) that multiple patterns can be passed in the same
> command-line argument simply by separating them by newlines. Watch, and
> learn:
>
> 	$ printf '1\n2\n3\n' | grep "$(printf '1\n3\n')"
> 	1
> 	3
>
> That behavior also extends to patterns passed via `-e`, and it is not
> modified by passing the option `-E` (but trying this with -P issues the
> error "grep: the -P option only supports a single pattern").
>
> It seems that there are more old Unix hands who are surprised by this
> behavior, as grep invocations of the form
>
> 	grep "$(git rev-parse A B) C" file
>
> were introduced in a85b377d041 (push: the beginning of "git push
> --signed", 2014-09-12), and later faithfully copy-edited in b9459019bbb
> (push: heed user.signingkey for signed pushes, 2014-10-22).
>
> Please note that the output of `git rev-parse A B` separates the object
> IDs via *newlines*, not via spaces, and those newlines are preserved
> because the interpolation is enclosed in double quotes.
>
> As a consequence, these tests try to validate that the file contains
> either A's object ID, or B's object ID followed by C, or both. Clearly,
> however, what the test wanted to see is that there is a line that
> contains all of them.
>
> This is clearly unintended, and the grep invocations in question really
> match too many lines.
>
> Fix the test by avoiding the newlines in the patterns.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---

The invocation this fixes is not just misleading but simply wrong.
Nicely spotted.

Thanks, will queue.

>  t/t5534-push-signed.sh | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/t/t5534-push-signed.sh b/t/t5534-push-signed.sh
> index 5bcb288f5c4..464ffdd147a 100755
> --- a/t/t5534-push-signed.sh
> +++ b/t/t5534-push-signed.sh
> @@ -119,8 +119,11 @@ test_expect_success GPG 'signed push sends push certificate' '
>  		sed -n -e "s/^nonce /NONCE=/p" -e "/^$/q" dst/push-cert
>  	) >expect &&
>  
> -	grep "$(git rev-parse noop ff) refs/heads/ff" dst/push-cert &&
> -	grep "$(git rev-parse noop noff) refs/heads/noff" dst/push-cert &&
> +	noop=$(git rev-parse noop) &&
> +	ff=$(git rev-parse ff) &&
> +	noff=$(git rev-parse noff) &&
> +	grep "$noop $ff refs/heads/ff" dst/push-cert &&
> +	grep "$noop $noff refs/heads/noff" dst/push-cert &&
>  	test_cmp expect dst/push-cert-status
>  '
>  
> @@ -200,8 +203,11 @@ test_expect_success GPG 'fail without key and heed user.signingkey' '
>  		sed -n -e "s/^nonce /NONCE=/p" -e "/^$/q" dst/push-cert
>  	) >expect &&
>  
> -	grep "$(git rev-parse noop ff) refs/heads/ff" dst/push-cert &&
> -	grep "$(git rev-parse noop noff) refs/heads/noff" dst/push-cert &&
> +	noop=$(git rev-parse noop) &&
> +	ff=$(git rev-parse ff) &&
> +	noff=$(git rev-parse noff) &&
> +	grep "$noop $ff refs/heads/ff" dst/push-cert &&
> +	grep "$noop $noff refs/heads/noff" dst/push-cert &&
>  	test_cmp expect dst/push-cert-status
>  '
>  
>
> base-commit: 5116f791c12dda6b6c22fa85b600a8e30dfa168a

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] t5534: fix misleading grep invocation
  2017-07-05 16:26 ` Junio C Hamano
@ 2017-07-06  9:20   ` Michael J Gruber
  2017-07-06 16:23     ` Junio C Hamano
  2017-07-07 11:13     ` Johannes Schindelin
  0 siblings, 2 replies; 6+ messages in thread
From: Michael J Gruber @ 2017-07-06  9:20 UTC (permalink / raw)
  To: Junio C Hamano, Johannes Schindelin; +Cc: git

Junio C Hamano venit, vidit, dixit 05.07.2017 18:26:
> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
>> It seems to be a little-known feature of `grep` (and it certainly came
>> as a surprise to this here developer who believed to know the Unix tools
>> pretty well) that multiple patterns can be passed in the same
>> command-line argument simply by separating them by newlines. Watch, and
>> learn:
>>
>> 	$ printf '1\n2\n3\n' | grep "$(printf '1\n3\n')"
>> 	1
>> 	3
>>
>> That behavior also extends to patterns passed via `-e`, and it is not
>> modified by passing the option `-E` (but trying this with -P issues the
>> error "grep: the -P option only supports a single pattern").
>>
>> It seems that there are more old Unix hands who are surprised by this
>> behavior, as grep invocations of the form
>>
>> 	grep "$(git rev-parse A B) C" file
>>
>> were introduced in a85b377d041 (push: the beginning of "git push
>> --signed", 2014-09-12), and later faithfully copy-edited in b9459019bbb
>> (push: heed user.signingkey for signed pushes, 2014-10-22).
>>
>> Please note that the output of `git rev-parse A B` separates the object
>> IDs via *newlines*, not via spaces, and those newlines are preserved
>> because the interpolation is enclosed in double quotes.
>>
>> As a consequence, these tests try to validate that the file contains
>> either A's object ID, or B's object ID followed by C, or both. Clearly,
>> however, what the test wanted to see is that there is a line that
>> contains all of them.
>>
>> This is clearly unintended, and the grep invocations in question really
>> match too many lines.
>>
>> Fix the test by avoiding the newlines in the patterns.
>>
>> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
>> ---
> 
> The invocation this fixes is not just misleading but simply wrong.
> Nicely spotted.

In addition, the patch makes sure to catch any rev-parse failures which
the original invocation shove under the rug.

> Thanks, will queue.

Thanks from the faithful copy-editor ;)

How did you spot this? Are there grep versions that behave differently?

>>  t/t5534-push-signed.sh | 14 ++++++++++----
>>  1 file changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/t/t5534-push-signed.sh b/t/t5534-push-signed.sh
>> index 5bcb288f5c4..464ffdd147a 100755
>> --- a/t/t5534-push-signed.sh
>> +++ b/t/t5534-push-signed.sh
>> @@ -119,8 +119,11 @@ test_expect_success GPG 'signed push sends push certificate' '
>>  		sed -n -e "s/^nonce /NONCE=/p" -e "/^$/q" dst/push-cert
>>  	) >expect &&
>>  
>> -	grep "$(git rev-parse noop ff) refs/heads/ff" dst/push-cert &&
>> -	grep "$(git rev-parse noop noff) refs/heads/noff" dst/push-cert &&
>> +	noop=$(git rev-parse noop) &&
>> +	ff=$(git rev-parse ff) &&
>> +	noff=$(git rev-parse noff) &&
>> +	grep "$noop $ff refs/heads/ff" dst/push-cert &&
>> +	grep "$noop $noff refs/heads/noff" dst/push-cert &&
>>  	test_cmp expect dst/push-cert-status
>>  '
>>  
>> @@ -200,8 +203,11 @@ test_expect_success GPG 'fail without key and heed user.signingkey' '
>>  		sed -n -e "s/^nonce /NONCE=/p" -e "/^$/q" dst/push-cert
>>  	) >expect &&
>>  
>> -	grep "$(git rev-parse noop ff) refs/heads/ff" dst/push-cert &&
>> -	grep "$(git rev-parse noop noff) refs/heads/noff" dst/push-cert &&
>> +	noop=$(git rev-parse noop) &&
>> +	ff=$(git rev-parse ff) &&
>> +	noff=$(git rev-parse noff) &&
>> +	grep "$noop $ff refs/heads/ff" dst/push-cert &&
>> +	grep "$noop $noff refs/heads/noff" dst/push-cert &&
>>  	test_cmp expect dst/push-cert-status
>>  '
>>  
>>
>> base-commit: 5116f791c12dda6b6c22fa85b600a8e30dfa168a

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] t5534: fix misleading grep invocation
  2017-07-06  9:20   ` Michael J Gruber
@ 2017-07-06 16:23     ` Junio C Hamano
  2017-07-07 11:13     ` Johannes Schindelin
  1 sibling, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2017-07-06 16:23 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: Johannes Schindelin, git

Michael J Gruber <git@grubix.eu> writes:

> Junio C Hamano venit, vidit, dixit 05.07.2017 18:26:
>
>> The invocation this fixes is not just misleading but simply wrong.
>> Nicely spotted.
>
> In addition, the patch makes sure to catch any rev-parse failures which
> the original invocation shove under the rug.

Yeah, good thing that this got fixed ;-)


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] t5534: fix misleading grep invocation
  2017-07-06  9:20   ` Michael J Gruber
  2017-07-06 16:23     ` Junio C Hamano
@ 2017-07-07 11:13     ` Johannes Schindelin
  2017-07-07 16:41       ` Junio C Hamano
  1 sibling, 1 reply; 6+ messages in thread
From: Johannes Schindelin @ 2017-07-07 11:13 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 2914 bytes --]

Hi Michael,

On Thu, 6 Jul 2017, Michael J Gruber wrote:

> Junio C Hamano venit, vidit, dixit 05.07.2017 18:26:
> > Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> > 
> >> It seems to be a little-known feature of `grep` (and it certainly came
> >> as a surprise to this here developer who believed to know the Unix tools
> >> pretty well) that multiple patterns can be passed in the same
> >> command-line argument simply by separating them by newlines. Watch, and
> >> learn:
> >>
> >> 	$ printf '1\n2\n3\n' | grep "$(printf '1\n3\n')"
> >> 	1
> >> 	3
> >>
> >> That behavior also extends to patterns passed via `-e`, and it is not
> >> modified by passing the option `-E` (but trying this with -P issues the
> >> error "grep: the -P option only supports a single pattern").
> >>
> >> It seems that there are more old Unix hands who are surprised by this
> >> behavior, as grep invocations of the form
> >>
> >> 	grep "$(git rev-parse A B) C" file
> >>
> >> were introduced in a85b377d041 (push: the beginning of "git push
> >> --signed", 2014-09-12), and later faithfully copy-edited in b9459019bbb
> >> (push: heed user.signingkey for signed pushes, 2014-10-22).
> >>
> >> Please note that the output of `git rev-parse A B` separates the object
> >> IDs via *newlines*, not via spaces, and those newlines are preserved
> >> because the interpolation is enclosed in double quotes.
> >>
> >> As a consequence, these tests try to validate that the file contains
> >> either A's object ID, or B's object ID followed by C, or both. Clearly,
> >> however, what the test wanted to see is that there is a line that
> >> contains all of them.
> >>
> >> This is clearly unintended, and the grep invocations in question really
> >> match too many lines.
>
> [...]
>
> How did you spot this? Are there grep versions that behave differently?

Yes, there are grep versions that behave differently... how did you guess?

I am in the middle of an extended investigation trying to assess how
feasible it would be to use a native Win32 port of BusyBox (started by
long-time Git contributor Nguyễn Thái Ngọc Duy) in Git for Windows to
execute the many, many remaining Unix shell scripts that are a core part
of Git (including crucial functionality such as bisect, rebase, stash and
submodule, for which we suffer portability and performance problems).

And it is BusyBox' grep that does not handle newlines in the pattern
argument to split it into two alternative patterns.

I first considered patching BusyBox to adhere to the expected behavior,
but then I looked closer and saw that the test's grep invocations actually
matched two lines instead of what I expected. An even closer look made me
suspect that the original intention was different from what the script
actually does, and for once I tried to be nice in my commit message.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] t5534: fix misleading grep invocation
  2017-07-07 11:13     ` Johannes Schindelin
@ 2017-07-07 16:41       ` Junio C Hamano
  0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2017-07-07 16:41 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Michael J Gruber, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Yes, there are grep versions that behave differently... how did you guess?
>
> I am in the middle of an extended investigation trying to assess how
> feasible it would be to use a native Win32 port of BusyBox (started by
> long-time Git contributor Nguyễn Thái Ngọc Duy) in Git for Windows to
> execute the many, many remaining Unix shell scripts that are a core part
> of Git (including crucial functionality such as bisect, rebase, stash and
> submodule, for which we suffer portability and performance problems).

I've long thought that BusyBox was primarily about size and not
about performance, but I can imagine that it would be a big win to
be able to run things like "mkdir" and "rm" without fork/exec, as it
is likely to be extermely more expensive than preparing to call and
actually making system calls mkdir(2), unlink(2), etc.

Interesting.  I learned a new thing today, but apparently that
FEATURE_SH_NOFORK was not a very new development.  I do not think
anybody is crazy enough to attempt making Git a nofork applet,
though ;-)




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-07-07 16:41 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-05 11:37 [PATCH] t5534: fix misleading grep invocation Johannes Schindelin
2017-07-05 16:26 ` Junio C Hamano
2017-07-06  9:20   ` Michael J Gruber
2017-07-06 16:23     ` Junio C Hamano
2017-07-07 11:13     ` Johannes Schindelin
2017-07-07 16:41       ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).