[PATCH] tests: make 'test_oid' print trailing newline

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* [PATCH] tests: make 'test_oid' print trailing newline
@ 2022-12-18 16:29 SZEDER Gábor
  2022-12-19  0:48 ` Junio C Hamano
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: SZEDER Gábor @ 2022-12-18 16:29 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, brian m. carlson, SZEDER Gábor

Unlike other test helper functions, 'test_oid' doesn't terminate its
output with a LF, but, alas, the reason for this, if any, is not
mentioned in 2c02b110da (t: add test functions to translate
hash-related values, 2018-09-13)).

Now, in the vast majority of cases 'test_oid' is invoked in a command
substitution that is part of a heredoc or supplies an argument to a
command or the value to a variable, and the command substitution would
chop off any trailing LFs, so in these cases the lack or presence of a
trailing LF in its output doesn't matter.  However:

  - There appear to be only three cases where 'test_oid' is not
    invoked in a command substitution:

      $ git grep '\stest_oid ' -- ':/t/*.sh'
      t0000-basic.sh:  test_oid zero >actual &&
      t0000-basic.sh:  test_oid zero >actual &&
      t0000-basic.sh:  test_oid zero >actual &&

    These are all in test cases checking that 'test_oid' actually
    works, and that the size of its output matches the size of the
    corresponding hash function with conditions like

      test $(wc -c <actual) -eq 40

    In these cases the lack of trailing LF does actually matter,
    though they could be trivially updated to account for the presence
    of a trailing LF.

  - There are also a few cases where the lack of trailing LF in
    'test_oid's output actually hurts, because tests need to compare
    its output with LF terminated file contents, forcing developers to
    invoke it as 'echo $(test_oid ...)' to append the missing LF:

      $ git grep 'echo "\?$(test_oid ' -- ':/t/*.sh'
      t1302-repo-version.sh:  echo $(test_oid version) >expect &&
      t1500-rev-parse.sh:     echo "$(test_oid algo)" >expect &&
      t4044-diff-index-unique-abbrev.sh:      echo "$(test_oid val1)" > foo &&
      t4044-diff-index-unique-abbrev.sh:      echo "$(test_oid val2)" > foo &&
      t5313-pack-bounds-checks.sh:    echo $(test_oid oidfff) >file &&

    And there is yet another similar case in an in-flight topic at:

      https://public-inbox.org/git/813e81a058227bd373cec802e443fcd677042fb4.1670862677.git.gitgitgadget@gmail.com/

Arguably we would be better off if 'test_oid' terminated its output
with a LF.  So let's update 'test_oid' accordingly, update its tests
in t0000 to account for the extra character in those size tests, and
remove the now unnecessary 'echo $(...)' command substitutions around
'test_oid' invocations as well.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 t/t0000-basic.sh                    | 7 ++++---
 t/t1302-repo-version.sh             | 2 +-
 t/t1500-rev-parse.sh                | 2 +-
 t/t4044-diff-index-unique-abbrev.sh | 4 ++--
 t/t5313-pack-bounds-checks.sh       | 2 +-
 t/test-lib-functions.sh             | 2 +-
 6 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
index 502b4bcf9e..8ea31d187a 100755
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -815,7 +815,8 @@ test_expect_success 'test_oid provides sane info by default' '
 	grep "^00*\$" actual &&
 	rawsz="$(test_oid rawsz)" &&
 	hexsz="$(test_oid hexsz)" &&
-	test "$hexsz" -eq $(wc -c <actual) &&
+	# +1 accounts for the trailing newline
+	test $(( $hexsz + 1)) -eq $(wc -c <actual) &&
 	test $(( $rawsz * 2)) -eq "$hexsz"
 '
 
@@ -826,7 +827,7 @@ test_expect_success 'test_oid can look up data for SHA-1' '
 	grep "^00*\$" actual &&
 	rawsz="$(test_oid rawsz)" &&
 	hexsz="$(test_oid hexsz)" &&
-	test $(wc -c <actual) -eq 40 &&
+	test $(wc -c <actual) -eq 41 &&
 	test "$rawsz" -eq 20 &&
 	test "$hexsz" -eq 40
 '
@@ -838,7 +839,7 @@ test_expect_success 'test_oid can look up data for SHA-256' '
 	grep "^00*\$" actual &&
 	rawsz="$(test_oid rawsz)" &&
 	hexsz="$(test_oid hexsz)" &&
-	test $(wc -c <actual) -eq 64 &&
+	test $(wc -c <actual) -eq 65 &&
 	test "$rawsz" -eq 32 &&
 	test "$hexsz" -eq 64
 '
diff --git a/t/t1302-repo-version.sh b/t/t1302-repo-version.sh
index 0acabb6d11..7cf80bf66a 100755
--- a/t/t1302-repo-version.sh
+++ b/t/t1302-repo-version.sh
@@ -27,7 +27,7 @@ test_expect_success 'setup' '
 '
 
 test_expect_success 'gitdir selection on normal repos' '
-	echo $(test_oid version) >expect &&
+	test_oid version >expect &&
 	git config core.repositoryformatversion >actual &&
 	git -C test config core.repositoryformatversion >actual2 &&
 	test_cmp expect actual &&
diff --git a/t/t1500-rev-parse.sh b/t/t1500-rev-parse.sh
index 81de584ea2..37ee5091b5 100755
--- a/t/t1500-rev-parse.sh
+++ b/t/t1500-rev-parse.sh
@@ -195,7 +195,7 @@ test_expect_success 'rev-parse --is-shallow-repository in non-shallow repo' '
 '
 
 test_expect_success 'rev-parse --show-object-format in repo' '
-	echo "$(test_oid algo)" >expect &&
+	test_oid algo >expect &&
 	git rev-parse --show-object-format >actual &&
 	test_cmp expect actual &&
 	git rev-parse --show-object-format=storage >actual &&
diff --git a/t/t4044-diff-index-unique-abbrev.sh b/t/t4044-diff-index-unique-abbrev.sh
index 29e49d2290..9f6043daba 100755
--- a/t/t4044-diff-index-unique-abbrev.sh
+++ b/t/t4044-diff-index-unique-abbrev.sh
@@ -34,12 +34,12 @@ test_expect_success 'setup' '
 	100644 blob $(test_oid hash2)	foo
 	EOF
 
-	echo "$(test_oid val1)" > foo &&
+	test_oid val1 > foo &&
 	git add foo &&
 	git commit -m "initial" &&
 	git cat-file -p HEAD: > actual &&
 	test_cmp expect_initial actual &&
-	echo "$(test_oid val2)" > foo &&
+	test_oid val2 > foo &&
 	git commit -a -m "update" &&
 	git cat-file -p HEAD: > actual &&
 	test_cmp expect_update actual
diff --git a/t/t5313-pack-bounds-checks.sh b/t/t5313-pack-bounds-checks.sh
index cc4cfaa9d3..ceaa6700a2 100755
--- a/t/t5313-pack-bounds-checks.sh
+++ b/t/t5313-pack-bounds-checks.sh
@@ -59,7 +59,7 @@ test_expect_success 'setup' '
 test_expect_success 'set up base packfile and variables' '
 	# the hash of this content starts with ff, which
 	# makes some later computations much simpler
-	echo $(test_oid oidfff) >file &&
+	test_oid oidfff >file &&
 	git add file &&
 	git commit -m base &&
 	git repack -ad &&
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 796093a7b3..f51b97663f 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1682,7 +1682,7 @@ test_oid () {
 	then
 		BUG "undefined key '$1'"
 	fi &&
-	eval "printf '%s' \"\${$var}\""
+	eval "printf '%s\n' \"\${$var}\""
 }
 
 # Insert a slash into an object ID so it can be used to reference a location
-- 
2.39.0.269.g5ff869c7c0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] tests: make 'test_oid' print trailing newline
  2022-12-18 16:29 [PATCH] tests: make 'test_oid' print trailing newline SZEDER Gábor
@ 2022-12-19  0:48 ` Junio C Hamano
  2022-12-22 18:58   ` SZEDER Gábor
  2022-12-19 14:03 ` brian m. carlson
  2022-12-19 15:27 ` Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2022-12-19  0:48 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, brian m. carlson

SZEDER Gábor <szeder.dev@gmail.com> writes:

> Unlike other test helper functions, 'test_oid' doesn't terminate its
> output with a LF, but, alas, the reason for this, if any, is not
> mentioned in 2c02b110da (t: add test functions to translate
> hash-related values, 2018-09-13)).

I (obviously) agree with the analysis in the proposed log message.
Having to touch the sanity checking tests in 0-basic is a bit
annoying, but having to do an artificial echo is even more annoying,
so these changes may probably be a net win, I would say.

Having said that.

>       $ git grep '\stest_oid ' -- ':/t/*.sh'
>       $ git grep 'echo "\?$(test_oid ' -- ':/t/*.sh'

I found these examples in the log message a bit annoying to see, as
both invite an undefined behaviour by having an ordinary character
('s' or '?')  preceded by an unescaped backslash in a POSIXly
correct implementation of BRE.  GNU libc seems to be OK with it (I
double checked by adding "-G" on the command line to make sure my
experiments are not affected by any grep.patterntype), but they may
fail for folks on stricter platforms.

Thanks.  Will queue.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tests: make 'test_oid' print trailing newline
  2022-12-18 16:29 [PATCH] tests: make 'test_oid' print trailing newline SZEDER Gábor
  2022-12-19  0:48 ` Junio C Hamano
@ 2022-12-19 14:03 ` brian m. carlson
  2022-12-19 15:27 ` Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 7+ messages in thread
From: brian m. carlson @ 2022-12-19 14:03 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 825 bytes --]

On 2022-12-18 at 16:29:05, SZEDER Gábor wrote:
> Arguably we would be better off if 'test_oid' terminated its output
> with a LF.  So let's update 'test_oid' accordingly, update its tests
> in t0000 to account for the extra character in those size tests, and
> remove the now unnecessary 'echo $(...)' command substitutions around
> 'test_oid' invocations as well.

I don't recall that there was a particular reason for me to do it the
way that it was, and if the commit message doesn't mention it, I think
it's fine to replace it.  Perhaps I intended to allow writing the binary
form as well as the text form, in which case a newline would be
undesirable, but I simply don't recall.

All that to say, I think this patch is fine as it stands.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tests: make 'test_oid' print trailing newline
  2022-12-18 16:29 [PATCH] tests: make 'test_oid' print trailing newline SZEDER Gábor
  2022-12-19  0:48 ` Junio C Hamano
  2022-12-19 14:03 ` brian m. carlson
@ 2022-12-19 15:27 ` Ævar Arnfjörð Bjarmason
  2022-12-19 23:58   ` Junio C Hamano
  2 siblings, 1 reply; 7+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-19 15:27 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, Junio C Hamano, brian m. carlson


On Sun, Dec 18 2022, SZEDER Gábor wrote:

> Unlike other test helper functions, 'test_oid' doesn't terminate its
> output with a LF, but, alas, the reason for this, if any, is not
> mentioned in 2c02b110da (t: add test functions to translate
> hash-related values, 2018-09-13)).
>
> Now, in the vast majority of cases 'test_oid' is invoked in a command
> substitution that is part of a heredoc or supplies an argument to a
> command or the value to a variable, and the command substitution would
> chop off any trailing LFs, so in these cases the lack or presence of a
> trailing LF in its output doesn't matter.  However:
>
>   - There appear to be only three cases where 'test_oid' is not
>     invoked in a command substitution:
>
>       $ git grep '\stest_oid ' -- ':/t/*.sh'
>       t0000-basic.sh:  test_oid zero >actual &&
>       t0000-basic.sh:  test_oid zero >actual &&
>       t0000-basic.sh:  test_oid zero >actual &&
>
>     These are all in test cases checking that 'test_oid' actually
>     works, and that the size of its output matches the size of the
>     corresponding hash function with conditions like
>
>       test $(wc -c <actual) -eq 40
>
>     In these cases the lack of trailing LF does actually matter,
>     though they could be trivially updated to account for the presence
>     of a trailing LF.
>
>   - There are also a few cases where the lack of trailing LF in
>     'test_oid's output actually hurts, because tests need to compare
>     its output with LF terminated file contents, forcing developers to
>     invoke it as 'echo $(test_oid ...)' to append the missing LF:
>
>       $ git grep 'echo "\?$(test_oid ' -- ':/t/*.sh'
>       t1302-repo-version.sh:  echo $(test_oid version) >expect &&
>       t1500-rev-parse.sh:     echo "$(test_oid algo)" >expect &&
>       t4044-diff-index-unique-abbrev.sh:      echo "$(test_oid val1)" > foo &&
>       t4044-diff-index-unique-abbrev.sh:      echo "$(test_oid val2)" > foo &&
>       t5313-pack-bounds-checks.sh:    echo $(test_oid oidfff) >file &&
>
>     And there is yet another similar case in an in-flight topic at:
>
>       https://public-inbox.org/git/813e81a058227bd373cec802e443fcd677042fb4.1670862677.git.gitgitgadget@gmail.com/
>
> Arguably we would be better off if 'test_oid' terminated its output
> with a LF.  So let's update 'test_oid' accordingly, update its tests
> in t0000 to account for the extra character in those size tests, and
> remove the now unnecessary 'echo $(...)' command substitutions around
> 'test_oid' invocations as well.

I'm inclined to like this, as it certainly makes some examples better,
but e.g. here:

> @@ -826,7 +827,7 @@ test_expect_success 'test_oid can look up data for SHA-1' '
>  	grep "^00*\$" actual &&
>  	rawsz="$(test_oid rawsz)" &&
>  	hexsz="$(test_oid hexsz)" &&
> -	test $(wc -c <actual) -eq 40 &&
> +	test $(wc -c <actual) -eq 41 &&
>  	test "$rawsz" -eq 20 &&
>  	test "$hexsz" -eq 40
> [...]
>  '
> @@ -838,7 +839,7 @@ test_expect_success 'test_oid can look up data for SHA-256' '
>  	grep "^00*\$" actual &&
>  	rawsz="$(test_oid rawsz)" &&
>  	hexsz="$(test_oid hexsz)" &&
> -	test $(wc -c <actual) -eq 64 &&
> +	test $(wc -c <actual) -eq 65 &&
>  	test "$rawsz" -eq 32 &&
>  	test "$hexsz" -eq 64


If we have sibling tests we really should try to make these
consistent. These are still understandable, but it's rather annoying
that we aren't consistent here. I.e. we have 64 changed to 65, but not
32 to 33 etc.

I also vaguely recall (although probably nobody worries about such a
platform anymore) that POSIX utilities left themselves room to not work
on things that weren't \n-terminated.

>  test_expect_success 'gitdir selection on normal repos' '
> -	echo $(test_oid version) >expect &&
> +	test_oid version >expect &&
>  	git config core.repositoryformatversion >actual &&
>  	git -C test config core.repositoryformatversion >actual2 &&
>  	test_cmp expect actual &&
> diff --git a/t/t1500-rev-parse.sh b/t/t1500-rev-parse.sh
> index 81de584ea2..37ee5091b5 100755
> --- a/t/t1500-rev-parse.sh
> +++ b/t/t1500-rev-parse.sh
> @@ -195,7 +195,7 @@ test_expect_success 'rev-parse --is-shallow-repository in non-shallow repo' '
>  '
>  
>  test_expect_success 'rev-parse --show-object-format in repo' '
> -	echo "$(test_oid algo)" >expect &&
> +	test_oid algo >expect &&
>  	git rev-parse --show-object-format >actual &&
>  	test_cmp expect actual &&
>  	git rev-parse --show-object-format=storage >actual &&

This sort of thing is much nicer though, so maybe it's all worth it...

I wonder though if we shouldn't just have a test_cmp_oid, which would
abstract this away, and not care if it's \n-terminated or not...

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tests: make 'test_oid' print trailing newline
  2022-12-19 15:27 ` Ævar Arnfjörð Bjarmason
@ 2022-12-19 23:58   ` Junio C Hamano
  0 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2022-12-19 23:58 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: SZEDER Gábor, git, brian m. carlson

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> @@ -826,7 +827,7 @@ test_expect_success 'test_oid can look up data for SHA-1' '
>>  	grep "^00*\$" actual &&
> ...
> I also vaguely recall (although probably nobody worries about such a
> platform anymore) that POSIX utilities left themselves room to not work
> on things that weren't \n-terminated.

[jc: totally irrelevant curiosity-hunt]

I think you have in mind the combination of these.

 * "3.195 Incomplete Line" defines an incomple line as "A sequence
   of one or more non-<newline> characters at the end of the file."

 * "3.403 Text File" defines a text file to be "A file that contains
   characters organized into zero or more lines. ... many utilities
   only produce predictable or meaningful output when operating on
   text files".

 * "INPUT FILES" section of "grep" for example says "The input files
   shall be text files".

It may look unclear if "an incomplete line" is supposed to be a
"line", and if it is not, then the output from "test_oid zero" we
are grepping in in the above snippet is not a "text file".

The "INPUT FILES" section of "sort" states something interesting.

    The input files shall be text files, except that the sort
    utility shall add a <newline> to the end of a file ending with
    an incomplete last line.

Why is this interesting?  Because it smells like it is clarifying
whether it makes file a text to end in an incomplete line, but it
does not do any such thing ;-)  You can read it in two ways:

 * You must feed text files to "sort", but if you did feed a file
   that ends with an incomplete line, the utility adds <newline> at
   the end, which makes it a text, so the inputs to the utilities
   all becomes "text".

   Under this reading, you can as an exception feed a non-text file
   to the utility, as long as its non-text-ness is limited to ending
   with an incomplete line.  So, a file that ends with an incomplete
   line is *not* text.

 * You must feed text files to "sort", and an text file that ends
   with an incomplete line gains terminating <newline> at the end of
   that last line, so a hit on that line will be shown, terminated
   with <newline>, just like a hit on any other line.

   Under this reading, a text file may or may not end with an
   incomplete line, so a file that ends with an incomplete line is
   text.

https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/grep.html
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sort.html

So I dunno.

In any case, I think it is a good practice to avoid having to worry
about how the standard utilities would behave by making sure our
lines are complete.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tests: make 'test_oid' print trailing newline
  2022-12-19  0:48 ` Junio C Hamano
@ 2022-12-22 18:58   ` SZEDER Gábor
  2022-12-23  0:56     ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: SZEDER Gábor @ 2022-12-22 18:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, brian m. carlson

On Mon, Dec 19, 2022 at 09:48:49AM +0900, Junio C Hamano wrote:
> SZEDER Gábor <szeder.dev@gmail.com> writes:
> >       $ git grep '\stest_oid ' -- ':/t/*.sh'
> >       $ git grep 'echo "\?$(test_oid ' -- ':/t/*.sh'
> 
> I found these examples in the log message a bit annoying to see, as
> both invite an undefined behaviour by having an ordinary character
> ('s' or '?')  preceded by an unescaped backslash in a POSIXly
> correct implementation of BRE.  GNU libc seems to be OK with it (I
> double checked by adding "-G" on the command line to make sure my
> experiments are not affected by any grep.patterntype), but they may
> fail for folks on stricter platforms.

Please feel free to amend the commit message as you see fit.  Usually
I would do that myself as I'm rather picky of my commit messages, but,
alas, I'm not versed in portability issues of regexes, so I'm not sure
what the right regexes would be.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tests: make 'test_oid' print trailing newline
  2022-12-22 18:58   ` SZEDER Gábor
@ 2022-12-23  0:56     ` Junio C Hamano
  0 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2022-12-23  0:56 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: git, brian m. carlson

SZEDER Gábor <szeder.dev@gmail.com> writes:

> On Mon, Dec 19, 2022 at 09:48:49AM +0900, Junio C Hamano wrote:
>> SZEDER Gábor <szeder.dev@gmail.com> writes:
>> >       $ git grep '\stest_oid ' -- ':/t/*.sh'
>> >       $ git grep 'echo "\?$(test_oid ' -- ':/t/*.sh'
>>  ...
>> I found these examples in the log message a bit annoying to see, as
>> experiments are not affected by any grep.patterntype), but they may
>> fail for folks on stricter platforms.
>
> Please feel free to amend the commit message as you see fit.  Usually
> I would do that myself as I'm rather picky of my commit messages, but,
> alas, I'm not versed in portability issues of regexes, so I'm not sure
> what the right regexes would be.

I guess ERE would give us enough expressiveness to say "zero or one"
without relying on GNU extension.  Saying "Any whitespace" concisely
as "\s" would require PCRE (i.e. "grep -P") but because use of it is
optional, the best we could do is "[ ]" (in the [bracket]), one is
TAB and the other is SPACE).

But reading the message again, they are what the author of the patch
did to observe the current codebase, so I think being faithful to
what you did would be fine ;-)

In any case, thanks for cleaning it up.



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-12-23  0:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-18 16:29 [PATCH] tests: make 'test_oid' print trailing newline SZEDER Gábor
2022-12-19  0:48 ` Junio C Hamano
2022-12-22 18:58   ` SZEDER Gábor
2022-12-23  0:56     ` Junio C Hamano
2022-12-19 14:03 ` brian m. carlson
2022-12-19 15:27 ` Ævar Arnfjörð Bjarmason
2022-12-19 23:58   ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).