[PATCH 00/10] Hash-independent tests (part 3)

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* [PATCH 00/10] Hash-independent tests (part 3)
@ 2018-06-04 23:52 brian m. carlson
  2018-06-04 23:52 ` [PATCH 01/10] t: add tool to translate hash-related values brian m. carlson
                   ` (9 more replies)
  0 siblings, 10 replies; 26+ messages in thread
From: brian m. carlson @ 2018-06-04 23:52 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy, Torsten Bögershausen,
	Jeff King, René Scharfe

This is next in the series of improvements to make tests
hash-independent.

Introduced here is a test helper, test_translate, that allows lookups in
tests based on the hash algorithm in use.  Alternatives are either
specified inline (the former for SHA-1, the latter for NewHash), or
(more commonly) from a file based on a key.  Some basic examples of
translations are provided and used throughout the tests.

The ultimate idea is that tests can simply drop a file into the
translation directory and use their own test-specific translations if
convenient.

This series (and subsequent series) attempt to standardize slightly on
our use of invalid object IDs in tests.  Since the actual invalid IDs
aren't usually very important, the translations I've made using the
tables aren't necessarily entirely faithful: we will sometimes use
different SHA-1 object IDs than before, but we substitute only invalid
object IDs for different invalid ones.

It's likely in the future that test_translate will support additional
options depending on whether we want input, output, or internal formats.

I had mentioned in a previous comment that a given test would be
included in "the next series" (this one), but I redid the series and
decided to split it into smaller pieces, so it isn't included.  Sorry.

Comments on any aspect of the series are welcome, but thoughts on design
and naming would be especially valuable.

brian m. carlson (10):
  t: add tool to translate hash-related values
  t0000: use hash translation table
  t0002: abstract away SHA-1-specific constants
  t0027: use $ZERO_OID
  t0064: make hash size independent
  t1006: make hash size independent
  t1400: switch hard-coded object ID to variable
  t1405: make hash size independent
  t1406: make hash-size independent
  t1407: make hash size independent

 t/t0000-basic.sh               | 13 ++++-----
 t/t0002-gitfile.sh             | 26 +++++++++---------
 t/t0027-auto-crlf.sh           | 14 +++++-----
 t/t0064-sha1-array.sh          | 49 +++++++++++++++++++---------------
 t/t1006-cat-file.sh            |  4 +--
 t/t1400-update-ref.sh          |  2 +-
 t/t1405-main-ref-store.sh      |  4 +--
 t/t1406-submodule-ref-store.sh |  6 ++---
 t/t1407-worktree-ref-store.sh  |  4 +--
 t/test-lib-functions.sh        | 40 +++++++++++++++++++++++++++
 t/translate/hash-info          |  9 +++++++
 t/translate/oid                | 15 +++++++++++
 12 files changed, 129 insertions(+), 57 deletions(-)
 create mode 100644 t/translate/hash-info
 create mode 100644 t/translate/oid

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 01/10] t: add tool to translate hash-related values
  2018-06-04 23:52 [PATCH 00/10] Hash-independent tests (part 3) brian m. carlson
@ 2018-06-04 23:52 ` brian m. carlson
  2018-06-06  6:19   ` Torsten Bögershausen
  2018-06-11  7:47   ` Eric Sunshine
  2018-06-04 23:52 ` [PATCH 02/10] t0000: use hash translation table brian m. carlson
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 26+ messages in thread
From: brian m. carlson @ 2018-06-04 23:52 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy, Torsten Bögershausen,
	Jeff King, René Scharfe

Add a test function helper, test_translate, that will produce its first
argument if the hash in use is SHA-1 and the second if its argument is
NewHash.  Implement a mode that can read entries from a file as well for
reusability across tests.

For the moment, use the length of the empty blob to determine the hash
in use.  In the future, we can change this code so that it can use the
configuration and learn about the difference in input, output, and
on-disk formats.

Implement two basic lookup charts, one for common invalid or synthesized
object IDs, and one for various facts about the hash function in use.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 t/test-lib-functions.sh | 40 ++++++++++++++++++++++++++++++++++++++++
 t/translate/hash-info   |  9 +++++++++
 t/translate/oid         | 15 +++++++++++++++
 3 files changed, 64 insertions(+)
 create mode 100644 t/translate/hash-info
 create mode 100644 t/translate/oid

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 2b2181dca0..0e7067460b 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1147,3 +1147,43 @@ depacketize () {
 		}
 	'
 }
+
+test_translate_f_ () {
+	local file="$TEST_DIRECTORY/translate/$2" &&
+	perl -e '
+		$delim = "\t";
+		($hoidlen, $file, $arg) = @ARGV;
+		open($fh, "<", $file) or die "open: $!";
+		while (<$fh>) {
+			# Allow specifying other delimiters.
+			$delim = $1 if /^#!\sdelimiter\s(.)/;
+			next if /^#/;
+			@fields = split /$delim/, $_, 3;
+			if ($fields[0] eq $arg) {
+				print($hoidlen == 40 ? $fields[1] : $fields[2]);
+				last;
+			}
+		}
+	' "$1" "$file" "$3"
+}
+
+# Without -f, print the first argument if we are using SHA-1 and the second if
+# we're using NewHash.
+# With -f FILE ARG, read the (by default) tab-delimited file from
+# t/translate/FILE, finding the first field matching ARG and printing either the
+# second or third field depending on the hash in use.
+test_translate () {
+	local hoidlen=$(printf "%s" "$EMPTY_BLOB" | wc -c) &&
+	if [ "$1" = "-f" ]
+	then
+		shift &&
+		test_translate_f_ "$hoidlen" "$@"
+	else
+		if [ "$hoidlen" -eq 40 ]
+		then
+			printf "%s" "$1"
+		else
+			printf "%s" "$2"
+		fi
+	fi
+}
diff --git a/t/translate/hash-info b/t/translate/hash-info
new file mode 100644
index 0000000000..36cbd9a8eb
--- /dev/null
+++ b/t/translate/hash-info
@@ -0,0 +1,9 @@
+# Various facts about the hash algorithm in use for easy access in tests.
+#
+# Several aliases are provided for easy recall.
+rawsz	20	32
+oidlen	20	32
+hexsz	40	64
+hexoidlen	40	64
+zero	0000000000000000000000000000000000000000	0000000000000000000000000000000000000000000000000000000000000000
+zero-oid	0000000000000000000000000000000000000000	0000000000000000000000000000000000000000000000000000000000000000
diff --git a/t/translate/oid b/t/translate/oid
new file mode 100644
index 0000000000..8de0fd64af
--- /dev/null
+++ b/t/translate/oid
@@ -0,0 +1,15 @@
+# These are some common invalid and partial object IDs used in tests.
+001	0000000000000000000000000000000000000001	0000000000000000000000000000000000000000000000000000000000000001
+002	0000000000000000000000000000000000000002	0000000000000000000000000000000000000000000000000000000000000002
+003	0000000000000000000000000000000000000003	0000000000000000000000000000000000000000000000000000000000000003
+004	0000000000000000000000000000000000000004	0000000000000000000000000000000000000000000000000000000000000004
+005	0000000000000000000000000000000000000005	0000000000000000000000000000000000000000000000000000000000000005
+006	0000000000000000000000000000000000000006	0000000000000000000000000000000000000000000000000000000000000006
+007	0000000000000000000000000000000000000007	0000000000000000000000000000000000000000000000000000000000000007
+# All zeros or Fs missing one or two hex segments.
+zero-1	000000000000000000000000000000000000000	000000000000000000000000000000000000000000000000000000000000000
+zero-2	00000000000000000000000000000000000000	00000000000000000000000000000000000000000000000000000000000000
+ff-1	fffffffffffffffffffffffffffffffffffffff	fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
+ff-2	ffffffffffffffffffffffffffffffffffffff	ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
+numeric	0123456789012345678901234567890123456789	0123456789012345678901234567890123456789012345678901234567890123
+deadbeef	deadbeefdeadbeefdeadbeefdeadbeefdeadbeef	deadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 02/10] t0000: use hash translation table
  2018-06-04 23:52 [PATCH 00/10] Hash-independent tests (part 3) brian m. carlson
  2018-06-04 23:52 ` [PATCH 01/10] t: add tool to translate hash-related values brian m. carlson
@ 2018-06-04 23:52 ` brian m. carlson
  2018-06-04 23:52 ` [PATCH 03/10] t0002: abstract away SHA-1-specific constants brian m. carlson
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: brian m. carlson @ 2018-06-04 23:52 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy, Torsten Bögershausen,
	Jeff King, René Scharfe

If the hash we're using is 32 bytes in size, attempting to insert a
20-byte object name won't work.  Since these are synthesized objects
that are almost all zeros, look them up in a translation table.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 t/t0000-basic.sh | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
index af61d083b4..27ef9ecab2 100755
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -978,12 +978,13 @@ test_expect_success SHA1 'validate object ID for a known tree' '
 
 test_expect_success 'put invalid objects into the index' '
 	rm -f .git/index &&
-	cat >badobjects <<-\EOF &&
-	100644 blob 1000000000000000000000000000000000000000	dir/file1
-	100644 blob 2000000000000000000000000000000000000000	dir/file2
-	100644 blob 3000000000000000000000000000000000000000	dir/file3
-	100644 blob 4000000000000000000000000000000000000000	dir/file4
-	100644 blob 5000000000000000000000000000000000000000	dir/file5
+	suffix=$(echo $ZERO_OID | sed -e "s/^.//") &&
+	cat >badobjects <<-EOF &&
+	100644 blob $(test_translate -f oid 001)	dir/file1
+	100644 blob $(test_translate -f oid 002)	dir/file2
+	100644 blob $(test_translate -f oid 003)	dir/file3
+	100644 blob $(test_translate -f oid 004)	dir/file4
+	100644 blob $(test_translate -f oid 005)	dir/file5
 	EOF
 	git update-index --index-info <badobjects
 '

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 03/10] t0002: abstract away SHA-1-specific constants
  2018-06-04 23:52 [PATCH 00/10] Hash-independent tests (part 3) brian m. carlson
  2018-06-04 23:52 ` [PATCH 01/10] t: add tool to translate hash-related values brian m. carlson
  2018-06-04 23:52 ` [PATCH 02/10] t0000: use hash translation table brian m. carlson
@ 2018-06-04 23:52 ` brian m. carlson
  2018-06-11  7:59   ` Eric Sunshine
  2018-06-04 23:52 ` [PATCH 04/10] t0027: use $ZERO_OID brian m. carlson
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 26+ messages in thread
From: brian m. carlson @ 2018-06-04 23:52 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy, Torsten Bögershausen,
	Jeff King, René Scharfe

Adjust the test so that it computes variables for object IDs instead of
using hard-coded hashes.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 t/t0002-gitfile.sh | 26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/t/t0002-gitfile.sh b/t/t0002-gitfile.sh
index 3691023d51..020af4c53c 100755
--- a/t/t0002-gitfile.sh
+++ b/t/t0002-gitfile.sh
@@ -89,14 +89,16 @@ test_expect_success 'enter_repo non-strict mode' '
 		cd enter_repo &&
 		test_tick &&
 		test_commit foo &&
+		git rev-parse HEAD >head-revision &&
 		mv .git .realgit &&
 		echo "gitdir: .realgit" >.git
 	) &&
+	head=$(cat enter_repo/head-revision) &&
 	git ls-remote enter_repo >actual &&
-	cat >expected <<-\EOF &&
-	946e985ab20de757ca5b872b16d64e92ff3803a9	HEAD
-	946e985ab20de757ca5b872b16d64e92ff3803a9	refs/heads/master
-	946e985ab20de757ca5b872b16d64e92ff3803a9	refs/tags/foo
+	cat >expected <<-EOF &&
+	$head	HEAD
+	$head	refs/heads/master
+	$head	refs/tags/foo
 	EOF
 	test_cmp expected actual
 '
@@ -107,20 +109,20 @@ test_expect_success 'enter_repo linked checkout' '
 		git worktree add  ../foo refs/tags/foo
 	) &&
 	git ls-remote foo >actual &&
-	cat >expected <<-\EOF &&
-	946e985ab20de757ca5b872b16d64e92ff3803a9	HEAD
-	946e985ab20de757ca5b872b16d64e92ff3803a9	refs/heads/master
-	946e985ab20de757ca5b872b16d64e92ff3803a9	refs/tags/foo
+	cat >expected <<-EOF &&
+	$head	HEAD
+	$head	refs/heads/master
+	$head	refs/tags/foo
 	EOF
 	test_cmp expected actual
 '
 
 test_expect_success 'enter_repo strict mode' '
 	git ls-remote --upload-pack="git upload-pack --strict" foo/.git >actual &&
-	cat >expected <<-\EOF &&
-	946e985ab20de757ca5b872b16d64e92ff3803a9	HEAD
-	946e985ab20de757ca5b872b16d64e92ff3803a9	refs/heads/master
-	946e985ab20de757ca5b872b16d64e92ff3803a9	refs/tags/foo
+	cat >expected <<-EOF &&
+	$head	HEAD
+	$head	refs/heads/master
+	$head	refs/tags/foo
 	EOF
 	test_cmp expected actual
 '

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 04/10] t0027: use $ZERO_OID
  2018-06-04 23:52 [PATCH 00/10] Hash-independent tests (part 3) brian m. carlson
                   ` (2 preceding siblings ...)
  2018-06-04 23:52 ` [PATCH 03/10] t0002: abstract away SHA-1-specific constants brian m. carlson
@ 2018-06-04 23:52 ` brian m. carlson
  2018-06-06  7:02   ` Torsten Bögershausen
  2018-06-04 23:52 ` [PATCH 05/10] t0064: make hash size independent brian m. carlson
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 26+ messages in thread
From: brian m. carlson @ 2018-06-04 23:52 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy, Torsten Bögershausen,
	Jeff King, René Scharfe

Use the ZERO_OID variable to express the all-zeros object ID so that it
works with hash algorithms of all sizes.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 t/t0027-auto-crlf.sh | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh
index beb5927f77..14fcd3f49f 100755
--- a/t/t0027-auto-crlf.sh
+++ b/t/t0027-auto-crlf.sh
@@ -371,13 +371,13 @@ test_expect_success 'setup master' '
 	git checkout -b master &&
 	git add .gitattributes &&
 	git commit -m "add .gitattributes" . &&
-	printf "\$Id: 0000000000000000000000000000000000000000 \$\nLINEONE\nLINETWO\nLINETHREE"     >LF &&
-	printf "\$Id: 0000000000000000000000000000000000000000 \$\r\nLINEONE\r\nLINETWO\r\nLINETHREE" >CRLF &&
-	printf "\$Id: 0000000000000000000000000000000000000000 \$\nLINEONE\r\nLINETWO\nLINETHREE"   >CRLF_mix_LF &&
-	printf "\$Id: 0000000000000000000000000000000000000000 \$\nLINEONE\nLINETWO\rLINETHREE"     >LF_mix_CR &&
-	printf "\$Id: 0000000000000000000000000000000000000000 \$\r\nLINEONE\r\nLINETWO\rLINETHREE"   >CRLF_mix_CR &&
-	printf "\$Id: 0000000000000000000000000000000000000000 \$\r\nLINEONEQ\r\nLINETWO\r\nLINETHREE" | q_to_nul >CRLF_nul &&
-	printf "\$Id: 0000000000000000000000000000000000000000 \$\nLINEONEQ\nLINETWO\nLINETHREE" | q_to_nul >LF_nul &&
+	printf "\$Id: $ZERO_OID \$\nLINEONE\nLINETWO\nLINETHREE"     >LF &&
+	printf "\$Id: $ZERO_OID \$\r\nLINEONE\r\nLINETWO\r\nLINETHREE" >CRLF &&
+	printf "\$Id: $ZERO_OID \$\nLINEONE\r\nLINETWO\nLINETHREE"   >CRLF_mix_LF &&
+	printf "\$Id: $ZERO_OID \$\nLINEONE\nLINETWO\rLINETHREE"     >LF_mix_CR &&
+	printf "\$Id: $ZERO_OID \$\r\nLINEONE\r\nLINETWO\rLINETHREE"   >CRLF_mix_CR &&
+	printf "\$Id: $ZERO_OID \$\r\nLINEONEQ\r\nLINETWO\r\nLINETHREE" | q_to_nul >CRLF_nul &&
+	printf "\$Id: $ZERO_OID \$\nLINEONEQ\nLINETWO\nLINETHREE" | q_to_nul >LF_nul &&
 	create_NNO_MIX_files &&
 	git -c core.autocrlf=false add NNO_*.txt MIX_*.txt &&
 	git commit -m "mixed line endings" &&

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 05/10] t0064: make hash size independent
  2018-06-04 23:52 [PATCH 00/10] Hash-independent tests (part 3) brian m. carlson
                   ` (3 preceding siblings ...)
  2018-06-04 23:52 ` [PATCH 04/10] t0027: use $ZERO_OID brian m. carlson
@ 2018-06-04 23:52 ` brian m. carlson
  2018-06-11  8:09   ` Eric Sunshine
  2018-06-04 23:52 ` [PATCH 06/10] t1006: " brian m. carlson
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 26+ messages in thread
From: brian m. carlson @ 2018-06-04 23:52 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy, Torsten Bögershausen,
	Jeff King, René Scharfe

Compute test values of the appropriate size instead of hard-coding
40-character values.  Rename the echo20 function to echoid, since the
values may be of varying sizes.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 t/t0064-sha1-array.sh | 49 ++++++++++++++++++++++++-------------------
 1 file changed, 27 insertions(+), 22 deletions(-)

diff --git a/t/t0064-sha1-array.sh b/t/t0064-sha1-array.sh
index 67484502a0..ab230179d2 100755
--- a/t/t0064-sha1-array.sh
+++ b/t/t0064-sha1-array.sh
@@ -3,30 +3,30 @@
 test_description='basic tests for the SHA1 array implementation'
 . ./test-lib.sh
 
-echo20 () {
+echoid () {
 	prefix="${1:+$1 }"
 	shift
 	while test $# -gt 0
 	do
-		echo "$prefix$1$1$1$1$1$1$1$1$1$1$1$1$1$1$1$1$1$1$1$1"
+		echo "$prefix$ZERO_OID" | sed -e "s/00/$1/g"
 		shift
 	done
 }
 
 test_expect_success 'ordered enumeration' '
-	echo20 "" 44 55 88 aa >expect &&
+	echoid "" 44 55 88 aa >expect &&
 	{
-		echo20 append 88 44 aa 55 &&
+		echoid append 88 44 aa 55 &&
 		echo for_each_unique
 	} | test-tool sha1-array >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'ordered enumeration with duplicate suppression' '
-	echo20 "" 44 55 88 aa >expect &&
+	echoid "" 44 55 88 aa >expect &&
 	{
-		echo20 append 88 44 aa 55 &&
-		echo20 append 88 44 aa 55 &&
+		echoid append 88 44 aa 55 &&
+		echoid append 88 44 aa 55 &&
 		echo for_each_unique
 	} | test-tool sha1-array >actual &&
 	test_cmp expect actual
@@ -34,8 +34,8 @@ test_expect_success 'ordered enumeration with duplicate suppression' '
 
 test_expect_success 'lookup' '
 	{
-		echo20 append 88 44 aa 55 &&
-		echo20 lookup 55
+		echoid append 88 44 aa 55 &&
+		echoid lookup 55
 	} | test-tool sha1-array >actual &&
 	n=$(cat actual) &&
 	test "$n" -eq 1
@@ -43,8 +43,8 @@ test_expect_success 'lookup' '
 
 test_expect_success 'lookup non-existing entry' '
 	{
-		echo20 append 88 44 aa 55 &&
-		echo20 lookup 33
+		echoid append 88 44 aa 55 &&
+		echoid lookup 33
 	} | test-tool sha1-array >actual &&
 	n=$(cat actual) &&
 	test "$n" -lt 0
@@ -52,9 +52,9 @@ test_expect_success 'lookup non-existing entry' '
 
 test_expect_success 'lookup with duplicates' '
 	{
-		echo20 append 88 44 aa 55 &&
-		echo20 append 88 44 aa 55 &&
-		echo20 lookup 55
+		echoid append 88 44 aa 55 &&
+		echoid append 88 44 aa 55 &&
+		echoid lookup 55
 	} | test-tool sha1-array >actual &&
 	n=$(cat actual) &&
 	test "$n" -ge 2 &&
@@ -63,19 +63,24 @@ test_expect_success 'lookup with duplicates' '
 
 test_expect_success 'lookup non-existing entry with duplicates' '
 	{
-		echo20 append 88 44 aa 55 &&
-		echo20 append 88 44 aa 55 &&
-		echo20 lookup 66
+		echoid append 88 44 aa 55 &&
+		echoid append 88 44 aa 55 &&
+		echoid lookup 66
 	} | test-tool sha1-array >actual &&
 	n=$(cat actual) &&
 	test "$n" -lt 0
 '
 
 test_expect_success 'lookup with almost duplicate values' '
+	# n-1 5s
+	root=$(test_translate 555555555555555555555555555555555555555 \
+		555555555555555555555555555555555555555555555555555555555555555) &&
 	{
-		echo "append 5555555555555555555555555555555555555555" &&
-		echo "append 555555555555555555555555555555555555555f" &&
-		echo20 lookup 55
+		id1="${root}5" &&
+		id2="${root}f" &&
+		echo "append $id1" &&
+		echo "append $id2" &&
+		echoid lookup 55
 	} | test-tool sha1-array >actual &&
 	n=$(cat actual) &&
 	test "$n" -eq 0
@@ -83,8 +88,8 @@ test_expect_success 'lookup with almost duplicate values' '
 
 test_expect_success 'lookup with single duplicate value' '
 	{
-		echo20 append 55 55 &&
-		echo20 lookup 55
+		echoid append 55 55 &&
+		echoid lookup 55
 	} | test-tool sha1-array >actual &&
 	n=$(cat actual) &&
 	test "$n" -ge 0 &&

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 06/10] t1006: make hash size independent
  2018-06-04 23:52 [PATCH 00/10] Hash-independent tests (part 3) brian m. carlson
                   ` (4 preceding siblings ...)
  2018-06-04 23:52 ` [PATCH 05/10] t0064: make hash size independent brian m. carlson
@ 2018-06-04 23:52 ` brian m. carlson
  2018-06-04 23:52 ` [PATCH 07/10] t1400: switch hard-coded object ID to variable brian m. carlson
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: brian m. carlson @ 2018-06-04 23:52 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy, Torsten Bögershausen,
	Jeff King, René Scharfe

Compute the size of the tree and commit objects we're creating by
checking for the size of an object ID and computing the resulting sizes
accordingly.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 t/t1006-cat-file.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index 13dd510b2e..3b82051e2b 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -141,14 +141,14 @@ test_expect_success '--batch-check without %(rest) considers whole line' '
 '
 
 tree_sha1=$(git write-tree)
-tree_size=33
+tree_size=$(($(test_translate -f hash-info rawsz) + 13))
 tree_pretty_content="100644 blob $hello_sha1	hello"
 
 run_tests 'tree' $tree_sha1 $tree_size "" "$tree_pretty_content"
 
 commit_message="Initial commit"
 commit_sha1=$(echo_without_newline "$commit_message" | git commit-tree $tree_sha1)
-commit_size=177
+commit_size=$(($(test_translate -f hash-info hexsz) + 137))
 commit_content="tree $tree_sha1
 author $GIT_AUTHOR_NAME <$GIT_AUTHOR_EMAIL> 0000000000 +0000
 committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> 0000000000 +0000

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 07/10] t1400: switch hard-coded object ID to variable
  2018-06-04 23:52 [PATCH 00/10] Hash-independent tests (part 3) brian m. carlson
                   ` (5 preceding siblings ...)
  2018-06-04 23:52 ` [PATCH 06/10] t1006: " brian m. carlson
@ 2018-06-04 23:52 ` brian m. carlson
  2018-06-04 23:52 ` [PATCH 08/10] t1405: make hash size independent brian m. carlson
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: brian m. carlson @ 2018-06-04 23:52 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy, Torsten Bögershausen,
	Jeff King, René Scharfe

Switch a hard-coded all-zeros object ID to use a variable instead.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 t/t1400-update-ref.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh
index e1fd0f0ca8..ffaadf5f2d 100755
--- a/t/t1400-update-ref.sh
+++ b/t/t1400-update-ref.sh
@@ -346,7 +346,7 @@ test_expect_success "verifying $m's log (logged by config)" '
 
 git update-ref $m $D
 cat >.git/logs/$m <<EOF
-0000000000000000000000000000000000000000 $C $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> 1117150320 -0500
+$Z $C $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> 1117150320 -0500
 $C $A $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> 1117150350 -0500
 $A $B $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> 1117150380 -0500
 $F $Z $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> 1117150680 -0500

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 08/10] t1405: make hash size independent
  2018-06-04 23:52 [PATCH 00/10] Hash-independent tests (part 3) brian m. carlson
                   ` (6 preceding siblings ...)
  2018-06-04 23:52 ` [PATCH 07/10] t1400: switch hard-coded object ID to variable brian m. carlson
@ 2018-06-04 23:52 ` brian m. carlson
  2018-06-04 23:52 ` [PATCH 09/10] t1406: make hash-size independent brian m. carlson
  2018-06-04 23:52 ` [PATCH 10/10] t1407: make hash size independent brian m. carlson
  9 siblings, 0 replies; 26+ messages in thread
From: brian m. carlson @ 2018-06-04 23:52 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy, Torsten Bögershausen,
	Jeff King, René Scharfe

Instead of hard-coding a 40-based constant, split the output of
for-each-ref and for-each-reflog by field.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 t/t1405-main-ref-store.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t1405-main-ref-store.sh b/t/t1405-main-ref-store.sh
index a74c38b5fb..331899ddc4 100755
--- a/t/t1405-main-ref-store.sh
+++ b/t/t1405-main-ref-store.sh
@@ -54,7 +54,7 @@ test_expect_success 'for_each_ref(refs/heads/)' '
 '
 
 test_expect_success 'for_each_ref() is sorted' '
-	$RUN for-each-ref refs/heads/ | cut -c 42- >actual &&
+	$RUN for-each-ref refs/heads/ | cut -d" " -f 2- >actual &&
 	sort actual > expected &&
 	test_cmp expected actual
 '
@@ -71,7 +71,7 @@ test_expect_success 'verify_ref(new-master)' '
 '
 
 test_expect_success 'for_each_reflog()' '
-	$RUN for-each-reflog | sort -k2 | cut -c 42- >actual &&
+	$RUN for-each-reflog | sort -k2 | cut -d" " -f 2- >actual &&
 	cat >expected <<-\EOF &&
 	HEAD 0x1
 	refs/heads/master 0x0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 09/10] t1406: make hash-size independent
  2018-06-04 23:52 [PATCH 00/10] Hash-independent tests (part 3) brian m. carlson
                   ` (7 preceding siblings ...)
  2018-06-04 23:52 ` [PATCH 08/10] t1405: make hash size independent brian m. carlson
@ 2018-06-04 23:52 ` brian m. carlson
  2018-06-04 23:52 ` [PATCH 10/10] t1407: make hash size independent brian m. carlson
  9 siblings, 0 replies; 26+ messages in thread
From: brian m. carlson @ 2018-06-04 23:52 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy, Torsten Bögershausen,
	Jeff King, René Scharfe

Instead of hard-coding a 40-based constant, split the output of
for-each-ref and for-each-reflog by field.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 t/t1406-submodule-ref-store.sh | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/t/t1406-submodule-ref-store.sh b/t/t1406-submodule-ref-store.sh
index e093782cc3..d199d872fb 100755
--- a/t/t1406-submodule-ref-store.sh
+++ b/t/t1406-submodule-ref-store.sh
@@ -39,7 +39,7 @@ test_expect_success 'rename_refs() not allowed' '
 '
 
 test_expect_success 'for_each_ref(refs/heads/)' '
-	$RUN for-each-ref refs/heads/ | cut -c 42- >actual &&
+	$RUN for-each-ref refs/heads/ | cut -d" " -f 2- >actual &&
 	cat >expected <<-\EOF &&
 	master 0x0
 	new-master 0x0
@@ -48,7 +48,7 @@ test_expect_success 'for_each_ref(refs/heads/)' '
 '
 
 test_expect_success 'for_each_ref() is sorted' '
-	$RUN for-each-ref refs/heads/ | cut -c 42- >actual &&
+	$RUN for-each-ref refs/heads/ | cut -d" " -f 2- >actual &&
 	sort actual > expected &&
 	test_cmp expected actual
 '
@@ -65,7 +65,7 @@ test_expect_success 'verify_ref(new-master)' '
 '
 
 test_expect_success 'for_each_reflog()' '
-	$RUN for-each-reflog | sort | cut -c 42- >actual &&
+	$RUN for-each-reflog | sort | cut -d" " -f 2- >actual &&
 	cat >expected <<-\EOF &&
 	HEAD 0x1
 	refs/heads/master 0x0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 10/10] t1407: make hash size independent
  2018-06-04 23:52 [PATCH 00/10] Hash-independent tests (part 3) brian m. carlson
                   ` (8 preceding siblings ...)
  2018-06-04 23:52 ` [PATCH 09/10] t1406: make hash-size independent brian m. carlson
@ 2018-06-04 23:52 ` brian m. carlson
  9 siblings, 0 replies; 26+ messages in thread
From: brian m. carlson @ 2018-06-04 23:52 UTC (permalink / raw)
  To: git
  Cc: Nguyễn Thái Ngọc Duy, Torsten Bögershausen,
	Jeff King, René Scharfe

Instead of hard-coding a 40-based constant, split the output of
for-each-ref and for-each-reflog by field.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 t/t1407-worktree-ref-store.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t1407-worktree-ref-store.sh b/t/t1407-worktree-ref-store.sh
index 4623ae15c4..9a84858118 100755
--- a/t/t1407-worktree-ref-store.sh
+++ b/t/t1407-worktree-ref-store.sh
@@ -58,7 +58,7 @@ test_expect_success 'for_each_reflog()' '
 	mkdir -p     .git/worktrees/wt/logs/refs/bisect &&
 	echo $ZERO_OID > .git/worktrees/wt/logs/refs/bisect/wt-random &&
 
-	$RWT for-each-reflog | cut -c 42- | sort >actual &&
+	$RWT for-each-reflog | cut -d" " -f 2- | sort >actual &&
 	cat >expected <<-\EOF &&
 	HEAD 0x1
 	PSEUDO-WT 0x0
@@ -68,7 +68,7 @@ test_expect_success 'for_each_reflog()' '
 	EOF
 	test_cmp expected actual &&
 
-	$RMAIN for-each-reflog | cut -c 42- | sort >actual &&
+	$RMAIN for-each-reflog | cut -d" " -f 2- | sort >actual &&
 	cat >expected <<-\EOF &&
 	HEAD 0x1
 	PSEUDO-MAIN 0x0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 01/10] t: add tool to translate hash-related values
  2018-06-04 23:52 ` [PATCH 01/10] t: add tool to translate hash-related values brian m. carlson
@ 2018-06-06  6:19   ` Torsten Bögershausen
  2018-06-06 20:58     ` Jeff King
  2018-06-11  7:47   ` Eric Sunshine
  1 sibling, 1 reply; 26+ messages in thread
From: Torsten Bögershausen @ 2018-06-06  6:19 UTC (permalink / raw)
  To: brian m. carlson
  Cc: git, Nguyễn Thái Ngọc Duy, Jeff King,
	René Scharfe

Some style nite inline

On Mon, Jun 04, 2018 at 11:52:20PM +0000, brian m. carlson wrote:
> Add a test function helper, test_translate, that will produce its first
> argument if the hash in use is SHA-1 and the second if its argument is
> NewHash.  Implement a mode that can read entries from a file as well for
> reusability across tests.
> 
> For the moment, use the length of the empty blob to determine the hash
> in use.  In the future, we can change this code so that it can use the
> configuration and learn about the difference in input, output, and
> on-disk formats.
> 
> Implement two basic lookup charts, one for common invalid or synthesized
> object IDs, and one for various facts about the hash function in use.
> 
> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---
>  t/test-lib-functions.sh | 40 ++++++++++++++++++++++++++++++++++++++++
>  t/translate/hash-info   |  9 +++++++++
>  t/translate/oid         | 15 +++++++++++++++
>  3 files changed, 64 insertions(+)
>  create mode 100644 t/translate/hash-info
>  create mode 100644 t/translate/oid
> 
> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> index 2b2181dca0..0e7067460b 100644
> --- a/t/test-lib-functions.sh
> +++ b/t/test-lib-functions.sh
> @@ -1147,3 +1147,43 @@ depacketize () {
>  		}
>  	'
>  }
> +
> +test_translate_f_ () {
> +	local file="$TEST_DIRECTORY/translate/$2" &&

Unless I'm wrong, we don't use the "local" keyword ?

> +	perl -e '

The bare "perl" is better spelled as "$PERL_PATH"


> +		$delim = "\t";
> +		($hoidlen, $file, $arg) = @ARGV;
> +		open($fh, "<", $file) or die "open: $!";
> +		while (<$fh>) {
> +			# Allow specifying other delimiters.
> +			$delim = $1 if /^#!\sdelimiter\s(.)/;
> +			next if /^#/;
> +			@fields = split /$delim/, $_, 3;
> +			if ($fields[0] eq $arg) {
> +				print($hoidlen == 40 ? $fields[1] : $fields[2]);
> +				last;
> +			}
> +		}
> +	' "$1" "$file" "$3"
> +}
> +
> +# Without -f, print the first argument if we are using SHA-1 and the second if
> +# we're using NewHash.
> +# With -f FILE ARG, read the (by default) tab-delimited file from
> +# t/translate/FILE, finding the first field matching ARG and printing either the
> +# second or third field depending on the hash in use.
> +test_translate () {
> +	local hoidlen=$(printf "%s" "$EMPTY_BLOB" | wc -c) &&
> +	if [ "$1" = "-f" ]

Style nit, please avoid [] and use test:
	if test "$1" = "-f"

And more [] below

> +	then
> +		shift &&
> +		test_translate_f_ "$hoidlen" "$@"
> +	else
> +		if [ "$hoidlen" -eq 40 ]
> +		then
> +			printf "%s" "$1"
> +		else
> +			printf "%s" "$2"
> +		fi
> +	fi
> +}
> diff --git a/t/translate/hash-info b/t/translate/hash-info
> new file mode 100644
> index 0000000000..36cbd9a8eb
> --- /dev/null
> +++ b/t/translate/hash-info
> @@ -0,0 +1,9 @@
> +# Various facts about the hash algorithm in use for easy access in tests.
> +#
> +# Several aliases are provided for easy recall.
> +rawsz	20	32
> +oidlen	20	32
> +hexsz	40	64
> +hexoidlen	40	64
> +zero	0000000000000000000000000000000000000000	0000000000000000000000000000000000000000000000000000000000000000
> +zero-oid	0000000000000000000000000000000000000000	0000000000000000000000000000000000000000000000000000000000000000
> diff --git a/t/translate/oid b/t/translate/oid
> new file mode 100644
> index 0000000000..8de0fd64af
> --- /dev/null
> +++ b/t/translate/oid
> @@ -0,0 +1,15 @@
> +# These are some common invalid and partial object IDs used in tests.
> +001	0000000000000000000000000000000000000001	0000000000000000000000000000000000000000000000000000000000000001
> +002	0000000000000000000000000000000000000002	0000000000000000000000000000000000000000000000000000000000000002
> +003	0000000000000000000000000000000000000003	0000000000000000000000000000000000000000000000000000000000000003
> +004	0000000000000000000000000000000000000004	0000000000000000000000000000000000000000000000000000000000000004
> +005	0000000000000000000000000000000000000005	0000000000000000000000000000000000000000000000000000000000000005
> +006	0000000000000000000000000000000000000006	0000000000000000000000000000000000000000000000000000000000000006
> +007	0000000000000000000000000000000000000007	0000000000000000000000000000000000000000000000000000000000000007
> +# All zeros or Fs missing one or two hex segments.
> +zero-1	000000000000000000000000000000000000000	000000000000000000000000000000000000000000000000000000000000000
> +zero-2	00000000000000000000000000000000000000	00000000000000000000000000000000000000000000000000000000000000
> +ff-1	fffffffffffffffffffffffffffffffffffffff	fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
> +ff-2	ffffffffffffffffffffffffffffffffffffff	ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
> +numeric	0123456789012345678901234567890123456789	0123456789012345678901234567890123456789012345678901234567890123
> +deadbeef	deadbeefdeadbeefdeadbeefdeadbeefdeadbeef	deadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 04/10] t0027: use $ZERO_OID
  2018-06-04 23:52 ` [PATCH 04/10] t0027: use $ZERO_OID brian m. carlson
@ 2018-06-06  7:02   ` Torsten Bögershausen
  2018-06-07  1:25     ` brian m. carlson
  0 siblings, 1 reply; 26+ messages in thread
From: Torsten Bögershausen @ 2018-06-06  7:02 UTC (permalink / raw)
  To: brian m. carlson
  Cc: git, Nguyễn Thái Ngọc Duy, Jeff King,
	René Scharfe

On Mon, Jun 04, 2018 at 11:52:23PM +0000, brian m. carlson wrote:
> Use the ZERO_OID variable to express the all-zeros object ID so that it
> works with hash algorithms of all sizes.
> 
> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---
>  t/t0027-auto-crlf.sh | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh
> index beb5927f77..14fcd3f49f 100755
> --- a/t/t0027-auto-crlf.sh
> +++ b/t/t0027-auto-crlf.sh
> @@ -371,13 +371,13 @@ test_expect_success 'setup master' '
>  	git checkout -b master &&
>  	git add .gitattributes &&
>  	git commit -m "add .gitattributes" . &&
> -	printf "\$Id: 0000000000000000000000000000000000000000 \$\nLINEONE\nLINETWO\nLINETHREE"     >LF &&
> -	printf "\$Id: 0000000000000000000000000000000000000000 \$\r\nLINEONE\r\nLINETWO\r\nLINETHREE" >CRLF &&
> -	printf "\$Id: 0000000000000000000000000000000000000000 \$\nLINEONE\r\nLINETWO\nLINETHREE"   >CRLF_mix_LF &&
> -	printf "\$Id: 0000000000000000000000000000000000000000 \$\nLINEONE\nLINETWO\rLINETHREE"     >LF_mix_CR &&
> -	printf "\$Id: 0000000000000000000000000000000000000000 \$\r\nLINEONE\r\nLINETWO\rLINETHREE"   >CRLF_mix_CR &&
> -	printf "\$Id: 0000000000000000000000000000000000000000 \$\r\nLINEONEQ\r\nLINETWO\r\nLINETHREE" | q_to_nul >CRLF_nul &&
> -	printf "\$Id: 0000000000000000000000000000000000000000 \$\nLINEONEQ\nLINETWO\nLINETHREE" | q_to_nul >LF_nul &&
> +	printf "\$Id: $ZERO_OID \$\nLINEONE\nLINETWO\nLINETHREE"     >LF &&
> +	printf "\$Id: $ZERO_OID \$\r\nLINEONE\r\nLINETWO\r\nLINETHREE" >CRLF &&
> +	printf "\$Id: $ZERO_OID \$\nLINEONE\r\nLINETWO\nLINETHREE"   >CRLF_mix_LF &&
> +	printf "\$Id: $ZERO_OID \$\nLINEONE\nLINETWO\rLINETHREE"     >LF_mix_CR &&
> +	printf "\$Id: $ZERO_OID \$\r\nLINEONE\r\nLINETWO\rLINETHREE"   >CRLF_mix_CR &&
> +	printf "\$Id: $ZERO_OID \$\r\nLINEONEQ\r\nLINETWO\r\nLINETHREE" | q_to_nul >CRLF_nul &&
> +	printf "\$Id: $ZERO_OID \$\nLINEONEQ\nLINETWO\nLINETHREE" | q_to_nul >LF_nul &&
>  	create_NNO_MIX_files &&
>  	git -c core.autocrlf=false add NNO_*.txt MIX_*.txt &&
>  	git commit -m "mixed line endings" &&

Nothing wrong with the patch.
There is, however, a trick in t0027 to transform the different IDs back to a bunch of '0'.
The content of the file use only uppercase letters, and all lowercase ad digits
are converted like this:

compare_ws_file () {
	pfx=$1
	exp=$2.expect
	act=$pfx.actual.$3
	tr '\015\000abcdef0123456789' QN00000000000000000 <"$2" >"$exp" &&
	tr '\015\000abcdef0123456789' QN00000000000000000 <"$3" >"$act" &&
	test_cmp "$exp" "$act" &&
	rm "$exp" "$act"
}

In the long term the 'tr' may need an additional 'sed' expression.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 01/10] t: add tool to translate hash-related values
  2018-06-06  6:19   ` Torsten Bögershausen
@ 2018-06-06 20:58     ` Jeff King
  2018-06-07  0:57       ` brian m. carlson
  0 siblings, 1 reply; 26+ messages in thread
From: Jeff King @ 2018-06-06 20:58 UTC (permalink / raw)
  To: Torsten Bögershausen
  Cc: brian m. carlson, git, Nguyễn Thái Ngọc Duy,
	René Scharfe

On Wed, Jun 06, 2018 at 08:19:27AM +0200, Torsten Bögershausen wrote:

> > +test_translate_f_ () {
> > +	local file="$TEST_DIRECTORY/translate/$2" &&
> 
> Unless I'm wrong, we don't use the "local" keyword ?

We've got a test balloon out; see 01d3a526ad (t0000: check whether the
shell supports the "local" keyword, 2017-10-26). I think it's reasonable
to consider starting its use.

> > +	perl -e '
> 
> The bare "perl" is better spelled as "$PERL_PATH"

This use is OK. Since a0e0ec9f7d (t: provide a perl() function which
uses $PERL_PATH, 2013-10-28), most common uses handle this automatically
(there are some exceptions, covered in t/README).

> > +	if [ "$1" = "-f" ]
> 
> Style nit, please avoid [] and use test:
> 	if test "$1" = "-f"

This I agree with. :)

-Peff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 01/10] t: add tool to translate hash-related values
  2018-06-06 20:58     ` Jeff King
@ 2018-06-07  0:57       ` brian m. carlson
  2018-06-07  2:40         ` Jeff King
  0 siblings, 1 reply; 26+ messages in thread
From: brian m. carlson @ 2018-06-07  0:57 UTC (permalink / raw)
  To: Jeff King
  Cc: Torsten Bögershausen, git,
	Nguyễn Thái Ngọc Duy, René Scharfe

[-- Attachment #1: Type: text/plain, Size: 1393 bytes --]

On Wed, Jun 06, 2018 at 04:58:46PM -0400, Jeff King wrote:
> On Wed, Jun 06, 2018 at 08:19:27AM +0200, Torsten Bögershausen wrote:
> 
> > > +test_translate_f_ () {
> > > +	local file="$TEST_DIRECTORY/translate/$2" &&
> > 
> > Unless I'm wrong, we don't use the "local" keyword ?
> 
> We've got a test balloon out; see 01d3a526ad (t0000: check whether the
> shell supports the "local" keyword, 2017-10-26). I think it's reasonable
> to consider starting its use.

I used it because it's already in use earlier in the file in some of the
mingw_* functions.  Perhaps we happen to know that our mingw systems
will always have a suitable /bin/sh, but I suppose some less capable
shells would still have choked on it by now.

I can remove it if necessary, but it didn't seem necessary.

> > > +	perl -e '
> > 
> > The bare "perl" is better spelled as "$PERL_PATH"
> 
> This use is OK. Since a0e0ec9f7d (t: provide a perl() function which
> uses $PERL_PATH, 2013-10-28), most common uses handle this automatically
> (there are some exceptions, covered in t/README).

This was exactly my reasoning.

> > > +	if [ "$1" = "-f" ]
> > 
> > Style nit, please avoid [] and use test:
> > 	if test "$1" = "-f"
> 
> This I agree with. :)

Yeah, I forgot that that's our style.  I'll fix that.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 867 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 04/10] t0027: use $ZERO_OID
  2018-06-06  7:02   ` Torsten Bögershausen
@ 2018-06-07  1:25     ` brian m. carlson
  0 siblings, 0 replies; 26+ messages in thread
From: brian m. carlson @ 2018-06-07  1:25 UTC (permalink / raw)
  To: Torsten Bögershausen
  Cc: git, Nguyễn Thái Ngọc Duy, Jeff King,
	René Scharfe

[-- Attachment #1: Type: text/plain, Size: 809 bytes --]

On Wed, Jun 06, 2018 at 09:02:23AM +0200, Torsten Bögershausen wrote:
> Nothing wrong with the patch.
> There is, however, a trick in t0027 to transform the different IDs back to a bunch of '0'.
> The content of the file use only uppercase letters, and all lowercase ad digits
> are converted like this:
> 
> compare_ws_file () {
> 	pfx=$1
> 	exp=$2.expect
> 	act=$pfx.actual.$3
> 	tr '\015\000abcdef0123456789' QN00000000000000000 <"$2" >"$exp" &&
> 	tr '\015\000abcdef0123456789' QN00000000000000000 <"$3" >"$act" &&
> 	test_cmp "$exp" "$act" &&
> 	rm "$exp" "$act"
> }
> 
> In the long term the 'tr' may need an additional 'sed' expression.

I'll take a look.  That may end up being a more robust solution.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 867 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 01/10] t: add tool to translate hash-related values
  2018-06-07  0:57       ` brian m. carlson
@ 2018-06-07  2:40         ` Jeff King
  0 siblings, 0 replies; 26+ messages in thread
From: Jeff King @ 2018-06-07  2:40 UTC (permalink / raw)
  To: brian m. carlson, Torsten Bögershausen, git,
	Nguyễn Thái Ngọc Duy, René Scharfe

On Thu, Jun 07, 2018 at 12:57:04AM +0000, brian m. carlson wrote:

> > > Unless I'm wrong, we don't use the "local" keyword ?
> > 
> > We've got a test balloon out; see 01d3a526ad (t0000: check whether the
> > shell supports the "local" keyword, 2017-10-26). I think it's reasonable
> > to consider starting its use.
> 
> I used it because it's already in use earlier in the file in some of the
> mingw_* functions.  Perhaps we happen to know that our mingw systems
> will always have a suitable /bin/sh, but I suppose some less capable
> shells would still have choked on it by now.

We do know in that case; it's always bash under mingw.

That said...

> I can remove it if necessary, but it didn't seem necessary.

I feel OK about starting to use it, with the knowledge that we may get a
late-comer who hasn't even tested v2.16.0 yet and says "no, wait! My
shell doesn't support local!". And then we'd have to deal with it then.
But I suspect that won't happen, or it will turn out that the shell in
question is unusable for some other reason anyway.

-Peff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 01/10] t: add tool to translate hash-related values
  2018-06-04 23:52 ` [PATCH 01/10] t: add tool to translate hash-related values brian m. carlson
  2018-06-06  6:19   ` Torsten Bögershausen
@ 2018-06-11  7:47   ` Eric Sunshine
  2018-06-12  1:05     ` brian m. carlson
  2018-07-24 22:17     ` brian m. carlson
  1 sibling, 2 replies; 26+ messages in thread
From: Eric Sunshine @ 2018-06-11  7:47 UTC (permalink / raw)
  To: brian m. carlson
  Cc: git, Nguyễn Thái Ngọc Duy,
	Torsten Bögershausen, Jeff King, René Scharfe

On Mon, Jun 04, 2018 at 11:52:20PM +0000, brian m. carlson wrote:
> Add a test function helper, test_translate, that will produce its first
> argument if the hash in use is SHA-1 and the second if its argument is
> NewHash.  Implement a mode that can read entries from a file as well for
> reusability across tests.

The word "translate" is very generic and is (at least in my mind)
strongly associated with i18n/l10n, so the name test_translate() may
be confusing for readers. Perhaps test_oid_lookup() or test_oid_get()
or even just test_oid()?

> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---
> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> @@ -1147,3 +1147,43 @@ depacketize () {
> +test_translate_f_ () {
> +	local file="$TEST_DIRECTORY/translate/$2" &&
> +	perl -e '
> +		$delim = "\t";
> +		($hoidlen, $file, $arg) = @ARGV;
> +		open($fh, "<", $file) or die "open: $!";
> +		while (<$fh>) {
> +			# Allow specifying other delimiters.
> +			$delim = $1 if /^#!\sdelimiter\s(.)/;
> +			next if /^#/;
> +			@fields = split /$delim/, $_, 3;
> +			if ($fields[0] eq $arg) {
> +				print($hoidlen == 40 ? $fields[1] : $fields[2]);
> +				last;
> +			}
> +		}
> +	' "$1" "$file" "$3"
> +}

This is a very expensive lookup since it invokes a heavyweight command
(perl, in this case) for *every* OID it needs to retrieve from the
file. Windows users, especially, will likely not be happy about this.
See below for an alternative.

> +# Without -f, print the first argument if we are using SHA-1 and the second if
> +# we're using NewHash.
> +# With -f FILE ARG, read the (by default) tab-delimited file from
> +# t/translate/FILE, finding the first field matching ARG and printing either the
> +# second or third field depending on the hash in use.
> +test_translate () {
> +	local hoidlen=$(printf "%s" "$EMPTY_BLOB" | wc -c) &&
> +	if [ "$1" = "-f" ]
> +	then
> +		shift &&
> +		test_translate_f_ "$hoidlen" "$@"
> +	else
> +		if [ "$hoidlen" -eq 40 ]
> +		then
> +			printf "%s" "$1"
> +		else
> +			printf "%s" "$2"
> +		fi
> +	fi
> +}

This is less flexible than I had expected, allowing for only SHA1 and
NewHash. When you had written about OID lookup table functionality in
email previously, my impression was that the tables would allow values
for arbitrary hash algorithms. Such flexibility would allow people to
experiment with hash algorithms without having to once again retrofit
the test suite machinery.

Here's what I had envisioned when reading your emails about OID lookup
table functionality:

--- >8 ---
test_detect_hash () {
    test_hash_algo=...
}

test_oid_cache () {
    while read tag rest
    do
        case $tag in \#*) continue ;; esac

        for x in $rest
        do
            k=${x%:*}
            v=${x#*:}
            if test "$k" = $test_hash_algo
            then
                eval "test_oid_$tag=$v"
                break
            fi
        done
    done
}

test_oid () {
    if test $# -gt 1
    then
        test_oid_cache <<-EOF
        $*
        EOF
    fi
    eval "echo \$test_oid_$1"
}
--- >8 ---

test_detect_hash() would detect the hash algorithm and record it
instead of having to determine it each time an OID needs to be
"translated". It probably would be called by test-lib.sh.

test_oid_cache() reads a table of OID's and caches them so that
subsequent look-ups are fast. It would be called (possibly multiple
times) by any script needing to "translate" OID's. Each line of the
table has form "tag algo:value ...". Lines starting with '#' are
comments (as in your implementation). Since it reads stdin, it works
equally well reading OID tables from files and here-docs, which
provides extra flexibility for test authors. For instance:

    test_oid_cache <translate/hash-info

or:

    test_oid_cache <<-\EOF
    rawsz sha1:20 NewHash:32
    hexsz sha1:40 NewHash:64
    EOF

test_oid() does the actual OID lookup/translation. It looks up a
pre-cached value or, for convenience (as per your implementation), can
choose between values specified at invocation time. For example, the
simple case:

    $(test_oid hexsz)

And, when specifying values from which to choose based upon hash
algorithm:

    $(test_oid bored sha1:deadbeef NewHash:feedface)

A nice property of how this implementation caches values is that you
don't need test_oid() for really simple cases. You can just access the
variable directly. For instance: $test_oid_hexsz

Another nice property of how caching is implemented is that someone
testing a new hash algorithm doesn't have edit the existing tables to
tack the value for the new algorithm onto the end of each line. It
works equally well to place those values in a new file or new here-doc
or simply append new lines to existing files or here-docs. For
instance, someone testing algorithm "NewShiny" can just add those
lines without having to modify existing lines:

    test_oid_cache <<-\EOF
    rawsz sha1:20 NewHash:32
    hexsz sha1:40 NewHash:64
    # testing NewShiny algorithm
    rawsz: NewShiny:42
    hexsz: NewShiny:84
    EOF

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 03/10] t0002: abstract away SHA-1-specific constants
  2018-06-04 23:52 ` [PATCH 03/10] t0002: abstract away SHA-1-specific constants brian m. carlson
@ 2018-06-11  7:59   ` Eric Sunshine
  0 siblings, 0 replies; 26+ messages in thread
From: Eric Sunshine @ 2018-06-11  7:59 UTC (permalink / raw)
  To: brian m. carlson
  Cc: Git List, Nguyễn Thái Ngọc Duy,
	Torsten Bögershausen, Jeff King, René Scharfe

On Mon, Jun 4, 2018 at 7:52 PM, brian m. carlson
<sandals@crustytoothpaste.net> wrote:
> Adjust the test so that it computes variables for object IDs instead of
> using hard-coded hashes.
>
> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---
> diff --git a/t/t0002-gitfile.sh b/t/t0002-gitfile.sh
> @@ -89,14 +89,16 @@ test_expect_success 'enter_repo non-strict mode' '
>                 cd enter_repo &&
>                 test_tick &&
>                 test_commit foo &&
> +               git rev-parse HEAD >head-revision &&
>                 mv .git .realgit &&
>                 echo "gitdir: .realgit" >.git
>         ) &&
> +       head=$(cat enter_repo/head-revision) &&

Stashing the value temporarily in a file ("head-revision") is
unnecessary and somewhat ugly. Just grab it directly after the
subshell exits:

    (
        cd enter_repo &&
        ...
    ) &&
    head=$(git -C enter_repo rev-parse HEAD) &&
    ...

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 05/10] t0064: make hash size independent
  2018-06-04 23:52 ` [PATCH 05/10] t0064: make hash size independent brian m. carlson
@ 2018-06-11  8:09   ` Eric Sunshine
  2018-06-12  1:08     ` brian m. carlson
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Sunshine @ 2018-06-11  8:09 UTC (permalink / raw)
  To: brian m. carlson
  Cc: Git List, Nguyễn Thái Ngọc Duy,
	Torsten Bögershausen, Jeff King, René Scharfe

On Mon, Jun 4, 2018 at 7:52 PM, brian m. carlson
<sandals@crustytoothpaste.net> wrote:
> Compute test values of the appropriate size instead of hard-coding
> 40-character values.  Rename the echo20 function to echoid, since the
> values may be of varying sizes.
>
> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---
> diff --git a/t/t0064-sha1-array.sh b/t/t0064-sha1-array.sh
> @@ -3,30 +3,30 @@
> -echo20 () {
> +echoid () {
>         prefix="${1:+$1 }"
>         shift
>         while test $# -gt 0
>         do
> -               echo "$prefix$1$1$1$1$1$1$1$1$1$1$1$1$1$1$1$1$1$1$1$1"
> +               echo "$prefix$ZERO_OID" | sed -e "s/00/$1/g"
>                 shift
>         done
>  }
>
>
>  test_expect_success 'lookup with almost duplicate values' '
> +       # n-1 5s
> +       root=$(test_translate 555555555555555555555555555555555555555 \
> +               555555555555555555555555555555555555555555555555555555555555555) &&

This is rather unwieldy and ugly. How about simply re-using echoid()
to compute the value, like this:

    root=$(echoid "" 55) &&
    root=${root%5} &&

That is, use echoid() to generate the all-5's OID of correct length
and then chop the last '5' off the end.

>         {
> -               echo "append 5555555555555555555555555555555555555555" &&
> -               echo "append 555555555555555555555555555555555555555f" &&
> -               echo20 lookup 55
> +               id1="${root}5" &&
> +               id2="${root}f" &&
> +               echo "append $id1" &&
> +               echo "append $id2" &&
> +               echoid lookup 55

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 01/10] t: add tool to translate hash-related values
  2018-06-11  7:47   ` Eric Sunshine
@ 2018-06-12  1:05     ` brian m. carlson
  2018-06-12  7:29       ` Eric Sunshine
  2018-07-24 22:17     ` brian m. carlson
  1 sibling, 1 reply; 26+ messages in thread
From: brian m. carlson @ 2018-06-12  1:05 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: git, Nguyễn Thái Ngọc Duy,
	Torsten Bögershausen, Jeff King, René Scharfe

[-- Attachment #1: Type: text/plain, Size: 4638 bytes --]

On Mon, Jun 11, 2018 at 03:47:43AM -0400, Eric Sunshine wrote:
> On Mon, Jun 04, 2018 at 11:52:20PM +0000, brian m. carlson wrote:
> > Add a test function helper, test_translate, that will produce its first
> > argument if the hash in use is SHA-1 and the second if its argument is
> > NewHash.  Implement a mode that can read entries from a file as well for
> > reusability across tests.
> 
> The word "translate" is very generic and is (at least in my mind)
> strongly associated with i18n/l10n, so the name test_translate() may
> be confusing for readers. Perhaps test_oid_lookup() or test_oid_get()
> or even just test_oid()?

test_oid would be fine.  One note is that this doesn't always produce
OIDs; sometimes it will produce other values, but as long as you don't
think that's too confusing, I'm fine with it.

> This is a very expensive lookup since it invokes a heavyweight command
> (perl, in this case) for *every* OID it needs to retrieve from the
> file. Windows users, especially, will likely not be happy about this.
> See below for an alternative.

I agree perl would be expensive if it were invoked frequently, but
excepting SHA1-prerequisite tests, this function is invoked 32 times in
the entire testsuite.

One of the reasons I chose perl was because we have a variety of cases
where we'll need spaces in values, and those tend to be complex in
shell.

> This is less flexible than I had expected, allowing for only SHA1 and
> NewHash. When you had written about OID lookup table functionality in
> email previously, my impression was that the tables would allow values
> for arbitrary hash algorithms. Such flexibility would allow people to
> experiment with hash algorithms without having to once again retrofit
> the test suite machinery.

I wasn't thinking of that as a goal, but that would be a nice
improvement.

> Here's what I had envisioned when reading your emails about OID lookup
> table functionality:
> 
> --- >8 ---
> test_detect_hash () {
>     test_hash_algo=...
> }
>     
> test_oid_cache () {
>     while read tag rest
>     do
>         case $tag in \#*) continue ;; esac
> 
>         for x in $rest
>         do
>             k=${x%:*}
>             v=${x#*:}
>             if test "$k" = $test_hash_algo
>             then
>                 eval "test_oid_$tag=$v"
>                 break
>             fi
>         done
>     done
> }
> 
> test_oid () {
>     if test $# -gt 1
>     then
>         test_oid_cache <<-EOF
>         $*
>         EOF
>     fi
>     eval "echo \$test_oid_$1"
> }

Using shell variables like this does have the downside that we're
restricted to only characters allowed in shell variables.  That was
something I was trying to avoid, but it certainly isn't fatal.

> test_detect_hash() would detect the hash algorithm and record it
> instead of having to determine it each time an OID needs to be
> "translated". It probably would be called by test-lib.sh.

We'll probably have to deal with multiple hashes in the future,
including for input and output, but this could probably be coerced to
handle that case.

> And, when specifying values from which to choose based upon hash
> algorithm:
> 
>     $(test_oid bored sha1:deadbeef NewHash:feedface)

This syntax won't exactly be usable, because we have to deal with spaces
in values.  It shouldn't be too much of a problem to just use
test_oid_cache at the top of the file, though.

> A nice property of how this implementation caches values is that you
> don't need test_oid() for really simple cases. You can just access the
> variable directly. For instance: $test_oid_hexsz

Because we're going to need multiple hash support in the future, for
input, output, and on-disk, I feel like this is not a good direction for
us to go in the testsuite.  Internally, those variable names are likely
to change.

> Another nice property of how caching is implemented is that someone
> testing a new hash algorithm doesn't have edit the existing tables to
> tack the value for the new algorithm onto the end of each line. It
> works equally well to place those values in a new file or new here-doc
> or simply append new lines to existing files or here-docs. For
> instance, someone testing algorithm "NewShiny" can just add those
> lines without having to modify existing lines:

That would be convenient.

I like a lot of things about your implementation.  This has been helpful
feedback; let me think about it some and reroll.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 867 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 05/10] t0064: make hash size independent
  2018-06-11  8:09   ` Eric Sunshine
@ 2018-06-12  1:08     ` brian m. carlson
  0 siblings, 0 replies; 26+ messages in thread
From: brian m. carlson @ 2018-06-12  1:08 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Git List, Nguyễn Thái Ngọc Duy,
	Torsten Bögershausen, Jeff King, René Scharfe

[-- Attachment #1: Type: text/plain, Size: 810 bytes --]

On Mon, Jun 11, 2018 at 04:09:05AM -0400, Eric Sunshine wrote:
> On Mon, Jun 4, 2018 at 7:52 PM, brian m. carlson
> <sandals@crustytoothpaste.net> wrote:
> >  test_expect_success 'lookup with almost duplicate values' '
> > +       # n-1 5s
> > +       root=$(test_translate 555555555555555555555555555555555555555 \
> > +               555555555555555555555555555555555555555555555555555555555555555) &&
> 
> This is rather unwieldy and ugly. How about simply re-using echoid()
> to compute the value, like this:
> 
>     root=$(echoid "" 55) &&
>     root=${root%5} &&
> 
> That is, use echoid() to generate the all-5's OID of correct length
> and then chop the last '5' off the end.

Sure, that would be nicer.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 867 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 01/10] t: add tool to translate hash-related values
  2018-06-12  1:05     ` brian m. carlson
@ 2018-06-12  7:29       ` Eric Sunshine
  2018-06-14  0:22         ` brian m. carlson
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Sunshine @ 2018-06-12  7:29 UTC (permalink / raw)
  To: brian m. carlson, Eric Sunshine, Git List,
	Nguyễn Thái Ngọc Duy, Torsten Bögershausen,
	Jeff King, René Scharfe

On Mon, Jun 11, 2018 at 9:05 PM, brian m. carlson
<sandals@crustytoothpaste.net> wrote:
> On Mon, Jun 11, 2018 at 03:47:43AM -0400, Eric Sunshine wrote:
>> The word "translate" is very generic and is (at least in my mind)
>> strongly associated with i18n/l10n, so the name test_translate() may
>> be confusing for readers. Perhaps test_oid_lookup() or test_oid_get()
>> or even just test_oid()?
>
> test_oid would be fine.  One note is that this doesn't always produce
> OIDs; sometimes it will produce other values, but as long as you don't
> think that's too confusing, I'm fine with it.

It was surprising to see it used for non-OID's (such as hash
characteristics), but not hard to deal with.

One could also view this as a generic key/value cache (not specific to
OID's) with overriding super-key (the hash algorithm, in this case),
which would allow for more generic name than test_oid(), but we don't
have to go there presently.

>> This is a very expensive lookup since it invokes a heavyweight command
>> (perl, in this case) for *every* OID it needs to retrieve from the
>> file. Windows users, especially, will likely not be happy about this.
>> See below for an alternative.
>
> I agree perl would be expensive if it were invoked frequently, but
> excepting SHA1-prerequisite tests, this function is invoked 32 times in
> the entire testsuite.
>
> One of the reasons I chose perl was because we have a variety of cases
> where we'll need spaces in values, and those tend to be complex in
> shell.

Can you give examples of cases in which values will contain spaces? It
wasn't obvious from this patch series that such a need would arise.

Are these values totally free-form? If not, some character (such as
"_", "-", ".", etc.) could act as a stand-in for space. That shouldn't
be too hard to handle.

>> Here's what I had envisioned when reading your emails about OID lookup
>> table functionality:
>>
>> --- >8 ---
>> test_oid_cache () {
>>     while read tag rest
>>     do
>>         case $tag in \#*) continue ;; esac
>>
>>         for x in $rest
>>         do
>>             k=${x%:*}
>>             v=${x#*:}
>>             if test "$k" = $test_hash_algo
>>             then
>>                 eval "test_oid_$tag=$v"
>>                 break
>>             fi
>>         done
>>     done
>> }
>
> Using shell variables like this does have the downside that we're
> restricted to only characters allowed in shell variables.  That was
> something I was trying to avoid, but it certainly isn't fatal.

Is that just a general concern or do you have specific "weird" keys in mind?

>> test_detect_hash() would detect the hash algorithm and record it
>> instead of having to determine it each time an OID needs to be
>> "translated". It probably would be called by test-lib.sh.
>
> We'll probably have to deal with multiple hashes in the future,
> including for input and output, but this could probably be coerced to
> handle that case.

My original version of test_oid_cache() actually allowed for that by
caching _all_ information from the tables rather than only values
corresponding to $test_hash_algo. It looked like this:

--- >8 ---
test_oid_cache () {
    while read tag rest
    do
        case $tag in \#*) continue ;; esac

        for x in $rest
        do
            eval "test_oid_${tag}_${x%:*}=${x#*:}"
        done
    done
}
--- >8 ---

The hash algorithm is incorporated into the cache variable name like
this: "test_oid_hexsz_sha256"

>> And, when specifying values from which to choose based upon hash
>> algorithm:
>>
>>     $(test_oid bored sha1:deadbeef NewHash:feedface)
>
> This syntax won't exactly be usable, because we have to deal with spaces
> in values.  It shouldn't be too much of a problem to just use
> test_oid_cache at the top of the file, though.

See above about possibly using a stand-in character for space.

>> A nice property of how this implementation caches values is that you
>> don't need test_oid() for really simple cases. You can just access the
>> variable directly. For instance: $test_oid_hexsz
>
> Because we're going to need multiple hash support in the future, for
> input, output, and on-disk, I feel like this is not a good direction for
> us to go in the testsuite.  Internally, those variable names are likely
> to change.

Indeed, this isn't a real selling point; I only just thought of it
while composing the mail. Going through the API is more robust.
(Although, see above how the revised test_oid_cache() incorporates the
hash algorithm into the cache variable name.)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 01/10] t: add tool to translate hash-related values
  2018-06-12  7:29       ` Eric Sunshine
@ 2018-06-14  0:22         ` brian m. carlson
  0 siblings, 0 replies; 26+ messages in thread
From: brian m. carlson @ 2018-06-14  0:22 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Git List, Nguyễn Thái Ngọc Duy,
	Torsten Bögershausen, Jeff King, René Scharfe

[-- Attachment #1: Type: text/plain, Size: 3821 bytes --]

On Tue, Jun 12, 2018 at 03:29:47AM -0400, Eric Sunshine wrote:
> On Mon, Jun 11, 2018 at 9:05 PM, brian m. carlson
> <sandals@crustytoothpaste.net> wrote:
> > test_oid would be fine.  One note is that this doesn't always produce
> > OIDs; sometimes it will produce other values, but as long as you don't
> > think that's too confusing, I'm fine with it.
> 
> It was surprising to see it used for non-OID's (such as hash
> characteristics), but not hard to deal with.
> 
> One could also view this as a generic key/value cache (not specific to
> OID's) with overriding super-key (the hash algorithm, in this case),
> which would allow for more generic name than test_oid(), but we don't
> have to go there presently.

It is essentially that.  I'm happy with the test_oid name provided
others are.  My only concern is that it would be confusing.

I opted to use the same mechanism for hash characteristics because it
seemed better than creating a lot of one-off functions that might have
duplicative implementations.  But I'm open to arguments that having
test_oid_rawsz, test_oid_hexsz, etc. is better.

> > I agree perl would be expensive if it were invoked frequently, but
> > excepting SHA1-prerequisite tests, this function is invoked 32 times in
> > the entire testsuite.
> >
> > One of the reasons I chose perl was because we have a variety of cases
> > where we'll need spaces in values, and those tend to be complex in
> > shell.
> 
> Can you give examples of cases in which values will contain spaces? It
> wasn't obvious from this patch series that such a need would arise.
> 
> Are these values totally free-form? If not, some character (such as
> "_", "-", ".", etc.) could act as a stand-in for space. That shouldn't
> be too hard to handle.

t6030, which tests the bisect porcelain, is sensitive to the hash
algorithm because hash values are used as a secondary sort for the
closest commit.  Without totally gutting the test and redoing it, some
solution to produce something resembling a sane commit message would be
helpful.  We will also have cases where we need to provide strings to
printf(1), such as in some of the pack tests.

I have a minor modification to your code which handles that at the cost
of restricting us to one hash-key-value tuple on a line, which I think
is an acceptable tradeoff.

> > Using shell variables like this does have the downside that we're
> > restricted to only characters allowed in shell variables.  That was
> > something I was trying to avoid, but it certainly isn't fatal.
> 
> Is that just a general concern or do you have specific "weird" keys in mind?

I had originally thought of providing only the SHA-1 value instead of a
named key, which would have necessitated allowing arbitrary inputs, but
I ultimately decided that named keys were better.  I also tend to prefer
dashes in inputs over underscores, since it's a bit easier to type, but
that's really a secondary concern.

I think the benefits of an implementation closer to your outweigh the
downsides.

> My original version of test_oid_cache() actually allowed for that by
> caching _all_ information from the tables rather than only values
> corresponding to $test_hash_algo. It looked like this:
> 
> --- >8 ---
> test_oid_cache () {
>     while read tag rest
>     do
>         case $tag in \#*) continue ;; esac
> 
>         for x in $rest
>         do
>             eval "test_oid_${tag}_${x%:*}=${x#*:}"
>         done
>     done
> }
> --- >8 ---
> 
> The hash algorithm is incorporated into the cache variable name like
> this: "test_oid_hexsz_sha256"

Yeah, I basically reimplemented something similar to that based off your
code.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 867 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 01/10] t: add tool to translate hash-related values
  2018-06-11  7:47   ` Eric Sunshine
  2018-06-12  1:05     ` brian m. carlson
@ 2018-07-24 22:17     ` brian m. carlson
  2018-07-25  8:33       ` Eric Sunshine
  1 sibling, 1 reply; 26+ messages in thread
From: brian m. carlson @ 2018-07-24 22:17 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: git, Nguyễn Thái Ngọc Duy,
	Torsten Bögershausen, Jeff King, René Scharfe

[-- Attachment #1: Type: text/plain, Size: 1005 bytes --]

On Mon, Jun 11, 2018 at 03:47:43AM -0400, Eric Sunshine wrote:
> Here's what I had envisioned when reading your emails about OID lookup
> table functionality:
> 
> --- >8 ---
> test_detect_hash () {
>     test_hash_algo=...
> }
> 
> test_oid_cache () {
>     while read tag rest
>     do
>         case $tag in \#*) continue ;; esac
> 
>         for x in $rest
>         do
>             k=${x%:*}
>             v=${x#*:}
>             if test "$k" = $test_hash_algo
>             then
>                 eval "test_oid_$tag=$v"
>                 break
>             fi
>         done
>     done
> }
> 
> test_oid () {
>     if test $# -gt 1
>     then
>         test_oid_cache <<-EOF
>         $*
>         EOF
>     fi
>     eval "echo \$test_oid_$1"
> }
> --- >8 ---

I'd like to use this as a basis for my v2, but I would need your
sign-off in order to do that.  Would that be okay?
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 867 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 01/10] t: add tool to translate hash-related values
  2018-07-24 22:17     ` brian m. carlson
@ 2018-07-25  8:33       ` Eric Sunshine
  0 siblings, 0 replies; 26+ messages in thread
From: Eric Sunshine @ 2018-07-25  8:33 UTC (permalink / raw)
  To: brian m. carlson, Git List, Nguyễn Thái Ngọc Duy,
	Torsten Bögershausen, Jeff King, René Scharfe

On Tue, Jul 24, 2018 at 6:17 PM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
> On Mon, Jun 11, 2018 at 03:47:43AM -0400, Eric Sunshine wrote:
> > Here's what I had envisioned when reading your emails about OID lookup
> > table functionality:
> >
> > --- >8 ---
> > test_oid_cache () {
> >     while read tag rest
> >     do
> >         case $tag in \#*) continue ;; esac
> >
> >         for x in $rest
> >         do
> >             k=${x%:*}
> >             v=${x#*:}
> >             if test "$k" = $test_hash_algo
> >             then
> >                 eval "test_oid_$tag=$v"
> >                 break
> >             fi
> >         done
> >     done
> > }
> >
> > test_oid () {
> >     if test $# -gt 1
> >     then
> >         test_oid_cache <<-EOF
> >         $*
> >         EOF
> >     fi
> >     eval "echo \$test_oid_$1"
> > }
> > --- >8 ---
>
> I'd like to use this as a basis for my v2, but I would need your
> sign-off in order to do that.  Would that be okay?

You can have my sign-off for the code above.

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2018-07-25  8:34 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-04 23:52 [PATCH 00/10] Hash-independent tests (part 3) brian m. carlson
2018-06-04 23:52 ` [PATCH 01/10] t: add tool to translate hash-related values brian m. carlson
2018-06-06  6:19   ` Torsten Bögershausen
2018-06-06 20:58     ` Jeff King
2018-06-07  0:57       ` brian m. carlson
2018-06-07  2:40         ` Jeff King
2018-06-11  7:47   ` Eric Sunshine
2018-06-12  1:05     ` brian m. carlson
2018-06-12  7:29       ` Eric Sunshine
2018-06-14  0:22         ` brian m. carlson
2018-07-24 22:17     ` brian m. carlson
2018-07-25  8:33       ` Eric Sunshine
2018-06-04 23:52 ` [PATCH 02/10] t0000: use hash translation table brian m. carlson
2018-06-04 23:52 ` [PATCH 03/10] t0002: abstract away SHA-1-specific constants brian m. carlson
2018-06-11  7:59   ` Eric Sunshine
2018-06-04 23:52 ` [PATCH 04/10] t0027: use $ZERO_OID brian m. carlson
2018-06-06  7:02   ` Torsten Bögershausen
2018-06-07  1:25     ` brian m. carlson
2018-06-04 23:52 ` [PATCH 05/10] t0064: make hash size independent brian m. carlson
2018-06-11  8:09   ` Eric Sunshine
2018-06-12  1:08     ` brian m. carlson
2018-06-04 23:52 ` [PATCH 06/10] t1006: " brian m. carlson
2018-06-04 23:52 ` [PATCH 07/10] t1400: switch hard-coded object ID to variable brian m. carlson
2018-06-04 23:52 ` [PATCH 08/10] t1405: make hash size independent brian m. carlson
2018-06-04 23:52 ` [PATCH 09/10] t1406: make hash-size independent brian m. carlson
2018-06-04 23:52 ` [PATCH 10/10] t1407: make hash size independent brian m. carlson

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).