[PATCH 0/3] [RFC] tests: add test_todo() for known failures

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* [PATCH 0/3] [RFC] tests: add test_todo() for known failures
@ 2022-10-06 15:01 Phillip Wood via GitGitGadget
  2022-10-06 15:01 ` [PATCH 1/3] [RFC] tests: add test_todo() to mark known breakages Phillip Wood via GitGitGadget
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Phillip Wood via GitGitGadget @ 2022-10-06 15:01 UTC (permalink / raw)
  To: git; +Cc: Ævar Arnfjörð Bjarmason, Derrick Stolee,
	Phillip Wood

test_todo() is intended as a fine grained alternative to
test_expect_failure(). Rather than marking the whole test as failing
test_todo() is used to mark individual failing commands within a test. This
approach to writing failing tests allows us to detect unexpected failures
that are hidden by test_expect_failure().

This series attempts to keep most of the benefits test_expect_todo()
previously proposed by Ævar[1] while being simpler to use.

[1]
https://lore.kernel.org/git/cover-0.7-00000000000-20220318T002951Z-avarab@gmail.com/

Phillip Wood (3):
  [RFC] tests: add test_todo() to mark known breakages
  [RFC] test_todo: allow [!] grep as the command
  [RFC] test_todo: allow [verbose] test as the command

 t/README                        |  12 ++
 t/t0000-basic.sh                |  92 +++++++++++++++
 t/t0050-filesystem.sh           |   4 +-
 t/t3401-rebase-and-am-rename.sh |  12 +-
 t/t3424-rebase-empty.sh         |   6 +-
 t/t3510-cherry-pick-sequence.sh |  12 +-
 t/t3600-rm.sh                   |   8 +-
 t/t3903-stash.sh                |  12 +-
 t/t4014-format-patch.sh         |  20 ++--
 t/test-lib-functions.sh         | 194 +++++++++++++++++++++++++-------
 10 files changed, 293 insertions(+), 79 deletions(-)


base-commit: bcd6bc478adc4951d57ec597c44b12ee74bc88fb
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1374%2Fphillipwood%2Ftest-todo-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1374/phillipwood/test-todo-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1374
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/3] [RFC] tests: add test_todo() to mark known breakages
  2022-10-06 15:01 [PATCH 0/3] [RFC] tests: add test_todo() for known failures Phillip Wood via GitGitGadget
@ 2022-10-06 15:01 ` Phillip Wood via GitGitGadget
  2022-10-06 15:36   ` Ævar Arnfjörð Bjarmason
  2022-12-06 22:37   ` Victoria Dye
  2022-10-06 15:01 ` [PATCH 2/3] [RFC] test_todo: allow [!] grep as the command Phillip Wood via GitGitGadget
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 20+ messages in thread
From: Phillip Wood via GitGitGadget @ 2022-10-06 15:01 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Derrick Stolee,
	Phillip Wood, Phillip Wood

From: Phillip Wood <phillip.wood@dunelm.org.uk>

test_todo() is intended as a fine grained replacement for
test_expect_failure(). Rather than marking the whole test as failing
test_todo() is used to mark individual failing commands within a
test. This approach to writing failing tests allows us to detect
unexpected failures that are hidden by test_expect_failure().

Failing commands are reported by the test harness in the same way as
test_expect_failure() so there is no change in output when migrating
from test_expect_failure() to test_todo(). If a command marked with
test_todo() succeeds then the test will fail. This is designed to make
it easier to see when a command starts succeeding in our CI compared
to using test_expect_failure() where it is easy to fix a failing test
case and not realize it.

test_todo() is built upon test_expect_failure() but accepts commands
starting with test_* in addition to git. As our test_* assertions use
BUG() to signal usage errors any such error will not be hidden by
test_todo().

This commit coverts a few tests to show the intended use of
test_todo().  A limitation of test_todo() as it is currently
implemented is that it cannot be used in a subshell.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 t/README                        |  12 +++
 t/t0000-basic.sh                |  64 ++++++++++++++
 t/t3401-rebase-and-am-rename.sh |  12 +--
 t/t3424-rebase-empty.sh         |   6 +-
 t/t3600-rm.sh                   |   8 +-
 t/test-lib-functions.sh         | 147 +++++++++++++++++++++++---------
 6 files changed, 194 insertions(+), 55 deletions(-)

diff --git a/t/README b/t/README
index 979b2d4833d..642aeab80b4 100644
--- a/t/README
+++ b/t/README
@@ -892,6 +892,10 @@ see test-lib-functions.sh for the full list and their options.
 
  - test_expect_failure [<prereq>] <message> <script>
 
+   Where possible new tests should use test_expect_success and mark
+   the individual failing commands with test_todo (see below) rather
+   than using test_expect_failure.
+
    This is NOT the opposite of test_expect_success, but is used
    to mark a test that demonstrates a known breakage.  Unlike
    the usual test_expect_success tests, which say "ok" on
@@ -902,6 +906,14 @@ see test-lib-functions.sh for the full list and their options.
    Like test_expect_success this function can optionally use a three
    argument invocation with a prerequisite as the first argument.
 
+ - test_todo <command>
+
+   This is used to mark commands that should succeed but do not due to
+   a known issue. The test will be reported as a known failure if the
+   issue still exists and won’t cause -i (immediate) to stop. If the
+   command unexpectedly succeeds then the test will be reported as a
+   failing. test_todo cannot be used in a subshell.
+
  - test_debug <script>
 
    This takes a single argument, <script>, and evaluates it only
diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
index 502b4bcf9ea..52362ad3dd3 100755
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -141,6 +141,70 @@ test_expect_success 'subtest: a passing TODO test' '
 	EOF
 '
 
+test_expect_success 'subtest: a failing test_todo' '
+	write_and_run_sub_test_lib_test failing-test-todo <<-\EOF &&
+	test_false () {
+		false
+	}
+	test_expect_success "passing test" "true"
+	test_expect_success "known todo" "test_todo test_false"
+	test_done
+	EOF
+	check_sub_test_lib_test failing-test-todo <<-\EOF
+	> ok 1 - passing test
+	> not ok 2 - known todo # TODO known breakage
+	> # still have 1 known breakage(s)
+	> # passed all remaining 1 test(s)
+	> 1..2
+	EOF
+'
+
+test_expect_success 'subtest: a passing test_todo' '
+	write_and_run_sub_test_lib_test_err passing-test-todo <<-\EOF &&
+	test_true () {
+		true
+	}
+	test_expect_success "pretend we have fixed a test_todo breakage" \
+		"test_todo test_true"
+	test_done
+	EOF
+	check_sub_test_lib_test passing-test-todo <<-\EOF
+	> not ok 1 - pretend we have fixed a test_todo breakage
+	> #	test_todo test_true
+	> # failed 1 among 1 test(s)
+	> 1..1
+	EOF
+'
+
+test_expect_success 'subtest: test_todo allowed arguments' '
+	write_and_run_sub_test_lib_test_err acceptable-test-todo <<-\EOF &&
+	# This an acceptable command for test_todo but not test_must_fail
+	test_true () {
+		  return 0
+	}
+	test_expect_success "test_todo skips env and accepts good command" \
+		"test_todo env Var=Value git --invalid-option"
+	test_expect_success "test_todo skips env and rejects bad command" \
+		"test_todo env Var=Value false"
+	test_expect_success "test_todo test_must_fail accepts good command" \
+		"test_todo test_must_fail git --version"
+	test_expect_success "test_todo test_must_fail rejects bad command" \
+		"test_todo test_must_fail test_true"
+	test_done
+	EOF
+	check_sub_test_lib_test acceptable-test-todo <<-\EOF
+	> not ok 1 - test_todo skips env and accepts good command # TODO known breakage
+	> not ok 2 - test_todo skips env and rejects bad command
+	> #	test_todo env Var=Value false
+	> not ok 3 - test_todo test_must_fail accepts good command # TODO known breakage
+	> not ok 4 - test_todo test_must_fail rejects bad command
+	> #	test_todo test_must_fail test_true
+	> # still have 2 known breakage(s)
+	> # failed 2 among remaining 2 test(s)
+	> 1..4
+	EOF
+'
+
 test_expect_success 'subtest: 2 TODO tests, one passin' '
 	write_and_run_sub_test_lib_test partially-passing-todos <<-\EOF &&
 	test_expect_failure "pretend we have a known breakage" "false"
diff --git a/t/t3401-rebase-and-am-rename.sh b/t/t3401-rebase-and-am-rename.sh
index f18bae94507..cc5da9f5afe 100755
--- a/t/t3401-rebase-and-am-rename.sh
+++ b/t/t3401-rebase-and-am-rename.sh
@@ -52,7 +52,7 @@ test_expect_success 'rebase --interactive: directory rename detected' '
 	)
 '
 
-test_expect_failure 'rebase --apply: directory rename detected' '
+test_expect_success 'rebase --apply: directory rename detected' '
 	(
 		cd dir-rename &&
 
@@ -63,8 +63,8 @@ test_expect_failure 'rebase --apply: directory rename detected' '
 		git ls-files -s >out &&
 		test_line_count = 5 out &&
 
-		test_path_is_file y/d &&
-		test_path_is_missing x/d
+		test_todo test_path_is_file y/d &&
+		test_todo test_path_is_missing x/d
 	)
 '
 
@@ -84,7 +84,7 @@ test_expect_success 'rebase --merge: directory rename detected' '
 	)
 '
 
-test_expect_failure 'am: directory rename detected' '
+test_expect_success 'am: directory rename detected' '
 	(
 		cd dir-rename &&
 
@@ -97,8 +97,8 @@ test_expect_failure 'am: directory rename detected' '
 		git ls-files -s >out &&
 		test_line_count = 5 out &&
 
-		test_path_is_file y/d &&
-		test_path_is_missing x/d
+		test_todo test_path_is_file y/d &&
+		test_todo test_path_is_missing x/d
 	)
 '
 
diff --git a/t/t3424-rebase-empty.sh b/t/t3424-rebase-empty.sh
index 5e1045a0afc..b7cae260759 100755
--- a/t/t3424-rebase-empty.sh
+++ b/t/t3424-rebase-empty.sh
@@ -34,15 +34,15 @@ test_expect_success 'setup test repository' '
 	git commit -m "Five letters ought to be enough for anybody"
 '
 
-test_expect_failure 'rebase (apply-backend)' '
-	test_when_finished "git rebase --abort" &&
+test_expect_success 'rebase (apply-backend)' '
+	test_when_finished "test_might_fail git rebase --abort" &&
 	git checkout -B testing localmods &&
 	# rebase (--apply) should not drop commits that start empty
 	git rebase --apply upstream &&
 
 	test_write_lines D C B A >expect &&
 	git log --format=%s >actual &&
-	test_cmp expect actual
+	test_todo test_cmp expect actual
 '
 
 test_expect_success 'rebase --merge --empty=drop' '
diff --git a/t/t3600-rm.sh b/t/t3600-rm.sh
index e74a318ac33..fa7831c0674 100755
--- a/t/t3600-rm.sh
+++ b/t/t3600-rm.sh
@@ -790,7 +790,7 @@ test_expect_success SYMLINKS 'rm across a symlinked leading path (no index)' '
 	test_path_is_file e/f
 '
 
-test_expect_failure SYMLINKS 'rm across a symlinked leading path (w/ index)' '
+test_expect_success SYMLINKS 'rm across a symlinked leading path (w/ index)' '
 	rm -rf d e &&
 	mkdir d &&
 	echo content >d/f &&
@@ -798,10 +798,10 @@ test_expect_failure SYMLINKS 'rm across a symlinked leading path (w/ index)' '
 	git commit -m "d/f exists" &&
 	mv d e &&
 	ln -s e d &&
-	test_must_fail git rm d/f &&
-	git rev-parse --verify :d/f &&
+	test_todo test_must_fail git rm d/f &&
+	test_todo git rev-parse --verify :d/f &&
 	test -h d &&
-	test_path_is_file e/f
+	test_todo test_path_is_file e/f
 '
 
 test_expect_success 'setup for testing rm messages' '
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index c6479f24eb5..8978709b231 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -802,6 +802,7 @@ test_expect_failure () {
 	export test_prereq
 	if ! test_skip "$@"
 	then
+		test_todo_=test_expect_failure
 		test -n "$test_skip_test_preamble" ||
 		say >&3 "checking known breakage of $TEST_NUMBER.$test_count '$1': $2"
 		if test_run_ "$2" expecting_failure
@@ -825,9 +826,15 @@ test_expect_success () {
 	then
 		test -n "$test_skip_test_preamble" ||
 		say >&3 "expecting success of $TEST_NUMBER.$test_count '$1': $2"
+		test_todo_=test_expect_success
 		if test_run_ "$2"
 		then
-			test_ok_ "$1"
+			if test "$test_todo_" = "todo"
+			then
+				test_known_broken_failure_ "$1"
+			else
+				test_ok_ "$1"
+			fi
 		else
 			test_failure_ "$@"
 		fi
@@ -999,10 +1006,18 @@ list_contains () {
 }
 
 # Returns success if the arguments indicate that a command should be
-# accepted by test_must_fail(). If the command is run with env, the env
-# and its corresponding variable settings will be stripped before we
-# test the command being run.
+# accepted by test_must_fail() or test_todo(). If the command is run
+# with env, the env and its corresponding variable settings will be
+# stripped before we we test the command being run.
+#
+# test_todo() allows any of the assertions beginning test_ such as
+# test_cmp in addition the commands allowed by test_must_fail().
+
 test_must_fail_acceptable () {
+	local name
+	name="$1"
+	shift
+
 	if test "$1" = "env"
 	then
 		shift
@@ -1023,12 +1038,96 @@ test_must_fail_acceptable () {
 	git|__git*|test-tool|test_terminal)
 		return 0
 		;;
+	test_might_fail|test_todo|test_when_finished)
+		return 1
+		;;
+	test_must_fail)
+		if test "$name" = "todo"
+		then
+			shift
+			test_must_fail_acceptable must_fail "$@"
+			return $?
+		fi
+		return 1
+		;;
+	test_*)
+		test "$name" = "todo"
+		return $?
+		;;
 	*)
 		return 1
 		;;
 	esac
 }
 
+test_must_fail_helper () {
+	test_must_fail_name_="$1"
+	shift
+	case "$1" in
+	ok=*)
+		_test_ok=${1#ok=}
+		shift
+		;;
+	*)
+		_test_ok=
+		;;
+	esac
+	if ! test_must_fail_acceptable $test_must_fail_name_ "$@"
+	then
+		echo >&7 "test_$test_must_fail_name_: only 'git' is allowed: $*"
+		return 1
+	fi
+	"$@" 2>&7
+	exit_code=$?
+	if test $exit_code -eq 0 && ! list_contains "$_test_ok" success
+	then
+		echo >&4 "test_$test_must_fail_name_: command succeeded: $*"
+		return 1
+	elif test_match_signal 13 $exit_code && list_contains "$_test_ok" sigpipe
+	then
+		return 0
+	elif test $exit_code -gt 129 && test $exit_code -le 192
+	then
+		echo >&4 "test_$test_must_fail_name_: died by signal $(($exit_code - 128)): $*"
+		return 1
+	elif test $exit_code -eq 127
+	then
+		echo >&4 "test_$test_must_fail_name_: command not found: $*"
+		return 1
+	elif test $exit_code -eq 126
+	then
+		echo >&4 "test_$test_must_fail_name_: valgrind error: $*"
+		return 1
+	fi
+
+	return 0
+} 7>&2 2>&4
+
+# This is used to mark commands that should succeed but do not due to
+# a known issue. Marking the individual commands that fail rather than
+# using test_expect_failure allows us to detect any unexpected
+# failures. As with test_must_fail if the command is killed by a
+# signal the test will fail. If the command unexpectedly succeeds then
+# the test will also fail. For example:
+#
+#	test_expect_success 'test a known failure' '
+#		git foo 2>err &&
+#		test_todo test_must_be_empty err
+#	'
+#
+# This test will fail if "git foo" fails or err is unexpectedly empty.
+# test_todo can be used with "git" or any of the "test_*" assertions
+# such as test_cmp().
+
+test_todo () {
+	if test "$test_todo_" = "test_expect_failure"
+	then
+		BUG "test_todo_ cannot be used inside test_expect_failure"
+	fi
+	test_todo_=todo
+	test_must_fail_helper todo "$@" 2>&7
+} 7>&2 2>&4
+
 # This is not among top-level (test_expect_success | test_expect_failure)
 # but is a prefix that can be used in the test script, like:
 #
@@ -1061,43 +1160,7 @@ test_must_fail_acceptable () {
 #    ! grep pattern output
 
 test_must_fail () {
-	case "$1" in
-	ok=*)
-		_test_ok=${1#ok=}
-		shift
-		;;
-	*)
-		_test_ok=
-		;;
-	esac
-	if ! test_must_fail_acceptable "$@"
-	then
-		echo >&7 "test_must_fail: only 'git' is allowed: $*"
-		return 1
-	fi
-	"$@" 2>&7
-	exit_code=$?
-	if test $exit_code -eq 0 && ! list_contains "$_test_ok" success
-	then
-		echo >&4 "test_must_fail: command succeeded: $*"
-		return 1
-	elif test_match_signal 13 $exit_code && list_contains "$_test_ok" sigpipe
-	then
-		return 0
-	elif test $exit_code -gt 129 && test $exit_code -le 192
-	then
-		echo >&4 "test_must_fail: died by signal $(($exit_code - 128)): $*"
-		return 1
-	elif test $exit_code -eq 127
-	then
-		echo >&4 "test_must_fail: command not found: $*"
-		return 1
-	elif test $exit_code -eq 126
-	then
-		echo >&4 "test_must_fail: valgrind error: $*"
-		return 1
-	fi
-	return 0
+	test_must_fail_helper must_fail "$@" 2>&7
 } 7>&2 2>&4
 
 # Similar to test_must_fail, but tolerates success, too.  This is
@@ -1114,7 +1177,7 @@ test_must_fail () {
 # Accepts the same options as test_must_fail.
 
 test_might_fail () {
-	test_must_fail ok=success "$@" 2>&7
+	test_must_fail_helper might_fail ok=success "$@" 2>&7
 } 7>&2 2>&4
 
 # Similar to test_must_fail and test_might_fail, but check that a
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/3] [RFC] test_todo: allow [!] grep as the command
  2022-10-06 15:01 [PATCH 0/3] [RFC] tests: add test_todo() for known failures Phillip Wood via GitGitGadget
  2022-10-06 15:01 ` [PATCH 1/3] [RFC] tests: add test_todo() to mark known breakages Phillip Wood via GitGitGadget
@ 2022-10-06 15:01 ` Phillip Wood via GitGitGadget
  2022-10-06 15:56   ` Ævar Arnfjörð Bjarmason
  2022-10-06 15:01 ` [PATCH 3/3] [RFC] test_todo: allow [verbose] test " Phillip Wood via GitGitGadget
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 20+ messages in thread
From: Phillip Wood via GitGitGadget @ 2022-10-06 15:01 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Derrick Stolee,
	Phillip Wood, Phillip Wood

From: Phillip Wood <phillip.wood@dunelm.org.uk>

Many failing tests use grep, this commit converts a sample to use
test_todo(). As POSIX specifies that a return code greater than one
indicates an error rather than a failing match we take care not the
hide that.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 t/t0000-basic.sh                | 20 ++++++++++--
 t/t3510-cherry-pick-sequence.sh | 12 +++----
 t/t4014-format-patch.sh         | 20 ++++++------
 t/test-lib-functions.sh         | 56 ++++++++++++++++++++++++++++-----
 4 files changed, 82 insertions(+), 26 deletions(-)

diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
index 52362ad3dd3..db5c2059eb5 100755
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -190,6 +190,14 @@ test_expect_success 'subtest: test_todo allowed arguments' '
 		"test_todo test_must_fail git --version"
 	test_expect_success "test_todo test_must_fail rejects bad command" \
 		"test_todo test_must_fail test_true"
+	test_expect_success "test_todo accepts grep" \
+		"echo a >a && test_todo grep b <a"
+	test_expect_success "test_todo accepts ! grep" \
+		"echo a >a && test_todo ! grep -v b <a"
+	test_expect_success "test_todo detects grep errors" \
+		"echo a >a && test_todo grep --invalid-option b <a"
+	test_expect_success "test_todo detects ! grep errors" \
+		"echo a >a && test_todo ! grep --invalid-option -v b <a"
 	test_done
 	EOF
 	check_sub_test_lib_test acceptable-test-todo <<-\EOF
@@ -199,9 +207,15 @@ test_expect_success 'subtest: test_todo allowed arguments' '
 	> not ok 3 - test_todo test_must_fail accepts good command # TODO known breakage
 	> not ok 4 - test_todo test_must_fail rejects bad command
 	> #	test_todo test_must_fail test_true
-	> # still have 2 known breakage(s)
-	> # failed 2 among remaining 2 test(s)
-	> 1..4
+	> not ok 5 - test_todo accepts grep # TODO known breakage
+	> not ok 6 - test_todo accepts ! grep # TODO known breakage
+	> not ok 7 - test_todo detects grep errors
+	> #	echo a >a && test_todo grep --invalid-option b <a
+	> not ok 8 - test_todo detects ! grep errors
+	> #	echo a >a && test_todo ! grep --invalid-option -v b <a
+	> # still have 4 known breakage(s)
+	> # failed 4 among remaining 4 test(s)
+	> 1..8
 	EOF
 '
 
diff --git a/t/t3510-cherry-pick-sequence.sh b/t/t3510-cherry-pick-sequence.sh
index 3b0fa66c33d..cd869255a2c 100755
--- a/t/t3510-cherry-pick-sequence.sh
+++ b/t/t3510-cherry-pick-sequence.sh
@@ -577,7 +577,7 @@ test_expect_success '--continue respects -x in first commit in multi-pick' '
 	grep "cherry picked from.*$picked" msg
 '
 
-test_expect_failure '--signoff is automatically propagated to resolved conflict' '
+test_expect_success '--signoff is automatically propagated to resolved conflict' '
 	pristine_detach initial &&
 	test_expect_code 1 git cherry-pick --signoff base..anotherpick &&
 	echo "c" >foo &&
@@ -591,11 +591,11 @@ test_expect_failure '--signoff is automatically propagated to resolved conflict'
 	git cat-file commit HEAD~3 >initial_msg &&
 	! grep "Signed-off-by:" initial_msg &&
 	grep "Signed-off-by:" unrelatedpick_msg &&
-	! grep "Signed-off-by:" picked_msg &&
+	test_todo ! grep "Signed-off-by:" picked_msg &&
 	grep "Signed-off-by:" anotherpick_msg
 '
 
-test_expect_failure '--signoff dropped for implicit commit of resolution, multi-pick case' '
+test_expect_success '--signoff dropped for implicit commit of resolution, multi-pick case' '
 	pristine_detach initial &&
 	test_must_fail git cherry-pick -s picked anotherpick &&
 	echo c >foo &&
@@ -605,10 +605,10 @@ test_expect_failure '--signoff dropped for implicit commit of resolution, multi-
 	git diff --exit-code HEAD &&
 	test_cmp_rev initial HEAD^^ &&
 	git cat-file commit HEAD^ >msg &&
-	! grep Signed-off-by: msg
+	test_todo ! grep Signed-off-by: msg
 '
 
-test_expect_failure 'sign-off needs to be reaffirmed after conflict resolution, single-pick case' '
+test_expect_success 'sign-off needs to be reaffirmed after conflict resolution, single-pick case' '
 	pristine_detach initial &&
 	test_must_fail git cherry-pick -s picked &&
 	echo c >foo &&
@@ -618,7 +618,7 @@ test_expect_failure 'sign-off needs to be reaffirmed after conflict resolution,
 	git diff --exit-code HEAD &&
 	test_cmp_rev initial HEAD^ &&
 	git cat-file commit HEAD >msg &&
-	! grep Signed-off-by: msg
+	test_todo ! grep Signed-off-by: msg
 '
 
 test_expect_success 'malformed instruction sheet 1' '
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index ad5c0292794..ab649bf27e5 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -165,12 +165,12 @@ test_expect_success 'additional command line cc (ascii)' '
 	grep "^ *S E Cipient <scipient@example.com>\$" hdrs5
 '
 
-test_expect_failure 'additional command line cc (rfc822)' '
+test_expect_success 'additional command line cc (rfc822)' '
 	git config --replace-all format.headers "Cc: R E Cipient <rcipient@example.com>" &&
 	git format-patch --cc="S. E. Cipient <scipient@example.com>" --stdout main..side >patch5 &&
 	sed -e "/^\$/q" patch5 >hdrs5 &&
 	grep "^Cc: R E Cipient <rcipient@example.com>,\$" hdrs5 &&
-	grep "^ *\"S. E. Cipient\" <scipient@example.com>\$" hdrs5
+	test_todo grep "^ *\"S. E. Cipient\" <scipient@example.com>\$" hdrs5
 '
 
 test_expect_success 'command line headers' '
@@ -195,16 +195,16 @@ test_expect_success 'command line To: header (ascii)' '
 	grep "^To: R E Cipient <rcipient@example.com>\$" hdrs8
 '
 
-test_expect_failure 'command line To: header (rfc822)' '
+test_expect_success 'command line To: header (rfc822)' '
 	git format-patch --to="R. E. Cipient <rcipient@example.com>" --stdout main..side >patch8 &&
 	sed -e "/^\$/q" patch8 >hdrs8 &&
-	grep "^To: \"R. E. Cipient\" <rcipient@example.com>\$" hdrs8
+	test_todo grep "^To: \"R. E. Cipient\" <rcipient@example.com>\$" hdrs8
 '
 
-test_expect_failure 'command line To: header (rfc2047)' '
+test_expect_success 'command line To: header (rfc2047)' '
 	git format-patch --to="R Ä Cipient <rcipient@example.com>" --stdout main..side >patch8 &&
 	sed -e "/^\$/q" patch8 >hdrs8 &&
-	grep "^To: =?UTF-8?q?R=20=C3=84=20Cipient?= <rcipient@example.com>\$" hdrs8
+	test_todo grep "^To: =?UTF-8?q?R=20=C3=84=20Cipient?= <rcipient@example.com>\$" hdrs8
 '
 
 test_expect_success 'configuration To: header (ascii)' '
@@ -214,18 +214,18 @@ test_expect_success 'configuration To: header (ascii)' '
 	grep "^To: R E Cipient <rcipient@example.com>\$" hdrs9
 '
 
-test_expect_failure 'configuration To: header (rfc822)' '
+test_expect_success 'configuration To: header (rfc822)' '
 	git config format.to "R. E. Cipient <rcipient@example.com>" &&
 	git format-patch --stdout main..side >patch9 &&
 	sed -e "/^\$/q" patch9 >hdrs9 &&
-	grep "^To: \"R. E. Cipient\" <rcipient@example.com>\$" hdrs9
+	test_todo grep "^To: \"R. E. Cipient\" <rcipient@example.com>\$" hdrs9
 '
 
-test_expect_failure 'configuration To: header (rfc2047)' '
+test_expect_success 'configuration To: header (rfc2047)' '
 	git config format.to "R Ä Cipient <rcipient@example.com>" &&
 	git format-patch --stdout main..side >patch9 &&
 	sed -e "/^\$/q" patch9 >hdrs9 &&
-	grep "^To: =?UTF-8?q?R=20=C3=84=20Cipient?= <rcipient@example.com>\$" hdrs9
+	test_todo grep "^To: =?UTF-8?q?R=20=C3=84=20Cipient?= <rcipient@example.com>\$" hdrs9
 '
 
 # check_patch <patch>: Verify that <patch> looks like a half-sane
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 8978709b231..aee10bd0706 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1010,8 +1010,9 @@ list_contains () {
 # with env, the env and its corresponding variable settings will be
 # stripped before we we test the command being run.
 #
-# test_todo() allows any of the assertions beginning test_ such as
-# test_cmp in addition the commands allowed by test_must_fail().
+# test_todo() allows "grep" or any of the assertions beginning test_
+# such as test_cmp in addition the commands allowed by
+# test_must_fail().
 
 test_must_fail_acceptable () {
 	local name
@@ -1050,9 +1051,21 @@ test_must_fail_acceptable () {
 		fi
 		return 1
 		;;
-	test_*)
-		test "$name" = "todo"
-		return $?
+	grep|test_*)
+		if test "$name" = "todo"
+		then
+			test_todo_command_="$1"
+			return 0
+		fi
+		return 1
+		;;
+	'!')
+		if test "$name" = "todo" && test "$2" = "grep"
+		then
+			test_todo_command_=grep
+			return 0
+		fi
+		return 1
 		;;
 	*)
 		return 1
@@ -1061,6 +1074,7 @@ test_must_fail_acceptable () {
 }
 
 test_must_fail_helper () {
+	test_todo_command_=
 	test_must_fail_name_="$1"
 	shift
 	case "$1" in
@@ -1077,8 +1091,27 @@ test_must_fail_helper () {
 		echo >&7 "test_$test_must_fail_name_: only 'git' is allowed: $*"
 		return 1
 	fi
+	test_invert_exit_code_=
+	if test "$1" = "!"
+	then
+		test_invert_exit_code_=1
+		shift
+	fi
 	"$@" 2>&7
 	exit_code=$?
+	if test -n "$test_invert_exit_code_"
+	# We only invert the exit code of grep. An exit code greater
+	# than 1 indicates an error rather than a failed match so
+	# we only want to swap zero and one.
+	then
+		if test $exit_code -eq 0
+		then
+			exit_code=1
+		elif test $exit_code -eq 1
+		then
+			exit_code=0
+		fi
+	fi
 	if test $exit_code -eq 0 && ! list_contains "$_test_ok" success
 	then
 		echo >&4 "test_$test_must_fail_name_: command succeeded: $*"
@@ -1098,6 +1131,15 @@ test_must_fail_helper () {
 	then
 		echo >&4 "test_$test_must_fail_name_: valgrind error: $*"
 		return 1
+	else
+		case "$test_todo_command_" in
+		grep)
+			if test $exit_code -ne 1
+			then
+			       echo >&4 "test_todo: $test_todo_command_ error: $*"
+			       return 1
+			fi;;
+		esac
 	fi
 
 	return 0
@@ -1116,8 +1158,8 @@ test_must_fail_helper () {
 #	'
 #
 # This test will fail if "git foo" fails or err is unexpectedly empty.
-# test_todo can be used with "git" or any of the "test_*" assertions
-# such as test_cmp().
+# test_todo can be used with "git", "grep" or any of the "test_*"
+# assertions such as test_cmp().
 
 test_todo () {
 	if test "$test_todo_" = "test_expect_failure"
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/3] [RFC] test_todo: allow [verbose] test as the command
  2022-10-06 15:01 [PATCH 0/3] [RFC] tests: add test_todo() for known failures Phillip Wood via GitGitGadget
  2022-10-06 15:01 ` [PATCH 1/3] [RFC] tests: add test_todo() to mark known breakages Phillip Wood via GitGitGadget
  2022-10-06 15:01 ` [PATCH 2/3] [RFC] test_todo: allow [!] grep as the command Phillip Wood via GitGitGadget
@ 2022-10-06 15:01 ` Phillip Wood via GitGitGadget
  2022-10-06 16:02   ` Ævar Arnfjörð Bjarmason
  2022-10-06 17:05 ` [PATCH 0/3] [RFC] tests: add test_todo() for known failures Junio C Hamano
  2022-10-06 19:28 ` Ævar Arnfjörð Bjarmason
  4 siblings, 1 reply; 20+ messages in thread
From: Phillip Wood via GitGitGadget @ 2022-10-06 15:01 UTC (permalink / raw)
  To: git
  Cc: Ævar Arnfjörð Bjarmason, Derrick Stolee,
	Phillip Wood, Phillip Wood

From: Phillip Wood <phillip.wood@dunelm.org.uk>

For simplicity test_todo() allows verbose to precede any valid
command.  As POSIX specifies that a return code greater than one is an
error rather than a failed test we take care not to hide that.

I'm in two minds about this patch. Generally it is better to use one
of our test helpers such as test_cmp() rather than calling test
directly.  There are so few instances of test being used within
test_expect_failure() (the conversions here are not exhaustive but
there are not many more) that it would probably be better to convert
the tests by using the appropriate helper rather than supporting
calling test as the command to test_todo().

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 t/t0000-basic.sh        | 20 +++++++++++++++++---
 t/t0050-filesystem.sh   |  4 ++--
 t/t3903-stash.sh        | 12 ++++++------
 t/test-lib-functions.sh | 17 +++++++++++------
 4 files changed, 36 insertions(+), 17 deletions(-)

diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
index db5c2059eb5..93d3930d9f6 100755
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -198,6 +198,14 @@ test_expect_success 'subtest: test_todo allowed arguments' '
 		"echo a >a && test_todo grep --invalid-option b <a"
 	test_expect_success "test_todo detects ! grep errors" \
 		"echo a >a && test_todo ! grep --invalid-option -v b <a"
+	test_expect_success "test_todo accepts test" \
+		"test_todo test -z a"
+	test_expect_success "test_todo detects test errors" \
+		"test_todo test a xxx b"
+	test_expect_success "test_todo skips verbose and accepts good command" \
+		"test_todo verbose test -z a"
+	test_expect_success "test_todo skips verbose and rejects bad command" \
+		"test_todo verbose false"
 	test_done
 	EOF
 	check_sub_test_lib_test acceptable-test-todo <<-\EOF
@@ -213,9 +221,15 @@ test_expect_success 'subtest: test_todo allowed arguments' '
 	> #	echo a >a && test_todo grep --invalid-option b <a
 	> not ok 8 - test_todo detects ! grep errors
 	> #	echo a >a && test_todo ! grep --invalid-option -v b <a
-	> # still have 4 known breakage(s)
-	> # failed 4 among remaining 4 test(s)
-	> 1..8
+	> not ok 9 - test_todo accepts test # TODO known breakage
+	> not ok 10 - test_todo detects test errors
+	> #	test_todo test a xxx b
+	> not ok 11 - test_todo skips verbose and accepts good command # TODO known breakage
+	> not ok 12 - test_todo skips verbose and rejects bad command
+	> #	test_todo verbose false
+	> # still have 6 known breakage(s)
+	> # failed 6 among remaining 6 test(s)
+	> 1..12
 	EOF
 '
 
diff --git a/t/t0050-filesystem.sh b/t/t0050-filesystem.sh
index 325eb1c3cd0..16b933a4c8d 100755
--- a/t/t0050-filesystem.sh
+++ b/t/t0050-filesystem.sh
@@ -100,7 +100,7 @@ test_expect_success CASE_INSENSITIVE_FS 'add directory (with different case)' '
 	test_cmp expected actual
 '
 
-test_expect_failure CASE_INSENSITIVE_FS 'add (with different case)' '
+test_expect_success CASE_INSENSITIVE_FS 'add (with different case)' '
 	git reset --hard initial &&
 	rm camelcase &&
 	echo 1 >CamelCase &&
@@ -108,7 +108,7 @@ test_expect_failure CASE_INSENSITIVE_FS 'add (with different case)' '
 	git ls-files >tmp &&
 	camel=$(grep -i camelcase tmp) &&
 	test $(echo "$camel" | wc -l) = 1 &&
-	test "z$(git cat-file blob :$camel)" = z1
+	test_todo test "z$(git cat-file blob :$camel)" = z1
 '
 
 test_expect_success "setup unicode normalization tests" '
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index 376cc8f4ab8..afc5b4126f3 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -556,7 +556,7 @@ test_expect_success 'unstash must re-create the file' '
 	test bar = "$(cat file)"
 '
 
-test_expect_failure 'stash directory to file' '
+test_expect_success 'stash directory to file' '
 	git reset --hard &&
 	mkdir dir &&
 	echo foo >dir/file &&
@@ -567,12 +567,12 @@ test_expect_failure 'stash directory to file' '
 	git stash save "directory to file" &&
 	test_path_is_dir dir &&
 	test foo = "$(cat dir/file)" &&
-	test_must_fail git stash apply &&
-	test bar = "$(cat dir)" &&
+	test_todo test_must_fail git stash apply &&
+	test_todo test bar = "$(cat dir)" &&
 	git reset --soft HEAD^
 '
 
-test_expect_failure 'stash file to directory' '
+test_expect_success 'stash file to directory' '
 	git reset --hard &&
 	rm file &&
 	mkdir file &&
@@ -581,8 +581,8 @@ test_expect_failure 'stash file to directory' '
 	test_path_is_file file &&
 	test bar = "$(cat file)" &&
 	git stash apply &&
-	test_path_is_file file/file &&
-	test foo = "$(cat file/file)"
+	test_todo test_path_is_file file/file &&
+	test_todo test foo = "$(cat file/file)"
 '
 
 test_expect_success 'giving too many ref arguments does not modify files' '
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index aee10bd0706..068a0702809 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1010,8 +1010,8 @@ list_contains () {
 # with env, the env and its corresponding variable settings will be
 # stripped before we we test the command being run.
 #
-# test_todo() allows "grep" or any of the assertions beginning test_
-# such as test_cmp in addition the commands allowed by
+# test_todo() allows "grep", "test" or any of the assertions beginning
+# test_ such as test_cmp in addition the commands allowed by
 # test_must_fail().
 
 test_must_fail_acceptable () {
@@ -1019,6 +1019,11 @@ test_must_fail_acceptable () {
 	name="$1"
 	shift
 
+	if test "$name" = "todo" && test "$1" = "verbose"
+	then
+		shift
+	fi
+
 	if test "$1" = "env"
 	then
 		shift
@@ -1051,7 +1056,7 @@ test_must_fail_acceptable () {
 		fi
 		return 1
 		;;
-	grep|test_*)
+	grep|test|test_*)
 		if test "$name" = "todo"
 		then
 			test_todo_command_="$1"
@@ -1133,7 +1138,7 @@ test_must_fail_helper () {
 		return 1
 	else
 		case "$test_todo_command_" in
-		grep)
+		grep|test)
 			if test $exit_code -ne 1
 			then
 			       echo >&4 "test_todo: $test_todo_command_ error: $*"
@@ -1158,8 +1163,8 @@ test_must_fail_helper () {
 #	'
 #
 # This test will fail if "git foo" fails or err is unexpectedly empty.
-# test_todo can be used with "git", "grep" or any of the "test_*"
-# assertions such as test_cmp().
+# test_todo can be used with "git", "grep", "test" or any of the
+# "test_*" assertions such as test_cmp().
 
 test_todo () {
 	if test "$test_todo_" = "test_expect_failure"
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] [RFC] tests: add test_todo() to mark known breakages
  2022-10-06 15:01 ` [PATCH 1/3] [RFC] tests: add test_todo() to mark known breakages Phillip Wood via GitGitGadget
@ 2022-10-06 15:36   ` Ævar Arnfjörð Bjarmason
  2022-10-06 16:10     ` Phillip Wood
  2022-12-06 22:37   ` Victoria Dye
  1 sibling, 1 reply; 20+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-10-06 15:36 UTC (permalink / raw)
  To: Phillip Wood via GitGitGadget; +Cc: git, Derrick Stolee, Phillip Wood


On Thu, Oct 06 2022, Phillip Wood via GitGitGadget wrote:

> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> test_todo() is intended as a fine grained replacement for
> test_expect_failure(). Rather than marking the whole test as failing
> test_todo() is used to mark individual failing commands within a
> test. This approach to writing failing tests allows us to detect
> unexpected failures that are hidden by test_expect_failure().
>
> Failing commands are reported by the test harness in the same way as
> test_expect_failure() so there is no change in output when migrating
> from test_expect_failure() to test_todo(). If a command marked with
> test_todo() succeeds then the test will fail. This is designed to make
> it easier to see when a command starts succeeding in our CI compared
> to using test_expect_failure() where it is easy to fix a failing test
> case and not realize it.
>
> test_todo() is built upon test_expect_failure() but accepts commands
> starting with test_* in addition to git. As our test_* assertions use
> BUG() to signal usage errors any such error will not be hidden by
> test_todo().

I think they will, unless I'm missing something. E.g. try out:
	
	diff --git a/t/t0210-trace2-normal.sh b/t/t0210-trace2-normal.sh
	index 80e76a4695e..1be895abba6 100755
	--- a/t/t0210-trace2-normal.sh
	+++ b/t/t0210-trace2-normal.sh
	@@ -170,7 +170,7 @@ test_expect_success 'BUG messages are written to trace2' '
	 
	 test_expect_success 'bug messages with BUG_if_bug() are written to trace2' '
	 	test_when_finished "rm trace.normal actual expect" &&
	-	test_expect_code 99 env GIT_TRACE2="$(pwd)/trace.normal" \
	+	test_todo env GIT_TRACE2="$(pwd)/trace.normal" \
	 		test-tool trace2 008bug 2>err &&
	 	cat >expect <<-\EOF &&
	 	a bug message

I.e. in our tests you need to look out for exit code 99, not the usual
abort().

I have local patches to fix this, previously submitted as an RFC here:
https://lore.kernel.org/git/RFC-cover-0.3-00000000000-20220525T234908Z-avarab@gmail.com/

I just rebased that & CI is currently running, I'll see how it does:
https://github.com/avar/git/tree/avar/usage-do-not-abort-on-BUG-to-get-trace2-event-2

When I merged your patches here with that topic yours started doing the
right thing in this case, i.e. a "test_todo" that would get a BUG()"
would be reported as a failure.

> +	test_true () {
> +		true
> +	}
> +	test_expect_success "pretend we have fixed a test_todo breakage" \
> +		"test_todo test_true"

"Why the indirection", until I realized that it's because you're working
around the whitelist of commands that we have, i.e. we allow 'git' and
'test-tool', but not 'grep' or whatever.

I'm of the opinion that we should just drop that limitation altogether,
which is shown to be pretty silly in this case. I.e. at some point we
listed "test_*" as "this invokes a git binary", but a lot of our test_*
commands don't, including this one.

So in general I think we should just allow any command in
"test_must_fail" et al, and just catch in code review if someone uses
"test_must_fail grep" as opposed to "! grep".

But for the particular case of "test_todo" doing so seems like pointless
work, if we think we're going to miss this sort of thing in review in
general, then surely that's not a concern if we're going to very
prominently mark tests as TODO tests, given how the test of the output
shows them?

> +test_must_fail_helper () {
> +	test_must_fail_name_="$1"
> +	shift
> +	case "$1" in
> +	ok=*)
> +		_test_ok=${1#ok=}
> +		shift
> +		;;
> +	*)
> +		_test_ok=
> +		;;
> +	esac
> +	if ! test_must_fail_acceptable $test_must_fail_name_ "$@"
> +	then
> +		echo >&7 "test_$test_must_fail_name_: only 'git' is allowed: $*"
> +		return 1
> +	fi
> +	"$@" 2>&7
> +	exit_code=$?
> +	if test $exit_code -eq 0 && ! list_contains "$_test_ok" success
> +	then
> +		echo >&4 "test_$test_must_fail_name_: command succeeded: $*"
> +		return 1
> +	elif test_match_signal 13 $exit_code && list_contains "$_test_ok" sigpipe
> +	then
> +		return 0
> +	elif test $exit_code -gt 129 && test $exit_code -le 192
> +	then
> +		echo >&4 "test_$test_must_fail_name_: died by signal $(($exit_code - 128)): $*"
> +		return 1
> +	elif test $exit_code -eq 127
> +	then
> +		echo >&4 "test_$test_must_fail_name_: command not found: $*"
> +		return 1
> +	elif test $exit_code -eq 126
> +	then
> +		echo >&4 "test_$test_must_fail_name_: valgrind error: $*"
> +		return 1
> +	fi
> +
> +	return 0
> +} 7>&2 2>&4
> +
> +# This is used to mark commands that should succeed but do not due to
> +# a known issue. Marking the individual commands that fail rather than
> +# using test_expect_failure allows us to detect any unexpected
> +# failures. As with test_must_fail if the command is killed by a
> +# signal the test will fail. If the command unexpectedly succeeds then
> +# the test will also fail. For example:
> +#
> +#	test_expect_success 'test a known failure' '
> +#		git foo 2>err &&
> +#		test_todo test_must_be_empty err
> +#	'
> +#
> +# This test will fail if "git foo" fails or err is unexpectedly empty.
> +# test_todo can be used with "git" or any of the "test_*" assertions
> +# such as test_cmp().
> +
> +test_todo () {
> +	if test "$test_todo_" = "test_expect_failure"
> +	then
> +		BUG "test_todo_ cannot be used inside test_expect_failure"
> +	fi
> +	test_todo_=todo
> +	test_must_fail_helper todo "$@" 2>&7
> +} 7>&2 2>&4
> +
>  # This is not among top-level (test_expect_success | test_expect_failure)
>  # but is a prefix that can be used in the test script, like:
>  #
> @@ -1061,43 +1160,7 @@ test_must_fail_acceptable () {
>  #    ! grep pattern output
>  
>  test_must_fail () {
> -	case "$1" in
> -	ok=*)
> -		_test_ok=${1#ok=}
> -		shift
> -		;;
> -	*)
> -		_test_ok=
> -		;;
> -	esac
> -	if ! test_must_fail_acceptable "$@"
> -	then
> -		echo >&7 "test_must_fail: only 'git' is allowed: $*"
> -		return 1
> -	fi
> -	"$@" 2>&7
> -	exit_code=$?
> -	if test $exit_code -eq 0 && ! list_contains "$_test_ok" success
> -	then
> -		echo >&4 "test_must_fail: command succeeded: $*"
> -		return 1
> -	elif test_match_signal 13 $exit_code && list_contains "$_test_ok" sigpipe
> -	then
> -		return 0
> -	elif test $exit_code -gt 129 && test $exit_code -le 192
> -	then
> -		echo >&4 "test_must_fail: died by signal $(($exit_code - 128)): $*"
> -		return 1
> -	elif test $exit_code -eq 127
> -	then
> -		echo >&4 "test_must_fail: command not found: $*"
> -		return 1
> -	elif test $exit_code -eq 126
> -	then
> -		echo >&4 "test_must_fail: valgrind error: $*"
> -		return 1
> -	fi
> -	return 0
> +	test_must_fail_helper must_fail "$@" 2>&7
>  } 7>&2 2>&4
>  
>  # Similar to test_must_fail, but tolerates success, too.  This is
> @@ -1114,7 +1177,7 @@ test_must_fail () {
>  # Accepts the same options as test_must_fail.
>  
>  test_might_fail () {
> -	test_must_fail ok=success "$@" 2>&7
> +	test_must_fail_helper might_fail ok=success "$@" 2>&7
>  } 7>&2 2>&4
>  
>  # Similar to test_must_fail and test_might_fail, but check that a

I remember finding it annoying that "test_might_fail" is misreported
from test_must_fail_acceptable as being called "test_must_fail", so this
refactoring is very welcome.

But can you please split this into its own commit? I.e. that improvement
can stand on its own, i.e. just a change that has "test_must_fail" and
"test_might_fail" reporting their correct name.

Then this commit can follow, and then you'll just need to add (for this part):

	test_must_fail_helper todo "$@" 2>&7

And it'll make the resulting diff much smaller & easier to follow.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] [RFC] test_todo: allow [!] grep as the command
  2022-10-06 15:01 ` [PATCH 2/3] [RFC] test_todo: allow [!] grep as the command Phillip Wood via GitGitGadget
@ 2022-10-06 15:56   ` Ævar Arnfjörð Bjarmason
  2022-10-06 16:42     ` Phillip Wood
  0 siblings, 1 reply; 20+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-10-06 15:56 UTC (permalink / raw)
  To: Phillip Wood via GitGitGadget; +Cc: git, Derrick Stolee, Phillip Wood

On Thu, Oct 06 2022, Phillip Wood via GitGitGadget wrote:

> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> Many failing tests use grep, this commit converts a sample to use
> test_todo(). As POSIX specifies that a return code greater than one
> indicates an error rather than a failing match we take care not the
> hide that.

Ah, so on the one hand this gives me second thoughts about my stance
in[1], i.e. if we just allowed any command we wouldn't be forced to add
these sorts of special-cases.

Although, we could also allow any command, and just add smartness for
ones we know about, e.g. "grep".

But I do find doing this to be weirdly inconsistent, i.e. caring about
the difference between these two:

	$ grep blah README.md; echo $?
	1
	$ grep blah README.mdx; echo $?
	grep: README.mdx: No such file or directory
	2

Is basically why I took the approach I did in my [2], i.e. to force us
to positively assert *what* the bad behavior should be.

Which is why I ended up doing my verison of this sort of thing as [3],
i.e. you'd need to assert what specific exit code you expected, or
equivalent.

But at this point in the series:

	diff --git a/t/t3600-rm.sh b/t/t3600-rm.sh
	index fa7831c0674..086eaf91351 100755
	--- a/t/t3600-rm.sh
	+++ b/t/t3600-rm.sh
	@@ -801,7 +801,7 @@ test_expect_success SYMLINKS 'rm across a symlinked leading path (w/ index)' '
	 	test_todo test_must_fail git rm d/f &&
	 	test_todo git rev-parse --verify :d/f &&
	 	test -h d &&
	-	test_todo test_path_is_file e/f
	+	test_todo test_path_is_file blah
	 '

	 test_expect_success 'setup for testing rm messages' '

So, for our own test_path_* helpers we're not going to care at all, and
any failure will do (including a missing file), but we will care for
grep?

I'm obviously more on the "let's care" side, I just find it odd that
you've picked this halfway point here, but not for other things you're
wrapping.

1. https://lore.kernel.org/git/221006.868rltrltu.gmgdl@evledraar.gmail.com/
2. https://lore.kernel.org/git/cover-0.7-00000000000-20220318T002951Z-avarab@gmail.com/
3. https://lore.kernel.org/git/patch-5.7-553670da8a9-20220318T002951Z-avarab@gmail.com/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] [RFC] test_todo: allow [verbose] test as the command
  2022-10-06 15:01 ` [PATCH 3/3] [RFC] test_todo: allow [verbose] test " Phillip Wood via GitGitGadget
@ 2022-10-06 16:02   ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 20+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-10-06 16:02 UTC (permalink / raw)
  To: Phillip Wood via GitGitGadget; +Cc: git, Derrick Stolee, Phillip Wood


On Thu, Oct 06 2022, Phillip Wood via GitGitGadget wrote:

> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> For simplicity test_todo() allows verbose to precede any valid
> command.  As POSIX specifies that a return code greater than one is an
> error rather than a failed test we take care not to hide that.
>
> I'm in two minds about this patch. Generally it is better to use one
> of our test helpers such as test_cmp() rather than calling test
> directly.  There are so few instances of test being used within
> test_expect_failure() (the conversions here are not exhaustive but
> there are not many more) that it would probably be better to convert
> the tests by using the appropriate helper rather than supporting
> calling test as the command to test_todo().

I think that we might want to salvage parts of this, but we really
shouldn't be spending review time on carrying forward a bad pattern that
hides segfaults.

I.e. whatever we do about "test_todo"'s interaction with "test" let's
first change things like...

> -test_expect_failure CASE_INSENSITIVE_FS 'add (with different case)' '
> +test_expect_success CASE_INSENSITIVE_FS 'add (with different case)' '
>  	git reset --hard initial &&
>  	rm camelcase &&
>  	echo 1 >CamelCase &&
> @@ -108,7 +108,7 @@ test_expect_failure CASE_INSENSITIVE_FS 'add (with different case)' '
>  	git ls-files >tmp &&
>  	camel=$(grep -i camelcase tmp) &&
>  	test $(echo "$camel" | wc -l) = 1 &&
> -	test "z$(git cat-file blob :$camel)" = z1
> +	test_todo test "z$(git cat-file blob :$camel)" = z1

...this to e.g.:

	echo z1 >expect &&
	git cat-file blob :$camel >actual &&
	test_cmp expect actual

Or whatever, then let's see if migrating "verbose" is worthwhile, in the
post-image you end up with no real users of it, only your tests.

I've wanted to just remove it for a while, all its users seem to be
either bad uses like that, or we'd get much better bang for the buck out
of it by having a t/verbose-bin/ or whatever, which would just wrap
arbitrary commands like "grep" and the like (i.e. ones where we could
provide useful context).

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] [RFC] tests: add test_todo() to mark known breakages
  2022-10-06 15:36   ` Ævar Arnfjörð Bjarmason
@ 2022-10-06 16:10     ` Phillip Wood
  2022-10-06 20:33       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 20+ messages in thread
From: Phillip Wood @ 2022-10-06 16:10 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason,
	Phillip Wood via GitGitGadget
  Cc: git, Derrick Stolee

Hi Ævar

On 06/10/2022 16:36, Ævar Arnfjörð Bjarmason wrote:
> 
> On Thu, Oct 06 2022, Phillip Wood via GitGitGadget wrote:
> 
>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
 >>
>> test_todo() is built upon test_expect_failure() but accepts commands
>> starting with test_* in addition to git. As our test_* assertions use
>> BUG() to signal usage errors any such error will not be hidden by
>> test_todo().
> 
> I think they will, unless I'm missing something. E.g. try out:

It's talking about BUG() in test-lib.sh, not the C function. For example

test_path_is_file () {
	test "$#" -ne 1 && BUG "1 param"
	if ! test -f "$1"
	then
		echo "File $1 doesn't exist"
		false
	fi
}

So a test containing "test_todo test_path_is_file a b" should fail as 
BUG calls exit rather than returning non-zero (I should probably test 
that in 0000-basic.sh)

> 	diff --git a/t/t0210-trace2-normal.sh b/t/t0210-trace2-normal.sh
> 	index 80e76a4695e..1be895abba6 100755
> 	--- a/t/t0210-trace2-normal.sh
> 	+++ b/t/t0210-trace2-normal.sh
> 	@@ -170,7 +170,7 @@ test_expect_success 'BUG messages are written to trace2' '
> 	
> 	 test_expect_success 'bug messages with BUG_if_bug() are written to trace2' '
> 	 	test_when_finished "rm trace.normal actual expect" &&
> 	-	test_expect_code 99 env GIT_TRACE2="$(pwd)/trace.normal" \
> 	+	test_todo env GIT_TRACE2="$(pwd)/trace.normal" \
> 	 		test-tool trace2 008bug 2>err &&
> 	 	cat >expect <<-\EOF &&
> 	 	a bug message
> 
> I.e. in our tests you need to look out for exit code 99, not the usual
> abort().
> 
> I have local patches to fix this, previously submitted as an RFC here:
> https://lore.kernel.org/git/RFC-cover-0.3-00000000000-20220525T234908Z-avarab@gmail.com/
> 
> I just rebased that & CI is currently running, I'll see how it does:
> https://github.com/avar/git/tree/avar/usage-do-not-abort-on-BUG-to-get-trace2-event-2
> 
> When I merged your patches here with that topic yours started doing the
> right thing in this case, i.e. a "test_todo" that would get a BUG()"
> would be reported as a failure.
> 
>> +	test_true () {
>> +		true
>> +	}
>> +	test_expect_success "pretend we have fixed a test_todo breakage" \
>> +		"test_todo test_true"
> 
> "Why the indirection", until I realized that it's because you're working
> around the whitelist of commands that we have, i.e. we allow 'git' and
> 'test-tool', but not 'grep' or whatever.
> 
> I'm of the opinion that we should just drop that limitation altogether,
> which is shown to be pretty silly in this case. I.e. at some point we
> listed "test_*" as "this invokes a git binary", but a lot of our test_*
> commands don't, including this one.

test_expect_failure does not allow test_* are you thinking of test-tool?

> So in general I think we should just allow any command in
> "test_must_fail" et al, and just catch in code review if someone uses
> "test_must_fail grep" as opposed to "! grep".

That is not going to scale well

> But for the particular case of "test_todo" doing so seems like pointless
> work, if we think we're going to miss this sort of thing in review in
> general, then surely that's not a concern if we're going to very
> prominently mark tests as TODO tests, given how the test of the output
> shows them?

 >[...]
>>   test_might_fail () {
>> -	test_must_fail ok=success "$@" 2>&7
>> +	test_must_fail_helper might_fail ok=success "$@" 2>&7
>>   } 7>&2 2>&4
>>   
>>   # Similar to test_must_fail and test_might_fail, but check that a
> 
> I remember finding it annoying that "test_might_fail" is misreported
> from test_must_fail_acceptable as being called "test_must_fail", so this
> refactoring is very welcome.
> 
> But can you please split this into its own commit? I.e. that improvement
> can stand on its own, i.e. just a change that has "test_must_fail" and
> "test_might_fail" reporting their correct name.
 >
> Then this commit can follow, and then you'll just need to add (for this part) >
> 	test_must_fail_helper todo "$@" 2>&7
> 
> And it'll make the resulting diff much smaller & easier to follow.

Sure, I should also improve the error message in

 >> +		echo >&7 "test_$test_must_fail_name_: only 'git' is allowed: $*"

for test_todo.

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] [RFC] test_todo: allow [!] grep as the command
  2022-10-06 15:56   ` Ævar Arnfjörð Bjarmason
@ 2022-10-06 16:42     ` Phillip Wood
  2022-10-06 20:26       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 20+ messages in thread
From: Phillip Wood @ 2022-10-06 16:42 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason,
	Phillip Wood via GitGitGadget
  Cc: git, Derrick Stolee

On 06/10/2022 16:56, Ævar Arnfjörð Bjarmason wrote:
> 
> On Thu, Oct 06 2022, Phillip Wood via GitGitGadget wrote:
> 
>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>>
>> Many failing tests use grep, this commit converts a sample to use
>> test_todo(). As POSIX specifies that a return code greater than one
>> indicates an error rather than a failing match we take care not the
>> hide that.
> 
> Ah, so on the one hand this gives me second thoughts about my stance
> in[1], i.e. if we just allowed any command we wouldn't be forced to add
> these sorts of special-cases.
> 
> Although, we could also allow any command, and just add smartness for
> ones we know about, e.g. "grep".
> 
> But I do find doing this to be weirdly inconsistent, i.e. caring about
> the difference between these two:
> 
> 	$ grep blah README.md; echo $?
> 	1
> 	$ grep blah README.mdx; echo $?
> 	grep: README.mdx: No such file or directory
> 	2

The intent was to catch bad options, not missing files (i.e. we don't 
want test_todo to hide a failure from "grep --invalid-option"). We could 
check the file exists and skip running grep if it does not (hopefully 
the test wont be grepping multiple files in a single command)

> Is basically why I took the approach I did in my [2], i.e. to force us
> to positively assert *what* the bad behavior should be.

That is what made the end result so hard to use though

	test_todo \
		--want "test_must_fail git" \
		--reset "git reset --hard" \
		--expect git \
		-- \
		rm d/f &&

is not exactly readable.

Best Wishes

Phillip

> Which is why I ended up doing my verison of this sort of thing as [3],
> i.e. you'd need to assert what specific exit code you expected, or
> equivalent.
> 
> But at this point in the series:
> 	
> 	diff --git a/t/t3600-rm.sh b/t/t3600-rm.sh
> 	index fa7831c0674..086eaf91351 100755
> 	--- a/t/t3600-rm.sh
> 	+++ b/t/t3600-rm.sh
> 	@@ -801,7 +801,7 @@ test_expect_success SYMLINKS 'rm across a symlinked leading path (w/ index)' '
> 	 	test_todo test_must_fail git rm d/f &&
> 	 	test_todo git rev-parse --verify :d/f &&
> 	 	test -h d &&
> 	-	test_todo test_path_is_file e/f
> 	+	test_todo test_path_is_file blah
> 	 '
> 	
> 	 test_expect_success 'setup for testing rm messages' '
> 
> So, for our own test_path_* helpers we're not going to care at all, and
> any failure will do (including a missing file), but we will care for
> grep?
> 
> I'm obviously more on the "let's care" side, I just find it odd that
> you've picked this halfway point here, but not for other things you're
> wrapping.
> 
> 1. https://lore.kernel.org/git/221006.868rltrltu.gmgdl@evledraar.gmail.com/
> 2. https://lore.kernel.org/git/cover-0.7-00000000000-20220318T002951Z-avarab@gmail.com/
> 3. https://lore.kernel.org/git/patch-5.7-553670da8a9-20220318T002951Z-avarab@gmail.com/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/3] [RFC] tests: add test_todo() for known failures
  2022-10-06 15:01 [PATCH 0/3] [RFC] tests: add test_todo() for known failures Phillip Wood via GitGitGadget
                   ` (2 preceding siblings ...)
  2022-10-06 15:01 ` [PATCH 3/3] [RFC] test_todo: allow [verbose] test " Phillip Wood via GitGitGadget
@ 2022-10-06 17:05 ` Junio C Hamano
  2022-10-06 19:28 ` Ævar Arnfjörð Bjarmason
  4 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2022-10-06 17:05 UTC (permalink / raw)
  To: Phillip Wood via GitGitGadget
  Cc: git, Ævar Arnfjörð Bjarmason, Derrick Stolee,
	Phillip Wood

"Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:

> test_todo() is intended as a fine grained alternative to
> test_expect_failure(). Rather than marking the whole test as failing
> test_todo() is used to mark individual failing commands within a test. This
> approach to writing failing tests allows us to detect unexpected failures
> that are hidden by test_expect_failure().
>
> This series attempts to keep most of the benefits test_expect_todo()
> previously proposed by Ævar[1] while being simpler to use.

Great.  We discussed this some time ago and I am happy to see the
work re-ignited.

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/3] [RFC] tests: add test_todo() for known failures
  2022-10-06 15:01 [PATCH 0/3] [RFC] tests: add test_todo() for known failures Phillip Wood via GitGitGadget
                   ` (3 preceding siblings ...)
  2022-10-06 17:05 ` [PATCH 0/3] [RFC] tests: add test_todo() for known failures Junio C Hamano
@ 2022-10-06 19:28 ` Ævar Arnfjörð Bjarmason
  2022-10-07 13:26   ` Phillip Wood
  4 siblings, 1 reply; 20+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-10-06 19:28 UTC (permalink / raw)
  To: Phillip Wood via GitGitGadget; +Cc: git, Derrick Stolee, Phillip Wood


On Thu, Oct 06 2022, Phillip Wood via GitGitGadget wrote:

> test_todo() is intended as a fine grained alternative to
> test_expect_failure(). Rather than marking the whole test as failing
> test_todo() is used to mark individual failing commands within a test. This
> approach to writing failing tests allows us to detect unexpected failures
> that are hidden by test_expect_failure().
>
> This series attempts to keep most of the benefits test_expect_todo()
> previously proposed by Ævar[1] while being simpler to use.
>
> [1]
> https://lore.kernel.org/git/cover-0.7-00000000000-20220318T002951Z-avarab@gmail.com/

I like the interface you've got here much better than the one I
submitted in [1], so much that it's what I tried to write at first :)

But as you noted in 1/3:

	test_todo cannot be used in a subshell.

So when we do this:
	
	diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
	index 93d3930d9f6..75b84a09592 100755
	--- a/t/t0000-basic.sh
	+++ b/t/t0000-basic.sh
	@@ -147,7 +147,7 @@ test_expect_success 'subtest: a failing test_todo' '
	 		false
	 	}
	 	test_expect_success "passing test" "true"
	-	test_expect_success "known todo" "test_todo test_false"
	+	test_expect_success "known todo" "(test_todo test_false)"
	 	test_done
	 	EOF
	 	check_sub_test_lib_test failing-test-todo <<-\EOF

We'll get:
	
	+ diff -u failing-test-todo/expect.out failing-test-todo/out
	--- failing-test-todo/expect.out        2022-10-06 19:30:14.093338392 +0000
	+++ failing-test-todo/out       2022-10-06 19:30:14.093338392 +0000
	@@ -1,5 +1,4 @@
	 ok 1 - passing test
	-not ok 2 - known todo # TODO known breakage
	-# still have 1 known breakage(s)
	-# passed all remaining 1 test(s)
	+ok 2 - known todo
	+# passed all 2 test(s)
	 1..2

What I was initially trying to do when I tried this approach was to make
the "test_todo" be the equivalent of a sub-test, i.e. when we encounter
one we'd say "ok N - DESC" for the current test so far, and then an "ok
N+1 - DESC # TODO: $cmd" for the "test_todo" command.

I think I lost the code for that, but I tried hacking something rough up
on top of your series. I don't think it's a viable approach but it works
as long as we don't have a subshell (the "remaining N tests" count is
off, but it's fixable, I just couldn't be bothered for a WIP hack):
	
	diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
	index 93d3930d9f6..7e8e0a54558 100755
	--- a/t/t0000-basic.sh
	+++ b/t/t0000-basic.sh
	@@ -148,14 +148,17 @@ test_expect_success 'subtest: a failing test_todo' '
	 	}
	 	test_expect_success "passing test" "true"
	 	test_expect_success "known todo" "test_todo test_false"
	+	test_expect_success "passing test 2" "true"
	 	test_done
	 	EOF
	 	check_sub_test_lib_test failing-test-todo <<-\EOF
	 	> ok 1 - passing test
	-	> not ok 2 - known todo # TODO known breakage
	+	> not ok 2 - known todo: test_false # TODO known breakage
	+	> ok 3 - known todo (post-test_todo)
	+	> ok 4 - passing test 2
	 	> # still have 1 known breakage(s)
	-	> # passed all remaining 1 test(s)
	-	> 1..2
	+	> # passed all remaining 3 test(s)
	+	> 1..4
	 	EOF
	 '
	 
	@@ -171,7 +174,8 @@ test_expect_success 'subtest: a passing test_todo' '
	 	check_sub_test_lib_test passing-test-todo <<-\EOF
	 	> not ok 1 - pretend we have fixed a test_todo breakage
	 	> #	test_todo test_true
	-	> # failed 1 among 1 test(s)
	+	> # 1 known breakage(s) vanished; please update test(s)
	+	> # failed 1 among remaining 0 test(s)
	 	> 1..1
	 	EOF
	 '
	diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
	index 068a0702809..54365fe202f 100644
	--- a/t/test-lib-functions.sh
	+++ b/t/test-lib-functions.sh
	@@ -826,15 +826,12 @@ test_expect_success () {
	 	then
	 		test -n "$test_skip_test_preamble" ||
	 		say >&3 "expecting success of $TEST_NUMBER.$test_count '$1': $2"
	-		test_todo_=test_expect_success
	+		test_todo_title_="$1"
	+		test_had_todo_=
	 		if test_run_ "$2"
	 		then
	-			if test "$test_todo_" = "todo"
	-			then
	-				test_known_broken_failure_ "$1"
	-			else
	-				test_ok_ "$1"
	-			fi
	+			test_ok_ "$1${test_had_todo_:+ (post-test_todo)}"
	+			test_had_todo_=
	 		else
	 			test_failure_ "$@"
	 		fi
	@@ -1167,12 +1164,26 @@ test_must_fail_helper () {
	 # "test_*" assertions such as test_cmp().
	 
	 test_todo () {
	+	if test -z "$test_todo_title_"
	+	then
	+		BUG 'test_todo: expected a $test_todo_title_'
	+	fi &&
	 	if test "$test_todo_" = "test_expect_failure"
	 	then
	 		BUG "test_todo_ cannot be used inside test_expect_failure"
	+	fi &&
	+	# Tell "test_expect_success" it had a "test_todo"
	+	test_had_todo_=1 &&
	+	# We say that the test up until this point is OK, and emit an "ok .." for it.
	+	test_ok_ "$test_todo_title_" &&
	+	if test_must_fail_helper todo "$@" 2>&7
	+	then
	+		test_known_broken_failure_ "$test_todo_title_: $*" 1>&5 2>&6 &&
	+		test_count=$(($test_count+1))
	+	else
	+		test_known_broken_ok_ "$test_todo_title_: $*" &&
	+		return 1
	 	fi
	-	test_todo_=todo
	-	test_must_fail_helper todo "$@" 2>&7
	 } 7>&2 2>&4
	 
	 # This is not among top-level (test_expect_success | test_expect_failure)

Anyway, the core difference between the APIs we proposed for this is
that you'd do:

	test_expect_success 'desc' 'test_todo false'

Whereas I suggested:

	test_expect_todo 'desc' '! false'

Now, let's pick apart the differences:

 1. With "test_expect_todo" we're declaring "this is a TODO test" for
    the test as a whole.

 2. With your "test_todo" we're not doing that, instead we proceed as
    normal, and then we might note "we had a TODO" midway through, then
    at the end we'll spot that we had a TODO test (but this approach
    won't work with subshells).

 3. Your "test_todo" is basically a "let's let this pass", whereas mine
    was a helper which exhaustively declared *what* the bad behavior
    was.

    (Although some of yours seems to be midway between the two,
    i.e. https://lore.kernel.org/git/c3f4a79c-2dc6-fbf4-fc61-591ebf417682@dunelm.org.uk/)

I think the main critique you and Junio had of my series was to do with
#3, i.e. that it was a hassle to exhaustively declare what the behavior
is & should be, as you note in:
https://lore.kernel.org/git/c3f4a79c-2dc6-fbf4-fc61-591ebf417682@dunelm.org.uk/

	test_todo \
		--want "test_must_fail git" \
		--reset "git reset --hard" \
		--expect git \
		-- \
		rm d/f &&

That's fair enough, maybe that's not worth the effort. The reason I
initially hacked this up was because I'd noticed a behavior difference
in a command that was only revealed in a test_expect_failure block, but
because we didn't assert *what* the behavior was we didn't notice.

My version (if fully used) would spot that, but that's because of how I
wrote the "tes_todo", it's orthagonal to #1 and #2 above.

So I don't see why we wouldn't instead have a "test_expect_todo" and
just write the helper differently, or have a mode where it's less
strict, and (if we find it worthwhile) one where it's more strict.

I rebased my
https://lore.kernel.org/git/patch-1.7-4624abc2591-20220318T002951Z-avarab@gmail.com/
just now and applied the below on top, which seems to me to give you
pretty much the end result you want, the only difference is that my
version will also work in subshells (see the t2500 one):

diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index de1ec89007d..fe47e503bd1 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -468,7 +468,7 @@ test_expect_success 'sparse-checkout (init|set|disable) warns with unmerged stat
 	git -C unmerged sparse-checkout disable
 '
 
-test_expect_failure 'sparse-checkout reapply' '
+test_expect_todo 'sparse-checkout reapply' '
 	git clone repo tweak &&
 
 	echo dirty >tweak/deep/deeper2/a &&
@@ -502,11 +502,11 @@ test_expect_failure 'sparse-checkout reapply' '
 
 	# NEEDSWORK: We are asking to update a file outside of the
 	# sparse-checkout cone, but this is no longer allowed.
-	git -C tweak add folder1/a &&
+	test_todo git -C tweak add folder1/a &&
 	git -C tweak sparse-checkout reapply 2>err &&
-	test_must_be_empty err &&
+	test_todo test_must_be_empty err &&
 	test_path_is_missing tweak/deep/deeper2/a &&
-	test_path_is_missing tweak/folder1/a &&
+	test_todo test_path_is_missing tweak/folder1/a &&
 
 	git -C tweak sparse-checkout disable
 '
diff --git a/t/t2500-untracked-overwriting.sh b/t/t2500-untracked-overwriting.sh
index 5c0bf4d21fc..db7c72d38d8 100755
--- a/t/t2500-untracked-overwriting.sh
+++ b/t/t2500-untracked-overwriting.sh
@@ -167,7 +167,7 @@ test_expect_success 'git rebase fast forwarding and untracked files' '
 	)
 '
 
-test_expect_failure 'git rebase --autostash and untracked files' '
+test_expect_todo 'git rebase --autostash and untracked files' '
 	test_setup_sequencing rebase_autostash_and_untracked &&
 	(
 		cd sequencing_rebase_autostash_and_untracked &&
@@ -176,7 +176,7 @@ test_expect_failure 'git rebase --autostash and untracked files' '
 		mkdir filler &&
 		echo precious >filler/file &&
 		cp filler/file expect &&
-		git rebase --autostash init &&
+		test_todo git rebase --autostash init &&
 		test_path_is_file filler/file
 	)
 '
diff --git a/t/t3510-cherry-pick-sequence.sh b/t/t3510-cherry-pick-sequence.sh
index 3b0fa66c33d..b31b6b0f7a0 100755
--- a/t/t3510-cherry-pick-sequence.sh
+++ b/t/t3510-cherry-pick-sequence.sh
@@ -577,7 +577,7 @@ test_expect_success '--continue respects -x in first commit in multi-pick' '
 	grep "cherry picked from.*$picked" msg
 '
 
-test_expect_failure '--signoff is automatically propagated to resolved conflict' '
+test_expect_todo '--signoff is automatically propagated to resolved conflict' '
 	pristine_detach initial &&
 	test_expect_code 1 git cherry-pick --signoff base..anotherpick &&
 	echo "c" >foo &&
@@ -591,7 +591,7 @@ test_expect_failure '--signoff is automatically propagated to resolved conflict'
 	git cat-file commit HEAD~3 >initial_msg &&
 	! grep "Signed-off-by:" initial_msg &&
 	grep "Signed-off-by:" unrelatedpick_msg &&
-	! grep "Signed-off-by:" picked_msg &&
+	test_todo ! grep "Signed-off-by:" picked_msg &&
 	grep "Signed-off-by:" anotherpick_msg
 '
 
diff --git a/t/t3600-rm.sh b/t/t3600-rm.sh
index e74a318ac33..6c7929f5557 100755
--- a/t/t3600-rm.sh
+++ b/t/t3600-rm.sh
@@ -790,7 +790,7 @@ test_expect_success SYMLINKS 'rm across a symlinked leading path (no index)' '
 	test_path_is_file e/f
 '
 
-test_expect_failure SYMLINKS 'rm across a symlinked leading path (w/ index)' '
+test_expect_todo SYMLINKS 'rm across a symlinked leading path (w/ index)' '
 	rm -rf d e &&
 	mkdir d &&
 	echo content >d/f &&
@@ -798,10 +798,10 @@ test_expect_failure SYMLINKS 'rm across a symlinked leading path (w/ index)' '
 	git commit -m "d/f exists" &&
 	mv d e &&
 	ln -s e d &&
-	test_must_fail git rm d/f &&
-	git rev-parse --verify :d/f &&
+	test_todo test_must_fail git rm d/f &&
+	test_todo git rev-parse --verify :d/f &&
 	test -h d &&
-	test_path_is_file e/f
+	test_todo test_path_is_file e/f
 '
 
 test_expect_success 'setup for testing rm messages' '
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index ad5c0292794..a6a5a330180 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -165,12 +165,12 @@ test_expect_success 'additional command line cc (ascii)' '
 	grep "^ *S E Cipient <scipient@example.com>\$" hdrs5
 '
 
-test_expect_failure 'additional command line cc (rfc822)' '
+test_expect_todo 'additional command line cc (rfc822)' '
 	git config --replace-all format.headers "Cc: R E Cipient <rcipient@example.com>" &&
 	git format-patch --cc="S. E. Cipient <scipient@example.com>" --stdout main..side >patch5 &&
 	sed -e "/^\$/q" patch5 >hdrs5 &&
 	grep "^Cc: R E Cipient <rcipient@example.com>,\$" hdrs5 &&
-	grep "^ *\"S. E. Cipient\" <scipient@example.com>\$" hdrs5
+	test_todo grep "^ *\"S. E. Cipient\" <scipient@example.com>\$" hdrs5
 '
 
 test_expect_success 'command line headers' '
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index f342954de11..9d5706454a5 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1049,6 +1049,21 @@ test_must_fail_acceptable () {
 	esac
 }
 
+test_todo () {
+	local negate=-ne
+	local cmp_op=-ne
+	if test "$1" = "!"
+	then
+		negate=t &&
+		cmp_op=-eq
+		shift
+	fi &&
+	"$@" 2>&7
+	exit_code=$?
+	say "test_todo: got $exit_code ${negate:+negated!} from $*"
+	test "$exit_code" "$cmp_op" 0
+}
+
 # This is not among top-level (test_expect_success | test_expect_failure)
 # but is a prefix that can be used in the test script, like:
 #



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] [RFC] test_todo: allow [!] grep as the command
  2022-10-06 16:42     ` Phillip Wood
@ 2022-10-06 20:26       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 20+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-10-06 20:26 UTC (permalink / raw)
  To: phillip.wood; +Cc: Phillip Wood via GitGitGadget, git, Derrick Stolee


On Thu, Oct 06 2022, Phillip Wood wrote:

> On 06/10/2022 16:56, Ævar Arnfjörð Bjarmason wrote:
>> On Thu, Oct 06 2022, Phillip Wood via GitGitGadget wrote:
>> 
>>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>>>
>>> Many failing tests use grep, this commit converts a sample to use
>>> test_todo(). As POSIX specifies that a return code greater than one
>>> indicates an error rather than a failing match we take care not the
>>> hide that.
>> Ah, so on the one hand this gives me second thoughts about my stance
>> in[1], i.e. if we just allowed any command we wouldn't be forced to add
>> these sorts of special-cases.
>> Although, we could also allow any command, and just add smartness
>> for
>> ones we know about, e.g. "grep".
>> But I do find doing this to be weirdly inconsistent, i.e. caring
>> about
>> the difference between these two:
>> 	$ grep blah README.md; echo $?
>> 	1
>> 	$ grep blah README.mdx; echo $?
>> 	grep: README.mdx: No such file or directory
>> 	2
>
> The intent was to catch bad options, not missing files (i.e. we don't
> want test_todo to hide a failure from "grep --invalid-option"). We
> could check the file exists and skip running grep if it does not
> (hopefully the test wont be grepping multiple files in a single
> command)

It returns the same exit code for missing files and bad options, so I
don't think this plan will work.

I.e. I (in my initial series) wanted to have something where we declared
what the behavior was right now, *and* what it should be.

But some of our tests are wishy-washing "I wish this worked", so:

	test_todo git some-new-cmd && # should write "unicorn" to a new foo.txt
	test_todo grep unicorn foo.txt

Won't do what you expect?

>> Is basically why I took the approach I did in my [2], i.e. to force us
>> to positively assert *what* the bad behavior should be.
>
> That is what made the end result so hard to use though
>
> 	test_todo \
> 		--want "test_must_fail git" \
> 		--reset "git reset --hard" \
> 		--expect git \
> 		-- \
> 		rm d/f &&
>
> is not exactly readable.

Yes, indeed:) Anyway, my just-sent
https://lore.kernel.org/git/221006.86v8owr986.gmgdl@evledraar.gmail.com/
goes into that.

I think a "test_todo" should either be a "strictly declare stuff", or a
"YOLO this" where we just detect segfaults.

But per the above having it be some mix of the two is just confusing,
i.e. to extend the example above:

	test_todo git some-new-cmd &&
	test_todo test_path_exists foo.txt &&
	test_todo grep unicorn foo.txt

Won't "work", because the "test_path_exists" isn't strict, but your
"grep" is.

So I think whatever "test_todo" does it should be picking one or the
other.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] [RFC] tests: add test_todo() to mark known breakages
  2022-10-06 16:10     ` Phillip Wood
@ 2022-10-06 20:33       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 20+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-10-06 20:33 UTC (permalink / raw)
  To: phillip.wood; +Cc: Phillip Wood via GitGitGadget, git, Derrick Stolee


On Thu, Oct 06 2022, Phillip Wood wrote:

> Hi Ævar
>
> On 06/10/2022 16:36, Ævar Arnfjörð Bjarmason wrote:
>> On Thu, Oct 06 2022, Phillip Wood via GitGitGadget wrote:
>> 
>>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>>>
>>> test_todo() is built upon test_expect_failure() but accepts commands
>>> starting with test_* in addition to git. As our test_* assertions use
>>> BUG() to signal usage errors any such error will not be hidden by
>>> test_todo().
>> I think they will, unless I'm missing something. E.g. try out:
>
> It's talking about BUG() in test-lib.sh, not the C function. For example
>
> test_path_is_file () {
> 	test "$#" -ne 1 && BUG "1 param"
> 	if ! test -f "$1"
> 	then
> 		echo "File $1 doesn't exist"
> 		false
> 	fi
> }
>
> So a test containing "test_todo test_path_is_file a b" should fail as
> BUG calls exit rather than returning non-zero (I should probably test 
> that in 0000-basic.sh)

Ah, anyway, I think getting that to behave correctly is *the* thing any
less sucky test_expect_failure replacement needs to get right, i.e. to
handle BUG() (in C code), abort(), SANITIZE=undefined (and friends, all
of whom abort()), segfaults etc.

>> 	diff --git a/t/t0210-trace2-normal.sh b/t/t0210-trace2-normal.sh
>> 	index 80e76a4695e..1be895abba6 100755
>> 	--- a/t/t0210-trace2-normal.sh
>> 	+++ b/t/t0210-trace2-normal.sh
>> 	@@ -170,7 +170,7 @@ test_expect_success 'BUG messages are written to trace2' '
>> 	
>> 	 test_expect_success 'bug messages with BUG_if_bug() are written to trace2' '
>> 	 	test_when_finished "rm trace.normal actual expect" &&
>> 	-	test_expect_code 99 env GIT_TRACE2="$(pwd)/trace.normal" \
>> 	+	test_todo env GIT_TRACE2="$(pwd)/trace.normal" \
>> 	 		test-tool trace2 008bug 2>err &&
>> 	 	cat >expect <<-\EOF &&
>> 	 	a bug message
>> I.e. in our tests you need to look out for exit code 99, not the
>> usual
>> abort().
>> I have local patches to fix this, previously submitted as an RFC
>> here:
>> https://lore.kernel.org/git/RFC-cover-0.3-00000000000-20220525T234908Z-avarab@gmail.com/
>> I just rebased that & CI is currently running, I'll see how it does:
>> https://github.com/avar/git/tree/avar/usage-do-not-abort-on-BUG-to-get-trace2-event-2
>> When I merged your patches here with that topic yours started doing
>> the
>> right thing in this case, i.e. a "test_todo" that would get a BUG()"
>> would be reported as a failure.
>> 
>>> +	test_true () {
>>> +		true
>>> +	}
>>> +	test_expect_success "pretend we have fixed a test_todo breakage" \
>>> +		"test_todo test_true"
>> "Why the indirection", until I realized that it's because you're
>> working
>> around the whitelist of commands that we have, i.e. we allow 'git' and
>> 'test-tool', but not 'grep' or whatever.
>> I'm of the opinion that we should just drop that limitation
>> altogether,
>> which is shown to be pretty silly in this case. I.e. at some point we
>> listed "test_*" as "this invokes a git binary", but a lot of our test_*
>> commands don't, including this one.
>
> test_expect_failure does not allow test_* are you thinking of test-tool?

I'm talking about test_todo, and the reason for that "test_true" being
needed is because test_must_fail_acceptable is picky, but we could also
(I just adjusted that one test):
	
	diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
	index 52362ad3dd3..921e0401eb5 100755
	--- a/t/t0000-basic.sh
	+++ b/t/t0000-basic.sh
	@@ -143,11 +143,8 @@ test_expect_success 'subtest: a passing TODO test' '
	 
	 test_expect_success 'subtest: a failing test_todo' '
	 	write_and_run_sub_test_lib_test failing-test-todo <<-\EOF &&
	-	test_false () {
	-		false
	-	}
	 	test_expect_success "passing test" "true"
	-	test_expect_success "known todo" "test_todo test_false"
	+	test_expect_success "known todo" "test_todo false"
	 	test_done
	 	EOF
	 	check_sub_test_lib_test failing-test-todo <<-\EOF
	diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
	index 8978709b231..9300eaa2c62 100644
	--- a/t/test-lib-functions.sh
	+++ b/t/test-lib-functions.sh
	@@ -1034,6 +1034,11 @@ test_must_fail_acceptable () {
	 		done
	 	fi
	 
	+	if test "$name" = "todo"
	+	then
	+		return 0
	+	fi
	+
	 	case "$1" in
	 	git|__git*|test-tool|test_terminal)
	 		return 0
	@@ -1050,10 +1055,6 @@ test_must_fail_acceptable () {
	 		fi
	 		return 1
	 		;;
	-	test_*)
	-		test "$name" = "todo"
	-		return $?
	-		;;
	 	*)
	 		return 1
	 		;;
	

>> So in general I think we should just allow any command in
>> "test_must_fail" et al, and just catch in code review if someone uses
>> "test_must_fail grep" as opposed to "! grep".
>
> That is not going to scale well

Well, you're throwing the scaling out the window by whitelisting test_*
in your 1/3 :)

I don't think we really need it, but *the* reason it's there is to
distinguish things that run our own C code from things that don't, and a
lot of test_* helpers don't.

So if you want to pursue making this correct per the current intent it
should really use the current whitelist to decide whether or not to pass
things through the "should we change the exit code if it's a signal,
segfault etc?" part.

Otherwise you should just negate it or whatever, as the "test_todo" I
showed in
https://lore.kernel.org/git/221006.86v8owr986.gmgdl@evledraar.gmail.com/
does. I.e. we shouldn't be detecting an abort() in /bin/false and the
like.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/3] [RFC] tests: add test_todo() for known failures
  2022-10-06 19:28 ` Ævar Arnfjörð Bjarmason
@ 2022-10-07 13:26   ` Phillip Wood
  2022-10-07 17:08     ` Junio C Hamano
  0 siblings, 1 reply; 20+ messages in thread
From: Phillip Wood @ 2022-10-07 13:26 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason,
	Phillip Wood via GitGitGadget
  Cc: git, Derrick Stolee, Junio C Hamano

Hi Ævar

On 06/10/2022 20:28, Ævar Arnfjörð Bjarmason wrote:
> 
> On Thu, Oct 06 2022, Phillip Wood via GitGitGadget wrote:
> 
>> test_todo() is intended as a fine grained alternative to
>> test_expect_failure(). Rather than marking the whole test as failing
>> test_todo() is used to mark individual failing commands within a test. This
>> approach to writing failing tests allows us to detect unexpected failures
>> that are hidden by test_expect_failure().
>>
>> This series attempts to keep most of the benefits test_expect_todo()
>> previously proposed by Ævar[1] while being simpler to use.
>>
>> [1]
>> https://lore.kernel.org/git/cover-0.7-00000000000-20220318T002951Z-avarab@gmail.com/
> 
> I like the interface you've got here much better than the one I
> submitted in [1], so much that it's what I tried to write at first :)
> 
> But as you noted in 1/3:
> 
> 	test_todo cannot be used in a subshell.
> 
> Anyway, the core difference between the APIs we proposed for this is
> that you'd do:
> 
> 	test_expect_success 'desc' 'test_todo false'
> 
> Whereas I suggested:
> 
> 	test_expect_todo 'desc' '! false'
> 
> Now, let's pick apart the differences:
> 
>   1. With "test_expect_todo" we're declaring "this is a TODO test" for
>      the test as a whole.
 >
>   2. With your "test_todo" we're not doing that, instead we proceed as
>      normal, and then we might note "we had a TODO" midway through, then
>      at the end we'll spot that we had a TODO test (but this approach
>      won't work with subshells).

Yes, this series avoids adding test_expect_todo and reuses 
test_expect_success as Junio suggested [1]. By using a new toplevel form 
your series was able to handle test_todo in a subshell because it did 
not need any global state related to whether a test_todo had passed/failed.

The series here uses a variable to check if any test_todo statements 
were present in a test that ran successfully and that does not work if 
the test_todo happens in a subshell because the variable in the parent 
is not updated. First we need to consider whether there is any need for 
supporting test_todo in a subshell. If there is then a possible 
alternative is to store the state in a file. That would add the cost of 
"test -f" to test_expect_success but our tests do so much i/o that a 
single extra stat should not be noticeable. In particular I believe a 
file based approach can be implemented without adding any new processes 
to tests that do not use test_todo.

>   3. Your "test_todo" is basically a "let's let this pass", whereas mine
>      was a helper which exhaustively declared *what* the bad behavior
>      was.
> 
>      (Although some of yours seems to be midway between the two,
>      i.e. https://lore.kernel.org/git/c3f4a79c-2dc6-fbf4-fc61-591ebf417682@dunelm.org.uk/)

The only thing my series tries to assert is that a test_todo command is 
not failing due to a usage error so that it can catch buggy tests. 
Beyond that it does not care what the reason for failure is.

> I think the main critique you and Junio had of my series was to do with
> #3, i.e. that it was a hassle to exhaustively declare what the behavior
> is & should be, as you note in:

That was certainly my main objection, but Junio was not that keen on 
adding test_expect_todo [1]

> https://lore.kernel.org/git/c3f4a79c-2dc6-fbf4-fc61-591ebf417682@dunelm.org.uk/
> 
> 	test_todo \
> 		--want "test_must_fail git" \
> 		--reset "git reset --hard" \
> 		--expect git \
> 		-- \
> 		rm d/f &&
> 
> That's fair enough, maybe that's not worth the effort. The reason I
> initially hacked this up was because I'd noticed a behavior difference
> in a command that was only revealed in a test_expect_failure block, but
> because we didn't assert *what* the behavior was we didn't notice.
> 
> My version (if fully used) would spot that, but that's because of how I
> wrote the "tes_todo", it's orthagonal to #1 and #2 above.
> 
> So I don't see why we wouldn't instead have a "test_expect_todo" and
> just write the helper differently, or have a mode where it's less
> strict, and (if we find it worthwhile) one where it's more strict.

I think there is a question of whether we need a new toplevel 
test_expect_todo - why would we add it if we can just reuse 
test_expect_success? That way when a test failure is fixed all that 
needs to be done is to remove the test_todo calls.

Best Wishes

Phillip

[1] https://lore.kernel.org/git/xmqq8rt77zp7.fsf@gitster.g/


> I rebased my
> https://lore.kernel.org/git/patch-1.7-4624abc2591-20220318T002951Z-avarab@gmail.com/
> just now and applied the below on top, which seems to me to give you
> pretty much the end result you want, the only difference is that my
> version will also work in subshells (see the t2500 one):
> 
> diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
> index de1ec89007d..fe47e503bd1 100755
> --- a/t/t1091-sparse-checkout-builtin.sh
> +++ b/t/t1091-sparse-checkout-builtin.sh
> @@ -468,7 +468,7 @@ test_expect_success 'sparse-checkout (init|set|disable) warns with unmerged stat
>   	git -C unmerged sparse-checkout disable
>   '
>   
> -test_expect_failure 'sparse-checkout reapply' '
> +test_expect_todo 'sparse-checkout reapply' '
>   	git clone repo tweak &&
>   
>   	echo dirty >tweak/deep/deeper2/a &&
> @@ -502,11 +502,11 @@ test_expect_failure 'sparse-checkout reapply' '
>   
>   	# NEEDSWORK: We are asking to update a file outside of the
>   	# sparse-checkout cone, but this is no longer allowed.
> -	git -C tweak add folder1/a &&
> +	test_todo git -C tweak add folder1/a &&
>   	git -C tweak sparse-checkout reapply 2>err &&
> -	test_must_be_empty err &&
> +	test_todo test_must_be_empty err &&
>   	test_path_is_missing tweak/deep/deeper2/a &&
> -	test_path_is_missing tweak/folder1/a &&
> +	test_todo test_path_is_missing tweak/folder1/a &&
>   
>   	git -C tweak sparse-checkout disable
>   '
> diff --git a/t/t2500-untracked-overwriting.sh b/t/t2500-untracked-overwriting.sh
> index 5c0bf4d21fc..db7c72d38d8 100755
> --- a/t/t2500-untracked-overwriting.sh
> +++ b/t/t2500-untracked-overwriting.sh
> @@ -167,7 +167,7 @@ test_expect_success 'git rebase fast forwarding and untracked files' '
>   	)
>   '
>   
> -test_expect_failure 'git rebase --autostash and untracked files' '
> +test_expect_todo 'git rebase --autostash and untracked files' '
>   	test_setup_sequencing rebase_autostash_and_untracked &&
>   	(
>   		cd sequencing_rebase_autostash_and_untracked &&
> @@ -176,7 +176,7 @@ test_expect_failure 'git rebase --autostash and untracked files' '
>   		mkdir filler &&
>   		echo precious >filler/file &&
>   		cp filler/file expect &&
> -		git rebase --autostash init &&
> +		test_todo git rebase --autostash init &&
>   		test_path_is_file filler/file
>   	)
>   '
> diff --git a/t/t3510-cherry-pick-sequence.sh b/t/t3510-cherry-pick-sequence.sh
> index 3b0fa66c33d..b31b6b0f7a0 100755
> --- a/t/t3510-cherry-pick-sequence.sh
> +++ b/t/t3510-cherry-pick-sequence.sh
> @@ -577,7 +577,7 @@ test_expect_success '--continue respects -x in first commit in multi-pick' '
>   	grep "cherry picked from.*$picked" msg
>   '
>   
> -test_expect_failure '--signoff is automatically propagated to resolved conflict' '
> +test_expect_todo '--signoff is automatically propagated to resolved conflict' '
>   	pristine_detach initial &&
>   	test_expect_code 1 git cherry-pick --signoff base..anotherpick &&
>   	echo "c" >foo &&
> @@ -591,7 +591,7 @@ test_expect_failure '--signoff is automatically propagated to resolved conflict'
>   	git cat-file commit HEAD~3 >initial_msg &&
>   	! grep "Signed-off-by:" initial_msg &&
>   	grep "Signed-off-by:" unrelatedpick_msg &&
> -	! grep "Signed-off-by:" picked_msg &&
> +	test_todo ! grep "Signed-off-by:" picked_msg &&
>   	grep "Signed-off-by:" anotherpick_msg
>   '
>   
> diff --git a/t/t3600-rm.sh b/t/t3600-rm.sh
> index e74a318ac33..6c7929f5557 100755
> --- a/t/t3600-rm.sh
> +++ b/t/t3600-rm.sh
> @@ -790,7 +790,7 @@ test_expect_success SYMLINKS 'rm across a symlinked leading path (no index)' '
>   	test_path_is_file e/f
>   '
>   
> -test_expect_failure SYMLINKS 'rm across a symlinked leading path (w/ index)' '
> +test_expect_todo SYMLINKS 'rm across a symlinked leading path (w/ index)' '
>   	rm -rf d e &&
>   	mkdir d &&
>   	echo content >d/f &&
> @@ -798,10 +798,10 @@ test_expect_failure SYMLINKS 'rm across a symlinked leading path (w/ index)' '
>   	git commit -m "d/f exists" &&
>   	mv d e &&
>   	ln -s e d &&
> -	test_must_fail git rm d/f &&
> -	git rev-parse --verify :d/f &&
> +	test_todo test_must_fail git rm d/f &&
> +	test_todo git rev-parse --verify :d/f &&
>   	test -h d &&
> -	test_path_is_file e/f
> +	test_todo test_path_is_file e/f
>   '
>   
>   test_expect_success 'setup for testing rm messages' '
> diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
> index ad5c0292794..a6a5a330180 100755
> --- a/t/t4014-format-patch.sh
> +++ b/t/t4014-format-patch.sh
> @@ -165,12 +165,12 @@ test_expect_success 'additional command line cc (ascii)' '
>   	grep "^ *S E Cipient <scipient@example.com>\$" hdrs5
>   '
>   
> -test_expect_failure 'additional command line cc (rfc822)' '
> +test_expect_todo 'additional command line cc (rfc822)' '
>   	git config --replace-all format.headers "Cc: R E Cipient <rcipient@example.com>" &&
>   	git format-patch --cc="S. E. Cipient <scipient@example.com>" --stdout main..side >patch5 &&
>   	sed -e "/^\$/q" patch5 >hdrs5 &&
>   	grep "^Cc: R E Cipient <rcipient@example.com>,\$" hdrs5 &&
> -	grep "^ *\"S. E. Cipient\" <scipient@example.com>\$" hdrs5
> +	test_todo grep "^ *\"S. E. Cipient\" <scipient@example.com>\$" hdrs5
>   '
>   
>   test_expect_success 'command line headers' '
> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> index f342954de11..9d5706454a5 100644
> --- a/t/test-lib-functions.sh
> +++ b/t/test-lib-functions.sh
> @@ -1049,6 +1049,21 @@ test_must_fail_acceptable () {
>   	esac
>   }
>   
> +test_todo () {
> +	local negate=-ne
> +	local cmp_op=-ne
> +	if test "$1" = "!"
> +	then
> +		negate=t &&
> +		cmp_op=-eq
> +		shift
> +	fi &&
> +	"$@" 2>&7
> +	exit_code=$?
> +	say "test_todo: got $exit_code ${negate:+negated!} from $*"
> +	test "$exit_code" "$cmp_op" 0
> +}
> +
>   # This is not among top-level (test_expect_success | test_expect_failure)
>   # but is a prefix that can be used in the test script, like:
>   #
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/3] [RFC] tests: add test_todo() for known failures
  2022-10-07 13:26   ` Phillip Wood
@ 2022-10-07 17:08     ` Junio C Hamano
  0 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2022-10-07 17:08 UTC (permalink / raw)
  To: Phillip Wood
  Cc: Ævar Arnfjörð Bjarmason,
	Phillip Wood via GitGitGadget, git, Derrick Stolee

Phillip Wood <phillip.wood123@gmail.com> writes:

> I think there is a question of whether we need a new toplevel
> test_expect_todo - why would we add it if we can just reuse
> test_expect_success? That way when a test failure is fixed all that
> needs to be done is to remove the test_todo calls.

Yup, that is one of the reasons why I favor test_todo inside the
normal test_expect_success.  A patch that fixes a breakage would
come with a change to the tests that flips test_expect_failure to
test_expect_success often had the step that were expected to fail
outside the post context and did not make it immediately obvious
what was fixed, and use of a more focused test_todo would make it
more clear.  Unless we gain a clear advantage by using a different
top-level (e.g. some of the limitations like "not in a subshell" can
be lifted?), I do not think we want to add one.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] [RFC] tests: add test_todo() to mark known breakages
  2022-10-06 15:01 ` [PATCH 1/3] [RFC] tests: add test_todo() to mark known breakages Phillip Wood via GitGitGadget
  2022-10-06 15:36   ` Ævar Arnfjörð Bjarmason
@ 2022-12-06 22:37   ` Victoria Dye
  2022-12-07 12:08     ` Ævar Arnfjörð Bjarmason
  2022-12-08 15:06     ` Phillip Wood
  1 sibling, 2 replies; 20+ messages in thread
From: Victoria Dye @ 2022-12-06 22:37 UTC (permalink / raw)
  To: Phillip Wood via GitGitGadget, git
  Cc: Ævar Arnfjörð Bjarmason, Derrick Stolee,
	Phillip Wood

Phillip Wood via GitGitGadget wrote:
> From: Phillip Wood <phillip.wood@dunelm.org.uk>
> 
> test_todo() is intended as a fine grained replacement for
> test_expect_failure(). Rather than marking the whole test as failing
> test_todo() is used to mark individual failing commands within a
> test. This approach to writing failing tests allows us to detect
> unexpected failures that are hidden by test_expect_failure().

I love this idea! I've nearly been burned a couple of times by the wrong
line in a 'test_expect_failure' triggering the error (e.g., due to bad
syntax earlier in the test). The added specificity of 'test_todo' will help
both reviewers and people fixing the underlying issues demonstrated by
expected-failing tests.

> 
> Failing commands are reported by the test harness in the same way as
> test_expect_failure() so there is no change in output when migrating
> from test_expect_failure() to test_todo(). If a command marked with
> test_todo() succeeds then the test will fail. This is designed to make
> it easier to see when a command starts succeeding in our CI compared
> to using test_expect_failure() where it is easy to fix a failing test
> case and not realize it.
> 
> test_todo() is built upon test_expect_failure() but accepts commands
> starting with test_* in addition to git. As our test_* assertions use
> BUG() to signal usage errors any such error will not be hidden by
> test_todo().

Should this be so restrictive? I think 'test_todo' would need to handle any
arbitrary command (mostly because of custom functions like
'ensure_not_expanded' in 't1092') to be an easy-to-use drop-in replacement
for 'test_expect_failure'. 

I see there's some related discussion in another subthread [1], but I don't
necessarily think removing restrictions (i.e. that the tested command must
be 'git', 'test_*', etc.) on 'test_todo' requires doing the same for
'test_must_fail' et al. to be internally consistent. On one hand,
'test_todo' could be interpreted as an assertion (like 'test_must_fail'),
where we only want to assert on our code - hence the restrictions. From that
perspective, it would make sense to ease restrictions uniformly on all of
our assertion helpers. 

On the other hand, I'm interpreting 'test_todo' as
'test_expect_failure_on_line_N' - more of a "post-test result interpreter"
than an assertion helper. So because 'test_expect_failure' doesn't require
the failing line to come from a particular command, I don't think
'test_todo' needs to either. That leaves assertion helpers like
'test_must_fail' out of the scope of this change, avoiding any hairiness of
allowing them to assert on arbitrary code.

What do you think?

[1] https://lore.kernel.org/git/221006.86mta8r860.gmgdl@evledraar.gmail.com/

> 
> This commit coverts a few tests to show the intended use of
> test_todo().  A limitation of test_todo() as it is currently
> implemented is that it cannot be used in a subshell.
> 
> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] [RFC] tests: add test_todo() to mark known breakages
  2022-12-06 22:37   ` Victoria Dye
@ 2022-12-07 12:08     ` Ævar Arnfjörð Bjarmason
  2022-12-08 15:06     ` Phillip Wood
  1 sibling, 0 replies; 20+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-07 12:08 UTC (permalink / raw)
  To: Victoria Dye
  Cc: Phillip Wood via GitGitGadget, git, Derrick Stolee, Phillip Wood


On Tue, Dec 06 2022, Victoria Dye wrote:

> Phillip Wood via GitGitGadget wrote:
>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>> 
>> test_todo() is intended as a fine grained replacement for
>> test_expect_failure(). Rather than marking the whole test as failing
>> test_todo() is used to mark individual failing commands within a
>> test. This approach to writing failing tests allows us to detect
>> unexpected failures that are hidden by test_expect_failure().
>
> I love this idea! I've nearly been burned a couple of times by the wrong
> line in a 'test_expect_failure' triggering the error (e.g., due to bad
> syntax earlier in the test). The added specificity of 'test_todo' will help
> both reviewers and people fixing the underlying issues demonstrated by
> expected-failing tests.
>
>> 
>> Failing commands are reported by the test harness in the same way as
>> test_expect_failure() so there is no change in output when migrating
>> from test_expect_failure() to test_todo(). If a command marked with
>> test_todo() succeeds then the test will fail. This is designed to make
>> it easier to see when a command starts succeeding in our CI compared
>> to using test_expect_failure() where it is easy to fix a failing test
>> case and not realize it.
>> 
>> test_todo() is built upon test_expect_failure() but accepts commands
>> starting with test_* in addition to git. As our test_* assertions use
>> BUG() to signal usage errors any such error will not be hidden by
>> test_todo().
>
> Should this be so restrictive? I think 'test_todo' would need to handle any
> arbitrary command (mostly because of custom functions like
> 'ensure_not_expanded' in 't1092') to be an easy-to-use drop-in replacement
> for 'test_expect_failure'. 
>
> I see there's some related discussion in another subthread [1], but I don't
> necessarily think removing restrictions (i.e. that the tested command must
> be 'git', 'test_*', etc.) on 'test_todo' requires doing the same for
> 'test_must_fail' et al. to be internally consistent. On one hand,
> 'test_todo' could be interpreted as an assertion (like 'test_must_fail'),
> where we only want to assert on our code - hence the restrictions. From that
> perspective, it would make sense to ease restrictions uniformly on all of
> our assertion helpers. 
>
> On the other hand, I'm interpreting 'test_todo' as
> 'test_expect_failure_on_line_N' - more of a "post-test result interpreter"
> than an assertion helper. So because 'test_expect_failure' doesn't require
> the failing line to come from a particular command, I don't think
> 'test_todo' needs to either. That leaves assertion helpers like
> 'test_must_fail' out of the scope of this change, avoiding any hairiness of
> allowing them to assert on arbitrary code.
>
> What do you think?

Are you saying that for the "test_todo" we shouldn't care whether it
exits with a "normal" non-zero or a segfault, abort() (e.g. BUG()) etc?
That's what the "test_must_fail" v.s. "!" is about.

Even if we erased tat distinction I think such a thing would be a
marginal improvement on "test_expect_failure", as we'd at least mark
what line fails, but like "test_expect_failure" we'd accept segfaults as
failures.

but as noted in the upthread discussions I think we should do better and
still check for segfaults etc. I think we have a couple of
"test_expect_failure" now where we expect a segfault, but for the rest
we'd like to know if they start segfaulting.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] [RFC] tests: add test_todo() to mark known breakages
  2022-12-06 22:37   ` Victoria Dye
  2022-12-07 12:08     ` Ævar Arnfjörð Bjarmason
@ 2022-12-08 15:06     ` Phillip Wood
  2022-12-09  1:09       ` Junio C Hamano
  1 sibling, 1 reply; 20+ messages in thread
From: Phillip Wood @ 2022-12-08 15:06 UTC (permalink / raw)
  To: Victoria Dye, Phillip Wood via GitGitGadget, git
  Cc: Ævar Arnfjörð Bjarmason, Derrick Stolee

Hi Victoria

On 06/12/2022 22:37, Victoria Dye wrote:
> Phillip Wood via GitGitGadget wrote:
>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
 >>
>> Failing commands are reported by the test harness in the same way as
>> test_expect_failure() so there is no change in output when migrating
>> from test_expect_failure() to test_todo(). If a command marked with
>> test_todo() succeeds then the test will fail. This is designed to make
>> it easier to see when a command starts succeeding in our CI compared
>> to using test_expect_failure() where it is easy to fix a failing test
>> case and not realize it.
>>
>> test_todo() is built upon test_expect_failure() but accepts commands
>> starting with test_* in addition to git. As our test_* assertions use
>> BUG() to signal usage errors any such error will not be hidden by
>> test_todo().
> 
> Should this be so restrictive? I think 'test_todo' would need to handle any
> arbitrary command (mostly because of custom functions like
> 'ensure_not_expanded' in 't1092') to be an easy-to-use drop-in replacement
> for 'test_expect_failure'.
> 
> I see there's some related discussion in another subthread [1], but I don't
> necessarily think removing restrictions (i.e. that the tested command must
> be 'git', 'test_*', etc.) on 'test_todo' requires doing the same for
> 'test_must_fail' et al. to be internally consistent. On one hand,
> 'test_todo' could be interpreted as an assertion (like 'test_must_fail'),
> where we only want to assert on our code - hence the restrictions. From that
> perspective, it would make sense to ease restrictions uniformly on all of
> our assertion helpers.
> 
> On the other hand, I'm interpreting 'test_todo' as
> 'test_expect_failure_on_line_N' - more of a "post-test result interpreter"
> than an assertion helper. So because 'test_expect_failure' doesn't require
> the failing line to come from a particular command, I don't think
> 'test_todo' needs to either. That leaves assertion helpers like
> 'test_must_fail' out of the scope of this change, avoiding any hairiness of
> allowing them to assert on arbitrary code.
> 
> What do you think?

I don't think we need to remove the restrictions on 'test_must_fail', 
they seem to be there for a good reason and I'm not aware of anyone 
complaining about being inconvenienced by them. I think of 'test_todo' 
and 'test_must_fail' as being distinct, 'test_todo' only reuses the 
implementation of 'test_must_fail' for convenience rather than any other 
deep reason.

I added the restrictions to 'test_todo' to try and stop it being misused 
but I'm happy to relax them if needed. I'm keen that test_todo is able 
to distinguish between an expected failure and a failure due to the 
wrapped command being misused e.g. 'test_todo grep --invalid-option' 
should report an error. Restricting the commands makes it easier to 
guarantee that but we can always just add checks for other commands as 
we use them. In a way the existing restrictions are kind of pointless 
because test authors can always name their helper functions test_... to 
get round them.

I think you've convinced be to remove the restrictions on what can be 
wrapped by 'test_todo' when I re-roll.

Thanks for your thoughtful comments

Phillip

> [1] https://lore.kernel.org/git/221006.86mta8r860.gmgdl@evledraar.gmail.com/
> 
>>
>> This commit coverts a few tests to show the intended use of
>> test_todo().  A limitation of test_todo() as it is currently
>> implemented is that it cannot be used in a subshell.
>>
>> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] [RFC] tests: add test_todo() to mark known breakages
  2022-12-08 15:06     ` Phillip Wood
@ 2022-12-09  1:09       ` Junio C Hamano
  2022-12-09  9:04         ` Phillip Wood
  0 siblings, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2022-12-09  1:09 UTC (permalink / raw)
  To: Phillip Wood
  Cc: Victoria Dye, Phillip Wood via GitGitGadget, git,
	Ævar Arnfjörð Bjarmason, Derrick Stolee

Phillip Wood <phillip.wood123@gmail.com> writes:

> I added the restrictions to 'test_todo' to try and stop it being
> misused but I'm happy to relax them if needed. I'm keen that test_todo
> is able to distinguish between an expected failure and a failure due
> to the wrapped command being misused e.g. 'test_todo grep
> --invalid-option' should report an error.

Hmm, but it is not useful if the failure is from "you cannot use the
system command grep with test_todo", as the (implicitly) encouraged
"fix" the developer who wrote the test would pick would be to use
"! grep --invalid-option" which would still fail for a wrong reason.

If a "git" command is expected to run to a completion but is
currently broken and produces a wrong output, it would be very
useful to be able to write

	git command --args >actual &&
	test_todo grep -e "$expected_token" actual

to say "when 'git command' is fixed, the output should contains
this, but we know it currently is broken". 

> I think you've convinced be to remove the restrictions on what can be
> wrapped by 'test_todo' when I re-roll.
>
> Thanks for your thoughtful comments

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] [RFC] tests: add test_todo() to mark known breakages
  2022-12-09  1:09       ` Junio C Hamano
@ 2022-12-09  9:04         ` Phillip Wood
  0 siblings, 0 replies; 20+ messages in thread
From: Phillip Wood @ 2022-12-09  9:04 UTC (permalink / raw)
  To: Junio C Hamano, Phillip Wood
  Cc: Victoria Dye, Phillip Wood via GitGitGadget, git,
	Ævar Arnfjörð Bjarmason, Derrick Stolee

Hi Junio

On 09/12/2022 01:09, Junio C Hamano wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
> 
>> I added the restrictions to 'test_todo' to try and stop it being
>> misused but I'm happy to relax them if needed. I'm keen that test_todo
>> is able to distinguish between an expected failure and a failure due
>> to the wrapped command being misused e.g. 'test_todo grep
>> --invalid-option' should report an error.
> 
> Hmm, but it is not useful if the failure is from "you cannot use the
> system command grep with test_todo", as the (implicitly) encouraged
> "fix" the developer who wrote the test would pick would be to use
> "! grep --invalid-option" which would still fail for a wrong reason.
> 
> If a "git" command is expected to run to a completion but is
> currently broken and produces a wrong output, it would be very
> useful to be able to write
> 
> 	git command --args >actual &&
> 	test_todo grep -e "$expected_token" actual
> 
> to say "when 'git command' is fixed, the output should contains
> this, but we know it currently is broken".

You can do that now, the next patch adds support for "test_todo [!] 
grep" but you currently cannot pass arbitrary commands/helper functions.

Best Wishes

Phillip

>> I think you've convinced be to remove the restrictions on what can be
>> wrapped by 'test_todo' when I re-roll.
>>
>> Thanks for your thoughtful comments
> 
> Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2022-12-09  9:04 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-06 15:01 [PATCH 0/3] [RFC] tests: add test_todo() for known failures Phillip Wood via GitGitGadget
2022-10-06 15:01 ` [PATCH 1/3] [RFC] tests: add test_todo() to mark known breakages Phillip Wood via GitGitGadget
2022-10-06 15:36   ` Ævar Arnfjörð Bjarmason
2022-10-06 16:10     ` Phillip Wood
2022-10-06 20:33       ` Ævar Arnfjörð Bjarmason
2022-12-06 22:37   ` Victoria Dye
2022-12-07 12:08     ` Ævar Arnfjörð Bjarmason
2022-12-08 15:06     ` Phillip Wood
2022-12-09  1:09       ` Junio C Hamano
2022-12-09  9:04         ` Phillip Wood
2022-10-06 15:01 ` [PATCH 2/3] [RFC] test_todo: allow [!] grep as the command Phillip Wood via GitGitGadget
2022-10-06 15:56   ` Ævar Arnfjörð Bjarmason
2022-10-06 16:42     ` Phillip Wood
2022-10-06 20:26       ` Ævar Arnfjörð Bjarmason
2022-10-06 15:01 ` [PATCH 3/3] [RFC] test_todo: allow [verbose] test " Phillip Wood via GitGitGadget
2022-10-06 16:02   ` Ævar Arnfjörð Bjarmason
2022-10-06 17:05 ` [PATCH 0/3] [RFC] tests: add test_todo() for known failures Junio C Hamano
2022-10-06 19:28 ` Ævar Arnfjörð Bjarmason
2022-10-07 13:26   ` Phillip Wood
2022-10-07 17:08     ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).