From: Eric Sunshine <sunshine@sunshineco.com>
To: git@vger.kernel.org
Cc: Jeff King <peff@peff.net>, Jonathan Nieder <jrnieder@gmail.com>,
Junio C Hamano <gitster@pobox.com>,
Eric Sunshine <sunshine@sunshineco.com>
Subject: [PATCH v3 1/6] chainlint: match arbitrary here-docs tags rather than hard-coded names
Date: Wed, 15 Aug 2018 14:45:47 -0400 [thread overview]
Message-ID: <20180815184552.8418-2-sunshine@sunshineco.com> (raw)
In-Reply-To: <20180815184552.8418-1-sunshine@sunshineco.com>
chainlint.sed swallows top-level here-docs to avoid being fooled by
content which might look like start-of-subshell. It likewise swallows
here-docs in subshells to avoid marking content lines as breaking the
&&-chain, and to avoid being fooled by content which might look like
end-of-subshell, start-of-nested-subshell, or other specially-recognized
constructs.
At the time of implementation, it was believed that it was not possible
to support arbitrary here-doc tag names since 'sed' provides no way to
stash the opening tag name in a variable for later comparison against a
line signaling end-of-here-doc. Consequently, tag names are hard-coded,
with "EOF" being the only tag recognized at the top-level, and only
"EOF", "EOT", and "INPUT_END" being recognized within subshells. Also,
special care was taken to avoid being confused by here-docs nested
within other here-docs.
In practice, this limited number of hard-coded tag names has been "good
enough" for the 13000+ existing Git test, despite many of those tests
using tags other than the recognized ones, since the bodies of those
here-docs do not contain content which would fool the linter.
Nevertheless, the situation is not ideal since someone writing new
tests, and choosing a name not in the "blessed" set could potentially
trigger a false-positive.
To address this shortcoming, upgrade chainlint.sed to handle arbitrary
here-doc tag names, both at the top-level and within subshells.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
t/chainlint.sed | 57 +++++++++++++++++-----------
t/chainlint/here-doc.expect | 2 +
t/chainlint/here-doc.test | 7 ++++
t/chainlint/nested-here-doc.expect | 2 +
t/chainlint/nested-here-doc.test | 10 +++++
t/chainlint/subshell-here-doc.expect | 4 ++
t/chainlint/subshell-here-doc.test | 8 ++++
7 files changed, 67 insertions(+), 23 deletions(-)
diff --git a/t/chainlint.sed b/t/chainlint.sed
index 5f0882cb38..2af1a687f8 100644
--- a/t/chainlint.sed
+++ b/t/chainlint.sed
@@ -61,6 +61,22 @@
# "else", and "fi" in if-then-else likewise must not end with "&&", thus
# receives similar treatment.
#
+# Swallowing here-docs with arbitrary tags requires a bit of finesse. When a
+# line such as "cat <<EOF >out" is seen, the here-doc tag is moved to the front
+# of the line enclosed in angle brackets as a sentinel, giving "<EOF>cat >out".
+# As each subsequent line is read, it is appended to the target line and a
+# (whitespace-loose) back-reference match /^<(.*)>\n\1$/ is attempted to see if
+# the content inside "<...>" matches the entirety of the newly-read line. For
+# instance, if the next line read is "some data", when concatenated with the
+# target line, it becomes "<EOF>cat >out\nsome data", and a match is attempted
+# to see if "EOF" matches "some data". Since it doesn't, the next line is
+# attempted. When a line consisting of only "EOF" (and possible whitespace) is
+# encountered, it is appended to the target line giving "<EOF>cat >out\nEOF",
+# in which case the "EOF" inside "<...>" does match the text following the
+# newline, thus the closing here-doc tag has been found. The closing tag line
+# and the "<...>" prefix on the target line are then discarded, leaving just
+# the target line "cat >out".
+#
# To facilitate regression testing (and manual debugging), a ">" annotation is
# applied to the line containing ")" which closes a subshell, ">>" to a line
# closing a nested subshell, and ">>>" to a line closing both at once. This
@@ -78,14 +94,17 @@
# here-doc -- swallow it to avoid false hits within its body (but keep the
# command to which it was attached)
-/<<[ ]*[-\\]*EOF[ ]*/ {
- s/[ ]*<<[ ]*[-\\]*EOF//
- h
+/<<[ ]*[-\\]*[A-Za-z0-9_]/ {
+ s/^\(.*\)<<[ ]*[-\\]*\([A-Za-z0-9_][A-Za-z0-9_]*\)/<\2>\1<</
+ s/[ ]*<<//
:hereslurp
N
- s/.*\n//
- /^[ ]*EOF[ ]*$/!bhereslurp
- x
+ /^<\([^>]*\)>.*\n[ ]*\1[ ]*$/!{
+ s/\n.*$//
+ bhereslurp
+ }
+ s/^<[^>]*>//
+ s/\n.*$//
}
# one-liner "(...) &&"
@@ -139,9 +158,7 @@ s/.*\n//
/"[^'"]*'[^'"]*"/!bsqstring
}
# here-doc -- swallow it
-/<<[ ]*[-\\]*EOF/bheredoc
-/<<[ ]*[-\\]*EOT/bheredoc
-/<<[ ]*[-\\]*INPUT_END/bheredoc
+/<<[ ]*[-\\]*[A-Za-z0-9_]/bheredoc
# comment or empty line -- discard since final non-comment, non-empty line
# before closing ")", "done", "elsif", "else", or "fi" will need to be
# re-visited to drop "suspect" marking since final line of those constructs
@@ -249,23 +266,17 @@ s/\n//
bcheckchain
# found here-doc -- swallow it to avoid false hits within its body (but keep
-# the command to which it was attached); take care to handle here-docs nested
-# within here-docs by only recognizing closing tag matching outer here-doc
-# opening tag
+# the command to which it was attached)
:heredoc
-/EOF/{ s/[ ]*<<[ ]*[-\\]*EOF//; s/^/EOF/; }
-/EOT/{ s/[ ]*<<[ ]*[-\\]*EOT//; s/^/EOT/; }
-/INPUT_END/{ s/[ ]*<<[ ]*[-\\]*INPUT_END//; s/^/INPUT_END/; }
+s/^\(.*\)<<[ ]*[-\\]*\([A-Za-z0-9_][A-Za-z0-9_]*\)/<\2>\1<</
+s/[ ]*<<//
:hereslurpsub
N
-/^EOF.*\n[ ]*EOF[ ]*$/bhereclose
-/^EOT.*\n[ ]*EOT[ ]*$/bhereclose
-/^INPUT_END.*\n[ ]*INPUT_END[ ]*$/bhereclose
-bhereslurpsub
-:hereclose
-s/^EOF//
-s/^EOT//
-s/^INPUT_END//
+/^<\([^>]*\)>.*\n[ ]*\1[ ]*$/!{
+ s/\n.*$//
+ bhereslurpsub
+}
+s/^<[^>]*>//
s/\n.*$//
bcheckchain
diff --git a/t/chainlint/here-doc.expect b/t/chainlint/here-doc.expect
index 2328fe7753..33bc3cc0b4 100644
--- a/t/chainlint/here-doc.expect
+++ b/t/chainlint/here-doc.expect
@@ -1,3 +1,5 @@
boodle wobba gorgo snoot wafta snurb &&
+cat >foo &&
+
horticulture
diff --git a/t/chainlint/here-doc.test b/t/chainlint/here-doc.test
index bd36f6e1d3..8986eefe74 100644
--- a/t/chainlint/here-doc.test
+++ b/t/chainlint/here-doc.test
@@ -7,6 +7,13 @@ quoth the raven,
nevermore...
EOF
+# LINT: swallow here-doc with arbitrary tag
+cat <<-Arbitrary_Tag_42 >foo &&
+snoz
+boz
+woz
+Arbitrary_Tag_42
+
# LINT: swallow here-doc (EOF is last line of test)
horticulture <<\EOF
gomez
diff --git a/t/chainlint/nested-here-doc.expect b/t/chainlint/nested-here-doc.expect
index 559301e005..0c9ef1cfc6 100644
--- a/t/chainlint/nested-here-doc.expect
+++ b/t/chainlint/nested-here-doc.expect
@@ -1,3 +1,5 @@
+cat >foop &&
+
(
cat &&
?!AMP?! cat
diff --git a/t/chainlint/nested-here-doc.test b/t/chainlint/nested-here-doc.test
index 027e0bb3ff..f35404bf0f 100644
--- a/t/chainlint/nested-here-doc.test
+++ b/t/chainlint/nested-here-doc.test
@@ -1,3 +1,13 @@
+# LINT: inner "EOF" not misintrepreted as closing ARBITRARY here-doc
+cat <<ARBITRARY >foop &&
+naddle
+fub <<EOF
+ nozzle
+ noodle
+EOF
+formp
+ARBITRARY
+
(
# LINT: inner "EOF" not misintrepreted as closing INPUT_END here-doc
cat <<-\INPUT_END &&
diff --git a/t/chainlint/subshell-here-doc.expect b/t/chainlint/subshell-here-doc.expect
index 19d5aff233..7c2da63bc7 100644
--- a/t/chainlint/subshell-here-doc.expect
+++ b/t/chainlint/subshell-here-doc.expect
@@ -2,4 +2,8 @@
echo wobba gorgo snoot wafta snurb &&
?!AMP?! cat >bip
echo >bop
+>) &&
+(
+ cat >bup &&
+ meep
>)
diff --git a/t/chainlint/subshell-here-doc.test b/t/chainlint/subshell-here-doc.test
index 9c3564c247..05139af0b5 100644
--- a/t/chainlint/subshell-here-doc.test
+++ b/t/chainlint/subshell-here-doc.test
@@ -20,4 +20,12 @@
wednesday
pugsly
EOF
+) &&
+(
+# LINT: swallow here-doc with arbitrary tag
+ cat <<-\ARBITRARY >bup &&
+ glink
+ FIZZ
+ ARBITRARY
+ meep
)
--
2.18.0.267.gbc8be36ecb
next prev parent reply other threads:[~2018-08-15 18:46 UTC|newest]
Thread overview: 123+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-26 7:29 [PATCH 00/29] t: detect and fix broken &&-chains in subshells Eric Sunshine
2018-06-26 7:29 ` [PATCH 01/29] t7508: use test_when_finished() instead of managing exit code manually Eric Sunshine
2018-06-26 7:29 ` [PATCH 02/29] t0001: use "{...}" block around "||" expression rather than subshell Eric Sunshine
2018-06-26 7:29 ` [PATCH 03/29] t1300: use sane_unset() to avoid breaking &&-chain Eric Sunshine
2018-06-26 7:29 ` [PATCH 04/29] t3303: use standard here-doc tag "EOF" to avoid fooling --chain-lint Eric Sunshine
2018-06-26 7:29 ` [PATCH 05/29] t5505: modernize and simplify hard-to-digest test Eric Sunshine
2018-06-26 7:29 ` [PATCH 06/29] t6036: fix broken "merge fails but has appropriate contents" tests Eric Sunshine
2018-06-26 8:44 ` Elijah Newren
2018-06-26 7:29 ` [PATCH 07/29] t7201: drop pointless "exit 0" at end of subshell Eric Sunshine
2018-06-26 7:29 ` [PATCH 08/29] t7400: fix broken "submodule add/reconfigure --force" test Eric Sunshine
2018-06-27 18:04 ` Stefan Beller
2018-06-26 7:29 ` [PATCH 09/29] t7810: use test_expect_code() instead of hand-rolled comparison Eric Sunshine
2018-06-26 7:29 ` [PATCH 10/29] t9001: fix broken "invoke hook" test Eric Sunshine
2018-06-26 17:07 ` Jonathan Tan
2018-06-26 7:29 ` [PATCH 11/29] t9104: use "{...}" block around "||" expression rather than subshell Eric Sunshine
2018-06-26 7:29 ` [PATCH 12/29] t9401: drop unnecessary nested subshell Eric Sunshine
2018-06-26 7:29 ` [PATCH 13/29] t/lib-submodule-update: fix broken "replace submodule must-fail" test Eric Sunshine
2018-06-27 18:30 ` [PATCH] t/lib-submodule-update: fix absorbing test Stefan Beller
2018-06-27 18:38 ` Eric Sunshine
2018-06-26 7:29 ` [PATCH 14/29] t: drop subshell with missing &&-chain in favor of simpler construct Eric Sunshine
2018-06-26 19:31 ` Junio C Hamano
2018-06-26 20:06 ` Eric Sunshine
2018-06-26 7:29 ` [PATCH 15/29] t: drop unnecessary terminating semicolons in subshell Eric Sunshine
2018-06-26 7:29 ` [PATCH 16/29] t: use test_might_fail() instead of manipulating exit code manually Eric Sunshine
2018-06-26 7:29 ` [PATCH 17/29] t: use test_must_fail() instead of checking " Eric Sunshine
2018-06-26 7:59 ` Luke Diamand
2018-06-26 8:58 ` Elijah Newren
2018-06-26 9:21 ` Eric Sunshine
2018-06-26 18:05 ` Johannes Sixt
2018-06-26 18:14 ` Eric Sunshine
2018-06-26 21:00 ` Johannes Sixt
2018-06-26 7:29 ` [PATCH 18/29] t0000-t0999: fix broken &&-chains in subshells Eric Sunshine
2018-06-26 7:29 ` [PATCH 19/29] t1000-t1999: " Eric Sunshine
2018-06-26 7:29 ` [PATCH 20/29] t2000-t2999: " Eric Sunshine
2018-06-26 7:29 ` [PATCH 21/29] t3000-t3999: " Eric Sunshine
2018-06-26 7:29 ` [PATCH 22/29] t3030: " Eric Sunshine
2018-06-26 7:29 ` [PATCH 23/29] t4000-t4999: " Eric Sunshine
2018-06-26 7:29 ` [PATCH 24/29] t5000-t5999: " Eric Sunshine
2018-06-26 7:29 ` [PATCH 25/29] t6000-t6999: " Eric Sunshine
2018-06-26 7:29 ` [PATCH 26/29] t7000-t7999: " Eric Sunshine
2018-06-26 7:29 ` [PATCH 27/29] t9000-t9999: " Eric Sunshine
2018-06-26 7:30 ` [PATCH 28/29] t9119: " Eric Sunshine
2018-06-26 7:30 ` [PATCH 29/29] t/test-lib: teach --chain-lint to detect " Eric Sunshine
2018-06-26 19:15 ` Junio C Hamano
2018-06-26 19:52 ` Eric Sunshine
2018-06-26 20:17 ` Jeff King
2018-06-26 20:22 ` Jeff King
2018-06-26 20:59 ` Eric Sunshine
2018-06-26 21:33 ` Elijah Newren
2018-06-26 21:42 ` Eric Sunshine
2018-06-26 20:46 ` Eric Sunshine
2018-06-26 21:01 ` Jeff King
2018-06-26 21:13 ` Eric Sunshine
2018-06-28 14:35 ` Jeff King
2018-06-27 2:15 ` Elijah Newren
2018-06-27 6:27 ` Johannes Sixt
2018-06-27 6:48 ` Eric Sunshine
2018-06-28 14:37 ` Jeff King
2018-06-26 21:09 ` Junio C Hamano
2018-06-26 9:20 ` [PATCH 00/29] t: detect and fix " Elijah Newren
2018-06-26 9:31 ` Eric Sunshine
2018-06-26 15:34 ` Elijah Newren
2018-06-26 19:38 ` Junio C Hamano
2018-06-26 21:25 ` Eric Sunshine
2018-06-26 22:31 ` Junio C Hamano
2018-06-27 0:22 ` Jonathan Nieder
2018-07-11 6:46 ` [PATCH v2 00/10] detect " Eric Sunshine
2018-07-11 6:46 ` [PATCH v2 01/10] t/test-lib: teach --chain-lint to " Eric Sunshine
2018-07-11 21:37 ` Junio C Hamano
2018-07-12 10:50 ` Eric Sunshine
2018-07-12 16:56 ` Jeff King
2018-07-12 19:32 ` Eric Sunshine
2018-07-12 19:54 ` Junio C Hamano
2018-07-30 18:13 ` Jonathan Nieder
2018-07-30 19:06 ` [PATCH 0/2] subtree: fix &&-chain and simplify tests (Re: [PATCH v2 01/10] t/test-lib: teach --chain-lint to detect broken &&-chains in subshells) Jonathan Nieder
2018-07-30 19:07 ` [PATCH 1/2] subtree test: add missing && to &&-chain Jonathan Nieder
2018-07-30 19:07 ` [PATCH 2/2] subtree test: simplify preparation of expected results Jonathan Nieder
2018-07-30 20:25 ` [PATCH v2 01/10] t/test-lib: teach --chain-lint to detect broken &&-chains in subshells Eric Sunshine
2018-07-30 20:59 ` Jonathan Nieder
2018-07-30 21:38 ` Eric Sunshine
2018-07-31 12:50 ` Jeff King
2018-07-31 18:55 ` Eric Sunshine
2018-07-31 19:08 ` Jeff King
2018-08-23 18:02 ` Ævar Arnfjörð Bjarmason
2018-08-23 18:27 ` Eric Sunshine
2018-07-11 6:46 ` [PATCH v2 02/10] t/Makefile: add machinery to check correctness of chainlint.sed Eric Sunshine
2018-07-11 6:46 ` [PATCH v2 03/10] t/chainlint: add chainlint "basic" test cases Eric Sunshine
2018-07-11 6:46 ` [PATCH v2 04/10] t/chainlint: add chainlint "whitespace" " Eric Sunshine
2018-07-11 6:46 ` [PATCH v2 05/10] t/chainlint: add chainlint "one-liner" " Eric Sunshine
2018-07-11 6:46 ` [PATCH v2 06/10] t/chainlint: add chainlint "nested subshell" " Eric Sunshine
2018-07-11 6:46 ` [PATCH v2 07/10] t/chainlint: add chainlint "loop" and "conditional" " Eric Sunshine
2018-07-11 6:46 ` [PATCH v2 08/10] t/chainlint: add chainlint "cuddled" " Eric Sunshine
2018-07-11 6:46 ` [PATCH v2 09/10] t/chainlint: add chainlint "complex" " Eric Sunshine
2018-07-11 6:46 ` [PATCH v2 10/10] t/chainlint: add chainlint "specialized" " Eric Sunshine
2018-08-07 8:21 ` [PATCH 0/5] chainlint: improve robustness against "unusual" shell coding Eric Sunshine
2018-08-07 8:21 ` [PATCH 1/5] chainlint: match arbitrary here-docs tags rather than hard-coded names Eric Sunshine
2018-08-08 22:50 ` Jeff King
2018-08-09 5:58 ` Eric Sunshine
2018-08-09 14:26 ` Jeff King
2018-08-07 8:21 ` [PATCH 2/5] chainlint: recognize multi-line $(...) when command cuddled with "$(" Eric Sunshine
2018-08-07 8:21 ` [PATCH 3/5] chainlint: let here-doc and multi-line string commence on same line Eric Sunshine
2018-08-07 8:21 ` [PATCH 4/5] chainlint: recognize multi-line quoted strings more robustly Eric Sunshine
2018-08-07 8:21 ` [PATCH 5/5] chainlint: add test of pathological case which triggered false positive Eric Sunshine
2018-08-08 22:53 ` [PATCH 0/5] chainlint: improve robustness against "unusual" shell coding Jeff King
2018-08-09 0:44 ` Junio C Hamano
2018-08-13 8:47 ` [PATCH v2 0/6] " Eric Sunshine
2018-08-13 8:47 ` [PATCH v2 1/6] chainlint: match arbitrary here-docs tags rather than hard-coded names Eric Sunshine
2018-08-13 8:47 ` [PATCH v2 2/6] chainlint: match 'quoted' here-doc tags Eric Sunshine
2018-08-13 19:27 ` Junio C Hamano
2018-08-13 20:12 ` Eric Sunshine
2018-08-13 8:47 ` [PATCH v2 3/6] chainlint: recognize multi-line $(...) when command cuddled with "$(" Eric Sunshine
2018-08-13 8:47 ` [PATCH v2 4/6] chainlint: let here-doc and multi-line string commence on same line Eric Sunshine
2018-08-13 8:47 ` [PATCH v2 5/6] chainlint: recognize multi-line quoted strings more robustly Eric Sunshine
2018-08-13 8:47 ` [PATCH v2 6/6] chainlint: add test of pathological case which triggered false positive Eric Sunshine
2018-08-15 18:45 ` [PATCH v3 0/6] chainlint: improve robustness against "unusual" shell coding Eric Sunshine
2018-08-15 18:45 ` Eric Sunshine [this message]
2018-08-15 18:45 ` [PATCH v3 2/6] chainlint: match quoted here-doc tags Eric Sunshine
2018-08-15 18:45 ` [PATCH v3 3/6] chainlint: recognize multi-line $(...) when command cuddled with "$(" Eric Sunshine
2018-08-15 18:45 ` [PATCH v3 4/6] chainlint: let here-doc and multi-line string commence on same line Eric Sunshine
2018-08-15 18:45 ` [PATCH v3 5/6] chainlint: recognize multi-line quoted strings more robustly Eric Sunshine
2018-08-15 18:45 ` [PATCH v3 6/6] chainlint: add test of pathological case which triggered false positive Eric Sunshine
2018-08-29 9:45 ` [PATCH] chainlint: match "quoted" here-doc tags Eric Sunshine
2018-08-29 17:57 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180815184552.8418-2-sunshine@sunshineco.com \
--to=sunshine@sunshineco.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jrnieder@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).