git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>,
	"Jeffrey Walton" <noloader@gmail.com>,
	"Michał Kiedrowicz" <michal.kiedrowicz@gmail.com>,
	"J Smith" <dark.panda@gmail.com>,
	"Victor Leschuk" <vleschuk@gmail.com>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Fredrik Kuivinen" <frekui@gmail.com>,
	"Brandon Williams" <bmwill@google.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: [PATCH v4 13/31] grep: prepare for testing binary regexes containing rx metacharacters
Date: Thu, 25 May 2017 19:45:17 +0000	[thread overview]
Message-ID: <20170525194535.9324-14-avarab@gmail.com> (raw)
In-Reply-To: <20170525194535.9324-1-avarab@gmail.com>

Add setup code needed for testing regexes that contain both binary
data and regex metacharacters.

The POSIX regcomp() function inherently can't support that, because it
takes a \0-delimited char *, but other regex engines APIs like PCRE v2
take a pattern/length pair, and are thus able to handle \0s in
patterns as well as any other character.

When kwset was imported in commit 9eceddeec6 ("Use kwset in grep",
2011-08-21) this limitation was fixed, but at the expense of
introducing the undocumented limitation that any pattern containing \0
implicitly becomes a fixed match (equivalent to -F having been
provided).

That's not something we'd like to keep in the future. The inability to
match patterns containing \0 is a leaky implementation detail.

So add tests as a first step towards changing that. In order to test
that \0-patterns can properly match as regexes the test string needs
to have some regex metacharacters in it.

There were other blind spots in the tests. The code around kwset
specially handles case-insensitive & non-ASCII data, but there were no
tests for this.

Fix all of that by amending the text being matched to contain both
regex metacharacters & non-ASCII data.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t7008-grep-binary.sh | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index df93d8e44c..20370d6e0c 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -28,7 +28,7 @@ nul_match () {
 }
 
 test_expect_success 'setup' "
-	echo 'binaryQfile' | q_to_nul >a &&
+	echo 'binaryQfileQm[*]cQ*æQð' | q_to_nul >a &&
 	git add a &&
 	git commit -m.
 "
@@ -162,7 +162,7 @@ test_expect_success 'grep does not honor textconv' '
 '
 
 test_expect_success 'grep --textconv honors textconv' '
-	echo "a:binaryQfile" >expect &&
+	echo "a:binaryQfileQm[*]cQ*æQð" >expect &&
 	git grep --textconv Qfile >actual &&
 	test_cmp expect actual
 '
@@ -172,7 +172,7 @@ test_expect_success 'grep --no-textconv does not honor textconv' '
 '
 
 test_expect_success 'grep --textconv blob honors textconv' '
-	echo "HEAD:a:binaryQfile" >expect &&
+	echo "HEAD:a:binaryQfileQm[*]cQ*æQð" >expect &&
 	git grep --textconv Qfile HEAD:a >actual &&
 	test_cmp expect actual
 '
-- 
2.13.0.303.g4ebf302169


  parent reply	other threads:[~2017-05-25 19:46 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-25 19:45 [PATCH v4 00/31] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 01/31] Makefile & configure: reword inaccurate comment about PCRE Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 02/31] grep & rev-list doc: stop promising libpcre for --perl-regexp Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 03/31] test-lib: rename the LIBPCRE prerequisite to PCRE Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 04/31] log: add exhaustive tests for pattern style options & config Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 05/31] log: make --regexp-ignore-case work with --perl-regexp Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 06/31] grep: add a test asserting that --perl-regexp dies when !PCRE Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 07/31] grep: add a test for backreferences in PCRE patterns Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 08/31] grep: change non-ASCII -i test to stop using --debug Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 09/31] grep: add tests for --threads=N and grep.threads Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 10/31] grep: amend submodule recursion test for regex engine testing Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 11/31] grep: add tests for grep pattern types being passed to submodules Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 12/31] grep: add a test helper function for less verbose -f \0 tests Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` Ævar Arnfjörð Bjarmason [this message]
2017-05-25 19:45 ` [PATCH v4 14/31] grep: add tests to fix blind spots with \0 patterns Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 15/31] perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 16/31] perf: emit progress output when unpacking & building Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 17/31] perf: add a comparison test of grep regex engines Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 18/31] perf: add a comparison test of grep regex engines with -F Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 19/31] perf: add a comparison test of log --grep regex engines Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 20/31] perf: add a comparison test of log --grep regex engines with -F Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 21/31] grep: catch a missing enum in switch statement Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 22/31] grep: remove redundant regflags assignments Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 23/31] grep: factor test for \0 in grep patterns into a function Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 24/31] grep: change the internal PCRE macro names to be PCRE1 Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 25/31] grep: change internal *pcre* variable & function names to be *pcre1* Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 26/31] grep: move is_fixed() earlier to avoid forward declaration Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 27/31] test-lib: add a PTHREADS prerequisite Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 28/31] pack-objects & index-pack: add test for --threads warning Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 29/31] pack-objects: fix buggy warning about threads Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 30/31] grep: given --threads with NO_PTHREADS=YesPlease, warn Ævar Arnfjörð Bjarmason
2017-05-25 19:45 ` [PATCH v4 31/31] grep: assert that threading is enabled when calling grep_{lock,unlock} Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170525194535.9324-14-avarab@gmail.com \
    --to=avarab@gmail.com \
    --cc=bmwill@google.com \
    --cc=dark.panda@gmail.com \
    --cc=frekui@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=michal.kiedrowicz@gmail.com \
    --cc=noloader@gmail.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=vleschuk@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).