git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/8] grep: simplify & delete code by changing obscure cfg variable behavior
@ 2021-11-06 21:10 Ævar Arnfjörð Bjarmason
  2021-11-06 21:10 ` [PATCH 1/8] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
                   ` (8 more replies)
  0 siblings, 9 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-06 21:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, J Smith, Ævar Arnfjörð Bjarmason

I'd like to use the grep API for more things in some upcoming
optimization patches, and to do that I need to use grep_init(),
grep_config() etc.

These APIs have a very unusual API, and we've got quite a bit of code
in grep.c to support an obscure edge case in how "grep.extendedRegexp"
behaves when clashing with a "grep.patternType" variable present in
the same config space.

This series is an opinionated change of that behavior, and resulting
large deletion of code.

The series starts out by deleting some unused code in grep.c, and
moving a bit of related builtin/grep.c-specific code to that file, and
out of grep.c.

Ævar Arnfjörð Bjarmason (8):
  grep.h: remove unused "regex_t regexp" from grep_opt
  git.c & grep.c: assert that "prefix" is NULL or non-zero string
  grep: remove unused "prefix_length" member
  grep.c: move "prefix" out of "struct grep_opt"
  log tests: check if grep_config() is called by "log"-like cmds
  grep API: call grep_config() after grep_init()
  grep: simplify config parsing, change grep.<rx config> interaction
  grep: make "extendedRegexp=true" the same as "patternType=extended"

 Documentation/config/grep.txt |   4 +-
 Documentation/git-grep.txt    |   4 +-
 builtin/grep.c                |  31 +++++----
 builtin/log.c                 |  13 +++-
 git.c                         |   4 +-
 grep.c                        | 118 ++++------------------------------
 grep.h                        |  34 ++++++----
 revision.c                    |   4 +-
 t/t4202-log.sh                |  16 +++++
 t/t7810-grep.sh               |   4 +-
 10 files changed, 89 insertions(+), 143 deletions(-)

-- 
2.34.0.rc1.741.gab7bfd97031


^ permalink raw reply	[flat|nested] 151+ messages in thread

* [PATCH 1/8] grep.h: remove unused "regex_t regexp" from grep_opt
  2021-11-06 21:10 [PATCH 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
@ 2021-11-06 21:10 ` Ævar Arnfjörð Bjarmason
  2021-11-06 21:10 ` [PATCH 2/8] git.c & grep.c: assert that "prefix" is NULL or non-zero string Ævar Arnfjörð Bjarmason
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-06 21:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, J Smith, Ævar Arnfjörð Bjarmason

This "regex_t" in grep_opt has not been used since
f9b9faf6f8a (builtin-grep: allow more than one patterns., 2006-05-02),
we still use a "regex_t" for compiling regexes, but that's in the
"grep_pat" struct".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/grep.h b/grep.h
index 3e8815c347b..95cccb670f9 100644
--- a/grep.h
+++ b/grep.h
@@ -136,7 +136,6 @@ struct grep_opt {
 
 	const char *prefix;
 	int prefix_length;
-	regex_t regexp;
 	int linenum;
 	int columnnum;
 	int invert;
-- 
2.34.0.rc1.741.gab7bfd97031


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH 2/8] git.c & grep.c: assert that "prefix" is NULL or non-zero string
  2021-11-06 21:10 [PATCH 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
  2021-11-06 21:10 ` [PATCH 1/8] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
@ 2021-11-06 21:10 ` Ævar Arnfjörð Bjarmason
  2021-11-08 20:37   ` Taylor Blau
  2021-11-06 21:10 ` [PATCH 3/8] grep: remove unused "prefix_length" member Ævar Arnfjörð Bjarmason
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-06 21:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, J Smith, Ævar Arnfjörð Bjarmason

The "prefix" we get from setup.c is either going to be NULL or a
string of length >0, never "". So let's drop the "prefix && *prefix"
check in grep.c added in 0d042fecf2f (git-grep: show pathnames
relative to the current directory, 2006-08-11).

As seen in code in revision.c that was added in cd676a51367 (diff
--relative: output paths as relative to the current subdirectory,
2008-02-12) we have existing code that does away with this assertion.

This makes it easier to reason about a subsequent change to the
"prefix_length" code in grep.c in a subsequent commit, and since we're
going to the trouble of doing that let's leave behind an assert() to
promise this to any future callers.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 git.c  | 4 ++--
 grep.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/git.c b/git.c
index 5ff21be21f3..aa4f0d77c4b 100644
--- a/git.c
+++ b/git.c
@@ -420,9 +420,8 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
 {
 	int status, help;
 	struct stat st;
-	const char *prefix;
+	const char *prefix = NULL;
 
-	prefix = NULL;
 	help = argc == 2 && !strcmp(argv[1], "-h");
 	if (!help) {
 		if (p->option & RUN_SETUP)
@@ -431,6 +430,7 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
 			int nongit_ok;
 			prefix = setup_git_directory_gently(&nongit_ok);
 		}
+		assert(!prefix || (prefix && *prefix));
 		precompose_argv_prefix(argc, argv, NULL);
 		if (use_pager == -1 && p->option & (RUN_SETUP | RUN_SETUP_GENTLY) &&
 		    !(p->option & DELAY_PAGER_CONFIG))
diff --git a/grep.c b/grep.c
index f6e113e9f0f..88ebc504630 100644
--- a/grep.c
+++ b/grep.c
@@ -145,7 +145,7 @@ void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix
 
 	opt->repo = repo;
 	opt->prefix = prefix;
-	opt->prefix_length = (prefix && *prefix) ? strlen(prefix) : 0;
+	opt->prefix_length = prefix ? strlen(prefix) : 0;
 	opt->pattern_tail = &opt->pattern_list;
 	opt->header_tail = &opt->header_list;
 }
-- 
2.34.0.rc1.741.gab7bfd97031


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH 3/8] grep: remove unused "prefix_length" member
  2021-11-06 21:10 [PATCH 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
  2021-11-06 21:10 ` [PATCH 1/8] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
  2021-11-06 21:10 ` [PATCH 2/8] git.c & grep.c: assert that "prefix" is NULL or non-zero string Ævar Arnfjörð Bjarmason
@ 2021-11-06 21:10 ` Ævar Arnfjörð Bjarmason
  2021-11-08 20:42   ` Taylor Blau
  2021-11-06 21:10 ` [PATCH 4/8] grep.c: move "prefix" out of "struct grep_opt" Ævar Arnfjörð Bjarmason
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-06 21:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, J Smith, Ævar Arnfjörð Bjarmason

Remove the "prefix_length" member, which we compute with a strlen() on
the "prefix" argument to grep_init(), but whose strlen() hasn't been
used since 493b7a08d80 (grep: accept relative paths outside current
working directory, 2009-09-05).

When this code was added in 0d042fecf2f (git-grep: show pathnames
relative to the current directory, 2006-08-11) we used the length, but
since 493b7a08d80 we haven't used it for anything except a boolean
check that we could have done on the "prefix" member itself.

Before a preceding commit we also used to guard the strlen() with
"prefix && *prefix", but as that commit notes the RHS of that && chain
was also redundant.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 4 ++--
 grep.c         | 1 -
 grep.h         | 1 -
 3 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 9e34a820ad4..bd4d2107351 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -315,7 +315,7 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 	strbuf_reset(out);
 
 	if (opt->null_following_name) {
-		if (opt->relative && opt->prefix_length) {
+		if (opt->relative && opt->prefix) {
 			struct strbuf rel_buf = STRBUF_INIT;
 			const char *rel_name =
 				relative_path(filename + tree_name_len,
@@ -332,7 +332,7 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 		return;
 	}
 
-	if (opt->relative && opt->prefix_length)
+	if (opt->relative && opt->prefix)
 		quote_path(filename + tree_name_len, opt->prefix, out, 0);
 	else
 		quote_c_style(filename + tree_name_len, out, NULL, 0);
diff --git a/grep.c b/grep.c
index 88ebc504630..755afb5f96d 100644
--- a/grep.c
+++ b/grep.c
@@ -145,7 +145,6 @@ void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix
 
 	opt->repo = repo;
 	opt->prefix = prefix;
-	opt->prefix_length = prefix ? strlen(prefix) : 0;
 	opt->pattern_tail = &opt->pattern_list;
 	opt->header_tail = &opt->header_list;
 }
diff --git a/grep.h b/grep.h
index 95cccb670f9..467d775b5a9 100644
--- a/grep.h
+++ b/grep.h
@@ -135,7 +135,6 @@ struct grep_opt {
 	struct repository *repo;
 
 	const char *prefix;
-	int prefix_length;
 	int linenum;
 	int columnnum;
 	int invert;
-- 
2.34.0.rc1.741.gab7bfd97031


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH 4/8] grep.c: move "prefix" out of "struct grep_opt"
  2021-11-06 21:10 [PATCH 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
                   ` (2 preceding siblings ...)
  2021-11-06 21:10 ` [PATCH 3/8] grep: remove unused "prefix_length" member Ævar Arnfjörð Bjarmason
@ 2021-11-06 21:10 ` Ævar Arnfjörð Bjarmason
  2021-11-08 20:56   ` Taylor Blau
  2021-11-06 21:10 ` [PATCH 5/8] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-06 21:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, J Smith, Ævar Arnfjörð Bjarmason

The "struct grep_opt" is a mixture of things that would be needed by
all callers of the grep.c API, and quite a few things that only the
builtin/grep.c needs.

Since we got rid of "prefix_length" in the previous commit, let's move
the "prefix" variable over to "builtin/grep.c" where it's used. To do
this let's create a "struct grep_cmd_opt", which we'll have a pointer
to in a new "caller_priv" member in "struct grep_opt" (the existing
"priv" is used by the top-level "grep.c" itself).

We might eventually need to have grep_opt_dup() learn about this new
member, but since the prefix can be a "const char *const" (i.e. we
never change it in any way) let's leave that aside for now, in any
case "builtin/grep.c" is the only user of grep_opt_dup(), even though
it lives in the top-level "grep.c".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 17 ++++++++++++-----
 grep.c         |  3 +--
 grep.h         |  4 ++--
 revision.c     |  2 +-
 4 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index bd4d2107351..960c7aac123 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -37,6 +37,10 @@ static int num_threads;
 
 static pthread_t *threads;
 
+struct grep_cmd_opt {
+	const char *const prefix;
+};
+
 /* We use one producer thread and THREADS consumer
  * threads. The producer adds struct work_items to 'todo' and the
  * consumers pick work items from the same array.
@@ -312,14 +316,15 @@ static int grep_cmd_config(const char *var, const char *value, void *cb)
 static void grep_source_name(struct grep_opt *opt, const char *filename,
 			     int tree_name_len, struct strbuf *out)
 {
+	struct grep_cmd_opt *opt_cmd = opt->caller_priv;
 	strbuf_reset(out);
 
 	if (opt->null_following_name) {
-		if (opt->relative && opt->prefix) {
+		if (opt->relative && opt_cmd->prefix) {
 			struct strbuf rel_buf = STRBUF_INIT;
 			const char *rel_name =
 				relative_path(filename + tree_name_len,
-					      opt->prefix, &rel_buf);
+					      opt_cmd->prefix, &rel_buf);
 
 			if (tree_name_len)
 				strbuf_add(out, filename, tree_name_len);
@@ -332,8 +337,8 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 		return;
 	}
 
-	if (opt->relative && opt->prefix)
-		quote_path(filename + tree_name_len, opt->prefix, out, 0);
+	if (opt->relative && opt_cmd->prefix)
+		quote_path(filename + tree_name_len, opt_cmd->prefix, out, 0);
 	else
 		quote_c_style(filename + tree_name_len, out, NULL, 0);
 
@@ -837,6 +842,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	int external_grep_allowed__ignored;
 	const char *show_in_pager = NULL, *default_pager = "dummy";
 	struct grep_opt opt;
+	struct grep_cmd_opt opt_cmd = { .prefix = prefix };
 	struct object_array list = OBJECT_ARRAY_INIT;
 	struct pathspec pathspec;
 	struct string_list path_list = STRING_LIST_INIT_DUP;
@@ -964,7 +970,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	};
 
 	git_config(grep_cmd_config, NULL);
-	grep_init(&opt, the_repository, prefix);
+	grep_init(&opt, the_repository);
+	opt.caller_priv = &opt_cmd;
 
 	/*
 	 * If there is no -- then the paths must exist in the working
diff --git a/grep.c b/grep.c
index 755afb5f96d..c9065254aeb 100644
--- a/grep.c
+++ b/grep.c
@@ -139,12 +139,11 @@ int grep_config(const char *var, const char *value, void *cb)
  * default values from the template we read the configuration
  * information in an earlier call to git_config(grep_config).
  */
-void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix)
+void grep_init(struct grep_opt *opt, struct repository *repo)
 {
 	*opt = grep_defaults;
 
 	opt->repo = repo;
-	opt->prefix = prefix;
 	opt->pattern_tail = &opt->pattern_list;
 	opt->header_tail = &opt->header_list;
 }
diff --git a/grep.h b/grep.h
index 467d775b5a9..6b923d8599c 100644
--- a/grep.h
+++ b/grep.h
@@ -134,7 +134,6 @@ struct grep_opt {
 	 */
 	struct repository *repo;
 
-	const char *prefix;
 	int linenum;
 	int columnnum;
 	int invert;
@@ -172,6 +171,7 @@ struct grep_opt {
 	int show_hunk_mark;
 	int file_break;
 	int heading;
+	void *caller_priv;
 	void *priv;
 
 	void (*output)(struct grep_opt *opt, const void *data, size_t size);
@@ -179,7 +179,7 @@ struct grep_opt {
 };
 
 int grep_config(const char *var, const char *value, void *);
-void grep_init(struct grep_opt *, struct repository *repo, const char *prefix);
+void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index ab7c1358042..9f9b0d2429e 100644
--- a/revision.c
+++ b/revision.c
@@ -1833,7 +1833,7 @@ void repo_init_revisions(struct repository *r,
 	revs->commit_format = CMIT_FMT_DEFAULT;
 	revs->expand_tabs_in_log_default = 8;
 
-	grep_init(&revs->grep_filter, revs->repo, prefix);
+	grep_init(&revs->grep_filter, revs->repo);
 	revs->grep_filter.status_only = 1;
 
 	repo_diff_setup(revs->repo, &revs->diffopt);
-- 
2.34.0.rc1.741.gab7bfd97031


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH 5/8] log tests: check if grep_config() is called by "log"-like cmds
  2021-11-06 21:10 [PATCH 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
                   ` (3 preceding siblings ...)
  2021-11-06 21:10 ` [PATCH 4/8] grep.c: move "prefix" out of "struct grep_opt" Ævar Arnfjörð Bjarmason
@ 2021-11-06 21:10 ` Ævar Arnfjörð Bjarmason
  2021-11-06 21:10 ` [PATCH 6/8] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-06 21:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, J Smith, Ævar Arnfjörð Bjarmason

Extend the tests added in my 9df46763ef1 (log: add exhaustive tests
for pattern style options & config, 2017-05-20) to check not only
whether "git log" handles "grep.patternType", but also "git show"
etc.

It's sufficient to check whether a PCRE regex matches for the purposes
of this test, we otherwise assume that it's running the same code as
"git log", whose behavior is tested more exhaustively by test added in
9df46763ef1e.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t4202-log.sh | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 7884e3d46b3..a114c49ef27 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -449,6 +449,22 @@ test_expect_success !FAIL_PREREQS 'log with various grep.patternType configurati
 	)
 '
 
+for cmd in show whatchanged reflog format-patch
+do
+	myarg=
+	if test "$cmd" = "format-patch"
+	then
+		myarg="HEAD~.."
+	fi
+
+	test_expect_success PCRE "$cmd: understands grep.patternType=perl, like 'log'" '
+		git -c grep.patternType=fixed -C pattern-type $cmd --grep="1(?=\|2)" $myarg >actual &&
+		test_must_be_empty actual &&
+		git -c grep.patternType=perl -C pattern-type $cmd --grep="1(?=\|2)" $myarg >actual &&
+		test_file_not_empty actual
+	'
+done
+
 test_expect_success 'log --author' '
 	cat >expect <<-\EOF &&
 	Author: <BOLD;RED>A U<RESET> Thor <author@example.com>
-- 
2.34.0.rc1.741.gab7bfd97031


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH 6/8] grep API: call grep_config() after grep_init()
  2021-11-06 21:10 [PATCH 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
                   ` (4 preceding siblings ...)
  2021-11-06 21:10 ` [PATCH 5/8] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
@ 2021-11-06 21:10 ` Ævar Arnfjörð Bjarmason
  2021-11-08 21:49   ` Taylor Blau
  2021-11-06 21:10 ` [PATCH 7/8] grep: simplify config parsing, change grep.<rx config> interaction Ævar Arnfjörð Bjarmason
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-06 21:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, J Smith, Ævar Arnfjörð Bjarmason

The grep_init() function used the odd pattern of initializing the
passed-in "struct grep_opt" with a statically defined "grep_defaults"
struct, which would be modified in-place when we invoked
grep_config().

So we effectively (b) initialized config, (a) then defaults, (c)
followed by user options. Usually those are ordered as "a", "b" and
"c" instead.

As the comments being removed here show the previous behavior needed
to be carefully explained as we'd potentially share the populated
configuration among different instances of grep_init(). In practice we
didn't do that, but now that it can't be a concern anymore let's
remove those comments.

See 6ba9bb76e02 (grep: copy struct in one fell swoop, 2020-11-29) and
7687a0541e0 (grep: move the configuration parsing logic to grep.[ch],
2012-10-09) for the commits that added the comments.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c |  4 ++--
 builtin/log.c  | 13 +++++++++++--
 grep.c         | 39 +++------------------------------------
 grep.h         | 21 +++++++++++++++++++++
 4 files changed, 37 insertions(+), 40 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 960c7aac123..7f95f44e948 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -288,7 +288,7 @@ static int wait_all(void)
 static int grep_cmd_config(const char *var, const char *value, void *cb)
 {
 	int st = grep_config(var, value, cb);
-	if (git_color_default_config(var, value, cb) < 0)
+	if (git_color_default_config(var, value, NULL) < 0)
 		st = -1;
 
 	if (!strcmp(var, "grep.threads")) {
@@ -969,8 +969,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		OPT_END()
 	};
 
-	git_config(grep_cmd_config, NULL);
 	grep_init(&opt, the_repository);
+	git_config(grep_cmd_config, &opt);
 	opt.caller_priv = &opt_cmd;
 
 	/*
diff --git a/builtin/log.c b/builtin/log.c
index f75d87e8d7f..bfddacdfa6c 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -505,8 +505,6 @@ static int git_log_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
-	if (grep_config(var, value, cb) < 0)
-		return -1;
 	if (git_gpg_config(var, value, cb) < 0)
 		return -1;
 	return git_diff_ui_config(var, value, cb);
@@ -521,6 +519,8 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.simplify_history = 0;
 	memset(&opt, 0, sizeof(opt));
@@ -635,6 +635,8 @@ int cmd_show(int argc, const char **argv, const char *prefix)
 
 	memset(&match_all, 0, sizeof(match_all));
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.always_show_header = 1;
 	rev.no_walk = 1;
@@ -718,6 +720,8 @@ int cmd_log_reflog(int argc, const char **argv, const char *prefix)
 
 	repo_init_revisions(the_repository, &rev, prefix);
 	init_reflog_walk(&rev.reflog_info);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.verbose_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -751,6 +755,8 @@ int cmd_log(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.always_show_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -1833,10 +1839,13 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	extra_hdr.strdup_strings = 1;
 	extra_to.strdup_strings = 1;
 	extra_cc.strdup_strings = 1;
+
 	init_log_defaults();
 	init_display_notes(&notes_opt);
 	git_config(git_format_config, NULL);
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.show_notes = show_notes;
 	memcpy(&rev.notes_opt, &notes_opt, sizeof(notes_opt));
 	rev.commit_format = CMIT_FMT_EMAIL;
diff --git a/grep.c b/grep.c
index c9065254aeb..fb3f63c63ef 100644
--- a/grep.c
+++ b/grep.c
@@ -19,27 +19,6 @@ static void std_output(struct grep_opt *opt, const void *buf, size_t size)
 	fwrite(buf, size, 1, stdout);
 }
 
-static struct grep_opt grep_defaults = {
-	.relative = 1,
-	.pathname = 1,
-	.max_depth = -1,
-	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED,
-	.colors = {
-		[GREP_COLOR_CONTEXT] = "",
-		[GREP_COLOR_FILENAME] = "",
-		[GREP_COLOR_FUNCTION] = "",
-		[GREP_COLOR_LINENO] = "",
-		[GREP_COLOR_COLUMNNO] = "",
-		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_SELECTED] = "",
-		[GREP_COLOR_SEP] = GIT_COLOR_CYAN,
-	},
-	.only_matching = 0,
-	.color = -1,
-	.output = std_output,
-};
-
 static const char *color_grep_slots[] = {
 	[GREP_COLOR_CONTEXT]	    = "context",
 	[GREP_COLOR_FILENAME]	    = "filename",
@@ -75,20 +54,12 @@ define_list_config_array_extra(color_grep_slots, {"match"});
  */
 int grep_config(const char *var, const char *value, void *cb)
 {
-	struct grep_opt *opt = &grep_defaults;
+	struct grep_opt *opt = cb;
 	const char *slot;
 
 	if (userdiff_config(var, value) < 0)
 		return -1;
 
-	/*
-	 * The instance of grep_opt that we set up here is copied by
-	 * grep_init() to be used by each individual invocation.
-	 * When populating a new field of this structure here, be
-	 * sure to think about ownership -- e.g., you might need to
-	 * override the shallow copy in grep_init() with a deep copy.
-	 */
-
 	if (!strcmp(var, "grep.extendedregexp")) {
 		opt->extended_regexp_option = git_config_bool(var, value);
 		return 0;
@@ -134,14 +105,10 @@ int grep_config(const char *var, const char *value, void *cb)
 	return 0;
 }
 
-/*
- * Initialize one instance of grep_opt and copy the
- * default values from the template we read the configuration
- * information in an earlier call to git_config(grep_config).
- */
 void grep_init(struct grep_opt *opt, struct repository *repo)
 {
-	*opt = grep_defaults;
+	struct grep_opt blank = GREP_OPT_INIT;
+	memcpy(opt, &blank, sizeof(*opt));
 
 	opt->repo = repo;
 	opt->pattern_tail = &opt->pattern_list;
diff --git a/grep.h b/grep.h
index 6b923d8599c..30a7dfd3294 100644
--- a/grep.h
+++ b/grep.h
@@ -178,6 +178,27 @@ struct grep_opt {
 	void *output_priv;
 };
 
+#define GREP_OPT_INIT { \
+	.relative = 1, \
+	.pathname = 1, \
+	.max_depth = -1, \
+	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
+	.colors = { \
+		[GREP_COLOR_CONTEXT] = "", \
+		[GREP_COLOR_FILENAME] = "", \
+		[GREP_COLOR_FUNCTION] = "", \
+		[GREP_COLOR_LINENO] = "", \
+		[GREP_COLOR_COLUMNNO] = "", \
+		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_SELECTED] = "", \
+		[GREP_COLOR_SEP] = GIT_COLOR_CYAN, \
+	}, \
+	.only_matching = 0, \
+	.color = -1, \
+	.output = std_output, \
+}
+
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
-- 
2.34.0.rc1.741.gab7bfd97031


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH 7/8] grep: simplify config parsing, change grep.<rx config> interaction
  2021-11-06 21:10 [PATCH 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
                   ` (5 preceding siblings ...)
  2021-11-06 21:10 ` [PATCH 6/8] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
@ 2021-11-06 21:10 ` Ævar Arnfjörð Bjarmason
  2021-11-08 23:04   ` Taylor Blau
  2021-11-06 21:10 ` [PATCH 8/8] grep: make "extendedRegexp=true" the same as "patternType=extended" Ævar Arnfjörð Bjarmason
  2021-11-10  1:43 ` [PATCH v2 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
  8 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-06 21:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, J Smith, Ævar Arnfjörð Bjarmason

Change the interaction between "grep.patternType=default" and
"grep.extendedRegexp=true" to make setting "grep.extendedRegexp=true"
synonymous with setting "grep.patternType=extended".

This changes our existing config parsing behavior as detailed below,
but in a way that's consistent with how we parse other
configuration.

Pedantically speaking we're probably breaking past promises here, but
I doubt that this will impact anyone in practice. The reduction in
complexity and resulting consistency with other default config
behavior is worth it.

When "grep.patternType" was introduced in 84befcd0a4a (grep: add a
grep.patternType configuration setting, 2012-08-03) we made two
seemingly contradictory promises:

 1. You can set "grep.patternType", and "[setting it to] 'default'
    will return to the default matching behavior".

 2. Support the existing "grep.extendedRegexp" option, but ignore it
    when the new "grep.patternType" is set, *except* "when the
    `grep.patternType` option is set. to a value other than 'default'".

I think that 84befcd0a4a probably didn't intend this behavior, but
instead ended up conflating our internal "unspecified" state with a
user's explicit desire to set the configuration back to the
default.

I.e. a user would correctly expect this to keep working:

    # ERE grep
    git -c grep.extendedRegexp=true grep <pattern>

And likewise for "grep.patternType=default" to take precedence over
the disfavored "grep.extendedRegexp" option, i.e. the usual "last set
wins" semantics.

    # BRE grep
    git -c grep.extendedRegexp=true -c grep.patternType=basic grep <pattern>

But probably not for this to ignore the new "grep.patternType" option
entirely, say if /etc/gitconfig was still setting
"grep.extendedRegexp", but "~/.gitconfig" used the new
"grep.patternType" (and wanted to use the "default" value):

    # Was ERE, now BRE
    git -c grep.extendedRegexp=true grep.patternType=default grep <pattern>

I think that in practice nobody or almost nobody is going to be
relying on this obscure interaction, and as shown here it makes the
config parsing much simpler. We no longer have to carry a complex
state machine in "grep_commit_pattern_type()" and
"grep_set_pattern_type_option()".

We can also do away with the "int fixed" and "int pcre2" members in
favor of using "pattern_type_option" directly in "grep.c", as well as
dropping the "pattern_type_arg" variable in "builtin/grep.c" in favor
of using the "pattern_type_option" member directly.

See my 07a3d411739 (grep: remove regflags from the public grep_opt
API, 2017-06-29) for addition of the two comments being removed here,
i.e. the complexity noted in that commit is now going away.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/config/grep.txt |  3 +-
 Documentation/git-grep.txt    |  3 +-
 builtin/grep.c                | 10 ++---
 grep.c                        | 71 +++++------------------------------
 grep.h                        |  6 +--
 revision.c                    |  2 -
 t/t7810-grep.sh               |  2 +-
 7 files changed, 17 insertions(+), 80 deletions(-)

diff --git a/Documentation/config/grep.txt b/Documentation/config/grep.txt
index 44abe45a7ca..2669b1757d3 100644
--- a/Documentation/config/grep.txt
+++ b/Documentation/config/grep.txt
@@ -12,8 +12,7 @@ grep.patternType::
 
 grep.extendedRegexp::
 	If set to true, enable `--extended-regexp` option by default. This
-	option is ignored when the `grep.patternType` option is set to a value
-	other than 'default'.
+	option is ignored when the `grep.patternType` option is set.
 
 grep.threads::
 	Number of grep worker threads to use.
diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 3d393fbac1b..078dfeadf50 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -348,8 +348,7 @@ grep.patternType::
 
 grep.extendedRegexp::
 	If set to true, enable `--extended-regexp` option by default. This
-	option is ignored when the `grep.patternType` option is set to a value
-	other than 'default'.
+	option is ignored when the `grep.patternType` option is set.
 
 grep.threads::
 	Number of grep worker threads to use. If unset (or set to 0), Git will
diff --git a/builtin/grep.c b/builtin/grep.c
index 7f95f44e948..a4964baf9c0 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -849,7 +849,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	int i;
 	int dummy;
 	int use_index = 1;
-	int pattern_type_arg = GREP_PATTERN_TYPE_UNSPECIFIED;
 	int allow_revs;
 
 	struct option options[] = {
@@ -883,16 +882,16 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			N_("descend at most <depth> levels"), PARSE_OPT_NONEG,
 			NULL, 1 },
 		OPT_GROUP(""),
-		OPT_SET_INT('E', "extended-regexp", &pattern_type_arg,
+		OPT_SET_INT('E', "extended-regexp", &opt.pattern_type_option,
 			    N_("use extended POSIX regular expressions"),
 			    GREP_PATTERN_TYPE_ERE),
-		OPT_SET_INT('G', "basic-regexp", &pattern_type_arg,
+		OPT_SET_INT('G', "basic-regexp", &opt.pattern_type_option,
 			    N_("use basic POSIX regular expressions (default)"),
 			    GREP_PATTERN_TYPE_BRE),
-		OPT_SET_INT('F', "fixed-strings", &pattern_type_arg,
+		OPT_SET_INT('F', "fixed-strings", &opt.pattern_type_option,
 			    N_("interpret patterns as fixed strings"),
 			    GREP_PATTERN_TYPE_FIXED),
-		OPT_SET_INT('P', "perl-regexp", &pattern_type_arg,
+		OPT_SET_INT('P', "perl-regexp", &opt.pattern_type_option,
 			    N_("use Perl-compatible regular expressions"),
 			    GREP_PATTERN_TYPE_PCRE),
 		OPT_GROUP(""),
@@ -986,7 +985,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options, grep_usage,
 			     PARSE_OPT_KEEP_DASHDASH |
 			     PARSE_OPT_STOP_AT_NON_OPTION);
-	grep_commit_pattern_type(pattern_type_arg, &opt);
 
 	if (use_index && !startup_info->have_repository) {
 		int fallback = 0;
diff --git a/grep.c b/grep.c
index fb3f63c63ef..dda8e536fe3 100644
--- a/grep.c
+++ b/grep.c
@@ -60,8 +60,10 @@ int grep_config(const char *var, const char *value, void *cb)
 	if (userdiff_config(var, value) < 0)
 		return -1;
 
-	if (!strcmp(var, "grep.extendedregexp")) {
-		opt->extended_regexp_option = git_config_bool(var, value);
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_UNSPECIFIED &&
+	    !strcmp(var, "grep.extendedregexp") &&
+	    git_config_bool(var, value)) {
+		opt->pattern_type_option = GREP_PATTERN_TYPE_ERE;
 		return 0;
 	}
 
@@ -115,62 +117,6 @@ void grep_init(struct grep_opt *opt, struct repository *repo)
 	opt->header_tail = &opt->header_list;
 }
 
-static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	/*
-	 * When committing to the pattern type by setting the relevant
-	 * fields in grep_opt it's generally not necessary to zero out
-	 * the fields we're not choosing, since they won't have been
-	 * set by anything. The extended_regexp_option field is the
-	 * only exception to this.
-	 *
-	 * This is because in the process of parsing grep.patternType
-	 * & grep.extendedRegexp we set opt->pattern_type_option and
-	 * opt->extended_regexp_option, respectively. We then
-	 * internally use opt->extended_regexp_option to see if we're
-	 * compiling an ERE. It must be unset if that's not actually
-	 * the case.
-	 */
-	if (pattern_type != GREP_PATTERN_TYPE_ERE &&
-	    opt->extended_regexp_option)
-		opt->extended_regexp_option = 0;
-
-	switch (pattern_type) {
-	case GREP_PATTERN_TYPE_UNSPECIFIED:
-		/* fall through */
-
-	case GREP_PATTERN_TYPE_BRE:
-		break;
-
-	case GREP_PATTERN_TYPE_ERE:
-		opt->extended_regexp_option = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_FIXED:
-		opt->fixed = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_PCRE:
-		opt->pcre2 = 1;
-		break;
-	}
-}
-
-void grep_commit_pattern_type(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	if (pattern_type != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(pattern_type, opt);
-	else if (opt->pattern_type_option != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(opt->pattern_type_option, opt);
-	else if (opt->extended_regexp_option)
-		/*
-		 * This branch *must* happen after setting from the
-		 * opt->pattern_type_option above, we don't want
-		 * grep.extendedRegexp to override grep.patternType!
-		 */
-		grep_set_pattern_type_option(GREP_PATTERN_TYPE_ERE, opt);
-}
-
 static struct grep_pat *create_grep_pat(const char *pat, size_t patlen,
 					const char *origin, int no,
 					enum grep_pat_token t,
@@ -492,9 +438,10 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 
 	p->word_regexp = opt->word_regexp;
 	p->ignore_case = opt->ignore_case;
-	p->fixed = opt->fixed;
+	p->fixed = opt->pattern_type_option == GREP_PATTERN_TYPE_FIXED;
 
-	if (memchr(p->pattern, 0, p->patternlen) && !opt->pcre2)
+	if (opt->pattern_type_option != GREP_PATTERN_TYPE_PCRE &&
+	    memchr(p->pattern, 0, p->patternlen))
 		die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
 
 	p->is_fixed = is_fixed(p->pattern, p->patternlen);
@@ -545,14 +492,14 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 		return;
 	}
 
-	if (opt->pcre2) {
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_PCRE) {
 		compile_pcre2_pattern(p, opt);
 		return;
 	}
 
 	if (p->ignore_case)
 		regflags |= REG_ICASE;
-	if (opt->extended_regexp_option)
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_ERE)
 		regflags |= REG_EXTENDED;
 	err = regcomp(&p->regexp, p->pattern, regflags);
 	if (err) {
diff --git a/grep.h b/grep.h
index 30a7dfd3294..e4e548aed90 100644
--- a/grep.h
+++ b/grep.h
@@ -143,7 +143,6 @@ struct grep_opt {
 	int unmatch_name_only;
 	int count;
 	int word_regexp;
-	int fixed;
 	int all_match;
 #define GREP_BINARY_DEFAULT	0
 #define GREP_BINARY_NOMATCH	1
@@ -152,7 +151,6 @@ struct grep_opt {
 	int allow_textconv;
 	int extended;
 	int use_reflog_filter;
-	int pcre2;
 	int relative;
 	int pathname;
 	int null_following_name;
@@ -161,8 +159,7 @@ struct grep_opt {
 	int max_depth;
 	int funcname;
 	int funcbody;
-	int extended_regexp_option;
-	int pattern_type_option;
+	enum grep_pattern_type pattern_type_option;
 	int ignore_locale;
 	char colors[NR_GREP_COLORS][COLOR_MAXLEN];
 	unsigned pre_context;
@@ -201,7 +198,6 @@ struct grep_opt {
 
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
-void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
 void append_grep_pattern(struct grep_opt *opt, const char *pat, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index 9f9b0d2429e..ed29d245c89 100644
--- a/revision.c
+++ b/revision.c
@@ -2864,8 +2864,6 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 
 	diff_setup_done(&revs->diffopt);
 
-	grep_commit_pattern_type(GREP_PATTERN_TYPE_UNSPECIFIED,
-				 &revs->grep_filter);
 	if (!is_encoding_utf8(get_log_output_encoding()))
 		revs->grep_filter.ignore_locale = 1;
 	compile_grep_patterns(&revs->grep_filter);
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 6b6423a07c3..a59a9726357 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -443,7 +443,7 @@ do
 	'
 
 	test_expect_success "grep $L with grep.extendedRegexp=true and grep.patternType=default" '
-		echo "${HC}ab:abc" >expected &&
+		echo "${HC}ab:a+bc" >expected &&
 		git \
 			-c grep.extendedRegexp=true \
 			-c grep.patternType=default \
-- 
2.34.0.rc1.741.gab7bfd97031


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH 8/8] grep: make "extendedRegexp=true" the same as "patternType=extended"
  2021-11-06 21:10 [PATCH 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
                   ` (6 preceding siblings ...)
  2021-11-06 21:10 ` [PATCH 7/8] grep: simplify config parsing, change grep.<rx config> interaction Ævar Arnfjörð Bjarmason
@ 2021-11-06 21:10 ` Ævar Arnfjörð Bjarmason
  2021-11-10  1:43 ` [PATCH v2 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
  8 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-06 21:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, J Smith, Ævar Arnfjörð Bjarmason

In the preceding commit we changed how a "grep.patternType=default"
set after "grep.extendedRegexp=true" would be handled so that the last
set would win, but a "grep.extendedRegexp=true" would only be used if
"grep.patternType" was set to a value other than "default".

Thus a user who had old config and set "grep.extendedRegexp=true" in
their ~/.gitconfig expecting ERE behavior would be opted-in to say
"perl" regexes if a system "/etc/gitconfig" started setting
"grep.patternType=perl".

These funny semantics of only paying attention to a set if another key
is not set to a given value aren't how we treat other config keys, so
let's do away with this caveat for consistency.

The new semantics are simple, a "grep.extendedRegexp=true" is an exact
synonym for specifying "grep.patternType=extended" in the
config. We'll keep ignoring ""grep.extendedRegexp=false", although
arguably we could treat it as a "grep.patternType=basic".

As argued in the preceding commit I think this behavior came about
because we were conflating the state of our code's own internal
"default" value with what we found in explicit user config. See
84befcd0a4a (grep: add a grep.patternType configuration setting,
2012-08-03) for that past behavior.

Let's further change the documentation to note that
"grep.extendedRegexp" is a deprecated synonym, perhaps we'll be able
to remove it at some point in the future and do away with this
special-case entirely.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/config/grep.txt | 3 +--
 Documentation/git-grep.txt    | 3 +--
 grep.c                        | 8 +++-----
 grep.h                        | 5 +----
 t/t7810-grep.sh               | 2 +-
 5 files changed, 7 insertions(+), 14 deletions(-)

diff --git a/Documentation/config/grep.txt b/Documentation/config/grep.txt
index 2669b1757d3..9868ea8a061 100644
--- a/Documentation/config/grep.txt
+++ b/Documentation/config/grep.txt
@@ -11,8 +11,7 @@ grep.patternType::
 	value 'default' will return to the default matching behavior.
 
 grep.extendedRegexp::
-	If set to true, enable `--extended-regexp` option by default. This
-	option is ignored when the `grep.patternType` option is set.
+	Deprecated synonym for 'grep.patternType=extended`.
 
 grep.threads::
 	Number of grep worker threads to use.
diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 078dfeadf50..211ba6801b0 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -347,8 +347,7 @@ grep.patternType::
 	value 'default' will return to the default matching behavior.
 
 grep.extendedRegexp::
-	If set to true, enable `--extended-regexp` option by default. This
-	option is ignored when the `grep.patternType` option is set.
+	Deprecated synonym for 'grep.patternType=extended`.
 
 grep.threads::
 	Number of grep worker threads to use. If unset (or set to 0), Git will
diff --git a/grep.c b/grep.c
index dda8e536fe3..ef8746d85f0 100644
--- a/grep.c
+++ b/grep.c
@@ -33,9 +33,8 @@ static const char *color_grep_slots[] = {
 
 static int parse_pattern_type_arg(const char *opt, const char *arg)
 {
-	if (!strcmp(arg, "default"))
-		return GREP_PATTERN_TYPE_UNSPECIFIED;
-	else if (!strcmp(arg, "basic"))
+	if (!strcmp(arg, "basic") ||
+	    !strcmp(arg, "default"))
 		return GREP_PATTERN_TYPE_BRE;
 	else if (!strcmp(arg, "extended"))
 		return GREP_PATTERN_TYPE_ERE;
@@ -60,8 +59,7 @@ int grep_config(const char *var, const char *value, void *cb)
 	if (userdiff_config(var, value) < 0)
 		return -1;
 
-	if (opt->pattern_type_option == GREP_PATTERN_TYPE_UNSPECIFIED &&
-	    !strcmp(var, "grep.extendedregexp") &&
+	if (!strcmp(var, "grep.extendedregexp") &&
 	    git_config_bool(var, value)) {
 		opt->pattern_type_option = GREP_PATTERN_TYPE_ERE;
 		return 0;
diff --git a/grep.h b/grep.h
index e4e548aed90..55b23d045e0 100644
--- a/grep.h
+++ b/grep.h
@@ -94,8 +94,7 @@ enum grep_expr_node {
 };
 
 enum grep_pattern_type {
-	GREP_PATTERN_TYPE_UNSPECIFIED = 0,
-	GREP_PATTERN_TYPE_BRE,
+	GREP_PATTERN_TYPE_BRE = 0,
 	GREP_PATTERN_TYPE_ERE,
 	GREP_PATTERN_TYPE_FIXED,
 	GREP_PATTERN_TYPE_PCRE
@@ -179,7 +178,6 @@ struct grep_opt {
 	.relative = 1, \
 	.pathname = 1, \
 	.max_depth = -1, \
-	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
 	.colors = { \
 		[GREP_COLOR_CONTEXT] = "", \
 		[GREP_COLOR_FILENAME] = "", \
@@ -191,7 +189,6 @@ struct grep_opt {
 		[GREP_COLOR_SELECTED] = "", \
 		[GREP_COLOR_SEP] = GIT_COLOR_CYAN, \
 	}, \
-	.only_matching = 0, \
 	.color = -1, \
 	.output = std_output, \
 }
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index a59a9726357..afca938a4d0 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -461,7 +461,7 @@ do
 	'
 
 	test_expect_success "grep $L with grep.patternType=basic and grep.extendedRegexp=true" '
-		echo "${HC}ab:a+bc" >expected &&
+		echo "${HC}ab:abc" >expected &&
 		git \
 			-c grep.patternType=basic \
 			-c grep.extendedRegexp=true \
-- 
2.34.0.rc1.741.gab7bfd97031


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: [PATCH 2/8] git.c & grep.c: assert that "prefix" is NULL or non-zero string
  2021-11-06 21:10 ` [PATCH 2/8] git.c & grep.c: assert that "prefix" is NULL or non-zero string Ævar Arnfjörð Bjarmason
@ 2021-11-08 20:37   ` Taylor Blau
  0 siblings, 0 replies; 151+ messages in thread
From: Taylor Blau @ 2021-11-08 20:37 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano, J Smith

On Sat, Nov 06, 2021 at 10:10:48PM +0100, Ævar Arnfjörð Bjarmason wrote:
> @@ -431,6 +430,7 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
>  			int nongit_ok;
>  			prefix = setup_git_directory_gently(&nongit_ok);
>  		}
> +		assert(!prefix || (prefix && *prefix));

Small nit, but the check to `prefix` (in `prefix && *prefix`) is
redundant with the left-hand side of the or.

>  		precompose_argv_prefix(argc, argv, NULL);
>  		if (use_pager == -1 && p->option & (RUN_SETUP | RUN_SETUP_GENTLY) &&
>  		    !(p->option & DELAY_PAGER_CONFIG))
> diff --git a/grep.c b/grep.c
> index f6e113e9f0f..88ebc504630 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -145,7 +145,7 @@ void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix
>
>  	opt->repo = repo;
>  	opt->prefix = prefix;
> -	opt->prefix_length = (prefix && *prefix) ? strlen(prefix) : 0;
> +	opt->prefix_length = prefix ? strlen(prefix) : 0;

Looking around, ls-tree's initialization includes a conditional of the
form:

    if (prefix && *prefix)
      chomp_prefix = strlen(prefix);

So that could be cleaned up too. But honestly, the pre-image of this
patch (and the spot in ls-tree) doesn't make a lot of sense to me to
begin with.

Even if prefix were the empty string, calling strlen() on it will just
give us zero. So there is no difference between assigning `str && *str ?
strlen(str) : 0` and `str ? strlen(str) : 0`.

So I am confused why this needs hardening with an assertion when it
seems like the checks before calling strlen() were overly restrictive to
begin with. In other words: why not only include this hunk (either in
this patch, or squashed into another patch later on)?

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH 3/8] grep: remove unused "prefix_length" member
  2021-11-06 21:10 ` [PATCH 3/8] grep: remove unused "prefix_length" member Ævar Arnfjörð Bjarmason
@ 2021-11-08 20:42   ` Taylor Blau
  0 siblings, 0 replies; 151+ messages in thread
From: Taylor Blau @ 2021-11-08 20:42 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano, J Smith

On Sat, Nov 06, 2021 at 10:10:49PM +0100, Ævar Arnfjörð Bjarmason wrote:
> Remove the "prefix_length" member, which we compute with a strlen() on
> the "prefix" argument to grep_init(), but whose strlen() hasn't been
> used since 493b7a08d80 (grep: accept relative paths outside current
> working directory, 2009-09-05).

OK, so now we *are* relying on the assumption that prefix is either NULL
or a non-empty string.

I assume that the last patch was along the lines of "let's clean up this
redundant check before calling strlen()" and "prepare to not call
strlen() at all and just check the string itself for NULL". To be
honest, I imagine that it would have been much easier to review if these
two had been squashed into one, since I was a little surprised to see
the line I had just been commenting on in the previous patch removed.

Perhaps I should have looked a little further in the series before
commenting there, but I think it would have been even easier for
reviewers to see these two patches together.

> When this code was added in 0d042fecf2f (git-grep: show pathnames
> relative to the current directory, 2006-08-11) we used the length, but
> since 493b7a08d80 we haven't used it for anything except a boolean
> check that we could have done on the "prefix" member itself.
>
> Before a preceding commit we also used to guard the strlen() with
> "prefix && *prefix", but as that commit notes the RHS of that && chain
> was also redundant.

Everything in this patch looks fine to me, assuming that prefix is
indeed always NULL or non-empty (which I haven't verified myself).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH 4/8] grep.c: move "prefix" out of "struct grep_opt"
  2021-11-06 21:10 ` [PATCH 4/8] grep.c: move "prefix" out of "struct grep_opt" Ævar Arnfjörð Bjarmason
@ 2021-11-08 20:56   ` Taylor Blau
  2021-11-09  2:10     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 151+ messages in thread
From: Taylor Blau @ 2021-11-08 20:56 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano, J Smith

On Sat, Nov 06, 2021 at 10:10:50PM +0100, Ævar Arnfjörð Bjarmason wrote:
> The "struct grep_opt" is a mixture of things that would be needed by
> all callers of the grep.c API, and quite a few things that only the
> builtin/grep.c needs.
>
> Since we got rid of "prefix_length" in the previous commit, let's move
> the "prefix" variable over to "builtin/grep.c" where it's used. To do
> this let's create a "struct grep_cmd_opt", which we'll have a pointer
> to in a new "caller_priv" member in "struct grep_opt" (the existing
> "priv" is used by the top-level "grep.c" itself).

I'm definitely in favor of removing specialized, caller-specific bits
from an internal API. But I'm not sure why grep.c needs to keep track o
this new "caller_priv" field at all.

Among the uses of `prefix` in builtin/grep.c, I see grep_source_name,
and the call to run_pager(). Would it be more straightforward to pass
down prefix from cmd_grep down to its use in grep_source_name?

There are quite a few intermediary functions that we go through to get
from cmd_grep() down to grep_source_name(). For instance, we could reach
it through:

    cmd_grep -> grep_tree -> grep_submodule -> grep_cache -> grep_oid ->
    grep_source_name

But passing prefix from cmd_grep down to grep_source_name without
relying on the internals of grep.c seems like a good direction to me. I
could even buy that (ab)using a static variable in builtin/grep.c to
keep track of a constant prefix value would save you some plumbing
(though I'd rather see the usage spelled out more explicitly).

All of that is to say that I share your motivation for this patch and
think that the direction is good, but I would have preferred to do it
without the caller_priv variable (unless there is something that I am
missing here).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH 6/8] grep API: call grep_config() after grep_init()
  2021-11-06 21:10 ` [PATCH 6/8] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
@ 2021-11-08 21:49   ` Taylor Blau
  2021-11-09  2:06     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 151+ messages in thread
From: Taylor Blau @ 2021-11-08 21:49 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano, J Smith

On Sat, Nov 06, 2021 at 10:10:52PM +0100, Ævar Arnfjörð Bjarmason wrote:
> The grep_init() function used the odd pattern of initializing the
> passed-in "struct grep_opt" with a statically defined "grep_defaults"
> struct, which would be modified in-place when we invoked
> grep_config().
>
> So we effectively (b) initialized config, (a) then defaults, (c)
> followed by user options. Usually those are ordered as "a", "b" and
> "c" instead.

Do we risk changing any user-visible behavior here? Based on my reading
of grep.c before and after this patch, I think the answer is "no", but I
wasn't sure if you had done a similar analysis.

In any case, I think the "bring your own structure" instead of getting
one copied around is much easier to reason about. Even if we weren't
accidentally stomping on ownership of the struct before, not having to
reason about it is a nice benefit.

> As the comments being removed here show the previous behavior needed
> to be carefully explained as we'd potentially share the populated
> configuration among different instances of grep_init(). In practice we
> didn't do that, but now that it can't be a concern anymore let's
> remove those comments.

Makes sense, I agree.

> diff --git a/builtin/grep.c b/builtin/grep.c
> index 960c7aac123..7f95f44e948 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -288,7 +288,7 @@ static int wait_all(void)
>  static int grep_cmd_config(const char *var, const char *value, void *cb)
>  {
>  	int st = grep_config(var, value, cb);
> -	if (git_color_default_config(var, value, cb) < 0)
> +	if (git_color_default_config(var, value, NULL) < 0)

This doesn't appear strictly related to the rest of your changes, but
only serves to prevent the caller-provided data from being sent down to
git_color_default_config().

It didn't matter before because (a) the caller doesn't specify any data
to begin with, and git_color_default_config() (or the functions that it
calls) don't do anything with the extra pointer. Now cmd_grep() is going
to start passing around a pointer to a struct grep_opt.

But git_color_default_config() still doesn't do anything with the
pointer it receives, and passing that pointer around is standard
practice among config.c code. So I don't think that this hunk is
strictly necessary, and it's somewhat different than the pattern
established within config.c.

I wouldn't be sad to see this hunk dropped (and in fact have a slight
preference leaning this way), but I don't mind keeping it around,
either.

>  		st = -1;
>
>  	if (!strcmp(var, "grep.threads")) {
> @@ -969,8 +969,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>  		OPT_END()
>  	};
>
> -	git_config(grep_cmd_config, NULL);
>  	grep_init(&opt, the_repository);
> +	git_config(grep_cmd_config, &opt);
>  	opt.caller_priv = &opt_cmd;
>
>  	/*
> diff --git a/builtin/log.c b/builtin/log.c
> index f75d87e8d7f..bfddacdfa6c 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -505,8 +505,6 @@ static int git_log_config(const char *var, const char *value, void *cb)
>  		return 0;
>  	}
>
> -	if (grep_config(var, value, cb) < 0)
> -		return -1;
>  	if (git_gpg_config(var, value, cb) < 0)
>  		return -1;
>  	return git_diff_ui_config(var, value, cb);
> @@ -521,6 +519,8 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix)
>  	git_config(git_log_config, NULL);
>
>  	repo_init_revisions(the_repository, &rev, prefix);
> +	git_config(grep_config, &rev.grep_filter);
> +
>  	rev.diff = 1;
>  	rev.simplify_history = 0;
>  	memset(&opt, 0, sizeof(opt));
> @@ -635,6 +635,8 @@ int cmd_show(int argc, const char **argv, const char *prefix)
>
>  	memset(&match_all, 0, sizeof(match_all));
>  	repo_init_revisions(the_repository, &rev, prefix);
> +	git_config(grep_config, &rev.grep_filter);
> +
>  	rev.diff = 1;
>  	rev.always_show_header = 1;
>  	rev.no_walk = 1;
> @@ -718,6 +720,8 @@ int cmd_log_reflog(int argc, const char **argv, const char *prefix)
>
>  	repo_init_revisions(the_repository, &rev, prefix);
>  	init_reflog_walk(&rev.reflog_info);
> +	git_config(grep_config, &rev.grep_filter);
> +
>  	rev.verbose_header = 1;
>  	memset(&opt, 0, sizeof(opt));
>  	opt.def = "HEAD";
> @@ -751,6 +755,8 @@ int cmd_log(int argc, const char **argv, const char *prefix)
>  	git_config(git_log_config, NULL);
>
>  	repo_init_revisions(the_repository, &rev, prefix);
> +	git_config(grep_config, &rev.grep_filter);
> +
>  	rev.always_show_header = 1;
>  	memset(&opt, 0, sizeof(opt));
>  	opt.def = "HEAD";
> @@ -1833,10 +1839,13 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
>  	extra_hdr.strdup_strings = 1;
>  	extra_to.strdup_strings = 1;
>  	extra_cc.strdup_strings = 1;
> +
>  	init_log_defaults();
>  	init_display_notes(&notes_opt);
>  	git_config(git_format_config, NULL);
>  	repo_init_revisions(the_repository, &rev, prefix);
> +	git_config(grep_config, &rev.grep_filter);
> +
>  	rev.show_notes = show_notes;
>  	memcpy(&rev.notes_opt, &notes_opt, sizeof(notes_opt));
>  	rev.commit_format = CMIT_FMT_EMAIL;
> diff --git a/grep.c b/grep.c
> index c9065254aeb..fb3f63c63ef 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -19,27 +19,6 @@ static void std_output(struct grep_opt *opt, const void *buf, size_t size)
>  	fwrite(buf, size, 1, stdout);
>  }
>
> -static struct grep_opt grep_defaults = {
> -	.relative = 1,
> -	.pathname = 1,
> -	.max_depth = -1,
> -	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED,
> -	.colors = {
> -		[GREP_COLOR_CONTEXT] = "",
> -		[GREP_COLOR_FILENAME] = "",
> -		[GREP_COLOR_FUNCTION] = "",
> -		[GREP_COLOR_LINENO] = "",
> -		[GREP_COLOR_COLUMNNO] = "",
> -		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED,
> -		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED,
> -		[GREP_COLOR_SELECTED] = "",
> -		[GREP_COLOR_SEP] = GIT_COLOR_CYAN,
> -	},
> -	.only_matching = 0,
> -	.color = -1,
> -	.output = std_output,
> -};
> -
>  static const char *color_grep_slots[] = {
>  	[GREP_COLOR_CONTEXT]	    = "context",
>  	[GREP_COLOR_FILENAME]	    = "filename",
> @@ -75,20 +54,12 @@ define_list_config_array_extra(color_grep_slots, {"match"});
>   */
>  int grep_config(const char *var, const char *value, void *cb)
>  {
> -	struct grep_opt *opt = &grep_defaults;
> +	struct grep_opt *opt = cb;
>  	const char *slot;
>
>  	if (userdiff_config(var, value) < 0)
>  		return -1;
>
> -	/*
> -	 * The instance of grep_opt that we set up here is copied by
> -	 * grep_init() to be used by each individual invocation.
> -	 * When populating a new field of this structure here, be
> -	 * sure to think about ownership -- e.g., you might need to
> -	 * override the shallow copy in grep_init() with a deep copy.
> -	 */
> -
>  	if (!strcmp(var, "grep.extendedregexp")) {
>  		opt->extended_regexp_option = git_config_bool(var, value);
>  		return 0;
> @@ -134,14 +105,10 @@ int grep_config(const char *var, const char *value, void *cb)
>  	return 0;
>  }
>
> -/*
> - * Initialize one instance of grep_opt and copy the
> - * default values from the template we read the configuration
> - * information in an earlier call to git_config(grep_config).
> - */
>  void grep_init(struct grep_opt *opt, struct repository *repo)
>  {
> -	*opt = grep_defaults;
> +	struct grep_opt blank = GREP_OPT_INIT;
> +	memcpy(opt, &blank, sizeof(*opt));

I'm nit-picking, but creating a throwaway struct for the convenience of
using designated initialization (at the cost of having to memcpy an
entire struct around) seems like overkill.

Especially since we're just going to write into the other fields of the
the target struct anyway, I'd probably rather have seen everything
written out explicitly without the throwaway or memcpy.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH 7/8] grep: simplify config parsing, change grep.<rx config> interaction
  2021-11-06 21:10 ` [PATCH 7/8] grep: simplify config parsing, change grep.<rx config> interaction Ævar Arnfjörð Bjarmason
@ 2021-11-08 23:04   ` Taylor Blau
  2021-11-09  2:01     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 151+ messages in thread
From: Taylor Blau @ 2021-11-08 23:04 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano, J Smith

On Sat, Nov 06, 2021 at 10:10:53PM +0100, Ævar Arnfjörð Bjarmason wrote:
> I.e. a user would correctly expect this to keep working:
>
>     # ERE grep
>     git -c grep.extendedRegexp=true grep <pattern>
>
> And likewise for "grep.patternType=default" to take precedence over
> the disfavored "grep.extendedRegexp" option, i.e. the usual "last set
> wins" semantics.
>
>     # BRE grep
>     git -c grep.extendedRegexp=true -c grep.patternType=basic grep <pattern>
>
> But probably not for this to ignore the new "grep.patternType" option
> entirely, say if /etc/gitconfig was still setting
> "grep.extendedRegexp", but "~/.gitconfig" used the new
> "grep.patternType" (and wanted to use the "default" value):
>
>     # Was ERE, now BRE
>     git -c grep.extendedRegexp=true grep.patternType=default grep <pattern>

OK, so this is the case that we'd be "breaking". And I think that the
new behavior you're outlining here (where a higher-precedence
grep.patternType=default overrides a lower-precedence
grep.extendedRegexp=true, resulting in using BRE over ERE) makes more
sense.

At least, it makes more sense if your expectation of "default" is "the
default matching behavior", not "fallthrough to grep.extendedRegexp".

In any case, I am sensitive to breaking existing user workflows, but
this seems so obscure to me that I have a hard time expecting that
m(any?) users will even notice this at all.

The situation I'm most concerned about is having grep.extendedRegexp set
in, say, /etc/gitconfig and grep.patternType=default set at a
higher-precedence level.

> ---
>  Documentation/config/grep.txt |  3 +-
>  Documentation/git-grep.txt    |  3 +-

Not the fault of your patch, but these two are annoyingly (and subtly)
different from one another. Could we clean this up and put everything in
Documentation/config/grep.txt (and then include that in the
CONFIGURATION section of Documentation/git-grep.txt)?

>  builtin/grep.c                | 10 ++---
>  grep.c                        | 71 +++++------------------------------
>  grep.h                        |  6 +--
>  revision.c                    |  2 -
>  t/t7810-grep.sh               |  2 +-
>  7 files changed, 17 insertions(+), 80 deletions(-)
>
> diff --git a/Documentation/config/grep.txt b/Documentation/config/grep.txt
> index 44abe45a7ca..2669b1757d3 100644
> --- a/Documentation/config/grep.txt
> +++ b/Documentation/config/grep.txt
> @@ -12,8 +12,7 @@ grep.patternType::
>
>  grep.extendedRegexp::
>  	If set to true, enable `--extended-regexp` option by default. This
> -	option is ignored when the `grep.patternType` option is set to a value
> -	other than 'default'.
> +	option is ignored when the `grep.patternType` option is set.
>
>  grep.threads::
>  	Number of grep worker threads to use.
> diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
> index 3d393fbac1b..078dfeadf50 100644
> --- a/Documentation/git-grep.txt
> +++ b/Documentation/git-grep.txt
> @@ -348,8 +348,7 @@ grep.patternType::
>
>  grep.extendedRegexp::
>  	If set to true, enable `--extended-regexp` option by default. This
> -	option is ignored when the `grep.patternType` option is set to a value
> -	other than 'default'.
> +	option is ignored when the `grep.patternType` option is set.

Makes sense, and matches your description.

> diff --git a/grep.c b/grep.c
> index fb3f63c63ef..dda8e536fe3 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -60,8 +60,10 @@ int grep_config(const char *var, const char *value, void *cb)
>  	if (userdiff_config(var, value) < 0)
>  		return -1;
>
> -	if (!strcmp(var, "grep.extendedregexp")) {
> -		opt->extended_regexp_option = git_config_bool(var, value);
> +	if (opt->pattern_type_option == GREP_PATTERN_TYPE_UNSPECIFIED &&
> +	    !strcmp(var, "grep.extendedregexp") &&
> +	    git_config_bool(var, value)) {
> +		opt->pattern_type_option = GREP_PATTERN_TYPE_ERE;
>  		return 0;
>  	}

And here's our "as long as we haven't set the pattern type already via
grep.patternType, allow grep.extendedRegexp to set it". But the same
"only set when unspecified" condition *isn't* in place for
grep.patternType, which is what makes us prefer values from that
configuration over the other. Makes sense.

Everything else looks good here, too.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH 7/8] grep: simplify config parsing, change grep.<rx config> interaction
  2021-11-08 23:04   ` Taylor Blau
@ 2021-11-09  2:01     ` Ævar Arnfjörð Bjarmason
  2021-11-10  0:16       ` Taylor Blau
  0 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-09  2:01 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Junio C Hamano, J Smith


On Mon, Nov 08 2021, Taylor Blau wrote:

> On Sat, Nov 06, 2021 at 10:10:53PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> I.e. a user would correctly expect this to keep working:
>>
>>     # ERE grep
>>     git -c grep.extendedRegexp=true grep <pattern>
>>
>> And likewise for "grep.patternType=default" to take precedence over
>> the disfavored "grep.extendedRegexp" option, i.e. the usual "last set
>> wins" semantics.
>>
>>     # BRE grep
>>     git -c grep.extendedRegexp=true -c grep.patternType=basic grep <pattern>
>>
>> But probably not for this to ignore the new "grep.patternType" option
>> entirely, say if /etc/gitconfig was still setting
>> "grep.extendedRegexp", but "~/.gitconfig" used the new
>> "grep.patternType" (and wanted to use the "default" value):
>>
>>     # Was ERE, now BRE
>>     git -c grep.extendedRegexp=true grep.patternType=default grep <pattern>
>
> OK, so this is the case that we'd be "breaking". And I think that the
> new behavior you're outlining here (where a higher-precedence
> grep.patternType=default overrides a lower-precedence
> grep.extendedRegexp=true, resulting in using BRE over ERE) makes more
> sense.
>
> At least, it makes more sense if your expectation of "default" is "the
> default matching behavior", not "fallthrough to grep.extendedRegexp".
>
> In any case, I am sensitive to breaking existing user workflows, but
> this seems so obscure to me that I have a hard time expecting that
> m(any?) users will even notice this at all.
>
> The situation I'm most concerned about is having grep.extendedRegexp set
> in, say, /etc/gitconfig and grep.patternType=default set at a
> higher-precedence level.

*nod*, but the only user who'd end up with that is someone who's trying
to override grep.extendedRegexp but failing to do it, so this would
help.

Or someone who'd read the docs, understood that we promised that would
do nothing, and inserted that just to test us, but that seems unlikely
:)

Or, I suppose someone who's entirely confused, and will continue being
even more confused now that behavior changes on a git upgrade from ERE
to BRE.

I'm hoping the last two paragraphs describe no-one & that this is safe
to do.

>> ---
>>  Documentation/config/grep.txt |  3 +-
>>  Documentation/git-grep.txt    |  3 +-
>
> Not the fault of your patch, but these two are annoyingly (and subtly)
> different from one another. Could we clean this up and put everything in
> Documentation/config/grep.txt (and then include that in the
> CONFIGURATION section of Documentation/git-grep.txt)?

I've got a large series to do that for all of these, but opted to skip
that particular digression here (even for just grep.txt it's a bit
distracting).

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH 6/8] grep API: call grep_config() after grep_init()
  2021-11-08 21:49   ` Taylor Blau
@ 2021-11-09  2:06     ` Ævar Arnfjörð Bjarmason
  2021-11-10  0:18       ` Taylor Blau
  0 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-09  2:06 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Junio C Hamano, J Smith, Jeff King


On Mon, Nov 08 2021, Taylor Blau wrote:

> On Sat, Nov 06, 2021 at 10:10:52PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> The grep_init() function used the odd pattern of initializing the
>> passed-in "struct grep_opt" with a statically defined "grep_defaults"
>> struct, which would be modified in-place when we invoked
>> grep_config().
>>
>> So we effectively (b) initialized config, (a) then defaults, (c)
>> followed by user options. Usually those are ordered as "a", "b" and
>> "c" instead.
>
> Do we risk changing any user-visible behavior here? Based on my reading
> of grep.c before and after this patch, I think the answer is "no", but I
> wasn't sure if you had done a similar analysis.
>
> In any case, I think the "bring your own structure" instead of getting
> one copied around is much easier to reason about. Even if we weren't
> accidentally stomping on ownership of the struct before, not having to
> reason about it is a nice benefit.

I don't think we're changing any behavior except the one noted in this
series.

We only set a few config variables, so I thought that was fairly easy to
trace...

> [...]
>> diff --git a/builtin/grep.c b/builtin/grep.c
>> index 960c7aac123..7f95f44e948 100644
>> --- a/builtin/grep.c
>> +++ b/builtin/grep.c
>> @@ -288,7 +288,7 @@ static int wait_all(void)
>>  static int grep_cmd_config(const char *var, const char *value, void *cb)
>>  {
>>  	int st = grep_config(var, value, cb);
>> -	if (git_color_default_config(var, value, cb) < 0)
>> +	if (git_color_default_config(var, value, NULL) < 0)
>
> This doesn't appear strictly related to the rest of your changes, but
> only serves to prevent the caller-provided data from being sent down to
> git_color_default_config().
>
> It didn't matter before because (a) the caller doesn't specify any data
> to begin with, and git_color_default_config() (or the functions that it
> calls) don't do anything with the extra pointer. Now cmd_grep() is going
> to start passing around a pointer to a struct grep_opt.
>
> But git_color_default_config() still doesn't do anything with the
> pointer it receives, and passing that pointer around is standard
> practice among config.c code. So I don't think that this hunk is
> strictly necessary, and it's somewhat different than the pattern
> established within config.c.
>
> I wouldn't be sad to see this hunk dropped (and in fact have a slight
> preference leaning this way), but I don't mind keeping it around,
> either.

Will either split it up or drop it.

> [...]
>> -/*
>> - * Initialize one instance of grep_opt and copy the
>> - * default values from the template we read the configuration
>> - * information in an earlier call to git_config(grep_config).
>> - */
>>  void grep_init(struct grep_opt *opt, struct repository *repo)
>>  {
>> -	*opt = grep_defaults;
>> +	struct grep_opt blank = GREP_OPT_INIT;
>> +	memcpy(opt, &blank, sizeof(*opt));
>
> I'm nit-picking, but creating a throwaway struct for the convenience of
> using designated initialization (at the cost of having to memcpy an
> entire struct around) seems like overkill.
>
> Especially since we're just going to write into the other fields of the
> the target struct anyway, I'd probably rather have seen everything
> written out explicitly without the throwaway or memcpy.

It's a widely used pattern in the codebase at this point, see
5726a6b4012 (*.c *_init(): define in terms of corresponding *_INIT
macro, 2021-07-01) (mine, but I stole it from Jeff King).

As his linked-to compiler test shows the memcpy() is optimized away, so
modern compilers will treat these idioms the same way.

There was a suggestions somewhere that we should prorably move to that
"*<x> = <y>" or whatever it was briefer C99 (I think) syntax across the
board, it would be less verbose. But I haven't tested if it's as widely
supported, so I've just been sticking with that blank/memcpy() pattern
for "do init in terms of macro".

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH 4/8] grep.c: move "prefix" out of "struct grep_opt"
  2021-11-08 20:56   ` Taylor Blau
@ 2021-11-09  2:10     ` Ævar Arnfjörð Bjarmason
  2021-11-10  0:18       ` Taylor Blau
  0 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-09  2:10 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Junio C Hamano, J Smith


On Mon, Nov 08 2021, Taylor Blau wrote:

> On Sat, Nov 06, 2021 at 10:10:50PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> The "struct grep_opt" is a mixture of things that would be needed by
>> all callers of the grep.c API, and quite a few things that only the
>> builtin/grep.c needs.
>>
>> Since we got rid of "prefix_length" in the previous commit, let's move
>> the "prefix" variable over to "builtin/grep.c" where it's used. To do
>> this let's create a "struct grep_cmd_opt", which we'll have a pointer
>> to in a new "caller_priv" member in "struct grep_opt" (the existing
>> "priv" is used by the top-level "grep.c" itself).
>
> I'm definitely in favor of removing specialized, caller-specific bits
> from an internal API. But I'm not sure why grep.c needs to keep track o
> this new "caller_priv" field at all.
>
> Among the uses of `prefix` in builtin/grep.c, I see grep_source_name,
> and the call to run_pager(). Would it be more straightforward to pass
> down prefix from cmd_grep down to its use in grep_source_name?
>
> There are quite a few intermediary functions that we go through to get
> from cmd_grep() down to grep_source_name(). For instance, we could reach
> it through:
>
>     cmd_grep -> grep_tree -> grep_submodule -> grep_cache -> grep_oid ->
>     grep_source_name
>
> But passing prefix from cmd_grep down to grep_source_name without
> relying on the internals of grep.c seems like a good direction to me. I
> could even buy that (ab)using a static variable in builtin/grep.c to
> keep track of a constant prefix value would save you some plumbing
> (though I'd rather see the usage spelled out more explicitly).
>
> All of that is to say that I share your motivation for this patch and
> think that the direction is good, but I would have preferred to do it
> without the caller_priv variable (unless there is something that I am
> missing here).

Yes, that would make much more sense. I just had tunnel vision while
writing this, evidently. Yeah, just either passing it or using a static
variable in grep.c would make more sense.

Arguably we could even have it be read only (const char *const) and have
a global "the_prefix" or something, but it's probably not widely used
enough for that to make sense.





^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH 7/8] grep: simplify config parsing, change grep.<rx config> interaction
  2021-11-09  2:01     ` Ævar Arnfjörð Bjarmason
@ 2021-11-10  0:16       ` Taylor Blau
  0 siblings, 0 replies; 151+ messages in thread
From: Taylor Blau @ 2021-11-10  0:16 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Taylor Blau, git, Junio C Hamano, J Smith

On Tue, Nov 09, 2021 at 03:01:12AM +0100, Ævar Arnfjörð Bjarmason wrote:
> Or someone who'd read the docs, understood that we promised that would
> do nothing, and inserted that just to test us, but that seems unlikely
> :)
>
> Or, I suppose someone who's entirely confused, and will continue being
> even more confused now that behavior changes on a git upgrade from ERE
> to BRE.
>
> I'm hoping the last two paragraphs describe no-one & that this is safe
> to do.

Yeah, I agree that it seems very unlikely that anybody would actually be
affected here, so I'm comfortable with the change.

> >> ---
> >>  Documentation/config/grep.txt |  3 +-
> >>  Documentation/git-grep.txt    |  3 +-
> >
> > Not the fault of your patch, but these two are annoyingly (and subtly)
> > different from one another. Could we clean this up and put everything in
> > Documentation/config/grep.txt (and then include that in the
> > CONFIGURATION section of Documentation/git-grep.txt)?
>
> I've got a large series to do that for all of these, but opted to skip
> that particular digression here (even for just grep.txt it's a bit
> distracting).

I think that you could include a preparatory clean-up in this series to
make the Documentation/config/grep.txt included by
Documentation/git-grep.txt without being too distracting. But if punting
on it now works better for you, I don't think either makes a reviewer's
job substantially different.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH 6/8] grep API: call grep_config() after grep_init()
  2021-11-09  2:06     ` Ævar Arnfjörð Bjarmason
@ 2021-11-10  0:18       ` Taylor Blau
  0 siblings, 0 replies; 151+ messages in thread
From: Taylor Blau @ 2021-11-10  0:18 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Taylor Blau, git, Junio C Hamano, J Smith, Jeff King

On Tue, Nov 09, 2021 at 03:06:22AM +0100, Ævar Arnfjörð Bjarmason wrote:
> >> -/*
> >> - * Initialize one instance of grep_opt and copy the
> >> - * default values from the template we read the configuration
> >> - * information in an earlier call to git_config(grep_config).
> >> - */
> >>  void grep_init(struct grep_opt *opt, struct repository *repo)
> >>  {
> >> -	*opt = grep_defaults;
> >> +	struct grep_opt blank = GREP_OPT_INIT;
> >> +	memcpy(opt, &blank, sizeof(*opt));
> >
> > I'm nit-picking, but creating a throwaway struct for the convenience of
> > using designated initialization (at the cost of having to memcpy an
> > entire struct around) seems like overkill.
> >
> > Especially since we're just going to write into the other fields of the
> > the target struct anyway, I'd probably rather have seen everything
> > written out explicitly without the throwaway or memcpy.
>
> It's a widely used pattern in the codebase at this point, see
> 5726a6b4012 (*.c *_init(): define in terms of corresponding *_INIT
> macro, 2021-07-01) (mine, but I stole it from Jeff King).
>
> As his linked-to compiler test shows the memcpy() is optimized away, so
> modern compilers will treat these idioms the same way.
>
> There was a suggestions somewhere that we should prorably move to that
> "*<x> = <y>" or whatever it was briefer C99 (I think) syntax across the
> board, it would be less verbose. But I haven't tested if it's as widely
> supported, so I've just been sticking with that blank/memcpy() pattern
> for "do init in terms of macro".

I do at least prefer memcpy() over *<x> = <y> when x and y are
structures. But I wasn't aware that this was common in our codebase.
Anyway, my suggestion was only along the lines of "you're already
writing individual fields below, so why not just do that throughout
instead of memcpy()-ing some of them via a macro which expands to a
designated initializer?"

But this is a cosmetic point, so whatever you feel fits in most with the
surrounding style (so long as the pattern we're propagating isn't
terrible, which is the case here) then I'm OK with it.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH 4/8] grep.c: move "prefix" out of "struct grep_opt"
  2021-11-09  2:10     ` Ævar Arnfjörð Bjarmason
@ 2021-11-10  0:18       ` Taylor Blau
  0 siblings, 0 replies; 151+ messages in thread
From: Taylor Blau @ 2021-11-10  0:18 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Taylor Blau, git, Junio C Hamano, J Smith

On Tue, Nov 09, 2021 at 03:10:31AM +0100, Ævar Arnfjörð Bjarmason wrote:
> > All of that is to say that I share your motivation for this patch and
> > think that the direction is good, but I would have preferred to do it
> > without the caller_priv variable (unless there is something that I am
> > missing here).
>
> Yes, that would make much more sense. I just had tunnel vision while
> writing this, evidently. Yeah, just either passing it or using a static
> variable in grep.c would make more sense.
>
> Arguably we could even have it be read only (const char *const) and have
> a global "the_prefix" or something, but it's probably not widely used
> enough for that to make sense.

Either of the first two seem fine to me, but introducing `the_prefix`
seems both over-specified, and too big a step for this series alone.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 151+ messages in thread

* [PATCH v2 0/8] grep: simplify & delete code by changing obscure cfg variable behavior
  2021-11-06 21:10 [PATCH 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
                   ` (7 preceding siblings ...)
  2021-11-06 21:10 ` [PATCH 8/8] grep: make "extendedRegexp=true" the same as "patternType=extended" Ævar Arnfjörð Bjarmason
@ 2021-11-10  1:43 ` Ævar Arnfjörð Bjarmason
  2021-11-10  1:43   ` [PATCH v2 1/8] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
                     ` (10 more replies)
  8 siblings, 11 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-10  1:43 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

This series changes the behavior of an obscure interaction between
"grep.extendedRegexp=true" and "grep.patternType=default". See
7-8/8. Along the way we can delete a lot of code that was needed to
support the previous behavior.

For v1 and a more extensive summary see [1]. Thanks a lot Taylor for
the detailed review on v1!

Hopefully this v1 addresses all the feedback on one way or another,
it's still 8 patches, but much of the early part of v1 is squashed
together & re-done as suggested.

Then there's a mid-series de-duplication of the grep config
documentation. I ended up keeping the change to not needlessly pass
"cb" around in grep_cmd_config(), but that's now also in its own
patch.

1. https://lore.kernel.org/git/cover-0.8-00000000000-20211106T210711Z-avarab@gmail.com/

Ævar Arnfjörð Bjarmason (8):
  grep.h: remove unused "regex_t regexp" from grep_opt
  built-ins: trust the "prefix" from run_builtin()
  log tests: check if grep_config() is called by "log"-like cmds
  grep docs: de-duplicate configuration sections
  grep.c: don't pass along NULL callback value
  grep API: call grep_config() after grep_init()
  grep: simplify config parsing, change grep.<rx config> interaction
  grep: make "extendedRegexp=true" the same as "patternType=extended"

 Documentation/config/grep.txt |  11 ++--
 Documentation/git-grep.txt    |  30 +--------
 builtin/grep.c                |  27 ++++----
 builtin/log.c                 |  13 +++-
 builtin/ls-tree.c             |   9 ++-
 git.c                         |   4 +-
 grep.c                        | 118 ++++------------------------------
 grep.h                        |  35 ++++++----
 revision.c                    |   4 +-
 t/t4202-log.sh                |  16 +++++
 t/t7810-grep.sh               |   4 +-
 11 files changed, 97 insertions(+), 174 deletions(-)

Range-diff against v1:
1:  412b8b65266 = 1:  1435db727ef grep.h: remove unused "regex_t regexp" from grep_opt
2:  244715e3497 < -:  ----------- git.c & grep.c: assert that "prefix" is NULL or non-zero string
3:  3338cc95b81 < -:  ----------- grep: remove unused "prefix_length" member
4:  78298657d69 < -:  ----------- grep.c: move "prefix" out of "struct grep_opt"
-:  ----------- > 2:  63cf2fe266d built-ins: trust the "prefix" from run_builtin()
5:  ba9be0b9283 = 3:  41e38ebb32c log tests: check if grep_config() is called by "log"-like cmds
-:  ----------- > 4:  efe95397d72 grep docs: de-duplicate configuration sections
-:  ----------- > 5:  d0f0ac6c7ae grep.c: don't pass along NULL callback value
6:  933ac853bca ! 6:  917944f79a5 grep API: call grep_config() after grep_init()
    @@ Commit message
         didn't do that, but now that it can't be a concern anymore let's
         remove those comments.
     
    +    This does not change the behavior of any of the configuration
    +    variables or options. That would have been the case if we didn't move
    +    around the grep_config() call in "builtin/log.c". But now that we call
    +    "grep_config" after "git_log_config" and "git_format_config" we'll
    +    need to pass in the already initialized "struct grep_opt *".
    +
         See 6ba9bb76e02 (grep: copy struct in one fell swoop, 2020-11-29) and
         7687a0541e0 (grep: move the configuration parsing logic to grep.[ch],
         2012-10-09) for the commits that added the comments.
     
    +    The memcpy() pattern here will be optimized away and follows the
    +    convention of other *_init() functions. See 5726a6b4012 (*.c *_init():
    +    define in terms of corresponding *_INIT macro, 2021-07-01).
    +
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## builtin/grep.c ##
     @@ builtin/grep.c: static int wait_all(void)
    + 
      static int grep_cmd_config(const char *var, const char *value, void *cb)
      {
    - 	int st = grep_config(var, value, cb);
    --	if (git_color_default_config(var, value, cb) < 0)
    -+	if (git_color_default_config(var, value, NULL) < 0)
    +-	int st = grep_config(var, value, NULL);
    ++	int st = grep_config(var, value, cb);
    + 	if (git_color_default_config(var, value, NULL) < 0)
      		st = -1;
      
    - 	if (!strcmp(var, "grep.threads")) {
     @@ builtin/grep.c: int cmd_grep(int argc, const char **argv, const char *prefix)
    - 		OPT_END()
      	};
    + 	grep_prefix = prefix;
      
     -	git_config(grep_cmd_config, NULL);
      	grep_init(&opt, the_repository);
     +	git_config(grep_cmd_config, &opt);
    - 	opt.caller_priv = &opt_cmd;
      
      	/*
    + 	 * If there is no -- then the paths must exist in the working
     
      ## builtin/log.c ##
     @@ builtin/log.c: static int git_log_config(const char *var, const char *value, void *cb)
    @@ grep.c: int grep_config(const char *var, const char *value, void *cb)
     
      ## grep.h ##
     @@ grep.h: struct grep_opt {
    + 	int show_hunk_mark;
    + 	int file_break;
    + 	int heading;
    ++	void *caller_priv;
    + 	void *priv;
    + 
    + 	void (*output)(struct grep_opt *opt, const void *data, size_t size);
      	void *output_priv;
      };
      
7:  677a8f8520f ! 7:  140a7416223 grep: simplify config parsing, change grep.<rx config> interaction
    @@ Commit message
         but in a way that's consistent with how we parse other
         configuration.
     
    -    Pedantically speaking we're probably breaking past promises here, but
    -    I doubt that this will impact anyone in practice. The reduction in
    -    complexity and resulting consistency with other default config
    -    behavior is worth it.
    +    We are breaking past promises here, but I doubt that this will impact
    +    anyone in practice. The reduction in complexity and resulting
    +    consistency with other default config behavior is worth it.
     
         When "grep.patternType" was introduced in 84befcd0a4a (grep: add a
         grep.patternType configuration setting, 2012-08-03) we made two
    @@ Commit message
             # BRE grep
             git -c grep.extendedRegexp=true -c grep.patternType=basic grep <pattern>
     
    -    But probably not for this to ignore the new "grep.patternType" option
    -    entirely, say if /etc/gitconfig was still setting
    +    But probably not for this to ignore the favored "grep.patternType"
    +    option entirely, say if /etc/gitconfig was still setting
         "grep.extendedRegexp", but "~/.gitconfig" used the new
         "grep.patternType" (and wanted to use the "default" value):
     
    @@ Documentation/config/grep.txt: grep.patternType::
      	If set to true, enable `--extended-regexp` option by default. This
     -	option is ignored when the `grep.patternType` option is set to a value
     -	other than 'default'.
    -+	option is ignored when the `grep.patternType` option is set.
    - 
    - grep.threads::
    - 	Number of grep worker threads to use.
    -
    - ## Documentation/git-grep.txt ##
    -@@ Documentation/git-grep.txt: grep.patternType::
    - 
    - grep.extendedRegexp::
    - 	If set to true, enable `--extended-regexp` option by default. This
    --	option is ignored when the `grep.patternType` option is set to a value
    --	other than 'default'.
     +	option is ignored when the `grep.patternType` option is set.
      
      grep.threads::
8:  dadd5dff77a ! 8:  cc904d93b26 grep: make "extendedRegexp=true" the same as "patternType=extended"
    @@ Documentation/config/grep.txt: grep.patternType::
      grep.extendedRegexp::
     -	If set to true, enable `--extended-regexp` option by default. This
     -	option is ignored when the `grep.patternType` option is set.
    -+	Deprecated synonym for 'grep.patternType=extended`.
    - 
    - grep.threads::
    - 	Number of grep worker threads to use.
    -
    - ## Documentation/git-grep.txt ##
    -@@ Documentation/git-grep.txt: grep.patternType::
    - 	value 'default' will return to the default matching behavior.
    - 
    - grep.extendedRegexp::
    --	If set to true, enable `--extended-regexp` option by default. This
    --	option is ignored when the `grep.patternType` option is set.
     +	Deprecated synonym for 'grep.patternType=extended`.
      
      grep.threads::
    @@ grep.h: struct grep_opt {
      	.colors = { \
      		[GREP_COLOR_CONTEXT] = "", \
      		[GREP_COLOR_FILENAME] = "", \
    -@@ grep.h: struct grep_opt {
    - 		[GREP_COLOR_SELECTED] = "", \
    - 		[GREP_COLOR_SEP] = GIT_COLOR_CYAN, \
    - 	}, \
    --	.only_matching = 0, \
    - 	.color = -1, \
    - 	.output = std_output, \
    - }
     
      ## t/t7810-grep.sh ##
     @@ t/t7810-grep.sh: do
-- 
2.34.0.rc1.741.gab7bfd97031


^ permalink raw reply	[flat|nested] 151+ messages in thread

* [PATCH v2 1/8] grep.h: remove unused "regex_t regexp" from grep_opt
  2021-11-10  1:43 ` [PATCH v2 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
@ 2021-11-10  1:43   ` Ævar Arnfjörð Bjarmason
  2021-11-12 16:11     ` Junio C Hamano
  2021-11-10  1:43   ` [PATCH v2 2/8] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
                     ` (9 subsequent siblings)
  10 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-10  1:43 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

This "regex_t" in grep_opt has not been used since
f9b9faf6f8a (builtin-grep: allow more than one patterns., 2006-05-02),
we still use a "regex_t" for compiling regexes, but that's in the
"grep_pat" struct".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/grep.h b/grep.h
index 3e8815c347b..95cccb670f9 100644
--- a/grep.h
+++ b/grep.h
@@ -136,7 +136,6 @@ struct grep_opt {
 
 	const char *prefix;
 	int prefix_length;
-	regex_t regexp;
 	int linenum;
 	int columnnum;
 	int invert;
-- 
2.34.0.rc1.741.gab7bfd97031


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v2 2/8] built-ins: trust the "prefix" from run_builtin()
  2021-11-10  1:43 ` [PATCH v2 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
  2021-11-10  1:43   ` [PATCH v2 1/8] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
@ 2021-11-10  1:43   ` Ævar Arnfjörð Bjarmason
  2021-11-12 16:38     ` Junio C Hamano
  2021-11-10  1:43   ` [PATCH v2 3/8] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
                     ` (8 subsequent siblings)
  10 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-10  1:43 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change code in "builtin/grep.c" and "builtin/ls-tree.c" to trust the
"prefix" passed from "run_builtin()". The "prefix" we get from setup.c
is either going to be NULL or a string of length >0, never "".

So we can drop the "prefix && *prefix" checks added for
"builtin/grep.c" in 0d042fecf2f (git-grep: show pathnames relative to
the current directory, 2006-08-11), and for "builtin/ls-tree.c" in
a69dd585fca (ls-tree: chomp leading directories when run from a
subdirectory, 2005-12-23).

As seen in code in revision.c that was added in cd676a51367 (diff
--relative: output paths as relative to the current subdirectory,
2008-02-12) we already have existing code that does away with this
assertion.

This makes it easier to reason about a subsequent change to the
"prefix_length" code in grep.c in a subsequent commit, and since we're
going to the trouble of doing that let's leave behind an assert() to
promise this to any future callers.

For "builtin/grep.c" it would be painful to pass the "prefix" down the
callchain of:

    cmd_grep -> grep_tree -> grep_submodule -> grep_cache -> grep_oid ->
    grep_source_name

So for the code that needs it in grep_source_name() let's add a
"grep_prefix" variable similar to the existing "ls_tree_prefix".

While at it let's move the code in cmd_ls_tree() around so that we
assign to the "ls_tree_prefix" right after declaring the variables,
and stop assigning to "prefix". We only subsequently used that
variable later in the function after clobbering it. Let's just use our
own "grep_prefix" instead.

Let's also add an assert() in git.c, so that we'll make this promise
about the "prefix" to any current and future callers, as well as to
any readers of the code.

Code history:

 * The strlen() in "grep.c" hasn't been used since 493b7a08d80 (grep:
   accept relative paths outside current working directory, 2009-09-05).

   When that code was added in 0d042fecf2f (git-grep: show pathnames
   relative to the current directory, 2006-08-11) we used the length.

   But since 493b7a08d80 we haven't used it for anything except a
   boolean check that we could have done on the "prefix" member
   itself.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c    | 13 ++++++++-----
 builtin/ls-tree.c |  9 ++++-----
 git.c             |  4 ++--
 grep.c            |  4 +---
 grep.h            |  4 +---
 revision.c        |  2 +-
 6 files changed, 17 insertions(+), 19 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 9e34a820ad4..d85cbabea67 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -26,6 +26,8 @@
 #include "object-store.h"
 #include "packfile.h"
 
+static const char *grep_prefix;
+
 static char const * const grep_usage[] = {
 	N_("git grep [<options>] [-e] <pattern> [<rev>...] [[--] <path>...]"),
 	NULL
@@ -315,11 +317,11 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 	strbuf_reset(out);
 
 	if (opt->null_following_name) {
-		if (opt->relative && opt->prefix_length) {
+		if (opt->relative && grep_prefix) {
 			struct strbuf rel_buf = STRBUF_INIT;
 			const char *rel_name =
 				relative_path(filename + tree_name_len,
-					      opt->prefix, &rel_buf);
+					      grep_prefix, &rel_buf);
 
 			if (tree_name_len)
 				strbuf_add(out, filename, tree_name_len);
@@ -332,8 +334,8 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 		return;
 	}
 
-	if (opt->relative && opt->prefix_length)
-		quote_path(filename + tree_name_len, opt->prefix, out, 0);
+	if (opt->relative && grep_prefix)
+		quote_path(filename + tree_name_len, grep_prefix, out, 0);
 	else
 		quote_c_style(filename + tree_name_len, out, NULL, 0);
 
@@ -962,9 +964,10 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			   PARSE_OPT_NOCOMPLETE),
 		OPT_END()
 	};
+	grep_prefix = prefix;
 
 	git_config(grep_cmd_config, NULL);
-	grep_init(&opt, the_repository, prefix);
+	grep_init(&opt, the_repository);
 
 	/*
 	 * If there is no -- then the paths must exist in the working
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c71..84bed6d5612 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -147,16 +147,15 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 		OPT__ABBREV(&abbrev),
 		OPT_END()
 	};
-
-	git_config(git_default_config, NULL);
 	ls_tree_prefix = prefix;
-	if (prefix && *prefix)
+	if (prefix)
 		chomp_prefix = strlen(prefix);
 
+	git_config(git_default_config, NULL);
 	argc = parse_options(argc, argv, prefix, ls_tree_options,
 			     ls_tree_usage, 0);
 	if (full_tree) {
-		ls_tree_prefix = prefix = NULL;
+		ls_tree_prefix = NULL;
 		chomp_prefix = 0;
 	}
 	/* -d -r should imply -t, but -d by itself should not have to. */
@@ -178,7 +177,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 	parse_pathspec(&pathspec, PATHSPEC_ALL_MAGIC &
 				  ~(PATHSPEC_FROMTOP | PATHSPEC_LITERAL),
 		       PATHSPEC_PREFER_CWD,
-		       prefix, argv + 1);
+		       ls_tree_prefix, argv + 1);
 	for (i = 0; i < pathspec.nr; i++)
 		pathspec.items[i].nowildcard_len = pathspec.items[i].len;
 	pathspec.has_wildcard = 0;
diff --git a/git.c b/git.c
index 5ff21be21f3..611bf2f63eb 100644
--- a/git.c
+++ b/git.c
@@ -420,9 +420,8 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
 {
 	int status, help;
 	struct stat st;
-	const char *prefix;
+	const char *prefix = NULL;
 
-	prefix = NULL;
 	help = argc == 2 && !strcmp(argv[1], "-h");
 	if (!help) {
 		if (p->option & RUN_SETUP)
@@ -431,6 +430,7 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
 			int nongit_ok;
 			prefix = setup_git_directory_gently(&nongit_ok);
 		}
+		assert(!prefix || *prefix);
 		precompose_argv_prefix(argc, argv, NULL);
 		if (use_pager == -1 && p->option & (RUN_SETUP | RUN_SETUP_GENTLY) &&
 		    !(p->option & DELAY_PAGER_CONFIG))
diff --git a/grep.c b/grep.c
index f6e113e9f0f..c9065254aeb 100644
--- a/grep.c
+++ b/grep.c
@@ -139,13 +139,11 @@ int grep_config(const char *var, const char *value, void *cb)
  * default values from the template we read the configuration
  * information in an earlier call to git_config(grep_config).
  */
-void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix)
+void grep_init(struct grep_opt *opt, struct repository *repo)
 {
 	*opt = grep_defaults;
 
 	opt->repo = repo;
-	opt->prefix = prefix;
-	opt->prefix_length = (prefix && *prefix) ? strlen(prefix) : 0;
 	opt->pattern_tail = &opt->pattern_list;
 	opt->header_tail = &opt->header_list;
 }
diff --git a/grep.h b/grep.h
index 95cccb670f9..62deadb885f 100644
--- a/grep.h
+++ b/grep.h
@@ -134,8 +134,6 @@ struct grep_opt {
 	 */
 	struct repository *repo;
 
-	const char *prefix;
-	int prefix_length;
 	int linenum;
 	int columnnum;
 	int invert;
@@ -180,7 +178,7 @@ struct grep_opt {
 };
 
 int grep_config(const char *var, const char *value, void *);
-void grep_init(struct grep_opt *, struct repository *repo, const char *prefix);
+void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index ab7c1358042..9f9b0d2429e 100644
--- a/revision.c
+++ b/revision.c
@@ -1833,7 +1833,7 @@ void repo_init_revisions(struct repository *r,
 	revs->commit_format = CMIT_FMT_DEFAULT;
 	revs->expand_tabs_in_log_default = 8;
 
-	grep_init(&revs->grep_filter, revs->repo, prefix);
+	grep_init(&revs->grep_filter, revs->repo);
 	revs->grep_filter.status_only = 1;
 
 	repo_diff_setup(revs->repo, &revs->diffopt);
-- 
2.34.0.rc1.741.gab7bfd97031


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v2 3/8] log tests: check if grep_config() is called by "log"-like cmds
  2021-11-10  1:43 ` [PATCH v2 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
  2021-11-10  1:43   ` [PATCH v2 1/8] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
  2021-11-10  1:43   ` [PATCH v2 2/8] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
@ 2021-11-10  1:43   ` Ævar Arnfjörð Bjarmason
  2021-11-12 17:09     ` Junio C Hamano
  2021-11-10  1:43   ` [PATCH v2 4/8] grep docs: de-duplicate configuration sections Ævar Arnfjörð Bjarmason
                     ` (7 subsequent siblings)
  10 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-10  1:43 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Extend the tests added in my 9df46763ef1 (log: add exhaustive tests
for pattern style options & config, 2017-05-20) to check not only
whether "git log" handles "grep.patternType", but also "git show"
etc.

It's sufficient to check whether a PCRE regex matches for the purposes
of this test, we otherwise assume that it's running the same code as
"git log", whose behavior is tested more exhaustively by test added in
9df46763ef1e.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t4202-log.sh | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 7884e3d46b3..a114c49ef27 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -449,6 +449,22 @@ test_expect_success !FAIL_PREREQS 'log with various grep.patternType configurati
 	)
 '
 
+for cmd in show whatchanged reflog format-patch
+do
+	myarg=
+	if test "$cmd" = "format-patch"
+	then
+		myarg="HEAD~.."
+	fi
+
+	test_expect_success PCRE "$cmd: understands grep.patternType=perl, like 'log'" '
+		git -c grep.patternType=fixed -C pattern-type $cmd --grep="1(?=\|2)" $myarg >actual &&
+		test_must_be_empty actual &&
+		git -c grep.patternType=perl -C pattern-type $cmd --grep="1(?=\|2)" $myarg >actual &&
+		test_file_not_empty actual
+	'
+done
+
 test_expect_success 'log --author' '
 	cat >expect <<-\EOF &&
 	Author: <BOLD;RED>A U<RESET> Thor <author@example.com>
-- 
2.34.0.rc1.741.gab7bfd97031


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v2 4/8] grep docs: de-duplicate configuration sections
  2021-11-10  1:43 ` [PATCH v2 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
                     ` (2 preceding siblings ...)
  2021-11-10  1:43   ` [PATCH v2 3/8] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
@ 2021-11-10  1:43   ` Ævar Arnfjörð Bjarmason
  2021-11-12 17:15     ` Junio C Hamano
  2021-11-10  1:43   ` [PATCH v2 5/8] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
                     ` (6 subsequent siblings)
  10 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-10  1:43 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Include the "config/grep.txt" file in "git-grep.txt", instead of
repeating an almost identical description of the "grep" configuration
variables in two places. In a subsequent commit we'll amend this
documentation, and can now do so in one place instead of two.

Let's also add a short blurb at the top indicating that this is
included documentation, so users won't think that they need to read
the two versions and compare them.

That wording is copy/pasted from the change I made in b6a8d09f6d8 (gc
docs: include the "gc.*" section from "config" in "gc", 2019-04-07),
eventually we'll want to include this via template, and indeed this
change is extracted from a WIP series that fixes all these
"CONFIGURATION" includes which does that. But doing that would require
build system changes, so let's punt on it for now.

There is no loss of information here that isn't shown in the addition
to "grep.txt". This change was made by copying the contents of
"git-grep.txt"'s version over the "grep.txt" version. Aside from the
change "grep.txt" being made here the two were identical.

This documentation started being copy/pasted around in
b22520a37c8 (grep: allow -E and -n to be turned on by default via
configuration, 2011-03-30). After that in e.g. 6453f7b3486 (grep: add
grep.fullName config variable, 2014-03-17) they started drifting
apart, with only grep.fullName being described in the command
documentation.

In 434e6e753fe (config.txt: move grep.* to a separate file,
2018-10-27) we gained the include, but didn't do this next step, let's
do it now.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/config/grep.txt |  7 +++++--
 Documentation/git-grep.txt    | 30 +++---------------------------
 2 files changed, 8 insertions(+), 29 deletions(-)

diff --git a/Documentation/config/grep.txt b/Documentation/config/grep.txt
index 44abe45a7ca..ae51f2d91c8 100644
--- a/Documentation/config/grep.txt
+++ b/Documentation/config/grep.txt
@@ -16,8 +16,11 @@ grep.extendedRegexp::
 	other than 'default'.
 
 grep.threads::
-	Number of grep worker threads to use.
-	See `grep.threads` in linkgit:git-grep[1] for more information.
+	Number of grep worker threads to use. If unset (or set to 0), Git will
+	use as many threads as the number of logical cores available.
+
+grep.fullName::
+	If set to true, enable `--full-name` option by default.
 
 grep.fallbackToNoIndex::
 	If set to true, fall back to git grep --no-index if git grep
diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 3d393fbac1b..29d5ce04f5a 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -334,34 +334,10 @@ performance in this case, it might be desirable to use `--threads=1`.
 CONFIGURATION
 -------------
 
-grep.lineNumber::
-	If set to true, enable `-n` option by default.
-
-grep.column::
-	If set to true, enable the `--column` option by default.
-
-grep.patternType::
-	Set the default matching behavior. Using a value of 'basic', 'extended',
-	'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`,
-	`--fixed-strings`, or `--perl-regexp` option accordingly, while the
-	value 'default' will return to the default matching behavior.
-
-grep.extendedRegexp::
-	If set to true, enable `--extended-regexp` option by default. This
-	option is ignored when the `grep.patternType` option is set to a value
-	other than 'default'.
-
-grep.threads::
-	Number of grep worker threads to use. If unset (or set to 0), Git will
-	use as many threads as the number of logical cores available.
-
-grep.fullName::
-	If set to true, enable `--full-name` option by default.
-
-grep.fallbackToNoIndex::
-	If set to true, fall back to git grep --no-index if git grep
-	is executed outside of a git repository.  Defaults to false.
+The below documentation is the same as what's found in
+linkgit:git-config[1]:
 
+include::config/grep.txt[]
 
 GIT
 ---
-- 
2.34.0.rc1.741.gab7bfd97031


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v2 5/8] grep.c: don't pass along NULL callback value
  2021-11-10  1:43 ` [PATCH v2 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
                     ` (3 preceding siblings ...)
  2021-11-10  1:43   ` [PATCH v2 4/8] grep docs: de-duplicate configuration sections Ævar Arnfjörð Bjarmason
@ 2021-11-10  1:43   ` Ævar Arnfjörð Bjarmason
  2021-11-12 17:18     ` Junio C Hamano
  2021-11-10  1:43   ` [PATCH v2 6/8] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
                     ` (5 subsequent siblings)
  10 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-10  1:43 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change grep_cmd_config() top stop passing around the always-NULL "cb"
value. When this code was added in 7e8f59d577e (grep: color patterns
in output, 2009-03-07) it was non-NULL, but when that changed in
15fabd1bbd4 (builtin/grep.c: make configuration callback more
reusable, 2012-10-09) this code was left behind.

In a subsequent change I'll start using the "cb" value, this will make
it clear which functions we call need it, and which don't.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index d85cbabea67..5ec4cecae45 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -285,8 +285,8 @@ static int wait_all(void)
 
 static int grep_cmd_config(const char *var, const char *value, void *cb)
 {
-	int st = grep_config(var, value, cb);
-	if (git_color_default_config(var, value, cb) < 0)
+	int st = grep_config(var, value, NULL);
+	if (git_color_default_config(var, value, NULL) < 0)
 		st = -1;
 
 	if (!strcmp(var, "grep.threads")) {
-- 
2.34.0.rc1.741.gab7bfd97031


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v2 6/8] grep API: call grep_config() after grep_init()
  2021-11-10  1:43 ` [PATCH v2 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
                     ` (4 preceding siblings ...)
  2021-11-10  1:43   ` [PATCH v2 5/8] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
@ 2021-11-10  1:43   ` Ævar Arnfjörð Bjarmason
  2021-11-12 17:32     ` Junio C Hamano
  2021-11-10  1:43   ` [PATCH v2 7/8] grep: simplify config parsing, change grep.<rx config> interaction Ævar Arnfjörð Bjarmason
                     ` (4 subsequent siblings)
  10 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-10  1:43 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

The grep_init() function used the odd pattern of initializing the
passed-in "struct grep_opt" with a statically defined "grep_defaults"
struct, which would be modified in-place when we invoked
grep_config().

So we effectively (b) initialized config, (a) then defaults, (c)
followed by user options. Usually those are ordered as "a", "b" and
"c" instead.

As the comments being removed here show the previous behavior needed
to be carefully explained as we'd potentially share the populated
configuration among different instances of grep_init(). In practice we
didn't do that, but now that it can't be a concern anymore let's
remove those comments.

This does not change the behavior of any of the configuration
variables or options. That would have been the case if we didn't move
around the grep_config() call in "builtin/log.c". But now that we call
"grep_config" after "git_log_config" and "git_format_config" we'll
need to pass in the already initialized "struct grep_opt *".

See 6ba9bb76e02 (grep: copy struct in one fell swoop, 2020-11-29) and
7687a0541e0 (grep: move the configuration parsing logic to grep.[ch],
2012-10-09) for the commits that added the comments.

The memcpy() pattern here will be optimized away and follows the
convention of other *_init() functions. See 5726a6b4012 (*.c *_init():
define in terms of corresponding *_INIT macro, 2021-07-01).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c |  4 ++--
 builtin/log.c  | 13 +++++++++++--
 grep.c         | 39 +++------------------------------------
 grep.h         | 22 ++++++++++++++++++++++
 4 files changed, 38 insertions(+), 40 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 5ec4cecae45..0ea124321b6 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -285,7 +285,7 @@ static int wait_all(void)
 
 static int grep_cmd_config(const char *var, const char *value, void *cb)
 {
-	int st = grep_config(var, value, NULL);
+	int st = grep_config(var, value, cb);
 	if (git_color_default_config(var, value, NULL) < 0)
 		st = -1;
 
@@ -966,8 +966,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	};
 	grep_prefix = prefix;
 
-	git_config(grep_cmd_config, NULL);
 	grep_init(&opt, the_repository);
+	git_config(grep_cmd_config, &opt);
 
 	/*
 	 * If there is no -- then the paths must exist in the working
diff --git a/builtin/log.c b/builtin/log.c
index f75d87e8d7f..bfddacdfa6c 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -505,8 +505,6 @@ static int git_log_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
-	if (grep_config(var, value, cb) < 0)
-		return -1;
 	if (git_gpg_config(var, value, cb) < 0)
 		return -1;
 	return git_diff_ui_config(var, value, cb);
@@ -521,6 +519,8 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.simplify_history = 0;
 	memset(&opt, 0, sizeof(opt));
@@ -635,6 +635,8 @@ int cmd_show(int argc, const char **argv, const char *prefix)
 
 	memset(&match_all, 0, sizeof(match_all));
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.always_show_header = 1;
 	rev.no_walk = 1;
@@ -718,6 +720,8 @@ int cmd_log_reflog(int argc, const char **argv, const char *prefix)
 
 	repo_init_revisions(the_repository, &rev, prefix);
 	init_reflog_walk(&rev.reflog_info);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.verbose_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -751,6 +755,8 @@ int cmd_log(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.always_show_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -1833,10 +1839,13 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	extra_hdr.strdup_strings = 1;
 	extra_to.strdup_strings = 1;
 	extra_cc.strdup_strings = 1;
+
 	init_log_defaults();
 	init_display_notes(&notes_opt);
 	git_config(git_format_config, NULL);
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.show_notes = show_notes;
 	memcpy(&rev.notes_opt, &notes_opt, sizeof(notes_opt));
 	rev.commit_format = CMIT_FMT_EMAIL;
diff --git a/grep.c b/grep.c
index c9065254aeb..fb3f63c63ef 100644
--- a/grep.c
+++ b/grep.c
@@ -19,27 +19,6 @@ static void std_output(struct grep_opt *opt, const void *buf, size_t size)
 	fwrite(buf, size, 1, stdout);
 }
 
-static struct grep_opt grep_defaults = {
-	.relative = 1,
-	.pathname = 1,
-	.max_depth = -1,
-	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED,
-	.colors = {
-		[GREP_COLOR_CONTEXT] = "",
-		[GREP_COLOR_FILENAME] = "",
-		[GREP_COLOR_FUNCTION] = "",
-		[GREP_COLOR_LINENO] = "",
-		[GREP_COLOR_COLUMNNO] = "",
-		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_SELECTED] = "",
-		[GREP_COLOR_SEP] = GIT_COLOR_CYAN,
-	},
-	.only_matching = 0,
-	.color = -1,
-	.output = std_output,
-};
-
 static const char *color_grep_slots[] = {
 	[GREP_COLOR_CONTEXT]	    = "context",
 	[GREP_COLOR_FILENAME]	    = "filename",
@@ -75,20 +54,12 @@ define_list_config_array_extra(color_grep_slots, {"match"});
  */
 int grep_config(const char *var, const char *value, void *cb)
 {
-	struct grep_opt *opt = &grep_defaults;
+	struct grep_opt *opt = cb;
 	const char *slot;
 
 	if (userdiff_config(var, value) < 0)
 		return -1;
 
-	/*
-	 * The instance of grep_opt that we set up here is copied by
-	 * grep_init() to be used by each individual invocation.
-	 * When populating a new field of this structure here, be
-	 * sure to think about ownership -- e.g., you might need to
-	 * override the shallow copy in grep_init() with a deep copy.
-	 */
-
 	if (!strcmp(var, "grep.extendedregexp")) {
 		opt->extended_regexp_option = git_config_bool(var, value);
 		return 0;
@@ -134,14 +105,10 @@ int grep_config(const char *var, const char *value, void *cb)
 	return 0;
 }
 
-/*
- * Initialize one instance of grep_opt and copy the
- * default values from the template we read the configuration
- * information in an earlier call to git_config(grep_config).
- */
 void grep_init(struct grep_opt *opt, struct repository *repo)
 {
-	*opt = grep_defaults;
+	struct grep_opt blank = GREP_OPT_INIT;
+	memcpy(opt, &blank, sizeof(*opt));
 
 	opt->repo = repo;
 	opt->pattern_tail = &opt->pattern_list;
diff --git a/grep.h b/grep.h
index 62deadb885f..30a7dfd3294 100644
--- a/grep.h
+++ b/grep.h
@@ -171,12 +171,34 @@ struct grep_opt {
 	int show_hunk_mark;
 	int file_break;
 	int heading;
+	void *caller_priv;
 	void *priv;
 
 	void (*output)(struct grep_opt *opt, const void *data, size_t size);
 	void *output_priv;
 };
 
+#define GREP_OPT_INIT { \
+	.relative = 1, \
+	.pathname = 1, \
+	.max_depth = -1, \
+	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
+	.colors = { \
+		[GREP_COLOR_CONTEXT] = "", \
+		[GREP_COLOR_FILENAME] = "", \
+		[GREP_COLOR_FUNCTION] = "", \
+		[GREP_COLOR_LINENO] = "", \
+		[GREP_COLOR_COLUMNNO] = "", \
+		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_SELECTED] = "", \
+		[GREP_COLOR_SEP] = GIT_COLOR_CYAN, \
+	}, \
+	.only_matching = 0, \
+	.color = -1, \
+	.output = std_output, \
+}
+
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
-- 
2.34.0.rc1.741.gab7bfd97031


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v2 7/8] grep: simplify config parsing, change grep.<rx config> interaction
  2021-11-10  1:43 ` [PATCH v2 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
                     ` (5 preceding siblings ...)
  2021-11-10  1:43   ` [PATCH v2 6/8] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
@ 2021-11-10  1:43   ` Ævar Arnfjörð Bjarmason
  2021-11-12 19:19     ` Junio C Hamano
  2021-11-10  1:43   ` [PATCH v2 8/8] grep: make "extendedRegexp=true" the same as "patternType=extended" Ævar Arnfjörð Bjarmason
                     ` (3 subsequent siblings)
  10 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-10  1:43 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change the interaction between "grep.patternType=default" and
"grep.extendedRegexp=true" to make setting "grep.extendedRegexp=true"
synonymous with setting "grep.patternType=extended".

This changes our existing config parsing behavior as detailed below,
but in a way that's consistent with how we parse other
configuration.

We are breaking past promises here, but I doubt that this will impact
anyone in practice. The reduction in complexity and resulting
consistency with other default config behavior is worth it.

When "grep.patternType" was introduced in 84befcd0a4a (grep: add a
grep.patternType configuration setting, 2012-08-03) we made two
seemingly contradictory promises:

 1. You can set "grep.patternType", and "[setting it to] 'default'
    will return to the default matching behavior".

 2. Support the existing "grep.extendedRegexp" option, but ignore it
    when the new "grep.patternType" is set, *except* "when the
    `grep.patternType` option is set. to a value other than 'default'".

I think that 84befcd0a4a probably didn't intend this behavior, but
instead ended up conflating our internal "unspecified" state with a
user's explicit desire to set the configuration back to the
default.

I.e. a user would correctly expect this to keep working:

    # ERE grep
    git -c grep.extendedRegexp=true grep <pattern>

And likewise for "grep.patternType=default" to take precedence over
the disfavored "grep.extendedRegexp" option, i.e. the usual "last set
wins" semantics.

    # BRE grep
    git -c grep.extendedRegexp=true -c grep.patternType=basic grep <pattern>

But probably not for this to ignore the favored "grep.patternType"
option entirely, say if /etc/gitconfig was still setting
"grep.extendedRegexp", but "~/.gitconfig" used the new
"grep.patternType" (and wanted to use the "default" value):

    # Was ERE, now BRE
    git -c grep.extendedRegexp=true grep.patternType=default grep <pattern>

I think that in practice nobody or almost nobody is going to be
relying on this obscure interaction, and as shown here it makes the
config parsing much simpler. We no longer have to carry a complex
state machine in "grep_commit_pattern_type()" and
"grep_set_pattern_type_option()".

We can also do away with the "int fixed" and "int pcre2" members in
favor of using "pattern_type_option" directly in "grep.c", as well as
dropping the "pattern_type_arg" variable in "builtin/grep.c" in favor
of using the "pattern_type_option" member directly.

See my 07a3d411739 (grep: remove regflags from the public grep_opt
API, 2017-06-29) for addition of the two comments being removed here,
i.e. the complexity noted in that commit is now going away.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/config/grep.txt |  3 +-
 builtin/grep.c                | 10 ++---
 grep.c                        | 71 +++++------------------------------
 grep.h                        |  6 +--
 revision.c                    |  2 -
 t/t7810-grep.sh               |  2 +-
 6 files changed, 16 insertions(+), 78 deletions(-)

diff --git a/Documentation/config/grep.txt b/Documentation/config/grep.txt
index ae51f2d91c8..f4b7d3041fb 100644
--- a/Documentation/config/grep.txt
+++ b/Documentation/config/grep.txt
@@ -12,8 +12,7 @@ grep.patternType::
 
 grep.extendedRegexp::
 	If set to true, enable `--extended-regexp` option by default. This
-	option is ignored when the `grep.patternType` option is set to a value
-	other than 'default'.
+	option is ignored when the `grep.patternType` option is set.
 
 grep.threads::
 	Number of grep worker threads to use. If unset (or set to 0), Git will
diff --git a/builtin/grep.c b/builtin/grep.c
index 0ea124321b6..942c4b25077 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -845,7 +845,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	int i;
 	int dummy;
 	int use_index = 1;
-	int pattern_type_arg = GREP_PATTERN_TYPE_UNSPECIFIED;
 	int allow_revs;
 
 	struct option options[] = {
@@ -879,16 +878,16 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			N_("descend at most <depth> levels"), PARSE_OPT_NONEG,
 			NULL, 1 },
 		OPT_GROUP(""),
-		OPT_SET_INT('E', "extended-regexp", &pattern_type_arg,
+		OPT_SET_INT('E', "extended-regexp", &opt.pattern_type_option,
 			    N_("use extended POSIX regular expressions"),
 			    GREP_PATTERN_TYPE_ERE),
-		OPT_SET_INT('G', "basic-regexp", &pattern_type_arg,
+		OPT_SET_INT('G', "basic-regexp", &opt.pattern_type_option,
 			    N_("use basic POSIX regular expressions (default)"),
 			    GREP_PATTERN_TYPE_BRE),
-		OPT_SET_INT('F', "fixed-strings", &pattern_type_arg,
+		OPT_SET_INT('F', "fixed-strings", &opt.pattern_type_option,
 			    N_("interpret patterns as fixed strings"),
 			    GREP_PATTERN_TYPE_FIXED),
-		OPT_SET_INT('P', "perl-regexp", &pattern_type_arg,
+		OPT_SET_INT('P', "perl-regexp", &opt.pattern_type_option,
 			    N_("use Perl-compatible regular expressions"),
 			    GREP_PATTERN_TYPE_PCRE),
 		OPT_GROUP(""),
@@ -982,7 +981,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options, grep_usage,
 			     PARSE_OPT_KEEP_DASHDASH |
 			     PARSE_OPT_STOP_AT_NON_OPTION);
-	grep_commit_pattern_type(pattern_type_arg, &opt);
 
 	if (use_index && !startup_info->have_repository) {
 		int fallback = 0;
diff --git a/grep.c b/grep.c
index fb3f63c63ef..dda8e536fe3 100644
--- a/grep.c
+++ b/grep.c
@@ -60,8 +60,10 @@ int grep_config(const char *var, const char *value, void *cb)
 	if (userdiff_config(var, value) < 0)
 		return -1;
 
-	if (!strcmp(var, "grep.extendedregexp")) {
-		opt->extended_regexp_option = git_config_bool(var, value);
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_UNSPECIFIED &&
+	    !strcmp(var, "grep.extendedregexp") &&
+	    git_config_bool(var, value)) {
+		opt->pattern_type_option = GREP_PATTERN_TYPE_ERE;
 		return 0;
 	}
 
@@ -115,62 +117,6 @@ void grep_init(struct grep_opt *opt, struct repository *repo)
 	opt->header_tail = &opt->header_list;
 }
 
-static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	/*
-	 * When committing to the pattern type by setting the relevant
-	 * fields in grep_opt it's generally not necessary to zero out
-	 * the fields we're not choosing, since they won't have been
-	 * set by anything. The extended_regexp_option field is the
-	 * only exception to this.
-	 *
-	 * This is because in the process of parsing grep.patternType
-	 * & grep.extendedRegexp we set opt->pattern_type_option and
-	 * opt->extended_regexp_option, respectively. We then
-	 * internally use opt->extended_regexp_option to see if we're
-	 * compiling an ERE. It must be unset if that's not actually
-	 * the case.
-	 */
-	if (pattern_type != GREP_PATTERN_TYPE_ERE &&
-	    opt->extended_regexp_option)
-		opt->extended_regexp_option = 0;
-
-	switch (pattern_type) {
-	case GREP_PATTERN_TYPE_UNSPECIFIED:
-		/* fall through */
-
-	case GREP_PATTERN_TYPE_BRE:
-		break;
-
-	case GREP_PATTERN_TYPE_ERE:
-		opt->extended_regexp_option = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_FIXED:
-		opt->fixed = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_PCRE:
-		opt->pcre2 = 1;
-		break;
-	}
-}
-
-void grep_commit_pattern_type(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	if (pattern_type != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(pattern_type, opt);
-	else if (opt->pattern_type_option != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(opt->pattern_type_option, opt);
-	else if (opt->extended_regexp_option)
-		/*
-		 * This branch *must* happen after setting from the
-		 * opt->pattern_type_option above, we don't want
-		 * grep.extendedRegexp to override grep.patternType!
-		 */
-		grep_set_pattern_type_option(GREP_PATTERN_TYPE_ERE, opt);
-}
-
 static struct grep_pat *create_grep_pat(const char *pat, size_t patlen,
 					const char *origin, int no,
 					enum grep_pat_token t,
@@ -492,9 +438,10 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 
 	p->word_regexp = opt->word_regexp;
 	p->ignore_case = opt->ignore_case;
-	p->fixed = opt->fixed;
+	p->fixed = opt->pattern_type_option == GREP_PATTERN_TYPE_FIXED;
 
-	if (memchr(p->pattern, 0, p->patternlen) && !opt->pcre2)
+	if (opt->pattern_type_option != GREP_PATTERN_TYPE_PCRE &&
+	    memchr(p->pattern, 0, p->patternlen))
 		die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
 
 	p->is_fixed = is_fixed(p->pattern, p->patternlen);
@@ -545,14 +492,14 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 		return;
 	}
 
-	if (opt->pcre2) {
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_PCRE) {
 		compile_pcre2_pattern(p, opt);
 		return;
 	}
 
 	if (p->ignore_case)
 		regflags |= REG_ICASE;
-	if (opt->extended_regexp_option)
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_ERE)
 		regflags |= REG_EXTENDED;
 	err = regcomp(&p->regexp, p->pattern, regflags);
 	if (err) {
diff --git a/grep.h b/grep.h
index 30a7dfd3294..e4e548aed90 100644
--- a/grep.h
+++ b/grep.h
@@ -143,7 +143,6 @@ struct grep_opt {
 	int unmatch_name_only;
 	int count;
 	int word_regexp;
-	int fixed;
 	int all_match;
 #define GREP_BINARY_DEFAULT	0
 #define GREP_BINARY_NOMATCH	1
@@ -152,7 +151,6 @@ struct grep_opt {
 	int allow_textconv;
 	int extended;
 	int use_reflog_filter;
-	int pcre2;
 	int relative;
 	int pathname;
 	int null_following_name;
@@ -161,8 +159,7 @@ struct grep_opt {
 	int max_depth;
 	int funcname;
 	int funcbody;
-	int extended_regexp_option;
-	int pattern_type_option;
+	enum grep_pattern_type pattern_type_option;
 	int ignore_locale;
 	char colors[NR_GREP_COLORS][COLOR_MAXLEN];
 	unsigned pre_context;
@@ -201,7 +198,6 @@ struct grep_opt {
 
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
-void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
 void append_grep_pattern(struct grep_opt *opt, const char *pat, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index 9f9b0d2429e..ed29d245c89 100644
--- a/revision.c
+++ b/revision.c
@@ -2864,8 +2864,6 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 
 	diff_setup_done(&revs->diffopt);
 
-	grep_commit_pattern_type(GREP_PATTERN_TYPE_UNSPECIFIED,
-				 &revs->grep_filter);
 	if (!is_encoding_utf8(get_log_output_encoding()))
 		revs->grep_filter.ignore_locale = 1;
 	compile_grep_patterns(&revs->grep_filter);
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 6b6423a07c3..a59a9726357 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -443,7 +443,7 @@ do
 	'
 
 	test_expect_success "grep $L with grep.extendedRegexp=true and grep.patternType=default" '
-		echo "${HC}ab:abc" >expected &&
+		echo "${HC}ab:a+bc" >expected &&
 		git \
 			-c grep.extendedRegexp=true \
 			-c grep.patternType=default \
-- 
2.34.0.rc1.741.gab7bfd97031


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v2 8/8] grep: make "extendedRegexp=true" the same as "patternType=extended"
  2021-11-10  1:43 ` [PATCH v2 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
                     ` (6 preceding siblings ...)
  2021-11-10  1:43   ` [PATCH v2 7/8] grep: simplify config parsing, change grep.<rx config> interaction Ævar Arnfjörð Bjarmason
@ 2021-11-10  1:43   ` Ævar Arnfjörð Bjarmason
  2021-11-12 19:32     ` Junio C Hamano
  2021-11-10  2:23   ` [PATCH v2 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Taylor Blau
                     ` (2 subsequent siblings)
  10 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-10  1:43 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

In the preceding commit we changed how a "grep.patternType=default"
set after "grep.extendedRegexp=true" would be handled so that the last
set would win, but a "grep.extendedRegexp=true" would only be used if
"grep.patternType" was set to a value other than "default".

Thus a user who had old config and set "grep.extendedRegexp=true" in
their ~/.gitconfig expecting ERE behavior would be opted-in to say
"perl" regexes if a system "/etc/gitconfig" started setting
"grep.patternType=perl".

These funny semantics of only paying attention to a set if another key
is not set to a given value aren't how we treat other config keys, so
let's do away with this caveat for consistency.

The new semantics are simple, a "grep.extendedRegexp=true" is an exact
synonym for specifying "grep.patternType=extended" in the
config. We'll keep ignoring ""grep.extendedRegexp=false", although
arguably we could treat it as a "grep.patternType=basic".

As argued in the preceding commit I think this behavior came about
because we were conflating the state of our code's own internal
"default" value with what we found in explicit user config. See
84befcd0a4a (grep: add a grep.patternType configuration setting,
2012-08-03) for that past behavior.

Let's further change the documentation to note that
"grep.extendedRegexp" is a deprecated synonym, perhaps we'll be able
to remove it at some point in the future and do away with this
special-case entirely.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/config/grep.txt | 3 +--
 grep.c                        | 8 +++-----
 grep.h                        | 4 +---
 t/t7810-grep.sh               | 2 +-
 4 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/Documentation/config/grep.txt b/Documentation/config/grep.txt
index f4b7d3041fb..33e5f3827bc 100644
--- a/Documentation/config/grep.txt
+++ b/Documentation/config/grep.txt
@@ -11,8 +11,7 @@ grep.patternType::
 	value 'default' will return to the default matching behavior.
 
 grep.extendedRegexp::
-	If set to true, enable `--extended-regexp` option by default. This
-	option is ignored when the `grep.patternType` option is set.
+	Deprecated synonym for 'grep.patternType=extended`.
 
 grep.threads::
 	Number of grep worker threads to use. If unset (or set to 0), Git will
diff --git a/grep.c b/grep.c
index dda8e536fe3..ef8746d85f0 100644
--- a/grep.c
+++ b/grep.c
@@ -33,9 +33,8 @@ static const char *color_grep_slots[] = {
 
 static int parse_pattern_type_arg(const char *opt, const char *arg)
 {
-	if (!strcmp(arg, "default"))
-		return GREP_PATTERN_TYPE_UNSPECIFIED;
-	else if (!strcmp(arg, "basic"))
+	if (!strcmp(arg, "basic") ||
+	    !strcmp(arg, "default"))
 		return GREP_PATTERN_TYPE_BRE;
 	else if (!strcmp(arg, "extended"))
 		return GREP_PATTERN_TYPE_ERE;
@@ -60,8 +59,7 @@ int grep_config(const char *var, const char *value, void *cb)
 	if (userdiff_config(var, value) < 0)
 		return -1;
 
-	if (opt->pattern_type_option == GREP_PATTERN_TYPE_UNSPECIFIED &&
-	    !strcmp(var, "grep.extendedregexp") &&
+	if (!strcmp(var, "grep.extendedregexp") &&
 	    git_config_bool(var, value)) {
 		opt->pattern_type_option = GREP_PATTERN_TYPE_ERE;
 		return 0;
diff --git a/grep.h b/grep.h
index e4e548aed90..8ef70d125ff 100644
--- a/grep.h
+++ b/grep.h
@@ -94,8 +94,7 @@ enum grep_expr_node {
 };
 
 enum grep_pattern_type {
-	GREP_PATTERN_TYPE_UNSPECIFIED = 0,
-	GREP_PATTERN_TYPE_BRE,
+	GREP_PATTERN_TYPE_BRE = 0,
 	GREP_PATTERN_TYPE_ERE,
 	GREP_PATTERN_TYPE_FIXED,
 	GREP_PATTERN_TYPE_PCRE
@@ -179,7 +178,6 @@ struct grep_opt {
 	.relative = 1, \
 	.pathname = 1, \
 	.max_depth = -1, \
-	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
 	.colors = { \
 		[GREP_COLOR_CONTEXT] = "", \
 		[GREP_COLOR_FILENAME] = "", \
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index a59a9726357..afca938a4d0 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -461,7 +461,7 @@ do
 	'
 
 	test_expect_success "grep $L with grep.patternType=basic and grep.extendedRegexp=true" '
-		echo "${HC}ab:a+bc" >expected &&
+		echo "${HC}ab:abc" >expected &&
 		git \
 			-c grep.patternType=basic \
 			-c grep.extendedRegexp=true \
-- 
2.34.0.rc1.741.gab7bfd97031


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: [PATCH v2 0/8] grep: simplify & delete code by changing obscure cfg variable behavior
  2021-11-10  1:43 ` [PATCH v2 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
                     ` (7 preceding siblings ...)
  2021-11-10  1:43   ` [PATCH v2 8/8] grep: make "extendedRegexp=true" the same as "patternType=extended" Ævar Arnfjörð Bjarmason
@ 2021-11-10  2:23   ` Taylor Blau
  2021-11-29 14:50   ` [PATCH v3 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  2021-12-03 10:19   ` [PATCH v4 " Ævar Arnfjörð Bjarmason
  10 siblings, 0 replies; 151+ messages in thread
From: Taylor Blau @ 2021-11-10  2:23 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, J Smith, Taylor Blau

On Wed, Nov 10, 2021 at 02:43:42AM +0100, Ævar Arnfjörð Bjarmason wrote:
> For v1 and a more extensive summary see [1]. Thanks a lot Taylor for
> the detailed review on v1!

Thanks. I'm going to put this version on my review backlog for now if
you don't mind, since I think it's more pertinent for us to focus on the
upcoming release.

But I look forward to getting back to this once the release has
stabilized a little bit.


Thanks,
Taylor

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v2 1/8] grep.h: remove unused "regex_t regexp" from grep_opt
  2021-11-10  1:43   ` [PATCH v2 1/8] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
@ 2021-11-12 16:11     ` Junio C Hamano
  0 siblings, 0 replies; 151+ messages in thread
From: Junio C Hamano @ 2021-11-12 16:11 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> This "regex_t" in grep_opt has not been used since
> f9b9faf6f8a (builtin-grep: allow more than one patterns., 2006-05-02),
> we still use a "regex_t" for compiling regexes, but that's in the
> "grep_pat" struct".
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  grep.h | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/grep.h b/grep.h
> index 3e8815c347b..95cccb670f9 100644
> --- a/grep.h
> +++ b/grep.h
> @@ -136,7 +136,6 @@ struct grep_opt {
>  
>  	const char *prefix;
>  	int prefix_length;
> -	regex_t regexp;
>  	int linenum;
>  	int columnnum;
>  	int invert;

I would have expected "this used to be used but no longer; only
initialization of and assignment to it remain"; I am somewhat
surprised to see there is no mention to it anywhere in the code ;-)

Good find.  

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v2 2/8] built-ins: trust the "prefix" from run_builtin()
  2021-11-10  1:43   ` [PATCH v2 2/8] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
@ 2021-11-12 16:38     ` Junio C Hamano
  0 siblings, 0 replies; 151+ messages in thread
From: Junio C Hamano @ 2021-11-12 16:38 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> index 3a442631c71..84bed6d5612 100644
> --- a/builtin/ls-tree.c
> +++ b/builtin/ls-tree.c
> @@ -147,16 +147,15 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>  		OPT__ABBREV(&abbrev),
>  		OPT_END()
>  	};
> -
> -	git_config(git_default_config, NULL);
>  	ls_tree_prefix = prefix;

See below.

> -	if (prefix && *prefix)
> +	if (prefix)
>  		chomp_prefix = strlen(prefix);

We now assume a non-NULL prefix means a non-NUL *prefix, so this
change is understandable.

> +	git_config(git_default_config, NULL);

This moving down of git_config() call is not.  A necessary change,
or an unrelated churn?  If necessary, why?  By not checking if prefix[0]
is NUL, we now need to delay reading the configuration, because ...?

>  	argc = parse_options(argc, argv, prefix, ls_tree_options,
>  			     ls_tree_usage, 0);
>  	if (full_tree) {
> -		ls_tree_prefix = prefix = NULL;
> +		ls_tree_prefix = NULL;
>  		chomp_prefix = 0;
>  	}
>  	/* -d -r should imply -t, but -d by itself should not have to. */
> @@ -178,7 +177,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>  	parse_pathspec(&pathspec, PATHSPEC_ALL_MAGIC &
>  				  ~(PATHSPEC_FROMTOP | PATHSPEC_LITERAL),
>  		       PATHSPEC_PREFER_CWD,
> -		       prefix, argv + 1);
> +		       ls_tree_prefix, argv + 1);

The above two are unnecessary changes.

It was not like the introduction of ls_tree_prefix was made in order
to get rid of "prefix" altogether.  We still have and use prefix,
but we have ls_tree_prefix to expose the value of it to other
functions as a way to "cheat", without having to pass it through as
a parameter (and cheating is OK as its scope is limited to the file).

Perhaps make a conscious effort to refrain from making such
unnecessary changes, especially when working on a multi-patch
series, which may avoid wearing down reviewers?

> diff --git a/git.c b/git.c
> index 5ff21be21f3..611bf2f63eb 100644
> --- a/git.c
> +++ b/git.c
> @@ -420,9 +420,8 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
>  {
>  	int status, help;
>  	struct stat st;
> -	const char *prefix;
> +	const char *prefix = NULL;
>  
> -	prefix = NULL;
>  	help = argc == 2 && !strcmp(argv[1], "-h");
>  	if (!help) {
>  		if (p->option & RUN_SETUP)

Likewise.  Little cuts accumulate.

> @@ -431,6 +430,7 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
>  			int nongit_ok;
>  			prefix = setup_git_directory_gently(&nongit_ok);
>  		}
> +		assert(!prefix || *prefix);

Good.

> diff --git a/grep.h b/grep.h
> index 95cccb670f9..62deadb885f 100644
> --- a/grep.h
> +++ b/grep.h
> @@ -134,8 +134,6 @@ struct grep_opt {
>  	 */
>  	struct repository *repo;
>  
> -	const char *prefix;
> -	int prefix_length;

So, builtin/grep.c is the only user of the low-level grep machinery
that needs to touch these two, and we can lose these members now
that builtin/grep.c relies on the file-scope global instead.  OK.

> diff --git a/revision.c b/revision.c
> index ab7c1358042..9f9b0d2429e 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -1833,7 +1833,7 @@ void repo_init_revisions(struct repository *r,
>  	revs->commit_format = CMIT_FMT_DEFAULT;
>  	revs->expand_tabs_in_log_default = 8;
>  
> -	grep_init(&revs->grep_filter, revs->repo, prefix);
> +	grep_init(&revs->grep_filter, revs->repo);
>  	revs->grep_filter.status_only = 1;
>  
>  	repo_diff_setup(revs->repo, &revs->diffopt);

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v2 3/8] log tests: check if grep_config() is called by "log"-like cmds
  2021-11-10  1:43   ` [PATCH v2 3/8] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
@ 2021-11-12 17:09     ` Junio C Hamano
  0 siblings, 0 replies; 151+ messages in thread
From: Junio C Hamano @ 2021-11-12 17:09 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Extend the tests added in my 9df46763ef1 (log: add exhaustive tests
> for pattern style options & config, 2017-05-20) to check not only
> whether "git log" handles "grep.patternType", but also "git show"
> etc.
>
> It's sufficient to check whether a PCRE regex matches for the purposes
> of this test, we otherwise assume that it's running the same code as
> "git log", whose behavior is tested more exhaustively by test added in
> 9df46763ef1e.

I do agree with the reasoning that it is sufficient to check with
only one kind, but it is unclear if PCRE is so special and if so
why.  Wouldn't testing with, say grep.patternType=extended, alone
also be sufficient?  The reason I am asking it is mostly because the
above description is insufficient to answer these questions, but
also because we can lose the prerequisite and gain a bit wider test
coverage, if we use a pattern that is not PCRE.

Knowing that "show", "whatchanged", "reflog", and "format-patch" all
share the same backend in builtin/log.c, I am not sure how much
value the new tests add.  Testing the "--grep-reflog" option and the
"git rev-list" command may exercise a bit different code path,
though.

> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  t/t4202-log.sh | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
>
> diff --git a/t/t4202-log.sh b/t/t4202-log.sh
> index 7884e3d46b3..a114c49ef27 100755
> --- a/t/t4202-log.sh
> +++ b/t/t4202-log.sh
> @@ -449,6 +449,22 @@ test_expect_success !FAIL_PREREQS 'log with various grep.patternType configurati
>  	)
>  '
>  
> +for cmd in show whatchanged reflog format-patch
> +do
> +	myarg=
> +	if test "$cmd" = "format-patch"
> +	then
> +		myarg="HEAD~.."
> +	fi

I prefer to send the output that ought to contain the grep hits
consistently to "actual", i.e.

	case "$cmd" in
	format-patch) myarg="--stdout -1" ;;
	*) myarg= ;;
	esac	

instead of sending only the filenames.

I also prefer "case/esac" as it would be more concise when we need
to tweak myarg per command later.

> +	test_expect_success PCRE "$cmd: understands grep.patternType=perl, like 'log'" '
> +		git -c grep.patternType=fixed -C pattern-type $cmd --grep="1(?=\|2)" $myarg >actual &&
> +		test_must_be_empty actual &&
> +		git -c grep.patternType=perl -C pattern-type $cmd --grep="1(?=\|2)" $myarg >actual &&
> +		test_file_not_empty actual
> +	'
> +done
> +
>  test_expect_success 'log --author' '
>  	cat >expect <<-\EOF &&
>  	Author: <BOLD;RED>A U<RESET> Thor <author@example.com>

Thanks.

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v2 4/8] grep docs: de-duplicate configuration sections
  2021-11-10  1:43   ` [PATCH v2 4/8] grep docs: de-duplicate configuration sections Ævar Arnfjörð Bjarmason
@ 2021-11-12 17:15     ` Junio C Hamano
  0 siblings, 0 replies; 151+ messages in thread
From: Junio C Hamano @ 2021-11-12 17:15 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Include the "config/grep.txt" file in "git-grep.txt", instead of
> repeating an almost identical description of the "grep" configuration
> variables in two places. In a subsequent commit we'll amend this
> documentation, and can now do so in one place instead of two.

Good find.  They are indeed almost identical.  I am not sure about
the value of ...

> +The below documentation is the same as what's found in
> +linkgit:git-config[1]:

... when everybody becomes consistent, but in the meantime, while
some documentation pages are consistent while others are not, I can
see how it might help.

The patch looks good.

Thanks.

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v2 5/8] grep.c: don't pass along NULL callback value
  2021-11-10  1:43   ` [PATCH v2 5/8] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
@ 2021-11-12 17:18     ` Junio C Hamano
  0 siblings, 0 replies; 151+ messages in thread
From: Junio C Hamano @ 2021-11-12 17:18 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Change grep_cmd_config() top stop passing around the always-NULL "cb"

"Change X top stop passing"?  I cannot guess so I will not say "I'll
fix it to X, no need to resend".

The change itself does seem sensible.

Thanks.

> value. When this code was added in 7e8f59d577e (grep: color patterns
> in output, 2009-03-07) it was non-NULL, but when that changed in
> 15fabd1bbd4 (builtin/grep.c: make configuration callback more
> reusable, 2012-10-09) this code was left behind.
>
> In a subsequent change I'll start using the "cb" value, this will make
> it clear which functions we call need it, and which don't.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  builtin/grep.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/grep.c b/builtin/grep.c
> index d85cbabea67..5ec4cecae45 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -285,8 +285,8 @@ static int wait_all(void)
>  
>  static int grep_cmd_config(const char *var, const char *value, void *cb)
>  {
> -	int st = grep_config(var, value, cb);
> -	if (git_color_default_config(var, value, cb) < 0)
> +	int st = grep_config(var, value, NULL);
> +	if (git_color_default_config(var, value, NULL) < 0)
>  		st = -1;
>  
>  	if (!strcmp(var, "grep.threads")) {

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v2 6/8] grep API: call grep_config() after grep_init()
  2021-11-10  1:43   ` [PATCH v2 6/8] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
@ 2021-11-12 17:32     ` Junio C Hamano
  0 siblings, 0 replies; 151+ messages in thread
From: Junio C Hamano @ 2021-11-12 17:32 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> The grep_init() function used the odd pattern of initializing the
> passed-in "struct grep_opt" with a statically defined "grep_defaults"
> struct, which would be modified in-place when we invoked
> grep_config().
>
> So we effectively (b) initialized config, (a) then defaults, (c)
> followed by user options. Usually those are ordered as "a", "b" and
> "c" instead.
>
> As the comments being removed here show the previous behavior needed
> to be carefully explained as we'd potentially share the populated
> configuration among different instances of grep_init(). In practice we
> didn't do that, but now that it can't be a concern anymore let's
> remove those comments.

OK, so we did this because we wanted to be able to

   1. call grep_config() only once to populate the template;

   2. call grep_init() more than once, and match the grep_opt to
      what the config wanted, without having to call grep_config()
      once per grep_init() invocation.

   3. each invocation of grep_init() in 2. may be followed by
      parse_options() to further tweak grep_opt.

And now we instead have to do

   1. call grep_init()
   2. call grep_config()
   3. parse_options() to tweak

for each instance of grep_opt, which is much more common.

OK.

> diff --git a/builtin/log.c b/builtin/log.c
> index f75d87e8d7f..bfddacdfa6c 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -505,8 +505,6 @@ static int git_log_config(const char *var, const char *value, void *cb)
>  		return 0;
>  	}
>  
> -	if (grep_config(var, value, cb) < 0)
> -		return -1;

This used to tweak the "default template", which we no longer use,
so can go?  And in its place ...

>  	if (git_gpg_config(var, value, cb) < 0)
>  		return -1;
>  	return git_diff_ui_config(var, value, cb);
> @@ -521,6 +519,8 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix)
>  	git_config(git_log_config, NULL);
>  
>  	repo_init_revisions(the_repository, &rev, prefix);
> +	git_config(grep_config, &rev.grep_filter);
> +

... each command in the "log" family tweaks the grep_opt used for
real from the configuration.

>  	rev.diff = 1;
>  	rev.simplify_history = 0;
>  	memset(&opt, 0, sizeof(opt));
> @@ -635,6 +635,8 @@ int cmd_show(int argc, const char **argv, const char *prefix)
>  
>  	memset(&match_all, 0, sizeof(match_all));
>  	repo_init_revisions(the_repository, &rev, prefix);
> +	git_config(grep_config, &rev.grep_filter);
> +

Ditto.  OK, the new pattern makes sense.

> diff --git a/grep.h b/grep.h
> index 62deadb885f..30a7dfd3294 100644
> --- a/grep.h
> +++ b/grep.h
> @@ -171,12 +171,34 @@ struct grep_opt {
>  	int show_hunk_mark;
>  	int file_break;
>  	int heading;
> +	void *caller_priv;

This is unrelated and unexplained change, isn't it?

>  	void *priv;
>  
>  	void (*output)(struct grep_opt *opt, const void *data, size_t size);
>  	void *output_priv;
>  };
>  
> +#define GREP_OPT_INIT { \
> +	.relative = 1, \
> +	.pathname = 1, \
> +	.max_depth = -1, \
> +	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
> +	.colors = { \
> +		[GREP_COLOR_CONTEXT] = "", \
> +		[GREP_COLOR_FILENAME] = "", \
> +		[GREP_COLOR_FUNCTION] = "", \
> +		[GREP_COLOR_LINENO] = "", \
> +		[GREP_COLOR_COLUMNNO] = "", \
> +		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED, \
> +		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED, \
> +		[GREP_COLOR_SELECTED] = "", \
> +		[GREP_COLOR_SEP] = GIT_COLOR_CYAN, \
> +	}, \
> +	.only_matching = 0, \
> +	.color = -1, \
> +	.output = std_output, \
> +}

Other than the mysterious caller_priv bit, the change makes sense to
me.

Thanks.

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v2 7/8] grep: simplify config parsing, change grep.<rx config> interaction
  2021-11-10  1:43   ` [PATCH v2 7/8] grep: simplify config parsing, change grep.<rx config> interaction Ævar Arnfjörð Bjarmason
@ 2021-11-12 19:19     ` Junio C Hamano
  2021-11-13  9:55       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 151+ messages in thread
From: Junio C Hamano @ 2021-11-12 19:19 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Change the interaction between "grep.patternType=default" and
> "grep.extendedRegexp=true" to make setting "grep.extendedRegexp=true"
> synonymous with setting "grep.patternType=extended".

This description alone is not quite understandable.  It is not
saying much more than the single line title, and presense of it does
not seem to improve the understanding by the readers.

> When "grep.patternType" was introduced in 84befcd0a4a (grep: add a
> grep.patternType configuration setting, 2012-08-03) we made two
> seemingly contradictory promises:
>
>  1. You can set "grep.patternType", and "[setting it to] 'default'
>     will return to the default matching behavior".
>
>  2. Support the existing "grep.extendedRegexp" option, but ignore it
>     when the new "grep.patternType" is set, *except* "when the
>     `grep.patternType` option is set. to a value other than 'default'".

OK, so setting grep.patternType=default makes grep.extendedRegexp to
be taken into account.  By grep.patternType to something else, the
other one is ignored.  2. is a very explicit way to say so.  Where
did you get 1. from?  If you have this paragraph in the log message
in mind, I agree that it is less than ideally phrased, but ...

    Rather than adding an additional setting for grep.fooRegexp for
    current and future pattern matching options, add a
    grep.patternType setting that can accept appropriate values for
    modifying the default grep pattern matching behavior. The
    current values are "basic", "extended", "fixed", "perl" and
    "default" for setting -G, -E, -F, -P and the default behavior
    respectively.

... with the understanding of 2. (which is in what the commit adds
to Documentation/config.txt), it is reasonable to understand that
"the default behaviour" is "use BRE or ERE, depending on the setting
of grep.extendedRegexp".

Doesn't the code behave that way?  I think the above is exactly how
the commit wanted to make the code behave.

> I think that 84befcd0a4a probably didn't intend this behavior, but
> instead ended up conflating our internal "unspecified" state with a
> user's explicit desire to set the configuration back to the
> default.

I am not sure where that comes from, but if I imagine somebody
confuses between "default" and "basic" and considers "default" a
synonym for "basic", I can sort-of understand it.  Is it what is
happening here?

But it is not what the original .patternType patch wanted to do back
then, and it is not what we want to see now.

> I.e. a user would correctly expect this to keep working:
>
>     # ERE grep
>     git -c grep.extendedRegexp=true grep <pattern>

This makes sense.

> And likewise for "grep.patternType=default" to take precedence over
> the disfavored "grep.extendedRegexp" option, i.e. the usual "last set
> wins" semantics.
>
>     # BRE grep
>     git -c grep.extendedRegexp=true -c grep.patternType=basic grep <pattern>

This makes sense, too.

Do either of the above two not work as you expect (i.e. the first
use ERE and the second use BRE)?

What I have trouble with is that it is unclear if you are describing
what should happen (in the above, I said "makes sense", to show my
agreement, assuming that it is the case), or if you are describing
what does happen that you disagree with. 

Another thing I have trouble with is your mention of "keep working".
Are you proposing to deliberately break what is working as users
correctly expect?  Why?

> But probably not for this to ignore the favored "grep.patternType"
> option entirely, say if /etc/gitconfig was still setting
> "grep.extendedRegexp", but "~/.gitconfig" used the new
> "grep.patternType" (and wanted to use the "default" value):
>
>     # Was ERE, now BRE
>     git -c grep.extendedRegexp=true grep.patternType=default grep <pattern>

I do not quite get your "Was X, now Y" label.  What did you want to
say with that?

Also I am not sure what you exactly mean when you say "and wanted to
use the 'default' value".  There is no single "THE" default value.
If patternType=default is the last patterntype (it may be set in
many places, but the last one should win), the user is telling that
the last-one-wins setting of extendedRegexp is to be honored.  So,
if grep.extendedRegexp in /etc/gitconfig is the last one defined, we
would choose between BRE and ERE depending on that setting.

Isn't that what is happening in the current code?

Or are all the above explanation result of simple misunderstanding
that setting to "default" means setting to "basic"?


^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v2 8/8] grep: make "extendedRegexp=true" the same as "patternType=extended"
  2021-11-10  1:43   ` [PATCH v2 8/8] grep: make "extendedRegexp=true" the same as "patternType=extended" Ævar Arnfjörð Bjarmason
@ 2021-11-12 19:32     ` Junio C Hamano
  0 siblings, 0 replies; 151+ messages in thread
From: Junio C Hamano @ 2021-11-12 19:32 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> In the preceding commit we changed how a "grep.patternType=default"
> set after "grep.extendedRegexp=true" would be handled so that the last
> set would win, but a "grep.extendedRegexp=true" would only be used if
> "grep.patternType" was set to a value other than "default".

I am getting the feeling that my suspicion about your confusion in
7/8 is right.

The grep.patternType=default was merely a backward compatibility
measure for those who were used to grep.extendedRegexp=true/false
way of doing things, and "default" never meant to mean "basic".  It
merely was an instruction to "honor the old extendedRegexp variable
for this old timer".  The intention all along was that patternType
was a more flexible single true way to set the type to supersede
extendedRegexp but we couldn't discontinue the support for the
latter all of a sudden, and the "default" was invented as the
transition mechanism, but it ended up as the mechanism to choose
either new (setting patternType to some other value) or old
(patternType is set to default, making the value given to
extendedRegexp is the only thing that matters) world to live in.

If you want to change anything in this area, the right thing to do
is rather to _deprecate_ extendedRegexp and eventually remove
extendedRegexp together with the "default" setting to patternType, I
would think, not to change the semantics of extendedRegexp in any
way.

Thanks.

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v2 7/8] grep: simplify config parsing, change grep.<rx config> interaction
  2021-11-12 19:19     ` Junio C Hamano
@ 2021-11-13  9:55       ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-13  9:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, J Smith, Taylor Blau


On Fri, Nov 12 2021, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
> [...]

I'll try to reply to all the rest of the feedback, just really quick on
this, because I think it might represent a bit of a gordian knot.

> Another thing I have trouble with is your mention of "keep working".
> Are you proposing to deliberately break what is working as users
> correctly expect?  Why?

Yes, I'd like to change the behavior, because it makes the grep API much
easier to deal with, and beacuse I think it impacts nobody in practice.

The real goal for this series is that I've got pending patches to
speedup diffcore-pickaxe massively by moving it over to PCRE & drop the
kwset.c code. An old perf test I dug up for that is in [1].

To do that I needed to re-use the bits of grep.c machinery that deal
with setting up patterns, dealing with BRE,ERE,PCRE etc. elsewhere.

I *can* do that in a different way, but it's going to be much easier if
we can gradually evolve the already working grep API to become an
internal textual pattern matching API. Eventually I'd like to move all
of regcomp()/regexec() over to such a thing, because we can for any
other ranodm thing we use regexes for get speedups by using PCRE (and
optionally use its interface to understand BRE/ERE syntax).

The alternative is to split that part off from grep.c, which is a bit
more painful, or to have the init bits etc. take some "no config doesn't
go first", "no it goes first" flags just to support this one API user.

So it would be generally useful to know if you're at all open to
that. Reading between the lines in some other comments I fear that it
may be a "no" except if we mark it as deprecated, wait some years, maybe
remove/change it then etc.

1.

    GIT_TEST_LONG= GIT_PERF_REPEAT_COUNT=10 GIT_PERF_MAKE_OPTS='-j8 USE_LIBPCRE=1 CFLAGS=-O3 LIBPCREDIR=/home/avar/g/pcre2/inst' ./run origin/next HEAD -- p4209-pickaxe.sh
    Test                                                                      origin/next       HEAD
    ------------------------------------------------------------------------------------------------------------------
    4209.1: git log -S'int main' <limit-rev>..                                0.38(0.36+0.01)   0.37(0.33+0.04) -2.6%
    4209.2: git log -S'æ' <limit-rev>..                                       0.51(0.47+0.04)   0.32(0.27+0.05) -37.3%
    4209.3: git log --pickaxe-regex -S'(int|void|null)' <limit-rev>..         0.72(0.68+0.03)   0.57(0.54+0.03) -20.8%
    4209.4: git log --pickaxe-regex -S'if *\([^ ]+ & ' <limit-rev>..          0.60(0.55+0.02)   0.39(0.34+0.05) -35.0%
    4209.5: git log --pickaxe-regex -S'[àáâãäåæñøùúûüýþ]' <limit-rev>..       0.43(0.40+0.03)   0.50(0.44+0.06) +16.3%
    4209.6: git log -G'(int|void|null)' <limit-rev>..                         0.64(0.55+0.09)   0.63(0.56+0.05) -1.6%
    4209.7: git log -G'if *\([^ ]+ & ' <limit-rev>..                          0.64(0.59+0.05)   0.63(0.56+0.06) -1.6%
    4209.8: git log -G'[àáâãäåæñøùúûüýþ]' <limit-rev>..                       0.63(0.54+0.08)   0.62(0.55+0.06) -1.6%
    4209.9: git log -i -S'int main' <limit-rev>..                             0.39(0.35+0.03)   0.38(0.35+0.02) -2.6%
    4209.10: git log -i -S'æ' <limit-rev>..                                   0.39(0.33+0.06)   0.32(0.28+0.04) -17.9%
    4209.11: git log -i --pickaxe-regex -S'(int|void|null)' <limit-rev>..     0.90(0.84+0.05)   0.58(0.53+0.04) -35.6%
    4209.12: git log -i --pickaxe-regex -S'if *\([^ ]+ & ' <limit-rev>..      0.71(0.64+0.06)   0.40(0.37+0.03) -43.7%
    4209.13: git log -i --pickaxe-regex -S'[àáâãäåæñøùúûüýþ]' <limit-rev>..   0.43(0.40+0.03)   0.50(0.46+0.04) +16.3%
    4209.14: git log -i -G'(int|void|null)' <limit-rev>..                     0.64(0.57+0.06)   0.62(0.56+0.05) -3.1%
    4209.15: git log -i -G'if *\([^ ]+ & ' <limit-rev>..                      0.65(0.59+0.06)   0.63(0.54+0.08) -3.1%
    4209.16: git log -i -G'[àáâãäåæñøùúûüýþ]' <limit-rev>..                   0.63(0.55+0.08)   0.62(0.56+0.05) -1.6%

^ permalink raw reply	[flat|nested] 151+ messages in thread

* [PATCH v3 0/7] grep: simplify & delete "init" & "config" code
  2021-11-10  1:43 ` [PATCH v2 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
                     ` (8 preceding siblings ...)
  2021-11-10  2:23   ` [PATCH v2 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Taylor Blau
@ 2021-11-29 14:50   ` Ævar Arnfjörð Bjarmason
  2021-11-29 14:50     ` [PATCH v3 1/7] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
                       ` (8 more replies)
  2021-12-03 10:19   ` [PATCH v4 " Ævar Arnfjörð Bjarmason
  10 siblings, 9 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-29 14:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

In v1 and v2[1] of this series more code in grep.c was deleted by
changing what I think is a really obscure interaction between
"grep.extendedRegexp=true" and "grep.patternType".

Junio preferred having a deprecation period[2], so here's a re-roll
that preserves all existing behavior, at the cost of bit less code
deletion & simplification (from "97 insertions(+), 174 deletions(-)"
to "106 insertions(+), 131 deletions(-)").

Notes on individual patches below. This re-roll should address all
outstanding feedback on v2.

1. https://lore.kernel.org/git/cover-v2-0.8-00000000000-20211110T013632Z-avarab@gmail.com/
2. https://lore.kernel.org/git/xmqqk0hdqifg.fsf@gitster.g/

Ævar Arnfjörð Bjarmason (7):
  grep.h: remove unused "regex_t regexp" from grep_opt

No change.

  log tests: check if grep_config() is called by "log"-like cmds

Simplified the test, and it no longer depends (optionally) PCRE. We
just test BRE v.s. ERE instead.

  grep tests: add missing "grep.patternType" config test

A new test to assert existing behavior, we had a blindspot in this
area (could have hidden a possible regression).

  built-ins: trust the "prefix" from run_builtin()

Substantially the same, but I made some edits as requested to minimize
the diff size / skip any cleanups-while-at it.

  grep.c: don't pass along NULL callback value

Trivial commit message typo fix.

  grep API: call grep_config() after grep_init()

Removed stray leftover WIP code (an unused "caller_priv" struct
member), oops.

  grep: simplify config parsing and option parsing

Mostly new, replaces the now-ejected two last commits of v2 that
changed obscure behavior. We still delete most of the code involved in
this part of grep initialization and config handling, but now with no
changes in existing behavior.

This is now also ejected:

  grep docs: de-duplicate configuration sections
  (https://lore.kernel.org/git/patch-v2-4.8-efe95397d72-20211110T013632Z-avarab@gmail.com/)

Since I don't need to change any documentation anymore, moving around
the grep docs to live in one place isn't within the scope of this
series.

 builtin/grep.c    |  27 +++++-----
 builtin/log.c     |  13 ++++-
 builtin/ls-tree.c |   2 +-
 git.c             |   1 +
 grep.c            | 124 ++++++++--------------------------------------
 grep.h            |  33 ++++++++----
 revision.c        |   4 +-
 t/t4202-log.sh    |  24 +++++++++
 t/t7810-grep.sh   |   9 ++++
 9 files changed, 106 insertions(+), 131 deletions(-)

Range-diff against v2:
1:  1435db727ef = 1:  71ff51cb3c9 grep.h: remove unused "regex_t regexp" from grep_opt
3:  41e38ebb32c ! 2:  ec8e42ced1a log tests: check if grep_config() is called by "log"-like cmds
    @@ Commit message
         whether "git log" handles "grep.patternType", but also "git show"
         etc.
     
    -    It's sufficient to check whether a PCRE regex matches for the purposes
    -    of this test, we otherwise assume that it's running the same code as
    -    "git log", whose behavior is tested more exhaustively by test added in
    -    9df46763ef1e.
    +    It's sufficient to check whether we match a "fixed" or a "basic" regex
    +    here to see if these codepaths correctly invoked grep_config(). We
    +    don't need to check the details of their regular expression matching
    +    as the "log" test does.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
    @@ t/t4202-log.sh: test_expect_success !FAIL_PREREQS 'log with various grep.pattern
      
     +for cmd in show whatchanged reflog format-patch
     +do
    -+	myarg=
    -+	if test "$cmd" = "format-patch"
    -+	then
    -+		myarg="HEAD~.."
    -+	fi
    ++	case "$cmd" in
    ++	format-patch) myarg="HEAD~.." ;;
    ++	*) myarg= ;;
    ++	esac
     +
    -+	test_expect_success PCRE "$cmd: understands grep.patternType=perl, like 'log'" '
    -+		git -c grep.patternType=fixed -C pattern-type $cmd --grep="1(?=\|2)" $myarg >actual &&
    -+		test_must_be_empty actual &&
    -+		git -c grep.patternType=perl -C pattern-type $cmd --grep="1(?=\|2)" $myarg >actual &&
    -+		test_file_not_empty actual
    ++	test_expect_success "$cmd: understands grep.patternType, like 'log'" '
    ++		git init "pattern-type-$cmd" &&
    ++		(
    ++			cd "pattern-type-$cmd" &&
    ++			test_commit 1 file A &&
    ++			test_commit "(1|2)" file B 2 &&
    ++
    ++			git -c grep.patternType=fixed $cmd --grep="..." $myarg >actual &&
    ++			test_must_be_empty actual &&
    ++
    ++			git -c grep.patternType=basic $cmd --grep="..." $myarg >actual &&
    ++			test_file_not_empty actual
    ++		)
     +	'
     +done
    ++test_done
     +
      test_expect_success 'log --author' '
      	cat >expect <<-\EOF &&
8:  cc904d93b26 ! 3:  fcad1b1664b grep: make "extendedRegexp=true" the same as "patternType=extended"
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    grep: make "extendedRegexp=true" the same as "patternType=extended"
    +    grep tests: add missing "grep.patternType" config test
     
    -    In the preceding commit we changed how a "grep.patternType=default"
    -    set after "grep.extendedRegexp=true" would be handled so that the last
    -    set would win, but a "grep.extendedRegexp=true" would only be used if
    -    "grep.patternType" was set to a value other than "default".
    -
    -    Thus a user who had old config and set "grep.extendedRegexp=true" in
    -    their ~/.gitconfig expecting ERE behavior would be opted-in to say
    -    "perl" regexes if a system "/etc/gitconfig" started setting
    -    "grep.patternType=perl".
    -
    -    These funny semantics of only paying attention to a set if another key
    -    is not set to a given value aren't how we treat other config keys, so
    -    let's do away with this caveat for consistency.
    -
    -    The new semantics are simple, a "grep.extendedRegexp=true" is an exact
    -    synonym for specifying "grep.patternType=extended" in the
    -    config. We'll keep ignoring ""grep.extendedRegexp=false", although
    -    arguably we could treat it as a "grep.patternType=basic".
    -
    -    As argued in the preceding commit I think this behavior came about
    -    because we were conflating the state of our code's own internal
    -    "default" value with what we found in explicit user config. See
    -    84befcd0a4a (grep: add a grep.patternType configuration setting,
    -    2012-08-03) for that past behavior.
    -
    -    Let's further change the documentation to note that
    -    "grep.extendedRegexp" is a deprecated synonym, perhaps we'll be able
    -    to remove it at some point in the future and do away with this
    -    special-case entirely.
    +    Extend the grep tests to assert that setting
    +    "grep.patternType=extended" followed by "grep.patternType=default"
    +    will behave as if "--extended-regexp" was provided, and not as
    +    "--basic-regexp". In a subsequent commit we'll need to treat
    +    "grep.patternType=default" as a special-case, but let's make sure we
    +    don't ignore it if "grep.patternType" was set to a non-"default" value
    +    before.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
    - ## Documentation/config/grep.txt ##
    -@@ Documentation/config/grep.txt: grep.patternType::
    - 	value 'default' will return to the default matching behavior.
    - 
    - grep.extendedRegexp::
    --	If set to true, enable `--extended-regexp` option by default. This
    --	option is ignored when the `grep.patternType` option is set.
    -+	Deprecated synonym for 'grep.patternType=extended`.
    - 
    - grep.threads::
    - 	Number of grep worker threads to use. If unset (or set to 0), Git will
    -
    - ## grep.c ##
    -@@ grep.c: static const char *color_grep_slots[] = {
    - 
    - static int parse_pattern_type_arg(const char *opt, const char *arg)
    - {
    --	if (!strcmp(arg, "default"))
    --		return GREP_PATTERN_TYPE_UNSPECIFIED;
    --	else if (!strcmp(arg, "basic"))
    -+	if (!strcmp(arg, "basic") ||
    -+	    !strcmp(arg, "default"))
    - 		return GREP_PATTERN_TYPE_BRE;
    - 	else if (!strcmp(arg, "extended"))
    - 		return GREP_PATTERN_TYPE_ERE;
    -@@ grep.c: int grep_config(const char *var, const char *value, void *cb)
    - 	if (userdiff_config(var, value) < 0)
    - 		return -1;
    - 
    --	if (opt->pattern_type_option == GREP_PATTERN_TYPE_UNSPECIFIED &&
    --	    !strcmp(var, "grep.extendedregexp") &&
    -+	if (!strcmp(var, "grep.extendedregexp") &&
    - 	    git_config_bool(var, value)) {
    - 		opt->pattern_type_option = GREP_PATTERN_TYPE_ERE;
    - 		return 0;
    -
    - ## grep.h ##
    -@@ grep.h: enum grep_expr_node {
    - };
    - 
    - enum grep_pattern_type {
    --	GREP_PATTERN_TYPE_UNSPECIFIED = 0,
    --	GREP_PATTERN_TYPE_BRE,
    -+	GREP_PATTERN_TYPE_BRE = 0,
    - 	GREP_PATTERN_TYPE_ERE,
    - 	GREP_PATTERN_TYPE_FIXED,
    - 	GREP_PATTERN_TYPE_PCRE
    -@@ grep.h: struct grep_opt {
    - 	.relative = 1, \
    - 	.pathname = 1, \
    - 	.max_depth = -1, \
    --	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
    - 	.colors = { \
    - 		[GREP_COLOR_CONTEXT] = "", \
    - 		[GREP_COLOR_FILENAME] = "", \
    -
      ## t/t7810-grep.sh ##
     @@ t/t7810-grep.sh: do
    + 		test_cmp expected actual
      	'
      
    - 	test_expect_success "grep $L with grep.patternType=basic and grep.extendedRegexp=true" '
    --		echo "${HC}ab:a+bc" >expected &&
    -+		echo "${HC}ab:abc" >expected &&
    ++	test_expect_success "grep $L with grep.patternType=extended and grep.patternType=default" '
    ++		echo "${HC}ab:a+bc" >expected &&
    ++		git \
    ++			-c grep.patternType=extended \
    ++			-c grep.patternType=default \
    ++			grep "a+b*c" $H ab >actual &&
    ++		test_cmp expected actual
    ++	'
    ++
    + 	test_expect_success "grep $L with grep.patternType=extended and grep.extendedRegexp=false" '
    + 		echo "${HC}ab:abc" >expected &&
      		git \
    - 			-c grep.patternType=basic \
    - 			-c grep.extendedRegexp=true \
2:  63cf2fe266d ! 4:  854ffe8d0b9 built-ins: trust the "prefix" from run_builtin()
    @@ builtin/grep.c: int cmd_grep(int argc, const char **argv, const char *prefix)
     
      ## builtin/ls-tree.c ##
     @@ builtin/ls-tree.c: int cmd_ls_tree(int argc, const char **argv, const char *prefix)
    - 		OPT__ABBREV(&abbrev),
    - 		OPT_END()
    - 	};
    --
    --	git_config(git_default_config, NULL);
    + 
    + 	git_config(git_default_config, NULL);
      	ls_tree_prefix = prefix;
     -	if (prefix && *prefix)
     +	if (prefix)
      		chomp_prefix = strlen(prefix);
      
    -+	git_config(git_default_config, NULL);
      	argc = parse_options(argc, argv, prefix, ls_tree_options,
    - 			     ls_tree_usage, 0);
    - 	if (full_tree) {
    --		ls_tree_prefix = prefix = NULL;
    -+		ls_tree_prefix = NULL;
    - 		chomp_prefix = 0;
    - 	}
    - 	/* -d -r should imply -t, but -d by itself should not have to. */
    -@@ builtin/ls-tree.c: int cmd_ls_tree(int argc, const char **argv, const char *prefix)
    - 	parse_pathspec(&pathspec, PATHSPEC_ALL_MAGIC &
    - 				  ~(PATHSPEC_FROMTOP | PATHSPEC_LITERAL),
    - 		       PATHSPEC_PREFER_CWD,
    --		       prefix, argv + 1);
    -+		       ls_tree_prefix, argv + 1);
    - 	for (i = 0; i < pathspec.nr; i++)
    - 		pathspec.items[i].nowildcard_len = pathspec.items[i].len;
    - 	pathspec.has_wildcard = 0;
     
      ## git.c ##
    -@@ git.c: static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
    - {
    - 	int status, help;
    - 	struct stat st;
    --	const char *prefix;
    -+	const char *prefix = NULL;
    - 
    --	prefix = NULL;
    - 	help = argc == 2 && !strcmp(argv[1], "-h");
    - 	if (!help) {
    - 		if (p->option & RUN_SETUP)
     @@ git.c: static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
      			int nongit_ok;
      			prefix = setup_git_directory_gently(&nongit_ok);
4:  efe95397d72 < -:  ----------- grep docs: de-duplicate configuration sections
5:  d0f0ac6c7ae ! 5:  2536eae2c32 grep.c: don't pass along NULL callback value
    @@ Metadata
      ## Commit message ##
         grep.c: don't pass along NULL callback value
     
    -    Change grep_cmd_config() top stop passing around the always-NULL "cb"
    +    Change grep_cmd_config() to stop passing around the always-NULL "cb"
         value. When this code was added in 7e8f59d577e (grep: color patterns
         in output, 2009-03-07) it was non-NULL, but when that changed in
         15fabd1bbd4 (builtin/grep.c: make configuration callback more
6:  917944f79a5 ! 6:  4e1be7c165b grep API: call grep_config() after grep_init()
    @@ grep.c: int grep_config(const char *var, const char *value, void *cb)
     
      ## grep.h ##
     @@ grep.h: struct grep_opt {
    - 	int show_hunk_mark;
    - 	int file_break;
    - 	int heading;
    -+	void *caller_priv;
    - 	void *priv;
    - 
    - 	void (*output)(struct grep_opt *opt, const void *data, size_t size);
      	void *output_priv;
      };
      
7:  140a7416223 ! 7:  f40ab932cb1 grep: simplify config parsing, change grep.<rx config> interaction
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    grep: simplify config parsing, change grep.<rx config> interaction
    +    grep: simplify config parsing and option parsing
     
    -    Change the interaction between "grep.patternType=default" and
    -    "grep.extendedRegexp=true" to make setting "grep.extendedRegexp=true"
    -    synonymous with setting "grep.patternType=extended".
    -
    -    This changes our existing config parsing behavior as detailed below,
    -    but in a way that's consistent with how we parse other
    -    configuration.
    -
    -    We are breaking past promises here, but I doubt that this will impact
    -    anyone in practice. The reduction in complexity and resulting
    -    consistency with other default config behavior is worth it.
    +    Simplify the parsing of "grep.patternType" and
    +    "grep.extendedRegexp". This changes no behavior, but gets rid of
    +    complex parsing logic that isn't needed anymore.
     
         When "grep.patternType" was introduced in 84befcd0a4a (grep: add a
    -    grep.patternType configuration setting, 2012-08-03) we made two
    -    seemingly contradictory promises:
    +    grep.patternType configuration setting, 2012-08-03) we promised that:
     
          1. You can set "grep.patternType", and "[setting it to] 'default'
             will return to the default matching behavior".
     
    -     2. Support the existing "grep.extendedRegexp" option, but ignore it
    -        when the new "grep.patternType" is set, *except* "when the
    -        `grep.patternType` option is set. to a value other than 'default'".
    -
    -    I think that 84befcd0a4a probably didn't intend this behavior, but
    -    instead ended up conflating our internal "unspecified" state with a
    -    user's explicit desire to set the configuration back to the
    -    default.
    -
    -    I.e. a user would correctly expect this to keep working:
    -
    -        # ERE grep
    -        git -c grep.extendedRegexp=true grep <pattern>
    -
    -    And likewise for "grep.patternType=default" to take precedence over
    -    the disfavored "grep.extendedRegexp" option, i.e. the usual "last set
    -    wins" semantics.
    +     2. We'd support the existing "grep.extendedRegexp" option, but ignore
    +        it when the new "grep.patternType" option is set. We said we'd
    +        only ignore the older "grep.extendedRegexp" option "when the
    +        `grep.patternType` option is set. to a value other than
    +        'default'".
     
    -        # BRE grep
    -        git -c grep.extendedRegexp=true -c grep.patternType=basic grep <pattern>
    +    In a preceding commit we changed grep_config() to be called after
    +    grep_init(), which means that much of the complexity here can go
    +    away.
     
    -    But probably not for this to ignore the favored "grep.patternType"
    -    option entirely, say if /etc/gitconfig was still setting
    -    "grep.extendedRegexp", but "~/.gitconfig" used the new
    -    "grep.patternType" (and wanted to use the "default" value):
    -
    -        # Was ERE, now BRE
    -        git -c grep.extendedRegexp=true grep.patternType=default grep <pattern>
    -
    -    I think that in practice nobody or almost nobody is going to be
    -    relying on this obscure interaction, and as shown here it makes the
    -    config parsing much simpler. We no longer have to carry a complex
    -    state machine in "grep_commit_pattern_type()" and
    -    "grep_set_pattern_type_option()".
    -
    -    We can also do away with the "int fixed" and "int pcre2" members in
    -    favor of using "pattern_type_option" directly in "grep.c", as well as
    -    dropping the "pattern_type_arg" variable in "builtin/grep.c" in favor
    -    of using the "pattern_type_option" member directly.
    +    Now as before when we only understand a "grep.extendedRegexp" setting
    +    of "true", and if "grep.patterntype=default" is set we'll interpret it
    +    as "grep.patterntype=basic", except if we previously saw a
    +    "grep.extendedRegexp", then it's interpreted as
    +    "grep.patterntype=extended".
     
         See my 07a3d411739 (grep: remove regflags from the public grep_opt
         API, 2017-06-29) for addition of the two comments being removed here,
         i.e. the complexity noted in that commit is now going away.
     
    -    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    +    We don't need grep_commit_pattern_type() anymore, we can instead have
    +    OPT_SET_INT() in "builtin/grep.c" manipulate the "pattern_type_option"
    +    member in "struct grep_opt" directly.
     
    - ## Documentation/config/grep.txt ##
    -@@ Documentation/config/grep.txt: grep.patternType::
    - 
    - grep.extendedRegexp::
    - 	If set to true, enable `--extended-regexp` option by default. This
    --	option is ignored when the `grep.patternType` option is set to a value
    --	other than 'default'.
    -+	option is ignored when the `grep.patternType` option is set.
    - 
    - grep.threads::
    - 	Number of grep worker threads to use. If unset (or set to 0), Git will
    +    We can also do away with the indirection of the "int fixed" and "int
    +    pcre2" members in favor of using "pattern_type_option" directly in
    +    "grep.c".
    +
    +    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## builtin/grep.c ##
     @@ builtin/grep.c: int cmd_grep(int argc, const char **argv, const char *prefix)
    @@ builtin/grep.c: int cmd_grep(int argc, const char **argv, const char *prefix)
      		int fallback = 0;
     
      ## grep.c ##
    +@@ grep.c: static const char *color_grep_slots[] = {
    + 
    + static int parse_pattern_type_arg(const char *opt, const char *arg)
    + {
    +-	if (!strcmp(arg, "default"))
    +-		return GREP_PATTERN_TYPE_UNSPECIFIED;
    +-	else if (!strcmp(arg, "basic"))
    ++	if (!strcmp(arg, "basic"))
    + 		return GREP_PATTERN_TYPE_BRE;
    + 	else if (!strcmp(arg, "extended"))
    + 		return GREP_PATTERN_TYPE_ERE;
     @@ grep.c: int grep_config(const char *var, const char *value, void *cb)
    - 	if (userdiff_config(var, value) < 0)
      		return -1;
      
    --	if (!strcmp(var, "grep.extendedregexp")) {
    --		opt->extended_regexp_option = git_config_bool(var, value);
    -+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_UNSPECIFIED &&
    -+	    !strcmp(var, "grep.extendedregexp") &&
    -+	    git_config_bool(var, value)) {
    -+		opt->pattern_type_option = GREP_PATTERN_TYPE_ERE;
    + 	if (!strcmp(var, "grep.extendedregexp")) {
    ++		if (opt->extended_regexp_option)
    ++			return 0;
    + 		opt->extended_regexp_option = git_config_bool(var, value);
    ++		if (opt->extended_regexp_option)
    ++			opt->pattern_type_option = GREP_PATTERN_TYPE_ERE;
    ++		return 0;
    ++	}
    ++
    ++	if (!strcmp(var, "grep.patterntype") &&
    ++	    !strcmp(value, "default")) {
    ++		opt->pattern_type_option = opt->extended_regexp_option == 1
    ++			? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
      		return 0;
      	}
      
    + 	if (!strcmp(var, "grep.patterntype")) {
    ++		opt->extended_regexp_option = -1; /* ignore */
    + 		opt->pattern_type_option = parse_pattern_type_arg(var, value);
    + 		return 0;
    + 	}
     @@ grep.c: void grep_init(struct grep_opt *opt, struct repository *repo)
      	opt->header_tail = &opt->header_list;
      }
    @@ grep.c: static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
      	if (err) {
     
      ## grep.h ##
    +@@ grep.h: enum grep_expr_node {
    + };
    + 
    + enum grep_pattern_type {
    +-	GREP_PATTERN_TYPE_UNSPECIFIED = 0,
    +-	GREP_PATTERN_TYPE_BRE,
    ++	GREP_PATTERN_TYPE_BRE = 0,
    + 	GREP_PATTERN_TYPE_ERE,
    + 	GREP_PATTERN_TYPE_FIXED,
    + 	GREP_PATTERN_TYPE_PCRE
     @@ grep.h: struct grep_opt {
      	int unmatch_name_only;
      	int count;
    @@ grep.h: struct grep_opt {
      	int pathname;
      	int null_following_name;
     @@ grep.h: struct grep_opt {
    - 	int max_depth;
      	int funcname;
      	int funcbody;
    --	int extended_regexp_option;
    + 	int extended_regexp_option;
     -	int pattern_type_option;
     +	enum grep_pattern_type pattern_type_option;
      	int ignore_locale;
      	char colors[NR_GREP_COLORS][COLOR_MAXLEN];
      	unsigned pre_context;
     @@ grep.h: struct grep_opt {
    + 	.relative = 1, \
    + 	.pathname = 1, \
    + 	.max_depth = -1, \
    +-	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
    + 	.colors = { \
    + 		[GREP_COLOR_CONTEXT] = "", \
    + 		[GREP_COLOR_FILENAME] = "", \
    +@@ grep.h: struct grep_opt {
      
      int grep_config(const char *var, const char *value, void *);
      void grep_init(struct grep_opt *, struct repository *repo);
    @@ revision.c: int setup_revisions(int argc, const char **argv, struct rev_info *re
      	if (!is_encoding_utf8(get_log_output_encoding()))
      		revs->grep_filter.ignore_locale = 1;
      	compile_grep_patterns(&revs->grep_filter);
    -
    - ## t/t7810-grep.sh ##
    -@@ t/t7810-grep.sh: do
    - 	'
    - 
    - 	test_expect_success "grep $L with grep.extendedRegexp=true and grep.patternType=default" '
    --		echo "${HC}ab:abc" >expected &&
    -+		echo "${HC}ab:a+bc" >expected &&
    - 		git \
    - 			-c grep.extendedRegexp=true \
    - 			-c grep.patternType=default \
-- 
2.34.1.841.gf15fb7e6f34


^ permalink raw reply	[flat|nested] 151+ messages in thread

* [PATCH v3 1/7] grep.h: remove unused "regex_t regexp" from grep_opt
  2021-11-29 14:50   ` [PATCH v3 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
@ 2021-11-29 14:50     ` Ævar Arnfjörð Bjarmason
  2021-11-29 14:50     ` [PATCH v3 2/7] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
                       ` (7 subsequent siblings)
  8 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-29 14:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

This "regex_t" in grep_opt has not been used since
f9b9faf6f8a (builtin-grep: allow more than one patterns., 2006-05-02),
we still use a "regex_t" for compiling regexes, but that's in the
"grep_pat" struct".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/grep.h b/grep.h
index 3e8815c347b..95cccb670f9 100644
--- a/grep.h
+++ b/grep.h
@@ -136,7 +136,6 @@ struct grep_opt {
 
 	const char *prefix;
 	int prefix_length;
-	regex_t regexp;
 	int linenum;
 	int columnnum;
 	int invert;
-- 
2.34.1.841.gf15fb7e6f34


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v3 2/7] log tests: check if grep_config() is called by "log"-like cmds
  2021-11-29 14:50   ` [PATCH v3 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  2021-11-29 14:50     ` [PATCH v3 1/7] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
@ 2021-11-29 14:50     ` Ævar Arnfjörð Bjarmason
  2022-03-04  8:57       ` Tests in t4202 are aborted early, was: Re: [PATCH v3 2/7] log Fabian Stelzer
  2021-11-29 14:50     ` [PATCH v3 3/7] grep tests: add missing "grep.patternType" config test Ævar Arnfjörð Bjarmason
                       ` (6 subsequent siblings)
  8 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-29 14:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Extend the tests added in my 9df46763ef1 (log: add exhaustive tests
for pattern style options & config, 2017-05-20) to check not only
whether "git log" handles "grep.patternType", but also "git show"
etc.

It's sufficient to check whether we match a "fixed" or a "basic" regex
here to see if these codepaths correctly invoked grep_config(). We
don't need to check the details of their regular expression matching
as the "log" test does.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t4202-log.sh | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 7884e3d46b3..11bb25440b0 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -449,6 +449,30 @@ test_expect_success !FAIL_PREREQS 'log with various grep.patternType configurati
 	)
 '
 
+for cmd in show whatchanged reflog format-patch
+do
+	case "$cmd" in
+	format-patch) myarg="HEAD~.." ;;
+	*) myarg= ;;
+	esac
+
+	test_expect_success "$cmd: understands grep.patternType, like 'log'" '
+		git init "pattern-type-$cmd" &&
+		(
+			cd "pattern-type-$cmd" &&
+			test_commit 1 file A &&
+			test_commit "(1|2)" file B 2 &&
+
+			git -c grep.patternType=fixed $cmd --grep="..." $myarg >actual &&
+			test_must_be_empty actual &&
+
+			git -c grep.patternType=basic $cmd --grep="..." $myarg >actual &&
+			test_file_not_empty actual
+		)
+	'
+done
+test_done
+
 test_expect_success 'log --author' '
 	cat >expect <<-\EOF &&
 	Author: <BOLD;RED>A U<RESET> Thor <author@example.com>
-- 
2.34.1.841.gf15fb7e6f34


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v3 3/7] grep tests: add missing "grep.patternType" config test
  2021-11-29 14:50   ` [PATCH v3 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  2021-11-29 14:50     ` [PATCH v3 1/7] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
  2021-11-29 14:50     ` [PATCH v3 2/7] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
@ 2021-11-29 14:50     ` Ævar Arnfjörð Bjarmason
  2021-11-29 21:52       ` Junio C Hamano
  2021-11-29 14:50     ` [PATCH v3 4/7] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
                       ` (5 subsequent siblings)
  8 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-29 14:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Extend the grep tests to assert that setting
"grep.patternType=extended" followed by "grep.patternType=default"
will behave as if "--extended-regexp" was provided, and not as
"--basic-regexp". In a subsequent commit we'll need to treat
"grep.patternType=default" as a special-case, but let's make sure we
don't ignore it if "grep.patternType" was set to a non-"default" value
before.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t7810-grep.sh | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 6b6423a07c3..724b1bbbc1c 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -451,6 +451,15 @@ do
 		test_cmp expected actual
 	'
 
+	test_expect_success "grep $L with grep.patternType=extended and grep.patternType=default" '
+		echo "${HC}ab:a+bc" >expected &&
+		git \
+			-c grep.patternType=extended \
+			-c grep.patternType=default \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
 	test_expect_success "grep $L with grep.patternType=extended and grep.extendedRegexp=false" '
 		echo "${HC}ab:abc" >expected &&
 		git \
-- 
2.34.1.841.gf15fb7e6f34


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v3 4/7] built-ins: trust the "prefix" from run_builtin()
  2021-11-29 14:50   ` [PATCH v3 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                       ` (2 preceding siblings ...)
  2021-11-29 14:50     ` [PATCH v3 3/7] grep tests: add missing "grep.patternType" config test Ævar Arnfjörð Bjarmason
@ 2021-11-29 14:50     ` Ævar Arnfjörð Bjarmason
  2021-11-29 14:50     ` [PATCH v3 5/7] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
                       ` (4 subsequent siblings)
  8 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-29 14:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change code in "builtin/grep.c" and "builtin/ls-tree.c" to trust the
"prefix" passed from "run_builtin()". The "prefix" we get from setup.c
is either going to be NULL or a string of length >0, never "".

So we can drop the "prefix && *prefix" checks added for
"builtin/grep.c" in 0d042fecf2f (git-grep: show pathnames relative to
the current directory, 2006-08-11), and for "builtin/ls-tree.c" in
a69dd585fca (ls-tree: chomp leading directories when run from a
subdirectory, 2005-12-23).

As seen in code in revision.c that was added in cd676a51367 (diff
--relative: output paths as relative to the current subdirectory,
2008-02-12) we already have existing code that does away with this
assertion.

This makes it easier to reason about a subsequent change to the
"prefix_length" code in grep.c in a subsequent commit, and since we're
going to the trouble of doing that let's leave behind an assert() to
promise this to any future callers.

For "builtin/grep.c" it would be painful to pass the "prefix" down the
callchain of:

    cmd_grep -> grep_tree -> grep_submodule -> grep_cache -> grep_oid ->
    grep_source_name

So for the code that needs it in grep_source_name() let's add a
"grep_prefix" variable similar to the existing "ls_tree_prefix".

While at it let's move the code in cmd_ls_tree() around so that we
assign to the "ls_tree_prefix" right after declaring the variables,
and stop assigning to "prefix". We only subsequently used that
variable later in the function after clobbering it. Let's just use our
own "grep_prefix" instead.

Let's also add an assert() in git.c, so that we'll make this promise
about the "prefix" to any current and future callers, as well as to
any readers of the code.

Code history:

 * The strlen() in "grep.c" hasn't been used since 493b7a08d80 (grep:
   accept relative paths outside current working directory, 2009-09-05).

   When that code was added in 0d042fecf2f (git-grep: show pathnames
   relative to the current directory, 2006-08-11) we used the length.

   But since 493b7a08d80 we haven't used it for anything except a
   boolean check that we could have done on the "prefix" member
   itself.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c    | 13 ++++++++-----
 builtin/ls-tree.c |  2 +-
 git.c             |  1 +
 grep.c            |  4 +---
 grep.h            |  4 +---
 revision.c        |  2 +-
 6 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 9e34a820ad4..d85cbabea67 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -26,6 +26,8 @@
 #include "object-store.h"
 #include "packfile.h"
 
+static const char *grep_prefix;
+
 static char const * const grep_usage[] = {
 	N_("git grep [<options>] [-e] <pattern> [<rev>...] [[--] <path>...]"),
 	NULL
@@ -315,11 +317,11 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 	strbuf_reset(out);
 
 	if (opt->null_following_name) {
-		if (opt->relative && opt->prefix_length) {
+		if (opt->relative && grep_prefix) {
 			struct strbuf rel_buf = STRBUF_INIT;
 			const char *rel_name =
 				relative_path(filename + tree_name_len,
-					      opt->prefix, &rel_buf);
+					      grep_prefix, &rel_buf);
 
 			if (tree_name_len)
 				strbuf_add(out, filename, tree_name_len);
@@ -332,8 +334,8 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 		return;
 	}
 
-	if (opt->relative && opt->prefix_length)
-		quote_path(filename + tree_name_len, opt->prefix, out, 0);
+	if (opt->relative && grep_prefix)
+		quote_path(filename + tree_name_len, grep_prefix, out, 0);
 	else
 		quote_c_style(filename + tree_name_len, out, NULL, 0);
 
@@ -962,9 +964,10 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			   PARSE_OPT_NOCOMPLETE),
 		OPT_END()
 	};
+	grep_prefix = prefix;
 
 	git_config(grep_cmd_config, NULL);
-	grep_init(&opt, the_repository, prefix);
+	grep_init(&opt, the_repository);
 
 	/*
 	 * If there is no -- then the paths must exist in the working
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c71..6cb554cbb0a 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -150,7 +150,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 
 	git_config(git_default_config, NULL);
 	ls_tree_prefix = prefix;
-	if (prefix && *prefix)
+	if (prefix)
 		chomp_prefix = strlen(prefix);
 
 	argc = parse_options(argc, argv, prefix, ls_tree_options,
diff --git a/git.c b/git.c
index 5ff21be21f3..6cf48adeb1f 100644
--- a/git.c
+++ b/git.c
@@ -431,6 +431,7 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
 			int nongit_ok;
 			prefix = setup_git_directory_gently(&nongit_ok);
 		}
+		assert(!prefix || *prefix);
 		precompose_argv_prefix(argc, argv, NULL);
 		if (use_pager == -1 && p->option & (RUN_SETUP | RUN_SETUP_GENTLY) &&
 		    !(p->option & DELAY_PAGER_CONFIG))
diff --git a/grep.c b/grep.c
index fe847a0111a..12b202598a9 100644
--- a/grep.c
+++ b/grep.c
@@ -139,13 +139,11 @@ int grep_config(const char *var, const char *value, void *cb)
  * default values from the template we read the configuration
  * information in an earlier call to git_config(grep_config).
  */
-void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix)
+void grep_init(struct grep_opt *opt, struct repository *repo)
 {
 	*opt = grep_defaults;
 
 	opt->repo = repo;
-	opt->prefix = prefix;
-	opt->prefix_length = (prefix && *prefix) ? strlen(prefix) : 0;
 	opt->pattern_tail = &opt->pattern_list;
 	opt->header_tail = &opt->header_list;
 }
diff --git a/grep.h b/grep.h
index 95cccb670f9..62deadb885f 100644
--- a/grep.h
+++ b/grep.h
@@ -134,8 +134,6 @@ struct grep_opt {
 	 */
 	struct repository *repo;
 
-	const char *prefix;
-	int prefix_length;
 	int linenum;
 	int columnnum;
 	int invert;
@@ -180,7 +178,7 @@ struct grep_opt {
 };
 
 int grep_config(const char *var, const char *value, void *);
-void grep_init(struct grep_opt *, struct repository *repo, const char *prefix);
+void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index 1981a0859f0..a82e31df869 100644
--- a/revision.c
+++ b/revision.c
@@ -1833,7 +1833,7 @@ void repo_init_revisions(struct repository *r,
 	revs->commit_format = CMIT_FMT_DEFAULT;
 	revs->expand_tabs_in_log_default = 8;
 
-	grep_init(&revs->grep_filter, revs->repo, prefix);
+	grep_init(&revs->grep_filter, revs->repo);
 	revs->grep_filter.status_only = 1;
 
 	repo_diff_setup(revs->repo, &revs->diffopt);
-- 
2.34.1.841.gf15fb7e6f34


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v3 5/7] grep.c: don't pass along NULL callback value
  2021-11-29 14:50   ` [PATCH v3 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                       ` (3 preceding siblings ...)
  2021-11-29 14:50     ` [PATCH v3 4/7] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
@ 2021-11-29 14:50     ` Ævar Arnfjörð Bjarmason
  2021-11-29 14:50     ` [PATCH v3 6/7] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
                       ` (3 subsequent siblings)
  8 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-29 14:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change grep_cmd_config() to stop passing around the always-NULL "cb"
value. When this code was added in 7e8f59d577e (grep: color patterns
in output, 2009-03-07) it was non-NULL, but when that changed in
15fabd1bbd4 (builtin/grep.c: make configuration callback more
reusable, 2012-10-09) this code was left behind.

In a subsequent change I'll start using the "cb" value, this will make
it clear which functions we call need it, and which don't.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index d85cbabea67..5ec4cecae45 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -285,8 +285,8 @@ static int wait_all(void)
 
 static int grep_cmd_config(const char *var, const char *value, void *cb)
 {
-	int st = grep_config(var, value, cb);
-	if (git_color_default_config(var, value, cb) < 0)
+	int st = grep_config(var, value, NULL);
+	if (git_color_default_config(var, value, NULL) < 0)
 		st = -1;
 
 	if (!strcmp(var, "grep.threads")) {
-- 
2.34.1.841.gf15fb7e6f34


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v3 6/7] grep API: call grep_config() after grep_init()
  2021-11-29 14:50   ` [PATCH v3 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                       ` (4 preceding siblings ...)
  2021-11-29 14:50     ` [PATCH v3 5/7] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
@ 2021-11-29 14:50     ` Ævar Arnfjörð Bjarmason
  2021-11-29 14:50     ` [PATCH v3 7/7] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
                       ` (2 subsequent siblings)
  8 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-29 14:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

The grep_init() function used the odd pattern of initializing the
passed-in "struct grep_opt" with a statically defined "grep_defaults"
struct, which would be modified in-place when we invoked
grep_config().

So we effectively (b) initialized config, (a) then defaults, (c)
followed by user options. Usually those are ordered as "a", "b" and
"c" instead.

As the comments being removed here show the previous behavior needed
to be carefully explained as we'd potentially share the populated
configuration among different instances of grep_init(). In practice we
didn't do that, but now that it can't be a concern anymore let's
remove those comments.

This does not change the behavior of any of the configuration
variables or options. That would have been the case if we didn't move
around the grep_config() call in "builtin/log.c". But now that we call
"grep_config" after "git_log_config" and "git_format_config" we'll
need to pass in the already initialized "struct grep_opt *".

See 6ba9bb76e02 (grep: copy struct in one fell swoop, 2020-11-29) and
7687a0541e0 (grep: move the configuration parsing logic to grep.[ch],
2012-10-09) for the commits that added the comments.

The memcpy() pattern here will be optimized away and follows the
convention of other *_init() functions. See 5726a6b4012 (*.c *_init():
define in terms of corresponding *_INIT macro, 2021-07-01).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c |  4 ++--
 builtin/log.c  | 13 +++++++++++--
 grep.c         | 39 +++------------------------------------
 grep.h         | 21 +++++++++++++++++++++
 4 files changed, 37 insertions(+), 40 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 5ec4cecae45..0ea124321b6 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -285,7 +285,7 @@ static int wait_all(void)
 
 static int grep_cmd_config(const char *var, const char *value, void *cb)
 {
-	int st = grep_config(var, value, NULL);
+	int st = grep_config(var, value, cb);
 	if (git_color_default_config(var, value, NULL) < 0)
 		st = -1;
 
@@ -966,8 +966,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	};
 	grep_prefix = prefix;
 
-	git_config(grep_cmd_config, NULL);
 	grep_init(&opt, the_repository);
+	git_config(grep_cmd_config, &opt);
 
 	/*
 	 * If there is no -- then the paths must exist in the working
diff --git a/builtin/log.c b/builtin/log.c
index f75d87e8d7f..bfddacdfa6c 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -505,8 +505,6 @@ static int git_log_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
-	if (grep_config(var, value, cb) < 0)
-		return -1;
 	if (git_gpg_config(var, value, cb) < 0)
 		return -1;
 	return git_diff_ui_config(var, value, cb);
@@ -521,6 +519,8 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.simplify_history = 0;
 	memset(&opt, 0, sizeof(opt));
@@ -635,6 +635,8 @@ int cmd_show(int argc, const char **argv, const char *prefix)
 
 	memset(&match_all, 0, sizeof(match_all));
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.always_show_header = 1;
 	rev.no_walk = 1;
@@ -718,6 +720,8 @@ int cmd_log_reflog(int argc, const char **argv, const char *prefix)
 
 	repo_init_revisions(the_repository, &rev, prefix);
 	init_reflog_walk(&rev.reflog_info);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.verbose_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -751,6 +755,8 @@ int cmd_log(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.always_show_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -1833,10 +1839,13 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	extra_hdr.strdup_strings = 1;
 	extra_to.strdup_strings = 1;
 	extra_cc.strdup_strings = 1;
+
 	init_log_defaults();
 	init_display_notes(&notes_opt);
 	git_config(git_format_config, NULL);
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.show_notes = show_notes;
 	memcpy(&rev.notes_opt, &notes_opt, sizeof(notes_opt));
 	rev.commit_format = CMIT_FMT_EMAIL;
diff --git a/grep.c b/grep.c
index 12b202598a9..8dfa0300786 100644
--- a/grep.c
+++ b/grep.c
@@ -19,27 +19,6 @@ static void std_output(struct grep_opt *opt, const void *buf, size_t size)
 	fwrite(buf, size, 1, stdout);
 }
 
-static struct grep_opt grep_defaults = {
-	.relative = 1,
-	.pathname = 1,
-	.max_depth = -1,
-	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED,
-	.colors = {
-		[GREP_COLOR_CONTEXT] = "",
-		[GREP_COLOR_FILENAME] = "",
-		[GREP_COLOR_FUNCTION] = "",
-		[GREP_COLOR_LINENO] = "",
-		[GREP_COLOR_COLUMNNO] = "",
-		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_SELECTED] = "",
-		[GREP_COLOR_SEP] = GIT_COLOR_CYAN,
-	},
-	.only_matching = 0,
-	.color = -1,
-	.output = std_output,
-};
-
 static const char *color_grep_slots[] = {
 	[GREP_COLOR_CONTEXT]	    = "context",
 	[GREP_COLOR_FILENAME]	    = "filename",
@@ -75,20 +54,12 @@ define_list_config_array_extra(color_grep_slots, {"match"});
  */
 int grep_config(const char *var, const char *value, void *cb)
 {
-	struct grep_opt *opt = &grep_defaults;
+	struct grep_opt *opt = cb;
 	const char *slot;
 
 	if (userdiff_config(var, value) < 0)
 		return -1;
 
-	/*
-	 * The instance of grep_opt that we set up here is copied by
-	 * grep_init() to be used by each individual invocation.
-	 * When populating a new field of this structure here, be
-	 * sure to think about ownership -- e.g., you might need to
-	 * override the shallow copy in grep_init() with a deep copy.
-	 */
-
 	if (!strcmp(var, "grep.extendedregexp")) {
 		opt->extended_regexp_option = git_config_bool(var, value);
 		return 0;
@@ -134,14 +105,10 @@ int grep_config(const char *var, const char *value, void *cb)
 	return 0;
 }
 
-/*
- * Initialize one instance of grep_opt and copy the
- * default values from the template we read the configuration
- * information in an earlier call to git_config(grep_config).
- */
 void grep_init(struct grep_opt *opt, struct repository *repo)
 {
-	*opt = grep_defaults;
+	struct grep_opt blank = GREP_OPT_INIT;
+	memcpy(opt, &blank, sizeof(*opt));
 
 	opt->repo = repo;
 	opt->pattern_tail = &opt->pattern_list;
diff --git a/grep.h b/grep.h
index 62deadb885f..b651eb291f7 100644
--- a/grep.h
+++ b/grep.h
@@ -177,6 +177,27 @@ struct grep_opt {
 	void *output_priv;
 };
 
+#define GREP_OPT_INIT { \
+	.relative = 1, \
+	.pathname = 1, \
+	.max_depth = -1, \
+	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
+	.colors = { \
+		[GREP_COLOR_CONTEXT] = "", \
+		[GREP_COLOR_FILENAME] = "", \
+		[GREP_COLOR_FUNCTION] = "", \
+		[GREP_COLOR_LINENO] = "", \
+		[GREP_COLOR_COLUMNNO] = "", \
+		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_SELECTED] = "", \
+		[GREP_COLOR_SEP] = GIT_COLOR_CYAN, \
+	}, \
+	.only_matching = 0, \
+	.color = -1, \
+	.output = std_output, \
+}
+
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
-- 
2.34.1.841.gf15fb7e6f34


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v3 7/7] grep: simplify config parsing and option parsing
  2021-11-29 14:50   ` [PATCH v3 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                       ` (5 preceding siblings ...)
  2021-11-29 14:50     ` [PATCH v3 6/7] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
@ 2021-11-29 14:50     ` Ævar Arnfjörð Bjarmason
  2021-11-29 21:06     ` [PATCH v3 0/7] grep: simplify & delete "init" & "config" code Junio C Hamano
  2021-11-29 21:36     ` Junio C Hamano
  8 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-29 14:50 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Simplify the parsing of "grep.patternType" and
"grep.extendedRegexp". This changes no behavior, but gets rid of
complex parsing logic that isn't needed anymore.

When "grep.patternType" was introduced in 84befcd0a4a (grep: add a
grep.patternType configuration setting, 2012-08-03) we promised that:

 1. You can set "grep.patternType", and "[setting it to] 'default'
    will return to the default matching behavior".

 2. We'd support the existing "grep.extendedRegexp" option, but ignore
    it when the new "grep.patternType" option is set. We said we'd
    only ignore the older "grep.extendedRegexp" option "when the
    `grep.patternType` option is set. to a value other than
    'default'".

In a preceding commit we changed grep_config() to be called after
grep_init(), which means that much of the complexity here can go
away.

Now as before when we only understand a "grep.extendedRegexp" setting
of "true", and if "grep.patterntype=default" is set we'll interpret it
as "grep.patterntype=basic", except if we previously saw a
"grep.extendedRegexp", then it's interpreted as
"grep.patterntype=extended".

See my 07a3d411739 (grep: remove regflags from the public grep_opt
API, 2017-06-29) for addition of the two comments being removed here,
i.e. the complexity noted in that commit is now going away.

We don't need grep_commit_pattern_type() anymore, we can instead have
OPT_SET_INT() in "builtin/grep.c" manipulate the "pattern_type_option"
member in "struct grep_opt" directly.

We can also do away with the indirection of the "int fixed" and "int
pcre2" members in favor of using "pattern_type_option" directly in
"grep.c".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 10 +++----
 grep.c         | 81 +++++++++++---------------------------------------
 grep.h         |  9 ++----
 revision.c     |  2 --
 4 files changed, 24 insertions(+), 78 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 0ea124321b6..942c4b25077 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -845,7 +845,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	int i;
 	int dummy;
 	int use_index = 1;
-	int pattern_type_arg = GREP_PATTERN_TYPE_UNSPECIFIED;
 	int allow_revs;
 
 	struct option options[] = {
@@ -879,16 +878,16 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			N_("descend at most <depth> levels"), PARSE_OPT_NONEG,
 			NULL, 1 },
 		OPT_GROUP(""),
-		OPT_SET_INT('E', "extended-regexp", &pattern_type_arg,
+		OPT_SET_INT('E', "extended-regexp", &opt.pattern_type_option,
 			    N_("use extended POSIX regular expressions"),
 			    GREP_PATTERN_TYPE_ERE),
-		OPT_SET_INT('G', "basic-regexp", &pattern_type_arg,
+		OPT_SET_INT('G', "basic-regexp", &opt.pattern_type_option,
 			    N_("use basic POSIX regular expressions (default)"),
 			    GREP_PATTERN_TYPE_BRE),
-		OPT_SET_INT('F', "fixed-strings", &pattern_type_arg,
+		OPT_SET_INT('F', "fixed-strings", &opt.pattern_type_option,
 			    N_("interpret patterns as fixed strings"),
 			    GREP_PATTERN_TYPE_FIXED),
-		OPT_SET_INT('P', "perl-regexp", &pattern_type_arg,
+		OPT_SET_INT('P', "perl-regexp", &opt.pattern_type_option,
 			    N_("use Perl-compatible regular expressions"),
 			    GREP_PATTERN_TYPE_PCRE),
 		OPT_GROUP(""),
@@ -982,7 +981,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options, grep_usage,
 			     PARSE_OPT_KEEP_DASHDASH |
 			     PARSE_OPT_STOP_AT_NON_OPTION);
-	grep_commit_pattern_type(pattern_type_arg, &opt);
 
 	if (use_index && !startup_info->have_repository) {
 		int fallback = 0;
diff --git a/grep.c b/grep.c
index 8dfa0300786..b5342064066 100644
--- a/grep.c
+++ b/grep.c
@@ -33,9 +33,7 @@ static const char *color_grep_slots[] = {
 
 static int parse_pattern_type_arg(const char *opt, const char *arg)
 {
-	if (!strcmp(arg, "default"))
-		return GREP_PATTERN_TYPE_UNSPECIFIED;
-	else if (!strcmp(arg, "basic"))
+	if (!strcmp(arg, "basic"))
 		return GREP_PATTERN_TYPE_BRE;
 	else if (!strcmp(arg, "extended"))
 		return GREP_PATTERN_TYPE_ERE;
@@ -61,11 +59,23 @@ int grep_config(const char *var, const char *value, void *cb)
 		return -1;
 
 	if (!strcmp(var, "grep.extendedregexp")) {
+		if (opt->extended_regexp_option)
+			return 0;
 		opt->extended_regexp_option = git_config_bool(var, value);
+		if (opt->extended_regexp_option)
+			opt->pattern_type_option = GREP_PATTERN_TYPE_ERE;
+		return 0;
+	}
+
+	if (!strcmp(var, "grep.patterntype") &&
+	    !strcmp(value, "default")) {
+		opt->pattern_type_option = opt->extended_regexp_option == 1
+			? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
 		return 0;
 	}
 
 	if (!strcmp(var, "grep.patterntype")) {
+		opt->extended_regexp_option = -1; /* ignore */
 		opt->pattern_type_option = parse_pattern_type_arg(var, value);
 		return 0;
 	}
@@ -115,62 +125,6 @@ void grep_init(struct grep_opt *opt, struct repository *repo)
 	opt->header_tail = &opt->header_list;
 }
 
-static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	/*
-	 * When committing to the pattern type by setting the relevant
-	 * fields in grep_opt it's generally not necessary to zero out
-	 * the fields we're not choosing, since they won't have been
-	 * set by anything. The extended_regexp_option field is the
-	 * only exception to this.
-	 *
-	 * This is because in the process of parsing grep.patternType
-	 * & grep.extendedRegexp we set opt->pattern_type_option and
-	 * opt->extended_regexp_option, respectively. We then
-	 * internally use opt->extended_regexp_option to see if we're
-	 * compiling an ERE. It must be unset if that's not actually
-	 * the case.
-	 */
-	if (pattern_type != GREP_PATTERN_TYPE_ERE &&
-	    opt->extended_regexp_option)
-		opt->extended_regexp_option = 0;
-
-	switch (pattern_type) {
-	case GREP_PATTERN_TYPE_UNSPECIFIED:
-		/* fall through */
-
-	case GREP_PATTERN_TYPE_BRE:
-		break;
-
-	case GREP_PATTERN_TYPE_ERE:
-		opt->extended_regexp_option = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_FIXED:
-		opt->fixed = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_PCRE:
-		opt->pcre2 = 1;
-		break;
-	}
-}
-
-void grep_commit_pattern_type(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	if (pattern_type != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(pattern_type, opt);
-	else if (opt->pattern_type_option != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(opt->pattern_type_option, opt);
-	else if (opt->extended_regexp_option)
-		/*
-		 * This branch *must* happen after setting from the
-		 * opt->pattern_type_option above, we don't want
-		 * grep.extendedRegexp to override grep.patternType!
-		 */
-		grep_set_pattern_type_option(GREP_PATTERN_TYPE_ERE, opt);
-}
-
 static struct grep_pat *create_grep_pat(const char *pat, size_t patlen,
 					const char *origin, int no,
 					enum grep_pat_token t,
@@ -490,9 +444,10 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 
 	p->word_regexp = opt->word_regexp;
 	p->ignore_case = opt->ignore_case;
-	p->fixed = opt->fixed;
+	p->fixed = opt->pattern_type_option == GREP_PATTERN_TYPE_FIXED;
 
-	if (memchr(p->pattern, 0, p->patternlen) && !opt->pcre2)
+	if (opt->pattern_type_option != GREP_PATTERN_TYPE_PCRE &&
+	    memchr(p->pattern, 0, p->patternlen))
 		die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
 
 	p->is_fixed = is_fixed(p->pattern, p->patternlen);
@@ -543,14 +498,14 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 		return;
 	}
 
-	if (opt->pcre2) {
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_PCRE) {
 		compile_pcre2_pattern(p, opt);
 		return;
 	}
 
 	if (p->ignore_case)
 		regflags |= REG_ICASE;
-	if (opt->extended_regexp_option)
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_ERE)
 		regflags |= REG_EXTENDED;
 	err = regcomp(&p->regexp, p->pattern, regflags);
 	if (err) {
diff --git a/grep.h b/grep.h
index b651eb291f7..ab2ce833b40 100644
--- a/grep.h
+++ b/grep.h
@@ -94,8 +94,7 @@ enum grep_expr_node {
 };
 
 enum grep_pattern_type {
-	GREP_PATTERN_TYPE_UNSPECIFIED = 0,
-	GREP_PATTERN_TYPE_BRE,
+	GREP_PATTERN_TYPE_BRE = 0,
 	GREP_PATTERN_TYPE_ERE,
 	GREP_PATTERN_TYPE_FIXED,
 	GREP_PATTERN_TYPE_PCRE
@@ -143,7 +142,6 @@ struct grep_opt {
 	int unmatch_name_only;
 	int count;
 	int word_regexp;
-	int fixed;
 	int all_match;
 #define GREP_BINARY_DEFAULT	0
 #define GREP_BINARY_NOMATCH	1
@@ -152,7 +150,6 @@ struct grep_opt {
 	int allow_textconv;
 	int extended;
 	int use_reflog_filter;
-	int pcre2;
 	int relative;
 	int pathname;
 	int null_following_name;
@@ -162,7 +159,7 @@ struct grep_opt {
 	int funcname;
 	int funcbody;
 	int extended_regexp_option;
-	int pattern_type_option;
+	enum grep_pattern_type pattern_type_option;
 	int ignore_locale;
 	char colors[NR_GREP_COLORS][COLOR_MAXLEN];
 	unsigned pre_context;
@@ -181,7 +178,6 @@ struct grep_opt {
 	.relative = 1, \
 	.pathname = 1, \
 	.max_depth = -1, \
-	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
 	.colors = { \
 		[GREP_COLOR_CONTEXT] = "", \
 		[GREP_COLOR_FILENAME] = "", \
@@ -200,7 +196,6 @@ struct grep_opt {
 
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
-void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
 void append_grep_pattern(struct grep_opt *opt, const char *pat, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index a82e31df869..d08c7ae6c35 100644
--- a/revision.c
+++ b/revision.c
@@ -2855,8 +2855,6 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 
 	diff_setup_done(&revs->diffopt);
 
-	grep_commit_pattern_type(GREP_PATTERN_TYPE_UNSPECIFIED,
-				 &revs->grep_filter);
 	if (!is_encoding_utf8(get_log_output_encoding()))
 		revs->grep_filter.ignore_locale = 1;
 	compile_grep_patterns(&revs->grep_filter);
-- 
2.34.1.841.gf15fb7e6f34


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: [PATCH v3 0/7] grep: simplify & delete "init" & "config" code
  2021-11-29 14:50   ` [PATCH v3 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                       ` (6 preceding siblings ...)
  2021-11-29 14:50     ` [PATCH v3 7/7] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
@ 2021-11-29 21:06     ` Junio C Hamano
  2021-11-29 21:36     ` Junio C Hamano
  8 siblings, 0 replies; 151+ messages in thread
From: Junio C Hamano @ 2021-11-29 21:06 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> In v1 and v2[1] of this series more code in grep.c was deleted by
> changing what I think is a really obscure interaction between
> "grep.extendedRegexp=true" and "grep.patternType".
>
> Junio preferred having a deprecation period[2], so here's a re-roll
> that preserves all existing behavior, at the cost of bit less code
> deletion & simplification (from "97 insertions(+), 174 deletions(-)"
> to "106 insertions(+), 131 deletions(-)").

Deprecating grep.extendedRegexp and forcing users to only use the
grep.patterntype would be a lot more sensible way forward than
giving it a new meaning and letting these two variables interact
with each other.

Depending on how cleanly the internal code can become, with the
former variable still supported for backward compatibility, we might
not need to break working set-up existing end-users have, though.

We'll see.

Thanks.

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v3 0/7] grep: simplify & delete "init" & "config" code
  2021-11-29 14:50   ` [PATCH v3 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                       ` (7 preceding siblings ...)
  2021-11-29 21:06     ` [PATCH v3 0/7] grep: simplify & delete "init" & "config" code Junio C Hamano
@ 2021-11-29 21:36     ` Junio C Hamano
  8 siblings, 0 replies; 151+ messages in thread
From: Junio C Hamano @ 2021-11-29 21:36 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

>   log tests: check if grep_config() is called by "log"-like cmds
>
> Simplified the test, and it no longer depends (optionally) PCRE. We
> just test BRE v.s. ERE instead.

Hmph.  I see tests between fixed vs basic, though.


^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v3 3/7] grep tests: add missing "grep.patternType" config test
  2021-11-29 14:50     ` [PATCH v3 3/7] grep tests: add missing "grep.patternType" config test Ævar Arnfjörð Bjarmason
@ 2021-11-29 21:52       ` Junio C Hamano
  2021-12-03  0:48         ` Junio C Hamano
  0 siblings, 1 reply; 151+ messages in thread
From: Junio C Hamano @ 2021-11-29 21:52 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Extend the grep tests to assert that setting
> "grep.patternType=extended" followed by "grep.patternType=default"
> will behave as if "--extended-regexp" was provided, and not as
> "--basic-regexp".

grep.patternType is the usual "last-one wins".  If the last value
set to patternType is the default, the setting to grep.extendedRegexp
should take effect (so if it is set to true, we'd see ERE behavour).

Back in the days when the "return to the default matching behavior"
part was written in 84befcd0 (grep: add a grep.patternType
configuration setting, 2012-08-03), grep.extendedRegexp was the only
way to configure the behaviour since b22520a3 (grep: allow -E and -n
to be turned on by default via configuration, 2011-03-30).  It was
understandable that we referred to the behaviour that honors the
older configuration variable as "the default matching" behaviour.
It is fairly clear in its log message:

    When grep.patternType is set to a value other than "default", the
    grep.extendedRegexp setting is ignored. The value of "default" restores
    the current default behavior, including the grep.extendedRegexp
    behavior.

So, unless your description is a typo, I am somewhat surprised by
your findings that =default that comes later does not defeat an
earlier =extended.

It should just clear that earlier extended set by grep.patternType
and only pay attention to grep.extendedRegexp variable.  Doing
anything else is a bug, I think.

Thanks.

diff --git i/Documentation/config/grep.txt w/Documentation/config/grep.txt
index 44abe45a7c..95fcb3ca29 100644
--- i/Documentation/config/grep.txt
+++ w/Documentation/config/grep.txt
@@ -8,7 +8,8 @@ grep.patternType::
 	Set the default matching behavior. Using a value of 'basic', 'extended',
 	'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`,
 	`--fixed-strings`, or `--perl-regexp` option accordingly, while the
-	value 'default' will return to the default matching behavior.
+	value 'default' will return to the default matching behavior, which is,
+	to honor `grep.extendedRegexp` option and choose either basic or extended.
 
 grep.extendedRegexp::
 	If set to true, enable `--extended-regexp` option by default. This


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: [PATCH v3 3/7] grep tests: add missing "grep.patternType" config test
  2021-11-29 21:52       ` Junio C Hamano
@ 2021-12-03  0:48         ` Junio C Hamano
  0 siblings, 0 replies; 151+ messages in thread
From: Junio C Hamano @ 2021-12-03  0:48 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Junio C Hamano <gitster@pobox.com> writes:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> Extend the grep tests to assert that setting
>> "grep.patternType=extended" followed by "grep.patternType=default"
>> will behave as if "--extended-regexp" was provided, and not as
>> "--basic-regexp".
>
> grep.patternType is the usual "last-one wins".  If the last value
> set to patternType is the default, the setting to grep.extendedRegexp
> should take effect (so if it is set to true, we'd see ERE behavour).
>
> Back in the days when the "return to the default matching behavior"
> part was written in 84befcd0 (grep: add a grep.patternType
> configuration setting, 2012-08-03), grep.extendedRegexp was the only
> way to configure the behaviour since b22520a3 (grep: allow -E and -n
> to be turned on by default via configuration, 2011-03-30).  It was
> understandable that we referred to the behaviour that honors the
> older configuration variable as "the default matching" behaviour.
> It is fairly clear in its log message:
>
>     When grep.patternType is set to a value other than "default", the
>     grep.extendedRegexp setting is ignored. The value of "default" restores
>     the current default behavior, including the grep.extendedRegexp
>     behavior.
>
> So, unless your description is a typo, I am somewhat surprised by
> your findings that =default that comes later does not defeat an
> earlier =extended.
>
> It should just clear that earlier extended set by grep.patternType
> and only pay attention to grep.extendedRegexp variable.  Doing
> anything else is a bug, I think.

So, let's see how 

  $ git -c grep.patternType=extended \
	-c grep.patternType=default \
	grep foo

works today.

We start from builtin/grep.c::cmd_grep(), which calls
git_config(grep_cmd_config).  grep_cmd_config() farms out most of
the work to grep.c::grep_config(), which populates the grep_defaults
structure. grep_defaults.pattern_type_option first becomes
GREP_PATTERN_TYPE_ERE and then it gets overwritten to
GREP_PATTERN_TYPE_UNSPECIFIED.

Then grep.c::grep_init() copies that grep_defaults to the
per-invocation "struct grep_opt opt" that is on-stack in
builtin/grep.c::cmd_grep().  

opt.patternType becomes GREP_PATTERN_TYPE_UNSPECIFIED;
opt.extendedRegexp in the same structure is 0, because nobody has
touched the corresponding member in grep_defaults in
grep_cmd_config().

Then parse_options() gets its turn to futz with members in "opt".
-E/-G/-F/-P would be parsed into a separate variable "pattern_type";
in this case, there is no command line option, so the pattern_type
variable has GREP_PATTERN_TYPE_UNSPECIFIED.

And finally grep.c::grep_commit_pattern_type() is called to combine
what is in "pattern_type" and "opt".

It calls grep_set_pattern_type_option() to futz with members in opt
that is what determines the final choice.

 - If pattern_type is not UNSPECIFIED, use that;

 - Otherwise, if opt->pattern_type_option is not UNSPECIFIED, use that;

 - Otherwise, i.e. if pattern_type from the command line and
   opt->pattern_type_option from the configuration are both
   UNSPECIFIED, then check if opt->extended_regexp_option (which is
   set from the config via grep.extendedRegexp) is set.  If so, call
   grep_set_pattern_type_option() to use ERE.

Now, what does grep_set_pattern_type_option() do?

The first thing it does is when pattern_type given is not ERE, drop
the opt->extended_regexp_option bit (which may have been set by
having grep.extendedRegexp configuration set to true).  This is
because, just like 'fixed' and 'pcre2', the runtime after the opt
structure is set up, the code does not look at a single "type"
member that enumerates BRE, ERE, FIXED, PCRE to determine the type
of the pattern, and the 'extended_regexp_option' member, after
grep_commit_pattern_type() finishes its processing, is used to
signal that ERE is in effect.  But as we've seen in the design goal
of the earlier change 84befcd0 (grep: add a grep.patternType
configuration setting, 2012-08-03), the bit obtained from the
grep.extendedRegexp configuration variable is only valid when
grep.patternType is set to UNSPECIFIED (aka default), so there needs
some dropping of this bit happen.

But with two grep.patternType configuration, I do not think the bug
will trigger.  As we traced above, we just get UNSPECIFIED in
grep_defaults.pattern_type_option, that is copied to cmd_grep()::opt,
and it gets combined with UNSPECIFIED in cmd_grep()::pattern_type in
grep_commit_pattern_type().  But the three-step logic in the "commit"
will not do anything in this case.  So, I do not see any code that
makes this behave as if "git grep -E foo" was given.

I suspect that if you do

  $ git -c grep.extendedRegexp=true \
	-c grep.patternType=default \
	grep foo

it should set the .extended_regexp_option member to true and the
.pattern_type_option member to UNSPECIFIED in grep_defaults, copy it
to cmd_grep()::opt, and grep_commit_pattern_type() will try to
combine that "opt" with pattern_type==UNSPECIFIED.  The third "both
pattern_type and opt.pattern_type_option are UNSPECIFIED" case
triggers, and grep_set_pattern_type_option() would be called, with
its pattern_type parameter explicitly set to ERE.

The logic to combine these two are convoluted and I sense that it
could be simplified without breaking the established semantics, but
so far I am not seeing how the code can break in such a way that

>> Extend the grep tests to assert that setting
>> "grep.patternType=extended" followed by "grep.patternType=default"
>> will behave as if "--extended-regexp" was provided, and not as
>> "--basic-regexp".

this claim holds.

So,... after spending too much time following the code, I went back
to the actual test added by the code and see this:

+	test_expect_success "grep $L with grep.patternType=extended and grep.patternType=default" '
+		echo "${HC}ab:a+bc" >expected &&
+		git \
+			-c grep.patternType=extended \
+			-c grep.patternType=default \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'

Here, file "ac" has three lines

        a+b*c
        a+bc
        abc

and "a+b*c" pattern is designed to hit the first line with -F, the
second one with -G (because + is literal in BRE so it must exist
literally in the haystack, b* matches single b but not literal b* in
the haystack), the last one with -E (because neither + or * is
literal, so the first two lines do not match, but the last one
matches).  The expectation in the code, unlike what is in the log
message, is that this should match as if -G was given.

So, I guess there is no bug (other than the alarming false report in
the log message).




^ permalink raw reply	[flat|nested] 151+ messages in thread

* [PATCH v4 0/7] grep: simplify & delete "init" & "config" code
  2021-11-10  1:43 ` [PATCH v2 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
                     ` (9 preceding siblings ...)
  2021-11-29 14:50   ` [PATCH v3 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
@ 2021-12-03 10:19   ` Ævar Arnfjörð Bjarmason
  2021-12-03 10:19     ` [PATCH v4 1/7] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
                       ` (7 more replies)
  10 siblings, 8 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-03 10:19 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

A simplification and code deletion of the overly complex setup for the
grep API, no behavior changes. For v3 see:
https://lore.kernel.org/git/cover-v3-0.7-00000000000-20211129T143956Z-avarab@gmail.com/

As Junio notes in
https://lore.kernel.org/git/xmqqbl22634q.fsf@gitster.g/ my previous
3/7 had entirely the wrong commit message for the change it was
making, i.e. the code itself was correct, but I somehow got things
reversed in my head when writing the explanation.

That's fixed here, and for good measure I added another related test
to the same commit for the same existing behavior.

Ævar Arnfjörð Bjarmason (7):
  grep.h: remove unused "regex_t regexp" from grep_opt
  log tests: check if grep_config() is called by "log"-like cmds
  grep tests: add missing "grep.patternType" config test
  built-ins: trust the "prefix" from run_builtin()
  grep.c: don't pass along NULL callback value
  grep API: call grep_config() after grep_init()
  grep: simplify config parsing and option parsing

 builtin/grep.c    |  27 +++++-----
 builtin/log.c     |  13 ++++-
 builtin/ls-tree.c |   2 +-
 git.c             |   1 +
 grep.c            | 124 ++++++++--------------------------------------
 grep.h            |  33 ++++++++----
 revision.c        |   4 +-
 t/t4202-log.sh    |  24 +++++++++
 t/t7810-grep.sh   |  19 +++++++
 9 files changed, 116 insertions(+), 131 deletions(-)

Range-diff against v3:
1:  71ff51cb3c9 = 1:  d7d232b2b52 grep.h: remove unused "regex_t regexp" from grep_opt
2:  ec8e42ced1a = 2:  f853d669682 log tests: check if grep_config() is called by "log"-like cmds
3:  fcad1b1664b ! 3:  a97b7de3a3c grep tests: add missing "grep.patternType" config test
    @@ Commit message
     
         Extend the grep tests to assert that setting
         "grep.patternType=extended" followed by "grep.patternType=default"
    -    will behave as if "--extended-regexp" was provided, and not as
    -    "--basic-regexp". In a subsequent commit we'll need to treat
    +    will behave as if "--basic-regexp" was provided, and not as
    +    "--extended-regexp". In a subsequent commit we'll need to treat
         "grep.patternType=default" as a special-case, but let's make sure we
    -    don't ignore it if "grep.patternType" was set to a non-"default" value
    -    before.
    +    ignore it if it's being set to "default" following an earlier
    +    non-"default" "grep.patternType" setting.
    +
    +    Let's also test what happens when we have a sequence of "extended"
    +    followed by "default" and "fixed". In that case the "fixed" should
    +    prevail.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
    @@ t/t7810-grep.sh: do
     +			grep "a+b*c" $H ab >actual &&
     +		test_cmp expected actual
     +	'
    ++
    ++	test_expect_success "grep $L with grep.patternType=[extended -> default -> fixed]" '
    ++		echo "${HC}ab:a+b*c" >expected &&
    ++		git \
    ++			-c grep.patternType=extended \
    ++			-c grep.patternType=default \
    ++			-c grep.patternType=fixed \
    ++			grep "a+b*c" $H ab >actual &&
    ++		test_cmp expected actual
    ++	'
     +
      	test_expect_success "grep $L with grep.patternType=extended and grep.extendedRegexp=false" '
      		echo "${HC}ab:abc" >expected &&
4:  854ffe8d0b9 = 4:  f7d995a5a80 built-ins: trust the "prefix" from run_builtin()
5:  2536eae2c32 = 5:  ab1685f0dad grep.c: don't pass along NULL callback value
6:  4e1be7c165b = 6:  8ffa22df8c7 grep API: call grep_config() after grep_init()
7:  f40ab932cb1 = 7:  efbd1c50b43 grep: simplify config parsing and option parsing
-- 
2.34.1.875.gb925cffed1e


^ permalink raw reply	[flat|nested] 151+ messages in thread

* [PATCH v4 1/7] grep.h: remove unused "regex_t regexp" from grep_opt
  2021-12-03 10:19   ` [PATCH v4 " Ævar Arnfjörð Bjarmason
@ 2021-12-03 10:19     ` Ævar Arnfjörð Bjarmason
  2021-12-03 10:19     ` [PATCH v4 2/7] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
                       ` (6 subsequent siblings)
  7 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-03 10:19 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

This "regex_t" in grep_opt has not been used since
f9b9faf6f8a (builtin-grep: allow more than one patterns., 2006-05-02),
we still use a "regex_t" for compiling regexes, but that's in the
"grep_pat" struct".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/grep.h b/grep.h
index 3e8815c347b..95cccb670f9 100644
--- a/grep.h
+++ b/grep.h
@@ -136,7 +136,6 @@ struct grep_opt {
 
 	const char *prefix;
 	int prefix_length;
-	regex_t regexp;
 	int linenum;
 	int columnnum;
 	int invert;
-- 
2.34.1.875.gb925cffed1e


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v4 2/7] log tests: check if grep_config() is called by "log"-like cmds
  2021-12-03 10:19   ` [PATCH v4 " Ævar Arnfjörð Bjarmason
  2021-12-03 10:19     ` [PATCH v4 1/7] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
@ 2021-12-03 10:19     ` Ævar Arnfjörð Bjarmason
  2021-12-03 10:19     ` [PATCH v4 3/7] grep tests: add missing "grep.patternType" config test Ævar Arnfjörð Bjarmason
                       ` (5 subsequent siblings)
  7 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-03 10:19 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Extend the tests added in my 9df46763ef1 (log: add exhaustive tests
for pattern style options & config, 2017-05-20) to check not only
whether "git log" handles "grep.patternType", but also "git show"
etc.

It's sufficient to check whether we match a "fixed" or a "basic" regex
here to see if these codepaths correctly invoked grep_config(). We
don't need to check the details of their regular expression matching
as the "log" test does.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t4202-log.sh | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 7884e3d46b3..11bb25440b0 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -449,6 +449,30 @@ test_expect_success !FAIL_PREREQS 'log with various grep.patternType configurati
 	)
 '
 
+for cmd in show whatchanged reflog format-patch
+do
+	case "$cmd" in
+	format-patch) myarg="HEAD~.." ;;
+	*) myarg= ;;
+	esac
+
+	test_expect_success "$cmd: understands grep.patternType, like 'log'" '
+		git init "pattern-type-$cmd" &&
+		(
+			cd "pattern-type-$cmd" &&
+			test_commit 1 file A &&
+			test_commit "(1|2)" file B 2 &&
+
+			git -c grep.patternType=fixed $cmd --grep="..." $myarg >actual &&
+			test_must_be_empty actual &&
+
+			git -c grep.patternType=basic $cmd --grep="..." $myarg >actual &&
+			test_file_not_empty actual
+		)
+	'
+done
+test_done
+
 test_expect_success 'log --author' '
 	cat >expect <<-\EOF &&
 	Author: <BOLD;RED>A U<RESET> Thor <author@example.com>
-- 
2.34.1.875.gb925cffed1e


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v4 3/7] grep tests: add missing "grep.patternType" config test
  2021-12-03 10:19   ` [PATCH v4 " Ævar Arnfjörð Bjarmason
  2021-12-03 10:19     ` [PATCH v4 1/7] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
  2021-12-03 10:19     ` [PATCH v4 2/7] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
@ 2021-12-03 10:19     ` Ævar Arnfjörð Bjarmason
  2021-12-03 10:19     ` [PATCH v4 4/7] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
                       ` (4 subsequent siblings)
  7 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-03 10:19 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Extend the grep tests to assert that setting
"grep.patternType=extended" followed by "grep.patternType=default"
will behave as if "--basic-regexp" was provided, and not as
"--extended-regexp". In a subsequent commit we'll need to treat
"grep.patternType=default" as a special-case, but let's make sure we
ignore it if it's being set to "default" following an earlier
non-"default" "grep.patternType" setting.

Let's also test what happens when we have a sequence of "extended"
followed by "default" and "fixed". In that case the "fixed" should
prevail.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t7810-grep.sh | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 6b6423a07c3..113902c3bda 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -451,6 +451,25 @@ do
 		test_cmp expected actual
 	'
 
+	test_expect_success "grep $L with grep.patternType=extended and grep.patternType=default" '
+		echo "${HC}ab:a+bc" >expected &&
+		git \
+			-c grep.patternType=extended \
+			-c grep.patternType=default \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.patternType=[extended -> default -> fixed]" '
+		echo "${HC}ab:a+b*c" >expected &&
+		git \
+			-c grep.patternType=extended \
+			-c grep.patternType=default \
+			-c grep.patternType=fixed \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
 	test_expect_success "grep $L with grep.patternType=extended and grep.extendedRegexp=false" '
 		echo "${HC}ab:abc" >expected &&
 		git \
-- 
2.34.1.875.gb925cffed1e


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v4 4/7] built-ins: trust the "prefix" from run_builtin()
  2021-12-03 10:19   ` [PATCH v4 " Ævar Arnfjörð Bjarmason
                       ` (2 preceding siblings ...)
  2021-12-03 10:19     ` [PATCH v4 3/7] grep tests: add missing "grep.patternType" config test Ævar Arnfjörð Bjarmason
@ 2021-12-03 10:19     ` Ævar Arnfjörð Bjarmason
  2021-12-03 10:19     ` [PATCH v4 5/7] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
                       ` (3 subsequent siblings)
  7 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-03 10:19 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change code in "builtin/grep.c" and "builtin/ls-tree.c" to trust the
"prefix" passed from "run_builtin()". The "prefix" we get from setup.c
is either going to be NULL or a string of length >0, never "".

So we can drop the "prefix && *prefix" checks added for
"builtin/grep.c" in 0d042fecf2f (git-grep: show pathnames relative to
the current directory, 2006-08-11), and for "builtin/ls-tree.c" in
a69dd585fca (ls-tree: chomp leading directories when run from a
subdirectory, 2005-12-23).

As seen in code in revision.c that was added in cd676a51367 (diff
--relative: output paths as relative to the current subdirectory,
2008-02-12) we already have existing code that does away with this
assertion.

This makes it easier to reason about a subsequent change to the
"prefix_length" code in grep.c in a subsequent commit, and since we're
going to the trouble of doing that let's leave behind an assert() to
promise this to any future callers.

For "builtin/grep.c" it would be painful to pass the "prefix" down the
callchain of:

    cmd_grep -> grep_tree -> grep_submodule -> grep_cache -> grep_oid ->
    grep_source_name

So for the code that needs it in grep_source_name() let's add a
"grep_prefix" variable similar to the existing "ls_tree_prefix".

While at it let's move the code in cmd_ls_tree() around so that we
assign to the "ls_tree_prefix" right after declaring the variables,
and stop assigning to "prefix". We only subsequently used that
variable later in the function after clobbering it. Let's just use our
own "grep_prefix" instead.

Let's also add an assert() in git.c, so that we'll make this promise
about the "prefix" to any current and future callers, as well as to
any readers of the code.

Code history:

 * The strlen() in "grep.c" hasn't been used since 493b7a08d80 (grep:
   accept relative paths outside current working directory, 2009-09-05).

   When that code was added in 0d042fecf2f (git-grep: show pathnames
   relative to the current directory, 2006-08-11) we used the length.

   But since 493b7a08d80 we haven't used it for anything except a
   boolean check that we could have done on the "prefix" member
   itself.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c    | 13 ++++++++-----
 builtin/ls-tree.c |  2 +-
 git.c             |  1 +
 grep.c            |  4 +---
 grep.h            |  4 +---
 revision.c        |  2 +-
 6 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 9e34a820ad4..d85cbabea67 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -26,6 +26,8 @@
 #include "object-store.h"
 #include "packfile.h"
 
+static const char *grep_prefix;
+
 static char const * const grep_usage[] = {
 	N_("git grep [<options>] [-e] <pattern> [<rev>...] [[--] <path>...]"),
 	NULL
@@ -315,11 +317,11 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 	strbuf_reset(out);
 
 	if (opt->null_following_name) {
-		if (opt->relative && opt->prefix_length) {
+		if (opt->relative && grep_prefix) {
 			struct strbuf rel_buf = STRBUF_INIT;
 			const char *rel_name =
 				relative_path(filename + tree_name_len,
-					      opt->prefix, &rel_buf);
+					      grep_prefix, &rel_buf);
 
 			if (tree_name_len)
 				strbuf_add(out, filename, tree_name_len);
@@ -332,8 +334,8 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 		return;
 	}
 
-	if (opt->relative && opt->prefix_length)
-		quote_path(filename + tree_name_len, opt->prefix, out, 0);
+	if (opt->relative && grep_prefix)
+		quote_path(filename + tree_name_len, grep_prefix, out, 0);
 	else
 		quote_c_style(filename + tree_name_len, out, NULL, 0);
 
@@ -962,9 +964,10 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			   PARSE_OPT_NOCOMPLETE),
 		OPT_END()
 	};
+	grep_prefix = prefix;
 
 	git_config(grep_cmd_config, NULL);
-	grep_init(&opt, the_repository, prefix);
+	grep_init(&opt, the_repository);
 
 	/*
 	 * If there is no -- then the paths must exist in the working
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c71..6cb554cbb0a 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -150,7 +150,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 
 	git_config(git_default_config, NULL);
 	ls_tree_prefix = prefix;
-	if (prefix && *prefix)
+	if (prefix)
 		chomp_prefix = strlen(prefix);
 
 	argc = parse_options(argc, argv, prefix, ls_tree_options,
diff --git a/git.c b/git.c
index 5ff21be21f3..6cf48adeb1f 100644
--- a/git.c
+++ b/git.c
@@ -431,6 +431,7 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
 			int nongit_ok;
 			prefix = setup_git_directory_gently(&nongit_ok);
 		}
+		assert(!prefix || *prefix);
 		precompose_argv_prefix(argc, argv, NULL);
 		if (use_pager == -1 && p->option & (RUN_SETUP | RUN_SETUP_GENTLY) &&
 		    !(p->option & DELAY_PAGER_CONFIG))
diff --git a/grep.c b/grep.c
index fe847a0111a..12b202598a9 100644
--- a/grep.c
+++ b/grep.c
@@ -139,13 +139,11 @@ int grep_config(const char *var, const char *value, void *cb)
  * default values from the template we read the configuration
  * information in an earlier call to git_config(grep_config).
  */
-void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix)
+void grep_init(struct grep_opt *opt, struct repository *repo)
 {
 	*opt = grep_defaults;
 
 	opt->repo = repo;
-	opt->prefix = prefix;
-	opt->prefix_length = (prefix && *prefix) ? strlen(prefix) : 0;
 	opt->pattern_tail = &opt->pattern_list;
 	opt->header_tail = &opt->header_list;
 }
diff --git a/grep.h b/grep.h
index 95cccb670f9..62deadb885f 100644
--- a/grep.h
+++ b/grep.h
@@ -134,8 +134,6 @@ struct grep_opt {
 	 */
 	struct repository *repo;
 
-	const char *prefix;
-	int prefix_length;
 	int linenum;
 	int columnnum;
 	int invert;
@@ -180,7 +178,7 @@ struct grep_opt {
 };
 
 int grep_config(const char *var, const char *value, void *);
-void grep_init(struct grep_opt *, struct repository *repo, const char *prefix);
+void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index 1981a0859f0..a82e31df869 100644
--- a/revision.c
+++ b/revision.c
@@ -1833,7 +1833,7 @@ void repo_init_revisions(struct repository *r,
 	revs->commit_format = CMIT_FMT_DEFAULT;
 	revs->expand_tabs_in_log_default = 8;
 
-	grep_init(&revs->grep_filter, revs->repo, prefix);
+	grep_init(&revs->grep_filter, revs->repo);
 	revs->grep_filter.status_only = 1;
 
 	repo_diff_setup(revs->repo, &revs->diffopt);
-- 
2.34.1.875.gb925cffed1e


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v4 5/7] grep.c: don't pass along NULL callback value
  2021-12-03 10:19   ` [PATCH v4 " Ævar Arnfjörð Bjarmason
                       ` (3 preceding siblings ...)
  2021-12-03 10:19     ` [PATCH v4 4/7] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
@ 2021-12-03 10:19     ` Ævar Arnfjörð Bjarmason
  2021-12-03 10:19     ` [PATCH v4 6/7] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
                       ` (2 subsequent siblings)
  7 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-03 10:19 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change grep_cmd_config() to stop passing around the always-NULL "cb"
value. When this code was added in 7e8f59d577e (grep: color patterns
in output, 2009-03-07) it was non-NULL, but when that changed in
15fabd1bbd4 (builtin/grep.c: make configuration callback more
reusable, 2012-10-09) this code was left behind.

In a subsequent change I'll start using the "cb" value, this will make
it clear which functions we call need it, and which don't.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index d85cbabea67..5ec4cecae45 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -285,8 +285,8 @@ static int wait_all(void)
 
 static int grep_cmd_config(const char *var, const char *value, void *cb)
 {
-	int st = grep_config(var, value, cb);
-	if (git_color_default_config(var, value, cb) < 0)
+	int st = grep_config(var, value, NULL);
+	if (git_color_default_config(var, value, NULL) < 0)
 		st = -1;
 
 	if (!strcmp(var, "grep.threads")) {
-- 
2.34.1.875.gb925cffed1e


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v4 6/7] grep API: call grep_config() after grep_init()
  2021-12-03 10:19   ` [PATCH v4 " Ævar Arnfjörð Bjarmason
                       ` (4 preceding siblings ...)
  2021-12-03 10:19     ` [PATCH v4 5/7] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
@ 2021-12-03 10:19     ` Ævar Arnfjörð Bjarmason
  2021-12-03 10:19     ` [PATCH v4 7/7] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
  2021-12-22  2:58     ` [PATCH v5 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  7 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-03 10:19 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

The grep_init() function used the odd pattern of initializing the
passed-in "struct grep_opt" with a statically defined "grep_defaults"
struct, which would be modified in-place when we invoked
grep_config().

So we effectively (b) initialized config, (a) then defaults, (c)
followed by user options. Usually those are ordered as "a", "b" and
"c" instead.

As the comments being removed here show the previous behavior needed
to be carefully explained as we'd potentially share the populated
configuration among different instances of grep_init(). In practice we
didn't do that, but now that it can't be a concern anymore let's
remove those comments.

This does not change the behavior of any of the configuration
variables or options. That would have been the case if we didn't move
around the grep_config() call in "builtin/log.c". But now that we call
"grep_config" after "git_log_config" and "git_format_config" we'll
need to pass in the already initialized "struct grep_opt *".

See 6ba9bb76e02 (grep: copy struct in one fell swoop, 2020-11-29) and
7687a0541e0 (grep: move the configuration parsing logic to grep.[ch],
2012-10-09) for the commits that added the comments.

The memcpy() pattern here will be optimized away and follows the
convention of other *_init() functions. See 5726a6b4012 (*.c *_init():
define in terms of corresponding *_INIT macro, 2021-07-01).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c |  4 ++--
 builtin/log.c  | 13 +++++++++++--
 grep.c         | 39 +++------------------------------------
 grep.h         | 21 +++++++++++++++++++++
 4 files changed, 37 insertions(+), 40 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 5ec4cecae45..0ea124321b6 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -285,7 +285,7 @@ static int wait_all(void)
 
 static int grep_cmd_config(const char *var, const char *value, void *cb)
 {
-	int st = grep_config(var, value, NULL);
+	int st = grep_config(var, value, cb);
 	if (git_color_default_config(var, value, NULL) < 0)
 		st = -1;
 
@@ -966,8 +966,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	};
 	grep_prefix = prefix;
 
-	git_config(grep_cmd_config, NULL);
 	grep_init(&opt, the_repository);
+	git_config(grep_cmd_config, &opt);
 
 	/*
 	 * If there is no -- then the paths must exist in the working
diff --git a/builtin/log.c b/builtin/log.c
index f75d87e8d7f..bfddacdfa6c 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -505,8 +505,6 @@ static int git_log_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
-	if (grep_config(var, value, cb) < 0)
-		return -1;
 	if (git_gpg_config(var, value, cb) < 0)
 		return -1;
 	return git_diff_ui_config(var, value, cb);
@@ -521,6 +519,8 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.simplify_history = 0;
 	memset(&opt, 0, sizeof(opt));
@@ -635,6 +635,8 @@ int cmd_show(int argc, const char **argv, const char *prefix)
 
 	memset(&match_all, 0, sizeof(match_all));
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.always_show_header = 1;
 	rev.no_walk = 1;
@@ -718,6 +720,8 @@ int cmd_log_reflog(int argc, const char **argv, const char *prefix)
 
 	repo_init_revisions(the_repository, &rev, prefix);
 	init_reflog_walk(&rev.reflog_info);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.verbose_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -751,6 +755,8 @@ int cmd_log(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.always_show_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -1833,10 +1839,13 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	extra_hdr.strdup_strings = 1;
 	extra_to.strdup_strings = 1;
 	extra_cc.strdup_strings = 1;
+
 	init_log_defaults();
 	init_display_notes(&notes_opt);
 	git_config(git_format_config, NULL);
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.show_notes = show_notes;
 	memcpy(&rev.notes_opt, &notes_opt, sizeof(notes_opt));
 	rev.commit_format = CMIT_FMT_EMAIL;
diff --git a/grep.c b/grep.c
index 12b202598a9..8dfa0300786 100644
--- a/grep.c
+++ b/grep.c
@@ -19,27 +19,6 @@ static void std_output(struct grep_opt *opt, const void *buf, size_t size)
 	fwrite(buf, size, 1, stdout);
 }
 
-static struct grep_opt grep_defaults = {
-	.relative = 1,
-	.pathname = 1,
-	.max_depth = -1,
-	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED,
-	.colors = {
-		[GREP_COLOR_CONTEXT] = "",
-		[GREP_COLOR_FILENAME] = "",
-		[GREP_COLOR_FUNCTION] = "",
-		[GREP_COLOR_LINENO] = "",
-		[GREP_COLOR_COLUMNNO] = "",
-		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_SELECTED] = "",
-		[GREP_COLOR_SEP] = GIT_COLOR_CYAN,
-	},
-	.only_matching = 0,
-	.color = -1,
-	.output = std_output,
-};
-
 static const char *color_grep_slots[] = {
 	[GREP_COLOR_CONTEXT]	    = "context",
 	[GREP_COLOR_FILENAME]	    = "filename",
@@ -75,20 +54,12 @@ define_list_config_array_extra(color_grep_slots, {"match"});
  */
 int grep_config(const char *var, const char *value, void *cb)
 {
-	struct grep_opt *opt = &grep_defaults;
+	struct grep_opt *opt = cb;
 	const char *slot;
 
 	if (userdiff_config(var, value) < 0)
 		return -1;
 
-	/*
-	 * The instance of grep_opt that we set up here is copied by
-	 * grep_init() to be used by each individual invocation.
-	 * When populating a new field of this structure here, be
-	 * sure to think about ownership -- e.g., you might need to
-	 * override the shallow copy in grep_init() with a deep copy.
-	 */
-
 	if (!strcmp(var, "grep.extendedregexp")) {
 		opt->extended_regexp_option = git_config_bool(var, value);
 		return 0;
@@ -134,14 +105,10 @@ int grep_config(const char *var, const char *value, void *cb)
 	return 0;
 }
 
-/*
- * Initialize one instance of grep_opt and copy the
- * default values from the template we read the configuration
- * information in an earlier call to git_config(grep_config).
- */
 void grep_init(struct grep_opt *opt, struct repository *repo)
 {
-	*opt = grep_defaults;
+	struct grep_opt blank = GREP_OPT_INIT;
+	memcpy(opt, &blank, sizeof(*opt));
 
 	opt->repo = repo;
 	opt->pattern_tail = &opt->pattern_list;
diff --git a/grep.h b/grep.h
index 62deadb885f..b651eb291f7 100644
--- a/grep.h
+++ b/grep.h
@@ -177,6 +177,27 @@ struct grep_opt {
 	void *output_priv;
 };
 
+#define GREP_OPT_INIT { \
+	.relative = 1, \
+	.pathname = 1, \
+	.max_depth = -1, \
+	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
+	.colors = { \
+		[GREP_COLOR_CONTEXT] = "", \
+		[GREP_COLOR_FILENAME] = "", \
+		[GREP_COLOR_FUNCTION] = "", \
+		[GREP_COLOR_LINENO] = "", \
+		[GREP_COLOR_COLUMNNO] = "", \
+		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_SELECTED] = "", \
+		[GREP_COLOR_SEP] = GIT_COLOR_CYAN, \
+	}, \
+	.only_matching = 0, \
+	.color = -1, \
+	.output = std_output, \
+}
+
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
-- 
2.34.1.875.gb925cffed1e


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v4 7/7] grep: simplify config parsing and option parsing
  2021-12-03 10:19   ` [PATCH v4 " Ævar Arnfjörð Bjarmason
                       ` (5 preceding siblings ...)
  2021-12-03 10:19     ` [PATCH v4 6/7] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
@ 2021-12-03 10:19     ` Ævar Arnfjörð Bjarmason
  2021-12-22  2:58     ` [PATCH v5 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  7 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-03 10:19 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Simplify the parsing of "grep.patternType" and
"grep.extendedRegexp". This changes no behavior, but gets rid of
complex parsing logic that isn't needed anymore.

When "grep.patternType" was introduced in 84befcd0a4a (grep: add a
grep.patternType configuration setting, 2012-08-03) we promised that:

 1. You can set "grep.patternType", and "[setting it to] 'default'
    will return to the default matching behavior".

 2. We'd support the existing "grep.extendedRegexp" option, but ignore
    it when the new "grep.patternType" option is set. We said we'd
    only ignore the older "grep.extendedRegexp" option "when the
    `grep.patternType` option is set. to a value other than
    'default'".

In a preceding commit we changed grep_config() to be called after
grep_init(), which means that much of the complexity here can go
away.

Now as before when we only understand a "grep.extendedRegexp" setting
of "true", and if "grep.patterntype=default" is set we'll interpret it
as "grep.patterntype=basic", except if we previously saw a
"grep.extendedRegexp", then it's interpreted as
"grep.patterntype=extended".

See my 07a3d411739 (grep: remove regflags from the public grep_opt
API, 2017-06-29) for addition of the two comments being removed here,
i.e. the complexity noted in that commit is now going away.

We don't need grep_commit_pattern_type() anymore, we can instead have
OPT_SET_INT() in "builtin/grep.c" manipulate the "pattern_type_option"
member in "struct grep_opt" directly.

We can also do away with the indirection of the "int fixed" and "int
pcre2" members in favor of using "pattern_type_option" directly in
"grep.c".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 10 +++----
 grep.c         | 81 +++++++++++---------------------------------------
 grep.h         |  9 ++----
 revision.c     |  2 --
 4 files changed, 24 insertions(+), 78 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 0ea124321b6..942c4b25077 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -845,7 +845,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	int i;
 	int dummy;
 	int use_index = 1;
-	int pattern_type_arg = GREP_PATTERN_TYPE_UNSPECIFIED;
 	int allow_revs;
 
 	struct option options[] = {
@@ -879,16 +878,16 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			N_("descend at most <depth> levels"), PARSE_OPT_NONEG,
 			NULL, 1 },
 		OPT_GROUP(""),
-		OPT_SET_INT('E', "extended-regexp", &pattern_type_arg,
+		OPT_SET_INT('E', "extended-regexp", &opt.pattern_type_option,
 			    N_("use extended POSIX regular expressions"),
 			    GREP_PATTERN_TYPE_ERE),
-		OPT_SET_INT('G', "basic-regexp", &pattern_type_arg,
+		OPT_SET_INT('G', "basic-regexp", &opt.pattern_type_option,
 			    N_("use basic POSIX regular expressions (default)"),
 			    GREP_PATTERN_TYPE_BRE),
-		OPT_SET_INT('F', "fixed-strings", &pattern_type_arg,
+		OPT_SET_INT('F', "fixed-strings", &opt.pattern_type_option,
 			    N_("interpret patterns as fixed strings"),
 			    GREP_PATTERN_TYPE_FIXED),
-		OPT_SET_INT('P', "perl-regexp", &pattern_type_arg,
+		OPT_SET_INT('P', "perl-regexp", &opt.pattern_type_option,
 			    N_("use Perl-compatible regular expressions"),
 			    GREP_PATTERN_TYPE_PCRE),
 		OPT_GROUP(""),
@@ -982,7 +981,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options, grep_usage,
 			     PARSE_OPT_KEEP_DASHDASH |
 			     PARSE_OPT_STOP_AT_NON_OPTION);
-	grep_commit_pattern_type(pattern_type_arg, &opt);
 
 	if (use_index && !startup_info->have_repository) {
 		int fallback = 0;
diff --git a/grep.c b/grep.c
index 8dfa0300786..b5342064066 100644
--- a/grep.c
+++ b/grep.c
@@ -33,9 +33,7 @@ static const char *color_grep_slots[] = {
 
 static int parse_pattern_type_arg(const char *opt, const char *arg)
 {
-	if (!strcmp(arg, "default"))
-		return GREP_PATTERN_TYPE_UNSPECIFIED;
-	else if (!strcmp(arg, "basic"))
+	if (!strcmp(arg, "basic"))
 		return GREP_PATTERN_TYPE_BRE;
 	else if (!strcmp(arg, "extended"))
 		return GREP_PATTERN_TYPE_ERE;
@@ -61,11 +59,23 @@ int grep_config(const char *var, const char *value, void *cb)
 		return -1;
 
 	if (!strcmp(var, "grep.extendedregexp")) {
+		if (opt->extended_regexp_option)
+			return 0;
 		opt->extended_regexp_option = git_config_bool(var, value);
+		if (opt->extended_regexp_option)
+			opt->pattern_type_option = GREP_PATTERN_TYPE_ERE;
+		return 0;
+	}
+
+	if (!strcmp(var, "grep.patterntype") &&
+	    !strcmp(value, "default")) {
+		opt->pattern_type_option = opt->extended_regexp_option == 1
+			? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
 		return 0;
 	}
 
 	if (!strcmp(var, "grep.patterntype")) {
+		opt->extended_regexp_option = -1; /* ignore */
 		opt->pattern_type_option = parse_pattern_type_arg(var, value);
 		return 0;
 	}
@@ -115,62 +125,6 @@ void grep_init(struct grep_opt *opt, struct repository *repo)
 	opt->header_tail = &opt->header_list;
 }
 
-static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	/*
-	 * When committing to the pattern type by setting the relevant
-	 * fields in grep_opt it's generally not necessary to zero out
-	 * the fields we're not choosing, since they won't have been
-	 * set by anything. The extended_regexp_option field is the
-	 * only exception to this.
-	 *
-	 * This is because in the process of parsing grep.patternType
-	 * & grep.extendedRegexp we set opt->pattern_type_option and
-	 * opt->extended_regexp_option, respectively. We then
-	 * internally use opt->extended_regexp_option to see if we're
-	 * compiling an ERE. It must be unset if that's not actually
-	 * the case.
-	 */
-	if (pattern_type != GREP_PATTERN_TYPE_ERE &&
-	    opt->extended_regexp_option)
-		opt->extended_regexp_option = 0;
-
-	switch (pattern_type) {
-	case GREP_PATTERN_TYPE_UNSPECIFIED:
-		/* fall through */
-
-	case GREP_PATTERN_TYPE_BRE:
-		break;
-
-	case GREP_PATTERN_TYPE_ERE:
-		opt->extended_regexp_option = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_FIXED:
-		opt->fixed = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_PCRE:
-		opt->pcre2 = 1;
-		break;
-	}
-}
-
-void grep_commit_pattern_type(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	if (pattern_type != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(pattern_type, opt);
-	else if (opt->pattern_type_option != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(opt->pattern_type_option, opt);
-	else if (opt->extended_regexp_option)
-		/*
-		 * This branch *must* happen after setting from the
-		 * opt->pattern_type_option above, we don't want
-		 * grep.extendedRegexp to override grep.patternType!
-		 */
-		grep_set_pattern_type_option(GREP_PATTERN_TYPE_ERE, opt);
-}
-
 static struct grep_pat *create_grep_pat(const char *pat, size_t patlen,
 					const char *origin, int no,
 					enum grep_pat_token t,
@@ -490,9 +444,10 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 
 	p->word_regexp = opt->word_regexp;
 	p->ignore_case = opt->ignore_case;
-	p->fixed = opt->fixed;
+	p->fixed = opt->pattern_type_option == GREP_PATTERN_TYPE_FIXED;
 
-	if (memchr(p->pattern, 0, p->patternlen) && !opt->pcre2)
+	if (opt->pattern_type_option != GREP_PATTERN_TYPE_PCRE &&
+	    memchr(p->pattern, 0, p->patternlen))
 		die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
 
 	p->is_fixed = is_fixed(p->pattern, p->patternlen);
@@ -543,14 +498,14 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 		return;
 	}
 
-	if (opt->pcre2) {
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_PCRE) {
 		compile_pcre2_pattern(p, opt);
 		return;
 	}
 
 	if (p->ignore_case)
 		regflags |= REG_ICASE;
-	if (opt->extended_regexp_option)
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_ERE)
 		regflags |= REG_EXTENDED;
 	err = regcomp(&p->regexp, p->pattern, regflags);
 	if (err) {
diff --git a/grep.h b/grep.h
index b651eb291f7..ab2ce833b40 100644
--- a/grep.h
+++ b/grep.h
@@ -94,8 +94,7 @@ enum grep_expr_node {
 };
 
 enum grep_pattern_type {
-	GREP_PATTERN_TYPE_UNSPECIFIED = 0,
-	GREP_PATTERN_TYPE_BRE,
+	GREP_PATTERN_TYPE_BRE = 0,
 	GREP_PATTERN_TYPE_ERE,
 	GREP_PATTERN_TYPE_FIXED,
 	GREP_PATTERN_TYPE_PCRE
@@ -143,7 +142,6 @@ struct grep_opt {
 	int unmatch_name_only;
 	int count;
 	int word_regexp;
-	int fixed;
 	int all_match;
 #define GREP_BINARY_DEFAULT	0
 #define GREP_BINARY_NOMATCH	1
@@ -152,7 +150,6 @@ struct grep_opt {
 	int allow_textconv;
 	int extended;
 	int use_reflog_filter;
-	int pcre2;
 	int relative;
 	int pathname;
 	int null_following_name;
@@ -162,7 +159,7 @@ struct grep_opt {
 	int funcname;
 	int funcbody;
 	int extended_regexp_option;
-	int pattern_type_option;
+	enum grep_pattern_type pattern_type_option;
 	int ignore_locale;
 	char colors[NR_GREP_COLORS][COLOR_MAXLEN];
 	unsigned pre_context;
@@ -181,7 +178,6 @@ struct grep_opt {
 	.relative = 1, \
 	.pathname = 1, \
 	.max_depth = -1, \
-	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
 	.colors = { \
 		[GREP_COLOR_CONTEXT] = "", \
 		[GREP_COLOR_FILENAME] = "", \
@@ -200,7 +196,6 @@ struct grep_opt {
 
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
-void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
 void append_grep_pattern(struct grep_opt *opt, const char *pat, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index a82e31df869..d08c7ae6c35 100644
--- a/revision.c
+++ b/revision.c
@@ -2855,8 +2855,6 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 
 	diff_setup_done(&revs->diffopt);
 
-	grep_commit_pattern_type(GREP_PATTERN_TYPE_UNSPECIFIED,
-				 &revs->grep_filter);
 	if (!is_encoding_utf8(get_log_output_encoding()))
 		revs->grep_filter.ignore_locale = 1;
 	compile_grep_patterns(&revs->grep_filter);
-- 
2.34.1.875.gb925cffed1e


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v5 0/7] grep: simplify & delete "init" & "config" code
  2021-12-03 10:19   ` [PATCH v4 " Ævar Arnfjörð Bjarmason
                       ` (6 preceding siblings ...)
  2021-12-03 10:19     ` [PATCH v4 7/7] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
@ 2021-12-22  2:58     ` Ævar Arnfjörð Bjarmason
  2021-12-22  2:58       ` [PATCH v5 1/7] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
                         ` (8 more replies)
  7 siblings, 9 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-22  2:58 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

A simplification and code deletion of the overly complex setup for the
grep API, no behavior changes. For v4 see:
https://lore.kernel.org/git/cover-v4-0.7-00000000000-20211203T101348Z-avarab@gmail.com/

This re-roll is rebased on the latest push-out to "master", now-landed
topic had a minor conflict with it in git.c.

Ævar Arnfjörð Bjarmason (7):
  grep.h: remove unused "regex_t regexp" from grep_opt
  log tests: check if grep_config() is called by "log"-like cmds
  grep tests: add missing "grep.patternType" config test
  built-ins: trust the "prefix" from run_builtin()
  grep.c: don't pass along NULL callback value
  grep API: call grep_config() after grep_init()
  grep: simplify config parsing and option parsing

 builtin/grep.c    |  27 +++++-----
 builtin/log.c     |  13 ++++-
 builtin/ls-tree.c |   2 +-
 git.c             |   1 +
 grep.c            | 124 ++++++++--------------------------------------
 grep.h            |  33 ++++++++----
 revision.c        |   4 +-
 t/t4202-log.sh    |  24 +++++++++
 t/t7810-grep.sh   |  19 +++++++
 9 files changed, 116 insertions(+), 131 deletions(-)

Range-diff against v4:
1:  d7d232b2b52 = 1:  b6a3e0e2e08 grep.h: remove unused "regex_t regexp" from grep_opt
2:  f853d669682 = 2:  c0d77b2683f log tests: check if grep_config() is called by "log"-like cmds
3:  a97b7de3a3c = 3:  f02f246aa23 grep tests: add missing "grep.patternType" config test
4:  f7d995a5a80 ! 4:  a542a352eab built-ins: trust the "prefix" from run_builtin()
    @@ builtin/ls-tree.c: int cmd_ls_tree(int argc, const char **argv, const char *pref
     
      ## git.c ##
     @@ git.c: static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
    - 			int nongit_ok;
    - 			prefix = setup_git_directory_gently(&nongit_ok);
    - 		}
    -+		assert(!prefix || *prefix);
    - 		precompose_argv_prefix(argc, argv, NULL);
    - 		if (use_pager == -1 && p->option & (RUN_SETUP | RUN_SETUP_GENTLY) &&
    - 		    !(p->option & DELAY_PAGER_CONFIG))
    + 	} else {
    + 		prefix = NULL;
    + 	}
    ++	assert(!prefix || *prefix);
    + 	precompose_argv_prefix(argc, argv, NULL);
    + 	if (use_pager == -1 && run_setup &&
    + 		!(p->option & DELAY_PAGER_CONFIG))
     
      ## grep.c ##
     @@ grep.c: int grep_config(const char *var, const char *value, void *cb)
5:  ab1685f0dad = 5:  a33b00a247e grep.c: don't pass along NULL callback value
6:  8ffa22df8c7 = 6:  92b1c3958fa grep API: call grep_config() after grep_init()
7:  efbd1c50b43 = 7:  63de643ebc2 grep: simplify config parsing and option parsing
-- 
2.34.1.1146.gb52885e7c44


^ permalink raw reply	[flat|nested] 151+ messages in thread

* [PATCH v5 1/7] grep.h: remove unused "regex_t regexp" from grep_opt
  2021-12-22  2:58     ` [PATCH v5 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
@ 2021-12-22  2:58       ` Ævar Arnfjörð Bjarmason
  2021-12-22  2:58       ` [PATCH v5 2/7] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
                         ` (7 subsequent siblings)
  8 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-22  2:58 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

This "regex_t" in grep_opt has not been used since
f9b9faf6f8a (builtin-grep: allow more than one patterns., 2006-05-02),
we still use a "regex_t" for compiling regexes, but that's in the
"grep_pat" struct".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/grep.h b/grep.h
index 3e8815c347b..95cccb670f9 100644
--- a/grep.h
+++ b/grep.h
@@ -136,7 +136,6 @@ struct grep_opt {
 
 	const char *prefix;
 	int prefix_length;
-	regex_t regexp;
 	int linenum;
 	int columnnum;
 	int invert;
-- 
2.34.1.1146.gb52885e7c44


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v5 2/7] log tests: check if grep_config() is called by "log"-like cmds
  2021-12-22  2:58     ` [PATCH v5 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  2021-12-22  2:58       ` [PATCH v5 1/7] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
@ 2021-12-22  2:58       ` Ævar Arnfjörð Bjarmason
  2021-12-22  2:58       ` [PATCH v5 3/7] grep tests: add missing "grep.patternType" config test Ævar Arnfjörð Bjarmason
                         ` (6 subsequent siblings)
  8 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-22  2:58 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Extend the tests added in my 9df46763ef1 (log: add exhaustive tests
for pattern style options & config, 2017-05-20) to check not only
whether "git log" handles "grep.patternType", but also "git show"
etc.

It's sufficient to check whether we match a "fixed" or a "basic" regex
here to see if these codepaths correctly invoked grep_config(). We
don't need to check the details of their regular expression matching
as the "log" test does.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t4202-log.sh | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 2ced7e9d817..2490f2cd5ed 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -449,6 +449,30 @@ test_expect_success !FAIL_PREREQS 'log with various grep.patternType configurati
 	)
 '
 
+for cmd in show whatchanged reflog format-patch
+do
+	case "$cmd" in
+	format-patch) myarg="HEAD~.." ;;
+	*) myarg= ;;
+	esac
+
+	test_expect_success "$cmd: understands grep.patternType, like 'log'" '
+		git init "pattern-type-$cmd" &&
+		(
+			cd "pattern-type-$cmd" &&
+			test_commit 1 file A &&
+			test_commit "(1|2)" file B 2 &&
+
+			git -c grep.patternType=fixed $cmd --grep="..." $myarg >actual &&
+			test_must_be_empty actual &&
+
+			git -c grep.patternType=basic $cmd --grep="..." $myarg >actual &&
+			test_file_not_empty actual
+		)
+	'
+done
+test_done
+
 test_expect_success 'log --author' '
 	cat >expect <<-\EOF &&
 	Author: <BOLD;RED>A U<RESET> Thor <author@example.com>
-- 
2.34.1.1146.gb52885e7c44


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v5 3/7] grep tests: add missing "grep.patternType" config test
  2021-12-22  2:58     ` [PATCH v5 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  2021-12-22  2:58       ` [PATCH v5 1/7] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
  2021-12-22  2:58       ` [PATCH v5 2/7] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
@ 2021-12-22  2:58       ` Ævar Arnfjörð Bjarmason
  2021-12-23 22:25         ` Junio C Hamano
  2021-12-22  2:58       ` [PATCH v5 4/7] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
                         ` (5 subsequent siblings)
  8 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-22  2:58 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Extend the grep tests to assert that setting
"grep.patternType=extended" followed by "grep.patternType=default"
will behave as if "--basic-regexp" was provided, and not as
"--extended-regexp". In a subsequent commit we'll need to treat
"grep.patternType=default" as a special-case, but let's make sure we
ignore it if it's being set to "default" following an earlier
non-"default" "grep.patternType" setting.

Let's also test what happens when we have a sequence of "extended"
followed by "default" and "fixed". In that case the "fixed" should
prevail.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t7810-grep.sh | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 6b6423a07c3..113902c3bda 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -451,6 +451,25 @@ do
 		test_cmp expected actual
 	'
 
+	test_expect_success "grep $L with grep.patternType=extended and grep.patternType=default" '
+		echo "${HC}ab:a+bc" >expected &&
+		git \
+			-c grep.patternType=extended \
+			-c grep.patternType=default \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.patternType=[extended -> default -> fixed]" '
+		echo "${HC}ab:a+b*c" >expected &&
+		git \
+			-c grep.patternType=extended \
+			-c grep.patternType=default \
+			-c grep.patternType=fixed \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
 	test_expect_success "grep $L with grep.patternType=extended and grep.extendedRegexp=false" '
 		echo "${HC}ab:abc" >expected &&
 		git \
-- 
2.34.1.1146.gb52885e7c44


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v5 4/7] built-ins: trust the "prefix" from run_builtin()
  2021-12-22  2:58     ` [PATCH v5 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                         ` (2 preceding siblings ...)
  2021-12-22  2:58       ` [PATCH v5 3/7] grep tests: add missing "grep.patternType" config test Ævar Arnfjörð Bjarmason
@ 2021-12-22  2:58       ` Ævar Arnfjörð Bjarmason
  2021-12-22  2:58       ` [PATCH v5 5/7] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
                         ` (4 subsequent siblings)
  8 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-22  2:58 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change code in "builtin/grep.c" and "builtin/ls-tree.c" to trust the
"prefix" passed from "run_builtin()". The "prefix" we get from setup.c
is either going to be NULL or a string of length >0, never "".

So we can drop the "prefix && *prefix" checks added for
"builtin/grep.c" in 0d042fecf2f (git-grep: show pathnames relative to
the current directory, 2006-08-11), and for "builtin/ls-tree.c" in
a69dd585fca (ls-tree: chomp leading directories when run from a
subdirectory, 2005-12-23).

As seen in code in revision.c that was added in cd676a51367 (diff
--relative: output paths as relative to the current subdirectory,
2008-02-12) we already have existing code that does away with this
assertion.

This makes it easier to reason about a subsequent change to the
"prefix_length" code in grep.c in a subsequent commit, and since we're
going to the trouble of doing that let's leave behind an assert() to
promise this to any future callers.

For "builtin/grep.c" it would be painful to pass the "prefix" down the
callchain of:

    cmd_grep -> grep_tree -> grep_submodule -> grep_cache -> grep_oid ->
    grep_source_name

So for the code that needs it in grep_source_name() let's add a
"grep_prefix" variable similar to the existing "ls_tree_prefix".

While at it let's move the code in cmd_ls_tree() around so that we
assign to the "ls_tree_prefix" right after declaring the variables,
and stop assigning to "prefix". We only subsequently used that
variable later in the function after clobbering it. Let's just use our
own "grep_prefix" instead.

Let's also add an assert() in git.c, so that we'll make this promise
about the "prefix" to any current and future callers, as well as to
any readers of the code.

Code history:

 * The strlen() in "grep.c" hasn't been used since 493b7a08d80 (grep:
   accept relative paths outside current working directory, 2009-09-05).

   When that code was added in 0d042fecf2f (git-grep: show pathnames
   relative to the current directory, 2006-08-11) we used the length.

   But since 493b7a08d80 we haven't used it for anything except a
   boolean check that we could have done on the "prefix" member
   itself.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c    | 13 ++++++++-----
 builtin/ls-tree.c |  2 +-
 git.c             |  1 +
 grep.c            |  4 +---
 grep.h            |  4 +---
 revision.c        |  2 +-
 6 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 9e34a820ad4..d85cbabea67 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -26,6 +26,8 @@
 #include "object-store.h"
 #include "packfile.h"
 
+static const char *grep_prefix;
+
 static char const * const grep_usage[] = {
 	N_("git grep [<options>] [-e] <pattern> [<rev>...] [[--] <path>...]"),
 	NULL
@@ -315,11 +317,11 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 	strbuf_reset(out);
 
 	if (opt->null_following_name) {
-		if (opt->relative && opt->prefix_length) {
+		if (opt->relative && grep_prefix) {
 			struct strbuf rel_buf = STRBUF_INIT;
 			const char *rel_name =
 				relative_path(filename + tree_name_len,
-					      opt->prefix, &rel_buf);
+					      grep_prefix, &rel_buf);
 
 			if (tree_name_len)
 				strbuf_add(out, filename, tree_name_len);
@@ -332,8 +334,8 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 		return;
 	}
 
-	if (opt->relative && opt->prefix_length)
-		quote_path(filename + tree_name_len, opt->prefix, out, 0);
+	if (opt->relative && grep_prefix)
+		quote_path(filename + tree_name_len, grep_prefix, out, 0);
 	else
 		quote_c_style(filename + tree_name_len, out, NULL, 0);
 
@@ -962,9 +964,10 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			   PARSE_OPT_NOCOMPLETE),
 		OPT_END()
 	};
+	grep_prefix = prefix;
 
 	git_config(grep_cmd_config, NULL);
-	grep_init(&opt, the_repository, prefix);
+	grep_init(&opt, the_repository);
 
 	/*
 	 * If there is no -- then the paths must exist in the working
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c71..6cb554cbb0a 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -150,7 +150,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 
 	git_config(git_default_config, NULL);
 	ls_tree_prefix = prefix;
-	if (prefix && *prefix)
+	if (prefix)
 		chomp_prefix = strlen(prefix);
 
 	argc = parse_options(argc, argv, prefix, ls_tree_options,
diff --git a/git.c b/git.c
index 7edafd8ecff..575d95046f2 100644
--- a/git.c
+++ b/git.c
@@ -436,6 +436,7 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
 	} else {
 		prefix = NULL;
 	}
+	assert(!prefix || *prefix);
 	precompose_argv_prefix(argc, argv, NULL);
 	if (use_pager == -1 && run_setup &&
 		!(p->option & DELAY_PAGER_CONFIG))
diff --git a/grep.c b/grep.c
index fe847a0111a..12b202598a9 100644
--- a/grep.c
+++ b/grep.c
@@ -139,13 +139,11 @@ int grep_config(const char *var, const char *value, void *cb)
  * default values from the template we read the configuration
  * information in an earlier call to git_config(grep_config).
  */
-void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix)
+void grep_init(struct grep_opt *opt, struct repository *repo)
 {
 	*opt = grep_defaults;
 
 	opt->repo = repo;
-	opt->prefix = prefix;
-	opt->prefix_length = (prefix && *prefix) ? strlen(prefix) : 0;
 	opt->pattern_tail = &opt->pattern_list;
 	opt->header_tail = &opt->header_list;
 }
diff --git a/grep.h b/grep.h
index 95cccb670f9..62deadb885f 100644
--- a/grep.h
+++ b/grep.h
@@ -134,8 +134,6 @@ struct grep_opt {
 	 */
 	struct repository *repo;
 
-	const char *prefix;
-	int prefix_length;
 	int linenum;
 	int columnnum;
 	int invert;
@@ -180,7 +178,7 @@ struct grep_opt {
 };
 
 int grep_config(const char *var, const char *value, void *);
-void grep_init(struct grep_opt *, struct repository *repo, const char *prefix);
+void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index 5390a479b30..495328e859c 100644
--- a/revision.c
+++ b/revision.c
@@ -1838,7 +1838,7 @@ void repo_init_revisions(struct repository *r,
 	revs->commit_format = CMIT_FMT_DEFAULT;
 	revs->expand_tabs_in_log_default = 8;
 
-	grep_init(&revs->grep_filter, revs->repo, prefix);
+	grep_init(&revs->grep_filter, revs->repo);
 	revs->grep_filter.status_only = 1;
 
 	repo_diff_setup(revs->repo, &revs->diffopt);
-- 
2.34.1.1146.gb52885e7c44


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v5 5/7] grep.c: don't pass along NULL callback value
  2021-12-22  2:58     ` [PATCH v5 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                         ` (3 preceding siblings ...)
  2021-12-22  2:58       ` [PATCH v5 4/7] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
@ 2021-12-22  2:58       ` Ævar Arnfjörð Bjarmason
  2021-12-22  2:58       ` [PATCH v5 6/7] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
                         ` (3 subsequent siblings)
  8 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-22  2:58 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change grep_cmd_config() to stop passing around the always-NULL "cb"
value. When this code was added in 7e8f59d577e (grep: color patterns
in output, 2009-03-07) it was non-NULL, but when that changed in
15fabd1bbd4 (builtin/grep.c: make configuration callback more
reusable, 2012-10-09) this code was left behind.

In a subsequent change I'll start using the "cb" value, this will make
it clear which functions we call need it, and which don't.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index d85cbabea67..5ec4cecae45 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -285,8 +285,8 @@ static int wait_all(void)
 
 static int grep_cmd_config(const char *var, const char *value, void *cb)
 {
-	int st = grep_config(var, value, cb);
-	if (git_color_default_config(var, value, cb) < 0)
+	int st = grep_config(var, value, NULL);
+	if (git_color_default_config(var, value, NULL) < 0)
 		st = -1;
 
 	if (!strcmp(var, "grep.threads")) {
-- 
2.34.1.1146.gb52885e7c44


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v5 6/7] grep API: call grep_config() after grep_init()
  2021-12-22  2:58     ` [PATCH v5 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                         ` (4 preceding siblings ...)
  2021-12-22  2:58       ` [PATCH v5 5/7] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
@ 2021-12-22  2:58       ` Ævar Arnfjörð Bjarmason
  2021-12-22  2:58       ` [PATCH v5 7/7] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
                         ` (2 subsequent siblings)
  8 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-22  2:58 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

The grep_init() function used the odd pattern of initializing the
passed-in "struct grep_opt" with a statically defined "grep_defaults"
struct, which would be modified in-place when we invoked
grep_config().

So we effectively (b) initialized config, (a) then defaults, (c)
followed by user options. Usually those are ordered as "a", "b" and
"c" instead.

As the comments being removed here show the previous behavior needed
to be carefully explained as we'd potentially share the populated
configuration among different instances of grep_init(). In practice we
didn't do that, but now that it can't be a concern anymore let's
remove those comments.

This does not change the behavior of any of the configuration
variables or options. That would have been the case if we didn't move
around the grep_config() call in "builtin/log.c". But now that we call
"grep_config" after "git_log_config" and "git_format_config" we'll
need to pass in the already initialized "struct grep_opt *".

See 6ba9bb76e02 (grep: copy struct in one fell swoop, 2020-11-29) and
7687a0541e0 (grep: move the configuration parsing logic to grep.[ch],
2012-10-09) for the commits that added the comments.

The memcpy() pattern here will be optimized away and follows the
convention of other *_init() functions. See 5726a6b4012 (*.c *_init():
define in terms of corresponding *_INIT macro, 2021-07-01).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c |  4 ++--
 builtin/log.c  | 13 +++++++++++--
 grep.c         | 39 +++------------------------------------
 grep.h         | 21 +++++++++++++++++++++
 4 files changed, 37 insertions(+), 40 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 5ec4cecae45..0ea124321b6 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -285,7 +285,7 @@ static int wait_all(void)
 
 static int grep_cmd_config(const char *var, const char *value, void *cb)
 {
-	int st = grep_config(var, value, NULL);
+	int st = grep_config(var, value, cb);
 	if (git_color_default_config(var, value, NULL) < 0)
 		st = -1;
 
@@ -966,8 +966,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	};
 	grep_prefix = prefix;
 
-	git_config(grep_cmd_config, NULL);
 	grep_init(&opt, the_repository);
+	git_config(grep_cmd_config, &opt);
 
 	/*
 	 * If there is no -- then the paths must exist in the working
diff --git a/builtin/log.c b/builtin/log.c
index 93ace0dde7d..fdde77e4ebb 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -520,8 +520,6 @@ static int git_log_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
-	if (grep_config(var, value, cb) < 0)
-		return -1;
 	if (git_gpg_config(var, value, cb) < 0)
 		return -1;
 	return git_diff_ui_config(var, value, cb);
@@ -536,6 +534,8 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.simplify_history = 0;
 	memset(&opt, 0, sizeof(opt));
@@ -650,6 +650,8 @@ int cmd_show(int argc, const char **argv, const char *prefix)
 
 	memset(&match_all, 0, sizeof(match_all));
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.always_show_header = 1;
 	rev.no_walk = 1;
@@ -733,6 +735,8 @@ int cmd_log_reflog(int argc, const char **argv, const char *prefix)
 
 	repo_init_revisions(the_repository, &rev, prefix);
 	init_reflog_walk(&rev.reflog_info);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.verbose_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -766,6 +770,8 @@ int cmd_log(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.always_show_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -1848,10 +1854,13 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	extra_hdr.strdup_strings = 1;
 	extra_to.strdup_strings = 1;
 	extra_cc.strdup_strings = 1;
+
 	init_log_defaults();
 	init_display_notes(&notes_opt);
 	git_config(git_format_config, NULL);
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.show_notes = show_notes;
 	memcpy(&rev.notes_opt, &notes_opt, sizeof(notes_opt));
 	rev.commit_format = CMIT_FMT_EMAIL;
diff --git a/grep.c b/grep.c
index 12b202598a9..8dfa0300786 100644
--- a/grep.c
+++ b/grep.c
@@ -19,27 +19,6 @@ static void std_output(struct grep_opt *opt, const void *buf, size_t size)
 	fwrite(buf, size, 1, stdout);
 }
 
-static struct grep_opt grep_defaults = {
-	.relative = 1,
-	.pathname = 1,
-	.max_depth = -1,
-	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED,
-	.colors = {
-		[GREP_COLOR_CONTEXT] = "",
-		[GREP_COLOR_FILENAME] = "",
-		[GREP_COLOR_FUNCTION] = "",
-		[GREP_COLOR_LINENO] = "",
-		[GREP_COLOR_COLUMNNO] = "",
-		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_SELECTED] = "",
-		[GREP_COLOR_SEP] = GIT_COLOR_CYAN,
-	},
-	.only_matching = 0,
-	.color = -1,
-	.output = std_output,
-};
-
 static const char *color_grep_slots[] = {
 	[GREP_COLOR_CONTEXT]	    = "context",
 	[GREP_COLOR_FILENAME]	    = "filename",
@@ -75,20 +54,12 @@ define_list_config_array_extra(color_grep_slots, {"match"});
  */
 int grep_config(const char *var, const char *value, void *cb)
 {
-	struct grep_opt *opt = &grep_defaults;
+	struct grep_opt *opt = cb;
 	const char *slot;
 
 	if (userdiff_config(var, value) < 0)
 		return -1;
 
-	/*
-	 * The instance of grep_opt that we set up here is copied by
-	 * grep_init() to be used by each individual invocation.
-	 * When populating a new field of this structure here, be
-	 * sure to think about ownership -- e.g., you might need to
-	 * override the shallow copy in grep_init() with a deep copy.
-	 */
-
 	if (!strcmp(var, "grep.extendedregexp")) {
 		opt->extended_regexp_option = git_config_bool(var, value);
 		return 0;
@@ -134,14 +105,10 @@ int grep_config(const char *var, const char *value, void *cb)
 	return 0;
 }
 
-/*
- * Initialize one instance of grep_opt and copy the
- * default values from the template we read the configuration
- * information in an earlier call to git_config(grep_config).
- */
 void grep_init(struct grep_opt *opt, struct repository *repo)
 {
-	*opt = grep_defaults;
+	struct grep_opt blank = GREP_OPT_INIT;
+	memcpy(opt, &blank, sizeof(*opt));
 
 	opt->repo = repo;
 	opt->pattern_tail = &opt->pattern_list;
diff --git a/grep.h b/grep.h
index 62deadb885f..b651eb291f7 100644
--- a/grep.h
+++ b/grep.h
@@ -177,6 +177,27 @@ struct grep_opt {
 	void *output_priv;
 };
 
+#define GREP_OPT_INIT { \
+	.relative = 1, \
+	.pathname = 1, \
+	.max_depth = -1, \
+	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
+	.colors = { \
+		[GREP_COLOR_CONTEXT] = "", \
+		[GREP_COLOR_FILENAME] = "", \
+		[GREP_COLOR_FUNCTION] = "", \
+		[GREP_COLOR_LINENO] = "", \
+		[GREP_COLOR_COLUMNNO] = "", \
+		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_SELECTED] = "", \
+		[GREP_COLOR_SEP] = GIT_COLOR_CYAN, \
+	}, \
+	.only_matching = 0, \
+	.color = -1, \
+	.output = std_output, \
+}
+
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
-- 
2.34.1.1146.gb52885e7c44


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v5 7/7] grep: simplify config parsing and option parsing
  2021-12-22  2:58     ` [PATCH v5 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                         ` (5 preceding siblings ...)
  2021-12-22  2:58       ` [PATCH v5 6/7] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
@ 2021-12-22  2:58       ` Ævar Arnfjörð Bjarmason
  2021-12-23 22:37         ` Junio C Hamano
  2021-12-23  0:30       ` [PATCH v5 0/7] grep: simplify & delete "init" & "config" code Junio C Hamano
  2021-12-26 22:37       ` [PATCH v6 " Ævar Arnfjörð Bjarmason
  8 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-22  2:58 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Simplify the parsing of "grep.patternType" and
"grep.extendedRegexp". This changes no behavior, but gets rid of
complex parsing logic that isn't needed anymore.

When "grep.patternType" was introduced in 84befcd0a4a (grep: add a
grep.patternType configuration setting, 2012-08-03) we promised that:

 1. You can set "grep.patternType", and "[setting it to] 'default'
    will return to the default matching behavior".

 2. We'd support the existing "grep.extendedRegexp" option, but ignore
    it when the new "grep.patternType" option is set. We said we'd
    only ignore the older "grep.extendedRegexp" option "when the
    `grep.patternType` option is set. to a value other than
    'default'".

In a preceding commit we changed grep_config() to be called after
grep_init(), which means that much of the complexity here can go
away.

Now as before when we only understand a "grep.extendedRegexp" setting
of "true", and if "grep.patterntype=default" is set we'll interpret it
as "grep.patterntype=basic", except if we previously saw a
"grep.extendedRegexp", then it's interpreted as
"grep.patterntype=extended".

See my 07a3d411739 (grep: remove regflags from the public grep_opt
API, 2017-06-29) for addition of the two comments being removed here,
i.e. the complexity noted in that commit is now going away.

We don't need grep_commit_pattern_type() anymore, we can instead have
OPT_SET_INT() in "builtin/grep.c" manipulate the "pattern_type_option"
member in "struct grep_opt" directly.

We can also do away with the indirection of the "int fixed" and "int
pcre2" members in favor of using "pattern_type_option" directly in
"grep.c".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 10 +++----
 grep.c         | 81 +++++++++++---------------------------------------
 grep.h         |  9 ++----
 revision.c     |  2 --
 4 files changed, 24 insertions(+), 78 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 0ea124321b6..942c4b25077 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -845,7 +845,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	int i;
 	int dummy;
 	int use_index = 1;
-	int pattern_type_arg = GREP_PATTERN_TYPE_UNSPECIFIED;
 	int allow_revs;
 
 	struct option options[] = {
@@ -879,16 +878,16 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			N_("descend at most <depth> levels"), PARSE_OPT_NONEG,
 			NULL, 1 },
 		OPT_GROUP(""),
-		OPT_SET_INT('E', "extended-regexp", &pattern_type_arg,
+		OPT_SET_INT('E', "extended-regexp", &opt.pattern_type_option,
 			    N_("use extended POSIX regular expressions"),
 			    GREP_PATTERN_TYPE_ERE),
-		OPT_SET_INT('G', "basic-regexp", &pattern_type_arg,
+		OPT_SET_INT('G', "basic-regexp", &opt.pattern_type_option,
 			    N_("use basic POSIX regular expressions (default)"),
 			    GREP_PATTERN_TYPE_BRE),
-		OPT_SET_INT('F', "fixed-strings", &pattern_type_arg,
+		OPT_SET_INT('F', "fixed-strings", &opt.pattern_type_option,
 			    N_("interpret patterns as fixed strings"),
 			    GREP_PATTERN_TYPE_FIXED),
-		OPT_SET_INT('P', "perl-regexp", &pattern_type_arg,
+		OPT_SET_INT('P', "perl-regexp", &opt.pattern_type_option,
 			    N_("use Perl-compatible regular expressions"),
 			    GREP_PATTERN_TYPE_PCRE),
 		OPT_GROUP(""),
@@ -982,7 +981,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options, grep_usage,
 			     PARSE_OPT_KEEP_DASHDASH |
 			     PARSE_OPT_STOP_AT_NON_OPTION);
-	grep_commit_pattern_type(pattern_type_arg, &opt);
 
 	if (use_index && !startup_info->have_repository) {
 		int fallback = 0;
diff --git a/grep.c b/grep.c
index 8dfa0300786..b5342064066 100644
--- a/grep.c
+++ b/grep.c
@@ -33,9 +33,7 @@ static const char *color_grep_slots[] = {
 
 static int parse_pattern_type_arg(const char *opt, const char *arg)
 {
-	if (!strcmp(arg, "default"))
-		return GREP_PATTERN_TYPE_UNSPECIFIED;
-	else if (!strcmp(arg, "basic"))
+	if (!strcmp(arg, "basic"))
 		return GREP_PATTERN_TYPE_BRE;
 	else if (!strcmp(arg, "extended"))
 		return GREP_PATTERN_TYPE_ERE;
@@ -61,11 +59,23 @@ int grep_config(const char *var, const char *value, void *cb)
 		return -1;
 
 	if (!strcmp(var, "grep.extendedregexp")) {
+		if (opt->extended_regexp_option)
+			return 0;
 		opt->extended_regexp_option = git_config_bool(var, value);
+		if (opt->extended_regexp_option)
+			opt->pattern_type_option = GREP_PATTERN_TYPE_ERE;
+		return 0;
+	}
+
+	if (!strcmp(var, "grep.patterntype") &&
+	    !strcmp(value, "default")) {
+		opt->pattern_type_option = opt->extended_regexp_option == 1
+			? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
 		return 0;
 	}
 
 	if (!strcmp(var, "grep.patterntype")) {
+		opt->extended_regexp_option = -1; /* ignore */
 		opt->pattern_type_option = parse_pattern_type_arg(var, value);
 		return 0;
 	}
@@ -115,62 +125,6 @@ void grep_init(struct grep_opt *opt, struct repository *repo)
 	opt->header_tail = &opt->header_list;
 }
 
-static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	/*
-	 * When committing to the pattern type by setting the relevant
-	 * fields in grep_opt it's generally not necessary to zero out
-	 * the fields we're not choosing, since they won't have been
-	 * set by anything. The extended_regexp_option field is the
-	 * only exception to this.
-	 *
-	 * This is because in the process of parsing grep.patternType
-	 * & grep.extendedRegexp we set opt->pattern_type_option and
-	 * opt->extended_regexp_option, respectively. We then
-	 * internally use opt->extended_regexp_option to see if we're
-	 * compiling an ERE. It must be unset if that's not actually
-	 * the case.
-	 */
-	if (pattern_type != GREP_PATTERN_TYPE_ERE &&
-	    opt->extended_regexp_option)
-		opt->extended_regexp_option = 0;
-
-	switch (pattern_type) {
-	case GREP_PATTERN_TYPE_UNSPECIFIED:
-		/* fall through */
-
-	case GREP_PATTERN_TYPE_BRE:
-		break;
-
-	case GREP_PATTERN_TYPE_ERE:
-		opt->extended_regexp_option = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_FIXED:
-		opt->fixed = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_PCRE:
-		opt->pcre2 = 1;
-		break;
-	}
-}
-
-void grep_commit_pattern_type(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	if (pattern_type != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(pattern_type, opt);
-	else if (opt->pattern_type_option != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(opt->pattern_type_option, opt);
-	else if (opt->extended_regexp_option)
-		/*
-		 * This branch *must* happen after setting from the
-		 * opt->pattern_type_option above, we don't want
-		 * grep.extendedRegexp to override grep.patternType!
-		 */
-		grep_set_pattern_type_option(GREP_PATTERN_TYPE_ERE, opt);
-}
-
 static struct grep_pat *create_grep_pat(const char *pat, size_t patlen,
 					const char *origin, int no,
 					enum grep_pat_token t,
@@ -490,9 +444,10 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 
 	p->word_regexp = opt->word_regexp;
 	p->ignore_case = opt->ignore_case;
-	p->fixed = opt->fixed;
+	p->fixed = opt->pattern_type_option == GREP_PATTERN_TYPE_FIXED;
 
-	if (memchr(p->pattern, 0, p->patternlen) && !opt->pcre2)
+	if (opt->pattern_type_option != GREP_PATTERN_TYPE_PCRE &&
+	    memchr(p->pattern, 0, p->patternlen))
 		die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
 
 	p->is_fixed = is_fixed(p->pattern, p->patternlen);
@@ -543,14 +498,14 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 		return;
 	}
 
-	if (opt->pcre2) {
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_PCRE) {
 		compile_pcre2_pattern(p, opt);
 		return;
 	}
 
 	if (p->ignore_case)
 		regflags |= REG_ICASE;
-	if (opt->extended_regexp_option)
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_ERE)
 		regflags |= REG_EXTENDED;
 	err = regcomp(&p->regexp, p->pattern, regflags);
 	if (err) {
diff --git a/grep.h b/grep.h
index b651eb291f7..ab2ce833b40 100644
--- a/grep.h
+++ b/grep.h
@@ -94,8 +94,7 @@ enum grep_expr_node {
 };
 
 enum grep_pattern_type {
-	GREP_PATTERN_TYPE_UNSPECIFIED = 0,
-	GREP_PATTERN_TYPE_BRE,
+	GREP_PATTERN_TYPE_BRE = 0,
 	GREP_PATTERN_TYPE_ERE,
 	GREP_PATTERN_TYPE_FIXED,
 	GREP_PATTERN_TYPE_PCRE
@@ -143,7 +142,6 @@ struct grep_opt {
 	int unmatch_name_only;
 	int count;
 	int word_regexp;
-	int fixed;
 	int all_match;
 #define GREP_BINARY_DEFAULT	0
 #define GREP_BINARY_NOMATCH	1
@@ -152,7 +150,6 @@ struct grep_opt {
 	int allow_textconv;
 	int extended;
 	int use_reflog_filter;
-	int pcre2;
 	int relative;
 	int pathname;
 	int null_following_name;
@@ -162,7 +159,7 @@ struct grep_opt {
 	int funcname;
 	int funcbody;
 	int extended_regexp_option;
-	int pattern_type_option;
+	enum grep_pattern_type pattern_type_option;
 	int ignore_locale;
 	char colors[NR_GREP_COLORS][COLOR_MAXLEN];
 	unsigned pre_context;
@@ -181,7 +178,6 @@ struct grep_opt {
 	.relative = 1, \
 	.pathname = 1, \
 	.max_depth = -1, \
-	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
 	.colors = { \
 		[GREP_COLOR_CONTEXT] = "", \
 		[GREP_COLOR_FILENAME] = "", \
@@ -200,7 +196,6 @@ struct grep_opt {
 
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
-void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
 void append_grep_pattern(struct grep_opt *opt, const char *pat, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index 495328e859c..298d0ea7574 100644
--- a/revision.c
+++ b/revision.c
@@ -2860,8 +2860,6 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 
 	diff_setup_done(&revs->diffopt);
 
-	grep_commit_pattern_type(GREP_PATTERN_TYPE_UNSPECIFIED,
-				 &revs->grep_filter);
 	if (!is_encoding_utf8(get_log_output_encoding()))
 		revs->grep_filter.ignore_locale = 1;
 	compile_grep_patterns(&revs->grep_filter);
-- 
2.34.1.1146.gb52885e7c44


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: [PATCH v5 0/7] grep: simplify & delete "init" & "config" code
  2021-12-22  2:58     ` [PATCH v5 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                         ` (6 preceding siblings ...)
  2021-12-22  2:58       ` [PATCH v5 7/7] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
@ 2021-12-23  0:30       ` Junio C Hamano
  2021-12-26 22:37       ` [PATCH v6 " Ævar Arnfjörð Bjarmason
  8 siblings, 0 replies; 151+ messages in thread
From: Junio C Hamano @ 2021-12-23  0:30 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> A simplification and code deletion of the overly complex setup for the
> grep API, no behavior changes. For v4 see:
> https://lore.kernel.org/git/cover-v4-0.7-00000000000-20211203T101348Z-avarab@gmail.com/
>
> This re-roll is rebased on the latest push-out to "master", now-landed
> topic had a minor conflict with it in git.c.

I understand that this has no changes other than the rebase to
adjust for the "even when we are running 'git cmd -h', make sure we
try to find where the git repository is (gently)".  Am I correct?

I am not complaining for lack of improvements or anything.  I am
only making sure that I am applying the right version to a right
base.

THanks.

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v5 3/7] grep tests: add missing "grep.patternType" config test
  2021-12-22  2:58       ` [PATCH v5 3/7] grep tests: add missing "grep.patternType" config test Ævar Arnfjörð Bjarmason
@ 2021-12-23 22:25         ` Junio C Hamano
  2021-12-25  0:06           ` Re* " Junio C Hamano
  0 siblings, 1 reply; 151+ messages in thread
From: Junio C Hamano @ 2021-12-23 22:25 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Extend the grep tests to assert that setting
> "grep.patternType=extended" followed by "grep.patternType=default"
> will behave as if "--basic-regexp" was provided, and not as
> "--extended-regexp". In a subsequent commit we'll need to treat
> "grep.patternType=default" as a special-case, but let's make sure we
> ignore it if it's being set to "default" following an earlier
> non-"default" "grep.patternType" setting.
>
> Let's also test what happens when we have a sequence of "extended"
> followed by "default" and "fixed". In that case the "fixed" should
> prevail.

The grep.patternType configuration variable has the "last one wins"
semantics just all the usual configuration variable, but the meaning
of the variable when it is set to "default" depends on the value set
to grep.extendedRegexp variable.

If you rewrite with the above understanding, what you wrote will
become a lot more concise.

    Extend the grep tests to assert that grep.patternType is the
    usual "last one wins" variable, and specifically, setting it to
    "default" has the same meaning as setting it to "basic" when
    grep.extendedRegexp is not set (or set to false).

> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  t/t7810-grep.sh | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
>
> diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
> index 6b6423a07c3..113902c3bda 100755
> --- a/t/t7810-grep.sh
> +++ b/t/t7810-grep.sh
> @@ -451,6 +451,25 @@ do
>  		test_cmp expected actual
>  	'
>  
> +	test_expect_success "grep $L with grep.patternType=extended and grep.patternType=default" '
> +		echo "${HC}ab:a+bc" >expected &&
> +		git \
> +			-c grep.patternType=extended \
> +			-c grep.patternType=default \
> +			grep "a+b*c" $H ab >actual &&
> +		test_cmp expected actual
> +	'
> +
> +	test_expect_success "grep $L with grep.patternType=[extended -> default -> fixed]" '
> +		echo "${HC}ab:a+b*c" >expected &&
> +		git \
> +			-c grep.patternType=extended \
> +			-c grep.patternType=default \
> +			-c grep.patternType=fixed \
> +			grep "a+b*c" $H ab >actual &&

And from that point of view, I think the second new test has much
less value than a possible alternative to ensure grep.patternType
set to fixed and then default behaves like setting it to extended
when grep.extendedRegexp is set to true.  As written, this is just
testing that the variable we designed and know to be "last one wins"
behaves as one once more.

> +		test_cmp expected actual
> +	'
> +
>  	test_expect_success "grep $L with grep.patternType=extended and grep.extendedRegexp=false" '
>  		echo "${HC}ab:abc" >expected &&
>  		git \

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v5 7/7] grep: simplify config parsing and option parsing
  2021-12-22  2:58       ` [PATCH v5 7/7] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
@ 2021-12-23 22:37         ` Junio C Hamano
  0 siblings, 0 replies; 151+ messages in thread
From: Junio C Hamano @ 2021-12-23 22:37 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Simplify the parsing of "grep.patternType" and
> "grep.extendedRegexp". This changes no behavior, but gets rid of
> complex parsing logic that isn't needed anymore.
>
> When "grep.patternType" was introduced in 84befcd0a4a (grep: add a
> grep.patternType configuration setting, 2012-08-03) we promised that:
>
>  1. You can set "grep.patternType", and "[setting it to] 'default'
>     will return to the default matching behavior".

You need to call it into readers' attention that back then the
author of that commit meant by "the default" to mean "whatever the
configuration system specified before the patch applied, i.e. with
grep.extendedRegexp".

>  2. We'd support the existing "grep.extendedRegexp" option, but ignore
>     it when the new "grep.patternType" option is set. We said we'd
>     only ignore the older "grep.extendedRegexp" option "when the
>     `grep.patternType` option is set. to a value other than
>     'default'".

Yes, in short, you can think of it this way.

 - We use grep.patternType as the source of truth.  It is a usual
   last-one-wins variable, with values like fixed, basic, pcre,
   etc.

 - There however is a special value 'default', which may mean
   'basic' or 'extended', depending on what grep.extendedRegexp is
   set to.

> In a preceding commit we changed grep_config() to be called after
> grep_init(), which means that much of the complexity here can go
> away.
>
> Now as before when we only understand a "grep.extendedRegexp" setting
> of "true", and if "grep.patterntype=default" is set we'll interpret it
> as "grep.patterntype=basic",

Is that a typo?  If extendedRegexp is set to 'true', then the
'default' would mean 'extended', so I would expect that we'd
see it as the same as setting it 'grep.patternType=extended'.

> except if we previously saw a
> "grep.extendedRegexp", then it's interpreted as
> "grep.patterntype=extended".

I am not sure what this means.  grep.extendedRegexp is also a usual
last-one-wins variable.  If you had this series:

    git \
    -c grep.extendedRegexp = false \
    -c grep.extendedRegexp = true \
    -c grep.patternType = default \
    some-command-that-take-regexp

the last grep.extendedRegexp is true, and the last grep.patternType
is default, so the command sould work on extended.  It is also true
if you had

    git \
    -c grep.extendedRegexp = false \
    -c grep.patternType = default \
    -c grep.extendedRegexp = true \
    some-command-that-take-regexp

That is why it is important to remember the fact that patternType
was give as "default" before you finish reading the configuration
and you are sure you know the last value of grep.extendedRegexp.
Only after that, you can resolve what that "default" means between
"basic" and "extended".  Trying to interpret "default" as soon as
you see it in grep.patternType and trying to make it into either
"basic" or "extended" will not work unless you know you have the
final value of grep.extendedRegexp.

> We don't need grep_commit_pattern_type() anymore,...

And I think that is what this function wanted to say: "we now have
seen all necessary, so we can commit what pattern type we are going
to use; before this point, we couldn't tell what 'default' meant".

So I am not sure how any change that says we do not need the
"commit" phase can be correct.

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re* [PATCH v5 3/7] grep tests: add missing "grep.patternType" config test
  2021-12-23 22:25         ` Junio C Hamano
@ 2021-12-25  0:06           ` Junio C Hamano
  2021-12-25  0:19             ` [RFC/PATCH] grep: allow scripts to ignore configured pattern type Junio C Hamano
  2021-12-25  1:04             ` Re* [PATCH v5 3/7] grep tests: add missing "grep.patternType" config test Junio C Hamano
  0 siblings, 2 replies; 151+ messages in thread
From: Junio C Hamano @ 2021-12-25  0:06 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Junio C Hamano <gitster@pobox.com> writes:

> The grep.patternType configuration variable has the "last one wins"
> semantics just all the usual configuration variable, but the meaning
> of the variable when it is set to "default" depends on the value set
> to grep.extendedRegexp variable.
>
> If you rewrite with the above understanding, what you wrote will
> become a lot more concise.
>
>     Extend the grep tests to assert that grep.patternType is the
>     usual "last one wins" variable, and specifically, setting it to
>     "default" has the same meaning as setting it to "basic" when
>     grep.extendedRegexp is not set (or set to false).

Also, it is probably a good idea to strees that grep.extendedRegexp
is also "last one wins", so as I wrote in a separate message, Git
finds the last one for grep.extendedRegexp and grep.patternType
independently and combines these last values to come up with the
pattern type it uses.

I'll tentatively queue the following patch between your 3/7 and 4/7,
but it probably is a good idea to squash it into 3/7, as it belongs
to the same theme: clarify how these two variables are meant to
interact with each other.

----- >8 --------- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----
Subject: [PATCH] t7810: make sure grep.extendedRegexp is also last-one-wins

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t7810-grep.sh | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 113902c3bd..2c17704e01 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -451,6 +451,16 @@ do
 		test_cmp expected actual
 	'
 
+	test_expect_success "grep $L with grep.extendedRegexp is last-one-wins" '
+		echo "${HC}ab:a+bc" >expected &&
+		git \
+			-c grep.extendedRegexp=true \
+			-c grep.patternType=default \
+			-c grep.extendedRegexp=false \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
 	test_expect_success "grep $L with grep.patternType=extended and grep.patternType=default" '
 		echo "${HC}ab:a+bc" >expected &&
 		git \
-- 
2.34.1-563-g3368a7891b


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [RFC/PATCH] grep: allow scripts to ignore configured pattern type
  2021-12-25  0:06           ` Re* " Junio C Hamano
@ 2021-12-25  0:19             ` Junio C Hamano
  2021-12-26 23:09               ` Ævar Arnfjörð Bjarmason
  2021-12-25  1:04             ` Re* [PATCH v5 3/7] grep tests: add missing "grep.patternType" config test Junio C Hamano
  1 sibling, 1 reply; 151+ messages in thread
From: Junio C Hamano @ 2021-12-25  0:19 UTC (permalink / raw)
  To: git; +Cc: Ævar Arnfjörð Bjarmason

We made a mistake to add grep.extendedRegexp configuration variable
long time ago, and made things even worse by introducing an even
more generalized grep.patternType configuration variable.

This was mostly because interactive users were lazy and wanted a way
to declare "I do not live in the ancient age, and my regexps are
always extended" and write "git grep" without having to type three
more letters " -E" on the command line.

But this in turn forces script writers to always specify what kind
of patterns they are writing, because without such command line
override, the interpretation of the patterns they write in their
scripts will be affected by the configuration variables of the user
who is running their script.

Introduce GIT_DISABLE_GREP_PATTERN_CONFIG environment variable that
script writers can set to "true" and export at the very beginning of
their script to defeat grep.extendedRegexp and grep.patternType
configuration variables.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * This is merely a weather balloon without proper documentation and
   test.  It might be even better idea to make such an environment
   variable to _specify_ what kind of pattern the script uses,
   instead of "we defeat end-user configuration and now we are
   forced to write in basic or have to write -E/-P etc.", which is
   what this patch does.

 grep.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/grep.c b/grep.c
index fe847a0111..0cfb698b51 100644
--- a/grep.c
+++ b/grep.c
@@ -77,10 +77,15 @@ int grep_config(const char *var, const char *value, void *cb)
 {
 	struct grep_opt *opt = &grep_defaults;
 	const char *slot;
+	static int disable_pattern_type_config = -1;
 
 	if (userdiff_config(var, value) < 0)
 		return -1;
 
+	if (disable_pattern_type_config < 0)
+		disable_pattern_type_config =
+			git_env_bool("GIT_DISABLE_GREP_PATTERN_CONFIG", 0);
+
 	/*
 	 * The instance of grep_opt that we set up here is copied by
 	 * grep_init() to be used by each individual invocation.
@@ -90,12 +95,14 @@ int grep_config(const char *var, const char *value, void *cb)
 	 */
 
 	if (!strcmp(var, "grep.extendedregexp")) {
-		opt->extended_regexp_option = git_config_bool(var, value);
+		if (!disable_pattern_type_config)
+			opt->extended_regexp_option = git_config_bool(var, value);
 		return 0;
 	}
 
 	if (!strcmp(var, "grep.patterntype")) {
-		opt->pattern_type_option = parse_pattern_type_arg(var, value);
+		if (!disable_pattern_type_config)
+			opt->pattern_type_option = parse_pattern_type_arg(var, value);
 		return 0;
 	}
 
-- 
2.34.1-563-g3368a7891b


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: Re* [PATCH v5 3/7] grep tests: add missing "grep.patternType" config test
  2021-12-25  0:06           ` Re* " Junio C Hamano
  2021-12-25  0:19             ` [RFC/PATCH] grep: allow scripts to ignore configured pattern type Junio C Hamano
@ 2021-12-25  1:04             ` Junio C Hamano
  1 sibling, 0 replies; 151+ messages in thread
From: Junio C Hamano @ 2021-12-25  1:04 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Junio C Hamano <gitster@pobox.com> writes:

> I'll tentatively queue the following patch between your 3/7 and 4/7,
> but it probably is a good idea to squash it into 3/7, as it belongs
> to the same theme: clarify how these two variables are meant to
> interact with each other.

Just as I suspected earlier, up to the [PATCH 6/7] of the series
passes this test and this test reveals the breakage in [PATCH 7/7].

In the review of that step, I said "I am not sure how any change
that says we do not need the "commit" phase can be correct." but
come to think of it, it was a bit too strong.  It is possible to
implement correct semantics without the "commit" phase, as long as
we do not try to decide between basic and extended too hastily when
we see grep.patternType=default, before we are sure that we will not
see any new definition of grep.extendedRegexp [*].  In the extreme,
we could keep it as 'default' until the time just before we compile
the regexp and consult the final value of grep.extendedRegexp to
decide between the two, and such an implementation would still give
correct results without having to have a "commit" phase.


[Footnote]

* The story is the same for the `--patternType=default` command line
  option, but by the time we read the command line option, we should
  be already done with the configuration, and there is no command
  line option to override the grep.extendedRegexp configuration
  variable, so we do not have to worry about that case.  When we
  read --patternType=default from the command line, we can safely
  use the value we know about grep.extendedRegexp and decide to use
  either basic or extended.

  But that is not true for grep.patternType configuration varialble,
  as the additional test I am responding to shows.

> ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----
> Subject: [PATCH] t7810: make sure grep.extendedRegexp is also last-one-wins
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  t/t7810-grep.sh | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
> index 113902c3bd..2c17704e01 100755
> --- a/t/t7810-grep.sh
> +++ b/t/t7810-grep.sh
> @@ -451,6 +451,16 @@ do
>  		test_cmp expected actual
>  	'
>  
> +	test_expect_success "grep $L with grep.extendedRegexp is last-one-wins" '
> +		echo "${HC}ab:a+bc" >expected &&
> +		git \
> +			-c grep.extendedRegexp=true \
> +			-c grep.patternType=default \
> +			-c grep.extendedRegexp=false \
> +			grep "a+b*c" $H ab >actual &&
> +		test_cmp expected actual
> +	'
> +
>  	test_expect_success "grep $L with grep.patternType=extended and grep.patternType=default" '
>  		echo "${HC}ab:a+bc" >expected &&
>  		git \

^ permalink raw reply	[flat|nested] 151+ messages in thread

* [PATCH v6 0/7] grep: simplify & delete "init" & "config" code
  2021-12-22  2:58     ` [PATCH v5 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                         ` (7 preceding siblings ...)
  2021-12-23  0:30       ` [PATCH v5 0/7] grep: simplify & delete "init" & "config" code Junio C Hamano
@ 2021-12-26 22:37       ` Ævar Arnfjörð Bjarmason
  2021-12-26 22:37         ` [PATCH v6 1/7] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
                           ` (7 more replies)
  8 siblings, 8 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-26 22:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

A simplification and code deletion of the overly complex setup for the
grep API, no behavior changes. For v5 see:
https://lore.kernel.org/git/cover-v5-0.7-00000000000-20211222T025214Z-avarab@gmail.com

Changes since v4:

 * As Junio pointed out there were behavior changes in the v4. I
   integrated/squashed the change he added to the
   gitster/ab/grep-patterntype branch to add the tests, and a fixed
   7/7 correctly handles the case of a flip-flopping
   grep.extendedRegexp now.

 * Some commit message additions/rewording that I hope will address
   relevant comments from Junio.

Ævar Arnfjörð Bjarmason (7):
  grep.h: remove unused "regex_t regexp" from grep_opt
  log tests: check if grep_config() is called by "log"-like cmds
  grep tests: add missing "grep.patternType" config tests
  built-ins: trust the "prefix" from run_builtin()
  grep.c: don't pass along NULL callback value
  grep API: call grep_config() after grep_init()
  grep: simplify config parsing and option parsing

 builtin/grep.c    |  27 +++++-----
 builtin/log.c     |  13 ++++-
 builtin/ls-tree.c |   2 +-
 git.c             |   1 +
 grep.c            | 126 +++++++++-------------------------------------
 grep.h            |  33 ++++++++----
 revision.c        |   4 +-
 t/t4202-log.sh    |  24 +++++++++
 t/t7810-grep.sh   |  39 ++++++++++++++
 9 files changed, 138 insertions(+), 131 deletions(-)

Range-diff against v5:
1:  b6a3e0e2e08 = 1:  b62e6b6162a grep.h: remove unused "regex_t regexp" from grep_opt
2:  c0d77b2683f = 2:  0edcdb50afd log tests: check if grep_config() is called by "log"-like cmds
3:  f02f246aa23 ! 3:  1b724d5e2e9 grep tests: add missing "grep.patternType" config test
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    grep tests: add missing "grep.patternType" config test
    +    grep tests: add missing "grep.patternType" config tests
     
         Extend the grep tests to assert that setting
         "grep.patternType=extended" followed by "grep.patternType=default"
    @@ Commit message
     
         Let's also test what happens when we have a sequence of "extended"
         followed by "default" and "fixed". In that case the "fixed" should
    -    prevail.
    +    prevail, as well as tests to check that a "grep.extendedRegexp=true"
    +    followed by a "grep.extendedRegexp=false" behaves as though
    +    "grep.extendedRegexp" wasn't provided.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    +    Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
      ## t/t7810-grep.sh ##
     @@ t/t7810-grep.sh: do
      		test_cmp expected actual
      	'
      
    ++	test_expect_success "grep $L with grep.extendedRegexp is last-one-wins" '
    ++		echo "${HC}ab:a+bc" >expected &&
    ++		git \
    ++			-c grep.extendedRegexp=true \
    ++			-c grep.patternType=basic \
    ++			-c grep.extendedRegexp=false \
    ++			grep "a+b*c" $H ab >actual &&
    ++		test_cmp expected actual
    ++	'
    ++
    ++	test_expect_success "grep $L with grep.extendedRegexp is last-one-wins & defers to grep.patternType" '
    ++		echo "${HC}ab:abc" >expected &&
    ++		git \
    ++			-c grep.extendedRegexp=true \
    ++			-c grep.patternType=extended \
    ++			-c grep.extendedRegexp=false \
    ++			grep "a+b*c" $H ab >actual &&
    ++		test_cmp expected actual
    ++	'
    ++
     +	test_expect_success "grep $L with grep.patternType=extended and grep.patternType=default" '
     +		echo "${HC}ab:a+bc" >expected &&
     +		git \
4:  a542a352eab = 4:  f4876552771 built-ins: trust the "prefix" from run_builtin()
5:  a33b00a247e = 5:  069b0339146 grep.c: don't pass along NULL callback value
6:  92b1c3958fa = 6:  e38eca56959 grep API: call grep_config() after grep_init()
7:  63de643ebc2 ! 7:  88dfd40bf9e grep: simplify config parsing and option parsing
    @@ Commit message
          1. You can set "grep.patternType", and "[setting it to] 'default'
             will return to the default matching behavior".
     
    +        In that context "the default" meant whatever the configuration
    +        system specified before that change, i.e. via grep.extendedRegexp.
    +
          2. We'd support the existing "grep.extendedRegexp" option, but ignore
             it when the new "grep.patternType" option is set. We said we'd
             only ignore the older "grep.extendedRegexp" option "when the
    @@ Commit message
         grep_init(), which means that much of the complexity here can go
         away.
     
    -    Now as before when we only understand a "grep.extendedRegexp" setting
    -    of "true", and if "grep.patterntype=default" is set we'll interpret it
    -    as "grep.patterntype=basic", except if we previously saw a
    -    "grep.extendedRegexp", then it's interpreted as
    -    "grep.patterntype=extended".
    +    As before "grep.extendedRegexp" is a last-one-wins variable. We need
    +    to maintain state inside parse_pattern_type_arg() to ignore it if a
    +    non-"default" grep.patternType was seen, but otherwise flip between
    +    BRE and ERE for "grep.extendedRegexp=[false|true]".
     
         See my 07a3d411739 (grep: remove regflags from the public grep_opt
         API, 2017-06-29) for addition of the two comments being removed here,
    @@ grep.c: int grep_config(const char *var, const char *value, void *cb)
      		return -1;
      
      	if (!strcmp(var, "grep.extendedregexp")) {
    -+		if (opt->extended_regexp_option)
    ++		if (opt->extended_regexp_option == -1)
     +			return 0;
      		opt->extended_regexp_option = git_config_bool(var, value);
     +		if (opt->extended_regexp_option)
     +			opt->pattern_type_option = GREP_PATTERN_TYPE_ERE;
    ++		else
    ++			opt->pattern_type_option = GREP_PATTERN_TYPE_BRE;
     +		return 0;
     +	}
     +
-- 
2.34.1.1239.g84ae229c870


^ permalink raw reply	[flat|nested] 151+ messages in thread

* [PATCH v6 1/7] grep.h: remove unused "regex_t regexp" from grep_opt
  2021-12-26 22:37       ` [PATCH v6 " Ævar Arnfjörð Bjarmason
@ 2021-12-26 22:37         ` Ævar Arnfjörð Bjarmason
  2021-12-26 22:37         ` [PATCH v6 2/7] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
                           ` (6 subsequent siblings)
  7 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-26 22:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

This "regex_t" in grep_opt has not been used since
f9b9faf6f8a (builtin-grep: allow more than one patterns., 2006-05-02),
we still use a "regex_t" for compiling regexes, but that's in the
"grep_pat" struct".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/grep.h b/grep.h
index 3e8815c347b..95cccb670f9 100644
--- a/grep.h
+++ b/grep.h
@@ -136,7 +136,6 @@ struct grep_opt {
 
 	const char *prefix;
 	int prefix_length;
-	regex_t regexp;
 	int linenum;
 	int columnnum;
 	int invert;
-- 
2.34.1.1239.g84ae229c870


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v6 2/7] log tests: check if grep_config() is called by "log"-like cmds
  2021-12-26 22:37       ` [PATCH v6 " Ævar Arnfjörð Bjarmason
  2021-12-26 22:37         ` [PATCH v6 1/7] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
@ 2021-12-26 22:37         ` Ævar Arnfjörð Bjarmason
  2021-12-26 22:37         ` [PATCH v6 3/7] grep tests: add missing "grep.patternType" config tests Ævar Arnfjörð Bjarmason
                           ` (5 subsequent siblings)
  7 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-26 22:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Extend the tests added in my 9df46763ef1 (log: add exhaustive tests
for pattern style options & config, 2017-05-20) to check not only
whether "git log" handles "grep.patternType", but also "git show"
etc.

It's sufficient to check whether we match a "fixed" or a "basic" regex
here to see if these codepaths correctly invoked grep_config(). We
don't need to check the details of their regular expression matching
as the "log" test does.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t4202-log.sh | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 2ced7e9d817..2490f2cd5ed 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -449,6 +449,30 @@ test_expect_success !FAIL_PREREQS 'log with various grep.patternType configurati
 	)
 '
 
+for cmd in show whatchanged reflog format-patch
+do
+	case "$cmd" in
+	format-patch) myarg="HEAD~.." ;;
+	*) myarg= ;;
+	esac
+
+	test_expect_success "$cmd: understands grep.patternType, like 'log'" '
+		git init "pattern-type-$cmd" &&
+		(
+			cd "pattern-type-$cmd" &&
+			test_commit 1 file A &&
+			test_commit "(1|2)" file B 2 &&
+
+			git -c grep.patternType=fixed $cmd --grep="..." $myarg >actual &&
+			test_must_be_empty actual &&
+
+			git -c grep.patternType=basic $cmd --grep="..." $myarg >actual &&
+			test_file_not_empty actual
+		)
+	'
+done
+test_done
+
 test_expect_success 'log --author' '
 	cat >expect <<-\EOF &&
 	Author: <BOLD;RED>A U<RESET> Thor <author@example.com>
-- 
2.34.1.1239.g84ae229c870


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v6 3/7] grep tests: add missing "grep.patternType" config tests
  2021-12-26 22:37       ` [PATCH v6 " Ævar Arnfjörð Bjarmason
  2021-12-26 22:37         ` [PATCH v6 1/7] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
  2021-12-26 22:37         ` [PATCH v6 2/7] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
@ 2021-12-26 22:37         ` Ævar Arnfjörð Bjarmason
  2021-12-26 22:37         ` [PATCH v6 4/7] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
                           ` (4 subsequent siblings)
  7 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-26 22:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Extend the grep tests to assert that setting
"grep.patternType=extended" followed by "grep.patternType=default"
will behave as if "--basic-regexp" was provided, and not as
"--extended-regexp". In a subsequent commit we'll need to treat
"grep.patternType=default" as a special-case, but let's make sure we
ignore it if it's being set to "default" following an earlier
non-"default" "grep.patternType" setting.

Let's also test what happens when we have a sequence of "extended"
followed by "default" and "fixed". In that case the "fixed" should
prevail, as well as tests to check that a "grep.extendedRegexp=true"
followed by a "grep.extendedRegexp=false" behaves as though
"grep.extendedRegexp" wasn't provided.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t7810-grep.sh | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 6b6423a07c3..664f884e12a 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -451,6 +451,45 @@ do
 		test_cmp expected actual
 	'
 
+	test_expect_success "grep $L with grep.extendedRegexp is last-one-wins" '
+		echo "${HC}ab:a+bc" >expected &&
+		git \
+			-c grep.extendedRegexp=true \
+			-c grep.patternType=basic \
+			-c grep.extendedRegexp=false \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.extendedRegexp is last-one-wins & defers to grep.patternType" '
+		echo "${HC}ab:abc" >expected &&
+		git \
+			-c grep.extendedRegexp=true \
+			-c grep.patternType=extended \
+			-c grep.extendedRegexp=false \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.patternType=extended and grep.patternType=default" '
+		echo "${HC}ab:a+bc" >expected &&
+		git \
+			-c grep.patternType=extended \
+			-c grep.patternType=default \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.patternType=[extended -> default -> fixed]" '
+		echo "${HC}ab:a+b*c" >expected &&
+		git \
+			-c grep.patternType=extended \
+			-c grep.patternType=default \
+			-c grep.patternType=fixed \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
 	test_expect_success "grep $L with grep.patternType=extended and grep.extendedRegexp=false" '
 		echo "${HC}ab:abc" >expected &&
 		git \
-- 
2.34.1.1239.g84ae229c870


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v6 4/7] built-ins: trust the "prefix" from run_builtin()
  2021-12-26 22:37       ` [PATCH v6 " Ævar Arnfjörð Bjarmason
                           ` (2 preceding siblings ...)
  2021-12-26 22:37         ` [PATCH v6 3/7] grep tests: add missing "grep.patternType" config tests Ævar Arnfjörð Bjarmason
@ 2021-12-26 22:37         ` Ævar Arnfjörð Bjarmason
  2021-12-26 22:37         ` [PATCH v6 5/7] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
                           ` (3 subsequent siblings)
  7 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-26 22:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change code in "builtin/grep.c" and "builtin/ls-tree.c" to trust the
"prefix" passed from "run_builtin()". The "prefix" we get from setup.c
is either going to be NULL or a string of length >0, never "".

So we can drop the "prefix && *prefix" checks added for
"builtin/grep.c" in 0d042fecf2f (git-grep: show pathnames relative to
the current directory, 2006-08-11), and for "builtin/ls-tree.c" in
a69dd585fca (ls-tree: chomp leading directories when run from a
subdirectory, 2005-12-23).

As seen in code in revision.c that was added in cd676a51367 (diff
--relative: output paths as relative to the current subdirectory,
2008-02-12) we already have existing code that does away with this
assertion.

This makes it easier to reason about a subsequent change to the
"prefix_length" code in grep.c in a subsequent commit, and since we're
going to the trouble of doing that let's leave behind an assert() to
promise this to any future callers.

For "builtin/grep.c" it would be painful to pass the "prefix" down the
callchain of:

    cmd_grep -> grep_tree -> grep_submodule -> grep_cache -> grep_oid ->
    grep_source_name

So for the code that needs it in grep_source_name() let's add a
"grep_prefix" variable similar to the existing "ls_tree_prefix".

While at it let's move the code in cmd_ls_tree() around so that we
assign to the "ls_tree_prefix" right after declaring the variables,
and stop assigning to "prefix". We only subsequently used that
variable later in the function after clobbering it. Let's just use our
own "grep_prefix" instead.

Let's also add an assert() in git.c, so that we'll make this promise
about the "prefix" to any current and future callers, as well as to
any readers of the code.

Code history:

 * The strlen() in "grep.c" hasn't been used since 493b7a08d80 (grep:
   accept relative paths outside current working directory, 2009-09-05).

   When that code was added in 0d042fecf2f (git-grep: show pathnames
   relative to the current directory, 2006-08-11) we used the length.

   But since 493b7a08d80 we haven't used it for anything except a
   boolean check that we could have done on the "prefix" member
   itself.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c    | 13 ++++++++-----
 builtin/ls-tree.c |  2 +-
 git.c             |  1 +
 grep.c            |  4 +---
 grep.h            |  4 +---
 revision.c        |  2 +-
 6 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 9e34a820ad4..d85cbabea67 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -26,6 +26,8 @@
 #include "object-store.h"
 #include "packfile.h"
 
+static const char *grep_prefix;
+
 static char const * const grep_usage[] = {
 	N_("git grep [<options>] [-e] <pattern> [<rev>...] [[--] <path>...]"),
 	NULL
@@ -315,11 +317,11 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 	strbuf_reset(out);
 
 	if (opt->null_following_name) {
-		if (opt->relative && opt->prefix_length) {
+		if (opt->relative && grep_prefix) {
 			struct strbuf rel_buf = STRBUF_INIT;
 			const char *rel_name =
 				relative_path(filename + tree_name_len,
-					      opt->prefix, &rel_buf);
+					      grep_prefix, &rel_buf);
 
 			if (tree_name_len)
 				strbuf_add(out, filename, tree_name_len);
@@ -332,8 +334,8 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 		return;
 	}
 
-	if (opt->relative && opt->prefix_length)
-		quote_path(filename + tree_name_len, opt->prefix, out, 0);
+	if (opt->relative && grep_prefix)
+		quote_path(filename + tree_name_len, grep_prefix, out, 0);
 	else
 		quote_c_style(filename + tree_name_len, out, NULL, 0);
 
@@ -962,9 +964,10 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			   PARSE_OPT_NOCOMPLETE),
 		OPT_END()
 	};
+	grep_prefix = prefix;
 
 	git_config(grep_cmd_config, NULL);
-	grep_init(&opt, the_repository, prefix);
+	grep_init(&opt, the_repository);
 
 	/*
 	 * If there is no -- then the paths must exist in the working
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c71..6cb554cbb0a 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -150,7 +150,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 
 	git_config(git_default_config, NULL);
 	ls_tree_prefix = prefix;
-	if (prefix && *prefix)
+	if (prefix)
 		chomp_prefix = strlen(prefix);
 
 	argc = parse_options(argc, argv, prefix, ls_tree_options,
diff --git a/git.c b/git.c
index 7edafd8ecff..575d95046f2 100644
--- a/git.c
+++ b/git.c
@@ -436,6 +436,7 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
 	} else {
 		prefix = NULL;
 	}
+	assert(!prefix || *prefix);
 	precompose_argv_prefix(argc, argv, NULL);
 	if (use_pager == -1 && run_setup &&
 		!(p->option & DELAY_PAGER_CONFIG))
diff --git a/grep.c b/grep.c
index fe847a0111a..12b202598a9 100644
--- a/grep.c
+++ b/grep.c
@@ -139,13 +139,11 @@ int grep_config(const char *var, const char *value, void *cb)
  * default values from the template we read the configuration
  * information in an earlier call to git_config(grep_config).
  */
-void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix)
+void grep_init(struct grep_opt *opt, struct repository *repo)
 {
 	*opt = grep_defaults;
 
 	opt->repo = repo;
-	opt->prefix = prefix;
-	opt->prefix_length = (prefix && *prefix) ? strlen(prefix) : 0;
 	opt->pattern_tail = &opt->pattern_list;
 	opt->header_tail = &opt->header_list;
 }
diff --git a/grep.h b/grep.h
index 95cccb670f9..62deadb885f 100644
--- a/grep.h
+++ b/grep.h
@@ -134,8 +134,6 @@ struct grep_opt {
 	 */
 	struct repository *repo;
 
-	const char *prefix;
-	int prefix_length;
 	int linenum;
 	int columnnum;
 	int invert;
@@ -180,7 +178,7 @@ struct grep_opt {
 };
 
 int grep_config(const char *var, const char *value, void *);
-void grep_init(struct grep_opt *, struct repository *repo, const char *prefix);
+void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index 5390a479b30..495328e859c 100644
--- a/revision.c
+++ b/revision.c
@@ -1838,7 +1838,7 @@ void repo_init_revisions(struct repository *r,
 	revs->commit_format = CMIT_FMT_DEFAULT;
 	revs->expand_tabs_in_log_default = 8;
 
-	grep_init(&revs->grep_filter, revs->repo, prefix);
+	grep_init(&revs->grep_filter, revs->repo);
 	revs->grep_filter.status_only = 1;
 
 	repo_diff_setup(revs->repo, &revs->diffopt);
-- 
2.34.1.1239.g84ae229c870


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v6 5/7] grep.c: don't pass along NULL callback value
  2021-12-26 22:37       ` [PATCH v6 " Ævar Arnfjörð Bjarmason
                           ` (3 preceding siblings ...)
  2021-12-26 22:37         ` [PATCH v6 4/7] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
@ 2021-12-26 22:37         ` Ævar Arnfjörð Bjarmason
  2021-12-26 22:37         ` [PATCH v6 6/7] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
                           ` (2 subsequent siblings)
  7 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-26 22:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change grep_cmd_config() to stop passing around the always-NULL "cb"
value. When this code was added in 7e8f59d577e (grep: color patterns
in output, 2009-03-07) it was non-NULL, but when that changed in
15fabd1bbd4 (builtin/grep.c: make configuration callback more
reusable, 2012-10-09) this code was left behind.

In a subsequent change I'll start using the "cb" value, this will make
it clear which functions we call need it, and which don't.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index d85cbabea67..5ec4cecae45 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -285,8 +285,8 @@ static int wait_all(void)
 
 static int grep_cmd_config(const char *var, const char *value, void *cb)
 {
-	int st = grep_config(var, value, cb);
-	if (git_color_default_config(var, value, cb) < 0)
+	int st = grep_config(var, value, NULL);
+	if (git_color_default_config(var, value, NULL) < 0)
 		st = -1;
 
 	if (!strcmp(var, "grep.threads")) {
-- 
2.34.1.1239.g84ae229c870


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v6 6/7] grep API: call grep_config() after grep_init()
  2021-12-26 22:37       ` [PATCH v6 " Ævar Arnfjörð Bjarmason
                           ` (4 preceding siblings ...)
  2021-12-26 22:37         ` [PATCH v6 5/7] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
@ 2021-12-26 22:37         ` Ævar Arnfjörð Bjarmason
  2021-12-26 22:37         ` [PATCH v6 7/7] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
  2021-12-28  1:07         ` [PATCH v7 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  7 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-26 22:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

The grep_init() function used the odd pattern of initializing the
passed-in "struct grep_opt" with a statically defined "grep_defaults"
struct, which would be modified in-place when we invoked
grep_config().

So we effectively (b) initialized config, (a) then defaults, (c)
followed by user options. Usually those are ordered as "a", "b" and
"c" instead.

As the comments being removed here show the previous behavior needed
to be carefully explained as we'd potentially share the populated
configuration among different instances of grep_init(). In practice we
didn't do that, but now that it can't be a concern anymore let's
remove those comments.

This does not change the behavior of any of the configuration
variables or options. That would have been the case if we didn't move
around the grep_config() call in "builtin/log.c". But now that we call
"grep_config" after "git_log_config" and "git_format_config" we'll
need to pass in the already initialized "struct grep_opt *".

See 6ba9bb76e02 (grep: copy struct in one fell swoop, 2020-11-29) and
7687a0541e0 (grep: move the configuration parsing logic to grep.[ch],
2012-10-09) for the commits that added the comments.

The memcpy() pattern here will be optimized away and follows the
convention of other *_init() functions. See 5726a6b4012 (*.c *_init():
define in terms of corresponding *_INIT macro, 2021-07-01).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c |  4 ++--
 builtin/log.c  | 13 +++++++++++--
 grep.c         | 39 +++------------------------------------
 grep.h         | 21 +++++++++++++++++++++
 4 files changed, 37 insertions(+), 40 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 5ec4cecae45..0ea124321b6 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -285,7 +285,7 @@ static int wait_all(void)
 
 static int grep_cmd_config(const char *var, const char *value, void *cb)
 {
-	int st = grep_config(var, value, NULL);
+	int st = grep_config(var, value, cb);
 	if (git_color_default_config(var, value, NULL) < 0)
 		st = -1;
 
@@ -966,8 +966,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	};
 	grep_prefix = prefix;
 
-	git_config(grep_cmd_config, NULL);
 	grep_init(&opt, the_repository);
+	git_config(grep_cmd_config, &opt);
 
 	/*
 	 * If there is no -- then the paths must exist in the working
diff --git a/builtin/log.c b/builtin/log.c
index 93ace0dde7d..fdde77e4ebb 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -520,8 +520,6 @@ static int git_log_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
-	if (grep_config(var, value, cb) < 0)
-		return -1;
 	if (git_gpg_config(var, value, cb) < 0)
 		return -1;
 	return git_diff_ui_config(var, value, cb);
@@ -536,6 +534,8 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.simplify_history = 0;
 	memset(&opt, 0, sizeof(opt));
@@ -650,6 +650,8 @@ int cmd_show(int argc, const char **argv, const char *prefix)
 
 	memset(&match_all, 0, sizeof(match_all));
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.always_show_header = 1;
 	rev.no_walk = 1;
@@ -733,6 +735,8 @@ int cmd_log_reflog(int argc, const char **argv, const char *prefix)
 
 	repo_init_revisions(the_repository, &rev, prefix);
 	init_reflog_walk(&rev.reflog_info);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.verbose_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -766,6 +770,8 @@ int cmd_log(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.always_show_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -1848,10 +1854,13 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	extra_hdr.strdup_strings = 1;
 	extra_to.strdup_strings = 1;
 	extra_cc.strdup_strings = 1;
+
 	init_log_defaults();
 	init_display_notes(&notes_opt);
 	git_config(git_format_config, NULL);
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.show_notes = show_notes;
 	memcpy(&rev.notes_opt, &notes_opt, sizeof(notes_opt));
 	rev.commit_format = CMIT_FMT_EMAIL;
diff --git a/grep.c b/grep.c
index 12b202598a9..8dfa0300786 100644
--- a/grep.c
+++ b/grep.c
@@ -19,27 +19,6 @@ static void std_output(struct grep_opt *opt, const void *buf, size_t size)
 	fwrite(buf, size, 1, stdout);
 }
 
-static struct grep_opt grep_defaults = {
-	.relative = 1,
-	.pathname = 1,
-	.max_depth = -1,
-	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED,
-	.colors = {
-		[GREP_COLOR_CONTEXT] = "",
-		[GREP_COLOR_FILENAME] = "",
-		[GREP_COLOR_FUNCTION] = "",
-		[GREP_COLOR_LINENO] = "",
-		[GREP_COLOR_COLUMNNO] = "",
-		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_SELECTED] = "",
-		[GREP_COLOR_SEP] = GIT_COLOR_CYAN,
-	},
-	.only_matching = 0,
-	.color = -1,
-	.output = std_output,
-};
-
 static const char *color_grep_slots[] = {
 	[GREP_COLOR_CONTEXT]	    = "context",
 	[GREP_COLOR_FILENAME]	    = "filename",
@@ -75,20 +54,12 @@ define_list_config_array_extra(color_grep_slots, {"match"});
  */
 int grep_config(const char *var, const char *value, void *cb)
 {
-	struct grep_opt *opt = &grep_defaults;
+	struct grep_opt *opt = cb;
 	const char *slot;
 
 	if (userdiff_config(var, value) < 0)
 		return -1;
 
-	/*
-	 * The instance of grep_opt that we set up here is copied by
-	 * grep_init() to be used by each individual invocation.
-	 * When populating a new field of this structure here, be
-	 * sure to think about ownership -- e.g., you might need to
-	 * override the shallow copy in grep_init() with a deep copy.
-	 */
-
 	if (!strcmp(var, "grep.extendedregexp")) {
 		opt->extended_regexp_option = git_config_bool(var, value);
 		return 0;
@@ -134,14 +105,10 @@ int grep_config(const char *var, const char *value, void *cb)
 	return 0;
 }
 
-/*
- * Initialize one instance of grep_opt and copy the
- * default values from the template we read the configuration
- * information in an earlier call to git_config(grep_config).
- */
 void grep_init(struct grep_opt *opt, struct repository *repo)
 {
-	*opt = grep_defaults;
+	struct grep_opt blank = GREP_OPT_INIT;
+	memcpy(opt, &blank, sizeof(*opt));
 
 	opt->repo = repo;
 	opt->pattern_tail = &opt->pattern_list;
diff --git a/grep.h b/grep.h
index 62deadb885f..b651eb291f7 100644
--- a/grep.h
+++ b/grep.h
@@ -177,6 +177,27 @@ struct grep_opt {
 	void *output_priv;
 };
 
+#define GREP_OPT_INIT { \
+	.relative = 1, \
+	.pathname = 1, \
+	.max_depth = -1, \
+	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
+	.colors = { \
+		[GREP_COLOR_CONTEXT] = "", \
+		[GREP_COLOR_FILENAME] = "", \
+		[GREP_COLOR_FUNCTION] = "", \
+		[GREP_COLOR_LINENO] = "", \
+		[GREP_COLOR_COLUMNNO] = "", \
+		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_SELECTED] = "", \
+		[GREP_COLOR_SEP] = GIT_COLOR_CYAN, \
+	}, \
+	.only_matching = 0, \
+	.color = -1, \
+	.output = std_output, \
+}
+
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
-- 
2.34.1.1239.g84ae229c870


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v6 7/7] grep: simplify config parsing and option parsing
  2021-12-26 22:37       ` [PATCH v6 " Ævar Arnfjörð Bjarmason
                           ` (5 preceding siblings ...)
  2021-12-26 22:37         ` [PATCH v6 6/7] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
@ 2021-12-26 22:37         ` Ævar Arnfjörð Bjarmason
  2021-12-27  6:06           ` Junio C Hamano
  2021-12-28  1:07         ` [PATCH v7 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  7 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-26 22:37 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Simplify the parsing of "grep.patternType" and
"grep.extendedRegexp". This changes no behavior, but gets rid of
complex parsing logic that isn't needed anymore.

When "grep.patternType" was introduced in 84befcd0a4a (grep: add a
grep.patternType configuration setting, 2012-08-03) we promised that:

 1. You can set "grep.patternType", and "[setting it to] 'default'
    will return to the default matching behavior".

    In that context "the default" meant whatever the configuration
    system specified before that change, i.e. via grep.extendedRegexp.

 2. We'd support the existing "grep.extendedRegexp" option, but ignore
    it when the new "grep.patternType" option is set. We said we'd
    only ignore the older "grep.extendedRegexp" option "when the
    `grep.patternType` option is set. to a value other than
    'default'".

In a preceding commit we changed grep_config() to be called after
grep_init(), which means that much of the complexity here can go
away.

As before "grep.extendedRegexp" is a last-one-wins variable. We need
to maintain state inside parse_pattern_type_arg() to ignore it if a
non-"default" grep.patternType was seen, but otherwise flip between
BRE and ERE for "grep.extendedRegexp=[false|true]".

See my 07a3d411739 (grep: remove regflags from the public grep_opt
API, 2017-06-29) for addition of the two comments being removed here,
i.e. the complexity noted in that commit is now going away.

We don't need grep_commit_pattern_type() anymore, we can instead have
OPT_SET_INT() in "builtin/grep.c" manipulate the "pattern_type_option"
member in "struct grep_opt" directly.

We can also do away with the indirection of the "int fixed" and "int
pcre2" members in favor of using "pattern_type_option" directly in
"grep.c".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 10 +++---
 grep.c         | 83 ++++++++++++--------------------------------------
 grep.h         |  9 ++----
 revision.c     |  2 --
 4 files changed, 26 insertions(+), 78 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 0ea124321b6..942c4b25077 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -845,7 +845,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	int i;
 	int dummy;
 	int use_index = 1;
-	int pattern_type_arg = GREP_PATTERN_TYPE_UNSPECIFIED;
 	int allow_revs;
 
 	struct option options[] = {
@@ -879,16 +878,16 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			N_("descend at most <depth> levels"), PARSE_OPT_NONEG,
 			NULL, 1 },
 		OPT_GROUP(""),
-		OPT_SET_INT('E', "extended-regexp", &pattern_type_arg,
+		OPT_SET_INT('E', "extended-regexp", &opt.pattern_type_option,
 			    N_("use extended POSIX regular expressions"),
 			    GREP_PATTERN_TYPE_ERE),
-		OPT_SET_INT('G', "basic-regexp", &pattern_type_arg,
+		OPT_SET_INT('G', "basic-regexp", &opt.pattern_type_option,
 			    N_("use basic POSIX regular expressions (default)"),
 			    GREP_PATTERN_TYPE_BRE),
-		OPT_SET_INT('F', "fixed-strings", &pattern_type_arg,
+		OPT_SET_INT('F', "fixed-strings", &opt.pattern_type_option,
 			    N_("interpret patterns as fixed strings"),
 			    GREP_PATTERN_TYPE_FIXED),
-		OPT_SET_INT('P', "perl-regexp", &pattern_type_arg,
+		OPT_SET_INT('P', "perl-regexp", &opt.pattern_type_option,
 			    N_("use Perl-compatible regular expressions"),
 			    GREP_PATTERN_TYPE_PCRE),
 		OPT_GROUP(""),
@@ -982,7 +981,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options, grep_usage,
 			     PARSE_OPT_KEEP_DASHDASH |
 			     PARSE_OPT_STOP_AT_NON_OPTION);
-	grep_commit_pattern_type(pattern_type_arg, &opt);
 
 	if (use_index && !startup_info->have_repository) {
 		int fallback = 0;
diff --git a/grep.c b/grep.c
index 8dfa0300786..e964f402472 100644
--- a/grep.c
+++ b/grep.c
@@ -33,9 +33,7 @@ static const char *color_grep_slots[] = {
 
 static int parse_pattern_type_arg(const char *opt, const char *arg)
 {
-	if (!strcmp(arg, "default"))
-		return GREP_PATTERN_TYPE_UNSPECIFIED;
-	else if (!strcmp(arg, "basic"))
+	if (!strcmp(arg, "basic"))
 		return GREP_PATTERN_TYPE_BRE;
 	else if (!strcmp(arg, "extended"))
 		return GREP_PATTERN_TYPE_ERE;
@@ -61,11 +59,25 @@ int grep_config(const char *var, const char *value, void *cb)
 		return -1;
 
 	if (!strcmp(var, "grep.extendedregexp")) {
+		if (opt->extended_regexp_option == -1)
+			return 0;
 		opt->extended_regexp_option = git_config_bool(var, value);
+		if (opt->extended_regexp_option)
+			opt->pattern_type_option = GREP_PATTERN_TYPE_ERE;
+		else
+			opt->pattern_type_option = GREP_PATTERN_TYPE_BRE;
+		return 0;
+	}
+
+	if (!strcmp(var, "grep.patterntype") &&
+	    !strcmp(value, "default")) {
+		opt->pattern_type_option = opt->extended_regexp_option == 1
+			? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
 		return 0;
 	}
 
 	if (!strcmp(var, "grep.patterntype")) {
+		opt->extended_regexp_option = -1; /* ignore */
 		opt->pattern_type_option = parse_pattern_type_arg(var, value);
 		return 0;
 	}
@@ -115,62 +127,6 @@ void grep_init(struct grep_opt *opt, struct repository *repo)
 	opt->header_tail = &opt->header_list;
 }
 
-static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	/*
-	 * When committing to the pattern type by setting the relevant
-	 * fields in grep_opt it's generally not necessary to zero out
-	 * the fields we're not choosing, since they won't have been
-	 * set by anything. The extended_regexp_option field is the
-	 * only exception to this.
-	 *
-	 * This is because in the process of parsing grep.patternType
-	 * & grep.extendedRegexp we set opt->pattern_type_option and
-	 * opt->extended_regexp_option, respectively. We then
-	 * internally use opt->extended_regexp_option to see if we're
-	 * compiling an ERE. It must be unset if that's not actually
-	 * the case.
-	 */
-	if (pattern_type != GREP_PATTERN_TYPE_ERE &&
-	    opt->extended_regexp_option)
-		opt->extended_regexp_option = 0;
-
-	switch (pattern_type) {
-	case GREP_PATTERN_TYPE_UNSPECIFIED:
-		/* fall through */
-
-	case GREP_PATTERN_TYPE_BRE:
-		break;
-
-	case GREP_PATTERN_TYPE_ERE:
-		opt->extended_regexp_option = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_FIXED:
-		opt->fixed = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_PCRE:
-		opt->pcre2 = 1;
-		break;
-	}
-}
-
-void grep_commit_pattern_type(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	if (pattern_type != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(pattern_type, opt);
-	else if (opt->pattern_type_option != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(opt->pattern_type_option, opt);
-	else if (opt->extended_regexp_option)
-		/*
-		 * This branch *must* happen after setting from the
-		 * opt->pattern_type_option above, we don't want
-		 * grep.extendedRegexp to override grep.patternType!
-		 */
-		grep_set_pattern_type_option(GREP_PATTERN_TYPE_ERE, opt);
-}
-
 static struct grep_pat *create_grep_pat(const char *pat, size_t patlen,
 					const char *origin, int no,
 					enum grep_pat_token t,
@@ -490,9 +446,10 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 
 	p->word_regexp = opt->word_regexp;
 	p->ignore_case = opt->ignore_case;
-	p->fixed = opt->fixed;
+	p->fixed = opt->pattern_type_option == GREP_PATTERN_TYPE_FIXED;
 
-	if (memchr(p->pattern, 0, p->patternlen) && !opt->pcre2)
+	if (opt->pattern_type_option != GREP_PATTERN_TYPE_PCRE &&
+	    memchr(p->pattern, 0, p->patternlen))
 		die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
 
 	p->is_fixed = is_fixed(p->pattern, p->patternlen);
@@ -543,14 +500,14 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 		return;
 	}
 
-	if (opt->pcre2) {
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_PCRE) {
 		compile_pcre2_pattern(p, opt);
 		return;
 	}
 
 	if (p->ignore_case)
 		regflags |= REG_ICASE;
-	if (opt->extended_regexp_option)
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_ERE)
 		regflags |= REG_EXTENDED;
 	err = regcomp(&p->regexp, p->pattern, regflags);
 	if (err) {
diff --git a/grep.h b/grep.h
index b651eb291f7..ab2ce833b40 100644
--- a/grep.h
+++ b/grep.h
@@ -94,8 +94,7 @@ enum grep_expr_node {
 };
 
 enum grep_pattern_type {
-	GREP_PATTERN_TYPE_UNSPECIFIED = 0,
-	GREP_PATTERN_TYPE_BRE,
+	GREP_PATTERN_TYPE_BRE = 0,
 	GREP_PATTERN_TYPE_ERE,
 	GREP_PATTERN_TYPE_FIXED,
 	GREP_PATTERN_TYPE_PCRE
@@ -143,7 +142,6 @@ struct grep_opt {
 	int unmatch_name_only;
 	int count;
 	int word_regexp;
-	int fixed;
 	int all_match;
 #define GREP_BINARY_DEFAULT	0
 #define GREP_BINARY_NOMATCH	1
@@ -152,7 +150,6 @@ struct grep_opt {
 	int allow_textconv;
 	int extended;
 	int use_reflog_filter;
-	int pcre2;
 	int relative;
 	int pathname;
 	int null_following_name;
@@ -162,7 +159,7 @@ struct grep_opt {
 	int funcname;
 	int funcbody;
 	int extended_regexp_option;
-	int pattern_type_option;
+	enum grep_pattern_type pattern_type_option;
 	int ignore_locale;
 	char colors[NR_GREP_COLORS][COLOR_MAXLEN];
 	unsigned pre_context;
@@ -181,7 +178,6 @@ struct grep_opt {
 	.relative = 1, \
 	.pathname = 1, \
 	.max_depth = -1, \
-	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
 	.colors = { \
 		[GREP_COLOR_CONTEXT] = "", \
 		[GREP_COLOR_FILENAME] = "", \
@@ -200,7 +196,6 @@ struct grep_opt {
 
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
-void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
 void append_grep_pattern(struct grep_opt *opt, const char *pat, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index 495328e859c..298d0ea7574 100644
--- a/revision.c
+++ b/revision.c
@@ -2860,8 +2860,6 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 
 	diff_setup_done(&revs->diffopt);
 
-	grep_commit_pattern_type(GREP_PATTERN_TYPE_UNSPECIFIED,
-				 &revs->grep_filter);
 	if (!is_encoding_utf8(get_log_output_encoding()))
 		revs->grep_filter.ignore_locale = 1;
 	compile_grep_patterns(&revs->grep_filter);
-- 
2.34.1.1239.g84ae229c870


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: [RFC/PATCH] grep: allow scripts to ignore configured pattern type
  2021-12-25  0:19             ` [RFC/PATCH] grep: allow scripts to ignore configured pattern type Junio C Hamano
@ 2021-12-26 23:09               ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-26 23:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git


On Fri, Dec 24 2021, Junio C Hamano wrote:

> We made a mistake to add grep.extendedRegexp configuration variable
> long time ago, and made things even worse by introducing an even
> more generalized grep.patternType configuration variable.
>
> This was mostly because interactive users were lazy and wanted a way
> to declare "I do not live in the ancient age, and my regexps are
> always extended" and write "git grep" without having to type three
> more letters " -E" on the command line.
>
> But this in turn forces script writers to always specify what kind
> of patterns they are writing, because without such command line
> override, the interpretation of the patterns they write in their
> scripts will be affected by the configuration variables of the user
> who is running their script.
>
> Introduce GIT_DISABLE_GREP_PATTERN_CONFIG environment variable that
> script writers can set to "true" and export at the very beginning of
> their script to defeat grep.extendedRegexp and grep.patternType
> configuration variables.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>
>  * This is merely a weather balloon without proper documentation and
>    test.  It might be even better idea to make such an environment
>    variable to _specify_ what kind of pattern the script uses,
>    instead of "we defeat end-user configuration and now we are
>    forced to write in basic or have to write -E/-P etc.", which is
>    what this patch does.

You note the lack of documentation. I do think anything in this
direction would do well to:

 * Specify what it is we're promising now exactly. The git-grep
   command is in "main porcelain" now, this change sounds like we're
   promising to make its output more plumbing-like.

 * As an aside I think a good follow-up to my series would be to
   just start warning() and eventually die()-ing on grep.extendedRegexp
   which would make this a bit simpler.

 * A "GIT_DISABLE_GREP_PATTERN_CONFIG" seems overly narrow. Just a few lines
   from the code being patched here we read the grep.lineNumber config, which is
   similarly annoying if you're parsing the "git grep" output, so at least a
   "GIT_DISABLE_GREP_CONFIG" would be handy.

 * But more generally we've had discussions on and off on-list about supporting
   a generic way to disable reading the config. Supporting e.g. "git --no-config" or
   a "GIT_NO_CONFIG" would be handy, even if all it did for now (and we could document
   it as such) would be to change the behavior of grep.
 

>  grep.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/grep.c b/grep.c
> index fe847a0111..0cfb698b51 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -77,10 +77,15 @@ int grep_config(const char *var, const char *value, void *cb)
>  {
>  	struct grep_opt *opt = &grep_defaults;
>  	const char *slot;
> +	static int disable_pattern_type_config = -1;
>  
>  	if (userdiff_config(var, value) < 0)
>  		return -1;
>  
> +	if (disable_pattern_type_config < 0)
> +		disable_pattern_type_config =
> +			git_env_bool("GIT_DISABLE_GREP_PATTERN_CONFIG", 0);
> +
>  	/*
>  	 * The instance of grep_opt that we set up here is copied by
>  	 * grep_init() to be used by each individual invocation.
> @@ -90,12 +95,14 @@ int grep_config(const char *var, const char *value, void *cb)
>  	 */
>  
>  	if (!strcmp(var, "grep.extendedregexp")) {
> -		opt->extended_regexp_option = git_config_bool(var, value);
> +		if (!disable_pattern_type_config)
> +			opt->extended_regexp_option = git_config_bool(var, value);
>  		return 0;
>  	}
>  
>  	if (!strcmp(var, "grep.patterntype")) {
> -		opt->pattern_type_option = parse_pattern_type_arg(var, value);
> +		if (!disable_pattern_type_config)
> +			opt->pattern_type_option = parse_pattern_type_arg(var, value);
>  		return 0;
>  	}


^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v6 7/7] grep: simplify config parsing and option parsing
  2021-12-26 22:37         ` [PATCH v6 7/7] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
@ 2021-12-27  6:06           ` Junio C Hamano
  2021-12-27 18:51             ` Junio C Hamano
  0 siblings, 1 reply; 151+ messages in thread
From: Junio C Hamano @ 2021-12-27  6:06 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> @@ -143,7 +142,6 @@ struct grep_opt {
>  	int unmatch_name_only;
>  	int count;
>  	int word_regexp;
> -	int fixed;
>  	int all_match;
>  #define GREP_BINARY_DEFAULT	0
>  #define GREP_BINARY_NOMATCH	1
> @@ -152,7 +150,6 @@ struct grep_opt {
>  	int allow_textconv;
>  	int extended;
>  	int use_reflog_filter;
> -	int pcre2;
>  	int relative;
>  	int pathname;
>  	int null_following_name;
> @@ -162,7 +159,7 @@ struct grep_opt {
>  	int funcname;
>  	int funcbody;
>  	int extended_regexp_option;
> -	int pattern_type_option;
> +	enum grep_pattern_type pattern_type_option;
>  	int ignore_locale;
>  	char colors[NR_GREP_COLORS][COLOR_MAXLEN];
>  	unsigned pre_context;
> @@ -181,7 +178,6 @@ struct grep_opt {
>  	.relative = 1, \
>  	.pathname = 1, \
>  	.max_depth = -1, \
> -	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
>  	.colors = { \
>  		[GREP_COLOR_CONTEXT] = "", \
>  		[GREP_COLOR_FILENAME] = "", \

I very much like the lossage of redundant fixed and pcre2 members.

As I kept telling you, we still need a separate bit to keep track of
the last value of grep.extendedRegexp, but the primary mechanism to
determine what pattern type to use should be a single enum that is
pattern_type.  When we see "fixed", "pcre", "-G", etc. from
grep.patternType config or from command line, we can stuff their
enum values in pattern_type member of this struct, and when we see
"default", we need to leave "default" in pattern_type member until
we see the last definition of grep.extendedRegexp, at which time
we can turn it into either "basic" or "extended".

So having only two members is absolutely the right thing to do.

But this part convinces me that whatever this patch does, it will
not possible be capable of doing the right thing.  You cannot
implement "we have to remember that the last grep.patternType we saw
was DEFAULT and in that case we cannot decide the real pattern type
until we see the last definition of grep.extendedRegexp, which may
be well after we saw the last grep.patternType definition" without a
value in this enum to express that the last value we saw was DEFAULT.

>  enum grep_pattern_type {
> -	GREP_PATTERN_TYPE_UNSPECIFIED = 0,
> -	GREP_PATTERN_TYPE_BRE,
> +	GREP_PATTERN_TYPE_BRE = 0,
>  	GREP_PATTERN_TYPE_ERE,
>  	GREP_PATTERN_TYPE_FIXED,
>  	GREP_PATTERN_TYPE_PCRE

> @@ -982,7 +981,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>  	argc = parse_options(argc, argv, prefix, options, grep_usage,
>  			     PARSE_OPT_KEEP_DASHDASH |
>  			     PARSE_OPT_STOP_AT_NON_OPTION);
> -	grep_commit_pattern_type(pattern_type_arg, &opt);

In other words, this lossage is likely wrong.  Let's keep reading
and see how well the config reader in this patch does. 

> @@ -61,11 +59,25 @@ int grep_config(const char *var, const char *value, void *cb)
>  		return -1;
>  
>  	if (!strcmp(var, "grep.extendedregexp")) {
> +		if (opt->extended_regexp_option == -1)
> +			return 0;
>  		opt->extended_regexp_option = git_config_bool(var, value);
> +		if (opt->extended_regexp_option)
> +			opt->pattern_type_option = GREP_PATTERN_TYPE_ERE;
> +		else
> +			opt->pattern_type_option = GREP_PATTERN_TYPE_BRE;
> +		return 0;
> +	}
> +	if (!strcmp(var, "grep.patterntype") &&
> +	    !strcmp(value, "default")) {
> +		opt->pattern_type_option = opt->extended_regexp_option == 1
> +			? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
>  		return 0;
>  	}
>  
>  	if (!strcmp(var, "grep.patterntype")) {
> +		opt->extended_regexp_option = -1; /* ignore */
>  		opt->pattern_type_option = parse_pattern_type_arg(var, value);
>  		return 0;
>  	}

The above does not look correct at all.

What happens when the configuration parser sees these configuration
variables in this sequence:

 - grep.patternType set to say "pcre" (or anything not "default").
 - grep.extendedRegexp set to "true".
 - grep.patternType set to "default".

After these three variable definitions with the usual "last one
wins" (for each variable independently), the last value for the
grep.patternType variable is "default", and the last value for
the grep.extendedRegexp variable is "true".  The user wants to use
the ERE patterns.

The way the above code would work on this three variable definition
sequence, as far as I read it, would however not give us the desired
behaviour.  First we drop extended_regexp_option member to -1 while
setting PCRE to attern_type_option member, and then grep.extendedRegexp
is totally ignored, and then we see patterntype set to default and
notice extended_regexp_option is *NOT* 1 (because you ignored it and
left it to -1), and end up using BRE, no?

I agree 100% with the direction that .fixed and .pcre2 members that
were added over time to the struct are redundant and it is a very
good idea to get rid of them.  But we need to keep track of two
configuration variables separately to allow them the "last one wins"
semantics independently, and for that, you cannot lose the "default"
value from the enum.  It is impossible not to store the fact that
"default" was the last value so far we saw for grep.patternType
because you do not know, at the point of seeing "default", what the
final value for grep.extendedRegexp will be.  If you want to correctly
implement the interaction between two variables without regression,
that is.

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v6 7/7] grep: simplify config parsing and option parsing
  2021-12-27  6:06           ` Junio C Hamano
@ 2021-12-27 18:51             ` Junio C Hamano
  0 siblings, 0 replies; 151+ messages in thread
From: Junio C Hamano @ 2021-12-27 18:51 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Junio C Hamano <gitster@pobox.com> writes:

> The above does not look correct at all.
>
> What happens when the configuration parser sees these configuration
> variables in this sequence:
>
>  - grep.patternType set to say "pcre" (or anything not "default").
>  - grep.extendedRegexp set to "true".
>  - grep.patternType set to "default".
>
> After these three variable definitions with the usual "last one
> wins" (for each variable independently), the last value for the
> grep.patternType variable is "default", and the last value for
> the grep.extendedRegexp variable is "true".  The user wants to use
> the ERE patterns.

By the way, the example I gave you for the previous round, and
similarly the one in the message I am responding to were all written
to help you realize that it is simply a broken approach if we do not
keep "default" as default and instead resolve it to either "basic"
or "extended" too early.  The goal of these examples was *NOT* to
tell you "this single thing is broken with the code in this round so
let's fix it".

It seems I am not succeeding in conveying that point, and specially
I smell that in the change between v5 and v6.

So let me try to be a bit more explicit.  Let's not do another round
of "I think this is a moral equivalent of what you want, even though
it is not done the way you suggested." I think we wasted a reroll or
three with that attitude in changes leading to v6 already, after I
gave my review to v5, and I think the v5 review essentially was a
repeat of my review for v3's 3/7, so if I conveyed the point clearly
enough back then, perhaps we didn't have to waste your time on v4
and v5, either.  Sorry about that.

So, here is what this step of the series SHOULD do:

 * Use two members to keep track of the final configuration value we
   saw for grep.patternTYpe and grep.extendedRegexp independently.
   The existing .fixed and .pcre2 fields are superfluous.  But no
   more "ah, we see patternType so let's ignore extendedRegexp"
   games.

 * When parsing the command line options -G, -E, etc., update the
   .patternType member with the value found.  We do not want to and
   need to touch .extendedRegexp member, whose SOLE purpose should
   be to keep track of "what the last value we saw for
   grep.extendedRegexp configuration variable".

 * Do ALL THE ABOVE while keeping "default" in the .patternType
   member as "default" as-is given by the user; do not turn it into
   "basic" or "extended" in config callback at all.

 * At some point of your choice between the time we finished parsing
   both configuration variables and command line options and the
   time we compile the pattern string to regexp objects of various
   types, look at the .patternType member and resolve it into
   basic/extended IFF it is set to default, using .extendedRegexp
   member (for this to work correctly, it is important not to let
   -E/-G command like options to touch .extendedRegexp member---it
   should be used ONLY to keep track of "what the last value we saw
   for grep.extendedRegexp configuration variable").

 * After the above step is done, .extendedRegexp member is no longer
   needed and we can compile the pattern using only the value in
   .patternType member.

The penultimate bullet point gives us a wiggle room to lose the
"commit" thing and delay it until the very last moment, the function
that decides to call which regexp engine's regcomp.  The important
thing is that we cannot lose the value "default" from .patternType
field or lose the last value given to .extendedRegexp field too
early, namely, before we have read all the configurtion streams and
know the last value for these two variables.

Thanks.  Hopefully I was clear enough this time.

----- >8 ---- ----- >8 ---- ----- >8 ---- ----- >8 ---- ----- >8 -----
Subject: [PATCH] fixup! grep tests: add missing "grep.patternType" config tests

---
 t/t7810-grep.sh | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 664f884e12..2e2829ee55 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -471,6 +471,16 @@ do
 		test_cmp expected actual
 	'
 
+	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently" '
+		echo "${HC}ab:abc" >expected &&
+		git \
+			-c grep.patternType=fixed \
+			-c grep.extendedRegexp=true \
+			-c grep.patternType=default \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
 	test_expect_success "grep $L with grep.patternType=extended and grep.patternType=default" '
 		echo "${HC}ab:a+bc" >expected &&
 		git \
-- 
2.34.1-568-g69e9fd72b5


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v7 00/10] grep: simplify & delete "init" & "config" code
  2021-12-26 22:37       ` [PATCH v6 " Ævar Arnfjörð Bjarmason
                           ` (6 preceding siblings ...)
  2021-12-26 22:37         ` [PATCH v6 7/7] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
@ 2021-12-28  1:07         ` Ævar Arnfjörð Bjarmason
  2021-12-28  1:07           ` [PATCH v7 01/10] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
                             ` (10 more replies)
  7 siblings, 11 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28  1:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

A re-roll that should *hopefully* address the comments Junio had in
https://lore.kernel.org/git/xmqq4k6tyj8r.fsf@gitster.g/; To quote
those:

> Junio C Hamano <gitster@pobox.com> writes:
> [...]
> By the way, the example I gave you for the previous round, and
> similarly the one in the message I am responding to were all written
> to help you realize that it is simply a broken approach if we do not
> keep "default" as default and instead resolve it to either "basic"
> or "extended" too early.  The goal of these examples was *NOT* to
> tell you "this single thing is broken with the code in this round so
> let's fix it".

Yes, the v6 was broken. Sorry about that. Between re-rolling some
other things and coming back to this series I managed to miss some of
those subtleties.

> So let me try to be a bit more explicit.  Let's not do another round
> of "I think this is a moral equivalent of what you want, even though
> it is not done the way you suggested." I think we wasted a reroll or
> three with that attitude in changes leading to v6 already, after I
> gave my review to v5, and I think the v5 review essentially was a
> repeat of my review for v3's 3/7, so if I conveyed the point clearly
> enough back then, perhaps we didn't have to waste your time on v4
> and v5, either.  Sorry about that.
>
> So, here is what this step of the series SHOULD do:

So, at the risk of another wasted re-roll this doesn't exactly
implement what you suggested, but I think it'll finally cover all the
basis of being bug-for-bug compatible with the old implementation now.

I started this re-roll by implementing it exactly as you suggested,
along with new tests (both your fixup, and more), but as the updated
commit message in 09/10 details found that I could convert it back to
a state machine around a static variable in "grep_config()", which has
the advantage of getting rid of more "struct grep_opt" fields annd
narrowing the behavior to just that function.

More comments on the range-diff below:

Ævar Arnfjörð Bjarmason (10):
  grep.h: remove unused "regex_t regexp" from grep_opt
  log tests: check if grep_config() is called by "log"-like cmds

No changes.

  grep tests: add missing "grep.patternType" config tests

A couple of new tests.

  built-ins: trust the "prefix" from run_builtin()
  grep.c: don't pass along NULL callback value
  grep API: call grep_config() after grep_init()

No changes.

  grep.h: make "grep_opt.pattern_type_option" use its enum
  grep.c: do "if (bool && memchr())" not "if (memchr() && bool)"

I split these two off from the previous tip commit to make the diff
for the next one smaller.

  grep: simplify config parsing and option parsing

Changed, as noted above.

  grep.[ch]: remove GREP_PATTERN_TYPE_UNSPECIFIED

With that new behavior we can also get rid of
GREP_PATTERN_TYPE_UNSPECIFIED. It's still implicitly there as == 0,
but for a non-switch'd enum type treating these as "flags" makes more
sense, and makes the code in grep_config() more straightforward &
brief.

 builtin/grep.c    |  27 +++++-----
 builtin/log.c     |  13 ++++-
 builtin/ls-tree.c |   2 +-
 git.c             |   1 +
 grep.c            | 126 +++++++++-------------------------------------
 grep.h            |  34 +++++++++----
 revision.c        |   4 +-
 t/t4202-log.sh    |  24 +++++++++
 t/t7810-grep.sh   |  68 +++++++++++++++++++++++++
 9 files changed, 168 insertions(+), 131 deletions(-)

Range-diff against v6:
 1:  b62e6b6162a =  1:  b62e6b6162a grep.h: remove unused "regex_t regexp" from grep_opt
 2:  0edcdb50afd =  2:  0edcdb50afd log tests: check if grep_config() is called by "log"-like cmds
 3:  1b724d5e2e9 !  3:  e1b4b5b77e0 grep tests: add missing "grep.patternType" config tests
    @@ t/t7810-grep.sh: do
     +		test_cmp expected actual
     +	'
     +
    ++	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (BRE)" '
    ++		echo "${HC}ab:abc" >expected &&
    ++		git \
    ++			-c grep.patternType=fixed \
    ++			-c grep.extendedRegexp=true \
    ++			-c grep.patternType=default \
    ++			grep "a+b*c" $H ab >actual &&
    ++		test_cmp expected actual
    ++	'
    ++
    ++	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (ERE)" '
    ++		echo "${HC}ab:a+bc" >expected &&
    ++		git \
    ++			-c grep.patternType=default \
    ++			-c grep.extendedRegexp=true \
    ++			-c grep.patternType=basic \
    ++			grep "a+b*c" $H ab >actual &&
    ++		test_cmp expected actual
    ++	'
    ++
     +	test_expect_success "grep $L with grep.patternType=extended and grep.patternType=default" '
     +		echo "${HC}ab:a+bc" >expected &&
     +		git \
    @@ t/t7810-grep.sh: do
      	test_expect_success "grep $L with grep.patternType=extended and grep.extendedRegexp=false" '
      		echo "${HC}ab:abc" >expected &&
      		git \
    +@@ t/t7810-grep.sh: do
    + 		test_cmp expected actual
    + 	'
    + 
    ++	test_expect_success "grep $L with grep.extendedRegexp=false and grep.patternType=default" '
    ++		echo "${HC}ab:abc" >expected &&
    ++		git \
    ++			-c grep.extendedRegexp=false \
    ++			-c grep.patternType=extended \
    ++			grep "a+b*c" $H ab >actual &&
    ++		test_cmp expected actual
    ++	'
    ++
    + 	test_expect_success "grep $L with grep.extendedRegexp=true and grep.patternType=basic" '
    + 		echo "${HC}ab:a+bc" >expected &&
    + 		git \
 4:  f4876552771 =  4:  6d91a765fd7 built-ins: trust the "prefix" from run_builtin()
 5:  069b0339146 =  5:  844b4727ca3 grep.c: don't pass along NULL callback value
 6:  e38eca56959 =  6:  d9cf9bf5e37 grep API: call grep_config() after grep_init()
 -:  ----------- >  7:  57ecc5c0d65 grep.h: make "grep_opt.pattern_type_option" use its enum
 -:  ----------- >  8:  7dbeafde26b grep.c: do "if (bool && memchr())" not "if (memchr() && bool)"
 7:  88dfd40bf9e !  9:  c6ca39b4554 grep: simplify config parsing and option parsing
    @@ Commit message
         grep_init(), which means that much of the complexity here can go
         away.
     
    -    As before "grep.extendedRegexp" is a last-one-wins variable. We need
    -    to maintain state inside parse_pattern_type_arg() to ignore it if a
    -    non-"default" grep.patternType was seen, but otherwise flip between
    -    BRE and ERE for "grep.extendedRegexp=[false|true]".
    +    As before both "grep.patternType" and "grep.extendedRegexp" are
    +    last-one-wins variable, with "grep.extendedRegexp" yielding to
    +    "grep.patternType", except when "grep.patternType=default".
    +
    +    Note that this applies as we parse the config, i.e. a sequence of:
    +
    +        -c grep.patternType=perl
    +        -c grep.extendedRegexp=true \
    +        -c grep.patternType=default
    +
    +    Should select ERE due to "grep.extendedRegexp=true and
    +    grep.extendedRegexp=default", not BRE, even though that's the
    +    "default" patternType. We can determine this as we parse the config,
    +    because:
    +
    +     * If we see "grep.extendedRegexp" we set the internal "ero" to its
    +       boolean value.
    +
    +     * If we see "grep.extendedRegexp" but
    +       "grep.patternType=[default|<unset>]" is in effect we *don't* set
    +       the internal "pattern_type_option" to update the pattern type.
    +
    +     * If we see "grep.patternType!=default" we can set our internal
    +       "pattern_type_option" directly, it doesn't matter what the state of
    +       "grep.extendedRegexp" is, but we don't forget what it was, in case
    +       we see a "grep.patternType=default" again.
    +
    +     * If we see a "grep.patternType=default" we can set the pattern to
    +       ERE or BRE depending on whether we last saw a
    +       "grep.extendedRegexp=true" or
    +       "grep.extendedRegexp=[false|<unset>]".
    +
    +    We could equally call this new adjust_pattern_type() in
    +    compile_regexp(), i.e. this fixup on top of this passes all our
    +    tests (with -U0 for brevity):
    +
    +        @@ -60,0 +61 @@ static void adjust_pattern_type(enum grep_pattern_type *pto, const int ero)
    +        +static int ero = -1;
    +        @@ -65 +65,0 @@ int grep_config(const char *var, const char *value, void *cb)
    +        -       static int ero = -1;
    +        @@ -72 +71,0 @@ int grep_config(const char *var, const char *value, void *cb)
    +        -               adjust_pattern_type(&opt->pattern_type_option, ero);
    +        @@ -80 +78,0 @@ int grep_config(const char *var, const char *value, void *cb)
    +        -               adjust_pattern_type(&opt->pattern_type_option, ero);
    +        @@ -445,0 +444,2 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
    +        +       if (ero != -1)
    +        +               adjust_pattern_type(&opt->pattern_type_option, ero);
    +
    +    But doing it as we stream the git_config() makes it
    +    clear that we can determine the interplay between these two variables
    +    as we go. We don't need to wait until we see the last value of the two
    +    configuration variables.
    +
    +    This is true because of the rationale above, and because the
    +    subsequent code in compile_regexp() treats
    +    "pattern_type_option=GREP_PATTERN_TYPE_{UNSPECIFIED,BRE}"
    +    equally. I.e. we'll end up with different internal
    +    ""pattern_type_option" values there for:
    +
    +        # UNSPECIFIED
    +        -c grep.patternType=default
    +        # BRE
    +        -c grep.extendedRegexp=false -c grep.patternType=default
    +
    +    But the difference won't matter, which simplifies some of this logic,
    +    we never need to adjust a "grep.patternType" if we didn't see a
    +    "grep.extendedRegexp" before. We can also remove the
    +    "extended_regexp_option" member from "struct grep_opt" in favor of a
    +    static variable in grep_config().
    +
    +    The command-line parsing in cmd_grep() can then completely ignore
    +    "grep.extendedRegexp". Whatever effect it had before that step won't
    +    matter if we see -G, -E, -P etc.
     
         See my 07a3d411739 (grep: remove regflags from the public grep_opt
         API, 2017-06-29) for addition of the two comments being removed here,
         i.e. the complexity noted in that commit is now going away.
     
    -    We don't need grep_commit_pattern_type() anymore, we can instead have
    -    OPT_SET_INT() in "builtin/grep.c" manipulate the "pattern_type_option"
    -    member in "struct grep_opt" directly.
    -
    -    We can also do away with the indirection of the "int fixed" and "int
    -    pcre2" members in favor of using "pattern_type_option" directly in
    -    "grep.c".
    -
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## builtin/grep.c ##
    @@ builtin/grep.c: int cmd_grep(int argc, const char **argv, const char *prefix)
      		int fallback = 0;
     
      ## grep.c ##
    -@@ grep.c: static const char *color_grep_slots[] = {
    +@@ grep.c: static int parse_pattern_type_arg(const char *opt, const char *arg)
      
    - static int parse_pattern_type_arg(const char *opt, const char *arg)
    - {
    --	if (!strcmp(arg, "default"))
    --		return GREP_PATTERN_TYPE_UNSPECIFIED;
    --	else if (!strcmp(arg, "basic"))
    -+	if (!strcmp(arg, "basic"))
    - 		return GREP_PATTERN_TYPE_BRE;
    - 	else if (!strcmp(arg, "extended"))
    - 		return GREP_PATTERN_TYPE_ERE;
    + define_list_config_array_extra(color_grep_slots, {"match"});
    + 
    ++static void adjust_pattern_type(enum grep_pattern_type *pto, const int ero)
    ++{
    ++	if (*pto == GREP_PATTERN_TYPE_UNSPECIFIED)
    ++		*pto = ero ? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
    ++}
    ++
    + /*
    +  * Read the configuration file once and store it in
    +  * the grep_defaults template.
     @@ grep.c: int grep_config(const char *var, const char *value, void *cb)
    + {
    + 	struct grep_opt *opt = cb;
    + 	const char *slot;
    ++	static int ero = -1;
    + 
    + 	if (userdiff_config(var, value) < 0)
      		return -1;
      
      	if (!strcmp(var, "grep.extendedregexp")) {
    -+		if (opt->extended_regexp_option == -1)
    -+			return 0;
    - 		opt->extended_regexp_option = git_config_bool(var, value);
    -+		if (opt->extended_regexp_option)
    -+			opt->pattern_type_option = GREP_PATTERN_TYPE_ERE;
    -+		else
    -+			opt->pattern_type_option = GREP_PATTERN_TYPE_BRE;
    -+		return 0;
    -+	}
    -+
    -+	if (!strcmp(var, "grep.patterntype") &&
    -+	    !strcmp(value, "default")) {
    -+		opt->pattern_type_option = opt->extended_regexp_option == 1
    -+			? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
    +-		opt->extended_regexp_option = git_config_bool(var, value);
    ++		ero = git_config_bool(var, value);
    ++		adjust_pattern_type(&opt->pattern_type_option, ero);
      		return 0;
      	}
      
      	if (!strcmp(var, "grep.patterntype")) {
    -+		opt->extended_regexp_option = -1; /* ignore */
      		opt->pattern_type_option = parse_pattern_type_arg(var, value);
    ++		if (ero == -1)
    ++			return 0;
    ++		adjust_pattern_type(&opt->pattern_type_option, ero);
      		return 0;
      	}
    + 
     @@ grep.c: void grep_init(struct grep_opt *opt, struct repository *repo)
      	opt->header_tail = &opt->header_list;
      }
    @@ grep.c: static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
     -	p->fixed = opt->fixed;
     +	p->fixed = opt->pattern_type_option == GREP_PATTERN_TYPE_FIXED;
      
    --	if (memchr(p->pattern, 0, p->patternlen) && !opt->pcre2)
    +-	if (!opt->pcre2 &&
     +	if (opt->pattern_type_option != GREP_PATTERN_TYPE_PCRE &&
    -+	    memchr(p->pattern, 0, p->patternlen))
    + 	    memchr(p->pattern, 0, p->patternlen))
      		die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
      
    - 	p->is_fixed = is_fixed(p->pattern, p->patternlen);
     @@ grep.c: static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
      		return;
      	}
    @@ grep.c: static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
      	if (err) {
     
      ## grep.h ##
    -@@ grep.h: enum grep_expr_node {
    - };
    - 
    - enum grep_pattern_type {
    --	GREP_PATTERN_TYPE_UNSPECIFIED = 0,
    --	GREP_PATTERN_TYPE_BRE,
    -+	GREP_PATTERN_TYPE_BRE = 0,
    - 	GREP_PATTERN_TYPE_ERE,
    - 	GREP_PATTERN_TYPE_FIXED,
    - 	GREP_PATTERN_TYPE_PCRE
     @@ grep.h: struct grep_opt {
      	int unmatch_name_only;
      	int count;
    @@ grep.h: struct grep_opt {
      	int pathname;
      	int null_following_name;
     @@ grep.h: struct grep_opt {
    + 	int max_depth;
      	int funcname;
      	int funcbody;
    - 	int extended_regexp_option;
    --	int pattern_type_option;
    -+	enum grep_pattern_type pattern_type_option;
    +-	int extended_regexp_option;
    + 	enum grep_pattern_type pattern_type_option;
      	int ignore_locale;
      	char colors[NR_GREP_COLORS][COLOR_MAXLEN];
    - 	unsigned pre_context;
    -@@ grep.h: struct grep_opt {
    - 	.relative = 1, \
    - 	.pathname = 1, \
    - 	.max_depth = -1, \
    --	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
    - 	.colors = { \
    - 		[GREP_COLOR_CONTEXT] = "", \
    - 		[GREP_COLOR_FILENAME] = "", \
     @@ grep.h: struct grep_opt {
      
      int grep_config(const char *var, const char *value, void *);
 -:  ----------- > 10:  b764c09d2b7 grep.[ch]: remove GREP_PATTERN_TYPE_UNSPECIFIED
-- 
2.34.1.1250.g6a242c1e9ad


^ permalink raw reply	[flat|nested] 151+ messages in thread

* [PATCH v7 01/10] grep.h: remove unused "regex_t regexp" from grep_opt
  2021-12-28  1:07         ` [PATCH v7 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
@ 2021-12-28  1:07           ` Ævar Arnfjörð Bjarmason
  2021-12-28  1:07           ` [PATCH v7 02/10] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
                             ` (9 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28  1:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

This "regex_t" in grep_opt has not been used since
f9b9faf6f8a (builtin-grep: allow more than one patterns., 2006-05-02),
we still use a "regex_t" for compiling regexes, but that's in the
"grep_pat" struct".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/grep.h b/grep.h
index 3e8815c347b..95cccb670f9 100644
--- a/grep.h
+++ b/grep.h
@@ -136,7 +136,6 @@ struct grep_opt {
 
 	const char *prefix;
 	int prefix_length;
-	regex_t regexp;
 	int linenum;
 	int columnnum;
 	int invert;
-- 
2.34.1.1250.g6a242c1e9ad


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v7 02/10] log tests: check if grep_config() is called by "log"-like cmds
  2021-12-28  1:07         ` [PATCH v7 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  2021-12-28  1:07           ` [PATCH v7 01/10] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
@ 2021-12-28  1:07           ` Ævar Arnfjörð Bjarmason
  2021-12-28  1:07           ` [PATCH v7 03/10] grep tests: add missing "grep.patternType" config tests Ævar Arnfjörð Bjarmason
                             ` (8 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28  1:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Extend the tests added in my 9df46763ef1 (log: add exhaustive tests
for pattern style options & config, 2017-05-20) to check not only
whether "git log" handles "grep.patternType", but also "git show"
etc.

It's sufficient to check whether we match a "fixed" or a "basic" regex
here to see if these codepaths correctly invoked grep_config(). We
don't need to check the details of their regular expression matching
as the "log" test does.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t4202-log.sh | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 2ced7e9d817..2490f2cd5ed 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -449,6 +449,30 @@ test_expect_success !FAIL_PREREQS 'log with various grep.patternType configurati
 	)
 '
 
+for cmd in show whatchanged reflog format-patch
+do
+	case "$cmd" in
+	format-patch) myarg="HEAD~.." ;;
+	*) myarg= ;;
+	esac
+
+	test_expect_success "$cmd: understands grep.patternType, like 'log'" '
+		git init "pattern-type-$cmd" &&
+		(
+			cd "pattern-type-$cmd" &&
+			test_commit 1 file A &&
+			test_commit "(1|2)" file B 2 &&
+
+			git -c grep.patternType=fixed $cmd --grep="..." $myarg >actual &&
+			test_must_be_empty actual &&
+
+			git -c grep.patternType=basic $cmd --grep="..." $myarg >actual &&
+			test_file_not_empty actual
+		)
+	'
+done
+test_done
+
 test_expect_success 'log --author' '
 	cat >expect <<-\EOF &&
 	Author: <BOLD;RED>A U<RESET> Thor <author@example.com>
-- 
2.34.1.1250.g6a242c1e9ad


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v7 03/10] grep tests: add missing "grep.patternType" config tests
  2021-12-28  1:07         ` [PATCH v7 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  2021-12-28  1:07           ` [PATCH v7 01/10] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
  2021-12-28  1:07           ` [PATCH v7 02/10] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
@ 2021-12-28  1:07           ` Ævar Arnfjörð Bjarmason
  2021-12-28  1:07           ` [PATCH v7 04/10] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
                             ` (7 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28  1:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Extend the grep tests to assert that setting
"grep.patternType=extended" followed by "grep.patternType=default"
will behave as if "--basic-regexp" was provided, and not as
"--extended-regexp". In a subsequent commit we'll need to treat
"grep.patternType=default" as a special-case, but let's make sure we
ignore it if it's being set to "default" following an earlier
non-"default" "grep.patternType" setting.

Let's also test what happens when we have a sequence of "extended"
followed by "default" and "fixed". In that case the "fixed" should
prevail, as well as tests to check that a "grep.extendedRegexp=true"
followed by a "grep.extendedRegexp=false" behaves as though
"grep.extendedRegexp" wasn't provided.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t7810-grep.sh | 68 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 6b6423a07c3..dce653d749d 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -451,6 +451,65 @@ do
 		test_cmp expected actual
 	'
 
+	test_expect_success "grep $L with grep.extendedRegexp is last-one-wins" '
+		echo "${HC}ab:a+bc" >expected &&
+		git \
+			-c grep.extendedRegexp=true \
+			-c grep.patternType=basic \
+			-c grep.extendedRegexp=false \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.extendedRegexp is last-one-wins & defers to grep.patternType" '
+		echo "${HC}ab:abc" >expected &&
+		git \
+			-c grep.extendedRegexp=true \
+			-c grep.patternType=extended \
+			-c grep.extendedRegexp=false \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (BRE)" '
+		echo "${HC}ab:abc" >expected &&
+		git \
+			-c grep.patternType=fixed \
+			-c grep.extendedRegexp=true \
+			-c grep.patternType=default \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (ERE)" '
+		echo "${HC}ab:a+bc" >expected &&
+		git \
+			-c grep.patternType=default \
+			-c grep.extendedRegexp=true \
+			-c grep.patternType=basic \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.patternType=extended and grep.patternType=default" '
+		echo "${HC}ab:a+bc" >expected &&
+		git \
+			-c grep.patternType=extended \
+			-c grep.patternType=default \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.patternType=[extended -> default -> fixed]" '
+		echo "${HC}ab:a+b*c" >expected &&
+		git \
+			-c grep.patternType=extended \
+			-c grep.patternType=default \
+			-c grep.patternType=fixed \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
 	test_expect_success "grep $L with grep.patternType=extended and grep.extendedRegexp=false" '
 		echo "${HC}ab:abc" >expected &&
 		git \
@@ -478,6 +537,15 @@ do
 		test_cmp expected actual
 	'
 
+	test_expect_success "grep $L with grep.extendedRegexp=false and grep.patternType=default" '
+		echo "${HC}ab:abc" >expected &&
+		git \
+			-c grep.extendedRegexp=false \
+			-c grep.patternType=extended \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
 	test_expect_success "grep $L with grep.extendedRegexp=true and grep.patternType=basic" '
 		echo "${HC}ab:a+bc" >expected &&
 		git \
-- 
2.34.1.1250.g6a242c1e9ad


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v7 04/10] built-ins: trust the "prefix" from run_builtin()
  2021-12-28  1:07         ` [PATCH v7 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                             ` (2 preceding siblings ...)
  2021-12-28  1:07           ` [PATCH v7 03/10] grep tests: add missing "grep.patternType" config tests Ævar Arnfjörð Bjarmason
@ 2021-12-28  1:07           ` Ævar Arnfjörð Bjarmason
  2021-12-28  1:07           ` [PATCH v7 05/10] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
                             ` (6 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28  1:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change code in "builtin/grep.c" and "builtin/ls-tree.c" to trust the
"prefix" passed from "run_builtin()". The "prefix" we get from setup.c
is either going to be NULL or a string of length >0, never "".

So we can drop the "prefix && *prefix" checks added for
"builtin/grep.c" in 0d042fecf2f (git-grep: show pathnames relative to
the current directory, 2006-08-11), and for "builtin/ls-tree.c" in
a69dd585fca (ls-tree: chomp leading directories when run from a
subdirectory, 2005-12-23).

As seen in code in revision.c that was added in cd676a51367 (diff
--relative: output paths as relative to the current subdirectory,
2008-02-12) we already have existing code that does away with this
assertion.

This makes it easier to reason about a subsequent change to the
"prefix_length" code in grep.c in a subsequent commit, and since we're
going to the trouble of doing that let's leave behind an assert() to
promise this to any future callers.

For "builtin/grep.c" it would be painful to pass the "prefix" down the
callchain of:

    cmd_grep -> grep_tree -> grep_submodule -> grep_cache -> grep_oid ->
    grep_source_name

So for the code that needs it in grep_source_name() let's add a
"grep_prefix" variable similar to the existing "ls_tree_prefix".

While at it let's move the code in cmd_ls_tree() around so that we
assign to the "ls_tree_prefix" right after declaring the variables,
and stop assigning to "prefix". We only subsequently used that
variable later in the function after clobbering it. Let's just use our
own "grep_prefix" instead.

Let's also add an assert() in git.c, so that we'll make this promise
about the "prefix" to any current and future callers, as well as to
any readers of the code.

Code history:

 * The strlen() in "grep.c" hasn't been used since 493b7a08d80 (grep:
   accept relative paths outside current working directory, 2009-09-05).

   When that code was added in 0d042fecf2f (git-grep: show pathnames
   relative to the current directory, 2006-08-11) we used the length.

   But since 493b7a08d80 we haven't used it for anything except a
   boolean check that we could have done on the "prefix" member
   itself.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c    | 13 ++++++++-----
 builtin/ls-tree.c |  2 +-
 git.c             |  1 +
 grep.c            |  4 +---
 grep.h            |  4 +---
 revision.c        |  2 +-
 6 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 9e34a820ad4..d85cbabea67 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -26,6 +26,8 @@
 #include "object-store.h"
 #include "packfile.h"
 
+static const char *grep_prefix;
+
 static char const * const grep_usage[] = {
 	N_("git grep [<options>] [-e] <pattern> [<rev>...] [[--] <path>...]"),
 	NULL
@@ -315,11 +317,11 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 	strbuf_reset(out);
 
 	if (opt->null_following_name) {
-		if (opt->relative && opt->prefix_length) {
+		if (opt->relative && grep_prefix) {
 			struct strbuf rel_buf = STRBUF_INIT;
 			const char *rel_name =
 				relative_path(filename + tree_name_len,
-					      opt->prefix, &rel_buf);
+					      grep_prefix, &rel_buf);
 
 			if (tree_name_len)
 				strbuf_add(out, filename, tree_name_len);
@@ -332,8 +334,8 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 		return;
 	}
 
-	if (opt->relative && opt->prefix_length)
-		quote_path(filename + tree_name_len, opt->prefix, out, 0);
+	if (opt->relative && grep_prefix)
+		quote_path(filename + tree_name_len, grep_prefix, out, 0);
 	else
 		quote_c_style(filename + tree_name_len, out, NULL, 0);
 
@@ -962,9 +964,10 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			   PARSE_OPT_NOCOMPLETE),
 		OPT_END()
 	};
+	grep_prefix = prefix;
 
 	git_config(grep_cmd_config, NULL);
-	grep_init(&opt, the_repository, prefix);
+	grep_init(&opt, the_repository);
 
 	/*
 	 * If there is no -- then the paths must exist in the working
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c71..6cb554cbb0a 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -150,7 +150,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 
 	git_config(git_default_config, NULL);
 	ls_tree_prefix = prefix;
-	if (prefix && *prefix)
+	if (prefix)
 		chomp_prefix = strlen(prefix);
 
 	argc = parse_options(argc, argv, prefix, ls_tree_options,
diff --git a/git.c b/git.c
index 7edafd8ecff..575d95046f2 100644
--- a/git.c
+++ b/git.c
@@ -436,6 +436,7 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
 	} else {
 		prefix = NULL;
 	}
+	assert(!prefix || *prefix);
 	precompose_argv_prefix(argc, argv, NULL);
 	if (use_pager == -1 && run_setup &&
 		!(p->option & DELAY_PAGER_CONFIG))
diff --git a/grep.c b/grep.c
index fe847a0111a..12b202598a9 100644
--- a/grep.c
+++ b/grep.c
@@ -139,13 +139,11 @@ int grep_config(const char *var, const char *value, void *cb)
  * default values from the template we read the configuration
  * information in an earlier call to git_config(grep_config).
  */
-void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix)
+void grep_init(struct grep_opt *opt, struct repository *repo)
 {
 	*opt = grep_defaults;
 
 	opt->repo = repo;
-	opt->prefix = prefix;
-	opt->prefix_length = (prefix && *prefix) ? strlen(prefix) : 0;
 	opt->pattern_tail = &opt->pattern_list;
 	opt->header_tail = &opt->header_list;
 }
diff --git a/grep.h b/grep.h
index 95cccb670f9..62deadb885f 100644
--- a/grep.h
+++ b/grep.h
@@ -134,8 +134,6 @@ struct grep_opt {
 	 */
 	struct repository *repo;
 
-	const char *prefix;
-	int prefix_length;
 	int linenum;
 	int columnnum;
 	int invert;
@@ -180,7 +178,7 @@ struct grep_opt {
 };
 
 int grep_config(const char *var, const char *value, void *);
-void grep_init(struct grep_opt *, struct repository *repo, const char *prefix);
+void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index 5390a479b30..495328e859c 100644
--- a/revision.c
+++ b/revision.c
@@ -1838,7 +1838,7 @@ void repo_init_revisions(struct repository *r,
 	revs->commit_format = CMIT_FMT_DEFAULT;
 	revs->expand_tabs_in_log_default = 8;
 
-	grep_init(&revs->grep_filter, revs->repo, prefix);
+	grep_init(&revs->grep_filter, revs->repo);
 	revs->grep_filter.status_only = 1;
 
 	repo_diff_setup(revs->repo, &revs->diffopt);
-- 
2.34.1.1250.g6a242c1e9ad


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v7 05/10] grep.c: don't pass along NULL callback value
  2021-12-28  1:07         ` [PATCH v7 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                             ` (3 preceding siblings ...)
  2021-12-28  1:07           ` [PATCH v7 04/10] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
@ 2021-12-28  1:07           ` Ævar Arnfjörð Bjarmason
  2021-12-28  1:07           ` [PATCH v7 06/10] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
                             ` (5 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28  1:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change grep_cmd_config() to stop passing around the always-NULL "cb"
value. When this code was added in 7e8f59d577e (grep: color patterns
in output, 2009-03-07) it was non-NULL, but when that changed in
15fabd1bbd4 (builtin/grep.c: make configuration callback more
reusable, 2012-10-09) this code was left behind.

In a subsequent change I'll start using the "cb" value, this will make
it clear which functions we call need it, and which don't.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index d85cbabea67..5ec4cecae45 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -285,8 +285,8 @@ static int wait_all(void)
 
 static int grep_cmd_config(const char *var, const char *value, void *cb)
 {
-	int st = grep_config(var, value, cb);
-	if (git_color_default_config(var, value, cb) < 0)
+	int st = grep_config(var, value, NULL);
+	if (git_color_default_config(var, value, NULL) < 0)
 		st = -1;
 
 	if (!strcmp(var, "grep.threads")) {
-- 
2.34.1.1250.g6a242c1e9ad


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v7 06/10] grep API: call grep_config() after grep_init()
  2021-12-28  1:07         ` [PATCH v7 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                             ` (4 preceding siblings ...)
  2021-12-28  1:07           ` [PATCH v7 05/10] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
@ 2021-12-28  1:07           ` Ævar Arnfjörð Bjarmason
  2021-12-28  1:07           ` [PATCH v7 07/10] grep.h: make "grep_opt.pattern_type_option" use its enum Ævar Arnfjörð Bjarmason
                             ` (4 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28  1:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

The grep_init() function used the odd pattern of initializing the
passed-in "struct grep_opt" with a statically defined "grep_defaults"
struct, which would be modified in-place when we invoked
grep_config().

So we effectively (b) initialized config, (a) then defaults, (c)
followed by user options. Usually those are ordered as "a", "b" and
"c" instead.

As the comments being removed here show the previous behavior needed
to be carefully explained as we'd potentially share the populated
configuration among different instances of grep_init(). In practice we
didn't do that, but now that it can't be a concern anymore let's
remove those comments.

This does not change the behavior of any of the configuration
variables or options. That would have been the case if we didn't move
around the grep_config() call in "builtin/log.c". But now that we call
"grep_config" after "git_log_config" and "git_format_config" we'll
need to pass in the already initialized "struct grep_opt *".

See 6ba9bb76e02 (grep: copy struct in one fell swoop, 2020-11-29) and
7687a0541e0 (grep: move the configuration parsing logic to grep.[ch],
2012-10-09) for the commits that added the comments.

The memcpy() pattern here will be optimized away and follows the
convention of other *_init() functions. See 5726a6b4012 (*.c *_init():
define in terms of corresponding *_INIT macro, 2021-07-01).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c |  4 ++--
 builtin/log.c  | 13 +++++++++++--
 grep.c         | 39 +++------------------------------------
 grep.h         | 21 +++++++++++++++++++++
 4 files changed, 37 insertions(+), 40 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 5ec4cecae45..0ea124321b6 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -285,7 +285,7 @@ static int wait_all(void)
 
 static int grep_cmd_config(const char *var, const char *value, void *cb)
 {
-	int st = grep_config(var, value, NULL);
+	int st = grep_config(var, value, cb);
 	if (git_color_default_config(var, value, NULL) < 0)
 		st = -1;
 
@@ -966,8 +966,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	};
 	grep_prefix = prefix;
 
-	git_config(grep_cmd_config, NULL);
 	grep_init(&opt, the_repository);
+	git_config(grep_cmd_config, &opt);
 
 	/*
 	 * If there is no -- then the paths must exist in the working
diff --git a/builtin/log.c b/builtin/log.c
index 93ace0dde7d..fdde77e4ebb 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -520,8 +520,6 @@ static int git_log_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
-	if (grep_config(var, value, cb) < 0)
-		return -1;
 	if (git_gpg_config(var, value, cb) < 0)
 		return -1;
 	return git_diff_ui_config(var, value, cb);
@@ -536,6 +534,8 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.simplify_history = 0;
 	memset(&opt, 0, sizeof(opt));
@@ -650,6 +650,8 @@ int cmd_show(int argc, const char **argv, const char *prefix)
 
 	memset(&match_all, 0, sizeof(match_all));
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.always_show_header = 1;
 	rev.no_walk = 1;
@@ -733,6 +735,8 @@ int cmd_log_reflog(int argc, const char **argv, const char *prefix)
 
 	repo_init_revisions(the_repository, &rev, prefix);
 	init_reflog_walk(&rev.reflog_info);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.verbose_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -766,6 +770,8 @@ int cmd_log(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.always_show_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -1848,10 +1854,13 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	extra_hdr.strdup_strings = 1;
 	extra_to.strdup_strings = 1;
 	extra_cc.strdup_strings = 1;
+
 	init_log_defaults();
 	init_display_notes(&notes_opt);
 	git_config(git_format_config, NULL);
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.show_notes = show_notes;
 	memcpy(&rev.notes_opt, &notes_opt, sizeof(notes_opt));
 	rev.commit_format = CMIT_FMT_EMAIL;
diff --git a/grep.c b/grep.c
index 12b202598a9..8dfa0300786 100644
--- a/grep.c
+++ b/grep.c
@@ -19,27 +19,6 @@ static void std_output(struct grep_opt *opt, const void *buf, size_t size)
 	fwrite(buf, size, 1, stdout);
 }
 
-static struct grep_opt grep_defaults = {
-	.relative = 1,
-	.pathname = 1,
-	.max_depth = -1,
-	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED,
-	.colors = {
-		[GREP_COLOR_CONTEXT] = "",
-		[GREP_COLOR_FILENAME] = "",
-		[GREP_COLOR_FUNCTION] = "",
-		[GREP_COLOR_LINENO] = "",
-		[GREP_COLOR_COLUMNNO] = "",
-		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_SELECTED] = "",
-		[GREP_COLOR_SEP] = GIT_COLOR_CYAN,
-	},
-	.only_matching = 0,
-	.color = -1,
-	.output = std_output,
-};
-
 static const char *color_grep_slots[] = {
 	[GREP_COLOR_CONTEXT]	    = "context",
 	[GREP_COLOR_FILENAME]	    = "filename",
@@ -75,20 +54,12 @@ define_list_config_array_extra(color_grep_slots, {"match"});
  */
 int grep_config(const char *var, const char *value, void *cb)
 {
-	struct grep_opt *opt = &grep_defaults;
+	struct grep_opt *opt = cb;
 	const char *slot;
 
 	if (userdiff_config(var, value) < 0)
 		return -1;
 
-	/*
-	 * The instance of grep_opt that we set up here is copied by
-	 * grep_init() to be used by each individual invocation.
-	 * When populating a new field of this structure here, be
-	 * sure to think about ownership -- e.g., you might need to
-	 * override the shallow copy in grep_init() with a deep copy.
-	 */
-
 	if (!strcmp(var, "grep.extendedregexp")) {
 		opt->extended_regexp_option = git_config_bool(var, value);
 		return 0;
@@ -134,14 +105,10 @@ int grep_config(const char *var, const char *value, void *cb)
 	return 0;
 }
 
-/*
- * Initialize one instance of grep_opt and copy the
- * default values from the template we read the configuration
- * information in an earlier call to git_config(grep_config).
- */
 void grep_init(struct grep_opt *opt, struct repository *repo)
 {
-	*opt = grep_defaults;
+	struct grep_opt blank = GREP_OPT_INIT;
+	memcpy(opt, &blank, sizeof(*opt));
 
 	opt->repo = repo;
 	opt->pattern_tail = &opt->pattern_list;
diff --git a/grep.h b/grep.h
index 62deadb885f..b651eb291f7 100644
--- a/grep.h
+++ b/grep.h
@@ -177,6 +177,27 @@ struct grep_opt {
 	void *output_priv;
 };
 
+#define GREP_OPT_INIT { \
+	.relative = 1, \
+	.pathname = 1, \
+	.max_depth = -1, \
+	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
+	.colors = { \
+		[GREP_COLOR_CONTEXT] = "", \
+		[GREP_COLOR_FILENAME] = "", \
+		[GREP_COLOR_FUNCTION] = "", \
+		[GREP_COLOR_LINENO] = "", \
+		[GREP_COLOR_COLUMNNO] = "", \
+		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_SELECTED] = "", \
+		[GREP_COLOR_SEP] = GIT_COLOR_CYAN, \
+	}, \
+	.only_matching = 0, \
+	.color = -1, \
+	.output = std_output, \
+}
+
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
-- 
2.34.1.1250.g6a242c1e9ad


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v7 07/10] grep.h: make "grep_opt.pattern_type_option" use its enum
  2021-12-28  1:07         ` [PATCH v7 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                             ` (5 preceding siblings ...)
  2021-12-28  1:07           ` [PATCH v7 06/10] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
@ 2021-12-28  1:07           ` Ævar Arnfjörð Bjarmason
  2021-12-28  1:07           ` [PATCH v7 08/10] grep.c: do "if (bool && memchr())" not "if (memchr() && bool)" Ævar Arnfjörð Bjarmason
                             ` (3 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28  1:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change the "pattern_type_option" member of "struct grep_opt" to use
the enum type we use for it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/grep.h b/grep.h
index b651eb291f7..bae2899e364 100644
--- a/grep.h
+++ b/grep.h
@@ -162,7 +162,7 @@ struct grep_opt {
 	int funcname;
 	int funcbody;
 	int extended_regexp_option;
-	int pattern_type_option;
+	enum grep_pattern_type pattern_type_option;
 	int ignore_locale;
 	char colors[NR_GREP_COLORS][COLOR_MAXLEN];
 	unsigned pre_context;
-- 
2.34.1.1250.g6a242c1e9ad


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v7 08/10] grep.c: do "if (bool && memchr())" not "if (memchr() && bool)"
  2021-12-28  1:07         ` [PATCH v7 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                             ` (6 preceding siblings ...)
  2021-12-28  1:07           ` [PATCH v7 07/10] grep.h: make "grep_opt.pattern_type_option" use its enum Ævar Arnfjörð Bjarmason
@ 2021-12-28  1:07           ` Ævar Arnfjörð Bjarmason
  2021-12-28  1:07           ` [PATCH v7 09/10] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
                             ` (2 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28  1:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change code in compile_regexp() to check the cheaper boolean
"!opt->pcre2" condition before the "memchr()" search.

This doesn't noticeably optimize anything, but makes the code more
obvious and conventional. The line wrapping being added here also
makes a subsequent commit smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index 8dfa0300786..f85be8b6eac 100644
--- a/grep.c
+++ b/grep.c
@@ -492,7 +492,8 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 	p->ignore_case = opt->ignore_case;
 	p->fixed = opt->fixed;
 
-	if (memchr(p->pattern, 0, p->patternlen) && !opt->pcre2)
+	if (!opt->pcre2 &&
+	    memchr(p->pattern, 0, p->patternlen))
 		die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
 
 	p->is_fixed = is_fixed(p->pattern, p->patternlen);
-- 
2.34.1.1250.g6a242c1e9ad


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v7 09/10] grep: simplify config parsing and option parsing
  2021-12-28  1:07         ` [PATCH v7 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                             ` (7 preceding siblings ...)
  2021-12-28  1:07           ` [PATCH v7 08/10] grep.c: do "if (bool && memchr())" not "if (memchr() && bool)" Ævar Arnfjörð Bjarmason
@ 2021-12-28  1:07           ` Ævar Arnfjörð Bjarmason
  2021-12-28  1:07           ` [PATCH v7 10/10] grep.[ch]: remove GREP_PATTERN_TYPE_UNSPECIFIED Ævar Arnfjörð Bjarmason
  2022-01-18 15:55           ` [PATCH v8 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28  1:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Simplify the parsing of "grep.patternType" and
"grep.extendedRegexp". This changes no behavior, but gets rid of
complex parsing logic that isn't needed anymore.

When "grep.patternType" was introduced in 84befcd0a4a (grep: add a
grep.patternType configuration setting, 2012-08-03) we promised that:

 1. You can set "grep.patternType", and "[setting it to] 'default'
    will return to the default matching behavior".

    In that context "the default" meant whatever the configuration
    system specified before that change, i.e. via grep.extendedRegexp.

 2. We'd support the existing "grep.extendedRegexp" option, but ignore
    it when the new "grep.patternType" option is set. We said we'd
    only ignore the older "grep.extendedRegexp" option "when the
    `grep.patternType` option is set. to a value other than
    'default'".

In a preceding commit we changed grep_config() to be called after
grep_init(), which means that much of the complexity here can go
away.

As before both "grep.patternType" and "grep.extendedRegexp" are
last-one-wins variable, with "grep.extendedRegexp" yielding to
"grep.patternType", except when "grep.patternType=default".

Note that this applies as we parse the config, i.e. a sequence of:

    -c grep.patternType=perl
    -c grep.extendedRegexp=true \
    -c grep.patternType=default

Should select ERE due to "grep.extendedRegexp=true and
grep.extendedRegexp=default", not BRE, even though that's the
"default" patternType. We can determine this as we parse the config,
because:

 * If we see "grep.extendedRegexp" we set the internal "ero" to its
   boolean value.

 * If we see "grep.extendedRegexp" but
   "grep.patternType=[default|<unset>]" is in effect we *don't* set
   the internal "pattern_type_option" to update the pattern type.

 * If we see "grep.patternType!=default" we can set our internal
   "pattern_type_option" directly, it doesn't matter what the state of
   "grep.extendedRegexp" is, but we don't forget what it was, in case
   we see a "grep.patternType=default" again.

 * If we see a "grep.patternType=default" we can set the pattern to
   ERE or BRE depending on whether we last saw a
   "grep.extendedRegexp=true" or
   "grep.extendedRegexp=[false|<unset>]".

We could equally call this new adjust_pattern_type() in
compile_regexp(), i.e. this fixup on top of this passes all our
tests (with -U0 for brevity):

    @@ -60,0 +61 @@ static void adjust_pattern_type(enum grep_pattern_type *pto, const int ero)
    +static int ero = -1;
    @@ -65 +65,0 @@ int grep_config(const char *var, const char *value, void *cb)
    -       static int ero = -1;
    @@ -72 +71,0 @@ int grep_config(const char *var, const char *value, void *cb)
    -               adjust_pattern_type(&opt->pattern_type_option, ero);
    @@ -80 +78,0 @@ int grep_config(const char *var, const char *value, void *cb)
    -               adjust_pattern_type(&opt->pattern_type_option, ero);
    @@ -445,0 +444,2 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
    +       if (ero != -1)
    +               adjust_pattern_type(&opt->pattern_type_option, ero);

But doing it as we stream the git_config() makes it
clear that we can determine the interplay between these two variables
as we go. We don't need to wait until we see the last value of the two
configuration variables.

This is true because of the rationale above, and because the
subsequent code in compile_regexp() treats
"pattern_type_option=GREP_PATTERN_TYPE_{UNSPECIFIED,BRE}"
equally. I.e. we'll end up with different internal
""pattern_type_option" values there for:

    # UNSPECIFIED
    -c grep.patternType=default
    # BRE
    -c grep.extendedRegexp=false -c grep.patternType=default

But the difference won't matter, which simplifies some of this logic,
we never need to adjust a "grep.patternType" if we didn't see a
"grep.extendedRegexp" before. We can also remove the
"extended_regexp_option" member from "struct grep_opt" in favor of a
static variable in grep_config().

The command-line parsing in cmd_grep() can then completely ignore
"grep.extendedRegexp". Whatever effect it had before that step won't
matter if we see -G, -E, -P etc.

See my 07a3d411739 (grep: remove regflags from the public grep_opt
API, 2017-06-29) for addition of the two comments being removed here,
i.e. the complexity noted in that commit is now going away.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 10 +++----
 grep.c         | 77 +++++++++++---------------------------------------
 grep.h         |  4 ---
 revision.c     |  2 --
 4 files changed, 20 insertions(+), 73 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 0ea124321b6..942c4b25077 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -845,7 +845,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	int i;
 	int dummy;
 	int use_index = 1;
-	int pattern_type_arg = GREP_PATTERN_TYPE_UNSPECIFIED;
 	int allow_revs;
 
 	struct option options[] = {
@@ -879,16 +878,16 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			N_("descend at most <depth> levels"), PARSE_OPT_NONEG,
 			NULL, 1 },
 		OPT_GROUP(""),
-		OPT_SET_INT('E', "extended-regexp", &pattern_type_arg,
+		OPT_SET_INT('E', "extended-regexp", &opt.pattern_type_option,
 			    N_("use extended POSIX regular expressions"),
 			    GREP_PATTERN_TYPE_ERE),
-		OPT_SET_INT('G', "basic-regexp", &pattern_type_arg,
+		OPT_SET_INT('G', "basic-regexp", &opt.pattern_type_option,
 			    N_("use basic POSIX regular expressions (default)"),
 			    GREP_PATTERN_TYPE_BRE),
-		OPT_SET_INT('F', "fixed-strings", &pattern_type_arg,
+		OPT_SET_INT('F', "fixed-strings", &opt.pattern_type_option,
 			    N_("interpret patterns as fixed strings"),
 			    GREP_PATTERN_TYPE_FIXED),
-		OPT_SET_INT('P', "perl-regexp", &pattern_type_arg,
+		OPT_SET_INT('P', "perl-regexp", &opt.pattern_type_option,
 			    N_("use Perl-compatible regular expressions"),
 			    GREP_PATTERN_TYPE_PCRE),
 		OPT_GROUP(""),
@@ -982,7 +981,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options, grep_usage,
 			     PARSE_OPT_KEEP_DASHDASH |
 			     PARSE_OPT_STOP_AT_NON_OPTION);
-	grep_commit_pattern_type(pattern_type_arg, &opt);
 
 	if (use_index && !startup_info->have_repository) {
 		int fallback = 0;
diff --git a/grep.c b/grep.c
index f85be8b6eac..a6712adbce7 100644
--- a/grep.c
+++ b/grep.c
@@ -48,6 +48,12 @@ static int parse_pattern_type_arg(const char *opt, const char *arg)
 
 define_list_config_array_extra(color_grep_slots, {"match"});
 
+static void adjust_pattern_type(enum grep_pattern_type *pto, const int ero)
+{
+	if (*pto == GREP_PATTERN_TYPE_UNSPECIFIED)
+		*pto = ero ? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
+}
+
 /*
  * Read the configuration file once and store it in
  * the grep_defaults template.
@@ -56,17 +62,22 @@ int grep_config(const char *var, const char *value, void *cb)
 {
 	struct grep_opt *opt = cb;
 	const char *slot;
+	static int ero = -1;
 
 	if (userdiff_config(var, value) < 0)
 		return -1;
 
 	if (!strcmp(var, "grep.extendedregexp")) {
-		opt->extended_regexp_option = git_config_bool(var, value);
+		ero = git_config_bool(var, value);
+		adjust_pattern_type(&opt->pattern_type_option, ero);
 		return 0;
 	}
 
 	if (!strcmp(var, "grep.patterntype")) {
 		opt->pattern_type_option = parse_pattern_type_arg(var, value);
+		if (ero == -1)
+			return 0;
+		adjust_pattern_type(&opt->pattern_type_option, ero);
 		return 0;
 	}
 
@@ -115,62 +126,6 @@ void grep_init(struct grep_opt *opt, struct repository *repo)
 	opt->header_tail = &opt->header_list;
 }
 
-static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	/*
-	 * When committing to the pattern type by setting the relevant
-	 * fields in grep_opt it's generally not necessary to zero out
-	 * the fields we're not choosing, since they won't have been
-	 * set by anything. The extended_regexp_option field is the
-	 * only exception to this.
-	 *
-	 * This is because in the process of parsing grep.patternType
-	 * & grep.extendedRegexp we set opt->pattern_type_option and
-	 * opt->extended_regexp_option, respectively. We then
-	 * internally use opt->extended_regexp_option to see if we're
-	 * compiling an ERE. It must be unset if that's not actually
-	 * the case.
-	 */
-	if (pattern_type != GREP_PATTERN_TYPE_ERE &&
-	    opt->extended_regexp_option)
-		opt->extended_regexp_option = 0;
-
-	switch (pattern_type) {
-	case GREP_PATTERN_TYPE_UNSPECIFIED:
-		/* fall through */
-
-	case GREP_PATTERN_TYPE_BRE:
-		break;
-
-	case GREP_PATTERN_TYPE_ERE:
-		opt->extended_regexp_option = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_FIXED:
-		opt->fixed = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_PCRE:
-		opt->pcre2 = 1;
-		break;
-	}
-}
-
-void grep_commit_pattern_type(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	if (pattern_type != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(pattern_type, opt);
-	else if (opt->pattern_type_option != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(opt->pattern_type_option, opt);
-	else if (opt->extended_regexp_option)
-		/*
-		 * This branch *must* happen after setting from the
-		 * opt->pattern_type_option above, we don't want
-		 * grep.extendedRegexp to override grep.patternType!
-		 */
-		grep_set_pattern_type_option(GREP_PATTERN_TYPE_ERE, opt);
-}
-
 static struct grep_pat *create_grep_pat(const char *pat, size_t patlen,
 					const char *origin, int no,
 					enum grep_pat_token t,
@@ -490,9 +445,9 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 
 	p->word_regexp = opt->word_regexp;
 	p->ignore_case = opt->ignore_case;
-	p->fixed = opt->fixed;
+	p->fixed = opt->pattern_type_option == GREP_PATTERN_TYPE_FIXED;
 
-	if (!opt->pcre2 &&
+	if (opt->pattern_type_option != GREP_PATTERN_TYPE_PCRE &&
 	    memchr(p->pattern, 0, p->patternlen))
 		die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
 
@@ -544,14 +499,14 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 		return;
 	}
 
-	if (opt->pcre2) {
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_PCRE) {
 		compile_pcre2_pattern(p, opt);
 		return;
 	}
 
 	if (p->ignore_case)
 		regflags |= REG_ICASE;
-	if (opt->extended_regexp_option)
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_ERE)
 		regflags |= REG_EXTENDED;
 	err = regcomp(&p->regexp, p->pattern, regflags);
 	if (err) {
diff --git a/grep.h b/grep.h
index bae2899e364..32ff4dad3de 100644
--- a/grep.h
+++ b/grep.h
@@ -143,7 +143,6 @@ struct grep_opt {
 	int unmatch_name_only;
 	int count;
 	int word_regexp;
-	int fixed;
 	int all_match;
 #define GREP_BINARY_DEFAULT	0
 #define GREP_BINARY_NOMATCH	1
@@ -152,7 +151,6 @@ struct grep_opt {
 	int allow_textconv;
 	int extended;
 	int use_reflog_filter;
-	int pcre2;
 	int relative;
 	int pathname;
 	int null_following_name;
@@ -161,7 +159,6 @@ struct grep_opt {
 	int max_depth;
 	int funcname;
 	int funcbody;
-	int extended_regexp_option;
 	enum grep_pattern_type pattern_type_option;
 	int ignore_locale;
 	char colors[NR_GREP_COLORS][COLOR_MAXLEN];
@@ -200,7 +197,6 @@ struct grep_opt {
 
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
-void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
 void append_grep_pattern(struct grep_opt *opt, const char *pat, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index 495328e859c..298d0ea7574 100644
--- a/revision.c
+++ b/revision.c
@@ -2860,8 +2860,6 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 
 	diff_setup_done(&revs->diffopt);
 
-	grep_commit_pattern_type(GREP_PATTERN_TYPE_UNSPECIFIED,
-				 &revs->grep_filter);
 	if (!is_encoding_utf8(get_log_output_encoding()))
 		revs->grep_filter.ignore_locale = 1;
 	compile_grep_patterns(&revs->grep_filter);
-- 
2.34.1.1250.g6a242c1e9ad


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v7 10/10] grep.[ch]: remove GREP_PATTERN_TYPE_UNSPECIFIED
  2021-12-28  1:07         ` [PATCH v7 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                             ` (8 preceding siblings ...)
  2021-12-28  1:07           ` [PATCH v7 09/10] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
@ 2021-12-28  1:07           ` Ævar Arnfjörð Bjarmason
  2022-01-18 15:55           ` [PATCH v8 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-12-28  1:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Remove the need for "GREP_PATTERN_TYPE_UNSPECIFIED" in favor of having
the users of the "pattern_type_option" member check whether that
member is set or not.

The "UNSPECIFIED" case was already handled implicitly in
compile_regexp(), and we don't use this "enum" in a "switch"
statement, so let's not explicitly name the
"GREP_PATTERN_TYPE_UNSPECIFIED = 0" case. It is still important that
"GREP_PATTERN_TYPE_BRE != 0", as can be seen in failing tests if the
parsing for "basic" in parse_pattern_type_arg() is made to "return 0".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 9 ++++++---
 grep.h | 4 +---
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/grep.c b/grep.c
index a6712adbce7..914eb5dee9b 100644
--- a/grep.c
+++ b/grep.c
@@ -34,7 +34,7 @@ static const char *color_grep_slots[] = {
 static int parse_pattern_type_arg(const char *opt, const char *arg)
 {
 	if (!strcmp(arg, "default"))
-		return GREP_PATTERN_TYPE_UNSPECIFIED;
+		return 0;
 	else if (!strcmp(arg, "basic"))
 		return GREP_PATTERN_TYPE_BRE;
 	else if (!strcmp(arg, "extended"))
@@ -50,8 +50,7 @@ define_list_config_array_extra(color_grep_slots, {"match"});
 
 static void adjust_pattern_type(enum grep_pattern_type *pto, const int ero)
 {
-	if (*pto == GREP_PATTERN_TYPE_UNSPECIFIED)
-		*pto = ero ? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
+	*pto = ero ? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
 }
 
 /*
@@ -69,12 +68,16 @@ int grep_config(const char *var, const char *value, void *cb)
 
 	if (!strcmp(var, "grep.extendedregexp")) {
 		ero = git_config_bool(var, value);
+		if (opt->pattern_type_option)
+			return 0;
 		adjust_pattern_type(&opt->pattern_type_option, ero);
 		return 0;
 	}
 
 	if (!strcmp(var, "grep.patterntype")) {
 		opt->pattern_type_option = parse_pattern_type_arg(var, value);
+		if (opt->pattern_type_option)
+			return 0;
 		if (ero == -1)
 			return 0;
 		adjust_pattern_type(&opt->pattern_type_option, ero);
diff --git a/grep.h b/grep.h
index 32ff4dad3de..e365b689287 100644
--- a/grep.h
+++ b/grep.h
@@ -94,8 +94,7 @@ enum grep_expr_node {
 };
 
 enum grep_pattern_type {
-	GREP_PATTERN_TYPE_UNSPECIFIED = 0,
-	GREP_PATTERN_TYPE_BRE,
+	GREP_PATTERN_TYPE_BRE = 1,
 	GREP_PATTERN_TYPE_ERE,
 	GREP_PATTERN_TYPE_FIXED,
 	GREP_PATTERN_TYPE_PCRE
@@ -178,7 +177,6 @@ struct grep_opt {
 	.relative = 1, \
 	.pathname = 1, \
 	.max_depth = -1, \
-	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
 	.colors = { \
 		[GREP_COLOR_CONTEXT] = "", \
 		[GREP_COLOR_FILENAME] = "", \
-- 
2.34.1.1250.g6a242c1e9ad


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v8 00/10] grep: simplify & delete "init" & "config" code
  2021-12-28  1:07         ` [PATCH v7 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                             ` (9 preceding siblings ...)
  2021-12-28  1:07           ` [PATCH v7 10/10] grep.[ch]: remove GREP_PATTERN_TYPE_UNSPECIFIED Ævar Arnfjörð Bjarmason
@ 2022-01-18 15:55           ` Ævar Arnfjörð Bjarmason
  2022-01-18 15:55             ` [PATCH v8 01/10] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
                               ` (10 more replies)
  10 siblings, 11 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-18 15:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

A v8 re-roll of this series. For context and v7 see:
https://lore.kernel.org/git/cover-v7-00.10-00000000000-20211228T004707Z-avarab@gmail.com

The v7 has not been picked up yet. The ab/grep-patterntype in Junio's
tree is the v6. This v8 is rebased on "master" for a merge conflict
with the now-merged lh/use-gnu-color-in-grep.

Ævar Arnfjörð Bjarmason (10):
  grep.h: remove unused "regex_t regexp" from grep_opt
  log tests: check if grep_config() is called by "log"-like cmds
  grep tests: add missing "grep.patternType" config tests
  built-ins: trust the "prefix" from run_builtin()
  grep.c: don't pass along NULL callback value
  grep API: call grep_config() after grep_init()
  grep.h: make "grep_opt.pattern_type_option" use its enum
  grep.c: do "if (bool && memchr())" not "if (memchr() && bool)"
  grep: simplify config parsing and option parsing
  grep.[ch]: remove GREP_PATTERN_TYPE_UNSPECIFIED

 builtin/grep.c    |  27 +++++-----
 builtin/log.c     |  13 ++++-
 builtin/ls-tree.c |   2 +-
 git.c             |   1 +
 grep.c            | 126 +++++++++-------------------------------------
 grep.h            |  34 +++++++++----
 revision.c        |   4 +-
 t/t4202-log.sh    |  24 +++++++++
 t/t7810-grep.sh   |  68 +++++++++++++++++++++++++
 9 files changed, 168 insertions(+), 131 deletions(-)

Range-diff against v7:
 1:  b62e6b6162a =  1:  010a2066656 grep.h: remove unused "regex_t regexp" from grep_opt
 2:  0edcdb50afd =  2:  e4981fa3417 log tests: check if grep_config() is called by "log"-like cmds
 3:  e1b4b5b77e0 =  3:  59092169e55 grep tests: add missing "grep.patternType" config tests
 4:  6d91a765fd7 =  4:  331c9019a0e built-ins: trust the "prefix" from run_builtin()
 5:  844b4727ca3 =  5:  25dd327b653 grep.c: don't pass along NULL callback value
 6:  d9cf9bf5e37 !  6:  3c559ad006a grep API: call grep_config() after grep_init()
    @@ grep.c: static void std_output(struct grep_opt *opt, const void *buf, size_t siz
     -	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED,
     -	.colors = {
     -		[GREP_COLOR_CONTEXT] = "",
    --		[GREP_COLOR_FILENAME] = "",
    +-		[GREP_COLOR_FILENAME] = GIT_COLOR_MAGENTA,
     -		[GREP_COLOR_FUNCTION] = "",
    --		[GREP_COLOR_LINENO] = "",
    --		[GREP_COLOR_COLUMNNO] = "",
    +-		[GREP_COLOR_LINENO] = GIT_COLOR_GREEN,
    +-		[GREP_COLOR_COLUMNNO] = GIT_COLOR_GREEN,
     -		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED,
     -		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED,
     -		[GREP_COLOR_SELECTED] = "",
    @@ grep.h: struct grep_opt {
     +	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
     +	.colors = { \
     +		[GREP_COLOR_CONTEXT] = "", \
    -+		[GREP_COLOR_FILENAME] = "", \
    ++		[GREP_COLOR_FILENAME] = GIT_COLOR_MAGENTA, \
     +		[GREP_COLOR_FUNCTION] = "", \
    -+		[GREP_COLOR_LINENO] = "", \
    -+		[GREP_COLOR_COLUMNNO] = "", \
    ++		[GREP_COLOR_LINENO] = GIT_COLOR_GREEN, \
    ++		[GREP_COLOR_COLUMNNO] = GIT_COLOR_GREEN, \
     +		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED, \
     +		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED, \
     +		[GREP_COLOR_SELECTED] = "", \
 7:  57ecc5c0d65 =  7:  daf873899c1 grep.h: make "grep_opt.pattern_type_option" use its enum
 8:  7dbeafde26b =  8:  62650a78ea9 grep.c: do "if (bool && memchr())" not "if (memchr() && bool)"
 9:  c6ca39b4554 !  9:  c211bb0c69d grep: simplify config parsing and option parsing
    @@ grep.h: struct grep_opt {
      	int word_regexp;
     -	int fixed;
      	int all_match;
    - #define GREP_BINARY_DEFAULT	0
    - #define GREP_BINARY_NOMATCH	1
    + 	int no_body_match;
    + 	int body_hit;
     @@ grep.h: struct grep_opt {
      	int allow_textconv;
      	int extended;
10:  b764c09d2b7 ! 10:  b52a0c11fa9 grep.[ch]: remove GREP_PATTERN_TYPE_UNSPECIFIED
    @@ grep.h: struct grep_opt {
     -	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
      	.colors = { \
      		[GREP_COLOR_CONTEXT] = "", \
    - 		[GREP_COLOR_FILENAME] = "", \
    + 		[GREP_COLOR_FILENAME] = GIT_COLOR_MAGENTA, \
-- 
2.35.0.rc1.864.g57621b115b6


^ permalink raw reply	[flat|nested] 151+ messages in thread

* [PATCH v8 01/10] grep.h: remove unused "regex_t regexp" from grep_opt
  2022-01-18 15:55           ` [PATCH v8 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
@ 2022-01-18 15:55             ` Ævar Arnfjörð Bjarmason
  2022-01-18 15:55             ` [PATCH v8 02/10] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
                               ` (9 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-18 15:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

This "regex_t" in grep_opt has not been used since
f9b9faf6f8a (builtin-grep: allow more than one patterns., 2006-05-02),
we still use a "regex_t" for compiling regexes, but that's in the
"grep_pat" struct".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/grep.h b/grep.h
index 6a1f0ab0172..400172676a1 100644
--- a/grep.h
+++ b/grep.h
@@ -136,7 +136,6 @@ struct grep_opt {
 
 	const char *prefix;
 	int prefix_length;
-	regex_t regexp;
 	int linenum;
 	int columnnum;
 	int invert;
-- 
2.35.0.rc1.864.g57621b115b6


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v8 02/10] log tests: check if grep_config() is called by "log"-like cmds
  2022-01-18 15:55           ` [PATCH v8 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  2022-01-18 15:55             ` [PATCH v8 01/10] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
@ 2022-01-18 15:55             ` Ævar Arnfjörð Bjarmason
  2022-01-18 15:55             ` [PATCH v8 03/10] grep tests: add missing "grep.patternType" config tests Ævar Arnfjörð Bjarmason
                               ` (8 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-18 15:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Extend the tests added in my 9df46763ef1 (log: add exhaustive tests
for pattern style options & config, 2017-05-20) to check not only
whether "git log" handles "grep.patternType", but also "git show"
etc.

It's sufficient to check whether we match a "fixed" or a "basic" regex
here to see if these codepaths correctly invoked grep_config(). We
don't need to check the details of their regular expression matching
as the "log" test does.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t4202-log.sh | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 50495598619..e775b378e4b 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -449,6 +449,30 @@ test_expect_success !FAIL_PREREQS 'log with various grep.patternType configurati
 	)
 '
 
+for cmd in show whatchanged reflog format-patch
+do
+	case "$cmd" in
+	format-patch) myarg="HEAD~.." ;;
+	*) myarg= ;;
+	esac
+
+	test_expect_success "$cmd: understands grep.patternType, like 'log'" '
+		git init "pattern-type-$cmd" &&
+		(
+			cd "pattern-type-$cmd" &&
+			test_commit 1 file A &&
+			test_commit "(1|2)" file B 2 &&
+
+			git -c grep.patternType=fixed $cmd --grep="..." $myarg >actual &&
+			test_must_be_empty actual &&
+
+			git -c grep.patternType=basic $cmd --grep="..." $myarg >actual &&
+			test_file_not_empty actual
+		)
+	'
+done
+test_done
+
 test_expect_success 'log --author' '
 	cat >expect <<-\EOF &&
 	Author: <BOLD;RED>A U<RESET> Thor <author@example.com>
-- 
2.35.0.rc1.864.g57621b115b6


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v8 03/10] grep tests: add missing "grep.patternType" config tests
  2022-01-18 15:55           ` [PATCH v8 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  2022-01-18 15:55             ` [PATCH v8 01/10] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
  2022-01-18 15:55             ` [PATCH v8 02/10] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
@ 2022-01-18 15:55             ` Ævar Arnfjörð Bjarmason
  2022-01-18 15:55             ` [PATCH v8 04/10] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
                               ` (7 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-18 15:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Extend the grep tests to assert that setting
"grep.patternType=extended" followed by "grep.patternType=default"
will behave as if "--basic-regexp" was provided, and not as
"--extended-regexp". In a subsequent commit we'll need to treat
"grep.patternType=default" as a special-case, but let's make sure we
ignore it if it's being set to "default" following an earlier
non-"default" "grep.patternType" setting.

Let's also test what happens when we have a sequence of "extended"
followed by "default" and "fixed". In that case the "fixed" should
prevail, as well as tests to check that a "grep.extendedRegexp=true"
followed by a "grep.extendedRegexp=false" behaves as though
"grep.extendedRegexp" wasn't provided.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t7810-grep.sh | 68 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 424c31c3287..34d8f69c1de 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -451,6 +451,65 @@ do
 		test_cmp expected actual
 	'
 
+	test_expect_success "grep $L with grep.extendedRegexp is last-one-wins" '
+		echo "${HC}ab:a+bc" >expected &&
+		git \
+			-c grep.extendedRegexp=true \
+			-c grep.patternType=basic \
+			-c grep.extendedRegexp=false \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.extendedRegexp is last-one-wins & defers to grep.patternType" '
+		echo "${HC}ab:abc" >expected &&
+		git \
+			-c grep.extendedRegexp=true \
+			-c grep.patternType=extended \
+			-c grep.extendedRegexp=false \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (BRE)" '
+		echo "${HC}ab:abc" >expected &&
+		git \
+			-c grep.patternType=fixed \
+			-c grep.extendedRegexp=true \
+			-c grep.patternType=default \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (ERE)" '
+		echo "${HC}ab:a+bc" >expected &&
+		git \
+			-c grep.patternType=default \
+			-c grep.extendedRegexp=true \
+			-c grep.patternType=basic \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.patternType=extended and grep.patternType=default" '
+		echo "${HC}ab:a+bc" >expected &&
+		git \
+			-c grep.patternType=extended \
+			-c grep.patternType=default \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.patternType=[extended -> default -> fixed]" '
+		echo "${HC}ab:a+b*c" >expected &&
+		git \
+			-c grep.patternType=extended \
+			-c grep.patternType=default \
+			-c grep.patternType=fixed \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
 	test_expect_success "grep $L with grep.patternType=extended and grep.extendedRegexp=false" '
 		echo "${HC}ab:abc" >expected &&
 		git \
@@ -478,6 +537,15 @@ do
 		test_cmp expected actual
 	'
 
+	test_expect_success "grep $L with grep.extendedRegexp=false and grep.patternType=default" '
+		echo "${HC}ab:abc" >expected &&
+		git \
+			-c grep.extendedRegexp=false \
+			-c grep.patternType=extended \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
 	test_expect_success "grep $L with grep.extendedRegexp=true and grep.patternType=basic" '
 		echo "${HC}ab:a+bc" >expected &&
 		git \
-- 
2.35.0.rc1.864.g57621b115b6


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v8 04/10] built-ins: trust the "prefix" from run_builtin()
  2022-01-18 15:55           ` [PATCH v8 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                               ` (2 preceding siblings ...)
  2022-01-18 15:55             ` [PATCH v8 03/10] grep tests: add missing "grep.patternType" config tests Ævar Arnfjörð Bjarmason
@ 2022-01-18 15:55             ` Ævar Arnfjörð Bjarmason
  2022-01-18 15:55             ` [PATCH v8 05/10] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
                               ` (6 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-18 15:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change code in "builtin/grep.c" and "builtin/ls-tree.c" to trust the
"prefix" passed from "run_builtin()". The "prefix" we get from setup.c
is either going to be NULL or a string of length >0, never "".

So we can drop the "prefix && *prefix" checks added for
"builtin/grep.c" in 0d042fecf2f (git-grep: show pathnames relative to
the current directory, 2006-08-11), and for "builtin/ls-tree.c" in
a69dd585fca (ls-tree: chomp leading directories when run from a
subdirectory, 2005-12-23).

As seen in code in revision.c that was added in cd676a51367 (diff
--relative: output paths as relative to the current subdirectory,
2008-02-12) we already have existing code that does away with this
assertion.

This makes it easier to reason about a subsequent change to the
"prefix_length" code in grep.c in a subsequent commit, and since we're
going to the trouble of doing that let's leave behind an assert() to
promise this to any future callers.

For "builtin/grep.c" it would be painful to pass the "prefix" down the
callchain of:

    cmd_grep -> grep_tree -> grep_submodule -> grep_cache -> grep_oid ->
    grep_source_name

So for the code that needs it in grep_source_name() let's add a
"grep_prefix" variable similar to the existing "ls_tree_prefix".

While at it let's move the code in cmd_ls_tree() around so that we
assign to the "ls_tree_prefix" right after declaring the variables,
and stop assigning to "prefix". We only subsequently used that
variable later in the function after clobbering it. Let's just use our
own "grep_prefix" instead.

Let's also add an assert() in git.c, so that we'll make this promise
about the "prefix" to any current and future callers, as well as to
any readers of the code.

Code history:

 * The strlen() in "grep.c" hasn't been used since 493b7a08d80 (grep:
   accept relative paths outside current working directory, 2009-09-05).

   When that code was added in 0d042fecf2f (git-grep: show pathnames
   relative to the current directory, 2006-08-11) we used the length.

   But since 493b7a08d80 we haven't used it for anything except a
   boolean check that we could have done on the "prefix" member
   itself.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c    | 13 ++++++++-----
 builtin/ls-tree.c |  2 +-
 git.c             |  1 +
 grep.c            |  4 +---
 grep.h            |  4 +---
 revision.c        |  2 +-
 6 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 9e34a820ad4..d85cbabea67 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -26,6 +26,8 @@
 #include "object-store.h"
 #include "packfile.h"
 
+static const char *grep_prefix;
+
 static char const * const grep_usage[] = {
 	N_("git grep [<options>] [-e] <pattern> [<rev>...] [[--] <path>...]"),
 	NULL
@@ -315,11 +317,11 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 	strbuf_reset(out);
 
 	if (opt->null_following_name) {
-		if (opt->relative && opt->prefix_length) {
+		if (opt->relative && grep_prefix) {
 			struct strbuf rel_buf = STRBUF_INIT;
 			const char *rel_name =
 				relative_path(filename + tree_name_len,
-					      opt->prefix, &rel_buf);
+					      grep_prefix, &rel_buf);
 
 			if (tree_name_len)
 				strbuf_add(out, filename, tree_name_len);
@@ -332,8 +334,8 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 		return;
 	}
 
-	if (opt->relative && opt->prefix_length)
-		quote_path(filename + tree_name_len, opt->prefix, out, 0);
+	if (opt->relative && grep_prefix)
+		quote_path(filename + tree_name_len, grep_prefix, out, 0);
 	else
 		quote_c_style(filename + tree_name_len, out, NULL, 0);
 
@@ -962,9 +964,10 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			   PARSE_OPT_NOCOMPLETE),
 		OPT_END()
 	};
+	grep_prefix = prefix;
 
 	git_config(grep_cmd_config, NULL);
-	grep_init(&opt, the_repository, prefix);
+	grep_init(&opt, the_repository);
 
 	/*
 	 * If there is no -- then the paths must exist in the working
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c71..6cb554cbb0a 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -150,7 +150,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 
 	git_config(git_default_config, NULL);
 	ls_tree_prefix = prefix;
-	if (prefix && *prefix)
+	if (prefix)
 		chomp_prefix = strlen(prefix);
 
 	argc = parse_options(argc, argv, prefix, ls_tree_options,
diff --git a/git.c b/git.c
index edda922ce6d..9d257e092da 100644
--- a/git.c
+++ b/git.c
@@ -436,6 +436,7 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
 	} else {
 		prefix = NULL;
 	}
+	assert(!prefix || *prefix);
 	precompose_argv_prefix(argc, argv, NULL);
 	if (use_pager == -1 && run_setup &&
 		!(p->option & DELAY_PAGER_CONFIG))
diff --git a/grep.c b/grep.c
index 7bb0360869a..8421dc55486 100644
--- a/grep.c
+++ b/grep.c
@@ -139,13 +139,11 @@ int grep_config(const char *var, const char *value, void *cb)
  * default values from the template we read the configuration
  * information in an earlier call to git_config(grep_config).
  */
-void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix)
+void grep_init(struct grep_opt *opt, struct repository *repo)
 {
 	*opt = grep_defaults;
 
 	opt->repo = repo;
-	opt->prefix = prefix;
-	opt->prefix_length = (prefix && *prefix) ? strlen(prefix) : 0;
 	opt->pattern_tail = &opt->pattern_list;
 	opt->header_tail = &opt->header_list;
 }
diff --git a/grep.h b/grep.h
index 400172676a1..23a2a41d2c4 100644
--- a/grep.h
+++ b/grep.h
@@ -134,8 +134,6 @@ struct grep_opt {
 	 */
 	struct repository *repo;
 
-	const char *prefix;
-	int prefix_length;
 	int linenum;
 	int columnnum;
 	int invert;
@@ -182,7 +180,7 @@ struct grep_opt {
 };
 
 int grep_config(const char *var, const char *value, void *);
-void grep_init(struct grep_opt *, struct repository *repo, const char *prefix);
+void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index ad4286fbdde..d6e0e2b23b5 100644
--- a/revision.c
+++ b/revision.c
@@ -1838,7 +1838,7 @@ void repo_init_revisions(struct repository *r,
 	revs->commit_format = CMIT_FMT_DEFAULT;
 	revs->expand_tabs_in_log_default = 8;
 
-	grep_init(&revs->grep_filter, revs->repo, prefix);
+	grep_init(&revs->grep_filter, revs->repo);
 	revs->grep_filter.status_only = 1;
 
 	repo_diff_setup(revs->repo, &revs->diffopt);
-- 
2.35.0.rc1.864.g57621b115b6


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v8 05/10] grep.c: don't pass along NULL callback value
  2022-01-18 15:55           ` [PATCH v8 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                               ` (3 preceding siblings ...)
  2022-01-18 15:55             ` [PATCH v8 04/10] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
@ 2022-01-18 15:55             ` Ævar Arnfjörð Bjarmason
  2022-01-18 15:55             ` [PATCH v8 06/10] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
                               ` (5 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-18 15:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change grep_cmd_config() to stop passing around the always-NULL "cb"
value. When this code was added in 7e8f59d577e (grep: color patterns
in output, 2009-03-07) it was non-NULL, but when that changed in
15fabd1bbd4 (builtin/grep.c: make configuration callback more
reusable, 2012-10-09) this code was left behind.

In a subsequent change I'll start using the "cb" value, this will make
it clear which functions we call need it, and which don't.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index d85cbabea67..5ec4cecae45 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -285,8 +285,8 @@ static int wait_all(void)
 
 static int grep_cmd_config(const char *var, const char *value, void *cb)
 {
-	int st = grep_config(var, value, cb);
-	if (git_color_default_config(var, value, cb) < 0)
+	int st = grep_config(var, value, NULL);
+	if (git_color_default_config(var, value, NULL) < 0)
 		st = -1;
 
 	if (!strcmp(var, "grep.threads")) {
-- 
2.35.0.rc1.864.g57621b115b6


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v8 06/10] grep API: call grep_config() after grep_init()
  2022-01-18 15:55           ` [PATCH v8 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                               ` (4 preceding siblings ...)
  2022-01-18 15:55             ` [PATCH v8 05/10] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
@ 2022-01-18 15:55             ` Ævar Arnfjörð Bjarmason
  2022-01-18 15:55             ` [PATCH v8 07/10] grep.h: make "grep_opt.pattern_type_option" use its enum Ævar Arnfjörð Bjarmason
                               ` (4 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-18 15:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

The grep_init() function used the odd pattern of initializing the
passed-in "struct grep_opt" with a statically defined "grep_defaults"
struct, which would be modified in-place when we invoked
grep_config().

So we effectively (b) initialized config, (a) then defaults, (c)
followed by user options. Usually those are ordered as "a", "b" and
"c" instead.

As the comments being removed here show the previous behavior needed
to be carefully explained as we'd potentially share the populated
configuration among different instances of grep_init(). In practice we
didn't do that, but now that it can't be a concern anymore let's
remove those comments.

This does not change the behavior of any of the configuration
variables or options. That would have been the case if we didn't move
around the grep_config() call in "builtin/log.c". But now that we call
"grep_config" after "git_log_config" and "git_format_config" we'll
need to pass in the already initialized "struct grep_opt *".

See 6ba9bb76e02 (grep: copy struct in one fell swoop, 2020-11-29) and
7687a0541e0 (grep: move the configuration parsing logic to grep.[ch],
2012-10-09) for the commits that added the comments.

The memcpy() pattern here will be optimized away and follows the
convention of other *_init() functions. See 5726a6b4012 (*.c *_init():
define in terms of corresponding *_INIT macro, 2021-07-01).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c |  4 ++--
 builtin/log.c  | 13 +++++++++++--
 grep.c         | 39 +++------------------------------------
 grep.h         | 21 +++++++++++++++++++++
 4 files changed, 37 insertions(+), 40 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 5ec4cecae45..0ea124321b6 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -285,7 +285,7 @@ static int wait_all(void)
 
 static int grep_cmd_config(const char *var, const char *value, void *cb)
 {
-	int st = grep_config(var, value, NULL);
+	int st = grep_config(var, value, cb);
 	if (git_color_default_config(var, value, NULL) < 0)
 		st = -1;
 
@@ -966,8 +966,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	};
 	grep_prefix = prefix;
 
-	git_config(grep_cmd_config, NULL);
 	grep_init(&opt, the_repository);
+	git_config(grep_cmd_config, &opt);
 
 	/*
 	 * If there is no -- then the paths must exist in the working
diff --git a/builtin/log.c b/builtin/log.c
index 4b493408cc5..06283b37e7a 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -520,8 +520,6 @@ static int git_log_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
-	if (grep_config(var, value, cb) < 0)
-		return -1;
 	if (git_gpg_config(var, value, cb) < 0)
 		return -1;
 	return git_diff_ui_config(var, value, cb);
@@ -536,6 +534,8 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.simplify_history = 0;
 	memset(&opt, 0, sizeof(opt));
@@ -650,6 +650,8 @@ int cmd_show(int argc, const char **argv, const char *prefix)
 
 	memset(&match_all, 0, sizeof(match_all));
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.always_show_header = 1;
 	rev.no_walk = 1;
@@ -733,6 +735,8 @@ int cmd_log_reflog(int argc, const char **argv, const char *prefix)
 
 	repo_init_revisions(the_repository, &rev, prefix);
 	init_reflog_walk(&rev.reflog_info);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.verbose_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -766,6 +770,8 @@ int cmd_log(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.always_show_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -1848,10 +1854,13 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	extra_hdr.strdup_strings = 1;
 	extra_to.strdup_strings = 1;
 	extra_cc.strdup_strings = 1;
+
 	init_log_defaults();
 	init_display_notes(&notes_opt);
 	git_config(git_format_config, NULL);
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.show_notes = show_notes;
 	memcpy(&rev.notes_opt, &notes_opt, sizeof(notes_opt));
 	rev.commit_format = CMIT_FMT_EMAIL;
diff --git a/grep.c b/grep.c
index 8421dc55486..35e12e43c09 100644
--- a/grep.c
+++ b/grep.c
@@ -19,27 +19,6 @@ static void std_output(struct grep_opt *opt, const void *buf, size_t size)
 	fwrite(buf, size, 1, stdout);
 }
 
-static struct grep_opt grep_defaults = {
-	.relative = 1,
-	.pathname = 1,
-	.max_depth = -1,
-	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED,
-	.colors = {
-		[GREP_COLOR_CONTEXT] = "",
-		[GREP_COLOR_FILENAME] = GIT_COLOR_MAGENTA,
-		[GREP_COLOR_FUNCTION] = "",
-		[GREP_COLOR_LINENO] = GIT_COLOR_GREEN,
-		[GREP_COLOR_COLUMNNO] = GIT_COLOR_GREEN,
-		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_SELECTED] = "",
-		[GREP_COLOR_SEP] = GIT_COLOR_CYAN,
-	},
-	.only_matching = 0,
-	.color = -1,
-	.output = std_output,
-};
-
 static const char *color_grep_slots[] = {
 	[GREP_COLOR_CONTEXT]	    = "context",
 	[GREP_COLOR_FILENAME]	    = "filename",
@@ -75,20 +54,12 @@ define_list_config_array_extra(color_grep_slots, {"match"});
  */
 int grep_config(const char *var, const char *value, void *cb)
 {
-	struct grep_opt *opt = &grep_defaults;
+	struct grep_opt *opt = cb;
 	const char *slot;
 
 	if (userdiff_config(var, value) < 0)
 		return -1;
 
-	/*
-	 * The instance of grep_opt that we set up here is copied by
-	 * grep_init() to be used by each individual invocation.
-	 * When populating a new field of this structure here, be
-	 * sure to think about ownership -- e.g., you might need to
-	 * override the shallow copy in grep_init() with a deep copy.
-	 */
-
 	if (!strcmp(var, "grep.extendedregexp")) {
 		opt->extended_regexp_option = git_config_bool(var, value);
 		return 0;
@@ -134,14 +105,10 @@ int grep_config(const char *var, const char *value, void *cb)
 	return 0;
 }
 
-/*
- * Initialize one instance of grep_opt and copy the
- * default values from the template we read the configuration
- * information in an earlier call to git_config(grep_config).
- */
 void grep_init(struct grep_opt *opt, struct repository *repo)
 {
-	*opt = grep_defaults;
+	struct grep_opt blank = GREP_OPT_INIT;
+	memcpy(opt, &blank, sizeof(*opt));
 
 	opt->repo = repo;
 	opt->pattern_tail = &opt->pattern_list;
diff --git a/grep.h b/grep.h
index 23a2a41d2c4..3112d1c2a38 100644
--- a/grep.h
+++ b/grep.h
@@ -179,6 +179,27 @@ struct grep_opt {
 	void *output_priv;
 };
 
+#define GREP_OPT_INIT { \
+	.relative = 1, \
+	.pathname = 1, \
+	.max_depth = -1, \
+	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
+	.colors = { \
+		[GREP_COLOR_CONTEXT] = "", \
+		[GREP_COLOR_FILENAME] = GIT_COLOR_MAGENTA, \
+		[GREP_COLOR_FUNCTION] = "", \
+		[GREP_COLOR_LINENO] = GIT_COLOR_GREEN, \
+		[GREP_COLOR_COLUMNNO] = GIT_COLOR_GREEN, \
+		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_SELECTED] = "", \
+		[GREP_COLOR_SEP] = GIT_COLOR_CYAN, \
+	}, \
+	.only_matching = 0, \
+	.color = -1, \
+	.output = std_output, \
+}
+
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
-- 
2.35.0.rc1.864.g57621b115b6


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v8 07/10] grep.h: make "grep_opt.pattern_type_option" use its enum
  2022-01-18 15:55           ` [PATCH v8 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                               ` (5 preceding siblings ...)
  2022-01-18 15:55             ` [PATCH v8 06/10] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
@ 2022-01-18 15:55             ` Ævar Arnfjörð Bjarmason
  2022-01-18 15:55             ` [PATCH v8 08/10] grep.c: do "if (bool && memchr())" not "if (memchr() && bool)" Ævar Arnfjörð Bjarmason
                               ` (3 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-18 15:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change the "pattern_type_option" member of "struct grep_opt" to use
the enum type we use for it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/grep.h b/grep.h
index 3112d1c2a38..89a2ce51130 100644
--- a/grep.h
+++ b/grep.h
@@ -164,7 +164,7 @@ struct grep_opt {
 	int funcname;
 	int funcbody;
 	int extended_regexp_option;
-	int pattern_type_option;
+	enum grep_pattern_type pattern_type_option;
 	int ignore_locale;
 	char colors[NR_GREP_COLORS][COLOR_MAXLEN];
 	unsigned pre_context;
-- 
2.35.0.rc1.864.g57621b115b6


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v8 08/10] grep.c: do "if (bool && memchr())" not "if (memchr() && bool)"
  2022-01-18 15:55           ` [PATCH v8 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                               ` (6 preceding siblings ...)
  2022-01-18 15:55             ` [PATCH v8 07/10] grep.h: make "grep_opt.pattern_type_option" use its enum Ævar Arnfjörð Bjarmason
@ 2022-01-18 15:55             ` Ævar Arnfjörð Bjarmason
  2022-01-18 15:55             ` [PATCH v8 09/10] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
                               ` (2 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-18 15:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change code in compile_regexp() to check the cheaper boolean
"!opt->pcre2" condition before the "memchr()" search.

This doesn't noticeably optimize anything, but makes the code more
obvious and conventional. The line wrapping being added here also
makes a subsequent commit smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index 35e12e43c09..60228a95a4f 100644
--- a/grep.c
+++ b/grep.c
@@ -492,7 +492,8 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 	p->ignore_case = opt->ignore_case;
 	p->fixed = opt->fixed;
 
-	if (memchr(p->pattern, 0, p->patternlen) && !opt->pcre2)
+	if (!opt->pcre2 &&
+	    memchr(p->pattern, 0, p->patternlen))
 		die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
 
 	p->is_fixed = is_fixed(p->pattern, p->patternlen);
-- 
2.35.0.rc1.864.g57621b115b6


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v8 09/10] grep: simplify config parsing and option parsing
  2022-01-18 15:55           ` [PATCH v8 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                               ` (7 preceding siblings ...)
  2022-01-18 15:55             ` [PATCH v8 08/10] grep.c: do "if (bool && memchr())" not "if (memchr() && bool)" Ævar Arnfjörð Bjarmason
@ 2022-01-18 15:55             ` Ævar Arnfjörð Bjarmason
  2022-01-18 22:50               ` Junio C Hamano
  2022-01-18 15:55             ` [PATCH v8 10/10] grep.[ch]: remove GREP_PATTERN_TYPE_UNSPECIFIED Ævar Arnfjörð Bjarmason
  2022-01-27 11:56             ` [PATCH v9 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  10 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-18 15:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Simplify the parsing of "grep.patternType" and
"grep.extendedRegexp". This changes no behavior, but gets rid of
complex parsing logic that isn't needed anymore.

When "grep.patternType" was introduced in 84befcd0a4a (grep: add a
grep.patternType configuration setting, 2012-08-03) we promised that:

 1. You can set "grep.patternType", and "[setting it to] 'default'
    will return to the default matching behavior".

    In that context "the default" meant whatever the configuration
    system specified before that change, i.e. via grep.extendedRegexp.

 2. We'd support the existing "grep.extendedRegexp" option, but ignore
    it when the new "grep.patternType" option is set. We said we'd
    only ignore the older "grep.extendedRegexp" option "when the
    `grep.patternType` option is set. to a value other than
    'default'".

In a preceding commit we changed grep_config() to be called after
grep_init(), which means that much of the complexity here can go
away.

As before both "grep.patternType" and "grep.extendedRegexp" are
last-one-wins variable, with "grep.extendedRegexp" yielding to
"grep.patternType", except when "grep.patternType=default".

Note that this applies as we parse the config, i.e. a sequence of:

    -c grep.patternType=perl
    -c grep.extendedRegexp=true \
    -c grep.patternType=default

Should select ERE due to "grep.extendedRegexp=true and
grep.extendedRegexp=default", not BRE, even though that's the
"default" patternType. We can determine this as we parse the config,
because:

 * If we see "grep.extendedRegexp" we set the internal "ero" to its
   boolean value.

 * If we see "grep.extendedRegexp" but
   "grep.patternType=[default|<unset>]" is in effect we *don't* set
   the internal "pattern_type_option" to update the pattern type.

 * If we see "grep.patternType!=default" we can set our internal
   "pattern_type_option" directly, it doesn't matter what the state of
   "grep.extendedRegexp" is, but we don't forget what it was, in case
   we see a "grep.patternType=default" again.

 * If we see a "grep.patternType=default" we can set the pattern to
   ERE or BRE depending on whether we last saw a
   "grep.extendedRegexp=true" or
   "grep.extendedRegexp=[false|<unset>]".

We could equally call this new adjust_pattern_type() in
compile_regexp(), i.e. this fixup on top of this passes all our
tests (with -U0 for brevity):

    @@ -60,0 +61 @@ static void adjust_pattern_type(enum grep_pattern_type *pto, const int ero)
    +static int ero = -1;
    @@ -65 +65,0 @@ int grep_config(const char *var, const char *value, void *cb)
    -       static int ero = -1;
    @@ -72 +71,0 @@ int grep_config(const char *var, const char *value, void *cb)
    -               adjust_pattern_type(&opt->pattern_type_option, ero);
    @@ -80 +78,0 @@ int grep_config(const char *var, const char *value, void *cb)
    -               adjust_pattern_type(&opt->pattern_type_option, ero);
    @@ -445,0 +444,2 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
    +       if (ero != -1)
    +               adjust_pattern_type(&opt->pattern_type_option, ero);

But doing it as we stream the git_config() makes it
clear that we can determine the interplay between these two variables
as we go. We don't need to wait until we see the last value of the two
configuration variables.

This is true because of the rationale above, and because the
subsequent code in compile_regexp() treats
"pattern_type_option=GREP_PATTERN_TYPE_{UNSPECIFIED,BRE}"
equally. I.e. we'll end up with different internal
""pattern_type_option" values there for:

    # UNSPECIFIED
    -c grep.patternType=default
    # BRE
    -c grep.extendedRegexp=false -c grep.patternType=default

But the difference won't matter, which simplifies some of this logic,
we never need to adjust a "grep.patternType" if we didn't see a
"grep.extendedRegexp" before. We can also remove the
"extended_regexp_option" member from "struct grep_opt" in favor of a
static variable in grep_config().

The command-line parsing in cmd_grep() can then completely ignore
"grep.extendedRegexp". Whatever effect it had before that step won't
matter if we see -G, -E, -P etc.

See my 07a3d411739 (grep: remove regflags from the public grep_opt
API, 2017-06-29) for addition of the two comments being removed here,
i.e. the complexity noted in that commit is now going away.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 10 +++----
 grep.c         | 77 +++++++++++---------------------------------------
 grep.h         |  4 ---
 revision.c     |  2 --
 4 files changed, 20 insertions(+), 73 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 0ea124321b6..942c4b25077 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -845,7 +845,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	int i;
 	int dummy;
 	int use_index = 1;
-	int pattern_type_arg = GREP_PATTERN_TYPE_UNSPECIFIED;
 	int allow_revs;
 
 	struct option options[] = {
@@ -879,16 +878,16 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			N_("descend at most <depth> levels"), PARSE_OPT_NONEG,
 			NULL, 1 },
 		OPT_GROUP(""),
-		OPT_SET_INT('E', "extended-regexp", &pattern_type_arg,
+		OPT_SET_INT('E', "extended-regexp", &opt.pattern_type_option,
 			    N_("use extended POSIX regular expressions"),
 			    GREP_PATTERN_TYPE_ERE),
-		OPT_SET_INT('G', "basic-regexp", &pattern_type_arg,
+		OPT_SET_INT('G', "basic-regexp", &opt.pattern_type_option,
 			    N_("use basic POSIX regular expressions (default)"),
 			    GREP_PATTERN_TYPE_BRE),
-		OPT_SET_INT('F', "fixed-strings", &pattern_type_arg,
+		OPT_SET_INT('F', "fixed-strings", &opt.pattern_type_option,
 			    N_("interpret patterns as fixed strings"),
 			    GREP_PATTERN_TYPE_FIXED),
-		OPT_SET_INT('P', "perl-regexp", &pattern_type_arg,
+		OPT_SET_INT('P', "perl-regexp", &opt.pattern_type_option,
 			    N_("use Perl-compatible regular expressions"),
 			    GREP_PATTERN_TYPE_PCRE),
 		OPT_GROUP(""),
@@ -982,7 +981,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options, grep_usage,
 			     PARSE_OPT_KEEP_DASHDASH |
 			     PARSE_OPT_STOP_AT_NON_OPTION);
-	grep_commit_pattern_type(pattern_type_arg, &opt);
 
 	if (use_index && !startup_info->have_repository) {
 		int fallback = 0;
diff --git a/grep.c b/grep.c
index 60228a95a4f..bb487e994d0 100644
--- a/grep.c
+++ b/grep.c
@@ -48,6 +48,12 @@ static int parse_pattern_type_arg(const char *opt, const char *arg)
 
 define_list_config_array_extra(color_grep_slots, {"match"});
 
+static void adjust_pattern_type(enum grep_pattern_type *pto, const int ero)
+{
+	if (*pto == GREP_PATTERN_TYPE_UNSPECIFIED)
+		*pto = ero ? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
+}
+
 /*
  * Read the configuration file once and store it in
  * the grep_defaults template.
@@ -56,17 +62,22 @@ int grep_config(const char *var, const char *value, void *cb)
 {
 	struct grep_opt *opt = cb;
 	const char *slot;
+	static int ero = -1;
 
 	if (userdiff_config(var, value) < 0)
 		return -1;
 
 	if (!strcmp(var, "grep.extendedregexp")) {
-		opt->extended_regexp_option = git_config_bool(var, value);
+		ero = git_config_bool(var, value);
+		adjust_pattern_type(&opt->pattern_type_option, ero);
 		return 0;
 	}
 
 	if (!strcmp(var, "grep.patterntype")) {
 		opt->pattern_type_option = parse_pattern_type_arg(var, value);
+		if (ero == -1)
+			return 0;
+		adjust_pattern_type(&opt->pattern_type_option, ero);
 		return 0;
 	}
 
@@ -115,62 +126,6 @@ void grep_init(struct grep_opt *opt, struct repository *repo)
 	opt->header_tail = &opt->header_list;
 }
 
-static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	/*
-	 * When committing to the pattern type by setting the relevant
-	 * fields in grep_opt it's generally not necessary to zero out
-	 * the fields we're not choosing, since they won't have been
-	 * set by anything. The extended_regexp_option field is the
-	 * only exception to this.
-	 *
-	 * This is because in the process of parsing grep.patternType
-	 * & grep.extendedRegexp we set opt->pattern_type_option and
-	 * opt->extended_regexp_option, respectively. We then
-	 * internally use opt->extended_regexp_option to see if we're
-	 * compiling an ERE. It must be unset if that's not actually
-	 * the case.
-	 */
-	if (pattern_type != GREP_PATTERN_TYPE_ERE &&
-	    opt->extended_regexp_option)
-		opt->extended_regexp_option = 0;
-
-	switch (pattern_type) {
-	case GREP_PATTERN_TYPE_UNSPECIFIED:
-		/* fall through */
-
-	case GREP_PATTERN_TYPE_BRE:
-		break;
-
-	case GREP_PATTERN_TYPE_ERE:
-		opt->extended_regexp_option = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_FIXED:
-		opt->fixed = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_PCRE:
-		opt->pcre2 = 1;
-		break;
-	}
-}
-
-void grep_commit_pattern_type(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	if (pattern_type != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(pattern_type, opt);
-	else if (opt->pattern_type_option != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(opt->pattern_type_option, opt);
-	else if (opt->extended_regexp_option)
-		/*
-		 * This branch *must* happen after setting from the
-		 * opt->pattern_type_option above, we don't want
-		 * grep.extendedRegexp to override grep.patternType!
-		 */
-		grep_set_pattern_type_option(GREP_PATTERN_TYPE_ERE, opt);
-}
-
 static struct grep_pat *create_grep_pat(const char *pat, size_t patlen,
 					const char *origin, int no,
 					enum grep_pat_token t,
@@ -490,9 +445,9 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 
 	p->word_regexp = opt->word_regexp;
 	p->ignore_case = opt->ignore_case;
-	p->fixed = opt->fixed;
+	p->fixed = opt->pattern_type_option == GREP_PATTERN_TYPE_FIXED;
 
-	if (!opt->pcre2 &&
+	if (opt->pattern_type_option != GREP_PATTERN_TYPE_PCRE &&
 	    memchr(p->pattern, 0, p->patternlen))
 		die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
 
@@ -544,14 +499,14 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 		return;
 	}
 
-	if (opt->pcre2) {
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_PCRE) {
 		compile_pcre2_pattern(p, opt);
 		return;
 	}
 
 	if (p->ignore_case)
 		regflags |= REG_ICASE;
-	if (opt->extended_regexp_option)
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_ERE)
 		regflags |= REG_EXTENDED;
 	err = regcomp(&p->regexp, p->pattern, regflags);
 	if (err) {
diff --git a/grep.h b/grep.h
index 89a2ce51130..ab0f8290784 100644
--- a/grep.h
+++ b/grep.h
@@ -143,7 +143,6 @@ struct grep_opt {
 	int unmatch_name_only;
 	int count;
 	int word_regexp;
-	int fixed;
 	int all_match;
 	int no_body_match;
 	int body_hit;
@@ -154,7 +153,6 @@ struct grep_opt {
 	int allow_textconv;
 	int extended;
 	int use_reflog_filter;
-	int pcre2;
 	int relative;
 	int pathname;
 	int null_following_name;
@@ -163,7 +161,6 @@ struct grep_opt {
 	int max_depth;
 	int funcname;
 	int funcbody;
-	int extended_regexp_option;
 	enum grep_pattern_type pattern_type_option;
 	int ignore_locale;
 	char colors[NR_GREP_COLORS][COLOR_MAXLEN];
@@ -202,7 +199,6 @@ struct grep_opt {
 
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
-void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
 void append_grep_pattern(struct grep_opt *opt, const char *pat, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index d6e0e2b23b5..dd301e30147 100644
--- a/revision.c
+++ b/revision.c
@@ -2860,8 +2860,6 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 
 	diff_setup_done(&revs->diffopt);
 
-	grep_commit_pattern_type(GREP_PATTERN_TYPE_UNSPECIFIED,
-				 &revs->grep_filter);
 	if (!is_encoding_utf8(get_log_output_encoding()))
 		revs->grep_filter.ignore_locale = 1;
 	compile_grep_patterns(&revs->grep_filter);
-- 
2.35.0.rc1.864.g57621b115b6


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v8 10/10] grep.[ch]: remove GREP_PATTERN_TYPE_UNSPECIFIED
  2022-01-18 15:55           ` [PATCH v8 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                               ` (8 preceding siblings ...)
  2022-01-18 15:55             ` [PATCH v8 09/10] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
@ 2022-01-18 15:55             ` Ævar Arnfjörð Bjarmason
  2022-01-27 11:56             ` [PATCH v9 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-18 15:55 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Remove the need for "GREP_PATTERN_TYPE_UNSPECIFIED" in favor of having
the users of the "pattern_type_option" member check whether that
member is set or not.

The "UNSPECIFIED" case was already handled implicitly in
compile_regexp(), and we don't use this "enum" in a "switch"
statement, so let's not explicitly name the
"GREP_PATTERN_TYPE_UNSPECIFIED = 0" case. It is still important that
"GREP_PATTERN_TYPE_BRE != 0", as can be seen in failing tests if the
parsing for "basic" in parse_pattern_type_arg() is made to "return 0".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 9 ++++++---
 grep.h | 4 +---
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/grep.c b/grep.c
index bb487e994d0..3497df48ca6 100644
--- a/grep.c
+++ b/grep.c
@@ -34,7 +34,7 @@ static const char *color_grep_slots[] = {
 static int parse_pattern_type_arg(const char *opt, const char *arg)
 {
 	if (!strcmp(arg, "default"))
-		return GREP_PATTERN_TYPE_UNSPECIFIED;
+		return 0;
 	else if (!strcmp(arg, "basic"))
 		return GREP_PATTERN_TYPE_BRE;
 	else if (!strcmp(arg, "extended"))
@@ -50,8 +50,7 @@ define_list_config_array_extra(color_grep_slots, {"match"});
 
 static void adjust_pattern_type(enum grep_pattern_type *pto, const int ero)
 {
-	if (*pto == GREP_PATTERN_TYPE_UNSPECIFIED)
-		*pto = ero ? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
+	*pto = ero ? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
 }
 
 /*
@@ -69,12 +68,16 @@ int grep_config(const char *var, const char *value, void *cb)
 
 	if (!strcmp(var, "grep.extendedregexp")) {
 		ero = git_config_bool(var, value);
+		if (opt->pattern_type_option)
+			return 0;
 		adjust_pattern_type(&opt->pattern_type_option, ero);
 		return 0;
 	}
 
 	if (!strcmp(var, "grep.patterntype")) {
 		opt->pattern_type_option = parse_pattern_type_arg(var, value);
+		if (opt->pattern_type_option)
+			return 0;
 		if (ero == -1)
 			return 0;
 		adjust_pattern_type(&opt->pattern_type_option, ero);
diff --git a/grep.h b/grep.h
index ab0f8290784..460cb75a357 100644
--- a/grep.h
+++ b/grep.h
@@ -94,8 +94,7 @@ enum grep_expr_node {
 };
 
 enum grep_pattern_type {
-	GREP_PATTERN_TYPE_UNSPECIFIED = 0,
-	GREP_PATTERN_TYPE_BRE,
+	GREP_PATTERN_TYPE_BRE = 1,
 	GREP_PATTERN_TYPE_ERE,
 	GREP_PATTERN_TYPE_FIXED,
 	GREP_PATTERN_TYPE_PCRE
@@ -180,7 +179,6 @@ struct grep_opt {
 	.relative = 1, \
 	.pathname = 1, \
 	.max_depth = -1, \
-	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
 	.colors = { \
 		[GREP_COLOR_CONTEXT] = "", \
 		[GREP_COLOR_FILENAME] = GIT_COLOR_MAGENTA, \
-- 
2.35.0.rc1.864.g57621b115b6


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: [PATCH v8 09/10] grep: simplify config parsing and option parsing
  2022-01-18 15:55             ` [PATCH v8 09/10] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
@ 2022-01-18 22:50               ` Junio C Hamano
  2022-01-18 22:55                 ` Junio C Hamano
  2022-01-19  0:17                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 151+ messages in thread
From: Junio C Hamano @ 2022-01-18 22:50 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> When "grep.patternType" was introduced in 84befcd0a4a (grep: add a
> grep.patternType configuration setting, 2012-08-03) we promised that:
>
>  1. You can set "grep.patternType", and "[setting it to] 'default'
>     will return to the default matching behavior".
>
>     In that context "the default" meant whatever the configuration
>     system specified before that change, i.e. via grep.extendedRegexp.
>
>  2. We'd support the existing "grep.extendedRegexp" option, but ignore
>     it when the new "grep.patternType" option is set. We said we'd
>     only ignore the older "grep.extendedRegexp" option "when the
>     `grep.patternType` option is set. to a value other than
>     'default'".

Extra period in the middle of a sentence after "set".

> As before both "grep.patternType" and "grep.extendedRegexp" are
> last-one-wins variable, with "grep.extendedRegexp" yielding to
> "grep.patternType", except when "grep.patternType=default".
>
> Note that this applies as we parse the config, i.e. a sequence of:
>
>     -c grep.patternType=perl
>     -c grep.extendedRegexp=true \
>     -c grep.patternType=default
>
> Should select ERE due to "grep.extendedRegexp=true and

Downcase "S" in "should", as this is still in the middle of the
sentence that began with "Note that".

> grep.extendedRegexp=default", not BRE, even though that's the

The second one should be "grep.patternType=default".

> "default" patternType. We can determine this as we parse the config,

Drop "even though that's the default patternType".  You've already
explained that it is not what "default" for the "patternType" (which
any reader who has been following so far would take as a reference
to "grep.patternType") at all.  You can also drop ", not BRE," while
doing so.

> because:
>
>  * If we see "grep.extendedRegexp" we set the internal "ero" to its
>    boolean value.
>
>  * If we see "grep.extendedRegexp" but
>    "grep.patternType=[default|<unset>]" is in effect we *don't* set
>    the internal "pattern_type_option" to update the pattern type.
>
>  * If we see "grep.patternType!=default" we can set our internal
>    "pattern_type_option" directly, it doesn't matter what the state of
>    "grep.extendedRegexp" is, but we don't forget what it was, in case
>    we see a "grep.patternType=default" again.
>
>  * If we see a "grep.patternType=default" we can set the pattern to
>    ERE or BRE depending on whether we last saw a
>    "grep.extendedRegexp=true" or
>    "grep.extendedRegexp=[false|<unset>]".

That sounds complex enough, doesn't it?  The statement that opens
the proposed log mesage is "gets rid of complex parsing logic that
isn't needed", but the above sounds more like a complex logic is
being traded with another.

> diff --git a/grep.c b/grep.c
> index 60228a95a4f..bb487e994d0 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -48,6 +48,12 @@ static int parse_pattern_type_arg(const char *opt, const char *arg)
>  
>  define_list_config_array_extra(color_grep_slots, {"match"});
>  
> +static void adjust_pattern_type(enum grep_pattern_type *pto, const int ero)
> +{
> +	if (*pto == GREP_PATTERN_TYPE_UNSPECIFIED)
> +		*pto = ero ? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
> +}
> +
>  /*
>   * Read the configuration file once and store it in
>   * the grep_defaults template.
> @@ -56,17 +62,22 @@ int grep_config(const char *var, const char *value, void *cb)
>  {
>  	struct grep_opt *opt = cb;
>  	const char *slot;
> +	static int ero = -1;

Is this new reentrancy issue worth it?  I think it makes the whole
thing unnecessarily complex, compared to a more naïve "we keep track
of the last-one-that-won for grep.extendedRegexp and
grep.patternType separately during option and config parsing inside
the grep_opt structure, and then combine the two when we compile the
pattern string into regexp or pcre object" approach.

Let's ask it in a different way.  What is this static, that is way
separated from all the members in the grep_opt structure, buying us?
Certainly not the ease of understanding what the code is doing.  Not
the size of the overall grep_opt structure (which we do not allocate
tons anyway).  Other than that fact that you can say "I did it my
own way, ignoring reviewer suggestions, I won!!!", I do not see any
upside with this code.

PUzzled.

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v8 09/10] grep: simplify config parsing and option parsing
  2022-01-18 22:50               ` Junio C Hamano
@ 2022-01-18 22:55                 ` Junio C Hamano
  2022-01-19  0:17                 ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 151+ messages in thread
From: Junio C Hamano @ 2022-01-18 22:55 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Junio C Hamano <gitster@pobox.com> writes:

>> @@ -56,17 +62,22 @@ int grep_config(const char *var, const char *value, void *cb)
>>  {
>>  	struct grep_opt *opt = cb;
>>  	const char *slot;
>> +	static int ero = -1;
>
> Is this new reentrancy issue worth it?  I think it makes the whole
> thing unnecessarily complex, compared to a more naïve "we keep track
> of the last-one-that-won for grep.extendedRegexp and
> grep.patternType separately during option and config parsing inside
> the grep_opt structure, and then combine the two when we compile the
> pattern string into regexp or pcre object" approach.

Another problem is that there are those corporate server-side folks
who are interested in giving an endpoint that lets clients to ask
performing Git operations (like grep and blame).  Adding more statics
instead of keeping track of dynamic runtime structure like grep_opt
is deliberately making things more difficult for them, isn't it?



^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v8 09/10] grep: simplify config parsing and option parsing
  2022-01-18 22:50               ` Junio C Hamano
  2022-01-18 22:55                 ` Junio C Hamano
@ 2022-01-19  0:17                 ` Ævar Arnfjörð Bjarmason
  2022-01-19  1:09                   ` Junio C Hamano
  1 sibling, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-19  0:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, J Smith, Taylor Blau


On Tue, Jan 18 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> When "grep.patternType" was introduced in 84befcd0a4a (grep: add a
>> grep.patternType configuration setting, 2012-08-03) we promised that:
>>
>>  1. You can set "grep.patternType", and "[setting it to] 'default'
>>     will return to the default matching behavior".
>>
>>     In that context "the default" meant whatever the configuration
>>     system specified before that change, i.e. via grep.extendedRegexp.
>>
>>  2. We'd support the existing "grep.extendedRegexp" option, but ignore
>>     it when the new "grep.patternType" option is set. We said we'd
>>     only ignore the older "grep.extendedRegexp" option "when the
>>     `grep.patternType` option is set. to a value other than
>>     'default'".
>
> Extra period in the middle of a sentence after "set".
>
>> As before both "grep.patternType" and "grep.extendedRegexp" are
>> last-one-wins variable, with "grep.extendedRegexp" yielding to
>> "grep.patternType", except when "grep.patternType=default".
>>
>> Note that this applies as we parse the config, i.e. a sequence of:
>>
>>     -c grep.patternType=perl
>>     -c grep.extendedRegexp=true \
>>     -c grep.patternType=default
>>
>> Should select ERE due to "grep.extendedRegexp=true and
>
> Downcase "S" in "should", as this is still in the middle of the
> sentence that began with "Note that".
>
>> grep.extendedRegexp=default", not BRE, even though that's the
>
> The second one should be "grep.patternType=default".

*nod*

>> "default" patternType. We can determine this as we parse the config,
>
> Drop "even though that's the default patternType".  You've already
> explained that it is not what "default" for the "patternType" (which
> any reader who has been following so far would take as a reference
> to "grep.patternType") at all.  You can also drop ", not BRE," while
> doing so.
>
>> because:
>>
>>  * If we see "grep.extendedRegexp" we set the internal "ero" to its
>>    boolean value.
>>
>>  * If we see "grep.extendedRegexp" but
>>    "grep.patternType=[default|<unset>]" is in effect we *don't* set
>>    the internal "pattern_type_option" to update the pattern type.
>>
>>  * If we see "grep.patternType!=default" we can set our internal
>>    "pattern_type_option" directly, it doesn't matter what the state of
>>    "grep.extendedRegexp" is, but we don't forget what it was, in case
>>    we see a "grep.patternType=default" again.
>>
>>  * If we see a "grep.patternType=default" we can set the pattern to
>>    ERE or BRE depending on whether we last saw a
>>    "grep.extendedRegexp=true" or
>>    "grep.extendedRegexp=[false|<unset>]".
>
> That sounds complex enough, doesn't it?  The statement that opens
> the proposed log mesage is "gets rid of complex parsing logic that
> isn't needed", but the above sounds more like a complex logic is
> being traded with another.

The complexity this series is addressing is that you couldn't treat this
like most other built-in / library APIs, where we instantiate defaults,
do config, then CLI parsing.

>> diff --git a/grep.c b/grep.c
>> index 60228a95a4f..bb487e994d0 100644
>> --- a/grep.c
>> +++ b/grep.c
>> @@ -48,6 +48,12 @@ static int parse_pattern_type_arg(const char *opt, const char *arg)
>>  
>>  define_list_config_array_extra(color_grep_slots, {"match"});
>>  
>> +static void adjust_pattern_type(enum grep_pattern_type *pto, const int ero)
>> +{
>> +	if (*pto == GREP_PATTERN_TYPE_UNSPECIFIED)
>> +		*pto = ero ? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
>> +}
>> +
>>  /*
>>   * Read the configuration file once and store it in
>>   * the grep_defaults template.
>> @@ -56,17 +62,22 @@ int grep_config(const char *var, const char *value, void *cb)
>>  {
>>  	struct grep_opt *opt = cb;
>>  	const char *slot;
>> +	static int ero = -1;
>
> Is this new reentrancy issue worth it?  I think it makes the whole
> thing unnecessarily complex, compared to a more naïve "we keep track
> of the last-one-that-won for grep.extendedRegexp and
> grep.patternType separately during option and config parsing inside
> the grep_opt structure, and then combine the two when we compile the
> pattern string into regexp or pcre object" approach.

I can move back to using the variable in the struct. The post-image here
is from incrementally working on that, until I saw that it wasn't needed
outside the config parsing step itself.

Is a reentrancy issue a practical concern? This part of the grep API is
explicitly called by the whole init-once/config-once/getopt-once step in
builtin/grep.c (and revision.c).

> Let's ask it in a different way.  What is this static, that is way
> separated from all the members in the grep_opt structure, buying us?
> Certainly not the ease of understanding what the code is doing.  Not
> the size of the overall grep_opt structure (which we do not allocate
> tons anyway).  Other than that fact that you can say "I did it my
> own way, ignoring reviewer suggestions, I won!!!", I do not see any
> upside with this code.

The ease of understanding that the state isn't needed outside of the
config callback, but that's in the eye of the beholder I suppose. That's
not true of the other remaining struct members.

Then downthread you mention:

> Another problem is that there are those corporate server-side folks
> who are interested in giving an endpoint that lets clients to ask
> performing Git operations (like grep and blame).  Adding more statics
> instead of keeping track of dynamic runtime structure like grep_opt
> is deliberately making things more difficult for them, isn't it?

IIRC I'm the only person who's been advocating that as an eventual neat
thing to do, and I don't see how this would make it harder.

If we had a grep IPC call of some sort we'd surely limit it to the
"modern" config of grep.patternType, and take the opportunity to
deprecate grep.extendedRegexp from that interface.

Or it would take explicit parameters, instead of allowing the passing of
config from a remote process.

In either case the required surgery and scaffolding to make that work
will at most be trivially impacted by these changes, and likely not at
all.

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v8 09/10] grep: simplify config parsing and option parsing
  2022-01-19  0:17                 ` Ævar Arnfjörð Bjarmason
@ 2022-01-19  1:09                   ` Junio C Hamano
  2022-01-19  1:15                     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 151+ messages in thread
From: Junio C Hamano @ 2022-01-19  1:09 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> If we had a grep IPC call of some sort we'd surely limit it to the
> "modern" config of grep.patternType, and take the opportunity to
> deprecate grep.extendedRegexp from that interface.

A process that visits more than one repositories, with different
configuration, does not have such a choice.  As far as I tell, there
is not yet a way to undo the static in the code after these patches
so that such a process can reset between repositories.

A member that is necessary only during configuration parsing is not
a problem as long as the field is marked clearly as such (I wouldn't
even call that "not a NEW problem", since it is not a problem to
begin with, and I am sure there are more examples in other
subsystems).  A static inside a helper function that has subtle
interactions with second and subsequent invocations makes the code
much harder to follow, on the other hand.

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v8 09/10] grep: simplify config parsing and option parsing
  2022-01-19  1:09                   ` Junio C Hamano
@ 2022-01-19  1:15                     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-19  1:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, J Smith, Taylor Blau


On Tue, Jan 18 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> If we had a grep IPC call of some sort we'd surely limit it to the
>> "modern" config of grep.patternType, and take the opportunity to
>> deprecate grep.extendedRegexp from that interface.
>
> A process that visits more than one repositories, with different
> configuration, does not have such a choice.  As far as I tell, there
> is not yet a way to undo the static in the code after these patches
> so that such a process can reset between repositories.

Yes, it does assume the current one-shot API users, e.g. "git grep
--recurse-submodules" only reads the config of the one parent repo.

> A member that is necessary only during configuration parsing is not
> a problem as long as the field is marked clearly as such (I wouldn't
> even call that "not a NEW problem", since it is not a problem to
> begin with, and I am sure there are more examples in other
> subsystems).  A static inside a helper function that has subtle
> interactions with second and subsequent invocations makes the code
> much harder to follow, on the other hand.

*nod*, will change it.

^ permalink raw reply	[flat|nested] 151+ messages in thread

* [PATCH v9 0/9] grep: simplify & delete "init" & "config" code
  2022-01-18 15:55           ` [PATCH v8 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                               ` (9 preceding siblings ...)
  2022-01-18 15:55             ` [PATCH v8 10/10] grep.[ch]: remove GREP_PATTERN_TYPE_UNSPECIFIED Ævar Arnfjörð Bjarmason
@ 2022-01-27 11:56             ` Ævar Arnfjörð Bjarmason
  2022-01-27 11:56               ` [PATCH v9 1/9] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
                                 ` (9 more replies)
  10 siblings, 10 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-27 11:56 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

This series makes using the grep.[ch] API easier, by having it follow
the usual pattern of being initialized with:

    defaults -> config -> command-line

This is to make some follow-up work easier, this is a net code
deletion if we exclude newly added tests.

Changes since v8:

 * Addressed Junio's comments on the last two patches, the 9/9 is now
   rewritten to use the "struct grep_opts" to track the
   "extended_regexp_option" member, and the previous 10/10 has been
   dropped.

For the v8 see
https://lore.kernel.org/git/cover-v8-00.10-00000000000-20220118T155211Z-avarab@gmail.com/

Ævar Arnfjörð Bjarmason (9):
  grep.h: remove unused "regex_t regexp" from grep_opt
  log tests: check if grep_config() is called by "log"-like cmds
  grep tests: add missing "grep.patternType" config tests
  built-ins: trust the "prefix" from run_builtin()
  grep.c: don't pass along NULL callback value
  grep API: call grep_config() after grep_init()
  grep.h: make "grep_opt.pattern_type_option" use its enum
  grep.c: do "if (bool && memchr())" not "if (memchr() && bool)"
  grep: simplify config parsing and option parsing

 builtin/grep.c    |  27 ++++++-----
 builtin/log.c     |  13 ++++-
 builtin/ls-tree.c |   2 +-
 git.c             |   1 +
 grep.c            | 120 ++++++++--------------------------------------
 grep.h            |  32 +++++++++----
 revision.c        |   4 +-
 t/t4202-log.sh    |  24 ++++++++++
 t/t7810-grep.sh   |  68 ++++++++++++++++++++++++++
 9 files changed, 165 insertions(+), 126 deletions(-)

Range-diff against v8:
 1:  010a2066656 =  1:  b9372cde017 grep.h: remove unused "regex_t regexp" from grep_opt
 2:  e4981fa3417 =  2:  30219a0ae9d log tests: check if grep_config() is called by "log"-like cmds
 3:  59092169e55 =  3:  a75b288340b grep tests: add missing "grep.patternType" config tests
 4:  331c9019a0e =  4:  6e7f9730a7d built-ins: trust the "prefix" from run_builtin()
 5:  25dd327b653 =  5:  fbcfea84696 grep.c: don't pass along NULL callback value
 6:  3c559ad006a =  6:  96c8cc7806e grep API: call grep_config() after grep_init()
 7:  daf873899c1 =  7:  e09616056b4 grep.h: make "grep_opt.pattern_type_option" use its enum
 8:  62650a78ea9 =  8:  61fc6a4dac8 grep.c: do "if (bool && memchr())" not "if (memchr() && bool)"
 9:  c211bb0c69d !  9:  445467e87f7 grep: simplify config parsing and option parsing
    @@ Commit message
          2. We'd support the existing "grep.extendedRegexp" option, but ignore
             it when the new "grep.patternType" option is set. We said we'd
             only ignore the older "grep.extendedRegexp" option "when the
    -        `grep.patternType` option is set. to a value other than
    +        `grep.patternType` option is set to a value other than
             'default'".
     
         In a preceding commit we changed grep_config() to be called after
    @@ Commit message
             -c grep.extendedRegexp=true \
             -c grep.patternType=default
     
    -    Should select ERE due to "grep.extendedRegexp=true and
    -    grep.extendedRegexp=default", not BRE, even though that's the
    -    "default" patternType. We can determine this as we parse the config,
    -    because:
    +    should select ERE due to "grep.extendedRegexp=true and
    +    grep.patternType=default". We can determine this as we parse the
    +    config, because:
     
    -     * If we see "grep.extendedRegexp" we set the internal "ero" to its
    -       boolean value.
    +     * If we see "grep.extendedRegexp" we set "extended_regexp_option" to
    +       its boolean value.
     
          * If we see "grep.extendedRegexp" but
            "grep.patternType=[default|<unset>]" is in effect we *don't* set
    @@ Commit message
     
          * If we see "grep.patternType!=default" we can set our internal
            "pattern_type_option" directly, it doesn't matter what the state of
    -       "grep.extendedRegexp" is, but we don't forget what it was, in case
    -       we see a "grep.patternType=default" again.
    +       "extended_regexp_option" is, but we don't forget what it was, in
    +       case we see a "grep.patternType=default" again.
     
          * If we see a "grep.patternType=default" we can set the pattern to
            ERE or BRE depending on whether we last saw a
            "grep.extendedRegexp=true" or
            "grep.extendedRegexp=[false|<unset>]".
     
    -    We could equally call this new adjust_pattern_type() in
    -    compile_regexp(), i.e. this fixup on top of this passes all our
    -    tests (with -U0 for brevity):
    -
    -        @@ -60,0 +61 @@ static void adjust_pattern_type(enum grep_pattern_type *pto, const int ero)
    -        +static int ero = -1;
    -        @@ -65 +65,0 @@ int grep_config(const char *var, const char *value, void *cb)
    -        -       static int ero = -1;
    -        @@ -72 +71,0 @@ int grep_config(const char *var, const char *value, void *cb)
    -        -               adjust_pattern_type(&opt->pattern_type_option, ero);
    -        @@ -80 +78,0 @@ int grep_config(const char *var, const char *value, void *cb)
    -        -               adjust_pattern_type(&opt->pattern_type_option, ero);
    -        @@ -445,0 +444,2 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
    -        +       if (ero != -1)
    -        +               adjust_pattern_type(&opt->pattern_type_option, ero);
    -
    -    But doing it as we stream the git_config() makes it
    -    clear that we can determine the interplay between these two variables
    -    as we go. We don't need to wait until we see the last value of the two
    -    configuration variables.
    -
    -    This is true because of the rationale above, and because the
    -    subsequent code in compile_regexp() treats
    -    "pattern_type_option=GREP_PATTERN_TYPE_{UNSPECIFIED,BRE}"
    -    equally. I.e. we'll end up with different internal
    -    ""pattern_type_option" values there for:
    -
    -        # UNSPECIFIED
    -        -c grep.patternType=default
    -        # BRE
    -        -c grep.extendedRegexp=false -c grep.patternType=default
    -
    -    But the difference won't matter, which simplifies some of this logic,
    -    we never need to adjust a "grep.patternType" if we didn't see a
    -    "grep.extendedRegexp" before. We can also remove the
    -    "extended_regexp_option" member from "struct grep_opt" in favor of a
    -    static variable in grep_config().
    +    With this change the "extended_regexp_option" member is only used
    +    within grep_config(), and in the current codebase we could equally
    +    track it as a "static" variable within that function, see [1] for a
    +    version for this patch that did that. We're keeping it a struct member
    +    to make that function reentrant, in case it ends up mattering in the
    +    future.
     
         The command-line parsing in cmd_grep() can then completely ignore
         "grep.extendedRegexp". Whatever effect it had before that step won't
    @@ Commit message
         API, 2017-06-29) for addition of the two comments being removed here,
         i.e. the complexity noted in that commit is now going away.
     
    +    1. https://lore.kernel.org/git/patch-v8-09.10-c211bb0c69d-20220118T155211Z-avarab@gmail.com/
    +
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## builtin/grep.c ##
    @@ grep.c: static int parse_pattern_type_arg(const char *opt, const char *arg)
       * Read the configuration file once and store it in
       * the grep_defaults template.
     @@ grep.c: int grep_config(const char *var, const char *value, void *cb)
    - {
    - 	struct grep_opt *opt = cb;
    - 	const char *slot;
    -+	static int ero = -1;
    - 
    - 	if (userdiff_config(var, value) < 0)
    - 		return -1;
      
      	if (!strcmp(var, "grep.extendedregexp")) {
    --		opt->extended_regexp_option = git_config_bool(var, value);
    -+		ero = git_config_bool(var, value);
    -+		adjust_pattern_type(&opt->pattern_type_option, ero);
    + 		opt->extended_regexp_option = git_config_bool(var, value);
    ++		adjust_pattern_type(&opt->pattern_type_option,
    ++				    opt->extended_regexp_option);
      		return 0;
      	}
      
      	if (!strcmp(var, "grep.patterntype")) {
      		opt->pattern_type_option = parse_pattern_type_arg(var, value);
    -+		if (ero == -1)
    ++		if (opt->extended_regexp_option == -1)
     +			return 0;
    -+		adjust_pattern_type(&opt->pattern_type_option, ero);
    ++		adjust_pattern_type(&opt->pattern_type_option,
    ++				    opt->extended_regexp_option);
      		return 0;
      	}
      
    @@ grep.h: struct grep_opt {
      	int pathname;
      	int null_following_name;
     @@ grep.h: struct grep_opt {
    - 	int max_depth;
    - 	int funcname;
    - 	int funcbody;
    --	int extended_regexp_option;
    - 	enum grep_pattern_type pattern_type_option;
    - 	int ignore_locale;
    - 	char colors[NR_GREP_COLORS][COLOR_MAXLEN];
    + 	.relative = 1, \
    + 	.pathname = 1, \
    + 	.max_depth = -1, \
    ++	.extended_regexp_option = -1, \
    + 	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
    + 	.colors = { \
    + 		[GREP_COLOR_CONTEXT] = "", \
     @@ grep.h: struct grep_opt {
      
      int grep_config(const char *var, const char *value, void *);
10:  b52a0c11fa9 <  -:  ----------- grep.[ch]: remove GREP_PATTERN_TYPE_UNSPECIFIED
-- 
2.35.0.894.g563b84683b9


^ permalink raw reply	[flat|nested] 151+ messages in thread

* [PATCH v9 1/9] grep.h: remove unused "regex_t regexp" from grep_opt
  2022-01-27 11:56             ` [PATCH v9 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
@ 2022-01-27 11:56               ` Ævar Arnfjörð Bjarmason
  2022-01-27 11:56               ` [PATCH v9 2/9] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
                                 ` (8 subsequent siblings)
  9 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-27 11:56 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

This "regex_t" in grep_opt has not been used since
f9b9faf6f8a (builtin-grep: allow more than one patterns., 2006-05-02),
we still use a "regex_t" for compiling regexes, but that's in the
"grep_pat" struct".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/grep.h b/grep.h
index 6a1f0ab0172..400172676a1 100644
--- a/grep.h
+++ b/grep.h
@@ -136,7 +136,6 @@ struct grep_opt {
 
 	const char *prefix;
 	int prefix_length;
-	regex_t regexp;
 	int linenum;
 	int columnnum;
 	int invert;
-- 
2.35.0.894.g563b84683b9


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v9 2/9] log tests: check if grep_config() is called by "log"-like cmds
  2022-01-27 11:56             ` [PATCH v9 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  2022-01-27 11:56               ` [PATCH v9 1/9] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
@ 2022-01-27 11:56               ` Ævar Arnfjörð Bjarmason
  2022-01-27 11:56               ` [PATCH v9 3/9] grep tests: add missing "grep.patternType" config tests Ævar Arnfjörð Bjarmason
                                 ` (7 subsequent siblings)
  9 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-27 11:56 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Extend the tests added in my 9df46763ef1 (log: add exhaustive tests
for pattern style options & config, 2017-05-20) to check not only
whether "git log" handles "grep.patternType", but also "git show"
etc.

It's sufficient to check whether we match a "fixed" or a "basic" regex
here to see if these codepaths correctly invoked grep_config(). We
don't need to check the details of their regular expression matching
as the "log" test does.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t4202-log.sh | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 50495598619..e775b378e4b 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -449,6 +449,30 @@ test_expect_success !FAIL_PREREQS 'log with various grep.patternType configurati
 	)
 '
 
+for cmd in show whatchanged reflog format-patch
+do
+	case "$cmd" in
+	format-patch) myarg="HEAD~.." ;;
+	*) myarg= ;;
+	esac
+
+	test_expect_success "$cmd: understands grep.patternType, like 'log'" '
+		git init "pattern-type-$cmd" &&
+		(
+			cd "pattern-type-$cmd" &&
+			test_commit 1 file A &&
+			test_commit "(1|2)" file B 2 &&
+
+			git -c grep.patternType=fixed $cmd --grep="..." $myarg >actual &&
+			test_must_be_empty actual &&
+
+			git -c grep.patternType=basic $cmd --grep="..." $myarg >actual &&
+			test_file_not_empty actual
+		)
+	'
+done
+test_done
+
 test_expect_success 'log --author' '
 	cat >expect <<-\EOF &&
 	Author: <BOLD;RED>A U<RESET> Thor <author@example.com>
-- 
2.35.0.894.g563b84683b9


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v9 3/9] grep tests: add missing "grep.patternType" config tests
  2022-01-27 11:56             ` [PATCH v9 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  2022-01-27 11:56               ` [PATCH v9 1/9] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
  2022-01-27 11:56               ` [PATCH v9 2/9] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
@ 2022-01-27 11:56               ` Ævar Arnfjörð Bjarmason
  2022-01-27 11:56               ` [PATCH v9 4/9] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
                                 ` (6 subsequent siblings)
  9 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-27 11:56 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Extend the grep tests to assert that setting
"grep.patternType=extended" followed by "grep.patternType=default"
will behave as if "--basic-regexp" was provided, and not as
"--extended-regexp". In a subsequent commit we'll need to treat
"grep.patternType=default" as a special-case, but let's make sure we
ignore it if it's being set to "default" following an earlier
non-"default" "grep.patternType" setting.

Let's also test what happens when we have a sequence of "extended"
followed by "default" and "fixed". In that case the "fixed" should
prevail, as well as tests to check that a "grep.extendedRegexp=true"
followed by a "grep.extendedRegexp=false" behaves as though
"grep.extendedRegexp" wasn't provided.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t7810-grep.sh | 68 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 424c31c3287..34d8f69c1de 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -451,6 +451,65 @@ do
 		test_cmp expected actual
 	'
 
+	test_expect_success "grep $L with grep.extendedRegexp is last-one-wins" '
+		echo "${HC}ab:a+bc" >expected &&
+		git \
+			-c grep.extendedRegexp=true \
+			-c grep.patternType=basic \
+			-c grep.extendedRegexp=false \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.extendedRegexp is last-one-wins & defers to grep.patternType" '
+		echo "${HC}ab:abc" >expected &&
+		git \
+			-c grep.extendedRegexp=true \
+			-c grep.patternType=extended \
+			-c grep.extendedRegexp=false \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (BRE)" '
+		echo "${HC}ab:abc" >expected &&
+		git \
+			-c grep.patternType=fixed \
+			-c grep.extendedRegexp=true \
+			-c grep.patternType=default \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (ERE)" '
+		echo "${HC}ab:a+bc" >expected &&
+		git \
+			-c grep.patternType=default \
+			-c grep.extendedRegexp=true \
+			-c grep.patternType=basic \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.patternType=extended and grep.patternType=default" '
+		echo "${HC}ab:a+bc" >expected &&
+		git \
+			-c grep.patternType=extended \
+			-c grep.patternType=default \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.patternType=[extended -> default -> fixed]" '
+		echo "${HC}ab:a+b*c" >expected &&
+		git \
+			-c grep.patternType=extended \
+			-c grep.patternType=default \
+			-c grep.patternType=fixed \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
 	test_expect_success "grep $L with grep.patternType=extended and grep.extendedRegexp=false" '
 		echo "${HC}ab:abc" >expected &&
 		git \
@@ -478,6 +537,15 @@ do
 		test_cmp expected actual
 	'
 
+	test_expect_success "grep $L with grep.extendedRegexp=false and grep.patternType=default" '
+		echo "${HC}ab:abc" >expected &&
+		git \
+			-c grep.extendedRegexp=false \
+			-c grep.patternType=extended \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
 	test_expect_success "grep $L with grep.extendedRegexp=true and grep.patternType=basic" '
 		echo "${HC}ab:a+bc" >expected &&
 		git \
-- 
2.35.0.894.g563b84683b9


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v9 4/9] built-ins: trust the "prefix" from run_builtin()
  2022-01-27 11:56             ` [PATCH v9 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                 ` (2 preceding siblings ...)
  2022-01-27 11:56               ` [PATCH v9 3/9] grep tests: add missing "grep.patternType" config tests Ævar Arnfjörð Bjarmason
@ 2022-01-27 11:56               ` Ævar Arnfjörð Bjarmason
  2022-01-27 11:56               ` [PATCH v9 5/9] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
                                 ` (5 subsequent siblings)
  9 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-27 11:56 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change code in "builtin/grep.c" and "builtin/ls-tree.c" to trust the
"prefix" passed from "run_builtin()". The "prefix" we get from setup.c
is either going to be NULL or a string of length >0, never "".

So we can drop the "prefix && *prefix" checks added for
"builtin/grep.c" in 0d042fecf2f (git-grep: show pathnames relative to
the current directory, 2006-08-11), and for "builtin/ls-tree.c" in
a69dd585fca (ls-tree: chomp leading directories when run from a
subdirectory, 2005-12-23).

As seen in code in revision.c that was added in cd676a51367 (diff
--relative: output paths as relative to the current subdirectory,
2008-02-12) we already have existing code that does away with this
assertion.

This makes it easier to reason about a subsequent change to the
"prefix_length" code in grep.c in a subsequent commit, and since we're
going to the trouble of doing that let's leave behind an assert() to
promise this to any future callers.

For "builtin/grep.c" it would be painful to pass the "prefix" down the
callchain of:

    cmd_grep -> grep_tree -> grep_submodule -> grep_cache -> grep_oid ->
    grep_source_name

So for the code that needs it in grep_source_name() let's add a
"grep_prefix" variable similar to the existing "ls_tree_prefix".

While at it let's move the code in cmd_ls_tree() around so that we
assign to the "ls_tree_prefix" right after declaring the variables,
and stop assigning to "prefix". We only subsequently used that
variable later in the function after clobbering it. Let's just use our
own "grep_prefix" instead.

Let's also add an assert() in git.c, so that we'll make this promise
about the "prefix" to any current and future callers, as well as to
any readers of the code.

Code history:

 * The strlen() in "grep.c" hasn't been used since 493b7a08d80 (grep:
   accept relative paths outside current working directory, 2009-09-05).

   When that code was added in 0d042fecf2f (git-grep: show pathnames
   relative to the current directory, 2006-08-11) we used the length.

   But since 493b7a08d80 we haven't used it for anything except a
   boolean check that we could have done on the "prefix" member
   itself.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c    | 13 ++++++++-----
 builtin/ls-tree.c |  2 +-
 git.c             |  1 +
 grep.c            |  4 +---
 grep.h            |  4 +---
 revision.c        |  2 +-
 6 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 9e34a820ad4..d85cbabea67 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -26,6 +26,8 @@
 #include "object-store.h"
 #include "packfile.h"
 
+static const char *grep_prefix;
+
 static char const * const grep_usage[] = {
 	N_("git grep [<options>] [-e] <pattern> [<rev>...] [[--] <path>...]"),
 	NULL
@@ -315,11 +317,11 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 	strbuf_reset(out);
 
 	if (opt->null_following_name) {
-		if (opt->relative && opt->prefix_length) {
+		if (opt->relative && grep_prefix) {
 			struct strbuf rel_buf = STRBUF_INIT;
 			const char *rel_name =
 				relative_path(filename + tree_name_len,
-					      opt->prefix, &rel_buf);
+					      grep_prefix, &rel_buf);
 
 			if (tree_name_len)
 				strbuf_add(out, filename, tree_name_len);
@@ -332,8 +334,8 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 		return;
 	}
 
-	if (opt->relative && opt->prefix_length)
-		quote_path(filename + tree_name_len, opt->prefix, out, 0);
+	if (opt->relative && grep_prefix)
+		quote_path(filename + tree_name_len, grep_prefix, out, 0);
 	else
 		quote_c_style(filename + tree_name_len, out, NULL, 0);
 
@@ -962,9 +964,10 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			   PARSE_OPT_NOCOMPLETE),
 		OPT_END()
 	};
+	grep_prefix = prefix;
 
 	git_config(grep_cmd_config, NULL);
-	grep_init(&opt, the_repository, prefix);
+	grep_init(&opt, the_repository);
 
 	/*
 	 * If there is no -- then the paths must exist in the working
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c71..6cb554cbb0a 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -150,7 +150,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 
 	git_config(git_default_config, NULL);
 	ls_tree_prefix = prefix;
-	if (prefix && *prefix)
+	if (prefix)
 		chomp_prefix = strlen(prefix);
 
 	argc = parse_options(argc, argv, prefix, ls_tree_options,
diff --git a/git.c b/git.c
index edda922ce6d..9d257e092da 100644
--- a/git.c
+++ b/git.c
@@ -436,6 +436,7 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
 	} else {
 		prefix = NULL;
 	}
+	assert(!prefix || *prefix);
 	precompose_argv_prefix(argc, argv, NULL);
 	if (use_pager == -1 && run_setup &&
 		!(p->option & DELAY_PAGER_CONFIG))
diff --git a/grep.c b/grep.c
index 7bb0360869a..8421dc55486 100644
--- a/grep.c
+++ b/grep.c
@@ -139,13 +139,11 @@ int grep_config(const char *var, const char *value, void *cb)
  * default values from the template we read the configuration
  * information in an earlier call to git_config(grep_config).
  */
-void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix)
+void grep_init(struct grep_opt *opt, struct repository *repo)
 {
 	*opt = grep_defaults;
 
 	opt->repo = repo;
-	opt->prefix = prefix;
-	opt->prefix_length = (prefix && *prefix) ? strlen(prefix) : 0;
 	opt->pattern_tail = &opt->pattern_list;
 	opt->header_tail = &opt->header_list;
 }
diff --git a/grep.h b/grep.h
index 400172676a1..23a2a41d2c4 100644
--- a/grep.h
+++ b/grep.h
@@ -134,8 +134,6 @@ struct grep_opt {
 	 */
 	struct repository *repo;
 
-	const char *prefix;
-	int prefix_length;
 	int linenum;
 	int columnnum;
 	int invert;
@@ -182,7 +180,7 @@ struct grep_opt {
 };
 
 int grep_config(const char *var, const char *value, void *);
-void grep_init(struct grep_opt *, struct repository *repo, const char *prefix);
+void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index ad4286fbdde..d6e0e2b23b5 100644
--- a/revision.c
+++ b/revision.c
@@ -1838,7 +1838,7 @@ void repo_init_revisions(struct repository *r,
 	revs->commit_format = CMIT_FMT_DEFAULT;
 	revs->expand_tabs_in_log_default = 8;
 
-	grep_init(&revs->grep_filter, revs->repo, prefix);
+	grep_init(&revs->grep_filter, revs->repo);
 	revs->grep_filter.status_only = 1;
 
 	repo_diff_setup(revs->repo, &revs->diffopt);
-- 
2.35.0.894.g563b84683b9


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v9 5/9] grep.c: don't pass along NULL callback value
  2022-01-27 11:56             ` [PATCH v9 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                 ` (3 preceding siblings ...)
  2022-01-27 11:56               ` [PATCH v9 4/9] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
@ 2022-01-27 11:56               ` Ævar Arnfjörð Bjarmason
  2022-01-27 11:56               ` [PATCH v9 6/9] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
                                 ` (4 subsequent siblings)
  9 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-27 11:56 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change grep_cmd_config() to stop passing around the always-NULL "cb"
value. When this code was added in 7e8f59d577e (grep: color patterns
in output, 2009-03-07) it was non-NULL, but when that changed in
15fabd1bbd4 (builtin/grep.c: make configuration callback more
reusable, 2012-10-09) this code was left behind.

In a subsequent change I'll start using the "cb" value, this will make
it clear which functions we call need it, and which don't.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index d85cbabea67..5ec4cecae45 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -285,8 +285,8 @@ static int wait_all(void)
 
 static int grep_cmd_config(const char *var, const char *value, void *cb)
 {
-	int st = grep_config(var, value, cb);
-	if (git_color_default_config(var, value, cb) < 0)
+	int st = grep_config(var, value, NULL);
+	if (git_color_default_config(var, value, NULL) < 0)
 		st = -1;
 
 	if (!strcmp(var, "grep.threads")) {
-- 
2.35.0.894.g563b84683b9


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v9 6/9] grep API: call grep_config() after grep_init()
  2022-01-27 11:56             ` [PATCH v9 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                 ` (4 preceding siblings ...)
  2022-01-27 11:56               ` [PATCH v9 5/9] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
@ 2022-01-27 11:56               ` Ævar Arnfjörð Bjarmason
  2022-01-27 11:56               ` [PATCH v9 7/9] grep.h: make "grep_opt.pattern_type_option" use its enum Ævar Arnfjörð Bjarmason
                                 ` (3 subsequent siblings)
  9 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-27 11:56 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

The grep_init() function used the odd pattern of initializing the
passed-in "struct grep_opt" with a statically defined "grep_defaults"
struct, which would be modified in-place when we invoked
grep_config().

So we effectively (b) initialized config, (a) then defaults, (c)
followed by user options. Usually those are ordered as "a", "b" and
"c" instead.

As the comments being removed here show the previous behavior needed
to be carefully explained as we'd potentially share the populated
configuration among different instances of grep_init(). In practice we
didn't do that, but now that it can't be a concern anymore let's
remove those comments.

This does not change the behavior of any of the configuration
variables or options. That would have been the case if we didn't move
around the grep_config() call in "builtin/log.c". But now that we call
"grep_config" after "git_log_config" and "git_format_config" we'll
need to pass in the already initialized "struct grep_opt *".

See 6ba9bb76e02 (grep: copy struct in one fell swoop, 2020-11-29) and
7687a0541e0 (grep: move the configuration parsing logic to grep.[ch],
2012-10-09) for the commits that added the comments.

The memcpy() pattern here will be optimized away and follows the
convention of other *_init() functions. See 5726a6b4012 (*.c *_init():
define in terms of corresponding *_INIT macro, 2021-07-01).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c |  4 ++--
 builtin/log.c  | 13 +++++++++++--
 grep.c         | 39 +++------------------------------------
 grep.h         | 21 +++++++++++++++++++++
 4 files changed, 37 insertions(+), 40 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 5ec4cecae45..0ea124321b6 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -285,7 +285,7 @@ static int wait_all(void)
 
 static int grep_cmd_config(const char *var, const char *value, void *cb)
 {
-	int st = grep_config(var, value, NULL);
+	int st = grep_config(var, value, cb);
 	if (git_color_default_config(var, value, NULL) < 0)
 		st = -1;
 
@@ -966,8 +966,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	};
 	grep_prefix = prefix;
 
-	git_config(grep_cmd_config, NULL);
 	grep_init(&opt, the_repository);
+	git_config(grep_cmd_config, &opt);
 
 	/*
 	 * If there is no -- then the paths must exist in the working
diff --git a/builtin/log.c b/builtin/log.c
index 4b493408cc5..06283b37e7a 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -520,8 +520,6 @@ static int git_log_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
-	if (grep_config(var, value, cb) < 0)
-		return -1;
 	if (git_gpg_config(var, value, cb) < 0)
 		return -1;
 	return git_diff_ui_config(var, value, cb);
@@ -536,6 +534,8 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.simplify_history = 0;
 	memset(&opt, 0, sizeof(opt));
@@ -650,6 +650,8 @@ int cmd_show(int argc, const char **argv, const char *prefix)
 
 	memset(&match_all, 0, sizeof(match_all));
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.always_show_header = 1;
 	rev.no_walk = 1;
@@ -733,6 +735,8 @@ int cmd_log_reflog(int argc, const char **argv, const char *prefix)
 
 	repo_init_revisions(the_repository, &rev, prefix);
 	init_reflog_walk(&rev.reflog_info);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.verbose_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -766,6 +770,8 @@ int cmd_log(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.always_show_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -1848,10 +1854,13 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	extra_hdr.strdup_strings = 1;
 	extra_to.strdup_strings = 1;
 	extra_cc.strdup_strings = 1;
+
 	init_log_defaults();
 	init_display_notes(&notes_opt);
 	git_config(git_format_config, NULL);
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.show_notes = show_notes;
 	memcpy(&rev.notes_opt, &notes_opt, sizeof(notes_opt));
 	rev.commit_format = CMIT_FMT_EMAIL;
diff --git a/grep.c b/grep.c
index 8421dc55486..35e12e43c09 100644
--- a/grep.c
+++ b/grep.c
@@ -19,27 +19,6 @@ static void std_output(struct grep_opt *opt, const void *buf, size_t size)
 	fwrite(buf, size, 1, stdout);
 }
 
-static struct grep_opt grep_defaults = {
-	.relative = 1,
-	.pathname = 1,
-	.max_depth = -1,
-	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED,
-	.colors = {
-		[GREP_COLOR_CONTEXT] = "",
-		[GREP_COLOR_FILENAME] = GIT_COLOR_MAGENTA,
-		[GREP_COLOR_FUNCTION] = "",
-		[GREP_COLOR_LINENO] = GIT_COLOR_GREEN,
-		[GREP_COLOR_COLUMNNO] = GIT_COLOR_GREEN,
-		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_SELECTED] = "",
-		[GREP_COLOR_SEP] = GIT_COLOR_CYAN,
-	},
-	.only_matching = 0,
-	.color = -1,
-	.output = std_output,
-};
-
 static const char *color_grep_slots[] = {
 	[GREP_COLOR_CONTEXT]	    = "context",
 	[GREP_COLOR_FILENAME]	    = "filename",
@@ -75,20 +54,12 @@ define_list_config_array_extra(color_grep_slots, {"match"});
  */
 int grep_config(const char *var, const char *value, void *cb)
 {
-	struct grep_opt *opt = &grep_defaults;
+	struct grep_opt *opt = cb;
 	const char *slot;
 
 	if (userdiff_config(var, value) < 0)
 		return -1;
 
-	/*
-	 * The instance of grep_opt that we set up here is copied by
-	 * grep_init() to be used by each individual invocation.
-	 * When populating a new field of this structure here, be
-	 * sure to think about ownership -- e.g., you might need to
-	 * override the shallow copy in grep_init() with a deep copy.
-	 */
-
 	if (!strcmp(var, "grep.extendedregexp")) {
 		opt->extended_regexp_option = git_config_bool(var, value);
 		return 0;
@@ -134,14 +105,10 @@ int grep_config(const char *var, const char *value, void *cb)
 	return 0;
 }
 
-/*
- * Initialize one instance of grep_opt and copy the
- * default values from the template we read the configuration
- * information in an earlier call to git_config(grep_config).
- */
 void grep_init(struct grep_opt *opt, struct repository *repo)
 {
-	*opt = grep_defaults;
+	struct grep_opt blank = GREP_OPT_INIT;
+	memcpy(opt, &blank, sizeof(*opt));
 
 	opt->repo = repo;
 	opt->pattern_tail = &opt->pattern_list;
diff --git a/grep.h b/grep.h
index 23a2a41d2c4..3112d1c2a38 100644
--- a/grep.h
+++ b/grep.h
@@ -179,6 +179,27 @@ struct grep_opt {
 	void *output_priv;
 };
 
+#define GREP_OPT_INIT { \
+	.relative = 1, \
+	.pathname = 1, \
+	.max_depth = -1, \
+	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
+	.colors = { \
+		[GREP_COLOR_CONTEXT] = "", \
+		[GREP_COLOR_FILENAME] = GIT_COLOR_MAGENTA, \
+		[GREP_COLOR_FUNCTION] = "", \
+		[GREP_COLOR_LINENO] = GIT_COLOR_GREEN, \
+		[GREP_COLOR_COLUMNNO] = GIT_COLOR_GREEN, \
+		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_SELECTED] = "", \
+		[GREP_COLOR_SEP] = GIT_COLOR_CYAN, \
+	}, \
+	.only_matching = 0, \
+	.color = -1, \
+	.output = std_output, \
+}
+
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
-- 
2.35.0.894.g563b84683b9


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v9 7/9] grep.h: make "grep_opt.pattern_type_option" use its enum
  2022-01-27 11:56             ` [PATCH v9 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                 ` (5 preceding siblings ...)
  2022-01-27 11:56               ` [PATCH v9 6/9] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
@ 2022-01-27 11:56               ` Ævar Arnfjörð Bjarmason
  2022-01-27 11:56               ` [PATCH v9 8/9] grep.c: do "if (bool && memchr())" not "if (memchr() && bool)" Ævar Arnfjörð Bjarmason
                                 ` (2 subsequent siblings)
  9 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-27 11:56 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change the "pattern_type_option" member of "struct grep_opt" to use
the enum type we use for it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/grep.h b/grep.h
index 3112d1c2a38..89a2ce51130 100644
--- a/grep.h
+++ b/grep.h
@@ -164,7 +164,7 @@ struct grep_opt {
 	int funcname;
 	int funcbody;
 	int extended_regexp_option;
-	int pattern_type_option;
+	enum grep_pattern_type pattern_type_option;
 	int ignore_locale;
 	char colors[NR_GREP_COLORS][COLOR_MAXLEN];
 	unsigned pre_context;
-- 
2.35.0.894.g563b84683b9


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v9 8/9] grep.c: do "if (bool && memchr())" not "if (memchr() && bool)"
  2022-01-27 11:56             ` [PATCH v9 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                 ` (6 preceding siblings ...)
  2022-01-27 11:56               ` [PATCH v9 7/9] grep.h: make "grep_opt.pattern_type_option" use its enum Ævar Arnfjörð Bjarmason
@ 2022-01-27 11:56               ` Ævar Arnfjörð Bjarmason
  2022-01-27 11:56               ` [PATCH v9 9/9] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
  2022-02-04 21:20               ` [PATCH v10 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  9 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-27 11:56 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change code in compile_regexp() to check the cheaper boolean
"!opt->pcre2" condition before the "memchr()" search.

This doesn't noticeably optimize anything, but makes the code more
obvious and conventional. The line wrapping being added here also
makes a subsequent commit smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index 35e12e43c09..60228a95a4f 100644
--- a/grep.c
+++ b/grep.c
@@ -492,7 +492,8 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 	p->ignore_case = opt->ignore_case;
 	p->fixed = opt->fixed;
 
-	if (memchr(p->pattern, 0, p->patternlen) && !opt->pcre2)
+	if (!opt->pcre2 &&
+	    memchr(p->pattern, 0, p->patternlen))
 		die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
 
 	p->is_fixed = is_fixed(p->pattern, p->patternlen);
-- 
2.35.0.894.g563b84683b9


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v9 9/9] grep: simplify config parsing and option parsing
  2022-01-27 11:56             ` [PATCH v9 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                 ` (7 preceding siblings ...)
  2022-01-27 11:56               ` [PATCH v9 8/9] grep.c: do "if (bool && memchr())" not "if (memchr() && bool)" Ævar Arnfjörð Bjarmason
@ 2022-01-27 11:56               ` Ævar Arnfjörð Bjarmason
  2022-01-27 20:30                 ` Junio C Hamano
  2022-02-04 21:20               ` [PATCH v10 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  9 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-01-27 11:56 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Simplify the parsing of "grep.patternType" and
"grep.extendedRegexp". This changes no behavior, but gets rid of
complex parsing logic that isn't needed anymore.

When "grep.patternType" was introduced in 84befcd0a4a (grep: add a
grep.patternType configuration setting, 2012-08-03) we promised that:

 1. You can set "grep.patternType", and "[setting it to] 'default'
    will return to the default matching behavior".

    In that context "the default" meant whatever the configuration
    system specified before that change, i.e. via grep.extendedRegexp.

 2. We'd support the existing "grep.extendedRegexp" option, but ignore
    it when the new "grep.patternType" option is set. We said we'd
    only ignore the older "grep.extendedRegexp" option "when the
    `grep.patternType` option is set to a value other than
    'default'".

In a preceding commit we changed grep_config() to be called after
grep_init(), which means that much of the complexity here can go
away.

As before both "grep.patternType" and "grep.extendedRegexp" are
last-one-wins variable, with "grep.extendedRegexp" yielding to
"grep.patternType", except when "grep.patternType=default".

Note that this applies as we parse the config, i.e. a sequence of:

    -c grep.patternType=perl
    -c grep.extendedRegexp=true \
    -c grep.patternType=default

should select ERE due to "grep.extendedRegexp=true and
grep.patternType=default". We can determine this as we parse the
config, because:

 * If we see "grep.extendedRegexp" we set "extended_regexp_option" to
   its boolean value.

 * If we see "grep.extendedRegexp" but
   "grep.patternType=[default|<unset>]" is in effect we *don't* set
   the internal "pattern_type_option" to update the pattern type.

 * If we see "grep.patternType!=default" we can set our internal
   "pattern_type_option" directly, it doesn't matter what the state of
   "extended_regexp_option" is, but we don't forget what it was, in
   case we see a "grep.patternType=default" again.

 * If we see a "grep.patternType=default" we can set the pattern to
   ERE or BRE depending on whether we last saw a
   "grep.extendedRegexp=true" or
   "grep.extendedRegexp=[false|<unset>]".

With this change the "extended_regexp_option" member is only used
within grep_config(), and in the current codebase we could equally
track it as a "static" variable within that function, see [1] for a
version for this patch that did that. We're keeping it a struct member
to make that function reentrant, in case it ends up mattering in the
future.

The command-line parsing in cmd_grep() can then completely ignore
"grep.extendedRegexp". Whatever effect it had before that step won't
matter if we see -G, -E, -P etc.

See my 07a3d411739 (grep: remove regflags from the public grep_opt
API, 2017-06-29) for addition of the two comments being removed here,
i.e. the complexity noted in that commit is now going away.

1. https://lore.kernel.org/git/patch-v8-09.10-c211bb0c69d-20220118T155211Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 10 +++----
 grep.c         | 76 +++++++++++---------------------------------------
 grep.h         |  4 +--
 revision.c     |  2 --
 4 files changed, 21 insertions(+), 71 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 0ea124321b6..942c4b25077 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -845,7 +845,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	int i;
 	int dummy;
 	int use_index = 1;
-	int pattern_type_arg = GREP_PATTERN_TYPE_UNSPECIFIED;
 	int allow_revs;
 
 	struct option options[] = {
@@ -879,16 +878,16 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			N_("descend at most <depth> levels"), PARSE_OPT_NONEG,
 			NULL, 1 },
 		OPT_GROUP(""),
-		OPT_SET_INT('E', "extended-regexp", &pattern_type_arg,
+		OPT_SET_INT('E', "extended-regexp", &opt.pattern_type_option,
 			    N_("use extended POSIX regular expressions"),
 			    GREP_PATTERN_TYPE_ERE),
-		OPT_SET_INT('G', "basic-regexp", &pattern_type_arg,
+		OPT_SET_INT('G', "basic-regexp", &opt.pattern_type_option,
 			    N_("use basic POSIX regular expressions (default)"),
 			    GREP_PATTERN_TYPE_BRE),
-		OPT_SET_INT('F', "fixed-strings", &pattern_type_arg,
+		OPT_SET_INT('F', "fixed-strings", &opt.pattern_type_option,
 			    N_("interpret patterns as fixed strings"),
 			    GREP_PATTERN_TYPE_FIXED),
-		OPT_SET_INT('P', "perl-regexp", &pattern_type_arg,
+		OPT_SET_INT('P', "perl-regexp", &opt.pattern_type_option,
 			    N_("use Perl-compatible regular expressions"),
 			    GREP_PATTERN_TYPE_PCRE),
 		OPT_GROUP(""),
@@ -982,7 +981,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options, grep_usage,
 			     PARSE_OPT_KEEP_DASHDASH |
 			     PARSE_OPT_STOP_AT_NON_OPTION);
-	grep_commit_pattern_type(pattern_type_arg, &opt);
 
 	if (use_index && !startup_info->have_repository) {
 		int fallback = 0;
diff --git a/grep.c b/grep.c
index 60228a95a4f..f07a21ff1aa 100644
--- a/grep.c
+++ b/grep.c
@@ -48,6 +48,12 @@ static int parse_pattern_type_arg(const char *opt, const char *arg)
 
 define_list_config_array_extra(color_grep_slots, {"match"});
 
+static void adjust_pattern_type(enum grep_pattern_type *pto, const int ero)
+{
+	if (*pto == GREP_PATTERN_TYPE_UNSPECIFIED)
+		*pto = ero ? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
+}
+
 /*
  * Read the configuration file once and store it in
  * the grep_defaults template.
@@ -62,11 +68,17 @@ int grep_config(const char *var, const char *value, void *cb)
 
 	if (!strcmp(var, "grep.extendedregexp")) {
 		opt->extended_regexp_option = git_config_bool(var, value);
+		adjust_pattern_type(&opt->pattern_type_option,
+				    opt->extended_regexp_option);
 		return 0;
 	}
 
 	if (!strcmp(var, "grep.patterntype")) {
 		opt->pattern_type_option = parse_pattern_type_arg(var, value);
+		if (opt->extended_regexp_option == -1)
+			return 0;
+		adjust_pattern_type(&opt->pattern_type_option,
+				    opt->extended_regexp_option);
 		return 0;
 	}
 
@@ -115,62 +127,6 @@ void grep_init(struct grep_opt *opt, struct repository *repo)
 	opt->header_tail = &opt->header_list;
 }
 
-static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	/*
-	 * When committing to the pattern type by setting the relevant
-	 * fields in grep_opt it's generally not necessary to zero out
-	 * the fields we're not choosing, since they won't have been
-	 * set by anything. The extended_regexp_option field is the
-	 * only exception to this.
-	 *
-	 * This is because in the process of parsing grep.patternType
-	 * & grep.extendedRegexp we set opt->pattern_type_option and
-	 * opt->extended_regexp_option, respectively. We then
-	 * internally use opt->extended_regexp_option to see if we're
-	 * compiling an ERE. It must be unset if that's not actually
-	 * the case.
-	 */
-	if (pattern_type != GREP_PATTERN_TYPE_ERE &&
-	    opt->extended_regexp_option)
-		opt->extended_regexp_option = 0;
-
-	switch (pattern_type) {
-	case GREP_PATTERN_TYPE_UNSPECIFIED:
-		/* fall through */
-
-	case GREP_PATTERN_TYPE_BRE:
-		break;
-
-	case GREP_PATTERN_TYPE_ERE:
-		opt->extended_regexp_option = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_FIXED:
-		opt->fixed = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_PCRE:
-		opt->pcre2 = 1;
-		break;
-	}
-}
-
-void grep_commit_pattern_type(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	if (pattern_type != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(pattern_type, opt);
-	else if (opt->pattern_type_option != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(opt->pattern_type_option, opt);
-	else if (opt->extended_regexp_option)
-		/*
-		 * This branch *must* happen after setting from the
-		 * opt->pattern_type_option above, we don't want
-		 * grep.extendedRegexp to override grep.patternType!
-		 */
-		grep_set_pattern_type_option(GREP_PATTERN_TYPE_ERE, opt);
-}
-
 static struct grep_pat *create_grep_pat(const char *pat, size_t patlen,
 					const char *origin, int no,
 					enum grep_pat_token t,
@@ -490,9 +446,9 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 
 	p->word_regexp = opt->word_regexp;
 	p->ignore_case = opt->ignore_case;
-	p->fixed = opt->fixed;
+	p->fixed = opt->pattern_type_option == GREP_PATTERN_TYPE_FIXED;
 
-	if (!opt->pcre2 &&
+	if (opt->pattern_type_option != GREP_PATTERN_TYPE_PCRE &&
 	    memchr(p->pattern, 0, p->patternlen))
 		die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
 
@@ -544,14 +500,14 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 		return;
 	}
 
-	if (opt->pcre2) {
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_PCRE) {
 		compile_pcre2_pattern(p, opt);
 		return;
 	}
 
 	if (p->ignore_case)
 		regflags |= REG_ICASE;
-	if (opt->extended_regexp_option)
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_ERE)
 		regflags |= REG_EXTENDED;
 	err = regcomp(&p->regexp, p->pattern, regflags);
 	if (err) {
diff --git a/grep.h b/grep.h
index 89a2ce51130..f89324e9aa9 100644
--- a/grep.h
+++ b/grep.h
@@ -143,7 +143,6 @@ struct grep_opt {
 	int unmatch_name_only;
 	int count;
 	int word_regexp;
-	int fixed;
 	int all_match;
 	int no_body_match;
 	int body_hit;
@@ -154,7 +153,6 @@ struct grep_opt {
 	int allow_textconv;
 	int extended;
 	int use_reflog_filter;
-	int pcre2;
 	int relative;
 	int pathname;
 	int null_following_name;
@@ -183,6 +181,7 @@ struct grep_opt {
 	.relative = 1, \
 	.pathname = 1, \
 	.max_depth = -1, \
+	.extended_regexp_option = -1, \
 	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
 	.colors = { \
 		[GREP_COLOR_CONTEXT] = "", \
@@ -202,7 +201,6 @@ struct grep_opt {
 
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
-void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
 void append_grep_pattern(struct grep_opt *opt, const char *pat, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index d6e0e2b23b5..dd301e30147 100644
--- a/revision.c
+++ b/revision.c
@@ -2860,8 +2860,6 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 
 	diff_setup_done(&revs->diffopt);
 
-	grep_commit_pattern_type(GREP_PATTERN_TYPE_UNSPECIFIED,
-				 &revs->grep_filter);
 	if (!is_encoding_utf8(get_log_output_encoding()))
 		revs->grep_filter.ignore_locale = 1;
 	compile_grep_patterns(&revs->grep_filter);
-- 
2.35.0.894.g563b84683b9


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: [PATCH v9 9/9] grep: simplify config parsing and option parsing
  2022-01-27 11:56               ` [PATCH v9 9/9] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
@ 2022-01-27 20:30                 ` Junio C Hamano
  2022-01-27 21:35                   ` Junio C Hamano
  0 siblings, 1 reply; 151+ messages in thread
From: Junio C Hamano @ 2022-01-27 20:30 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> In a preceding commit we changed grep_config() to be called after
> grep_init(), which means that much of the complexity here can go
> away.
>
> As before both "grep.patternType" and "grep.extendedRegexp" are
> last-one-wins variable, with "grep.extendedRegexp" yielding to
> "grep.patternType", except when "grep.patternType=default".
>
> Note that this applies as we parse the config, i.e. a sequence of:
>
>     -c grep.patternType=perl
>     -c grep.extendedRegexp=true \
>     -c grep.patternType=default
>
> should select ERE due to "grep.extendedRegexp=true and
> grep.patternType=default".

Correct.  The same reasoning would apply to:

	-c grep.extendedRegexp=true \ 
	-c grep.patternType=perl \
	-c grep.patternType=default

which should also select ERE due to "grep.extendedRegexp=true and
grep.patternType=default".  As we re-check grep.extendedRegexp every
time grep.patternType changes, and we keep extendedRegexp as a
separate copy without getting lost like earlier rounds of this
series, this should work.

Would this also work?

	-c grep.extendedRegexp=false \
	-c grep.patternType=default \
	-c grep.extendedRegexp=true

We do keep extendedRegexp, but as soon as we read .patternType that
is default, adjust_pattern_type() overwrites the pattern_type_option
member with BRE, and the fact that .patternType was specified as "do
whatever the .extendedRegexp says" is lost when we read the third
one.

So, no, I am not sure this is correct.

> diff --git a/grep.c b/grep.c
> index 60228a95a4f..f07a21ff1aa 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -48,6 +48,12 @@ static int parse_pattern_type_arg(const char *opt, const char *arg)
>  
>  define_list_config_array_extra(color_grep_slots, {"match"});
>  
> +static void adjust_pattern_type(enum grep_pattern_type *pto, const int ero)
> +{
> +	if (*pto == GREP_PATTERN_TYPE_UNSPECIFIED)
> +		*pto = ero ? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
> +}

OK.  Earlier rounds just replaced the UNSPECIFIED with hardcoded value "0",
which was more or less pointless.  I think this is easier to follow.

But as I said, "committing" ERE vs BRE in this manner is probably
way too early and produce an incorrect result.  Instead ...

> @@ -490,9 +446,9 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)

... this is the right place to do the "see if pattern_type_option is
'default' and if so use 'extended_regexp_option' to commit to either
BRE or ERE".

I guess that is what I have been repeating during the review of the
past few rounds.  Am I overlooking some other cases where that
simpler-to-explain approach does not work?

>  	p->word_regexp = opt->word_regexp;
>  	p->ignore_case = opt->ignore_case;
> -	p->fixed = opt->fixed;
> +	p->fixed = opt->pattern_type_option == GREP_PATTERN_TYPE_FIXED;
>  
> -	if (!opt->pcre2 &&
> +	if (opt->pattern_type_option != GREP_PATTERN_TYPE_PCRE &&
>  	    memchr(p->pattern, 0, p->patternlen))
>  		die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
>  
> @@ -544,14 +500,14 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
>  		return;
>  	}
>  
> -	if (opt->pcre2) {
> +	if (opt->pattern_type_option == GREP_PATTERN_TYPE_PCRE) {
>  		compile_pcre2_pattern(p, opt);
>  		return;
>  	}
>  
>  	if (p->ignore_case)
>  		regflags |= REG_ICASE;
> -	if (opt->extended_regexp_option)
> +	if (opt->pattern_type_option == GREP_PATTERN_TYPE_ERE)
>  		regflags |= REG_EXTENDED;
>  	err = regcomp(&p->regexp, p->pattern, regflags);
>  	if (err) {
> diff --git a/grep.h b/grep.h
> index 89a2ce51130..f89324e9aa9 100644
> --- a/grep.h
> +++ b/grep.h
> @@ -143,7 +143,6 @@ struct grep_opt {
>  	int unmatch_name_only;
>  	int count;
>  	int word_regexp;
> -	int fixed;
>  	int all_match;
>  	int no_body_match;
>  	int body_hit;
> @@ -154,7 +153,6 @@ struct grep_opt {
>  	int allow_textconv;
>  	int extended;
>  	int use_reflog_filter;
> -	int pcre2;
>  	int relative;
>  	int pathname;
>  	int null_following_name;
> @@ -183,6 +181,7 @@ struct grep_opt {
>  	.relative = 1, \
>  	.pathname = 1, \
>  	.max_depth = -1, \
> +	.extended_regexp_option = -1, \

I do not think you meant this.  Uninitialized grep.extendedRegexp
defaults to 0 (BRE), I think.

>  	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
>  	.colors = { \
>  		[GREP_COLOR_CONTEXT] = "", \
> @@ -202,7 +201,6 @@ struct grep_opt {
>  
>  int grep_config(const char *var, const char *value, void *);
>  void grep_init(struct grep_opt *, struct repository *repo);
> -void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
>  
>  void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
>  void append_grep_pattern(struct grep_opt *opt, const char *pat, const char *origin, int no, enum grep_pat_token t);
> diff --git a/revision.c b/revision.c
> index d6e0e2b23b5..dd301e30147 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -2860,8 +2860,6 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
>  
>  	diff_setup_done(&revs->diffopt);
>  
> -	grep_commit_pattern_type(GREP_PATTERN_TYPE_UNSPECIFIED,
> -				 &revs->grep_filter);
>  	if (!is_encoding_utf8(get_log_output_encoding()))
>  		revs->grep_filter.ignore_locale = 1;
>  	compile_grep_patterns(&revs->grep_filter);

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v9 9/9] grep: simplify config parsing and option parsing
  2022-01-27 20:30                 ` Junio C Hamano
@ 2022-01-27 21:35                   ` Junio C Hamano
  2022-01-27 21:39                     ` Junio C Hamano
  0 siblings, 1 reply; 151+ messages in thread
From: Junio C Hamano @ 2022-01-27 21:35 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Junio C Hamano <gitster@pobox.com> writes:

> Would this also work?
>
> 	-c grep.extendedRegexp=false \
> 	-c grep.patternType=default \
> 	-c grep.extendedRegexp=true
>
> We do keep extendedRegexp, but as soon as we read .patternType that
> is default, adjust_pattern_type() overwrites the pattern_type_option
> member with BRE, and the fact that .patternType was specified as "do
> whatever the .extendedRegexp says" is lost when we read the third
> one.
>
> So, no, I am not sure this is correct.
> ...
> But as I said, "committing" ERE vs BRE in this manner is probably
> way too early and produce an incorrect result.  Instead ...
>
>> @@ -490,9 +446,9 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
>
> ... this is the right place to do the "see if pattern_type_option is
> 'default' and if so use 'extended_regexp_option' to commit to either
> BRE or ERE".
>
> I guess that is what I have been repeating during the review of the
> past few rounds.  Am I overlooking some other cases where that
> simpler-to-explain approach does not work?
> ...
>>  	.max_depth = -1, \
>> +	.extended_regexp_option = -1, \
>
> I do not think you meant this.  Uninitialized grep.extendedRegexp
> defaults to 0 (BRE), I think.

Taking all together, here is a squashable fix with an additional
test.

In addition to squashing the following in, we must update the
proposed log message.  Given that, even after taking _this long_ (I
think I have been saying this since the review of v5 iteration),
this series is still making the same mistake again, the fact that
the code needs to read all the configuration variables before it can
correctly decide what type the user really means deserves to be
stressed in the log message.

Despite what the proposed log message for this round (and many other
previous iterations) claimed, it fundamentally cannot be done inside
the callback, simply because the callback will not know how many
more times it will be called with what value for grep.patternType
and grep.extendedRegexp.  It can be done anywhere after the option
parser has finished reading all the options and knows there will not
be any more grep.patternType and grep.extendedRegexp that would
affect the computation.  One of the most natural such place is at
the beginning of compile_regexp(), I would think.

Other than that, all the previous steps looked good, so did the
parts of this commit that the attached fix-up does not touch.  It is
great that we do not have to carry "fixed", "pcre", etc. around as
separate members.

Thanks.

 grep.c          | 17 +++++------------
 grep.h          |  2 +-
 t/t7810-grep.sh | 10 ++++++++++
 3 files changed, 16 insertions(+), 13 deletions(-)

diff --git c/grep.c w/grep.c
index f07a21ff1a..a8f503f55c 100644
--- c/grep.c
+++ w/grep.c
@@ -48,12 +48,6 @@ static int parse_pattern_type_arg(const char *opt, const char *arg)
 
 define_list_config_array_extra(color_grep_slots, {"match"});
 
-static void adjust_pattern_type(enum grep_pattern_type *pto, const int ero)
-{
-	if (*pto == GREP_PATTERN_TYPE_UNSPECIFIED)
-		*pto = ero ? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
-}
-
 /*
  * Read the configuration file once and store it in
  * the grep_defaults template.
@@ -68,17 +62,11 @@ int grep_config(const char *var, const char *value, void *cb)
 
 	if (!strcmp(var, "grep.extendedregexp")) {
 		opt->extended_regexp_option = git_config_bool(var, value);
-		adjust_pattern_type(&opt->pattern_type_option,
-				    opt->extended_regexp_option);
 		return 0;
 	}
 
 	if (!strcmp(var, "grep.patterntype")) {
 		opt->pattern_type_option = parse_pattern_type_arg(var, value);
-		if (opt->extended_regexp_option == -1)
-			return 0;
-		adjust_pattern_type(&opt->pattern_type_option,
-				    opt->extended_regexp_option);
 		return 0;
 	}
 
@@ -444,6 +432,11 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 	int err;
 	int regflags = REG_NEWLINE;
 
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_UNSPECIFIED)
+		opt->pattern_type_option = (opt->extended_regexp_option
+					    ? GREP_PATTERN_TYPE_ERE
+					    : GREP_PATTERN_TYPE_BRE);
+
 	p->word_regexp = opt->word_regexp;
 	p->ignore_case = opt->ignore_case;
 	p->fixed = opt->pattern_type_option == GREP_PATTERN_TYPE_FIXED;
diff --git c/grep.h w/grep.h
index f89324e9aa..bdc6765482 100644
--- c/grep.h
+++ w/grep.h
@@ -181,7 +181,7 @@ struct grep_opt {
 	.relative = 1, \
 	.pathname = 1, \
 	.max_depth = -1, \
-	.extended_regexp_option = -1, \
+	.extended_regexp_option = 0, \
 	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
 	.colors = { \
 		[GREP_COLOR_CONTEXT] = "", \
diff --git c/t/t7810-grep.sh w/t/t7810-grep.sh
index 34d8f69c1d..b818e656ad 100755
--- c/t/t7810-grep.sh
+++ w/t/t7810-grep.sh
@@ -491,6 +491,16 @@ do
 		test_cmp expected actual
 	'
 
+	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (ERE)" '
+		echo "${HC}ab:abc" >expected &&
+		git \
+			-c grep.extendedRegexp=false \
+			-c grep.patternType=default \
+			-c grep.extendedRegexp=true \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
 	test_expect_success "grep $L with grep.patternType=extended and grep.patternType=default" '
 		echo "${HC}ab:a+bc" >expected &&
 		git \

^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: [PATCH v9 9/9] grep: simplify config parsing and option parsing
  2022-01-27 21:35                   ` Junio C Hamano
@ 2022-01-27 21:39                     ` Junio C Hamano
  0 siblings, 0 replies; 151+ messages in thread
From: Junio C Hamano @ 2022-01-27 21:39 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Junio C Hamano <gitster@pobox.com> writes:

> In addition to squashing the following in, we must update the
> proposed log message.  Given that, even after taking _this long_ (I
> think I have been saying this since the review of v5 iteration),

I found a reference:

    https://lore.kernel.org/git/xmqqv8zf6j86.fsf@gitster.g/

it indeed was [v5 7/7].

It is interesting that the breaking example I gave in the message is
exactly the one that I wrote (without going back to the archive) in
the "squashable fix" patch in the message I am responding to.

^ permalink raw reply	[flat|nested] 151+ messages in thread

* [PATCH v10 0/9] grep: simplify & delete "init" & "config" code
  2022-01-27 11:56             ` [PATCH v9 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                 ` (8 preceding siblings ...)
  2022-01-27 11:56               ` [PATCH v9 9/9] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
@ 2022-02-04 21:20               ` Ævar Arnfjörð Bjarmason
  2022-02-04 21:20                 ` [PATCH v10 1/9] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
                                   ` (9 more replies)
  9 siblings, 10 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-04 21:20 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

This series makes using the grep.[ch] API easier, by having it follow
the usual pattern of being initialized with:

    defaults -> config -> command-line

This is to make some follow-up work easier, this is a net code
deletion if we exclude newly added tests.

Changes since v9:

 * Integrated the proposed fixup from Junio's branch, and updated both
   the tests and commit messages accordingly. I changed "BRE" and
   "ERE" in the test descriptions to indicate what we're expecting to
   match as, and added the relevant test's from Junio's comments on an
   earlier round.

Junio: Sorry about the back & forth on this series. Each time I came
back to it it had been a while, and between a large mail queue and
fixing some isolated issues manged to lose the larger problem you were
pointing out. Thanks again!

Ævar Arnfjörð Bjarmason (9):
  grep.h: remove unused "regex_t regexp" from grep_opt
  log tests: check if grep_config() is called by "log"-like cmds
  grep tests: add missing "grep.patternType" config tests
  built-ins: trust the "prefix" from run_builtin()
  grep.c: don't pass along NULL callback value
  grep API: call grep_config() after grep_init()
  grep.h: make "grep_opt.pattern_type_option" use its enum
  grep.c: do "if (bool && memchr())" not "if (memchr() && bool)"
  grep: simplify config parsing and option parsing

 builtin/grep.c    |  27 +++++------
 builtin/log.c     |  13 +++++-
 builtin/ls-tree.c |   2 +-
 git.c             |   1 +
 grep.c            | 113 ++++++----------------------------------------
 grep.h            |  31 +++++++++----
 revision.c        |   4 +-
 t/t4202-log.sh    |  24 ++++++++++
 t/t7810-grep.sh   |  96 +++++++++++++++++++++++++++++++++++++++
 9 files changed, 185 insertions(+), 126 deletions(-)

Range-diff against v9:
 1:  b9372cde017 =  1:  184f7e0c5bd grep.h: remove unused "regex_t regexp" from grep_opt
 2:  30219a0ae9d =  2:  ac397cc6a18 log tests: check if grep_config() is called by "log"-like cmds
 3:  a75b288340b !  3:  3464c76cfd7 grep tests: add missing "grep.patternType" config tests
    @@ Commit message
         followed by a "grep.extendedRegexp=false" behaves as though
         "grep.extendedRegexp" wasn't provided.
     
    +    See [1] for the source of some of these tests, and their
    +    initial (pseudocode) implementation, and [2] for a later discussion
    +    about a breakage due to missing testing (which had been noted in [1]
    +    all along).
    +
    +    1. https://lore.kernel.org/git/xmqqv8zf6j86.fsf@gitster.g/
    +    2. https://lore.kernel.org/git/xmqqpmoczwtu.fsf@gitster.g/
    +
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
         Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
    @@ t/t7810-grep.sh: do
     +		test_cmp expected actual
     +	'
     +
    -+	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (BRE)" '
    ++	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (ERE)" '
     +		echo "${HC}ab:abc" >expected &&
     +		git \
     +			-c grep.patternType=fixed \
    @@ t/t7810-grep.sh: do
     +	'
     +
     +	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (ERE)" '
    ++		echo "${HC}ab:abc" >expected &&
    ++		git \
    ++			-c grep.extendedRegexp=false \
    ++			-c grep.patternType=default \
    ++			-c grep.extendedRegexp=true \
    ++			grep "a+b*c" $H ab >actual &&
    ++		test_cmp expected actual
    ++	'
    ++
    ++	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (BRE)" '
    ++		echo "${HC}ab:a+bc" >expected &&
    ++		git \
    ++			-c grep.extendedRegexp=true \
    ++			-c grep.extendedRegexp=false \
    ++			grep "a+b*c" $H ab >actual &&
    ++		test_cmp expected actual
    ++	'
    ++
    ++	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (BRE)" '
    ++		echo "${HC}ab:abc" >expected &&
    ++		git \
    ++			-c grep.extendedRegexp=false \
    ++			-c grep.extendedRegexp=true \
    ++			-c grep.patternType=default \
    ++			grep "a+b*c" $H ab >actual &&
    ++		test_cmp expected actual
    ++	'
    ++
    ++	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (BRE)" '
     +		echo "${HC}ab:a+bc" >expected &&
     +		git \
     +			-c grep.patternType=default \
    @@ t/t7810-grep.sh: do
     +			grep "a+b*c" $H ab >actual &&
     +		test_cmp expected actual
     +	'
    -+
     +	test_expect_success "grep $L with grep.patternType=extended and grep.patternType=default" '
     +		echo "${HC}ab:a+bc" >expected &&
     +		git \
 4:  6e7f9730a7d =  4:  c6ada96298a built-ins: trust the "prefix" from run_builtin()
 5:  fbcfea84696 =  5:  1f09de53e07 grep.c: don't pass along NULL callback value
 6:  96c8cc7806e =  6:  ce646154538 grep API: call grep_config() after grep_init()
 7:  e09616056b4 =  7:  6446b4f0f33 grep.h: make "grep_opt.pattern_type_option" use its enum
 8:  61fc6a4dac8 =  8:  df8ba5aba68 grep.c: do "if (bool && memchr())" not "if (memchr() && bool)"
 9:  445467e87f7 !  9:  ccbdfa48315 grep: simplify config parsing and option parsing
    @@ Commit message
         last-one-wins variable, with "grep.extendedRegexp" yielding to
         "grep.patternType", except when "grep.patternType=default".
     
    -    Note that this applies as we parse the config, i.e. a sequence of:
    +    Note that as the previously added tests indicate this cannot be done
    +    on-the-fly as we see the config variables, without introducing more
    +    state keeping. I.e. if we see:
     
    -        -c grep.patternType=perl
    -        -c grep.extendedRegexp=true \
    +        -c grep.extendedRegexp=false
             -c grep.patternType=default
    +        -c extendedRegexp=true
    +
    +    We need to select ERE, since grep.patternType=default unselects that
    +    variable, which normally has higher precedence, but we also need to
    +    select BRE in cases of:
     
    -    should select ERE due to "grep.extendedRegexp=true and
    -    grep.patternType=default". We can determine this as we parse the
    -    config, because:
    +        -c grep.extendedRegexp=true \
    +        -c grep.extendedRegexp=false
     
    -     * If we see "grep.extendedRegexp" we set "extended_regexp_option" to
    -       its boolean value.
    +    Which would not be the case for this, which select ERE:
     
    -     * If we see "grep.extendedRegexp" but
    -       "grep.patternType=[default|<unset>]" is in effect we *don't* set
    -       the internal "pattern_type_option" to update the pattern type.
    +        -c grep.patternType=extended \
    +        -c grep.extendedRegexp=false
     
    -     * If we see "grep.patternType!=default" we can set our internal
    -       "pattern_type_option" directly, it doesn't matter what the state of
    -       "extended_regexp_option" is, but we don't forget what it was, in
    -       case we see a "grep.patternType=default" again.
    +    Therefore we cannot do this on-the-fly in grep_config without also
    +    introducing tracking variables for not only the pattern type, but what
    +    the source of that pattern type was.
     
    -     * If we see a "grep.patternType=default" we can set the pattern to
    -       ERE or BRE depending on whether we last saw a
    -       "grep.extendedRegexp=true" or
    -       "grep.extendedRegexp=[false|<unset>]".
    +    So we need to decide on the pattern after our config was fully
    +    parsed. Let's do that by deferring the decision on the pattern type
    +    until it's time to compile it in compile_regexp().
     
    -    With this change the "extended_regexp_option" member is only used
    -    within grep_config(), and in the current codebase we could equally
    -    track it as a "static" variable within that function, see [1] for a
    -    version for this patch that did that. We're keeping it a struct member
    -    to make that function reentrant, in case it ends up mattering in the
    -    future.
    +    By that time we've not only parsed the config, but also handled the
    +    command-line options. Those will set "opt.pattern_type_option" (*not*
    +    "opt.extended_regexp_option"!).
     
    -    The command-line parsing in cmd_grep() can then completely ignore
    -    "grep.extendedRegexp". Whatever effect it had before that step won't
    -    matter if we see -G, -E, -P etc.
    +    At that point all we need to do is see if "grep.patternType" was
    +    UNSPECIFIED in the end (including an explicit "=default"), if so we'll
    +    use the "grep.extendedRegexp" configuration, if any.
     
         See my 07a3d411739 (grep: remove regflags from the public grep_opt
         API, 2017-06-29) for addition of the two comments being removed here,
    @@ builtin/grep.c: int cmd_grep(int argc, const char **argv, const char *prefix)
      		int fallback = 0;
     
      ## grep.c ##
    -@@ grep.c: static int parse_pattern_type_arg(const char *opt, const char *arg)
    - 
    - define_list_config_array_extra(color_grep_slots, {"match"});
    - 
    -+static void adjust_pattern_type(enum grep_pattern_type *pto, const int ero)
    -+{
    -+	if (*pto == GREP_PATTERN_TYPE_UNSPECIFIED)
    -+		*pto = ero ? GREP_PATTERN_TYPE_ERE : GREP_PATTERN_TYPE_BRE;
    -+}
    -+
    - /*
    -  * Read the configuration file once and store it in
    -  * the grep_defaults template.
    -@@ grep.c: int grep_config(const char *var, const char *value, void *cb)
    - 
    - 	if (!strcmp(var, "grep.extendedregexp")) {
    - 		opt->extended_regexp_option = git_config_bool(var, value);
    -+		adjust_pattern_type(&opt->pattern_type_option,
    -+				    opt->extended_regexp_option);
    - 		return 0;
    - 	}
    - 
    - 	if (!strcmp(var, "grep.patterntype")) {
    - 		opt->pattern_type_option = parse_pattern_type_arg(var, value);
    -+		if (opt->extended_regexp_option == -1)
    -+			return 0;
    -+		adjust_pattern_type(&opt->pattern_type_option,
    -+				    opt->extended_regexp_option);
    - 		return 0;
    - 	}
    - 
     @@ grep.c: void grep_init(struct grep_opt *opt, struct repository *repo)
      	opt->header_tail = &opt->header_list;
      }
    @@ grep.c: void grep_init(struct grep_opt *opt, struct repository *repo)
      					const char *origin, int no,
      					enum grep_pat_token t,
     @@ grep.c: static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
    + 	int err;
    + 	int regflags = REG_NEWLINE;
      
    ++	if (opt->pattern_type_option == GREP_PATTERN_TYPE_UNSPECIFIED)
    ++		opt->pattern_type_option = (opt->extended_regexp_option
    ++					    ? GREP_PATTERN_TYPE_ERE
    ++					    : GREP_PATTERN_TYPE_BRE);
    ++
      	p->word_regexp = opt->word_regexp;
      	p->ignore_case = opt->ignore_case;
     -	p->fixed = opt->fixed;
    @@ grep.h: struct grep_opt {
      	int pathname;
      	int null_following_name;
     @@ grep.h: struct grep_opt {
    - 	.relative = 1, \
    - 	.pathname = 1, \
    - 	.max_depth = -1, \
    -+	.extended_regexp_option = -1, \
    - 	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
    - 	.colors = { \
    - 		[GREP_COLOR_CONTEXT] = "", \
    -@@ grep.h: struct grep_opt {
      
      int grep_config(const char *var, const char *value, void *);
      void grep_init(struct grep_opt *, struct repository *repo);
-- 
2.35.1.940.ge7a5b4b05f2


^ permalink raw reply	[flat|nested] 151+ messages in thread

* [PATCH v10 1/9] grep.h: remove unused "regex_t regexp" from grep_opt
  2022-02-04 21:20               ` [PATCH v10 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
@ 2022-02-04 21:20                 ` Ævar Arnfjörð Bjarmason
  2022-02-04 21:20                 ` [PATCH v10 2/9] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
                                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-04 21:20 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

This "regex_t" in grep_opt has not been used since
f9b9faf6f8a (builtin-grep: allow more than one patterns., 2006-05-02),
we still use a "regex_t" for compiling regexes, but that's in the
"grep_pat" struct".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/grep.h b/grep.h
index 6a1f0ab0172..400172676a1 100644
--- a/grep.h
+++ b/grep.h
@@ -136,7 +136,6 @@ struct grep_opt {
 
 	const char *prefix;
 	int prefix_length;
-	regex_t regexp;
 	int linenum;
 	int columnnum;
 	int invert;
-- 
2.35.1.940.ge7a5b4b05f2


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v10 2/9] log tests: check if grep_config() is called by "log"-like cmds
  2022-02-04 21:20               ` [PATCH v10 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  2022-02-04 21:20                 ` [PATCH v10 1/9] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
@ 2022-02-04 21:20                 ` Ævar Arnfjörð Bjarmason
  2022-02-04 21:20                 ` [PATCH v10 3/9] grep tests: add missing "grep.patternType" config tests Ævar Arnfjörð Bjarmason
                                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-04 21:20 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Extend the tests added in my 9df46763ef1 (log: add exhaustive tests
for pattern style options & config, 2017-05-20) to check not only
whether "git log" handles "grep.patternType", but also "git show"
etc.

It's sufficient to check whether we match a "fixed" or a "basic" regex
here to see if these codepaths correctly invoked grep_config(). We
don't need to check the details of their regular expression matching
as the "log" test does.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t4202-log.sh | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 50495598619..e775b378e4b 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -449,6 +449,30 @@ test_expect_success !FAIL_PREREQS 'log with various grep.patternType configurati
 	)
 '
 
+for cmd in show whatchanged reflog format-patch
+do
+	case "$cmd" in
+	format-patch) myarg="HEAD~.." ;;
+	*) myarg= ;;
+	esac
+
+	test_expect_success "$cmd: understands grep.patternType, like 'log'" '
+		git init "pattern-type-$cmd" &&
+		(
+			cd "pattern-type-$cmd" &&
+			test_commit 1 file A &&
+			test_commit "(1|2)" file B 2 &&
+
+			git -c grep.patternType=fixed $cmd --grep="..." $myarg >actual &&
+			test_must_be_empty actual &&
+
+			git -c grep.patternType=basic $cmd --grep="..." $myarg >actual &&
+			test_file_not_empty actual
+		)
+	'
+done
+test_done
+
 test_expect_success 'log --author' '
 	cat >expect <<-\EOF &&
 	Author: <BOLD;RED>A U<RESET> Thor <author@example.com>
-- 
2.35.1.940.ge7a5b4b05f2


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v10 3/9] grep tests: add missing "grep.patternType" config tests
  2022-02-04 21:20               ` [PATCH v10 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  2022-02-04 21:20                 ` [PATCH v10 1/9] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
  2022-02-04 21:20                 ` [PATCH v10 2/9] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
@ 2022-02-04 21:20                 ` Ævar Arnfjörð Bjarmason
  2022-02-04 23:03                   ` Junio C Hamano
  2022-02-04 23:24                   ` Junio C Hamano
  2022-02-04 21:20                 ` [PATCH v10 4/9] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
                                   ` (6 subsequent siblings)
  9 siblings, 2 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-04 21:20 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Extend the grep tests to assert that setting
"grep.patternType=extended" followed by "grep.patternType=default"
will behave as if "--basic-regexp" was provided, and not as
"--extended-regexp". In a subsequent commit we'll need to treat
"grep.patternType=default" as a special-case, but let's make sure we
ignore it if it's being set to "default" following an earlier
non-"default" "grep.patternType" setting.

Let's also test what happens when we have a sequence of "extended"
followed by "default" and "fixed". In that case the "fixed" should
prevail, as well as tests to check that a "grep.extendedRegexp=true"
followed by a "grep.extendedRegexp=false" behaves as though
"grep.extendedRegexp" wasn't provided.

See [1] for the source of some of these tests, and their
initial (pseudocode) implementation, and [2] for a later discussion
about a breakage due to missing testing (which had been noted in [1]
all along).

1. https://lore.kernel.org/git/xmqqv8zf6j86.fsf@gitster.g/
2. https://lore.kernel.org/git/xmqqpmoczwtu.fsf@gitster.g/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t7810-grep.sh | 96 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 96 insertions(+)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 424c31c3287..79884787da2 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -451,6 +451,93 @@ do
 		test_cmp expected actual
 	'
 
+	test_expect_success "grep $L with grep.extendedRegexp is last-one-wins" '
+		echo "${HC}ab:a+bc" >expected &&
+		git \
+			-c grep.extendedRegexp=true \
+			-c grep.patternType=basic \
+			-c grep.extendedRegexp=false \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.extendedRegexp is last-one-wins & defers to grep.patternType" '
+		echo "${HC}ab:abc" >expected &&
+		git \
+			-c grep.extendedRegexp=true \
+			-c grep.patternType=extended \
+			-c grep.extendedRegexp=false \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (ERE)" '
+		echo "${HC}ab:abc" >expected &&
+		git \
+			-c grep.patternType=fixed \
+			-c grep.extendedRegexp=true \
+			-c grep.patternType=default \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (ERE)" '
+		echo "${HC}ab:abc" >expected &&
+		git \
+			-c grep.extendedRegexp=false \
+			-c grep.patternType=default \
+			-c grep.extendedRegexp=true \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (BRE)" '
+		echo "${HC}ab:a+bc" >expected &&
+		git \
+			-c grep.extendedRegexp=true \
+			-c grep.extendedRegexp=false \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (BRE)" '
+		echo "${HC}ab:abc" >expected &&
+		git \
+			-c grep.extendedRegexp=false \
+			-c grep.extendedRegexp=true \
+			-c grep.patternType=default \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (BRE)" '
+		echo "${HC}ab:a+bc" >expected &&
+		git \
+			-c grep.patternType=default \
+			-c grep.extendedRegexp=true \
+			-c grep.patternType=basic \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+	test_expect_success "grep $L with grep.patternType=extended and grep.patternType=default" '
+		echo "${HC}ab:a+bc" >expected &&
+		git \
+			-c grep.patternType=extended \
+			-c grep.patternType=default \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L with grep.patternType=[extended -> default -> fixed]" '
+		echo "${HC}ab:a+b*c" >expected &&
+		git \
+			-c grep.patternType=extended \
+			-c grep.patternType=default \
+			-c grep.patternType=fixed \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
 	test_expect_success "grep $L with grep.patternType=extended and grep.extendedRegexp=false" '
 		echo "${HC}ab:abc" >expected &&
 		git \
@@ -478,6 +565,15 @@ do
 		test_cmp expected actual
 	'
 
+	test_expect_success "grep $L with grep.extendedRegexp=false and grep.patternType=default" '
+		echo "${HC}ab:abc" >expected &&
+		git \
+			-c grep.extendedRegexp=false \
+			-c grep.patternType=extended \
+			grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+
 	test_expect_success "grep $L with grep.extendedRegexp=true and grep.patternType=basic" '
 		echo "${HC}ab:a+bc" >expected &&
 		git \
-- 
2.35.1.940.ge7a5b4b05f2


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v10 4/9] built-ins: trust the "prefix" from run_builtin()
  2022-02-04 21:20               ` [PATCH v10 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                   ` (2 preceding siblings ...)
  2022-02-04 21:20                 ` [PATCH v10 3/9] grep tests: add missing "grep.patternType" config tests Ævar Arnfjörð Bjarmason
@ 2022-02-04 21:20                 ` Ævar Arnfjörð Bjarmason
  2022-02-04 21:20                 ` [PATCH v10 5/9] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
                                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-04 21:20 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change code in "builtin/grep.c" and "builtin/ls-tree.c" to trust the
"prefix" passed from "run_builtin()". The "prefix" we get from setup.c
is either going to be NULL or a string of length >0, never "".

So we can drop the "prefix && *prefix" checks added for
"builtin/grep.c" in 0d042fecf2f (git-grep: show pathnames relative to
the current directory, 2006-08-11), and for "builtin/ls-tree.c" in
a69dd585fca (ls-tree: chomp leading directories when run from a
subdirectory, 2005-12-23).

As seen in code in revision.c that was added in cd676a51367 (diff
--relative: output paths as relative to the current subdirectory,
2008-02-12) we already have existing code that does away with this
assertion.

This makes it easier to reason about a subsequent change to the
"prefix_length" code in grep.c in a subsequent commit, and since we're
going to the trouble of doing that let's leave behind an assert() to
promise this to any future callers.

For "builtin/grep.c" it would be painful to pass the "prefix" down the
callchain of:

    cmd_grep -> grep_tree -> grep_submodule -> grep_cache -> grep_oid ->
    grep_source_name

So for the code that needs it in grep_source_name() let's add a
"grep_prefix" variable similar to the existing "ls_tree_prefix".

While at it let's move the code in cmd_ls_tree() around so that we
assign to the "ls_tree_prefix" right after declaring the variables,
and stop assigning to "prefix". We only subsequently used that
variable later in the function after clobbering it. Let's just use our
own "grep_prefix" instead.

Let's also add an assert() in git.c, so that we'll make this promise
about the "prefix" to any current and future callers, as well as to
any readers of the code.

Code history:

 * The strlen() in "grep.c" hasn't been used since 493b7a08d80 (grep:
   accept relative paths outside current working directory, 2009-09-05).

   When that code was added in 0d042fecf2f (git-grep: show pathnames
   relative to the current directory, 2006-08-11) we used the length.

   But since 493b7a08d80 we haven't used it for anything except a
   boolean check that we could have done on the "prefix" member
   itself.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c    | 13 ++++++++-----
 builtin/ls-tree.c |  2 +-
 git.c             |  1 +
 grep.c            |  4 +---
 grep.h            |  4 +---
 revision.c        |  2 +-
 6 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 9e34a820ad4..d85cbabea67 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -26,6 +26,8 @@
 #include "object-store.h"
 #include "packfile.h"
 
+static const char *grep_prefix;
+
 static char const * const grep_usage[] = {
 	N_("git grep [<options>] [-e] <pattern> [<rev>...] [[--] <path>...]"),
 	NULL
@@ -315,11 +317,11 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 	strbuf_reset(out);
 
 	if (opt->null_following_name) {
-		if (opt->relative && opt->prefix_length) {
+		if (opt->relative && grep_prefix) {
 			struct strbuf rel_buf = STRBUF_INIT;
 			const char *rel_name =
 				relative_path(filename + tree_name_len,
-					      opt->prefix, &rel_buf);
+					      grep_prefix, &rel_buf);
 
 			if (tree_name_len)
 				strbuf_add(out, filename, tree_name_len);
@@ -332,8 +334,8 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 		return;
 	}
 
-	if (opt->relative && opt->prefix_length)
-		quote_path(filename + tree_name_len, opt->prefix, out, 0);
+	if (opt->relative && grep_prefix)
+		quote_path(filename + tree_name_len, grep_prefix, out, 0);
 	else
 		quote_c_style(filename + tree_name_len, out, NULL, 0);
 
@@ -962,9 +964,10 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			   PARSE_OPT_NOCOMPLETE),
 		OPT_END()
 	};
+	grep_prefix = prefix;
 
 	git_config(grep_cmd_config, NULL);
-	grep_init(&opt, the_repository, prefix);
+	grep_init(&opt, the_repository);
 
 	/*
 	 * If there is no -- then the paths must exist in the working
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c71..6cb554cbb0a 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -150,7 +150,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 
 	git_config(git_default_config, NULL);
 	ls_tree_prefix = prefix;
-	if (prefix && *prefix)
+	if (prefix)
 		chomp_prefix = strlen(prefix);
 
 	argc = parse_options(argc, argv, prefix, ls_tree_options,
diff --git a/git.c b/git.c
index edda922ce6d..9d257e092da 100644
--- a/git.c
+++ b/git.c
@@ -436,6 +436,7 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
 	} else {
 		prefix = NULL;
 	}
+	assert(!prefix || *prefix);
 	precompose_argv_prefix(argc, argv, NULL);
 	if (use_pager == -1 && run_setup &&
 		!(p->option & DELAY_PAGER_CONFIG))
diff --git a/grep.c b/grep.c
index 7bb0360869a..8421dc55486 100644
--- a/grep.c
+++ b/grep.c
@@ -139,13 +139,11 @@ int grep_config(const char *var, const char *value, void *cb)
  * default values from the template we read the configuration
  * information in an earlier call to git_config(grep_config).
  */
-void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix)
+void grep_init(struct grep_opt *opt, struct repository *repo)
 {
 	*opt = grep_defaults;
 
 	opt->repo = repo;
-	opt->prefix = prefix;
-	opt->prefix_length = (prefix && *prefix) ? strlen(prefix) : 0;
 	opt->pattern_tail = &opt->pattern_list;
 	opt->header_tail = &opt->header_list;
 }
diff --git a/grep.h b/grep.h
index 400172676a1..23a2a41d2c4 100644
--- a/grep.h
+++ b/grep.h
@@ -134,8 +134,6 @@ struct grep_opt {
 	 */
 	struct repository *repo;
 
-	const char *prefix;
-	int prefix_length;
 	int linenum;
 	int columnnum;
 	int invert;
@@ -182,7 +180,7 @@ struct grep_opt {
 };
 
 int grep_config(const char *var, const char *value, void *);
-void grep_init(struct grep_opt *, struct repository *repo, const char *prefix);
+void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index ad4286fbdde..d6e0e2b23b5 100644
--- a/revision.c
+++ b/revision.c
@@ -1838,7 +1838,7 @@ void repo_init_revisions(struct repository *r,
 	revs->commit_format = CMIT_FMT_DEFAULT;
 	revs->expand_tabs_in_log_default = 8;
 
-	grep_init(&revs->grep_filter, revs->repo, prefix);
+	grep_init(&revs->grep_filter, revs->repo);
 	revs->grep_filter.status_only = 1;
 
 	repo_diff_setup(revs->repo, &revs->diffopt);
-- 
2.35.1.940.ge7a5b4b05f2


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v10 5/9] grep.c: don't pass along NULL callback value
  2022-02-04 21:20               ` [PATCH v10 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                   ` (3 preceding siblings ...)
  2022-02-04 21:20                 ` [PATCH v10 4/9] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
@ 2022-02-04 21:20                 ` Ævar Arnfjörð Bjarmason
  2022-02-04 21:20                 ` [PATCH v10 6/9] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
                                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-04 21:20 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change grep_cmd_config() to stop passing around the always-NULL "cb"
value. When this code was added in 7e8f59d577e (grep: color patterns
in output, 2009-03-07) it was non-NULL, but when that changed in
15fabd1bbd4 (builtin/grep.c: make configuration callback more
reusable, 2012-10-09) this code was left behind.

In a subsequent change I'll start using the "cb" value, this will make
it clear which functions we call need it, and which don't.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index d85cbabea67..5ec4cecae45 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -285,8 +285,8 @@ static int wait_all(void)
 
 static int grep_cmd_config(const char *var, const char *value, void *cb)
 {
-	int st = grep_config(var, value, cb);
-	if (git_color_default_config(var, value, cb) < 0)
+	int st = grep_config(var, value, NULL);
+	if (git_color_default_config(var, value, NULL) < 0)
 		st = -1;
 
 	if (!strcmp(var, "grep.threads")) {
-- 
2.35.1.940.ge7a5b4b05f2


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v10 6/9] grep API: call grep_config() after grep_init()
  2022-02-04 21:20               ` [PATCH v10 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                   ` (4 preceding siblings ...)
  2022-02-04 21:20                 ` [PATCH v10 5/9] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
@ 2022-02-04 21:20                 ` Ævar Arnfjörð Bjarmason
  2022-02-04 21:20                 ` [PATCH v10 7/9] grep.h: make "grep_opt.pattern_type_option" use its enum Ævar Arnfjörð Bjarmason
                                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-04 21:20 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

The grep_init() function used the odd pattern of initializing the
passed-in "struct grep_opt" with a statically defined "grep_defaults"
struct, which would be modified in-place when we invoked
grep_config().

So we effectively (b) initialized config, (a) then defaults, (c)
followed by user options. Usually those are ordered as "a", "b" and
"c" instead.

As the comments being removed here show the previous behavior needed
to be carefully explained as we'd potentially share the populated
configuration among different instances of grep_init(). In practice we
didn't do that, but now that it can't be a concern anymore let's
remove those comments.

This does not change the behavior of any of the configuration
variables or options. That would have been the case if we didn't move
around the grep_config() call in "builtin/log.c". But now that we call
"grep_config" after "git_log_config" and "git_format_config" we'll
need to pass in the already initialized "struct grep_opt *".

See 6ba9bb76e02 (grep: copy struct in one fell swoop, 2020-11-29) and
7687a0541e0 (grep: move the configuration parsing logic to grep.[ch],
2012-10-09) for the commits that added the comments.

The memcpy() pattern here will be optimized away and follows the
convention of other *_init() functions. See 5726a6b4012 (*.c *_init():
define in terms of corresponding *_INIT macro, 2021-07-01).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c |  4 ++--
 builtin/log.c  | 13 +++++++++++--
 grep.c         | 39 +++------------------------------------
 grep.h         | 21 +++++++++++++++++++++
 4 files changed, 37 insertions(+), 40 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 5ec4cecae45..0ea124321b6 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -285,7 +285,7 @@ static int wait_all(void)
 
 static int grep_cmd_config(const char *var, const char *value, void *cb)
 {
-	int st = grep_config(var, value, NULL);
+	int st = grep_config(var, value, cb);
 	if (git_color_default_config(var, value, NULL) < 0)
 		st = -1;
 
@@ -966,8 +966,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	};
 	grep_prefix = prefix;
 
-	git_config(grep_cmd_config, NULL);
 	grep_init(&opt, the_repository);
+	git_config(grep_cmd_config, &opt);
 
 	/*
 	 * If there is no -- then the paths must exist in the working
diff --git a/builtin/log.c b/builtin/log.c
index 4b493408cc5..06283b37e7a 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -520,8 +520,6 @@ static int git_log_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
-	if (grep_config(var, value, cb) < 0)
-		return -1;
 	if (git_gpg_config(var, value, cb) < 0)
 		return -1;
 	return git_diff_ui_config(var, value, cb);
@@ -536,6 +534,8 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.simplify_history = 0;
 	memset(&opt, 0, sizeof(opt));
@@ -650,6 +650,8 @@ int cmd_show(int argc, const char **argv, const char *prefix)
 
 	memset(&match_all, 0, sizeof(match_all));
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.always_show_header = 1;
 	rev.no_walk = 1;
@@ -733,6 +735,8 @@ int cmd_log_reflog(int argc, const char **argv, const char *prefix)
 
 	repo_init_revisions(the_repository, &rev, prefix);
 	init_reflog_walk(&rev.reflog_info);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.verbose_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -766,6 +770,8 @@ int cmd_log(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.always_show_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -1848,10 +1854,13 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	extra_hdr.strdup_strings = 1;
 	extra_to.strdup_strings = 1;
 	extra_cc.strdup_strings = 1;
+
 	init_log_defaults();
 	init_display_notes(&notes_opt);
 	git_config(git_format_config, NULL);
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.show_notes = show_notes;
 	memcpy(&rev.notes_opt, &notes_opt, sizeof(notes_opt));
 	rev.commit_format = CMIT_FMT_EMAIL;
diff --git a/grep.c b/grep.c
index 8421dc55486..35e12e43c09 100644
--- a/grep.c
+++ b/grep.c
@@ -19,27 +19,6 @@ static void std_output(struct grep_opt *opt, const void *buf, size_t size)
 	fwrite(buf, size, 1, stdout);
 }
 
-static struct grep_opt grep_defaults = {
-	.relative = 1,
-	.pathname = 1,
-	.max_depth = -1,
-	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED,
-	.colors = {
-		[GREP_COLOR_CONTEXT] = "",
-		[GREP_COLOR_FILENAME] = GIT_COLOR_MAGENTA,
-		[GREP_COLOR_FUNCTION] = "",
-		[GREP_COLOR_LINENO] = GIT_COLOR_GREEN,
-		[GREP_COLOR_COLUMNNO] = GIT_COLOR_GREEN,
-		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_SELECTED] = "",
-		[GREP_COLOR_SEP] = GIT_COLOR_CYAN,
-	},
-	.only_matching = 0,
-	.color = -1,
-	.output = std_output,
-};
-
 static const char *color_grep_slots[] = {
 	[GREP_COLOR_CONTEXT]	    = "context",
 	[GREP_COLOR_FILENAME]	    = "filename",
@@ -75,20 +54,12 @@ define_list_config_array_extra(color_grep_slots, {"match"});
  */
 int grep_config(const char *var, const char *value, void *cb)
 {
-	struct grep_opt *opt = &grep_defaults;
+	struct grep_opt *opt = cb;
 	const char *slot;
 
 	if (userdiff_config(var, value) < 0)
 		return -1;
 
-	/*
-	 * The instance of grep_opt that we set up here is copied by
-	 * grep_init() to be used by each individual invocation.
-	 * When populating a new field of this structure here, be
-	 * sure to think about ownership -- e.g., you might need to
-	 * override the shallow copy in grep_init() with a deep copy.
-	 */
-
 	if (!strcmp(var, "grep.extendedregexp")) {
 		opt->extended_regexp_option = git_config_bool(var, value);
 		return 0;
@@ -134,14 +105,10 @@ int grep_config(const char *var, const char *value, void *cb)
 	return 0;
 }
 
-/*
- * Initialize one instance of grep_opt and copy the
- * default values from the template we read the configuration
- * information in an earlier call to git_config(grep_config).
- */
 void grep_init(struct grep_opt *opt, struct repository *repo)
 {
-	*opt = grep_defaults;
+	struct grep_opt blank = GREP_OPT_INIT;
+	memcpy(opt, &blank, sizeof(*opt));
 
 	opt->repo = repo;
 	opt->pattern_tail = &opt->pattern_list;
diff --git a/grep.h b/grep.h
index 23a2a41d2c4..3112d1c2a38 100644
--- a/grep.h
+++ b/grep.h
@@ -179,6 +179,27 @@ struct grep_opt {
 	void *output_priv;
 };
 
+#define GREP_OPT_INIT { \
+	.relative = 1, \
+	.pathname = 1, \
+	.max_depth = -1, \
+	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
+	.colors = { \
+		[GREP_COLOR_CONTEXT] = "", \
+		[GREP_COLOR_FILENAME] = GIT_COLOR_MAGENTA, \
+		[GREP_COLOR_FUNCTION] = "", \
+		[GREP_COLOR_LINENO] = GIT_COLOR_GREEN, \
+		[GREP_COLOR_COLUMNNO] = GIT_COLOR_GREEN, \
+		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_SELECTED] = "", \
+		[GREP_COLOR_SEP] = GIT_COLOR_CYAN, \
+	}, \
+	.only_matching = 0, \
+	.color = -1, \
+	.output = std_output, \
+}
+
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
-- 
2.35.1.940.ge7a5b4b05f2


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v10 7/9] grep.h: make "grep_opt.pattern_type_option" use its enum
  2022-02-04 21:20               ` [PATCH v10 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                   ` (5 preceding siblings ...)
  2022-02-04 21:20                 ` [PATCH v10 6/9] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
@ 2022-02-04 21:20                 ` Ævar Arnfjörð Bjarmason
  2022-02-04 21:20                 ` [PATCH v10 8/9] grep.c: do "if (bool && memchr())" not "if (memchr() && bool)" Ævar Arnfjörð Bjarmason
                                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-04 21:20 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change the "pattern_type_option" member of "struct grep_opt" to use
the enum type we use for it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/grep.h b/grep.h
index 3112d1c2a38..89a2ce51130 100644
--- a/grep.h
+++ b/grep.h
@@ -164,7 +164,7 @@ struct grep_opt {
 	int funcname;
 	int funcbody;
 	int extended_regexp_option;
-	int pattern_type_option;
+	enum grep_pattern_type pattern_type_option;
 	int ignore_locale;
 	char colors[NR_GREP_COLORS][COLOR_MAXLEN];
 	unsigned pre_context;
-- 
2.35.1.940.ge7a5b4b05f2


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v10 8/9] grep.c: do "if (bool && memchr())" not "if (memchr() && bool)"
  2022-02-04 21:20               ` [PATCH v10 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                   ` (6 preceding siblings ...)
  2022-02-04 21:20                 ` [PATCH v10 7/9] grep.h: make "grep_opt.pattern_type_option" use its enum Ævar Arnfjörð Bjarmason
@ 2022-02-04 21:20                 ` Ævar Arnfjörð Bjarmason
  2022-02-04 21:20                 ` [PATCH v10 9/9] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
  2022-02-16  0:00                 ` [PATCH v11 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  9 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-04 21:20 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change code in compile_regexp() to check the cheaper boolean
"!opt->pcre2" condition before the "memchr()" search.

This doesn't noticeably optimize anything, but makes the code more
obvious and conventional. The line wrapping being added here also
makes a subsequent commit smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index 35e12e43c09..60228a95a4f 100644
--- a/grep.c
+++ b/grep.c
@@ -492,7 +492,8 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 	p->ignore_case = opt->ignore_case;
 	p->fixed = opt->fixed;
 
-	if (memchr(p->pattern, 0, p->patternlen) && !opt->pcre2)
+	if (!opt->pcre2 &&
+	    memchr(p->pattern, 0, p->patternlen))
 		die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
 
 	p->is_fixed = is_fixed(p->pattern, p->patternlen);
-- 
2.35.1.940.ge7a5b4b05f2


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v10 9/9] grep: simplify config parsing and option parsing
  2022-02-04 21:20               ` [PATCH v10 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                   ` (7 preceding siblings ...)
  2022-02-04 21:20                 ` [PATCH v10 8/9] grep.c: do "if (bool && memchr())" not "if (memchr() && bool)" Ævar Arnfjörð Bjarmason
@ 2022-02-04 21:20                 ` Ævar Arnfjörð Bjarmason
  2022-02-04 23:41                   ` Junio C Hamano
  2022-02-16  0:00                 ` [PATCH v11 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  9 siblings, 1 reply; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-04 21:20 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Simplify the parsing of "grep.patternType" and
"grep.extendedRegexp". This changes no behavior, but gets rid of
complex parsing logic that isn't needed anymore.

When "grep.patternType" was introduced in 84befcd0a4a (grep: add a
grep.patternType configuration setting, 2012-08-03) we promised that:

 1. You can set "grep.patternType", and "[setting it to] 'default'
    will return to the default matching behavior".

    In that context "the default" meant whatever the configuration
    system specified before that change, i.e. via grep.extendedRegexp.

 2. We'd support the existing "grep.extendedRegexp" option, but ignore
    it when the new "grep.patternType" option is set. We said we'd
    only ignore the older "grep.extendedRegexp" option "when the
    `grep.patternType` option is set to a value other than
    'default'".

In a preceding commit we changed grep_config() to be called after
grep_init(), which means that much of the complexity here can go
away.

As before both "grep.patternType" and "grep.extendedRegexp" are
last-one-wins variable, with "grep.extendedRegexp" yielding to
"grep.patternType", except when "grep.patternType=default".

Note that as the previously added tests indicate this cannot be done
on-the-fly as we see the config variables, without introducing more
state keeping. I.e. if we see:

    -c grep.extendedRegexp=false
    -c grep.patternType=default
    -c extendedRegexp=true

We need to select ERE, since grep.patternType=default unselects that
variable, which normally has higher precedence, but we also need to
select BRE in cases of:

    -c grep.extendedRegexp=true \
    -c grep.extendedRegexp=false

Which would not be the case for this, which select ERE:

    -c grep.patternType=extended \
    -c grep.extendedRegexp=false

Therefore we cannot do this on-the-fly in grep_config without also
introducing tracking variables for not only the pattern type, but what
the source of that pattern type was.

So we need to decide on the pattern after our config was fully
parsed. Let's do that by deferring the decision on the pattern type
until it's time to compile it in compile_regexp().

By that time we've not only parsed the config, but also handled the
command-line options. Those will set "opt.pattern_type_option" (*not*
"opt.extended_regexp_option"!).

At that point all we need to do is see if "grep.patternType" was
UNSPECIFIED in the end (including an explicit "=default"), if so we'll
use the "grep.extendedRegexp" configuration, if any.

See my 07a3d411739 (grep: remove regflags from the public grep_opt
API, 2017-06-29) for addition of the two comments being removed here,
i.e. the complexity noted in that commit is now going away.

1. https://lore.kernel.org/git/patch-v8-09.10-c211bb0c69d-20220118T155211Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 10 +++-----
 grep.c         | 69 +++++++-------------------------------------------
 grep.h         |  3 ---
 revision.c     |  2 --
 4 files changed, 13 insertions(+), 71 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 0ea124321b6..942c4b25077 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -845,7 +845,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	int i;
 	int dummy;
 	int use_index = 1;
-	int pattern_type_arg = GREP_PATTERN_TYPE_UNSPECIFIED;
 	int allow_revs;
 
 	struct option options[] = {
@@ -879,16 +878,16 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			N_("descend at most <depth> levels"), PARSE_OPT_NONEG,
 			NULL, 1 },
 		OPT_GROUP(""),
-		OPT_SET_INT('E', "extended-regexp", &pattern_type_arg,
+		OPT_SET_INT('E', "extended-regexp", &opt.pattern_type_option,
 			    N_("use extended POSIX regular expressions"),
 			    GREP_PATTERN_TYPE_ERE),
-		OPT_SET_INT('G', "basic-regexp", &pattern_type_arg,
+		OPT_SET_INT('G', "basic-regexp", &opt.pattern_type_option,
 			    N_("use basic POSIX regular expressions (default)"),
 			    GREP_PATTERN_TYPE_BRE),
-		OPT_SET_INT('F', "fixed-strings", &pattern_type_arg,
+		OPT_SET_INT('F', "fixed-strings", &opt.pattern_type_option,
 			    N_("interpret patterns as fixed strings"),
 			    GREP_PATTERN_TYPE_FIXED),
-		OPT_SET_INT('P', "perl-regexp", &pattern_type_arg,
+		OPT_SET_INT('P', "perl-regexp", &opt.pattern_type_option,
 			    N_("use Perl-compatible regular expressions"),
 			    GREP_PATTERN_TYPE_PCRE),
 		OPT_GROUP(""),
@@ -982,7 +981,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options, grep_usage,
 			     PARSE_OPT_KEEP_DASHDASH |
 			     PARSE_OPT_STOP_AT_NON_OPTION);
-	grep_commit_pattern_type(pattern_type_arg, &opt);
 
 	if (use_index && !startup_info->have_repository) {
 		int fallback = 0;
diff --git a/grep.c b/grep.c
index 60228a95a4f..a8f503f55c7 100644
--- a/grep.c
+++ b/grep.c
@@ -115,62 +115,6 @@ void grep_init(struct grep_opt *opt, struct repository *repo)
 	opt->header_tail = &opt->header_list;
 }
 
-static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	/*
-	 * When committing to the pattern type by setting the relevant
-	 * fields in grep_opt it's generally not necessary to zero out
-	 * the fields we're not choosing, since they won't have been
-	 * set by anything. The extended_regexp_option field is the
-	 * only exception to this.
-	 *
-	 * This is because in the process of parsing grep.patternType
-	 * & grep.extendedRegexp we set opt->pattern_type_option and
-	 * opt->extended_regexp_option, respectively. We then
-	 * internally use opt->extended_regexp_option to see if we're
-	 * compiling an ERE. It must be unset if that's not actually
-	 * the case.
-	 */
-	if (pattern_type != GREP_PATTERN_TYPE_ERE &&
-	    opt->extended_regexp_option)
-		opt->extended_regexp_option = 0;
-
-	switch (pattern_type) {
-	case GREP_PATTERN_TYPE_UNSPECIFIED:
-		/* fall through */
-
-	case GREP_PATTERN_TYPE_BRE:
-		break;
-
-	case GREP_PATTERN_TYPE_ERE:
-		opt->extended_regexp_option = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_FIXED:
-		opt->fixed = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_PCRE:
-		opt->pcre2 = 1;
-		break;
-	}
-}
-
-void grep_commit_pattern_type(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	if (pattern_type != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(pattern_type, opt);
-	else if (opt->pattern_type_option != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(opt->pattern_type_option, opt);
-	else if (opt->extended_regexp_option)
-		/*
-		 * This branch *must* happen after setting from the
-		 * opt->pattern_type_option above, we don't want
-		 * grep.extendedRegexp to override grep.patternType!
-		 */
-		grep_set_pattern_type_option(GREP_PATTERN_TYPE_ERE, opt);
-}
-
 static struct grep_pat *create_grep_pat(const char *pat, size_t patlen,
 					const char *origin, int no,
 					enum grep_pat_token t,
@@ -488,11 +432,16 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 	int err;
 	int regflags = REG_NEWLINE;
 
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_UNSPECIFIED)
+		opt->pattern_type_option = (opt->extended_regexp_option
+					    ? GREP_PATTERN_TYPE_ERE
+					    : GREP_PATTERN_TYPE_BRE);
+
 	p->word_regexp = opt->word_regexp;
 	p->ignore_case = opt->ignore_case;
-	p->fixed = opt->fixed;
+	p->fixed = opt->pattern_type_option == GREP_PATTERN_TYPE_FIXED;
 
-	if (!opt->pcre2 &&
+	if (opt->pattern_type_option != GREP_PATTERN_TYPE_PCRE &&
 	    memchr(p->pattern, 0, p->patternlen))
 		die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
 
@@ -544,14 +493,14 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 		return;
 	}
 
-	if (opt->pcre2) {
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_PCRE) {
 		compile_pcre2_pattern(p, opt);
 		return;
 	}
 
 	if (p->ignore_case)
 		regflags |= REG_ICASE;
-	if (opt->extended_regexp_option)
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_ERE)
 		regflags |= REG_EXTENDED;
 	err = regcomp(&p->regexp, p->pattern, regflags);
 	if (err) {
diff --git a/grep.h b/grep.h
index 89a2ce51130..c722d25ed9d 100644
--- a/grep.h
+++ b/grep.h
@@ -143,7 +143,6 @@ struct grep_opt {
 	int unmatch_name_only;
 	int count;
 	int word_regexp;
-	int fixed;
 	int all_match;
 	int no_body_match;
 	int body_hit;
@@ -154,7 +153,6 @@ struct grep_opt {
 	int allow_textconv;
 	int extended;
 	int use_reflog_filter;
-	int pcre2;
 	int relative;
 	int pathname;
 	int null_following_name;
@@ -202,7 +200,6 @@ struct grep_opt {
 
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
-void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
 void append_grep_pattern(struct grep_opt *opt, const char *pat, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index d6e0e2b23b5..dd301e30147 100644
--- a/revision.c
+++ b/revision.c
@@ -2860,8 +2860,6 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 
 	diff_setup_done(&revs->diffopt);
 
-	grep_commit_pattern_type(GREP_PATTERN_TYPE_UNSPECIFIED,
-				 &revs->grep_filter);
 	if (!is_encoding_utf8(get_log_output_encoding()))
 		revs->grep_filter.ignore_locale = 1;
 	compile_grep_patterns(&revs->grep_filter);
-- 
2.35.1.940.ge7a5b4b05f2


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: [PATCH v10 3/9] grep tests: add missing "grep.patternType" config tests
  2022-02-04 21:20                 ` [PATCH v10 3/9] grep tests: add missing "grep.patternType" config tests Ævar Arnfjörð Bjarmason
@ 2022-02-04 23:03                   ` Junio C Hamano
  2022-02-04 23:24                   ` Junio C Hamano
  1 sibling, 0 replies; 151+ messages in thread
From: Junio C Hamano @ 2022-02-04 23:03 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> +	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (BRE)" '
> +		echo "${HC}ab:a+bc" >expected &&
> +		git \
> +			-c grep.patternType=default \
> +			-c grep.extendedRegexp=true \
> +			-c grep.patternType=basic \
> +			grep "a+b*c" $H ab >actual &&
> +		test_cmp expected actual
> +	'
> +	test_expect_success "grep $L with grep.patternType=extended and grep.patternType=default" '

Mising blank line between tests.

Other than that, this step looks good.

No need to resend; I'll fix this part up locally again.

Thanks.

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v10 3/9] grep tests: add missing "grep.patternType" config tests
  2022-02-04 21:20                 ` [PATCH v10 3/9] grep tests: add missing "grep.patternType" config tests Ævar Arnfjörð Bjarmason
  2022-02-04 23:03                   ` Junio C Hamano
@ 2022-02-04 23:24                   ` Junio C Hamano
  1 sibling, 0 replies; 151+ messages in thread
From: Junio C Hamano @ 2022-02-04 23:24 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> +	test_expect_success "grep $L with grep.extendedRegexp is last-one-wins" '

The statement is true, but ...

> +		echo "${HC}ab:a+bc" >expected &&
> +		git \
> +			-c grep.extendedRegexp=true \
> +			-c grep.patternType=basic \
> +			-c grep.extendedRegexp=false \

... I do not think that is what is tested here.  It checks that
the value of grep.extendedRegexp is irrelevant when grep.patternType
is anything but default, no?  In other words, if you lose the last
one that overrides the earlier .extendedRegexp=true, this should
still pass, no?

> +			grep "a+b*c" $H ab >actual &&
> +		test_cmp expected actual
> +	'
> +
> +	test_expect_success "grep $L with grep.extendedRegexp is last-one-wins & defers to grep.patternType" '
> +		echo "${HC}ab:abc" >expected &&
> +		git \
> +			-c grep.extendedRegexp=true \
> +			-c grep.patternType=extended \
> +			-c grep.extendedRegexp=false \

Likewise.  grep.extendedRegexp being last-one-wins has no relevance
to this result, just like the previous one.  I would understand:

	grep $L with grep.patternType=extended pays no atention to grep.extendedRegexp

though.

> +			grep "a+b*c" $H ab >actual &&
> +		test_cmp expected actual
> +	'
> +
> +	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (ERE)" '
> +		echo "${HC}ab:abc" >expected &&
> +		git \
> +			-c grep.patternType=fixed \
> +			-c grep.extendedRegexp=true \
> +			-c grep.patternType=default \
> +			grep "a+b*c" $H ab >actual &&

.patternType=default lets .extendedRegexp to decide, and we get ERE.

This does have a correct name and tests the right thing (but
grep.extendedRegexp is set only once in this, so "last-one-wins" is
technically correct but may not be very useful thing to stress upon
;-).

> +		test_cmp expected actual
> +	'
> +
> +	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (ERE)" '
> +		echo "${HC}ab:abc" >expected &&
> +		git \
> +			-c grep.extendedRegexp=false \
> +			-c grep.patternType=default \
> +			-c grep.extendedRegexp=true \

.patternType=default lets .extendedRegexp to decide, and we get ERE.

Future readers might wonder why we are are testing the same thing
again, without enough imagination to guess what faulty implementation
is possible to require this test ;-).

> +			grep "a+b*c" $H ab >actual &&
> +		test_cmp expected actual
> +	'
> +
> +	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (BRE)" '
> +		echo "${HC}ab:a+bc" >expected &&
> +		git \
> +			-c grep.extendedRegexp=true \
> +			-c grep.extendedRegexp=false \
> +			grep "a+b*c" $H ab >actual &&

.patternType=default that is implicit is the "last" value and lets
.extendedRegexp to decide, and we get BRE.

> +		test_cmp expected actual
> +	'
> +
> +	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (BRE)" '
> +		echo "${HC}ab:abc" >expected &&
> +		git \
> +			-c grep.extendedRegexp=false \
> +			-c grep.extendedRegexp=true \
> +			-c grep.patternType=default \

Similar.  As the default for .patternType is the default anyway,
even with an implementation that commits the choice too early would
get this right, as it assumes that .patternType is the default when
it sees true given to .extendedRegexp variable.

> +			grep "a+b*c" $H ab >actual &&
> +		test_cmp expected actual
> +	'
> +
> +	test_expect_success "grep $L with grep.extendedRegexp and grep.patternType are both last-one-wins independently (BRE)" '
> +		echo "${HC}ab:a+bc" >expected &&
> +		git \
> +			-c grep.patternType=default \
> +			-c grep.extendedRegexp=true \
> +			-c grep.patternType=basic \

The last value for .patternType not being "default" makes it the
only thing that matters.  This one also tests the right thing.

> +			grep "a+b*c" $H ab >actual &&
> +		test_cmp expected actual
> +	'
> +	test_expect_success "grep $L with grep.patternType=extended and grep.patternType=default" '
> +		echo "${HC}ab:a+bc" >expected &&
> +		git \
> +			-c grep.patternType=extended \
> +			-c grep.patternType=default \

The last value for .patternType being "default" lets .extendedRegexp
to decide, but .extendedRegexp is left to its default, which is
false, so we should get BRE.

> +			grep "a+b*c" $H ab >actual &&
> +		test_cmp expected actual
> +	'
> +
> +	test_expect_success "grep $L with grep.patternType=[extended -> default -> fixed]" '
> +		echo "${HC}ab:a+b*c" >expected &&
> +		git \
> +			-c grep.patternType=extended \
> +			-c grep.patternType=default \
> +			-c grep.patternType=fixed \
> +			grep "a+b*c" $H ab >actual &&

OK.

> +		test_cmp expected actual
> +	'
> +
>  	test_expect_success "grep $L with grep.patternType=extended and grep.extendedRegexp=false" '
>  		echo "${HC}ab:abc" >expected &&
>  		git \
> @@ -478,6 +565,15 @@ do
>  		test_cmp expected actual
>  	'
>  
> +	test_expect_success "grep $L with grep.extendedRegexp=false and grep.patternType=default" '
> +		echo "${HC}ab:abc" >expected &&
> +		git \
> +			-c grep.extendedRegexp=false \
> +			-c grep.patternType=extended \

Again, .patternType not being "default" makes .extendedRegexp not to
matter at all, and we get ERE.

> +			grep "a+b*c" $H ab >actual &&
> +		test_cmp expected actual
> +	'
> +
>  	test_expect_success "grep $L with grep.extendedRegexp=true and grep.patternType=basic" '
>  		echo "${HC}ab:a+bc" >expected &&
>  		git \

All tests look correct.  The first two had labels that missed the
point of what is being tested, which need fixing.  Other tests were
correct, I know that all of them may have helped to catch mistakes
made in past iterations of this series, but without knowing the past
mistakes, new readers may not get exactly the point of these tests.

Thanks.

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Re: [PATCH v10 9/9] grep: simplify config parsing and option parsing
  2022-02-04 21:20                 ` [PATCH v10 9/9] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
@ 2022-02-04 23:41                   ` Junio C Hamano
  0 siblings, 0 replies; 151+ messages in thread
From: Junio C Hamano @ 2022-02-04 23:41 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Note that as the previously added tests indicate this cannot be done
> on-the-fly as we see the config variables, without introducing more
> state keeping. I.e. if we see:
>
>     -c grep.extendedRegexp=false
>     -c grep.patternType=default
>     -c extendedRegexp=true
>
> We need to select ERE, since grep.patternType=default unselects that
> variable, which normally has higher precedence, but we also need to
> select BRE in cases of:
>
>     -c grep.extendedRegexp=true \
>     -c grep.extendedRegexp=false
>
> Which would not be the case for this, which select ERE:
>
>     -c grep.patternType=extended \
>     -c grep.extendedRegexp=false

I think the latter two examples can lose the backslash at the end
(and all of them can lose "-c").  We can rewrite the preamble of the
first one to clarify what we are trying to say with this notation,
perhaps like

	I.e. if we see these configuration variable definitions in
	this order:

> -static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, struct grep_opt *opt)
> -{
> -	/*
> -	 * When committing to the pattern type by setting the relevant
> -	 * fields in grep_opt it's generally not necessary to zero out
> -	 * the fields we're not choosing, since they won't have been
> -	 * set by anything. The extended_regexp_option field is the
> -	 * only exception to this.
> -	 *
> -	 * This is because in the process of parsing grep.patternType
> -	 * & grep.extendedRegexp we set opt->pattern_type_option and
> -	 * opt->extended_regexp_option, respectively. We then
> -	 * internally use opt->extended_regexp_option to see if we're
> -	 * compiling an ERE. It must be unset if that's not actually
> -	 * the case.
> -	 */
> -	if (pattern_type != GREP_PATTERN_TYPE_ERE &&
> -	    opt->extended_regexp_option)
> -		opt->extended_regexp_option = 0;
> -
> -	switch (pattern_type) {
> -	case GREP_PATTERN_TYPE_UNSPECIFIED:
> -		/* fall through */
> -
> -	case GREP_PATTERN_TYPE_BRE:
> -		break;
> -
> -	case GREP_PATTERN_TYPE_ERE:
> -		opt->extended_regexp_option = 1;
> -		break;
> -
> -	case GREP_PATTERN_TYPE_FIXED:
> -		opt->fixed = 1;
> -		break;
> -
> -	case GREP_PATTERN_TYPE_PCRE:
> -		opt->pcre2 = 1;
> -		break;
> -	}
> -}
>
> -void grep_commit_pattern_type(enum grep_pattern_type pattern_type, struct grep_opt *opt)
> -{
> -	if (pattern_type != GREP_PATTERN_TYPE_UNSPECIFIED)
> -		grep_set_pattern_type_option(pattern_type, opt);
> -	else if (opt->pattern_type_option != GREP_PATTERN_TYPE_UNSPECIFIED)
> -		grep_set_pattern_type_option(opt->pattern_type_option, opt);
> -	else if (opt->extended_regexp_option)
> -		/*
> -		 * This branch *must* happen after setting from the
> -		 * opt->pattern_type_option above, we don't want
> -		 * grep.extendedRegexp to override grep.patternType!
> -		 */
> -		grep_set_pattern_type_option(GREP_PATTERN_TYPE_ERE, opt);
> -}

It is great that we can lose this, together with the associated
fields like fixed and pcre2.

> @@ -488,11 +432,16 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
>  	int err;
>  	int regflags = REG_NEWLINE;
>  
> +	if (opt->pattern_type_option == GREP_PATTERN_TYPE_UNSPECIFIED)
> +		opt->pattern_type_option = (opt->extended_regexp_option
> +					    ? GREP_PATTERN_TYPE_ERE
> +					    : GREP_PATTERN_TYPE_BRE);

It is nice that we can forget about .extended_regrexp_option member
after this point, and .pattern_type_option will be the only thing
that matters.

>  	p->word_regexp = opt->word_regexp;
>  	p->ignore_case = opt->ignore_case;
> -	p->fixed = opt->fixed;
> +	p->fixed = opt->pattern_type_option == GREP_PATTERN_TYPE_FIXED;

This makes readers wonder if we can further lose members in p
(specifically .fixed), but cleaning up the grep_opt members is
already a great progress.

Looking good.  Other than minor tweaks I mentioned on the proposed
log message, I didn't see anything wrong in this version.

Thanks.

^ permalink raw reply	[flat|nested] 151+ messages in thread

* [PATCH v11 00/10] grep: simplify & delete "init" & "config" code
  2022-02-04 21:20               ` [PATCH v10 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                   ` (8 preceding siblings ...)
  2022-02-04 21:20                 ` [PATCH v10 9/9] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
@ 2022-02-16  0:00                 ` Ævar Arnfjörð Bjarmason
  2022-02-16  0:00                   ` [PATCH v11 01/10] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
                                     ` (10 more replies)
  9 siblings, 11 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-16  0:00 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

This series makes using the grep.[ch] API easier, by having it follow
the usual pattern of being initialized with:

    defaults -> config -> command-line

This is to make some follow-up work easier, this is a net code
deletion if we exclude newly added tests.

Changes since v9:

  * A new 3/10 and 4/10 hopefully address the comments about the test
    code. I ended up just adding a helper to reduce the existing and
    new verbosity of the tests, which should make it easier to reason
    about them.

Ævar Arnfjörð Bjarmason (10):
  grep.h: remove unused "regex_t regexp" from grep_opt
  log tests: check if grep_config() is called by "log"-like cmds
  grep tests: create a helper function for "BRE" or "ERE"
  grep tests: add missing "grep.patternType" config tests
  built-ins: trust the "prefix" from run_builtin()
  grep.c: don't pass along NULL callback value
  grep API: call grep_config() after grep_init()
  grep.h: make "grep_opt.pattern_type_option" use its enum
  grep.c: do "if (bool && memchr())" not "if (memchr() && bool)"
  grep: simplify config parsing and option parsing

 builtin/grep.c    |  27 +++----
 builtin/log.c     |  13 +++-
 builtin/ls-tree.c |   2 +-
 git.c             |   1 +
 grep.c            | 113 ++++------------------------
 grep.h            |  31 ++++++--
 revision.c        |   4 +-
 t/t4202-log.sh    |  24 ++++++
 t/t7810-grep.sh   | 186 ++++++++++++++++++++++++++--------------------
 9 files changed, 195 insertions(+), 206 deletions(-)

Range-diff against v10:
 1:  184f7e0c5bd =  1:  67af9123727 grep.h: remove unused "regex_t regexp" from grep_opt
 2:  ac397cc6a18 =  2:  b275d23f0a8 log tests: check if grep_config() is called by "log"-like cmds
 3:  3464c76cfd7 <  -:  ----------- grep tests: add missing "grep.patternType" config tests
 -:  ----------- >  3:  b0f91bf7e4a grep tests: create a helper function for "BRE" or "ERE"
 -:  ----------- >  4:  9906edd4f58 grep tests: add missing "grep.patternType" config tests
 4:  c6ada96298a =  5:  7389f767388 built-ins: trust the "prefix" from run_builtin()
 5:  1f09de53e07 =  6:  38bfa0ed5f9 grep.c: don't pass along NULL callback value
 6:  ce646154538 =  7:  a4c1ee91dc9 grep API: call grep_config() after grep_init()
 7:  6446b4f0f33 =  8:  fa0da3a9fba grep.h: make "grep_opt.pattern_type_option" use its enum
 8:  df8ba5aba68 =  9:  243ceccc1ad grep.c: do "if (bool && memchr())" not "if (memchr() && bool)"
 9:  ccbdfa48315 = 10:  38bc5dc9461 grep: simplify config parsing and option parsing
-- 
2.35.1.1028.g9479bb34b83


^ permalink raw reply	[flat|nested] 151+ messages in thread

* [PATCH v11 01/10] grep.h: remove unused "regex_t regexp" from grep_opt
  2022-02-16  0:00                 ` [PATCH v11 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
@ 2022-02-16  0:00                   ` Ævar Arnfjörð Bjarmason
  2022-02-16  0:00                   ` [PATCH v11 02/10] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
                                     ` (9 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-16  0:00 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

This "regex_t" in grep_opt has not been used since
f9b9faf6f8a (builtin-grep: allow more than one patterns., 2006-05-02),
we still use a "regex_t" for compiling regexes, but that's in the
"grep_pat" struct".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/grep.h b/grep.h
index 6a1f0ab0172..400172676a1 100644
--- a/grep.h
+++ b/grep.h
@@ -136,7 +136,6 @@ struct grep_opt {
 
 	const char *prefix;
 	int prefix_length;
-	regex_t regexp;
 	int linenum;
 	int columnnum;
 	int invert;
-- 
2.35.1.1028.g9479bb34b83


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v11 02/10] log tests: check if grep_config() is called by "log"-like cmds
  2022-02-16  0:00                 ` [PATCH v11 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  2022-02-16  0:00                   ` [PATCH v11 01/10] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
@ 2022-02-16  0:00                   ` Ævar Arnfjörð Bjarmason
  2022-02-16  0:00                   ` [PATCH v11 03/10] grep tests: create a helper function for "BRE" or "ERE" Ævar Arnfjörð Bjarmason
                                     ` (8 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-16  0:00 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Extend the tests added in my 9df46763ef1 (log: add exhaustive tests
for pattern style options & config, 2017-05-20) to check not only
whether "git log" handles "grep.patternType", but also "git show"
etc.

It's sufficient to check whether we match a "fixed" or a "basic" regex
here to see if these codepaths correctly invoked grep_config(). We
don't need to check the details of their regular expression matching
as the "log" test does.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t4202-log.sh | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index dc884107de4..5c4385c745b 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -449,6 +449,30 @@ test_expect_success !FAIL_PREREQS 'log with various grep.patternType configurati
 	)
 '
 
+for cmd in show whatchanged reflog format-patch
+do
+	case "$cmd" in
+	format-patch) myarg="HEAD~.." ;;
+	*) myarg= ;;
+	esac
+
+	test_expect_success "$cmd: understands grep.patternType, like 'log'" '
+		git init "pattern-type-$cmd" &&
+		(
+			cd "pattern-type-$cmd" &&
+			test_commit 1 file A &&
+			test_commit "(1|2)" file B 2 &&
+
+			git -c grep.patternType=fixed $cmd --grep="..." $myarg >actual &&
+			test_must_be_empty actual &&
+
+			git -c grep.patternType=basic $cmd --grep="..." $myarg >actual &&
+			test_file_not_empty actual
+		)
+	'
+done
+test_done
+
 test_expect_success 'log --author' '
 	cat >expect <<-\EOF &&
 	Author: <BOLD;RED>A U<RESET> Thor <author@example.com>
-- 
2.35.1.1028.g9479bb34b83


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v11 03/10] grep tests: create a helper function for "BRE" or "ERE"
  2022-02-16  0:00                 ` [PATCH v11 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
  2022-02-16  0:00                   ` [PATCH v11 01/10] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
  2022-02-16  0:00                   ` [PATCH v11 02/10] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
@ 2022-02-16  0:00                   ` Ævar Arnfjörð Bjarmason
  2022-02-16  0:00                   ` [PATCH v11 04/10] grep tests: add missing "grep.patternType" config tests Ævar Arnfjörð Bjarmason
                                     ` (7 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-16  0:00 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Refactor the repeated test code for finding out whether a given set of
configuration will pick basic, extended or fixed into a new
"test_pattern_type" helper function.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t7810-grep.sh | 134 +++++++++++++++++++-----------------------------
 1 file changed, 54 insertions(+), 80 deletions(-)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 424c31c3287..6f1103b54b9 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -98,6 +98,37 @@ test_expect_success 'grep should not segfault with a bad input' '
 
 test_invalid_grep_expression --and -e A
 
+test_pattern_type () {
+	H=$1 &&
+	HC=$2 &&
+	L=$3 &&
+	type=$4 &&
+	shift 4 &&
+
+	expected_str= &&
+	case "$type" in
+	BRE)
+		expected_str="${HC}ab:a+bc"
+		;;
+	ERE)
+		expected_str="${HC}ab:abc"
+		;;
+	FIX)
+		expected_str="${HC}ab:a+b*c"
+		;;
+	*)
+		BUG "unknown pattern type '$type'"
+		;;
+	esac &&
+	config_str="$@" &&
+
+	test_expect_success "grep $L with '$config_str' interpreted as $type" '
+		echo $expected_str >expected &&
+		git $config_str grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'
+}
+
 for H in HEAD ''
 do
 	case "$H" in
@@ -393,35 +424,13 @@ do
 		git grep --no-recursive -n -e vvv $H -- t . >actual &&
 		test_cmp expected actual
 	'
-	test_expect_success "grep $L with grep.extendedRegexp=false" '
-		echo "${HC}ab:a+bc" >expected &&
-		git -c grep.extendedRegexp=false grep "a+b*c" $H ab >actual &&
-		test_cmp expected actual
-	'
 
-	test_expect_success "grep $L with grep.extendedRegexp=true" '
-		echo "${HC}ab:abc" >expected &&
-		git -c grep.extendedRegexp=true grep "a+b*c" $H ab >actual &&
-		test_cmp expected actual
-	'
 
-	test_expect_success "grep $L with grep.patterntype=basic" '
-		echo "${HC}ab:a+bc" >expected &&
-		git -c grep.patterntype=basic grep "a+b*c" $H ab >actual &&
-		test_cmp expected actual
-	'
-
-	test_expect_success "grep $L with grep.patterntype=extended" '
-		echo "${HC}ab:abc" >expected &&
-		git -c grep.patterntype=extended grep "a+b*c" $H ab >actual &&
-		test_cmp expected actual
-	'
-
-	test_expect_success "grep $L with grep.patterntype=fixed" '
-		echo "${HC}ab:a+b*c" >expected &&
-		git -c grep.patterntype=fixed grep "a+b*c" $H ab >actual &&
-		test_cmp expected actual
-	'
+	test_pattern_type "$H" "$HC" "$L" BRE -c grep.extendedRegexp=false
+	test_pattern_type "$H" "$HC" "$L" ERE -c grep.extendedRegexp=true
+	test_pattern_type "$H" "$HC" "$L" BRE -c grep.patternType=basic
+	test_pattern_type "$H" "$HC" "$L" ERE -c grep.patternType=extended
+	test_pattern_type "$H" "$HC" "$L" FIX -c grep.patternType=fixed
 
 	test_expect_success PCRE "grep $L with grep.patterntype=perl" '
 		echo "${HC}ab:a+b*c" >expected &&
@@ -433,59 +442,24 @@ do
 		test_must_fail git -c grep.patterntype=perl grep "foo.*bar"
 	'
 
-	test_expect_success "grep $L with grep.patternType=default and grep.extendedRegexp=true" '
-		echo "${HC}ab:abc" >expected &&
-		git \
-			-c grep.patternType=default \
-			-c grep.extendedRegexp=true \
-			grep "a+b*c" $H ab >actual &&
-		test_cmp expected actual
-	'
-
-	test_expect_success "grep $L with grep.extendedRegexp=true and grep.patternType=default" '
-		echo "${HC}ab:abc" >expected &&
-		git \
-			-c grep.extendedRegexp=true \
-			-c grep.patternType=default \
-			grep "a+b*c" $H ab >actual &&
-		test_cmp expected actual
-	'
-
-	test_expect_success "grep $L with grep.patternType=extended and grep.extendedRegexp=false" '
-		echo "${HC}ab:abc" >expected &&
-		git \
-			-c grep.patternType=extended \
-			-c grep.extendedRegexp=false \
-			grep "a+b*c" $H ab >actual &&
-		test_cmp expected actual
-	'
-
-	test_expect_success "grep $L with grep.patternType=basic and grep.extendedRegexp=true" '
-		echo "${HC}ab:a+bc" >expected &&
-		git \
-			-c grep.patternType=basic \
-			-c grep.extendedRegexp=true \
-			grep "a+b*c" $H ab >actual &&
-		test_cmp expected actual
-	'
-
-	test_expect_success "grep $L with grep.extendedRegexp=false and grep.patternType=extended" '
-		echo "${HC}ab:abc" >expected &&
-		git \
-			-c grep.extendedRegexp=false \
-			-c grep.patternType=extended \
-			grep "a+b*c" $H ab >actual &&
-		test_cmp expected actual
-	'
-
-	test_expect_success "grep $L with grep.extendedRegexp=true and grep.patternType=basic" '
-		echo "${HC}ab:a+bc" >expected &&
-		git \
-			-c grep.extendedRegexp=true \
-			-c grep.patternType=basic \
-			grep "a+b*c" $H ab >actual &&
-		test_cmp expected actual
-	'
+	test_pattern_type "$H" "$HC" "$L" ERE \
+		-c grep.patternType=default \
+		-c grep.extendedRegexp=true
+	test_pattern_type "$H" "$HC" "$L" ERE \
+		-c grep.extendedRegexp=true \
+		-c grep.patternType=default
+	test_pattern_type "$H" "$HC" "$L" ERE \
+		-c grep.patternType=extended \
+		-c grep.extendedRegexp=false
+	test_pattern_type "$H" "$HC" "$L" BRE \
+		-c grep.patternType=basic \
+		-c grep.extendedRegexp=true
+	test_pattern_type "$H" "$HC" "$L" ERE \
+		-c grep.extendedRegexp=false \
+		-c grep.patternType=extended
+	test_pattern_type "$H" "$HC" "$L" BRE \
+		-c grep.extendedRegexp=true \
+		-c grep.patternType=basic
 
 	test_expect_success "grep --count $L" '
 		echo ${HC}ab:3 >expected &&
-- 
2.35.1.1028.g9479bb34b83


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v11 04/10] grep tests: add missing "grep.patternType" config tests
  2022-02-16  0:00                 ` [PATCH v11 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                     ` (2 preceding siblings ...)
  2022-02-16  0:00                   ` [PATCH v11 03/10] grep tests: create a helper function for "BRE" or "ERE" Ævar Arnfjörð Bjarmason
@ 2022-02-16  0:00                   ` Ævar Arnfjörð Bjarmason
  2022-02-16  0:00                   ` [PATCH v11 05/10] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
                                     ` (6 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-16  0:00 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Extend the grep tests to assert that setting
"grep.patternType=extended" followed by "grep.patternType=default"
will behave as if "--basic-regexp" was provided, and not as
"--extended-regexp". In a subsequent commit we'll need to treat
"grep.patternType=default" as a special-case, but let's make sure we
ignore it if it's being set to "default" following an earlier
non-"default" "grep.patternType" setting.

Let's also test what happens when we have a sequence of "extended"
followed by "default" and "fixed". In that case the "fixed" should
prevail, as well as tests to check that a "grep.extendedRegexp=true"
followed by a "grep.extendedRegexp=false" behaves as though
"grep.extendedRegexp" wasn't provided.

See [1] for the source of some of these tests, and their
initial (pseudocode) implementation, and [2] for a later discussion
about a breakage due to missing testing (which had been noted in [1]
all along).

1. https://lore.kernel.org/git/xmqqv8zf6j86.fsf@gitster.g/
2. https://lore.kernel.org/git/xmqqpmoczwtu.fsf@gitster.g/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t7810-grep.sh | 52 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 6f1103b54b9..69356011713 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -461,6 +461,58 @@ do
 		-c grep.extendedRegexp=true \
 		-c grep.patternType=basic
 
+	# grep.extendedRegexp is last-one-wins
+	test_pattern_type "$H" "$HC" "$L" BRE \
+		-c grep.extendedRegexp=true \
+		-c grep.extendedRegexp=false
+
+	# grep.patternType=basic pays no attention to grep.extendedRegexp
+	test_pattern_type "$H" "$HC" "$L" BRE \
+		-c grep.extendedRegexp=true \
+		-c grep.patternType=basic \
+		-c grep.extendedRegexp=false
+
+	# grep.patternType=extended pays no attention to grep.extendedRegexp
+	test_pattern_type "$H" "$HC" "$L" ERE \
+		-c grep.extendedRegexp=true \
+		-c grep.patternType=extended \
+		-c grep.extendedRegexp=false
+
+	# grep.extendedRegexp is used with a last-one-wins grep.patternType=default
+	test_pattern_type "$H" "$HC" "$L" ERE \
+		-c grep.patternType=fixed \
+		-c grep.extendedRegexp=true \
+		-c grep.patternType=default
+
+	# grep.extendedRegexp is used with earlier grep.patternType=default
+	test_pattern_type "$H" "$HC" "$L" ERE \
+		-c grep.extendedRegexp=false \
+		-c grep.patternType=default \
+		-c grep.extendedRegexp=true
+
+	# grep.extendedRegexp is used with a last-one-loses grep.patternType=default
+	test_pattern_type "$H" "$HC" "$L" ERE \
+		-c grep.extendedRegexp=false \
+		-c grep.extendedRegexp=true \
+		-c grep.patternType=default
+
+	# grep.extendedRegexp and grep.patternType are both last-one-wins independently
+	test_pattern_type "$H" "$HC" "$L" BRE \
+		-c grep.patternType=default \
+		-c grep.extendedRegexp=true \
+		-c grep.patternType=basic
+
+	# grep.patternType=extended and grep.patternType=default
+	test_pattern_type "$H" "$HC" "$L" BRE \
+		-c grep.patternType=extended \
+		-c grep.patternType=default
+
+	# grep.patternType=[extended -> default -> fixed] (BRE)" '
+	test_pattern_type "$H" "$HC" "$L" FIX \
+		-c grep.patternType=extended \
+		-c grep.patternType=default \
+		-c grep.patternType=fixed
+
 	test_expect_success "grep --count $L" '
 		echo ${HC}ab:3 >expected &&
 		git grep --count -e b $H -- ab >actual &&
-- 
2.35.1.1028.g9479bb34b83


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v11 05/10] built-ins: trust the "prefix" from run_builtin()
  2022-02-16  0:00                 ` [PATCH v11 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                     ` (3 preceding siblings ...)
  2022-02-16  0:00                   ` [PATCH v11 04/10] grep tests: add missing "grep.patternType" config tests Ævar Arnfjörð Bjarmason
@ 2022-02-16  0:00                   ` Ævar Arnfjörð Bjarmason
  2022-02-16  0:00                   ` [PATCH v11 06/10] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
                                     ` (5 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-16  0:00 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change code in "builtin/grep.c" and "builtin/ls-tree.c" to trust the
"prefix" passed from "run_builtin()". The "prefix" we get from setup.c
is either going to be NULL or a string of length >0, never "".

So we can drop the "prefix && *prefix" checks added for
"builtin/grep.c" in 0d042fecf2f (git-grep: show pathnames relative to
the current directory, 2006-08-11), and for "builtin/ls-tree.c" in
a69dd585fca (ls-tree: chomp leading directories when run from a
subdirectory, 2005-12-23).

As seen in code in revision.c that was added in cd676a51367 (diff
--relative: output paths as relative to the current subdirectory,
2008-02-12) we already have existing code that does away with this
assertion.

This makes it easier to reason about a subsequent change to the
"prefix_length" code in grep.c in a subsequent commit, and since we're
going to the trouble of doing that let's leave behind an assert() to
promise this to any future callers.

For "builtin/grep.c" it would be painful to pass the "prefix" down the
callchain of:

    cmd_grep -> grep_tree -> grep_submodule -> grep_cache -> grep_oid ->
    grep_source_name

So for the code that needs it in grep_source_name() let's add a
"grep_prefix" variable similar to the existing "ls_tree_prefix".

While at it let's move the code in cmd_ls_tree() around so that we
assign to the "ls_tree_prefix" right after declaring the variables,
and stop assigning to "prefix". We only subsequently used that
variable later in the function after clobbering it. Let's just use our
own "grep_prefix" instead.

Let's also add an assert() in git.c, so that we'll make this promise
about the "prefix" to any current and future callers, as well as to
any readers of the code.

Code history:

 * The strlen() in "grep.c" hasn't been used since 493b7a08d80 (grep:
   accept relative paths outside current working directory, 2009-09-05).

   When that code was added in 0d042fecf2f (git-grep: show pathnames
   relative to the current directory, 2006-08-11) we used the length.

   But since 493b7a08d80 we haven't used it for anything except a
   boolean check that we could have done on the "prefix" member
   itself.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c    | 13 ++++++++-----
 builtin/ls-tree.c |  2 +-
 git.c             |  1 +
 grep.c            |  4 +---
 grep.h            |  4 +---
 revision.c        |  2 +-
 6 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 9e34a820ad4..d85cbabea67 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -26,6 +26,8 @@
 #include "object-store.h"
 #include "packfile.h"
 
+static const char *grep_prefix;
+
 static char const * const grep_usage[] = {
 	N_("git grep [<options>] [-e] <pattern> [<rev>...] [[--] <path>...]"),
 	NULL
@@ -315,11 +317,11 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 	strbuf_reset(out);
 
 	if (opt->null_following_name) {
-		if (opt->relative && opt->prefix_length) {
+		if (opt->relative && grep_prefix) {
 			struct strbuf rel_buf = STRBUF_INIT;
 			const char *rel_name =
 				relative_path(filename + tree_name_len,
-					      opt->prefix, &rel_buf);
+					      grep_prefix, &rel_buf);
 
 			if (tree_name_len)
 				strbuf_add(out, filename, tree_name_len);
@@ -332,8 +334,8 @@ static void grep_source_name(struct grep_opt *opt, const char *filename,
 		return;
 	}
 
-	if (opt->relative && opt->prefix_length)
-		quote_path(filename + tree_name_len, opt->prefix, out, 0);
+	if (opt->relative && grep_prefix)
+		quote_path(filename + tree_name_len, grep_prefix, out, 0);
 	else
 		quote_c_style(filename + tree_name_len, out, NULL, 0);
 
@@ -962,9 +964,10 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			   PARSE_OPT_NOCOMPLETE),
 		OPT_END()
 	};
+	grep_prefix = prefix;
 
 	git_config(grep_cmd_config, NULL);
-	grep_init(&opt, the_repository, prefix);
+	grep_init(&opt, the_repository);
 
 	/*
 	 * If there is no -- then the paths must exist in the working
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3a442631c71..6cb554cbb0a 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -150,7 +150,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
 
 	git_config(git_default_config, NULL);
 	ls_tree_prefix = prefix;
-	if (prefix && *prefix)
+	if (prefix)
 		chomp_prefix = strlen(prefix);
 
 	argc = parse_options(argc, argv, prefix, ls_tree_options,
diff --git a/git.c b/git.c
index 340665d4a04..a25940d72e8 100644
--- a/git.c
+++ b/git.c
@@ -436,6 +436,7 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
 	} else {
 		prefix = NULL;
 	}
+	assert(!prefix || *prefix);
 	precompose_argv_prefix(argc, argv, NULL);
 	if (use_pager == -1 && run_setup &&
 		!(p->option & DELAY_PAGER_CONFIG))
diff --git a/grep.c b/grep.c
index 5bec7fd7935..8b61cbc3e09 100644
--- a/grep.c
+++ b/grep.c
@@ -139,13 +139,11 @@ int grep_config(const char *var, const char *value, void *cb)
  * default values from the template we read the configuration
  * information in an earlier call to git_config(grep_config).
  */
-void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix)
+void grep_init(struct grep_opt *opt, struct repository *repo)
 {
 	*opt = grep_defaults;
 
 	opt->repo = repo;
-	opt->prefix = prefix;
-	opt->prefix_length = (prefix && *prefix) ? strlen(prefix) : 0;
 	opt->pattern_tail = &opt->pattern_list;
 	opt->header_tail = &opt->header_list;
 }
diff --git a/grep.h b/grep.h
index 400172676a1..23a2a41d2c4 100644
--- a/grep.h
+++ b/grep.h
@@ -134,8 +134,6 @@ struct grep_opt {
 	 */
 	struct repository *repo;
 
-	const char *prefix;
-	int prefix_length;
 	int linenum;
 	int columnnum;
 	int invert;
@@ -182,7 +180,7 @@ struct grep_opt {
 };
 
 int grep_config(const char *var, const char *value, void *);
-void grep_init(struct grep_opt *, struct repository *repo, const char *prefix);
+void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index ad4286fbdde..d6e0e2b23b5 100644
--- a/revision.c
+++ b/revision.c
@@ -1838,7 +1838,7 @@ void repo_init_revisions(struct repository *r,
 	revs->commit_format = CMIT_FMT_DEFAULT;
 	revs->expand_tabs_in_log_default = 8;
 
-	grep_init(&revs->grep_filter, revs->repo, prefix);
+	grep_init(&revs->grep_filter, revs->repo);
 	revs->grep_filter.status_only = 1;
 
 	repo_diff_setup(revs->repo, &revs->diffopt);
-- 
2.35.1.1028.g9479bb34b83


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v11 06/10] grep.c: don't pass along NULL callback value
  2022-02-16  0:00                 ` [PATCH v11 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                     ` (4 preceding siblings ...)
  2022-02-16  0:00                   ` [PATCH v11 05/10] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
@ 2022-02-16  0:00                   ` Ævar Arnfjörð Bjarmason
  2022-02-16  0:00                   ` [PATCH v11 07/10] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
                                     ` (4 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-16  0:00 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change grep_cmd_config() to stop passing around the always-NULL "cb"
value. When this code was added in 7e8f59d577e (grep: color patterns
in output, 2009-03-07) it was non-NULL, but when that changed in
15fabd1bbd4 (builtin/grep.c: make configuration callback more
reusable, 2012-10-09) this code was left behind.

In a subsequent change I'll start using the "cb" value, this will make
it clear which functions we call need it, and which don't.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index d85cbabea67..5ec4cecae45 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -285,8 +285,8 @@ static int wait_all(void)
 
 static int grep_cmd_config(const char *var, const char *value, void *cb)
 {
-	int st = grep_config(var, value, cb);
-	if (git_color_default_config(var, value, cb) < 0)
+	int st = grep_config(var, value, NULL);
+	if (git_color_default_config(var, value, NULL) < 0)
 		st = -1;
 
 	if (!strcmp(var, "grep.threads")) {
-- 
2.35.1.1028.g9479bb34b83


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v11 07/10] grep API: call grep_config() after grep_init()
  2022-02-16  0:00                 ` [PATCH v11 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                     ` (5 preceding siblings ...)
  2022-02-16  0:00                   ` [PATCH v11 06/10] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
@ 2022-02-16  0:00                   ` Ævar Arnfjörð Bjarmason
  2022-02-16  0:00                   ` [PATCH v11 08/10] grep.h: make "grep_opt.pattern_type_option" use its enum Ævar Arnfjörð Bjarmason
                                     ` (3 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-16  0:00 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

The grep_init() function used the odd pattern of initializing the
passed-in "struct grep_opt" with a statically defined "grep_defaults"
struct, which would be modified in-place when we invoked
grep_config().

So we effectively (b) initialized config, (a) then defaults, (c)
followed by user options. Usually those are ordered as "a", "b" and
"c" instead.

As the comments being removed here show the previous behavior needed
to be carefully explained as we'd potentially share the populated
configuration among different instances of grep_init(). In practice we
didn't do that, but now that it can't be a concern anymore let's
remove those comments.

This does not change the behavior of any of the configuration
variables or options. That would have been the case if we didn't move
around the grep_config() call in "builtin/log.c". But now that we call
"grep_config" after "git_log_config" and "git_format_config" we'll
need to pass in the already initialized "struct grep_opt *".

See 6ba9bb76e02 (grep: copy struct in one fell swoop, 2020-11-29) and
7687a0541e0 (grep: move the configuration parsing logic to grep.[ch],
2012-10-09) for the commits that added the comments.

The memcpy() pattern here will be optimized away and follows the
convention of other *_init() functions. See 5726a6b4012 (*.c *_init():
define in terms of corresponding *_INIT macro, 2021-07-01).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c |  4 ++--
 builtin/log.c  | 13 +++++++++++--
 grep.c         | 39 +++------------------------------------
 grep.h         | 21 +++++++++++++++++++++
 4 files changed, 37 insertions(+), 40 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 5ec4cecae45..0ea124321b6 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -285,7 +285,7 @@ static int wait_all(void)
 
 static int grep_cmd_config(const char *var, const char *value, void *cb)
 {
-	int st = grep_config(var, value, NULL);
+	int st = grep_config(var, value, cb);
 	if (git_color_default_config(var, value, NULL) < 0)
 		st = -1;
 
@@ -966,8 +966,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	};
 	grep_prefix = prefix;
 
-	git_config(grep_cmd_config, NULL);
 	grep_init(&opt, the_repository);
+	git_config(grep_cmd_config, &opt);
 
 	/*
 	 * If there is no -- then the paths must exist in the working
diff --git a/builtin/log.c b/builtin/log.c
index 4b493408cc5..06283b37e7a 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -520,8 +520,6 @@ static int git_log_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
-	if (grep_config(var, value, cb) < 0)
-		return -1;
 	if (git_gpg_config(var, value, cb) < 0)
 		return -1;
 	return git_diff_ui_config(var, value, cb);
@@ -536,6 +534,8 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.simplify_history = 0;
 	memset(&opt, 0, sizeof(opt));
@@ -650,6 +650,8 @@ int cmd_show(int argc, const char **argv, const char *prefix)
 
 	memset(&match_all, 0, sizeof(match_all));
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.diff = 1;
 	rev.always_show_header = 1;
 	rev.no_walk = 1;
@@ -733,6 +735,8 @@ int cmd_log_reflog(int argc, const char **argv, const char *prefix)
 
 	repo_init_revisions(the_repository, &rev, prefix);
 	init_reflog_walk(&rev.reflog_info);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.verbose_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -766,6 +770,8 @@ int cmd_log(int argc, const char **argv, const char *prefix)
 	git_config(git_log_config, NULL);
 
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.always_show_header = 1;
 	memset(&opt, 0, sizeof(opt));
 	opt.def = "HEAD";
@@ -1848,10 +1854,13 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	extra_hdr.strdup_strings = 1;
 	extra_to.strdup_strings = 1;
 	extra_cc.strdup_strings = 1;
+
 	init_log_defaults();
 	init_display_notes(&notes_opt);
 	git_config(git_format_config, NULL);
 	repo_init_revisions(the_repository, &rev, prefix);
+	git_config(grep_config, &rev.grep_filter);
+
 	rev.show_notes = show_notes;
 	memcpy(&rev.notes_opt, &notes_opt, sizeof(notes_opt));
 	rev.commit_format = CMIT_FMT_EMAIL;
diff --git a/grep.c b/grep.c
index 8b61cbc3e09..2f6a01c52a5 100644
--- a/grep.c
+++ b/grep.c
@@ -19,27 +19,6 @@ static void std_output(struct grep_opt *opt, const void *buf, size_t size)
 	fwrite(buf, size, 1, stdout);
 }
 
-static struct grep_opt grep_defaults = {
-	.relative = 1,
-	.pathname = 1,
-	.max_depth = -1,
-	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED,
-	.colors = {
-		[GREP_COLOR_CONTEXT] = "",
-		[GREP_COLOR_FILENAME] = GIT_COLOR_MAGENTA,
-		[GREP_COLOR_FUNCTION] = "",
-		[GREP_COLOR_LINENO] = GIT_COLOR_GREEN,
-		[GREP_COLOR_COLUMNNO] = GIT_COLOR_GREEN,
-		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED,
-		[GREP_COLOR_SELECTED] = "",
-		[GREP_COLOR_SEP] = GIT_COLOR_CYAN,
-	},
-	.only_matching = 0,
-	.color = -1,
-	.output = std_output,
-};
-
 static const char *color_grep_slots[] = {
 	[GREP_COLOR_CONTEXT]	    = "context",
 	[GREP_COLOR_FILENAME]	    = "filename",
@@ -75,20 +54,12 @@ define_list_config_array_extra(color_grep_slots, {"match"});
  */
 int grep_config(const char *var, const char *value, void *cb)
 {
-	struct grep_opt *opt = &grep_defaults;
+	struct grep_opt *opt = cb;
 	const char *slot;
 
 	if (userdiff_config(var, value) < 0)
 		return -1;
 
-	/*
-	 * The instance of grep_opt that we set up here is copied by
-	 * grep_init() to be used by each individual invocation.
-	 * When populating a new field of this structure here, be
-	 * sure to think about ownership -- e.g., you might need to
-	 * override the shallow copy in grep_init() with a deep copy.
-	 */
-
 	if (!strcmp(var, "grep.extendedregexp")) {
 		opt->extended_regexp_option = git_config_bool(var, value);
 		return 0;
@@ -134,14 +105,10 @@ int grep_config(const char *var, const char *value, void *cb)
 	return 0;
 }
 
-/*
- * Initialize one instance of grep_opt and copy the
- * default values from the template we read the configuration
- * information in an earlier call to git_config(grep_config).
- */
 void grep_init(struct grep_opt *opt, struct repository *repo)
 {
-	*opt = grep_defaults;
+	struct grep_opt blank = GREP_OPT_INIT;
+	memcpy(opt, &blank, sizeof(*opt));
 
 	opt->repo = repo;
 	opt->pattern_tail = &opt->pattern_list;
diff --git a/grep.h b/grep.h
index 23a2a41d2c4..3112d1c2a38 100644
--- a/grep.h
+++ b/grep.h
@@ -179,6 +179,27 @@ struct grep_opt {
 	void *output_priv;
 };
 
+#define GREP_OPT_INIT { \
+	.relative = 1, \
+	.pathname = 1, \
+	.max_depth = -1, \
+	.pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED, \
+	.colors = { \
+		[GREP_COLOR_CONTEXT] = "", \
+		[GREP_COLOR_FILENAME] = GIT_COLOR_MAGENTA, \
+		[GREP_COLOR_FUNCTION] = "", \
+		[GREP_COLOR_LINENO] = GIT_COLOR_GREEN, \
+		[GREP_COLOR_COLUMNNO] = GIT_COLOR_GREEN, \
+		[GREP_COLOR_MATCH_CONTEXT] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_MATCH_SELECTED] = GIT_COLOR_BOLD_RED, \
+		[GREP_COLOR_SELECTED] = "", \
+		[GREP_COLOR_SEP] = GIT_COLOR_CYAN, \
+	}, \
+	.only_matching = 0, \
+	.color = -1, \
+	.output = std_output, \
+}
+
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
-- 
2.35.1.1028.g9479bb34b83


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v11 08/10] grep.h: make "grep_opt.pattern_type_option" use its enum
  2022-02-16  0:00                 ` [PATCH v11 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                     ` (6 preceding siblings ...)
  2022-02-16  0:00                   ` [PATCH v11 07/10] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
@ 2022-02-16  0:00                   ` Ævar Arnfjörð Bjarmason
  2022-02-16  0:00                   ` [PATCH v11 09/10] grep.c: do "if (bool && memchr())" not "if (memchr() && bool)" Ævar Arnfjörð Bjarmason
                                     ` (2 subsequent siblings)
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-16  0:00 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change the "pattern_type_option" member of "struct grep_opt" to use
the enum type we use for it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/grep.h b/grep.h
index 3112d1c2a38..89a2ce51130 100644
--- a/grep.h
+++ b/grep.h
@@ -164,7 +164,7 @@ struct grep_opt {
 	int funcname;
 	int funcbody;
 	int extended_regexp_option;
-	int pattern_type_option;
+	enum grep_pattern_type pattern_type_option;
 	int ignore_locale;
 	char colors[NR_GREP_COLORS][COLOR_MAXLEN];
 	unsigned pre_context;
-- 
2.35.1.1028.g9479bb34b83


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v11 09/10] grep.c: do "if (bool && memchr())" not "if (memchr() && bool)"
  2022-02-16  0:00                 ` [PATCH v11 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                     ` (7 preceding siblings ...)
  2022-02-16  0:00                   ` [PATCH v11 08/10] grep.h: make "grep_opt.pattern_type_option" use its enum Ævar Arnfjörð Bjarmason
@ 2022-02-16  0:00                   ` Ævar Arnfjörð Bjarmason
  2022-02-16  0:00                   ` [PATCH v11 10/10] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
  2022-02-16  2:20                   ` [PATCH v11 00/10] grep: simplify & delete "init" & "config" code Junio C Hamano
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-16  0:00 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Change code in compile_regexp() to check the cheaper boolean
"!opt->pcre2" condition before the "memchr()" search.

This doesn't noticeably optimize anything, but makes the code more
obvious and conventional. The line wrapping being added here also
makes a subsequent commit smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index 2f6a01c52a5..62f2595f68a 100644
--- a/grep.c
+++ b/grep.c
@@ -492,7 +492,8 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 	p->ignore_case = opt->ignore_case;
 	p->fixed = opt->fixed;
 
-	if (memchr(p->pattern, 0, p->patternlen) && !opt->pcre2)
+	if (!opt->pcre2 &&
+	    memchr(p->pattern, 0, p->patternlen))
 		die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
 
 	p->is_fixed = is_fixed(p->pattern, p->patternlen);
-- 
2.35.1.1028.g9479bb34b83


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* [PATCH v11 10/10] grep: simplify config parsing and option parsing
  2022-02-16  0:00                 ` [PATCH v11 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                     ` (8 preceding siblings ...)
  2022-02-16  0:00                   ` [PATCH v11 09/10] grep.c: do "if (bool && memchr())" not "if (memchr() && bool)" Ævar Arnfjörð Bjarmason
@ 2022-02-16  0:00                   ` Ævar Arnfjörð Bjarmason
  2022-02-16  2:20                   ` [PATCH v11 00/10] grep: simplify & delete "init" & "config" code Junio C Hamano
  10 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-16  0:00 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, J Smith, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Simplify the parsing of "grep.patternType" and
"grep.extendedRegexp". This changes no behavior, but gets rid of
complex parsing logic that isn't needed anymore.

When "grep.patternType" was introduced in 84befcd0a4a (grep: add a
grep.patternType configuration setting, 2012-08-03) we promised that:

 1. You can set "grep.patternType", and "[setting it to] 'default'
    will return to the default matching behavior".

    In that context "the default" meant whatever the configuration
    system specified before that change, i.e. via grep.extendedRegexp.

 2. We'd support the existing "grep.extendedRegexp" option, but ignore
    it when the new "grep.patternType" option is set. We said we'd
    only ignore the older "grep.extendedRegexp" option "when the
    `grep.patternType` option is set to a value other than
    'default'".

In a preceding commit we changed grep_config() to be called after
grep_init(), which means that much of the complexity here can go
away.

As before both "grep.patternType" and "grep.extendedRegexp" are
last-one-wins variable, with "grep.extendedRegexp" yielding to
"grep.patternType", except when "grep.patternType=default".

Note that as the previously added tests indicate this cannot be done
on-the-fly as we see the config variables, without introducing more
state keeping. I.e. if we see:

    -c grep.extendedRegexp=false
    -c grep.patternType=default
    -c extendedRegexp=true

We need to select ERE, since grep.patternType=default unselects that
variable, which normally has higher precedence, but we also need to
select BRE in cases of:

    -c grep.extendedRegexp=true \
    -c grep.extendedRegexp=false

Which would not be the case for this, which select ERE:

    -c grep.patternType=extended \
    -c grep.extendedRegexp=false

Therefore we cannot do this on-the-fly in grep_config without also
introducing tracking variables for not only the pattern type, but what
the source of that pattern type was.

So we need to decide on the pattern after our config was fully
parsed. Let's do that by deferring the decision on the pattern type
until it's time to compile it in compile_regexp().

By that time we've not only parsed the config, but also handled the
command-line options. Those will set "opt.pattern_type_option" (*not*
"opt.extended_regexp_option"!).

At that point all we need to do is see if "grep.patternType" was
UNSPECIFIED in the end (including an explicit "=default"), if so we'll
use the "grep.extendedRegexp" configuration, if any.

See my 07a3d411739 (grep: remove regflags from the public grep_opt
API, 2017-06-29) for addition of the two comments being removed here,
i.e. the complexity noted in that commit is now going away.

1. https://lore.kernel.org/git/patch-v8-09.10-c211bb0c69d-20220118T155211Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 10 +++-----
 grep.c         | 69 +++++++-------------------------------------------
 grep.h         |  3 ---
 revision.c     |  2 --
 4 files changed, 13 insertions(+), 71 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 0ea124321b6..942c4b25077 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -845,7 +845,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	int i;
 	int dummy;
 	int use_index = 1;
-	int pattern_type_arg = GREP_PATTERN_TYPE_UNSPECIFIED;
 	int allow_revs;
 
 	struct option options[] = {
@@ -879,16 +878,16 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			N_("descend at most <depth> levels"), PARSE_OPT_NONEG,
 			NULL, 1 },
 		OPT_GROUP(""),
-		OPT_SET_INT('E', "extended-regexp", &pattern_type_arg,
+		OPT_SET_INT('E', "extended-regexp", &opt.pattern_type_option,
 			    N_("use extended POSIX regular expressions"),
 			    GREP_PATTERN_TYPE_ERE),
-		OPT_SET_INT('G', "basic-regexp", &pattern_type_arg,
+		OPT_SET_INT('G', "basic-regexp", &opt.pattern_type_option,
 			    N_("use basic POSIX regular expressions (default)"),
 			    GREP_PATTERN_TYPE_BRE),
-		OPT_SET_INT('F', "fixed-strings", &pattern_type_arg,
+		OPT_SET_INT('F', "fixed-strings", &opt.pattern_type_option,
 			    N_("interpret patterns as fixed strings"),
 			    GREP_PATTERN_TYPE_FIXED),
-		OPT_SET_INT('P', "perl-regexp", &pattern_type_arg,
+		OPT_SET_INT('P', "perl-regexp", &opt.pattern_type_option,
 			    N_("use Perl-compatible regular expressions"),
 			    GREP_PATTERN_TYPE_PCRE),
 		OPT_GROUP(""),
@@ -982,7 +981,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options, grep_usage,
 			     PARSE_OPT_KEEP_DASHDASH |
 			     PARSE_OPT_STOP_AT_NON_OPTION);
-	grep_commit_pattern_type(pattern_type_arg, &opt);
 
 	if (use_index && !startup_info->have_repository) {
 		int fallback = 0;
diff --git a/grep.c b/grep.c
index 62f2595f68a..d5ad9617d99 100644
--- a/grep.c
+++ b/grep.c
@@ -115,62 +115,6 @@ void grep_init(struct grep_opt *opt, struct repository *repo)
 	opt->header_tail = &opt->header_list;
 }
 
-static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	/*
-	 * When committing to the pattern type by setting the relevant
-	 * fields in grep_opt it's generally not necessary to zero out
-	 * the fields we're not choosing, since they won't have been
-	 * set by anything. The extended_regexp_option field is the
-	 * only exception to this.
-	 *
-	 * This is because in the process of parsing grep.patternType
-	 * & grep.extendedRegexp we set opt->pattern_type_option and
-	 * opt->extended_regexp_option, respectively. We then
-	 * internally use opt->extended_regexp_option to see if we're
-	 * compiling an ERE. It must be unset if that's not actually
-	 * the case.
-	 */
-	if (pattern_type != GREP_PATTERN_TYPE_ERE &&
-	    opt->extended_regexp_option)
-		opt->extended_regexp_option = 0;
-
-	switch (pattern_type) {
-	case GREP_PATTERN_TYPE_UNSPECIFIED:
-		/* fall through */
-
-	case GREP_PATTERN_TYPE_BRE:
-		break;
-
-	case GREP_PATTERN_TYPE_ERE:
-		opt->extended_regexp_option = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_FIXED:
-		opt->fixed = 1;
-		break;
-
-	case GREP_PATTERN_TYPE_PCRE:
-		opt->pcre2 = 1;
-		break;
-	}
-}
-
-void grep_commit_pattern_type(enum grep_pattern_type pattern_type, struct grep_opt *opt)
-{
-	if (pattern_type != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(pattern_type, opt);
-	else if (opt->pattern_type_option != GREP_PATTERN_TYPE_UNSPECIFIED)
-		grep_set_pattern_type_option(opt->pattern_type_option, opt);
-	else if (opt->extended_regexp_option)
-		/*
-		 * This branch *must* happen after setting from the
-		 * opt->pattern_type_option above, we don't want
-		 * grep.extendedRegexp to override grep.patternType!
-		 */
-		grep_set_pattern_type_option(GREP_PATTERN_TYPE_ERE, opt);
-}
-
 static struct grep_pat *create_grep_pat(const char *pat, size_t patlen,
 					const char *origin, int no,
 					enum grep_pat_token t,
@@ -488,11 +432,16 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 	int err;
 	int regflags = REG_NEWLINE;
 
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_UNSPECIFIED)
+		opt->pattern_type_option = (opt->extended_regexp_option
+					    ? GREP_PATTERN_TYPE_ERE
+					    : GREP_PATTERN_TYPE_BRE);
+
 	p->word_regexp = opt->word_regexp;
 	p->ignore_case = opt->ignore_case;
-	p->fixed = opt->fixed;
+	p->fixed = opt->pattern_type_option == GREP_PATTERN_TYPE_FIXED;
 
-	if (!opt->pcre2 &&
+	if (opt->pattern_type_option != GREP_PATTERN_TYPE_PCRE &&
 	    memchr(p->pattern, 0, p->patternlen))
 		die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
 
@@ -544,14 +493,14 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 		return;
 	}
 
-	if (opt->pcre2) {
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_PCRE) {
 		compile_pcre2_pattern(p, opt);
 		return;
 	}
 
 	if (p->ignore_case)
 		regflags |= REG_ICASE;
-	if (opt->extended_regexp_option)
+	if (opt->pattern_type_option == GREP_PATTERN_TYPE_ERE)
 		regflags |= REG_EXTENDED;
 	err = regcomp(&p->regexp, p->pattern, regflags);
 	if (err) {
diff --git a/grep.h b/grep.h
index 89a2ce51130..c722d25ed9d 100644
--- a/grep.h
+++ b/grep.h
@@ -143,7 +143,6 @@ struct grep_opt {
 	int unmatch_name_only;
 	int count;
 	int word_regexp;
-	int fixed;
 	int all_match;
 	int no_body_match;
 	int body_hit;
@@ -154,7 +153,6 @@ struct grep_opt {
 	int allow_textconv;
 	int extended;
 	int use_reflog_filter;
-	int pcre2;
 	int relative;
 	int pathname;
 	int null_following_name;
@@ -202,7 +200,6 @@ struct grep_opt {
 
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo);
-void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
 void append_grep_pattern(struct grep_opt *opt, const char *pat, const char *origin, int no, enum grep_pat_token t);
diff --git a/revision.c b/revision.c
index d6e0e2b23b5..dd301e30147 100644
--- a/revision.c
+++ b/revision.c
@@ -2860,8 +2860,6 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 
 	diff_setup_done(&revs->diffopt);
 
-	grep_commit_pattern_type(GREP_PATTERN_TYPE_UNSPECIFIED,
-				 &revs->grep_filter);
 	if (!is_encoding_utf8(get_log_output_encoding()))
 		revs->grep_filter.ignore_locale = 1;
 	compile_grep_patterns(&revs->grep_filter);
-- 
2.35.1.1028.g9479bb34b83


^ permalink raw reply related	[flat|nested] 151+ messages in thread

* Re: [PATCH v11 00/10] grep: simplify & delete "init" & "config" code
  2022-02-16  0:00                 ` [PATCH v11 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
                                     ` (9 preceding siblings ...)
  2022-02-16  0:00                   ` [PATCH v11 10/10] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
@ 2022-02-16  2:20                   ` Junio C Hamano
  10 siblings, 0 replies; 151+ messages in thread
From: Junio C Hamano @ 2022-02-16  2:20 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, J Smith, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

>   * A new 3/10 and 4/10 hopefully address the comments about the test
>     code. I ended up just adding a helper to reduce the existing and
>     new verbosity of the tests, which should make it easier to reason
>     about them.

Here is an excerpt plus comments on "git diff @{1}" after queuing
the new series.  I found only a few minor "Huh?", none of them a
huge deal.

Will queue; thanks.

+test_pattern_type () {
+	H=$1 &&
+	HC=$2 &&
+	L=$3 &&
+	type=$4 &&
+	shift 4 &&
+
+	expected_str= &&
+	case "$type" in
+	BRE)
+		expected_str="${HC}ab:a+bc"
+		;;
+	ERE)
+		expected_str="${HC}ab:abc"
+		;;
+	FIX)
+		expected_str="${HC}ab:a+b*c"
+		;;
+	*)
+		BUG "unknown pattern type '$type'"
+		;;
+	esac &&

Excellent.  I always had to think twice when commenting on earlier
rounds of the patches which expected strings corresponded to what
pattern type.  Now we have a clearly defined table.

+	config_str="$@" &&

This forces each element of $@ to lose its identity, and makes it a
single string separated by a whitespace, so it is less misleading to
write

	config_str="$*"

instead, but it is not a huge deal.

+	test_expect_success "grep $L with '$config_str' interpreted as $type" '
+		echo $expected_str >expected &&
+		git $config_str grep "a+b*c" $H ab >actual &&
+		test_cmp expected actual
+	'

We must leave $config_str unquoted (because we do want the string
split at $IFS), but not quoting "$expected_str" looks a bit yucky
(because we have no intention to let $IFS to split it).

+}
+

+	test_pattern_type "$H" "$HC" "$L" BRE -c grep.extendedRegexp=false
+	test_pattern_type "$H" "$HC" "$L" ERE -c grep.extendedRegexp=true
+	test_pattern_type "$H" "$HC" "$L" BRE -c grep.patternType=basic
+	test_pattern_type "$H" "$HC" "$L" ERE -c grep.patternType=extended
+	test_pattern_type "$H" "$HC" "$L" FIX -c grep.patternType=fixed
 
This part demonstrates the beauty of the new helper very well ;-)

s/FIX/FIXED/ would be more grammatically correct.  It would break
alignment above and I suspect that was why the patch chose to say
"FIX" instead, but I am not sure if the alignment here is so
valuable.

+	test_pattern_type "$H" "$HC" "$L" ERE \
+		-c grep.patternType=default \
+		-c grep.extendedRegexp=true
+	test_pattern_type "$H" "$HC" "$L" ERE \
+		-c grep.extendedRegexp=true \
+		-c grep.patternType=default
+	test_pattern_type "$H" "$HC" "$L" ERE \
+		-c grep.patternType=extended \
+		-c grep.extendedRegexp=false
+	test_pattern_type "$H" "$HC" "$L" BRE \
+		-c grep.patternType=basic \
+		-c grep.extendedRegexp=true
+	test_pattern_type "$H" "$HC" "$L" ERE \
+		-c grep.extendedRegexp=false \
+		-c grep.patternType=extended
+	test_pattern_type "$H" "$HC" "$L" BRE \
+		-c grep.extendedRegexp=true \
+		-c grep.patternType=basic

OK.  A bit redundant, knowing the implementation that parses the two
variables independently just like any other two variables, but these
are correct, which counts even more ;-).

+	# grep.extendedRegexp is last-one-wins
+	test_pattern_type "$H" "$HC" "$L" BRE \
+		-c grep.extendedRegexp=true \
+		-c grep.extendedRegexp=false
+
+	# grep.patternType=basic pays no attention to grep.extendedRegexp
+	test_pattern_type "$H" "$HC" "$L" BRE \
+		-c grep.extendedRegexp=true \
+		-c grep.patternType=basic \
+		-c grep.extendedRegexp=false
+
+	# grep.patternType=extended pays no attention to grep.extendedRegexp
+	test_pattern_type "$H" "$HC" "$L" ERE \
+		-c grep.extendedRegexp=true \
+		-c grep.patternType=extended \
+		-c grep.extendedRegexp=false

All correct.

+	# grep.extendedRegexp is used with a last-one-wins grep.patternType=default
+	test_pattern_type "$H" "$HC" "$L" ERE \
+		-c grep.patternType=fixed \
+		-c grep.extendedRegexp=true \
+		-c grep.patternType=default

Nice.

+	# grep.extendedRegexp is used with earlier grep.patternType=default
+	test_pattern_type "$H" "$HC" "$L" ERE \
+		-c grep.extendedRegexp=false \
+		-c grep.patternType=default \
+		-c grep.extendedRegexp=true

OK.  I would have expected "the last" instead of "earlier".  Because
these two variables are independently "the last one wins", just like
any two variables that are "the last one wins", the relative order
of their appearance does not matter.

+	# grep.extendedRegexp is used with a last-one-loses grep.patternType=default
+	test_pattern_type "$H" "$HC" "$L" ERE \
+		-c grep.extendedRegexp=false \
+		-c grep.extendedRegexp=true \
+		-c grep.patternType=default

I am not sure what last-one-loses mean here.  Both variables are
independently last-one-wins, so at the end of the parsing,
grep.extendedRegexp has 'true' (because it is the last value seen
for the variable) while grep.patternType has 'default' (again, it is
the last value seen for the variable).

+	# grep.extendedRegexp and grep.patternType are both last-one-wins independently
+	test_pattern_type "$H" "$HC" "$L" BRE \
+		-c grep.patternType=default \
+		-c grep.extendedRegexp=true \
+		-c grep.patternType=basic

The title of this one gets the gist of the mistakes in the code in
earlier rounds.

+	# grep.patternType=extended and grep.patternType=default
+	test_pattern_type "$H" "$HC" "$L" BRE \
+		-c grep.patternType=extended \
+		-c grep.patternType=default
+
+	# grep.patternType=[extended -> default -> fixed] (BRE)" '
+	test_pattern_type "$H" "$HC" "$L" FIX \
+		-c grep.patternType=extended \
+		-c grep.patternType=default \
+		-c grep.patternType=fixed
 
OK.

 	test_expect_success "grep --count $L" '
 		echo ${HC}ab:3 >expected &&

^ permalink raw reply	[flat|nested] 151+ messages in thread

* Tests in t4202 are aborted early, was: Re: [PATCH v3 2/7] log
  2021-11-29 14:50     ` [PATCH v3 2/7] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
@ 2022-03-04  8:57       ` Fabian Stelzer
  2022-03-04 10:05         ` [PATCH] log tests: fix "abort tests early" regression in ff37a60c369 Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 151+ messages in thread
From: Fabian Stelzer @ 2022-03-04  8:57 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, J Smith, Taylor Blau

On 29.11.2021 15:50, Ævar Arnfjörð Bjarmason wrote:
>Extend the tests added in my 9df46763ef1 (log: add exhaustive tests
>for pattern style options & config, 2017-05-20) to check not only
>whether "git log" handles "grep.patternType", but also "git show"
>etc.
>
>It's sufficient to check whether we match a "fixed" or a "basic" regex
>here to see if these codepaths correctly invoked grep_config(). We
>don't need to check the details of their regular expression matching
>as the "log" test does.
>
>Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>---
> t/t4202-log.sh | 24 ++++++++++++++++++++++++
> 1 file changed, 24 insertions(+)
>
>diff --git a/t/t4202-log.sh b/t/t4202-log.sh
>index 7884e3d46b3..11bb25440b0 100755
>--- a/t/t4202-log.sh
>+++ b/t/t4202-log.sh
>@@ -449,6 +449,30 @@ test_expect_success !FAIL_PREREQS 'log with various grep.patternType configurati
> 	)
> '
>
>+for cmd in show whatchanged reflog format-patch
>+do
>+	case "$cmd" in
>+	format-patch) myarg="HEAD~.." ;;
>+	*) myarg= ;;
>+	esac
>+
>+	test_expect_success "$cmd: understands grep.patternType, like 'log'" '
>+		git init "pattern-type-$cmd" &&
>+		(
>+			cd "pattern-type-$cmd" &&
>+			test_commit 1 file A &&
>+			test_commit "(1|2)" file B 2 &&
>+
>+			git -c grep.patternType=fixed $cmd --grep="..." $myarg >actual &&
>+			test_must_be_empty actual &&
>+
>+			git -c grep.patternType=basic $cmd --grep="..." $myarg >actual &&
>+			test_file_not_empty actual
>+		)
>+	'
>+done
>+test_done

After rebasing my work from <20220302090250.590450-1-fs@gigacodes.de> on 
master I was a bit confused as to why my tests in t4202 were no longer 
executing and none of my changes did anything about it. I suppose this 
`test_done` is left over from testing and slipped into master?

>+
> test_expect_success 'log --author' '
> 	cat >expect <<-\EOF &&
> 	Author: <BOLD;RED>A U<RESET> Thor <author@example.com>
>-- 
>2.34.1.841.gf15fb7e6f34
>

^ permalink raw reply	[flat|nested] 151+ messages in thread

* [PATCH] log tests: fix "abort tests early" regression in ff37a60c369
  2022-03-04  8:57       ` Tests in t4202 are aborted early, was: Re: [PATCH v3 2/7] log Fabian Stelzer
@ 2022-03-04 10:05         ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 151+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-04 10:05 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Fabian Stelzer,
	Ævar Arnfjörð Bjarmason

Fix a regression in ff37a60c369 (log tests: check if grep_config() is
called by "log"-like cmds, 2022-02-16), a "test_done" command used
during development made it into a submitted patch causing tests 41-136
in t/t4202-log.sh to be skipped.

Reported-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

On Fri, Mar 04 2022, Fabian Stelzer wrote:

> After rebasing my work from <20220302090250.590450-1-fs@gigacodes.de>
> on master I was a bit confused as to why my tests in t4202 were no
> longer executing and none of my changes did anything about it. I
> suppose this `test_done` is left over from testing and slipped into
> master?

Ouch! Yes, that's a rather obvious stupid mistake of mine that
shouldn't have escaped the lab. Sorry!

FWIW since we haven't run these tests in a while (not too long
though!) it's conceivable that CI would fail on them, but in addition
to passing locally here's passing CI for this change:
https://github.com/avar/git/runs/5420131756

So we should be OK with this change.

 t/t4202-log.sh | 1 -
 1 file changed, 1 deletion(-)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index 55fac644464..46e413bcc93 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -484,7 +484,6 @@ do
 		)
 	'
 done
-test_done
 
 test_expect_success 'log --author' '
 	cat >expect <<-\EOF &&
-- 
2.35.1.1230.ga6e6579e98c


^ permalink raw reply related	[flat|nested] 151+ messages in thread

end of thread, other threads:[~2022-03-04 10:05 UTC | newest]

Thread overview: 151+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-06 21:10 [PATCH 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
2021-11-06 21:10 ` [PATCH 1/8] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
2021-11-06 21:10 ` [PATCH 2/8] git.c & grep.c: assert that "prefix" is NULL or non-zero string Ævar Arnfjörð Bjarmason
2021-11-08 20:37   ` Taylor Blau
2021-11-06 21:10 ` [PATCH 3/8] grep: remove unused "prefix_length" member Ævar Arnfjörð Bjarmason
2021-11-08 20:42   ` Taylor Blau
2021-11-06 21:10 ` [PATCH 4/8] grep.c: move "prefix" out of "struct grep_opt" Ævar Arnfjörð Bjarmason
2021-11-08 20:56   ` Taylor Blau
2021-11-09  2:10     ` Ævar Arnfjörð Bjarmason
2021-11-10  0:18       ` Taylor Blau
2021-11-06 21:10 ` [PATCH 5/8] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
2021-11-06 21:10 ` [PATCH 6/8] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
2021-11-08 21:49   ` Taylor Blau
2021-11-09  2:06     ` Ævar Arnfjörð Bjarmason
2021-11-10  0:18       ` Taylor Blau
2021-11-06 21:10 ` [PATCH 7/8] grep: simplify config parsing, change grep.<rx config> interaction Ævar Arnfjörð Bjarmason
2021-11-08 23:04   ` Taylor Blau
2021-11-09  2:01     ` Ævar Arnfjörð Bjarmason
2021-11-10  0:16       ` Taylor Blau
2021-11-06 21:10 ` [PATCH 8/8] grep: make "extendedRegexp=true" the same as "patternType=extended" Ævar Arnfjörð Bjarmason
2021-11-10  1:43 ` [PATCH v2 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Ævar Arnfjörð Bjarmason
2021-11-10  1:43   ` [PATCH v2 1/8] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
2021-11-12 16:11     ` Junio C Hamano
2021-11-10  1:43   ` [PATCH v2 2/8] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
2021-11-12 16:38     ` Junio C Hamano
2021-11-10  1:43   ` [PATCH v2 3/8] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
2021-11-12 17:09     ` Junio C Hamano
2021-11-10  1:43   ` [PATCH v2 4/8] grep docs: de-duplicate configuration sections Ævar Arnfjörð Bjarmason
2021-11-12 17:15     ` Junio C Hamano
2021-11-10  1:43   ` [PATCH v2 5/8] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
2021-11-12 17:18     ` Junio C Hamano
2021-11-10  1:43   ` [PATCH v2 6/8] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
2021-11-12 17:32     ` Junio C Hamano
2021-11-10  1:43   ` [PATCH v2 7/8] grep: simplify config parsing, change grep.<rx config> interaction Ævar Arnfjörð Bjarmason
2021-11-12 19:19     ` Junio C Hamano
2021-11-13  9:55       ` Ævar Arnfjörð Bjarmason
2021-11-10  1:43   ` [PATCH v2 8/8] grep: make "extendedRegexp=true" the same as "patternType=extended" Ævar Arnfjörð Bjarmason
2021-11-12 19:32     ` Junio C Hamano
2021-11-10  2:23   ` [PATCH v2 0/8] grep: simplify & delete code by changing obscure cfg variable behavior Taylor Blau
2021-11-29 14:50   ` [PATCH v3 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
2021-11-29 14:50     ` [PATCH v3 1/7] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
2021-11-29 14:50     ` [PATCH v3 2/7] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
2022-03-04  8:57       ` Tests in t4202 are aborted early, was: Re: [PATCH v3 2/7] log Fabian Stelzer
2022-03-04 10:05         ` [PATCH] log tests: fix "abort tests early" regression in ff37a60c369 Ævar Arnfjörð Bjarmason
2021-11-29 14:50     ` [PATCH v3 3/7] grep tests: add missing "grep.patternType" config test Ævar Arnfjörð Bjarmason
2021-11-29 21:52       ` Junio C Hamano
2021-12-03  0:48         ` Junio C Hamano
2021-11-29 14:50     ` [PATCH v3 4/7] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
2021-11-29 14:50     ` [PATCH v3 5/7] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
2021-11-29 14:50     ` [PATCH v3 6/7] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
2021-11-29 14:50     ` [PATCH v3 7/7] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
2021-11-29 21:06     ` [PATCH v3 0/7] grep: simplify & delete "init" & "config" code Junio C Hamano
2021-11-29 21:36     ` Junio C Hamano
2021-12-03 10:19   ` [PATCH v4 " Ævar Arnfjörð Bjarmason
2021-12-03 10:19     ` [PATCH v4 1/7] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
2021-12-03 10:19     ` [PATCH v4 2/7] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
2021-12-03 10:19     ` [PATCH v4 3/7] grep tests: add missing "grep.patternType" config test Ævar Arnfjörð Bjarmason
2021-12-03 10:19     ` [PATCH v4 4/7] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
2021-12-03 10:19     ` [PATCH v4 5/7] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
2021-12-03 10:19     ` [PATCH v4 6/7] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
2021-12-03 10:19     ` [PATCH v4 7/7] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
2021-12-22  2:58     ` [PATCH v5 0/7] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
2021-12-22  2:58       ` [PATCH v5 1/7] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
2021-12-22  2:58       ` [PATCH v5 2/7] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
2021-12-22  2:58       ` [PATCH v5 3/7] grep tests: add missing "grep.patternType" config test Ævar Arnfjörð Bjarmason
2021-12-23 22:25         ` Junio C Hamano
2021-12-25  0:06           ` Re* " Junio C Hamano
2021-12-25  0:19             ` [RFC/PATCH] grep: allow scripts to ignore configured pattern type Junio C Hamano
2021-12-26 23:09               ` Ævar Arnfjörð Bjarmason
2021-12-25  1:04             ` Re* [PATCH v5 3/7] grep tests: add missing "grep.patternType" config test Junio C Hamano
2021-12-22  2:58       ` [PATCH v5 4/7] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
2021-12-22  2:58       ` [PATCH v5 5/7] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
2021-12-22  2:58       ` [PATCH v5 6/7] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
2021-12-22  2:58       ` [PATCH v5 7/7] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
2021-12-23 22:37         ` Junio C Hamano
2021-12-23  0:30       ` [PATCH v5 0/7] grep: simplify & delete "init" & "config" code Junio C Hamano
2021-12-26 22:37       ` [PATCH v6 " Ævar Arnfjörð Bjarmason
2021-12-26 22:37         ` [PATCH v6 1/7] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
2021-12-26 22:37         ` [PATCH v6 2/7] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
2021-12-26 22:37         ` [PATCH v6 3/7] grep tests: add missing "grep.patternType" config tests Ævar Arnfjörð Bjarmason
2021-12-26 22:37         ` [PATCH v6 4/7] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
2021-12-26 22:37         ` [PATCH v6 5/7] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
2021-12-26 22:37         ` [PATCH v6 6/7] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
2021-12-26 22:37         ` [PATCH v6 7/7] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
2021-12-27  6:06           ` Junio C Hamano
2021-12-27 18:51             ` Junio C Hamano
2021-12-28  1:07         ` [PATCH v7 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
2021-12-28  1:07           ` [PATCH v7 01/10] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
2021-12-28  1:07           ` [PATCH v7 02/10] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
2021-12-28  1:07           ` [PATCH v7 03/10] grep tests: add missing "grep.patternType" config tests Ævar Arnfjörð Bjarmason
2021-12-28  1:07           ` [PATCH v7 04/10] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
2021-12-28  1:07           ` [PATCH v7 05/10] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
2021-12-28  1:07           ` [PATCH v7 06/10] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
2021-12-28  1:07           ` [PATCH v7 07/10] grep.h: make "grep_opt.pattern_type_option" use its enum Ævar Arnfjörð Bjarmason
2021-12-28  1:07           ` [PATCH v7 08/10] grep.c: do "if (bool && memchr())" not "if (memchr() && bool)" Ævar Arnfjörð Bjarmason
2021-12-28  1:07           ` [PATCH v7 09/10] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
2021-12-28  1:07           ` [PATCH v7 10/10] grep.[ch]: remove GREP_PATTERN_TYPE_UNSPECIFIED Ævar Arnfjörð Bjarmason
2022-01-18 15:55           ` [PATCH v8 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
2022-01-18 15:55             ` [PATCH v8 01/10] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
2022-01-18 15:55             ` [PATCH v8 02/10] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
2022-01-18 15:55             ` [PATCH v8 03/10] grep tests: add missing "grep.patternType" config tests Ævar Arnfjörð Bjarmason
2022-01-18 15:55             ` [PATCH v8 04/10] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
2022-01-18 15:55             ` [PATCH v8 05/10] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
2022-01-18 15:55             ` [PATCH v8 06/10] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
2022-01-18 15:55             ` [PATCH v8 07/10] grep.h: make "grep_opt.pattern_type_option" use its enum Ævar Arnfjörð Bjarmason
2022-01-18 15:55             ` [PATCH v8 08/10] grep.c: do "if (bool && memchr())" not "if (memchr() && bool)" Ævar Arnfjörð Bjarmason
2022-01-18 15:55             ` [PATCH v8 09/10] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
2022-01-18 22:50               ` Junio C Hamano
2022-01-18 22:55                 ` Junio C Hamano
2022-01-19  0:17                 ` Ævar Arnfjörð Bjarmason
2022-01-19  1:09                   ` Junio C Hamano
2022-01-19  1:15                     ` Ævar Arnfjörð Bjarmason
2022-01-18 15:55             ` [PATCH v8 10/10] grep.[ch]: remove GREP_PATTERN_TYPE_UNSPECIFIED Ævar Arnfjörð Bjarmason
2022-01-27 11:56             ` [PATCH v9 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
2022-01-27 11:56               ` [PATCH v9 1/9] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
2022-01-27 11:56               ` [PATCH v9 2/9] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
2022-01-27 11:56               ` [PATCH v9 3/9] grep tests: add missing "grep.patternType" config tests Ævar Arnfjörð Bjarmason
2022-01-27 11:56               ` [PATCH v9 4/9] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
2022-01-27 11:56               ` [PATCH v9 5/9] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
2022-01-27 11:56               ` [PATCH v9 6/9] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
2022-01-27 11:56               ` [PATCH v9 7/9] grep.h: make "grep_opt.pattern_type_option" use its enum Ævar Arnfjörð Bjarmason
2022-01-27 11:56               ` [PATCH v9 8/9] grep.c: do "if (bool && memchr())" not "if (memchr() && bool)" Ævar Arnfjörð Bjarmason
2022-01-27 11:56               ` [PATCH v9 9/9] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
2022-01-27 20:30                 ` Junio C Hamano
2022-01-27 21:35                   ` Junio C Hamano
2022-01-27 21:39                     ` Junio C Hamano
2022-02-04 21:20               ` [PATCH v10 0/9] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
2022-02-04 21:20                 ` [PATCH v10 1/9] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
2022-02-04 21:20                 ` [PATCH v10 2/9] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
2022-02-04 21:20                 ` [PATCH v10 3/9] grep tests: add missing "grep.patternType" config tests Ævar Arnfjörð Bjarmason
2022-02-04 23:03                   ` Junio C Hamano
2022-02-04 23:24                   ` Junio C Hamano
2022-02-04 21:20                 ` [PATCH v10 4/9] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
2022-02-04 21:20                 ` [PATCH v10 5/9] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
2022-02-04 21:20                 ` [PATCH v10 6/9] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
2022-02-04 21:20                 ` [PATCH v10 7/9] grep.h: make "grep_opt.pattern_type_option" use its enum Ævar Arnfjörð Bjarmason
2022-02-04 21:20                 ` [PATCH v10 8/9] grep.c: do "if (bool && memchr())" not "if (memchr() && bool)" Ævar Arnfjörð Bjarmason
2022-02-04 21:20                 ` [PATCH v10 9/9] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
2022-02-04 23:41                   ` Junio C Hamano
2022-02-16  0:00                 ` [PATCH v11 00/10] grep: simplify & delete "init" & "config" code Ævar Arnfjörð Bjarmason
2022-02-16  0:00                   ` [PATCH v11 01/10] grep.h: remove unused "regex_t regexp" from grep_opt Ævar Arnfjörð Bjarmason
2022-02-16  0:00                   ` [PATCH v11 02/10] log tests: check if grep_config() is called by "log"-like cmds Ævar Arnfjörð Bjarmason
2022-02-16  0:00                   ` [PATCH v11 03/10] grep tests: create a helper function for "BRE" or "ERE" Ævar Arnfjörð Bjarmason
2022-02-16  0:00                   ` [PATCH v11 04/10] grep tests: add missing "grep.patternType" config tests Ævar Arnfjörð Bjarmason
2022-02-16  0:00                   ` [PATCH v11 05/10] built-ins: trust the "prefix" from run_builtin() Ævar Arnfjörð Bjarmason
2022-02-16  0:00                   ` [PATCH v11 06/10] grep.c: don't pass along NULL callback value Ævar Arnfjörð Bjarmason
2022-02-16  0:00                   ` [PATCH v11 07/10] grep API: call grep_config() after grep_init() Ævar Arnfjörð Bjarmason
2022-02-16  0:00                   ` [PATCH v11 08/10] grep.h: make "grep_opt.pattern_type_option" use its enum Ævar Arnfjörð Bjarmason
2022-02-16  0:00                   ` [PATCH v11 09/10] grep.c: do "if (bool && memchr())" not "if (memchr() && bool)" Ævar Arnfjörð Bjarmason
2022-02-16  0:00                   ` [PATCH v11 10/10] grep: simplify config parsing and option parsing Ævar Arnfjörð Bjarmason
2022-02-16  2:20                   ` [PATCH v11 00/10] grep: simplify & delete "init" & "config" code Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).