git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / Atom feed
* [RFC PATCH] grep: allow for run time disabling of JIT in PCRE
@ 2019-07-28 23:54 Carlo Marcelo Arenas Belón
  2019-07-29  0:09 ` Carlo Arenas
                   ` (4 more replies)
  0 siblings, 5 replies; 23+ messages in thread
From: Carlo Marcelo Arenas Belón @ 2019-07-28 23:54 UTC (permalink / raw)
  To: git; +Cc: avarab, sandals

PCRE1 allowed for a compile time flag to disable JIT, but PCRE2 never
had one, forcing the use of JIT if -P was requested.

After ed0479ce3d (Merge branch 'ab/no-kwset' into next, 2019-07-15)
the PCRE2 engine will be used more broadly and therefore adding this
knob will give users a fallback for situations like the one observed
in OpenBSD with a JIT enabled PCRE2, because of W^X restrictions:

  $ git grep 'foo bar'
  fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
  $ git grep -G 'foo bar'
  fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
  $ git grep -E 'foo bar'
  fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
  $ git grep -F 'foo bar'
  fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
---
 Documentation/git-grep.txt |  4 ++++
 grep.c                     | 15 +++++++++++++--
 grep.h                     |  1 +
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index c89fb569e3..ff544bdeec 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -69,6 +69,10 @@ grep.fallbackToNoIndex::
 	If set to true, fall back to git grep --no-index if git grep
 	is executed outside of a git repository.  Defaults to false.
 
+pcre.jit::
+	If set to false, disable JIT when using PCRE.  Defaults to
+	true.
+
 
 OPTIONS
 -------
diff --git a/grep.c b/grep.c
index c7c06ae08d..3524d353dd 100644
--- a/grep.c
+++ b/grep.c
@@ -56,6 +56,7 @@ void init_grep_defaults(struct repository *repo)
 	opt->repo = repo;
 	opt->relative = 1;
 	opt->pathname = 1;
+	opt->pcre_jit = 1;
 	opt->max_depth = -1;
 	opt->pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED;
 	color_set(opt->colors[GREP_COLOR_CONTEXT], "");
@@ -125,6 +126,12 @@ int grep_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (!strcmp(var, "pcre.jit")) {
+		int is_bool;
+		opt->pcre_jit = git_config_bool_or_int(var, value, &is_bool);
+		return 0;
+	}
+
 	if (!strcmp(var, "color.grep"))
 		opt->color = git_config_colorbool(var, value);
 	if (!strcmp(var, "color.grep.match")) {
@@ -163,6 +170,7 @@ void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix
 	opt->pattern_tail = &opt->pattern_list;
 	opt->header_tail = &opt->header_list;
 
+	opt->pcre_jit = def->pcre_jit;
 	opt->only_matching = def->only_matching;
 	opt->color = def->color;
 	opt->extended_regexp_option = def->extended_regexp_option;
@@ -393,7 +401,8 @@ static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt)
 		die("%s", error);
 
 #ifdef GIT_PCRE1_USE_JIT
-	pcre_config(PCRE_CONFIG_JIT, &p->pcre1_jit_on);
+	if (opt->pcre_jit)
+		pcre_config(PCRE_CONFIG_JIT, &p->pcre1_jit_on);
 #endif
 }
 
@@ -489,7 +498,9 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 		compile_regexp_failed(p, (const char *)&errbuf);
 	}
 
-	pcre2_config(PCRE2_CONFIG_JIT, &p->pcre2_jit_on);
+	if (opt->pcre_jit)
+		pcre2_config(PCRE2_CONFIG_JIT, &p->pcre2_jit_on);
+
 	if (p->pcre2_jit_on) {
 		jitret = pcre2_jit_compile(p->pcre2_pattern, PCRE2_JIT_COMPLETE);
 		if (jitret)
diff --git a/grep.h b/grep.h
index c0c71eb4a9..fff152e606 100644
--- a/grep.h
+++ b/grep.h
@@ -151,6 +151,7 @@ struct grep_opt {
 	int allow_textconv;
 	int extended;
 	int use_reflog_filter;
+	int pcre_jit;
 	int pcre1;
 	int pcre2;
 	int relative;
-- 
2.22.0


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH] grep: allow for run time disabling of JIT in PCRE
  2019-07-28 23:54 [RFC PATCH] grep: allow for run time disabling of JIT in PCRE Carlo Marcelo Arenas Belón
@ 2019-07-29  0:09 ` Carlo Arenas
  2019-07-29  4:57 ` Junio C Hamano
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 23+ messages in thread
From: Carlo Arenas @ 2019-07-29  0:09 UTC (permalink / raw)
  To: git; +Cc: avarab, sandals

On Sun, Jul 28, 2019 at 4:54 PM Carlo Marcelo Arenas Belón
<carenas@gmail.com> wrote:
> @@ -125,6 +126,12 @@ int grep_config(const char *var, const char *value, void *cb)
>                 return 0;
>         }
>
> +       if (!strcmp(var, "pcre.jit")) {
> +               int is_bool;
> +               opt->pcre_jit = git_config_bool_or_int(var, value, &is_bool);
> +               return 0;
> +       }
> +
>         if (!strcmp(var, "color.grep"))
>                 opt->color = git_config_colorbool(var, value);
>         if (!strcmp(var, "color.grep.match")) {

using git_config_bool_or_int, as I am hoping a future version will use
a third value (maybe -1) to
indicate JIT will be tried first, but then the interpreter will be
used in case JIT is not available (as
recommended in PCRE)

not sure also about the right name and where to document this flag, as
this is not only restricted to
the grep subcommand and the issue it is working around will be also
relevant for log (including pickaxe)

Carlo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH] grep: allow for run time disabling of JIT in PCRE
  2019-07-28 23:54 [RFC PATCH] grep: allow for run time disabling of JIT in PCRE Carlo Marcelo Arenas Belón
  2019-07-29  0:09 ` Carlo Arenas
@ 2019-07-29  4:57 ` Junio C Hamano
  2019-07-29  5:29   ` Carlo Arenas
  2019-07-29  8:55 ` Ævar Arnfjörð Bjarmason
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2019-07-29  4:57 UTC (permalink / raw)
  To: Carlo Marcelo Arenas Belón; +Cc: git, avarab, sandals

Carlo Marcelo Arenas Belón  <carenas@gmail.com> writes:

> PCRE1 allowed for a compile time flag to disable JIT, but PCRE2 never
> had one, forcing the use of JIT if -P was requested.
>
> After ed0479ce3d (Merge branch 'ab/no-kwset' into next, 2019-07-15)
> the PCRE2 engine will be used more broadly and therefore adding this
> knob will give users a fallback for situations like the one observed
> in OpenBSD with a JIT enabled PCRE2, because of W^X restrictions:
>
>   $ git grep 'foo bar'
>   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
>   $ git grep -G 'foo bar'
>   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
>   $ git grep -E 'foo bar'
>   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
>   $ git grep -F 'foo bar'
>   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'

;-) Yeah, we should have known that security-paranoid distros would
have W^X issues with this series, too.

I am not sure I like a config-only knob like this,
though---shouldn't we have a command line knob to turn jit off
first, and then for those who gets tired of having to type it all
the time add the configuration to flip the default for them?

Other than that, the feature itself makes quite a lot of sense.

>
> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
> ---
>  Documentation/git-grep.txt |  4 ++++
>  grep.c                     | 15 +++++++++++++--
>  grep.h                     |  1 +
>  3 files changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
> index c89fb569e3..ff544bdeec 100644
> --- a/Documentation/git-grep.txt
> +++ b/Documentation/git-grep.txt
> @@ -69,6 +69,10 @@ grep.fallbackToNoIndex::
>  	If set to true, fall back to git grep --no-index if git grep
>  	is executed outside of a git repository.  Defaults to false.
>  
> +pcre.jit::
> +	If set to false, disable JIT when using PCRE.  Defaults to
> +	true.
> +
>  
>  OPTIONS
>  -------
> diff --git a/grep.c b/grep.c
> index c7c06ae08d..3524d353dd 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -56,6 +56,7 @@ void init_grep_defaults(struct repository *repo)
>  	opt->repo = repo;
>  	opt->relative = 1;
>  	opt->pathname = 1;
> +	opt->pcre_jit = 1;
>  	opt->max_depth = -1;
>  	opt->pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED;
>  	color_set(opt->colors[GREP_COLOR_CONTEXT], "");
> @@ -125,6 +126,12 @@ int grep_config(const char *var, const char *value, void *cb)
>  		return 0;
>  	}
>  
> +	if (!strcmp(var, "pcre.jit")) {
> +		int is_bool;
> +		opt->pcre_jit = git_config_bool_or_int(var, value, &is_bool);
> +		return 0;
> +	}
> +
>  	if (!strcmp(var, "color.grep"))
>  		opt->color = git_config_colorbool(var, value);
>  	if (!strcmp(var, "color.grep.match")) {
> @@ -163,6 +170,7 @@ void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix
>  	opt->pattern_tail = &opt->pattern_list;
>  	opt->header_tail = &opt->header_list;
>  
> +	opt->pcre_jit = def->pcre_jit;
>  	opt->only_matching = def->only_matching;
>  	opt->color = def->color;
>  	opt->extended_regexp_option = def->extended_regexp_option;
> @@ -393,7 +401,8 @@ static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt)
>  		die("%s", error);
>  
>  #ifdef GIT_PCRE1_USE_JIT
> -	pcre_config(PCRE_CONFIG_JIT, &p->pcre1_jit_on);
> +	if (opt->pcre_jit)
> +		pcre_config(PCRE_CONFIG_JIT, &p->pcre1_jit_on);
>  #endif
>  }
>  
> @@ -489,7 +498,9 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
>  		compile_regexp_failed(p, (const char *)&errbuf);
>  	}
>  
> -	pcre2_config(PCRE2_CONFIG_JIT, &p->pcre2_jit_on);
> +	if (opt->pcre_jit)
> +		pcre2_config(PCRE2_CONFIG_JIT, &p->pcre2_jit_on);
> +
>  	if (p->pcre2_jit_on) {
>  		jitret = pcre2_jit_compile(p->pcre2_pattern, PCRE2_JIT_COMPLETE);
>  		if (jitret)
> diff --git a/grep.h b/grep.h
> index c0c71eb4a9..fff152e606 100644
> --- a/grep.h
> +++ b/grep.h
> @@ -151,6 +151,7 @@ struct grep_opt {
>  	int allow_textconv;
>  	int extended;
>  	int use_reflog_filter;
> +	int pcre_jit;
>  	int pcre1;
>  	int pcre2;
>  	int relative;

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH] grep: allow for run time disabling of JIT in PCRE
  2019-07-29  4:57 ` Junio C Hamano
@ 2019-07-29  5:29   ` Carlo Arenas
  0 siblings, 0 replies; 23+ messages in thread
From: Carlo Arenas @ 2019-07-29  5:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, avarab, sandals

On Sun, Jul 28, 2019 at 9:57 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> I am not sure I like a config-only knob like this,
> though---shouldn't we have a command line knob to turn jit off
> first, and then for those who gets tired of having to type it all
> the time add the configuration to flip the default for them?

are you suggesting to add a --pcre-jit parameter to both grep and log?

guess using pcre.jit for the configuration makes sense as far as
it is documented on both man pages, or would this imply this should go
instead into core?

I was expecting this to be set system (or at least global) wide most
of the time like
(ex: core.ignorecase, credential.helper) neither of which I can relate
to a command line parameter.

> Other than that, the feature itself makes quite a lot of sense.

should I target maint/master?, next conflicts because of ab/no-kwset and it also
conflicts with other in fly features too (some not even in pu), which is why now
is based on pu and depends on ab/pcre-jit-fixes

Carlo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH] grep: allow for run time disabling of JIT in PCRE
  2019-07-28 23:54 [RFC PATCH] grep: allow for run time disabling of JIT in PCRE Carlo Marcelo Arenas Belón
  2019-07-29  0:09 ` Carlo Arenas
  2019-07-29  4:57 ` Junio C Hamano
@ 2019-07-29  8:55 ` Ævar Arnfjörð Bjarmason
  2019-07-29 10:26   ` Carlo Arenas
  2019-07-29 10:59 ` [RFC PATCH v2] " Carlo Marcelo Arenas Belón
  2019-07-31 12:24 ` [RFC PATCH] grep: allow for run time disabling of JIT in PCRE Johannes Schindelin
  4 siblings, 1 reply; 23+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-07-29  8:55 UTC (permalink / raw)
  To: Carlo Marcelo Arenas Belón; +Cc: git, sandals


On Mon, Jul 29 2019, Carlo Marcelo Arenas Belón wrote:

> PCRE1 allowed for a compile time flag to disable JIT, but PCRE2 never
> had one, forcing the use of JIT if -P was requested.

What's that PCRE1 compile-time flag?

> After ed0479ce3d (Merge branch 'ab/no-kwset' into next, 2019-07-15)
> the PCRE2 engine will be used more broadly and therefore adding this
> knob will give users a fallback for situations like the one observed
> in OpenBSD with a JIT enabled PCRE2, because of W^X restrictions:
>
>   $ git grep 'foo bar'
>   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
>   $ git grep -G 'foo bar'
>   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
>   $ git grep -E 'foo bar'
>   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
>   $ git grep -F 'foo bar'
>   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'

Yeah that obviously sucks more with ab/no-kwset, but that seems like a
case where -P would have been completely broken before, and therefore I
can't imagine the package ever passed "make test". Or is W^X also
exposed as some run-time option on OpenBSD?

I.e. aside from the merits of such a setting in general these examples
seem like just working around something that should be fixed at make
all/test time, or maybe I'm missing something.

To the extent that we'd want to make this sort of thing configurable, I
wonder if a continuation of my (*NO_JIT) patch isn't better, i.e. just
adding the ability to configure some string we'd inject at the start of
every pattern.

That would allow for setting any other number of options in
pcre2syntax(3) without us needing to carry config for each one,
e.g. (*LIMIT_HEAP=d), (*LIMIT_DEPTH=d) etc. It does present a larger
foot-gun surface though...

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH] grep: allow for run time disabling of JIT in PCRE
  2019-07-29  8:55 ` Ævar Arnfjörð Bjarmason
@ 2019-07-29 10:26   ` Carlo Arenas
  2019-07-29 12:38     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 23+ messages in thread
From: Carlo Arenas @ 2019-07-29 10:26 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, sandals

On Mon, Jul 29, 2019 at 1:55 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Mon, Jul 29 2019, Carlo Marcelo Arenas Belón wrote:
>
> > PCRE1 allowed for a compile time flag to disable JIT, but PCRE2 never
> > had one, forcing the use of JIT if -P was requested.
>
> What's that PCRE1 compile-time flag?

NO_LIBPCRE1_JIT at GIT compile time (regardless of JIT support in the
PCRE1 library you are using)

> > After ed0479ce3d (Merge branch 'ab/no-kwset' into next, 2019-07-15)
> > the PCRE2 engine will be used more broadly and therefore adding this
> > knob will give users a fallback for situations like the one observed
> > in OpenBSD with a JIT enabled PCRE2, because of W^X restrictions:
> >
> >   $ git grep 'foo bar'
> >   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
> >   $ git grep -G 'foo bar'
> >   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
> >   $ git grep -E 'foo bar'
> >   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
> >   $ git grep -F 'foo bar'
> >   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
>
> Yeah that obviously sucks more with ab/no-kwset, but that seems like a
> case where -P would have been completely broken before, and therefore I
> can't imagine the package ever passed "make test". Or is W^X also
> exposed as some run-time option on OpenBSD?

ironically, you could use PCRE1 since that is not using the JIT fast
path and therefore will fallback automatically to the interpreter

there is also a convoluted way to make your binary work by moving
it into a mount point that has been specially exempted from that W^X
restriction.

> I.e. aside from the merits of such a setting in general these examples
> seem like just working around something that should be fixed at make
> all/test time, or maybe I'm missing something.

1) before you could just avoid using -P and still be able to grep
2) there is no way to tell PCRE2 to get out of the way even if you are
    not using -P

you are right though that this is not a new problem and was reported
before with patches and the last comment saying a configuration
should be provided.

> To the extent that we'd want to make this sort of thing configurable, I
> wonder if a continuation of my (*NO_JIT) patch isn't better, i.e. just
> adding the ability to configure some string we'd inject at the start of
> every pattern.

looking at the number of lines of code, it would seem the configuration
approach is simpler.

> That would allow for setting any other number of options in
> pcre2syntax(3) without us needing to carry config for each one,
> e.g. (*LIMIT_HEAP=d), (*LIMIT_DEPTH=d) etc. It does present a larger
> foot-gun surface though...

the parameters I suspect users might need are not really accessible through
that (ex: jit stacksize).

it is important to note that currently we are not preventing any user to use
those flags themselves in their patterns either.

Carlo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC PATCH v2] grep: allow for run time disabling of JIT in PCRE
  2019-07-28 23:54 [RFC PATCH] grep: allow for run time disabling of JIT in PCRE Carlo Marcelo Arenas Belón
                   ` (2 preceding siblings ...)
  2019-07-29  8:55 ` Ævar Arnfjörð Bjarmason
@ 2019-07-29 10:59 ` Carlo Marcelo Arenas Belón
  2019-07-29 11:33   ` Carlo Arenas
                     ` (3 more replies)
  2019-07-31 12:24 ` [RFC PATCH] grep: allow for run time disabling of JIT in PCRE Johannes Schindelin
  4 siblings, 4 replies; 23+ messages in thread
From: Carlo Marcelo Arenas Belón @ 2019-07-29 10:59 UTC (permalink / raw)
  To: git; +Cc: avarab, sandals, gitster, dev+git, Johannes.Schindelin

PCRE1 allowed for a compile time flag to disable JIT, but PCRE2 never
had one, forcing the use of JIT if -P was requested.

After ed0479ce3d (Merge branch 'ab/no-kwset' into next, 2019-07-15)
the PCRE2 engine will be used more broadly and therefore adding this
knob will allow users a escape from situations where JIT might be
problematic.

JIT will be used by default but it can be disabled with the --no-pcre-jit
option in `git grep` or by setting 0/false into the pcre.jit config.

If a value of -1 is used instead then the following error is prevented by
using the interpreter when a JIT failure consistent with known security
restrictions is found at regex compilation time.

  $ git grep 'foo bar'
  fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
---
V2: add command line to grep as suggested by Junio

 Documentation/git-grep.txt | 11 +++++++++++
 builtin/grep.c             |  4 ++++
 grep.c                     | 30 ++++++++++++++++++++++++++----
 grep.h                     |  1 +
 4 files changed, 42 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index c89fb569e3..895c6b34ec 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -13,6 +13,7 @@ SYNOPSIS
 	   [-v | --invert-match] [-h|-H] [--full-name]
 	   [-E | --extended-regexp] [-G | --basic-regexp]
 	   [-P | --perl-regexp]
+	   [-j | --[no]-pcre-jit]
 	   [-F | --fixed-strings] [-n | --line-number] [--column]
 	   [-l | --files-with-matches] [-L | --files-without-match]
 	   [(-O | --open-files-in-pager) [<pager>]]
@@ -69,6 +70,12 @@ grep.fallbackToNoIndex::
 	If set to true, fall back to git grep --no-index if git grep
 	is executed outside of a git repository.  Defaults to false.
 
+pcre.jit::
+	If set to false, disable JIT when using PCRE.  Defaults to
+	true.
+	if set to -1 will try first to use JIT and fallback to the
+	interpreter instead of returning an error.
+
 
 OPTIONS
 -------
@@ -175,6 +182,10 @@ providing this option will cause it to die.
 	Use fixed strings for patterns (don't interpret pattern
 	as a regex).
 
+-j::
+--[no-]pcre-jit::
+	Diable JIT in PCRE with --no-pcre-jit.
+
 -n::
 --line-number::
 	Prefix the line number to matching lines.
diff --git a/builtin/grep.c b/builtin/grep.c
index 580fd38f41..b0e94875b2 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -923,6 +923,10 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		OPT_BOOL_F(0, "ext-grep", &external_grep_allowed__ignored,
 			   N_("allow calling of grep(1) (ignored by this build)"),
 			   PARSE_OPT_NOCOMPLETE),
+		OPT_GROUP("PCRE"),
+		OPT_SET_INT('j', "pcre-jit", &opt.pcre_jit,
+			N_("when to use JIT with PCRE"),
+			1),
 		OPT_END()
 	};
 
diff --git a/grep.c b/grep.c
index c7c06ae08d..d58cad0257 100644
--- a/grep.c
+++ b/grep.c
@@ -56,6 +56,7 @@ void init_grep_defaults(struct repository *repo)
 	opt->repo = repo;
 	opt->relative = 1;
 	opt->pathname = 1;
+	opt->pcre_jit = 1;
 	opt->max_depth = -1;
 	opt->pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED;
 	color_set(opt->colors[GREP_COLOR_CONTEXT], "");
@@ -125,6 +126,12 @@ int grep_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (!strcmp(var, "pcre.jit")) {
+		int is_bool;
+		opt->pcre_jit = git_config_bool_or_int(var, value, &is_bool);
+		return 0;
+	}
+
 	if (!strcmp(var, "color.grep"))
 		opt->color = git_config_colorbool(var, value);
 	if (!strcmp(var, "color.grep.match")) {
@@ -163,6 +170,7 @@ void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix
 	opt->pattern_tail = &opt->pattern_list;
 	opt->header_tail = &opt->header_list;
 
+	opt->pcre_jit = def->pcre_jit;
 	opt->only_matching = def->only_matching;
 	opt->color = def->color;
 	opt->extended_regexp_option = def->extended_regexp_option;
@@ -393,7 +401,8 @@ static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt)
 		die("%s", error);
 
 #ifdef GIT_PCRE1_USE_JIT
-	pcre_config(PCRE_CONFIG_JIT, &p->pcre1_jit_on);
+	if (opt->pcre_jit)
+		pcre_config(PCRE_CONFIG_JIT, &p->pcre1_jit_on);
 #endif
 }
 
@@ -489,11 +498,24 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 		compile_regexp_failed(p, (const char *)&errbuf);
 	}
 
-	pcre2_config(PCRE2_CONFIG_JIT, &p->pcre2_jit_on);
+	if (opt->pcre_jit)
+		pcre2_config(PCRE2_CONFIG_JIT, &p->pcre2_jit_on);
+
 	if (p->pcre2_jit_on) {
 		jitret = pcre2_jit_compile(p->pcre2_pattern, PCRE2_JIT_COMPLETE);
-		if (jitret)
-			die("Couldn't JIT the PCRE2 pattern '%s', got '%d'\n", p->pattern, jitret);
+		if (jitret) {
+			if ((opt->pcre_jit < 0) &&
+				jitret == PCRE2_ERROR_NOMEMORY) {
+				/*
+				 * JIT compiler isn't available but we can
+				 * still fallback to the interpreter
+				 */
+				p->pcre2_jit_on = 0;
+				return;
+			}
+			else
+				die("Couldn't JIT the PCRE2 pattern '%s', got '%d'\n", p->pattern, jitret);
+		}
 
 		/*
 		 * The pcre2_config(PCRE2_CONFIG_JIT, ...) call just
diff --git a/grep.h b/grep.h
index c0c71eb4a9..fff152e606 100644
--- a/grep.h
+++ b/grep.h
@@ -151,6 +151,7 @@ struct grep_opt {
 	int allow_textconv;
 	int extended;
 	int use_reflog_filter;
+	int pcre_jit;
 	int pcre1;
 	int pcre2;
 	int relative;

base-commit: 870eea81669bfff4333b37b11fedd870cd05fd90
-- 
2.22.0


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v2] grep: allow for run time disabling of JIT in PCRE
  2019-07-29 10:59 ` [RFC PATCH v2] " Carlo Marcelo Arenas Belón
@ 2019-07-29 11:33   ` Carlo Arenas
  2019-07-29 15:11   ` René Scharfe
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 23+ messages in thread
From: Carlo Arenas @ 2019-07-29 11:33 UTC (permalink / raw)
  To: git; +Cc: avarab, sandals, gitster, dev+git, Johannes.Schindelin

Known Issues:
* PCRE1 is broken, but fixing it would make more sense on top of the
topic[1] (not in pu)
* it depends on the current ab/pcre-jit-fixes that is missing 1
critical commit in pu
* no tests yet; would need to extend it on top of the debug from Beat
and test-tool changes from Ævar, neither of which are final
* need to build on top of pu and will need further changes to be ready
for next/master/maint

The code has been tested in OpenBSD with PCRE2 (latest from svn, but
any version should work if they are JIT enabled), it is
expected to also work in NetBSD (even with PAX enabled kernels) and
macOS 10.13.6 but haven't yet tested them.  HardenedBSD
will likely segfault unless pcre.jit=0 as described in the original report[2]

Testing with SElinux and PAX enabled for Linux encouraged

Carlo

[1] https://public-inbox.org/git/20190726202642.7986-1-carenas@gmail.com/
[2] https://public-inbox.org/git/20181209230024.43444-1-carenas@gmail.com/

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH] grep: allow for run time disabling of JIT in PCRE
  2019-07-29 10:26   ` Carlo Arenas
@ 2019-07-29 12:38     ` Ævar Arnfjörð Bjarmason
  2019-07-30 13:01       ` Carlo Arenas
  0 siblings, 1 reply; 23+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-07-29 12:38 UTC (permalink / raw)
  To: Carlo Arenas; +Cc: git, sandals


On Mon, Jul 29 2019, Carlo Arenas wrote:

> On Mon, Jul 29, 2019 at 1:55 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>> On Mon, Jul 29 2019, Carlo Marcelo Arenas Belón wrote:
>>
>> > PCRE1 allowed for a compile time flag to disable JIT, but PCRE2 never
>> > had one, forcing the use of JIT if -P was requested.
>>
>> What's that PCRE1 compile-time flag?
>
> NO_LIBPCRE1_JIT at GIT compile time (regardless of JIT support in the
> PCRE1 library you are using)

Ah of course, I was reading this as "regexp
compile-time". I.e. something like (*NO_JIT). No *such* thing exists for
PCRE v1 JIT AFAIK as exposed by git-grep.

>> > After ed0479ce3d (Merge branch 'ab/no-kwset' into next, 2019-07-15)
>> > the PCRE2 engine will be used more broadly and therefore adding this
>> > knob will give users a fallback for situations like the one observed
>> > in OpenBSD with a JIT enabled PCRE2, because of W^X restrictions:
>> >
>> >   $ git grep 'foo bar'
>> >   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
>> >   $ git grep -G 'foo bar'
>> >   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
>> >   $ git grep -E 'foo bar'
>> >   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
>> >   $ git grep -F 'foo bar'
>> >   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
>>
>> Yeah that obviously sucks more with ab/no-kwset, but that seems like a
>> case where -P would have been completely broken before, and therefore I
>> can't imagine the package ever passed "make test". Or is W^X also
>> exposed as some run-time option on OpenBSD?
>
> ironically, you could use PCRE1 since that is not using the JIT fast
> path and therefore will fallback automatically to the interpreter

...because OpenBSD PCRE v1 was compiled with --disable-jit before, but
their v2 package has --enable-jit, it just doesn't work at all? Is this
your custom built git + OpenBSD packages of PCRE coming with the OS?

I don't use OpenBSD, but isn't this their recipe? Seems they use "make
test", and don't compile with PCRE at all if I'm reading it right:
https://github.com/openbsd/ports/blob/master/devel/git/Makefile

> there is also a convoluted way to make your binary work by moving
> it into a mount point that has been specially exempted from that W^X
> restriction.
>
>> I.e. aside from the merits of such a setting in general these examples
>> seem like just working around something that should be fixed at make
>> all/test time, or maybe I'm missing something.
>
> 1) before you could just avoid using -P and still be able to grep
> 2) there is no way to tell PCRE2 to get out of the way even if you are
>     not using -P

Right, no arguments at all about ab/no-kwset making this worse (re: your
#1). I just really prefer not to expose/document config for what
*should* be something purely internal if the X-Y problem is a bug being
exposed that we should just fix.

Particularly because I think it's a losing battle to provide run-time
options for what are surely a *lot* of "make test" failures.

If it really is unavoidable to detect this until runtime in some common
configurations I have no problem with it, I just haven't encountered
that so far.

> you are right though that this is not a new problem and was reported
> before with patches and the last comment saying a configuration
> should be provided.

patches = your recent
https://public-inbox.org/git/20181209230024.43444-2-carenas@gmail.com/
or something earlier?

That patch seems sane without having tested it. Seems like the
equivalent of what we do with v1 with PCRE2_JIT_COMPLETE.

I *am* curious if there's setups where fixing the code for PCRE v1 isn't
purely an academic exercise. Is there a reason for why these platforms
can't just move to PCRE v2 in principle (dumpster fires in "next"
non-withstanding)?

>> To the extent that we'd want to make this sort of thing configurable, I
>> wonder if a continuation of my (*NO_JIT) patch isn't better, i.e. just
>> adding the ability to configure some string we'd inject at the start of
>> every pattern.
>
> looking at the number of lines of code, it would seem the configuration
> approach is simpler.
>
>> That would allow for setting any other number of options in
>> pcre2syntax(3) without us needing to carry config for each one,
>> e.g. (*LIMIT_HEAP=d), (*LIMIT_DEPTH=d) etc. It does present a larger
>> foot-gun surface though...
>
> the parameters I suspect users might need are not really accessible through
> that (ex: jit stacksize).
>
> it is important to note that currently we are not preventing any user to use
> those flags themselves in their patterns either.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v2] grep: allow for run time disabling of JIT in PCRE
  2019-07-29 10:59 ` [RFC PATCH v2] " Carlo Marcelo Arenas Belón
  2019-07-29 11:33   ` Carlo Arenas
@ 2019-07-29 15:11   ` René Scharfe
  2019-07-29 17:47     ` Junio C Hamano
  2019-07-31 12:32   ` Johannes Schindelin
  2019-08-04  3:14   ` [RFC PATCH v3] grep: treat PCRE2 jit compilation memory error as non fatal Carlo Marcelo Arenas Belón
  3 siblings, 1 reply; 23+ messages in thread
From: René Scharfe @ 2019-07-29 15:11 UTC (permalink / raw)
  To: Carlo Marcelo Arenas Belón, git
  Cc: avarab, sandals, gitster, dev+git, Johannes.Schindelin

Am 29.07.19 um 12:59 schrieb Carlo Marcelo Arenas Belón:
> PCRE1 allowed for a compile time flag to disable JIT, but PCRE2 never
> had one, forcing the use of JIT if -P was requested.
>
> After ed0479ce3d (Merge branch 'ab/no-kwset' into next, 2019-07-15)
> the PCRE2 engine will be used more broadly and therefore adding this
> knob will allow users a escape from situations where JIT might be
> problematic.
>
> JIT will be used by default but it can be disabled with the --no-pcre-jit
> option in `git grep` or by setting 0/false into the pcre.jit config.
>
> If a value of -1 is used instead then the following error is prevented by
> using the interpreter when a JIT failure consistent with known security
> restrictions is found at regex compilation time.
>
>   $ git grep 'foo bar'
>   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
>
> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
> ---
> V2: add command line to grep as suggested by Junio
>
>  Documentation/git-grep.txt | 11 +++++++++++
>  builtin/grep.c             |  4 ++++
>  grep.c                     | 30 ++++++++++++++++++++++++++----
>  grep.h                     |  1 +
>  4 files changed, 42 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
> index c89fb569e3..895c6b34ec 100644
> --- a/Documentation/git-grep.txt
> +++ b/Documentation/git-grep.txt
> @@ -13,6 +13,7 @@ SYNOPSIS
>  	   [-v | --invert-match] [-h|-H] [--full-name]
>  	   [-E | --extended-regexp] [-G | --basic-regexp]
>  	   [-P | --perl-regexp]
> +	   [-j | --[no]-pcre-jit]

Do users care?  Enough so to add a short option for this?

>  	   [-F | --fixed-strings] [-n | --line-number] [--column]
>  	   [-l | --files-with-matches] [-L | --files-without-match]
>  	   [(-O | --open-files-in-pager) [<pager>]]
> @@ -69,6 +70,12 @@ grep.fallbackToNoIndex::
>  	If set to true, fall back to git grep --no-index if git grep
>  	is executed outside of a git repository.  Defaults to false.
>
> +pcre.jit::
> +	If set to false, disable JIT when using PCRE.  Defaults to
> +	true.
> +	if set to -1 will try first to use JIT and fallback to the
> +	interpreter instead of returning an error.

Why not implement only -1, without adding this config setting?

> +
>
>  OPTIONS
>  -------
> @@ -175,6 +182,10 @@ providing this option will cause it to die.
>  	Use fixed strings for patterns (don't interpret pattern
>  	as a regex).
>
> +-j::
> +--[no-]pcre-jit::
> +	Diable JIT in PCRE with --no-pcre-jit.

"Disable".

René

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v2] grep: allow for run time disabling of JIT in PCRE
  2019-07-29 15:11   ` René Scharfe
@ 2019-07-29 17:47     ` Junio C Hamano
  2019-07-30  0:49       ` Carlo Arenas
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2019-07-29 17:47 UTC (permalink / raw)
  To: René Scharfe
  Cc: Carlo Marcelo Arenas Belón, git, avarab, sandals, dev+git,
	Johannes.Schindelin

René Scharfe <l.s.r@web.de> writes:

>> +pcre.jit::
>> +	If set to false, disable JIT when using PCRE.  Defaults to
>> +	true.
>> +	if set to -1 will try first to use JIT and fallback to the
>> +	interpreter instead of returning an error.
>
> Why not implement only -1, without adding this config setting?

... nor command line option.  If we have an auto-fallback, I would
think that makes the most sense.  IIRC the first iteration with only
the configuration was really about working around the (non-working)
pcre-jit---if we can self-detect and skip a non-working case, that
would allow us to drop end-user facing knobs, which is ideal.

Thanks for a doze of sanity.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v2] grep: allow for run time disabling of JIT in PCRE
  2019-07-29 17:47     ` Junio C Hamano
@ 2019-07-30  0:49       ` Carlo Arenas
  2019-07-30 17:55         ` René Scharfe
  2019-07-31 12:36         ` Johannes Schindelin
  0 siblings, 2 replies; 23+ messages in thread
From: Carlo Arenas @ 2019-07-30  0:49 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: René Scharfe, git, avarab, sandals, dev+git, Johannes.Schindelin

On Mon, Jul 29, 2019 at 10:47 AM Junio C Hamano <gitster@pobox.com> wrote:
> René Scharfe <l.s.r@web.de> writes:
> >> +pcre.jit::
> >> +    If set to false, disable JIT when using PCRE.  Defaults to
> >> +    true.
> >> +    if set to -1 will try first to use JIT and fallback to the
> >> +    interpreter instead of returning an error.
> >
> > Why not implement only -1, without adding this config setting?
>
> ... nor command line option.  If we have an auto-fallback, I would
> think that makes the most sense.  IIRC the first iteration with only
> the configuration was really about working around the (non-working)
> pcre-jit---if we can self-detect and skip a non-working case, that
> would allow us to drop end-user facing knobs, which is ideal.

because that was proposed earlier[1] and wasn't accepted ;)

the main pushback though I got was that doing the fallback would degrade
performance and so it was suggested[2] that keeping the error should be
possible somehow (with the implication it will add yet another macro)

since living without grep -P was a reasonable tradeoff at that time got
punted, but the need to find a solution for this become more urgent once
it was announced[3] PCRE2 would be used also used outside -P

> Thanks for a doze of sanity.

Obviously I am biased, but I kind of like the knob as it allows the user
more flexibility to tweak the internals of grep and because we had
made those internals already visible (ex: not handling any library
errors ourselves and just aborting with a pcre error), but without any
flexibility to fix those problems themselves (unless they open the code
and rebuild, in most cases)

the comment from the user that reported[4] a regression with GNU grep
because of JIT stack size and that I quote below is representative of how
that layering violation affect users, and while git users are more likely
than grep users to do the code tweaking needed, they could use some
help.

"As using the JIT can not be turned off at runtime, nor can the stacksize
be controlled without patching + recompiling, this breaks previously
working expressions for me, so I consider this a new regression,
introduced with b06f7a29a58bbdd5866afc1e92dba3fdc9e2ed59 .

I tested that increasing the stack-size to 1 M fixes the problem for me.
A better fix could maybe consist of a better error message, allowing
stack-size control at runtime and / or making JIT optional at runtime."

making JIT optional at runtime is therefore the title of this patch and as
I mentioned in some other thread it might be even useful to us for our
own tests.

Carlo

[1] https://public-inbox.org/git/20181209230024.43444-3-carenas@gmail.com/
[2] https://public-inbox.org/git/87zhtbn5xb.fsf@evledraar.gmail.com/
[3] https://public-inbox.org/git/CAPUEspjKxQFiRgmfb2SuR_xpVu4=MN66kGEeBK1pHdBgXQbv7Q@mail.gmail.com/
[4] https://www.mail-archive.com/bug-grep@gnu.org/msg05762.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH] grep: allow for run time disabling of JIT in PCRE
  2019-07-29 12:38     ` Ævar Arnfjörð Bjarmason
@ 2019-07-30 13:01       ` Carlo Arenas
  0 siblings, 0 replies; 23+ messages in thread
From: Carlo Arenas @ 2019-07-30 13:01 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, sandals

On Mon, Jul 29, 2019 at 5:38 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> On Mon, Jul 29 2019, Carlo Arenas wrote:
> > On Mon, Jul 29, 2019 at 1:55 AM Ævar Arnfjörð Bjarmason
> > <avarab@gmail.com> wrote:
> >>
> >> On Mon, Jul 29 2019, Carlo Marcelo Arenas Belón wrote:
> >>
> >> > PCRE1 allowed for a compile time flag to disable JIT, but PCRE2 never
> >> > had one, forcing the use of JIT if -P was requested.
> >>
> >> What's that PCRE1 compile-time flag?
> >
> > NO_LIBPCRE1_JIT at GIT compile time (regardless of JIT support in the
> > PCRE1 library you are using)
>
> Ah of course, I was reading this as "regexp
> compile-time". I.e. something like (*NO_JIT). No *such* thing exists for
> PCRE v1 JIT AFAIK as exposed by git-grep.

correct, but there are still other knobs like (*UTF), (*UCP), (?m) or
(?i) that also
affect some of our assumptions.

> >> > After ed0479ce3d (Merge branch 'ab/no-kwset' into next, 2019-07-15)
> >> > the PCRE2 engine will be used more broadly and therefore adding this
> >> > knob will give users a fallback for situations like the one observed
> >> > in OpenBSD with a JIT enabled PCRE2, because of W^X restrictions:
> >> >
> >> >   $ git grep 'foo bar'
> >> >   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
> >> >   $ git grep -G 'foo bar'
> >> >   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
> >> >   $ git grep -E 'foo bar'
> >> >   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
> >> >   $ git grep -F 'foo bar'
> >> >   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
> >>
> >> Yeah that obviously sucks more with ab/no-kwset, but that seems like a
> >> case where -P would have been completely broken before, and therefore I
> >> can't imagine the package ever passed "make test". Or is W^X also
> >> exposed as some run-time option on OpenBSD?
> >
> > ironically, you could use PCRE1 since that is not using the JIT fast
> > path and therefore will fallback automatically to the interpreter
>
> ...because OpenBSD PCRE v1 was compiled with --disable-jit before, but
> their v2 package has --enable-jit, it just doesn't work at all? Is this
> your custom built git + OpenBSD packages of PCRE coming with the OS?

sorry for the confusion, custom builds of both PCRE and git.
and I was referring to the fact that after 685668faaa (grep: stop
using a custom JIT stack with PCRE v1, 2019-07-26)
and with my patches to avoid UTF-8 issues, git + PCRE1 was a much pleasant
experience than git + PCRE2 in OpenBSD (which is also why I didn't
even care about fixing it for pcre.jit=!1 yet)

as shown by :
$ git grep 'foo bar' Documentation | cat
Documentation/RelNotes/1.8.0.txt:   when the user says "git checkout
-b -t foo bar" (e.g. "-t" is not a
Documentation/git-rev-list.txt: $ git rev-list foo bar ^baz
Documentation/gitattributes.txt:abc foo bar baz
$ git grep -P 'foo bar' Documentation | cat
Documentation/RelNotes/1.8.0.txt:   when the user says "git checkout
-b -t foo bar" (e.g. "-t" is not a
Documentation/git-rev-list.txt: $ git rev-list foo bar ^baz
Documentation/gitattributes.txt:abc foo bar baz
$ git grep -P 'foo[ ]bar' Documentation | cat
Documentation/RelNotes/1.8.0.txt:   when the user says "git checkout
-b -t foo bar" (e.g. "-t" is not a
Documentation/git-rev-list.txt: $ git rev-list foo bar ^baz
Documentation/gitattributes.txt:abc foo bar baz
$ dmesg | grep git
git-grep(87484): mmap W^X violation

the last of which might be suppressed with `-c pcre.jit=0` once I get
to fix that

> I don't use OpenBSD, but isn't this their recipe? Seems they use "make
> test", and don't compile with PCRE at all if I'm reading it right:
> https://github.com/openbsd/ports/blob/master/devel/git/Makefile

yes, I was using OpenBSD as a testbase where issues with JIT would
be easily reproducible; their packagers are smart enough not to enable
JIT by default in PCRE or even link it with git as you pointed out.

NetBSD/HardenedBSD might be a better example of a native default package that
would be affected in its standard configuration, if that is what you
were looking for.

> >> I.e. aside from the merits of such a setting in general these examples
> >> seem like just working around something that should be fixed at make
> >> all/test time, or maybe I'm missing something.
> >
> > 1) before you could just avoid using -P and still be able to grep
> > 2) there is no way to tell PCRE2 to get out of the way even if you are
> >     not using -P
>
> Right, no arguments at all about ab/no-kwset making this worse (re: your
> #1). I just really prefer not to expose/document config for what
> *should* be something purely internal if the X-Y problem is a bug being
> exposed that we should just fix.
>
> Particularly because I think it's a losing battle to provide run-time
> options for what are surely a *lot* of "make test" failures.

not sure I understand.  The knob was there to give flexibility to the user
to decide for himself how he wants to use the application and how that
use fits in their environment.

we can't predict either of those with 100% certainty and while it makes
sense we will impose some constrains we should understand that the
more inflexible those are, the more users will be alienated.

this specific constrain is particularly silly, even PCRE recommends using
JIT only as an optimization and fallback to the interpreter but we don't
follow that advice and the least we could do is give a escape hatch to
the users.

> If it really is unavoidable to detect this until runtime in some common
> configurations I have no problem with it, I just haven't encountered
> that so far.

guess it is sort of a chicken/egg problem and we would rather have
OpenBSD never linking their git with PCRE.

FWIW we can't know ahead of time if someone setup will include
running a PAX/SELinux enabled kernel on their otherwise regular
userspace encountering this problem.

there is also the case of Linux distributions without official packages
(like Gentoo and Arch) where each user decides which options they
want to use on their packages at their own time.

> patches = your recent
> https://public-inbox.org/git/20181209230024.43444-2-carenas@gmail.com/
> or something earlier?
>
> That patch seems sane without having tested it. Seems like the
> equivalent of what we do with v1 with PCRE2_JIT_COMPLETE.

I am missing context here; that patch was obsoleted by your ab/pcre-jit-fixes
branch and has nothing to do with PCRE2_JIT_COMPLETE.

if I recall correctly, was waiting on feedback on the series on top of your
original pcre-jit-fixes branch (with only 3 patches) and that was posted in:
https://public-inbox.org/git/20190726202642.7986-1-carenas@gmail.com/T/#u

> I *am* curious if there's setups where fixing the code for PCRE v1 isn't
> purely an academic exercise. Is there a reason for why these platforms
> can't just move to PCRE v2 in principle (dumpster fires in "next"
> non-withstanding)?

how can you expect me (or anyone else) to answer that question? obviously
we don't know, I personally for sure have no problem.

brian mentioned[1] some CentOS 6 users that don't have PCRE2 in their systems,
I mentioned Xcode's git in macOS as likely not updating since it uses
a system library
and that is also used by several other applications (like Safari)

but regardless of that, I see no reason for making their life more
difficult; PCRE1 is
widely used, it is still being supported, and they can use PCRE2 as an
alternative
most of the time (at least with git)

Carlo

[1] https://public-inbox.org/git/20190615191514.GD8616@genre.crustytoothpaste.net/

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v2] grep: allow for run time disabling of JIT in PCRE
  2019-07-30  0:49       ` Carlo Arenas
@ 2019-07-30 17:55         ` René Scharfe
  2019-07-31 12:36         ` Johannes Schindelin
  1 sibling, 0 replies; 23+ messages in thread
From: René Scharfe @ 2019-07-30 17:55 UTC (permalink / raw)
  To: Carlo Arenas, Junio C Hamano
  Cc: git, avarab, sandals, dev+git, Johannes.Schindelin

Am 30.07.19 um 02:49 schrieb Carlo Arenas:
> On Mon, Jul 29, 2019 at 10:47 AM Junio C Hamano <gitster@pobox.com> wrote:
>> René Scharfe <l.s.r@web.de> writes:
>>>> +pcre.jit::
>>>> +    If set to false, disable JIT when using PCRE.  Defaults to
>>>> +    true.
>>>> +    if set to -1 will try first to use JIT and fallback to the
>>>> +    interpreter instead of returning an error.
>>>
>>> Why not implement only -1, without adding this config setting?
>>
>> ... nor command line option.  If we have an auto-fallback, I would
>> think that makes the most sense.  IIRC the first iteration with only
>> the configuration was really about working around the (non-working)
>> pcre-jit---if we can self-detect and skip a non-working case, that
>> would allow us to drop end-user facing knobs, which is ideal.
>
> because that was proposed earlier[1] and wasn't accepted ;)

So you add all those knobs for Ævar?

Users on OpenBSD would get an error whenever they used -P?  And they
are expected to discover an option for turning off said error?  That
could be handled automatically?

This sounds like bullying to me.  I simply wouldn't use -P anymore if
that happened to me.  If that error was returned without -P, I'd
contemplate switching to the Silver Searcher or a similar tool.

> the main pushback though I got was that doing the fallback would degrade
> performance and so it was suggested[2] that keeping the error should be
> possible somehow (with the implication it will add yet another macro)

The slowdown by the fallback should be minimal -- just the extra
wasted time to compile the pattern to machine code.  Compilation time is
probably (hopefully?) dwarfed by search time.

IIUC, Ævar's concern was more about being able to discover when JIT is
not available for some reason.  Printing a warning with --debug or
--verbose could help with that.

Personally I don't care about JIT at all and wouldn't want to see such a
warning and certainly prefer not to get any error message about it,
even more so since there is nothing I can do about it.  (Disabling
security features to get faster search sounds sounds like a no-go.)

> since living without grep -P was a reasonable tradeoff at that time got
> punted, but the need to find a solution for this become more urgent once
> it was announced[3] PCRE2 would be used also used outside -P

Right.

> Obviously I am biased, but I kind of like the knob as it allows the user
> more flexibility to tweak the internals of grep and because we had
> made those internals already visible (ex: not handling any library
> errors ourselves and just aborting with a pcre error), but without any
> flexibility to fix those problems themselves (unless they open the code
> and rebuild, in most cases)
>
> the comment from the user that reported[4] a regression with GNU grep
> because of JIT stack size and that I quote below is representative of how
> that layering violation affect users, and while git users are more likely
> than grep users to do the code tweaking needed, they could use some
> help.

So JIT can cause other problems, and some of them we don't (or can't)
handle in our code?  Turning on an unstable feature without a way to
disable it sounds like a bad idea indeed.  Is it really that bad?  Can
we grow the stack on demand, for example, as GNU grep does now in
response to the bug you mentioned (http://bugs.gnu.org/19833)?  Are
there more such examples?

> making JIT optional at runtime is therefore the title of this patch and as
> I mentioned in some other thread it might be even useful to us for our
> own tests.

Testability is a valid concern, especially if the JIT code is
considered, well, unfinished.

René

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH] grep: allow for run time disabling of JIT in PCRE
  2019-07-28 23:54 [RFC PATCH] grep: allow for run time disabling of JIT in PCRE Carlo Marcelo Arenas Belón
                   ` (3 preceding siblings ...)
  2019-07-29 10:59 ` [RFC PATCH v2] " Carlo Marcelo Arenas Belón
@ 2019-07-31 12:24 ` Johannes Schindelin
  4 siblings, 0 replies; 23+ messages in thread
From: Johannes Schindelin @ 2019-07-31 12:24 UTC (permalink / raw)
  To: Carlo Marcelo Arenas Belón; +Cc: git, avarab, sandals

[-- Attachment #1: Type: text/plain, Size: 4359 bytes --]

Hi,

On Sun, 28 Jul 2019, Carlo Marcelo Arenas Belón wrote:

> PCRE1 allowed for a compile time flag to disable JIT, but PCRE2 never
> had one, forcing the use of JIT if -P was requested.
>
> After ed0479ce3d (Merge branch 'ab/no-kwset' into next, 2019-07-15)
> the PCRE2 engine will be used more broadly and therefore adding this
> knob will give users a fallback for situations like the one observed
> in OpenBSD with a JIT enabled PCRE2, because of W^X restrictions:

Just so that nobody else needs to feel like an idiot for not knowing
what on the burning planet "W^X" is supposed to mean:
https://en.wikipedia.org/wiki/W%5EX says it is essentially an OS-level
mechanism to prevent executing code that was written into memory by the
same process, i.e. JIT.

Makes me wonder whether node.js works on OpenBSD, or any decently fast
web browser like Firefox or Chromium...

But I digress. I just wanted to chime in with the results of my web hunt
for that "W^X" term.

Ciao,
Dscho

>
>   $ git grep 'foo bar'
>   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
>   $ git grep -G 'foo bar'
>   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
>   $ git grep -E 'foo bar'
>   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
>   $ git grep -F 'foo bar'
>   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
>
> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
> ---
>  Documentation/git-grep.txt |  4 ++++
>  grep.c                     | 15 +++++++++++++--
>  grep.h                     |  1 +
>  3 files changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
> index c89fb569e3..ff544bdeec 100644
> --- a/Documentation/git-grep.txt
> +++ b/Documentation/git-grep.txt
> @@ -69,6 +69,10 @@ grep.fallbackToNoIndex::
>  	If set to true, fall back to git grep --no-index if git grep
>  	is executed outside of a git repository.  Defaults to false.
>
> +pcre.jit::
> +	If set to false, disable JIT when using PCRE.  Defaults to
> +	true.
> +
>
>  OPTIONS
>  -------
> diff --git a/grep.c b/grep.c
> index c7c06ae08d..3524d353dd 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -56,6 +56,7 @@ void init_grep_defaults(struct repository *repo)
>  	opt->repo = repo;
>  	opt->relative = 1;
>  	opt->pathname = 1;
> +	opt->pcre_jit = 1;
>  	opt->max_depth = -1;
>  	opt->pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED;
>  	color_set(opt->colors[GREP_COLOR_CONTEXT], "");
> @@ -125,6 +126,12 @@ int grep_config(const char *var, const char *value, void *cb)
>  		return 0;
>  	}
>
> +	if (!strcmp(var, "pcre.jit")) {
> +		int is_bool;
> +		opt->pcre_jit = git_config_bool_or_int(var, value, &is_bool);
> +		return 0;
> +	}
> +
>  	if (!strcmp(var, "color.grep"))
>  		opt->color = git_config_colorbool(var, value);
>  	if (!strcmp(var, "color.grep.match")) {
> @@ -163,6 +170,7 @@ void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix
>  	opt->pattern_tail = &opt->pattern_list;
>  	opt->header_tail = &opt->header_list;
>
> +	opt->pcre_jit = def->pcre_jit;
>  	opt->only_matching = def->only_matching;
>  	opt->color = def->color;
>  	opt->extended_regexp_option = def->extended_regexp_option;
> @@ -393,7 +401,8 @@ static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt)
>  		die("%s", error);
>
>  #ifdef GIT_PCRE1_USE_JIT
> -	pcre_config(PCRE_CONFIG_JIT, &p->pcre1_jit_on);
> +	if (opt->pcre_jit)
> +		pcre_config(PCRE_CONFIG_JIT, &p->pcre1_jit_on);
>  #endif
>  }
>
> @@ -489,7 +498,9 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
>  		compile_regexp_failed(p, (const char *)&errbuf);
>  	}
>
> -	pcre2_config(PCRE2_CONFIG_JIT, &p->pcre2_jit_on);
> +	if (opt->pcre_jit)
> +		pcre2_config(PCRE2_CONFIG_JIT, &p->pcre2_jit_on);
> +
>  	if (p->pcre2_jit_on) {
>  		jitret = pcre2_jit_compile(p->pcre2_pattern, PCRE2_JIT_COMPLETE);
>  		if (jitret)
> diff --git a/grep.h b/grep.h
> index c0c71eb4a9..fff152e606 100644
> --- a/grep.h
> +++ b/grep.h
> @@ -151,6 +151,7 @@ struct grep_opt {
>  	int allow_textconv;
>  	int extended;
>  	int use_reflog_filter;
> +	int pcre_jit;
>  	int pcre1;
>  	int pcre2;
>  	int relative;
> --
> 2.22.0
>
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v2] grep: allow for run time disabling of JIT in PCRE
  2019-07-29 10:59 ` [RFC PATCH v2] " Carlo Marcelo Arenas Belón
  2019-07-29 11:33   ` Carlo Arenas
  2019-07-29 15:11   ` René Scharfe
@ 2019-07-31 12:32   ` Johannes Schindelin
  2019-07-31 14:57     ` Ævar Arnfjörð Bjarmason
  2019-08-04  3:14   ` [RFC PATCH v3] grep: treat PCRE2 jit compilation memory error as non fatal Carlo Marcelo Arenas Belón
  3 siblings, 1 reply; 23+ messages in thread
From: Johannes Schindelin @ 2019-07-31 12:32 UTC (permalink / raw)
  To: Carlo Marcelo Arenas Belón; +Cc: git, avarab, sandals, gitster, dev+git

[-- Attachment #1: Type: text/plain, Size: 1382 bytes --]

Hi,

On Mon, 29 Jul 2019, Carlo Marcelo Arenas Belón wrote:

>   $ git grep 'foo bar'
>   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'

My immediate reaction to this error message was: That's not helpful.
What is `-48` supposed to mean? Why do we even think it sensible to
throw such an error message at the end user? Can't we do a much better
job translating that into something that makes actual sense without
knowing implementation details?

But then, I realized that -48 must be a well-known constant in PCRE2,
and my reaction transformed into something much more hopeful: why don't
we detect the situation where the JIT'ed code was not actually
executable [*1*], and fall back to the non-JIT'ed code path ourselves,
without troubling the end user (maybe warning, but maybe better not lest
we annoy the user with something pointless)?

Even after finding out that -48 disappointingly means
PCRE2_ERROR_NOMEMORY (as opposed to something like
PCRE2_ERROR_CANNOT_EXECUTE_JIT_CODE), I like the idea of not bothering
end users and doing the sensible fallback under the hood.

Ciao,
Dscho

Footnote *1*: Why anybody would think it sensible to build a PCRE2 with
JIT on an OS that does not allow executing code that was written by the
same process is beyond me. Or is there a mode in OpenBSD that *does*
allow JIT'ed code to be executed?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v2] grep: allow for run time disabling of JIT in PCRE
  2019-07-30  0:49       ` Carlo Arenas
  2019-07-30 17:55         ` René Scharfe
@ 2019-07-31 12:36         ` Johannes Schindelin
  2019-07-31 16:18           ` Junio C Hamano
  1 sibling, 1 reply; 23+ messages in thread
From: Johannes Schindelin @ 2019-07-31 12:36 UTC (permalink / raw)
  To: Carlo Arenas
  Cc: Junio C Hamano, René Scharfe, git, avarab, sandals, dev+git

[-- Attachment #1: Type: text/plain, Size: 1470 bytes --]

Hi Carlo,

On Mon, 29 Jul 2019, Carlo Arenas wrote:

> On Mon, Jul 29, 2019 at 10:47 AM Junio C Hamano <gitster@pobox.com> wrote:
> > René Scharfe <l.s.r@web.de> writes:
> > >> +pcre.jit::
> > >> +    If set to false, disable JIT when using PCRE.  Defaults to
> > >> +    true.
> > >> +    if set to -1 will try first to use JIT and fallback to the
> > >> +    interpreter instead of returning an error.
> > >
> > > Why not implement only -1, without adding this config setting?
> >
> > ... nor command line option.  If we have an auto-fallback, I would
> > think that makes the most sense.  IIRC the first iteration with only
> > the configuration was really about working around the (non-working)
> > pcre-jit---if we can self-detect and skip a non-working case, that
> > would allow us to drop end-user facing knobs, which is ideal.
>
> because that was proposed earlier[1] and wasn't accepted ;)
> [...]
> [1] https://public-inbox.org/git/20181209230024.43444-3-carenas@gmail.com/

For the record, I read
https://public-inbox.org/git/xmqqh8flkgs2.fsf@gitster-ct.c.googlers.com/
as encouraging a slightly more powerful argument in favor. Junio seemed
to hope that PCRE2's own `pcre2grep` would behave that way, and that
would give us plenty reason to just imitate it.

I don't know whether `pcre2grep` behaves that way, even if it does not,
I think the benefits of the auto fallback to the end user are
considerable.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v2] grep: allow for run time disabling of JIT in PCRE
  2019-07-31 12:32   ` Johannes Schindelin
@ 2019-07-31 14:57     ` Ævar Arnfjörð Bjarmason
  2019-08-04  0:25       ` Carlo Arenas
  0 siblings, 1 reply; 23+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-07-31 14:57 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Carlo Marcelo Arenas Belón, git, sandals, gitster, dev+git


On Wed, Jul 31 2019, Johannes Schindelin wrote:

> Hi,
>
> On Mon, 29 Jul 2019, Carlo Marcelo Arenas Belón wrote:
>
>>   $ git grep 'foo bar'
>>   fatal: Couldn't JIT the PCRE2 pattern 'foo bar', got '-48'
>
> My immediate reaction to this error message was: That's not helpful.
> What is `-48` supposed to mean? Why do we even think it sensible to
> throw such an error message at the end user? Can't we do a much better
> job translating that into something that makes actual sense without
> knowing implementation details?
>
> But then, I realized that -48 must be a well-known constant in PCRE2,
> and my reaction transformed into something much more hopeful: why don't
> we detect the situation where the JIT'ed code was not actually
> executable [*1*], and fall back to the non-JIT'ed code path ourselves,
> without troubling the end user (maybe warning, but maybe better not lest
> we annoy the user with something pointless)?
>
> Even after finding out that -48 disappointingly means
> PCRE2_ERROR_NOMEMORY (as opposed to something like
> PCRE2_ERROR_CANNOT_EXECUTE_JIT_CODE), I like the idea of not bothering
> end users and doing the sensible fallback under the hood.
>
> Ciao,
> Dscho
>
> Footnote *1*: Why anybody would think it sensible to build a PCRE2 with
> JIT on an OS that does not allow executing code that was written by the
> same process is beyond me. Or is there a mode in OpenBSD that *does*
> allow JIT'ed code to be executed?

We do detect if JIT isn't supported and fall back. That's what the
pcre2_config(PCRE2_CONFIG_JIT, &p->pcre2_jit_on) code in grep.c
does. This and is the subsequent pcre2_pattern_info() call is how PCRE
documents that you should do this.

What hasn't been supported is all of that saying "yes, I support JIT"
and the feature then fail whaling. I had not encountered that before.

So far that seems like because Carlo just built a completely broken PCRE
v2 package, so I don't know if that's worth supporting on our
side. I.e. this isn't something I think could plausibly happen in the
wild.

That should *not* be confused with me thinking other stuff Carlo's
raised is a non-issue, e.g. running into the JIT stack limit etc. Some
of that's clearly bugs in our/my grep.c code that need fixing.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v2] grep: allow for run time disabling of JIT in PCRE
  2019-07-31 12:36         ` Johannes Schindelin
@ 2019-07-31 16:18           ` Junio C Hamano
  0 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2019-07-31 16:18 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Carlo Arenas, René Scharfe, git, avarab, sandals, dev+git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> For the record, I read
> https://public-inbox.org/git/xmqqh8flkgs2.fsf@gitster-ct.c.googlers.com/
> as encouraging a slightly more powerful argument in favor. Junio seemed
> to hope that PCRE2's own `pcre2grep` would behave that way, and that
> would give us plenty reason to just imitate it.
>
> I don't know whether `pcre2grep` behaves that way, even if it does not,
> I think the benefits of the auto fallback to the end user are
> considerable.

Thanks for digging that up ;-)  I do agree with what was said there.

JIT is merely an optimization, and we should be able to work without
it and should not even bother the users with warning messages when
we have to choose non-JIT codepath.  Those who care about debugging
can use a "--debug" option or something to figure out if their build
on a particular pattern is or is not using JIT.

I think I read somebody (Carlo?) made that argument in the thread
earlier, and I agree with that sentiment.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v2] grep: allow for run time disabling of JIT in PCRE
  2019-07-31 14:57     ` Ævar Arnfjörð Bjarmason
@ 2019-08-04  0:25       ` Carlo Arenas
  0 siblings, 0 replies; 23+ messages in thread
From: Carlo Arenas @ 2019-08-04  0:25 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Johannes Schindelin, git, sandals, gitster, dev+git

On Wed, Jul 31, 2019 at 7:57 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> What hasn't been supported is all of that saying "yes, I support JIT"
> and the feature then fail whaling. I had not encountered that before.
>
> So far that seems like because Carlo just built a completely broken PCRE
> v2 package, so I don't know if that's worth supporting on our
> side. I.e. this isn't something I think could plausibly happen in the
> wild.

since you are in Debian please follow the instructions here:

  https://wiki.debian.org/SELinux/Setup

no need to rebuild git or pcre (but to enable selinux will need to
reboot twice), then type as root the following:

  # set enforce 1

Carlo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC PATCH v3] grep: treat PCRE2 jit compilation memory error as non fatal
  2019-07-29 10:59 ` [RFC PATCH v2] " Carlo Marcelo Arenas Belón
                     ` (2 preceding siblings ...)
  2019-07-31 12:32   ` Johannes Schindelin
@ 2019-08-04  3:14   ` Carlo Marcelo Arenas Belón
  2019-08-04  7:43     ` Carlo Arenas
  3 siblings, 1 reply; 23+ messages in thread
From: Carlo Marcelo Arenas Belón @ 2019-08-04  3:14 UTC (permalink / raw)
  To: git; +Cc: Johannes.Schindelin, avarab, gitster, l.s.r

94da9193a6 (grep: add support for PCRE v2, 2017-06-01) uses the
JIT fast path unless JIT support has not been compiled in the
linked library.

Starting from 10.23 of PCRE2, pcre2grep ignores any errors from
pcre2_jit_cpmpile as a workaround for their bug1749[1] and we
should do too, so that the interpreter could be used as a fallback
in cases where JIT was not available because of a security policy.

To be conservative, we are restricting initially the error to the
known error that would be returned in that case (and to be documented
as such in a future release of PCRE) and printing a warning so that
corrective action could be taken.

[1] https://bugs.exim.org/show_bug.cgi?id=1749

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
---
 grep.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index f7c3a5803e..593a1cb7a0 100644
--- a/grep.c
+++ b/grep.c
@@ -525,7 +525,13 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 	if (p->pcre2_jit_on == 1) {
 		jitret = pcre2_jit_compile(p->pcre2_pattern, PCRE2_JIT_COMPLETE);
 		if (jitret)
-			die("Couldn't JIT the PCRE2 pattern '%s', got '%d'\n", p->pattern, jitret);
+			if (jitret == PCRE2_ERROR_NOMEMORY) {
+				warning("JIT couldn't be used in PCRE2");
+				p->pcre2_jit_on = 0;
+				return;
+			}
+			else
+				die("Couldn't JIT the PCRE2 pattern '%s', got '%d'\n", p->pattern, jitret);
 
 		/*
 		 * The pcre2_config(PCRE2_CONFIG_JIT, ...) call just
-- 
2.23.0.rc1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v3] grep: treat PCRE2 jit compilation memory error as non fatal
  2019-08-04  3:14   ` [RFC PATCH v3] grep: treat PCRE2 jit compilation memory error as non fatal Carlo Marcelo Arenas Belón
@ 2019-08-04  7:43     ` Carlo Arenas
  2019-08-05 20:16       ` Junio C Hamano
  0 siblings, 1 reply; 23+ messages in thread
From: Carlo Arenas @ 2019-08-04  7:43 UTC (permalink / raw)
  To: git; +Cc: Johannes.Schindelin, avarab, gitster, l.s.r

PROs:
* it works (only for PCRE2) and tested in OpenBSD, NetBSD, macOS, Linux (Debian)
* it applies everywhere (even pu) without conflicts
* it doesn't introduce any regressions in tests (tested in Debian with
SElinux in enforcing mode)
* it is simple

CONs:
* HardenedBSD still segfaults (bugfix proposed[1] to sljit/pcre)
* warning is noisy (at least once per thread) and might be even
ineffective as it goes to stderr while stdout with most the output
goes to a pager
* too conservative (pcre2grep shows all errors from pcre2_jit_compile
should be ignored)
* no tests

Known Issues:
* code is ugly (it even triggers a warning if you have the right compiler)
* code is suspiciously similar to one[2] that was rejected, but
hopefully commit message is better
* code is incomplete (PCRE1 has too many conflicting changes in flight
to attempt a similar fix)
* there are obvious blind spots in the tests that need fixing, and a
lot more testing in other platforms/architectures
* git still will sometimes die because the non fast path has UTF-8 issues

I still think the pcre.jit flag knob might be useful to workaround
some of the issues detailed in CONs but probably with a different
definition:
unset -> fallback (try JIT but use interpreter if that didn't work)
false -> don't even try to use JIT
true -> print warning and maybe even die (if we really think that is useful)

some performance numbers below for the perl tests

with JIT enabled (in non enforcing SELinux)

Test                                            this tree
---------------------------------------------------------------
7820.3: perl grep 'how.to'                      0.56(0.29+0.60)
7820.7: perl grep '^how to'                     0.49(0.29+0.54)
7820.11: perl grep '[how] to'                   0.54(0.39+0.51)
7820.15: perl grep '(e.t[^ ]*|v.ry) rare'       0.60(0.45+0.58)
7820.19: perl grep 'm(ú|u)lt.b(æ|y)te'          0.58(0.30+0.61)

with "fallback to interpreter" (in enforcing SELinux)

Test                                            this tree
---------------------------------------------------------------
7820.3: perl grep 'how.to'                      0.64(0.59+0.56)
7820.7: perl grep '^how to'                     1.83(2.91+0.56)
7820.11: perl grep '[how] to'                   2.07(3.33+0.61)
7820.15: perl grep '(e.t[^ ]*|v.ry) rare'       2.89(4.91+0.66)
7820.19: perl grep 'm(ú|u)lt.b(æ|y)te'          0.78(0.86+0.55)

[1] https://github.com/zherczeg/sljit/pull/2
[2] https://public-inbox.org/git/20181209230024.43444-3-carenas@gmail.com/

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH v3] grep: treat PCRE2 jit compilation memory error as non fatal
  2019-08-04  7:43     ` Carlo Arenas
@ 2019-08-05 20:16       ` Junio C Hamano
  0 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2019-08-05 20:16 UTC (permalink / raw)
  To: Carlo Arenas; +Cc: git, Johannes.Schindelin, avarab, l.s.r

Carlo Arenas <carenas@gmail.com> writes:

> * code is suspiciously similar to one[2] that was rejected, but
> hopefully commit message is better
> ...
> [2] https://public-inbox.org/git/20181209230024.43444-3-carenas@gmail.com/

I do not recall ever rejecting that one.

It did not come with a good proposed log message to be accepted
as-is, so I do not find it surprising that I did not pick it up, was
waiting for a new iteration and then everybody forgot about it.

But that is quite different from getting rejected (with the
connotation that "don't attempt this bad idea again, unless the
world changes drastically").

In any case, this round looks a lot more reasoned.  I personally do
not think the warning() is a good idea.  As I said in the old
discussion, we by default should treat JIT as a mere optimization,
and we should stay out of the way most of the time.

An additional "must have JIT or we will die" [*1*] can be added on
top of this change, if somebody really cares.

Thanks.


[Reference]

*1* https://public-inbox.org/git/87pnu9yekk.fsf@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2019-08-05 20:17 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-28 23:54 [RFC PATCH] grep: allow for run time disabling of JIT in PCRE Carlo Marcelo Arenas Belón
2019-07-29  0:09 ` Carlo Arenas
2019-07-29  4:57 ` Junio C Hamano
2019-07-29  5:29   ` Carlo Arenas
2019-07-29  8:55 ` Ævar Arnfjörð Bjarmason
2019-07-29 10:26   ` Carlo Arenas
2019-07-29 12:38     ` Ævar Arnfjörð Bjarmason
2019-07-30 13:01       ` Carlo Arenas
2019-07-29 10:59 ` [RFC PATCH v2] " Carlo Marcelo Arenas Belón
2019-07-29 11:33   ` Carlo Arenas
2019-07-29 15:11   ` René Scharfe
2019-07-29 17:47     ` Junio C Hamano
2019-07-30  0:49       ` Carlo Arenas
2019-07-30 17:55         ` René Scharfe
2019-07-31 12:36         ` Johannes Schindelin
2019-07-31 16:18           ` Junio C Hamano
2019-07-31 12:32   ` Johannes Schindelin
2019-07-31 14:57     ` Ævar Arnfjörð Bjarmason
2019-08-04  0:25       ` Carlo Arenas
2019-08-04  3:14   ` [RFC PATCH v3] grep: treat PCRE2 jit compilation memory error as non fatal Carlo Marcelo Arenas Belón
2019-08-04  7:43     ` Carlo Arenas
2019-08-05 20:16       ` Junio C Hamano
2019-07-31 12:24 ` [RFC PATCH] grep: allow for run time disabling of JIT in PCRE Johannes Schindelin

git@vger.kernel.org list mirror (unofficial, one of many)

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 git git/ https://public-inbox.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git
 note: .onion URLs require Tor: https://www.torproject.org/

code repositories for the project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git